Categories

## DATEDIF How to Calculate Difference Between 2 Dates

The `DATEDIF` function calculates the number of periods between two dates. The best way to remember this function is it calculates the DATE DIFference between two dates.

## What is DATEDIF?

The `DATEDIF` function is a popular formula that has 3 parameters which are all required. If you [encounter any error]({{< relref “#datedif-errors-and-problems” >}}) when using this function check you have populated all three fields correctly.

The function reference is as follows:

``=DATEDIF(start_date, end_date, unit) ``

The `start_date` parameter should be the earliest date, the `end_date` being the latest date, and the `unit` is the value of the periods you want returned from the difference between those two dates.

Accepted values in Google Sheets for the `unit` value are:

### Date Difference in Days

By setting the third parameter to the `DATEDIF` formula to `D` it means the resulting value returned will calculate the number of days between the two dates. Here are some examples:

Several things to note about these tests when using the `DATEDIF` function and the `unit` parameter as `D`:

1. The dates are not inclusive. Notice the first row is 0, if it were inclusive it would have been 1.
2. Should the `end_date` come before the `start_date` you will get an `#NUM!` error. Make sure your `end_date` occurs on or after the `start_date`.

### Date Difference in Months

If you want to calculate the difference between two dates in months, then set the third parameter of the formula to `"M"`, here are some interesting examples of what this looks like for certain values:

As can be seen from the table above, the day of the month needs to be either eclipsed or matched for the returning value to increment. In rows 5 and 6 especially we can see the days in the month of February 2020 would not have eclipsed the `start_date` of 31st January 2020, but when the `end_date` advances into the next month (1st March) the increment occurs.

Therefore, if your `start_date` is the end of a month containing 31 days it will not be until the 1st day of the following month, preceding a month containing only 28, 29 or 30 days, that the value will be incremented.

### Date Difference in Years

If you want to calculate the difference between two dates as years then set the third parameter value to `"Y"`. Here is what this would look like based on the following inputs:

As can be seen from the examples above, it’s not until the date and month of the `start_date` has been eclipsed or matched the value returned by the `DATEDIF` formula increments. This is very similar to when calculating the [date difference in months]({{< relref “#date-difference-in-months” >}}).

### Date Difference in Days Irrespective of Months

If you need to calculate the difference in days between two dates irrespective of the difference between months then you’ll need to use `"MD"` as the value of the third parameter, as demonstrated below:

As previously noted with the [difference in days]({{< relref “#date-difference-in-days” >}}) we can see from the examples above the difference in days is not inclusive. However, there are some interesting results:

Row 3 shows a result of -1. Why? This implies there was a day short in the previous month of getting to the same date, and the current date of 1st March 2020 is 1 day short of the 31 days (29 days in February + 1 day in March = 30 days – 31 days = -1). This then puts the remaining values of that month (in March) out, until we hit the new month April.

### Date Difference in Whole Months

If you want to calculate the difference between two dates in whole months you can set the third parameter to the `DATEDIF` formula to `"YM"`.

The results from this result set above are fairly simple to understand and are consistent with what we already know about – the day of the month needs to be matched or eclipsed to increment the value.

### Date Difference in Days Irrespective of Years

Similar to our previous example with [date differences in days irrespective of months]({{< relref “#date-differences-in-days-irrespective-of-months” >}}) we have a similar calculation but this time irrespective of years, here are some examples:

Consistent with the days difference irrespective of months we see that where the start date isn’t matched by day of month and month, it calculates the difference the previous month fell short.

## Benefits of DATEDIF Over Subtracting Two Dates

An objection to using the `DATEDIF` formula is that it achieves the same result when subtracting dates, which is certainly far simpler than trying to remember a formula, and it’s parameter values.

The biggest benefit of the `DATEDIF` formula is that it helps to calculate the difference between two dates by providing us with the type difference we need. Therefore, the `DATEDIF` formula is more versatile for various use cases.

Another minor benefit to using the `DATEDIF` function when calculating the difference in days between two dates over the simple subtraction method is if the input dates contain time.

You can still achieve the same answer as the `DATEDIF` function, but you would need to remember and apply to use the `TRUNC` formula to achieve the same answer in row 3 above, like so:

## DATEDIF Errors and Problems

There are a couple of things to be mindful of when using the `DATEDIF` function. As already shown above in the [DATEDIF days examples]({{< relref “#date-difference-in-days” >}}) we had a result on the last row that gave us an error.

If you do get a `#NUM!` error check the parameter values are correct, and the earliest date is the `start_date` value, and the later date is the `end_date` value.

Provided your parameter values have been set correctly, the only other type of error which may cause problems and may not be as obvious is if the locale setting of your dates are not what you anticipated.

### Incorrect DATEDIF Locale

If you’re getting an answer which definitely isn’t correct you will want to check the settings of your spreadsheet to ensure it correctly interprets a date field value.

For example, some countries, such as the UK and Australia, have dates set to the following format: `DD/MM/YYYY` whereas other countries, such as Canada and the Unites States, have date values set to the format of: `MM/DD/YYYY`.

Ways to minimise this error occurring would be to quickly check your locale settings are correct before starting a new spreadsheet, or when modifying a spreadsheet.

In a Google Sheet this would be simply going to File > Spreadsheet Settings > Locale – change this to your country.

{{< figure src=”/images/datedif-check-locale.png” caption=”Check spreadsheet locale settings” alt=”Google Sheets locale setting” >}}

Another way to check is to input a date that wouldn’t be a date in the other format, for example, try `01/13/2020` and `13/01/2020` – one of these will not work and your `DATEDIF` formula will return a `#VALUE!` error as the date fields entered aren’t actual dates.

## Conclusion

In this article we’ve done a deep dive into the `DATEDIF` formula and seen the results it provided based on different parameter values.

We’ve also seen how the formula can be better suited than simply subtracting two dates, and we’ve diagnosed our problems and fixed them.

Overall the `DATEDIF` function is a fantastic function to use, and you should look at incorporating it more in your own daily use. If you’d like to learn more about the function in Google Sheets, you can read the [documentation]({{% relref “datedif-docs” %}}).

Categories

## How to Flip ARRAY_CONSTRAIN Formula in Google Sheets

Previously I posted how you can use the `INDEX()` function to obtain the fields needed for a simple `SUM()` function.

Then I came across another handy Google Sheet function `ARRAY_CONSTRAIN()`.

## What ARRAY_CONSTRAIN Formula Does

There are three parameters with this formula:

1. `range` – insert the range for the formula to operate on.
2. `num_rows` – set the number of rows to compress.
3. `num_cols` – set the number of columns to compress.

### Example

Let’s look at a current depreciation problem I’m trying to solve.

I’m trying to compress the different types of depreciation on an array of different purchases projected over the next few years and to calculate what the new depreciation would be once these items have been purchased.

The data looks like this:

{{< figure src=”/images/purchase-schedule-with-depreciation-life.png” alt=”depreciation schedule” caption=”Simple depreciation schedule” >}}

What we’re trying to achieve is to determine what the depreciation would be each year according to the year when items have been purchased, and their corresponding depreciation life in years.

## Limitations of ARRAY_CONSTRAIN

Unfortunately the big limitation of `ARRAY_CONSTRAIN` is it works from the top-leftmost corner to cut the excess cells.

For example, if we apply the following formula on our data set:

``=ARRAY_CONSTRAIN(C1:G6, 3, 2)``

It will return for us the following highlighted cells:

This causes a problem for our depreciation schedule as we’d like to fetch the last two years when we’re in the `2024` column for row 3, so using the following formula:

``=ARRAY_CONSTRAIN(\$C3:G\$3, 1, B\$3)``

For the `Network Upgrade` row would only return `[20000,<blank>]`, as these are the nearest to the top-left cells.

## Workaround to ARRAY_CONSTRAIN

Is there a way to flip the input row?

There is!

Is there a `REVERSE()` function in Google Sheets we can apply to the range?

Unfortunately not yet.

So what we need to do as a workaround is to create our own `REVERSE()` function by sorting our data and reversing it backwards.

Thankfully my data already has a year row (`Row 1`), all I need to be able to do is sort the data set by that row, apply descending sort and then insert the new range back into the `ARRAY_CONSTRAIN()` formula.

Here’s how I flipped my data range:

``=ARRAY_CONSTRAIN(QUERY(TRANSPOSE(\$C\$1:G3),"SELECT Col" & ROWS(\$C\$1:G3) & " ORDER BY Col1 DESC", 0),\$B3,1)``

Let’s break apart this function and understand what’s happening, as usual we’ll start with the innermost functions and work out way out:

• `ROWS(\$C\$1:G3)` – I used this to be able to get the column number to return from the `QUERY` function. As the last Column in the data set is the one I wanted, I just obtained this from the height of the original input range.
• `TRANSPOSE(\$C\$1:G3)` – as my data set is in rows, I need to translate this into columns for it to be functional as a data set with the `QUERY()` function. The data set needs to include the row containing the years.
• `QUERY(..., "SELECT Col" & ... & " ORDER BY Col1 DESC", 0)` – the second `SELECT` parameter to our `QUERY()` function fetches just the one column, but orders by the first column being the column containing all the year values. The third parameter `0` just makes sure we don’t return any header rows.
• `ARRAY_CONSTRAIN(..., \$B3, 1)` – we finally wrap everything back into our original `ARRAY_CONSTRAIN` function. This time we change the parameters a little as they are in a column format, but the purposes are still the same – we want the nearest 2 years (the value of cell `\$B3`) – and as there’s only `1` column our third parameter is `1`.

To calculate depreciation on this row our final touch is to sum everything returned from the range and divide by the number of years:

``=SUM(ARRAY_CONSTRAIN(QUERY(TRANSPOSE(\$C\$1:G3, "SELECT Col" & ROWS(\$C\$1:G3) & " ORDER BY Col1 DESC", 0), \$B3, 1))/\$B3``

We further wrap the formula in a check to see if there are any depreciation values in the Dep’n (Years) column:

``=IF(LEN(\$B3),SUM(...),"")``

### Result

Here’s the final result showing the function as well as our accumulated depreciation for the items purchased:

## Conclusion

In this post you’ve been able to discover and learn about the `ARRAY_CONSTRAINT`. You’ve seen its limitations in fetching from the upper-leftmost corner, and perhaps in time Google Sheets will append an additional parameter to set where the corner should be fetched from.

In the meantime we’ve discovered how we can reverse or flip a range to make use of the `ARRAY_CONSTRAINT` function by using the `TRANSPOSE` and `QUERY` functions together.

Categories

## Ignore Blank Cells in QUERY Function Google Sheets

If you have a series of data where you’d like to apply the `QUERY` function over, but want to ignore certain rows if a particular cell is blank or empty then you would write your formula as follows:

``=QUERY(DataRange, "SELECT * WHERE Col1 IS NOT NULL", 1)``

The way you can have the Google Sheets `QUERY` function ignore blank or empty cells is by using the string `IS NOT NULL` in your formula.

## Example

As an example, suppose we had the following data on assets we’d like to purchase that have differing depreciation rates, but we only want to return the items where there is a depreciation value (implying purchases with no depreciation will be written off in the year they are purchased):

In the above diagram we have the following function:

``=QUERY(A2:E8, "SELECT * WHERE B IS NOT NULL", 0)``

Here’s what each of the parameters passed into the `QUERY` function mean:

## First Parameter – Data Range

Set the data range for which the `QUERY` function will be performing its task.

In the example above the `QUERY` is performed on the data inputs for each of the large purchase rows.

## Second Parameter – Query Filter

Applying a filter on the data range by using a form of SQL language. In our example, here’s what the statement translates into:

• `SELECT *` – select everything from the range;
• `WHERE B IS NOT NULL` – apply the filter on column B where the cell `is not null` (is not blank or empty).

## Third Parameter – Header Rows

Although our data set does contain a header row (`row A`) we don’t want to return this in our result, therefore, we set this value to `0`.

# Solution

The result of our formula is values from the original data set which need to be depreciated over several years.

We can further apply more logic to our `QUERY` function to perhaps `SUM` the value of all assets purchased containing the same depreciation value.

# Conclusion

In this post we’ve explored the powerful `QUERY` function and how we can filter our data by excluding rows where they contain cells which are blank or empty.

By applying the condition `IS NOT NULL` in our `QUERY` filter we can easily remove these from our results.

If you’d like to learn more about `QUERY` functions, especially how we can merge multiple rows into one you should check out our illustrated step-by-step article.

Categories

## 4 Simple Steps to Merge 2 Columns Into 1 With QUERY Function

If you have two or more columns of data and you want to merge these columns into one column then using the Google Sheets `QUERY` formula may be one way you can achieve this.

Here’s an in depth explanation of how to achieve this:

## Step 1 – Prep your data

If you data doesn’t contain any spaces then you’re good to go, if though your data does contain spaces then you will need to define a character at the end of each cell to get this to work.

## Step 2 – Understanding the QUERY Function

Let’s have two simple data sets, one which does not contain spaces in its data set, and another which does:

With both data sets we have highlighted in grey the returned values from the `QUERY` function.

### QUERY parameters

The `QUERY` function is a powerful formula in Google Sheets helps to treat data contained in tabular format like a database table which we can perform basic `SQL` queries on.

This function contains three parameters:

1. The data set.
2. The `SELECT` command.

It’s the third parameter of the `QUERY` function that will help us in being able to concatenate the data within each column.

At the moment, from above, we have set the third parameter to `0` which means our data set does not return any headers.

But what if I want to define the first row as being the header?

Then we would simply change this third parameter in the `QUERY` function to `1` – representing the first row. When we perform this change to our formula we see no change to our data, however, the returned values have now defined that first row as a header area.

What if we were to increase the number represented in our `QUERY` function? How is the result set returned if we define the `header` parameter as `2`?

Here’s how it looks for our data sets:

Did you notice what the `QUERY` function does when we define more than one header row?

It concatenates the header rows and inserts a space between each appended element. Therefore, the first row containing `A` and the second row containing `B` produce the header `A B`.

(Can you foresee the issues we’re going to run into when we merge our columnar data which contains spaces already?)

So if we wanted to get all the elements within our column wrapped into one row we would just need to change that third parameter to the maximum height of our columns. We could use the `ROWS()` function to capture this number:

## Step 3 – JOIN columns

As the columns have been merged into one header row, we now want to stitch these column headers together.

This is achieved by wrapping the `QUERY` function in a `JOIN` function, like so:

Notice here we used the first parameter of the `JOIN` function as a space to mimic the behaviour of what the `QUERY` data did to our data too.

Do be aware the `JOIN` function does have a limit of `50,000` characters. If you have a lot of data to concatenate you might want to break it up to smaller data sets.

### What if our data contains spaces?

As you can see this strategy isn’t going to work too well as our result now contains too many spaces between data cells, and looks as follows:

Therefore, we’re going to need to improvise on the spaces in our current data set.

What we’re going to do is to modify our original data by substituting spaces for an oblique character that isn’t used in our data. For example, my current data set with names isn’t likely to have a tilde character `~`.

Therefore, what we want to be able to do is to change our imported data set in the `QUERY` function so that we are operating on data that doesn’t contain any spaces – we want to be able to reserve these spaces for the explicit purpose of being able to split our data back into cells again.

One such way I achieved this was by changing our original formula:

``QUERY(A1:B3,,ROWS(A:A))``

To this:

``QUERY(ARRAYFORMULA(SUBSTITUTE(A1:B3," ","~")),,ROWS(A:A))``

What has happened with our first `QUERY` parameter?

Instead of referencing the original data set we modify the data set to substitute spaces for our oblique tilde `~` character.

The `ARRAYFORMULA` provides us the functionality of being able to operate on each cell within the range, and all we want to do is apply the `SUBSTITUTE` formula to each cell – if it has a space character replace it with the tilde character.

The result of our change should produce something like this:

We’ve now achieved our purpose of reserving the space characters for another specific purpose which we’ll see next.

## Step 4 – Transform back to one column

Our data is in one cell, split by the space character.

To stretch this back out into one single column all we need to do is `SPLIT` the data by the special space character and then `TRANSPOSE` it to leave the data as a column, otherwise it will remain in row fashion if we don’t.

Here’s what happens to our first data set when we `SPLIT` first:

``=SPLIT(JOIN(" ", QUERY(A1:B4,,ROWS(A:A)))," ")``

Then our concluding formula for the first data set:

``=transpose(split(join(" ", query(A1:B4,,rows(A:A)))," "))``

### What about our data set with spaces?

A little more work is going to be required with our original data set which contains spaces.

If we copy the above formulas into our working “space” formula here’s the result:

``=TRANSPOSE(SPLIT(JOIN(" ",QUERY(ARRAYFORMULA(SUBSITUTE(A1:B3," ","~")),,ROWS(A:A)))," "))``

As you can see we need to remove at the end each of those tilde characters.

How do you think we could do this?

If you were thinking `ARRAYFORMULA` and `SUBSTITUTE` again – well done! You’re acquiring a new skill today.

Here’s our concluding formula:

``=ARRAYFORMULA(SUBSTITUTE(TRANSPOSE(SPLIT(JOIN(" ",QUERY(ARRAYFORMULA(SUBSTITUTE(A1:B3," ","~")),,ROWS(A:A)))," ")),"~"," "))``

## Conclusion

So there you have it!

• What the third parameter of the `QUERY` formula does.
• How we can modify each cell within a range by using `ARRAYFORMULA` – specifically using the `SUBSTITUTE` formula.
• How to handle different data sets with the `QUERY` function, especially if they contain space characters.
• How to `JOIN` our `QUERY` data to form a single cell of data (careful of the 50,000 character limit).
• How to `SPLIT` our data set back into cells.
• How to `TRANSPOSE` our data set back into columns.

Well done if you’ve been able to follow along and understand these complex topics.

Categories

## Easily Merge Multiple Columns Into 1 with Query & ArrayFormula

In today’s post I’d like to explain how I used the following formula to help merge multiple columned data into a different columned arrangement as the purpose of the result was to export the original data set into a CSV file for import into NetSuite.

Here’s the formula we will dissect:

``=ARRAYFORMULA(SPLIT(ARRAYFORMULA(SUBSTITUTE(TRANSPOSE(SPLIT(JOIN(" ",QUERY(DataConcat!A:L,,ROWS(DataConcat!A:A)))," ")),"~"," ")),";",TRUE,FALSE))``

# The problem

I had exported the following salary data from our budgeting software which contained how much each staff person would be receiving each month throughout the financial year.

An example of the raw data looked something like this:

The original spreadsheet had all months of the year and is labelled as `Data`:

I had to import this data into NetSuite using the CSV Import process, which required changing the structure a little bit, so that the resulting CSV looked like so:

As you can see with the requirements of the task, we need to be able to expand the data set to rows (as it’s currently laid out in columns), then merge the data back into one columnar set. Once we have achieved this we then use the handy File > Download > Comma-separated values (current sheet) option in Google Sheets.

# Solution: Step 1 – Expansion

The first rule when manipulating data is to leave the original data set intact.

Therefore, we create a fresh new sheet and label this whatever name is appropriate for your needs. In this example we will name it appropriate to what we are doing: `DataExpand`.

In our expansion we want to achieve all necessary CSV fields required. This means we need to fetch the `ID` number from the original `Employee` field, and to copy the remaining data into another field for the `Employee Name`. Then we split the header column in our original sheet to a data range. Followed lastly with the amount in each date range.

Our first sheet, should look a little something like this:

Here are the details as displayed in the photo above:

• Cells `\$B\$1` and `\$G\$1` contained the `=COLUMN()` reference to each of the months on the original data set. This just helps with creating an easy `index` function that uses that reference which can make it easier copying the same columnar set 12 times.
• `Row 1` this contains the following formula to extract the `ID` number from the `Employee` field. Here we check to make sure we have data in the original `Data` sheet, if we do we want to perform a regular expression where we just want all the digits, therefore, we use the regular expression of `d+`:
``=if(len(Data!A2),REGEXEXTRACT(Data!A2,"d+"),"")``
• From `\$B\$2` down we then use another regex formula to extract those digits and replace them with nothing. We also want to check if the employee received any salaries or wages, if they didn’t we don’t want to add them into the data set.
``=if(index(Data!\$1:\$1000,row(),B\$1),trim(regexreplace(Data!\$A2,"d+","")),"")``
• From `\$C\$2` down we just insert the `ID` that is needed for each row if we have a confirmed value in the previous column.
``=if(len(B2),\$A2,"")``
• For `\$D\$2` down if we have a value in `\$B\$2` then we get the value of the salaries for that respective month. We use the handy `index` function here to get that value:
``=if(len(B2),index(Data!\$1:\$1000,row(),B\$1),"")``
• For `\$E\$2` down if we have a value in `\$B\$2` then we get the column heading of the respective month using, once again, the `index` function:
``=if(len(B2),index(Data!\$1:\$1000,1,B\$1),"")``
• Lastly, for `\$F\$2` down we get the value in the previous cell, if something exists, and calculate what the last day of the month will be for that cell:
``=if(len(E2),date(year(E2),month(E2)+1,0),"")``

As we have been mindful of the formulas used for each budget month, we now just need to copy cells `\$B\$2:\$F\$1` and then paste this 12 times across. We could make it even easier by, after our first paste in cells `\$G\$2:\$K\$4` changing cell `\$G\$1` to be a formula: `=\$B\$1 + 1`:

Then we copy this newly pasted range with formula edits across 10 more times.

# Solution: Step 2 – Concatenation

The purpose of the next sheet is to zip up our columnar collections. This is a relatively simple step, however, there a couple of things to be wary of:

1. Use `TEXTJOIN` when zipping up the columnar data rather than the `JOIN` or `CONCATENATE` functions. The primary reason for this is where there is no data in a cell `TEXTJOIN` creates the blank join, whereas the other functions neglect it. As an example, if we had the following table:

When using `=JOIN(",",A1:C1)` it produces the result `Missing,Puzzle,Piece` for the first row and `=JOIN(",",A2:C2)` will produce `Missing,Piece` for the second row.

Whereas using `=TEXTJOIN(",",FALSE,A1:C1)` produces the result `Missing,Puzzle,Piece` for the first row and `=TEXTJOIN(",", FALSE, A2:C2)` will produce `Missing,,Piece` for the second row.

As the `TEXTJOIN` function allows for the flexibility of the concatenation of your columnar data it’s import the data remain as it is represented elsewhere if there happens to be blank cells in your data.

1. The other important aspect with this formula is that we want to replace the space characters within our cells as the space character will be used to join and split data back into columns again later. Therefore, it’s important at this step (if your data also contains spaces) to replace the space character with an oblique character that is not used anywhere in your data set.

I am going to replace my space characters within my data to the tilde character `~`.

Therefore, I have created a new sheet, which I’ve labelled `DataConcat` and cell `\$A\$1` has the formula:

``=if(len(DataExpand!B2),SUBSTITUTE(TEXTJOIN(";",FALSE,DataExpand!B2:F2)," ","~"),"")``

Be mindful when transferring the data across to adjacent columns that you don’t copy paste or drag across the formula, as each column here needs to reference the collection in `DataExpand`. For example, in the next adjacent cell `\$B\$1` the formula is:

``=if(len(DataExpand!G2),substitute(textjoin(";",FALSE,DataExpand!G2:K2)," ","~"),"")``

You will need to do this 12 times for each month.

# Solution: Step 3 – Merge

So here’s where we merge all the columns from the `DataConcat` sheet in Step 2 above into one big column collection.

``=ARRAYFORMULA(SPLIT(ARRAYFORMULA(SUBSTITUTE(TRANSPOSE(SPLIT(JOIN(" ",QUERY(DataConcat!A:L,,ROWS(DataConcat!A:A)))," ")),"~"," ")),";",TRUE,FALSE))``

Wow, what a massive formula!

As with mathematics, we have to start with the inner most functions and work our way out. So let’s break up each component within this formula and explain what’s happening:

## QUERY function

``QUERY(DataConcat!A:L,,ROWS(DataConcat!A:A))``

Here we use the `QUERY` function to capture the data we want to operate on (the first parameter), being all the columns in the `DataConcat` sheet, and we then want to combine these cells into one row. We achieve this by setting the number of headers (the third parameter) to the maximum number of rows in the `DataConcat` sheet. This means every row in the `DataConcat` sheet will be zipped up into one row!

The way the `QUERY` function zips everything up is by applying the space character to each new cell. This was why it was important for us to remove the space characters from our own data set before we applied this function.

If you’d like to know more about the QUERY function I wrote up a more detailed post here explaining this type of use.

## MERGE – Using JOIN, SPLIT & TRANSPOSE functions

``TRANSPOSE(SPLIT(JOIN(" ", QUERY(...)), " "))``

The combination of these three functions is to merge the single row returned by the `QUERY` function into one column.

First we `JOIN` the rows together using the special space character which helps to delimit each individual row. Then we `SPLIT` by the space character which produces the rows across individual cells horizontally. Finally, the `TRANSPOSE` function translates the horizontal cells into vertical formation – a single column.

Unfortunately the `JOIN` function does have a character length limitation of 50,000 characters. You may need to break up your requirements or look at other means if this ends up being an impediment.

## EXPANSION – Using ARRAYFORMULA

The last step is expand our singular column so that it can be represented as needed for CSV export. If we’re expanding data sets we need to use the `ARRAYFORMULA` function, as demonstrated here:

``ARRAYFORMULA(SPLIT(SUBSTITUTE(transpose(...), "~", " "), ";", TRUE, FALSE))``

The `ARRAYFORMULA` provides us with the ability to loop through each of the cells in our data set and to apply individual functions to them. In our case we want to:

1. `SUBSTITUTE` back the oblique tilde `~` character with its original space character.
2. `SPLIT` back the data we concatenated in the `DataConcat` sheet. However, we need to be mindful there may be instances where we have blank cells, and we want the `SPLIT` function to return the blank cells, therefore, we need to change our fourth parameter in the `SPLIT` function to `FALSE` (default is `TRUE`). By switching this to `FALSE` it turns consecutive delimiters like `;;` into three separate cells, rather than just one cell.

With these functions all combined they produce the wonderful result intended, like so:

# Conclusion

Within this post we were able to learn how we can extrapolate data from one form to another using several intermediary steps and complex Google Sheet functions.

Hopefully this post has been helpful with learning:

• How to organise your steps by leaving the original data set (use multiple sheets if needed).
• How to use the `QUERY` function and especially its third `headers` parameter.
• Why you would use the `TEXTJOIN` function over `JOIN`.
• The limitations of the `JOIN` function.
• How to expand your data set using `ARRAYFORMULA` with any function when needing to loop through individual cells in the result.
• How to set the `SPLIT` function’s fourth parameter so that consecutive empty cells aren’t treated as one.
Categories

## How to Easily Sum Columns with Dynamic Range

I had a case in a spreadsheet where I needed to test the cash flow of payments made against future loans.

However, when it came to scenario testing the cash flow we had the ask and answer the question on the viability of the organisations cash flow depending on the loan repayment schedule.

Would we be able to repay our loans back quicker? What would our cash flow look like if repayments were 10 years, or 15 years, or 20 years’ terms?

To enable the spreadsheet to calculate this on the fly, I had to find a way whereby I could set a field value to represent the number of years the loan was to be repaid and for the single row in the report reflecting the repayment schedule to show those repayments.

For those familiar with a cash flow statement you will know there’s generally a section within the Financial Activities to detail any financial repayments made, ours was labelled Principal Repayments as shown in the basic table below of the first few columns (as we had to perform a 20-year projection there were more columns than this):

Now the issue at hand was that the New Loans row was an input row. Management wanted to insert how much they were willing to borrow for each project, but they wanted to see the impact on the Total Net Cash Flow which was the sum of the Total Operating Activities plus the Total Investing Activities plus the Total Financing Activities.

So with an input field on the New Loans row, and the output field on another row Principal Repayments all I needed to do was one more input cell representing the Loan Terms (in Years).

The cell I used as the input for the loan terms is defined in cell `\$B\$50`.

So here’s how the Principal Repayments row ended up:

``\$B\$37 = -sum(index(\$B\$35:B\$35,0,MAX(COLUMN(B\$35)-COLUMN(\$B\$35)-\$B\$50+1,1)):B\$35)/\$B\$50``

So let’s break this formula down, starting in the heart:

1. `MAX(COLUMN(B\$35)-COLUMN(\$B\$35)-\$B\$50+1, 1)` this formula helps to determine the column to start our range capture.

Here’s what we would get if the formula was copied across different cells:

`\$E\$35 = MAX(COLUMN(E\$35)-COLUMN(\$B\$35)-\$B\$50+1, 1)`

Which for each reference and function call results in:

• `COLUMN(E\$35) = 5`
• `COLUMN(\$B\$35) = 2`
• `\$B\$35 = 10`
• `MAX(5 - 2 - 10 + 1, 1) = MAX(-6, 1) = 1`

For the cell `\$Z\$35` which would contain this portion of the formula as `MAX(COLUMN(Z\$35)-COLUMN(\$B\$35)-\$B\$50+1, 1)` has the following results:

• `COLUMN(Z\$35) = 26`
• `COLUMN(\$B\$35) = 2`
• `\$B\$50 = 10`
• `MAX(26 - 2 - 10 + 1, 1) = MAX(15, 1) = 15`
1. `index(\$B\$35:B\$35, 0, ...)` the nifty thing about the `index` formula is that it returns a value or the reference to a value. The parameters of the `index` function are: `INDEX(reference, [row_offset], [column_offset])` and as we don’t want to change the `row_offset` this is represented as `0`, but we do want to change the `column_offset` and this is where the `MAX` formula helped us. From the value achieved from the `MAX` formula we then receive a reference.
2. `sum(index(...):B\$35)` as we’ve captured the cell reference from the `index` function we can add the range notation `:` and have the `sum` value add all values from that cell to the current cell we are in.
3. `=-sum(...)/\$B\$50` lastly as we have added up all the new loans in the `sum` formula we then divide the total New Loans amount by the loan term.

So if we had this cash flow sheet spread out of 25 time periods out, by the time we got to the formula in `\$Z\$37` it would look like this:

``\$Z\$37 = -sum(index(\$B\$35:Z\$35,0,MAX(COLUMN(Z\$35)-COLUMN(\$B\$35)-\$B\$50+1,1)):Z\$35)/\$B\$50``

Thereby only getting us the last `\$B\$50` (loan term) periods and dividing that amount by the loan term.

`INDEX` formula is a nifty little formula to learn!

Categories

## How To Join Multiple Columns Into 1 Sorted Unique Column

Further to my previous post on combining 2 columns into one column I’ve had to do some further work on this by now combining a number of unknown columns into one unique column and then sorting the result.

I’ve been able to achieve this and will demonstrate one way in how I’ve been able to get it to work. Let’s use a working example to show how it works.

First, we’ll acquaint ourselves with the old method of using this formula, which in this case would be the following for the following three columns (column `A` will be our results column):

``=unique(transpose(split(join(";",A:A)&join(";",B:B)&join(";",C:C),";")))``

Unfortunately the problem with this method is that is requires us to enter in the `join(delimiter,range)` function for each new column, and I want the function to be able to concatenate all cells in a range of columns.

Our first remedy at finding a solution avails itself when using the `ArrayFormula` function. As this function will loop through each of the cells in your array (being a range of cells) and will output each cell according to the way you define the function inside.

As an example, if we were to apply the following formula in cell A1 we would get the following:

``=arrayformula(concatenate(if(len(B:D)>0,B:D&";","")))``

As you can see what happens in the above function is that the `ArrayFormula` goes through my range `B:D` (being everything in columns B to D) and once it finds a cell that contains something (a string length greater than 0) it then gets that result and appends a semi-colon to it – this is denoted with the true statement in the `if` statement by `B:D&";"`.

Now we simply merge the two formulas together, in essence replacing the `join` section of our previous formula with the `ArrayFormula` above, producing:

``=unique(transpose(split(arrayformula(concatenate(if(len(B:D)>0,B:D&";",""))),";")))``

Lastly, if we wanted to sort the result of this output we would finally wrap the resulting formula in the `sort` function, where we would have everything sorted from A to Z:

``=sort(unique(transpose(split(arrayformula(concatenate(if(len(B:D)>0,B:D&";",""))),";"))))``

Of course if I didn’t know how many columns I would be expecting in this sheet I would amend the last column name to a much large letter, eg.

``=sort(unique(transpose(split(arrayformula(concatenate(if(len(B:Z)>0,B:Z&";",""))),";"))))``

However, do be aware processing time will increase with the more data columns you add.

Categories

## How To Easily Join 2 Columns Into 1

UPDATE (4 Nov 13): I’ve created another post that allows you to perform the same task demonstrated below, but works better with multiple columns (rather than just two as detailed below).

There are times within a Google Spreadsheet (or I guess any spreadsheet for that matter) where you need to merge two columns together to form one column. It comes in really handy when you are looking to use a series of cells for data validation and both bits of data span over two separate columns.

Just how can we get them into one column?

I had a case recently where I needed to append the contents of one column underneath the contents of another column, with the size of both column’s height unknown (this eliminated the easier approach of simply placing the contents underneath manually).

I was able to find a solution, and I’ll illustrate how it worked by using an example. Let’s assume the following columns of data:

The first thing we will do is move to the next adjacent column, column `C`, and in cell `C1` enter the following formula:

``=join(";",A:A)&join(";",B:B)``

What this formula does is mesh all the data in the first column into a single string where each cell’s content is appended together with a semi-colon. Now if your data contains semi-colons you will want to use another symbol that is not used in your data set.

Our output in cell `C1` would look a little something like this:

``Lions;Tigers;Elephants;Lemurs;ABC;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;Red;Yellow;Green;Blue;White;Black;ABC;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;``

Now we’re going to amend our code in `C1` so that we can split each item back into their own cell, our code in cell `C1` will now look like so:

``=split(join(";",A:A)&join(";",B:B),";")``

This would nor produce something a little like this:

Then as we need the data to be placed into a column we would transpose our data, by amending our cell in C1 to this:

``=transpose(split(join(";",A:A)&join(";",B:B),";"))``

Giving us:

Now if we want to remove any pieces of data that are the same within both columns we would just further wrap the UNIQUE function around the TRANSPOSE function. In our working example this would remove the second ABC in our new column. Here’s the formula firstly:

``=unique(transpose(split(join(";",A:A)&join(";",B:B),";")))``

Resulting in:

This has certainly helped me when operating with ImportRange function calls and then working with the imported data in the active sheet. Hopefully it helps you too!