How To Aggregate Data From Multiple Sheets With QUERY In Google Sheets

How can you aggregate data using multiple sheets with the QUERY function in Google Sheets?

To aggregate data sourced from multiple sheets, create a data set using the set notation {} by referencing each sheet then within the query statement of the QUERY function reference columns using ColX (with X being the index number of the column, starting at 1).

As the QUERY function contains three parameters, the first parameter data can be a range or a data set created using the {} notation. It’s important to know when creating your custom data set whether the additional ranges from multiple sheets will be added as additional columns or concatenated as additional rows .

Let’s explore in a little more detail what each data type looks like.

Additional Columns Approach

If the sheets contained within your spreadsheet contain data that can be added as additional columns then your data parameter in the QUERY function would simply look something like this:

=QUERY({Sheet1!B:B, Sheet2!B:B, Sheet3!B:B}, "SELECT *")

If you are in a locale where the comma is used as the decimal point, you would need to change the separator in the set notation to the backslash character, like so:

=QUERY({Sheet1!B:B \ Sheet2!B:B \ Sheet3!B:B}, "SELECT *")

An example of how this looks is where, for example, all sheets in column B referenced the amount of sales made in the month for each sales staff person and each sheet represents a month.

Check the Columns sheet to see how adding additional columns appears as a data set.

To perform an aggregate on this data set, you would just sum each column and would reference this like so in your SELECT statement:

=QUERY({Sheet1!B:B, Sheet2!B:B, Sheet3!B:B}, "SELECT SUM(Col1), SUM(Col2), SUM(Col3)")

The result of this formula is seen in the Agg sheet:

If you wanted to change the name of the aggregate columns , you could add the LABEL in the query statement and give more appropriate names for the headers in your result.

Concatenating Rows Approach

The other approach when aggregating data from multiple sheets is to concatenate the data from each sheet to create one long data set. This works well when each sheet has the same column headers , or if it doesn’t that your concatenation will be to common columns.

To do this your data parameter in the QUERY function needs to change to the following:

=QUERY({{Sheet1!A:B};{Sheet2!A:B};{Sheet3!A:B}}, "SELECT *")

The result of the function above is displayed in the sheet labelled Rows :

As you can see from the data displayed in the Rows sheet each data contained within a sheet is appended to the data set created using the set notation.

To perform aggregation you would use an element in each row to group the data by and then apply an aggregate function on the grouped data.

In the Agg sheet for this example the formula groups by the Employees and uses the aggregate formula SUM() to total their amount of sales, this formula looks as follows:

=QUERY({{Sheet1!A:B};{Sheet2!A:B};{Sheet3!A:B}},"SELECT Col1, SUM(Col2) GROUP BY Col1")

Notice with this formula that the same data is used in the data parameter of the QUERY function, but the SELECT statement uses ColX notation to reference the actual columns in the data set. The GROUP BY clause helps to perform the aggregation on each Employee .

You can see this result in the sheet labelled Agg1 below:

Notice in the Agg1 sheet that we have a little problem: the headers from the other sheets are somewhat being added and therefore used in the aggregate result. How can you exclude this?

There are two ways: amend the ranges in the data set to start from row 2, making the formula like this:

=QUERY({{Sheet1!A2:B};{Sheet2!A2:B};{Sheet3!A2:B}},"SELECT Col1, SUM(Col2) GROUP BY Col1")

The problem with this approach is that you don’t have any names for the column headers and would mean using labels for the headers , making the query statement a little longer with:

=QUERY({{Sheet1!A2:B};{Sheet2!A2:B};{Sheet3!A2:B}},"SELECT Col1, SUM(Col2) GROUP BY Col1 LABEL Col1 'Employee', SUM(Col2) 'Total Sales'")

This produces the result as displayed in the Agg2 sheet:

Another alternative is to use a WHERE clause on the data set . To remove the header from being inserted into the new data set, you could exclude it by using a WHERE condition.

One example could be to write a WHERE condition where the field is not empty:

=QUERY({{Sheet1!A:B};{Sheet2!A:B};{Sheet3!A:B}},"SELECT Col1, SUM(Col2) WHERE Col2 IS NOT NULL GROUP BY Col1 LABEL Col1 'Employees', SUM(Col2) 'Total Sales'")

As you can see from the above code the WHERE Col2 IS NOT NULL is used to eliminate from the data set where column headers are imported, and this produces the result as seen in the Agg3 sheet below:

Either approach when using the concatenation technique works well, you just need to ensure you know what you’re excluding with the WHERE clause and how to exclude properly.

Concatenating Rows Doesn’t Work

If you find when creating your new concatenated rows data set and see what appears to be only one sheet visible (the first sheet in your data set notation), be aware that if your data sets span entire columns, the empty rows are also added into the new data set.

For example, the data set capture of QUERY({{Sheet1!A:B, Sheet2!A:B... fetches all the data in Sheet1 and specifically all of the content in columns A & B including the empty cells ! Therefore, when you’re looking at your concatenated data set, you might only see the data from Sheet1 but not the data from Sheet2 .

It is there, you just need to scroll down.

An approach to removing this potential confusion from happening is to add a WHERE clause like WHERE Col1 IS NOT NULL . This will help to filter out the empty cells from your new concatenated data set.

Or an alternative is to just remove the rows of empty cells from each sheet (which is what I did with the example above so as to not need extra code in the query statement).

Summary

Aggregating data from multiple sheets using the QUERY function is relatively simple and straightforward. Once you know how to create your own data set using the set notation {} it then becomes a matter of how you would like to cobble your data together: creating new columns for each sheet, or concatenating rows of each sheet.

Each approach will have its own unique method for performing the aggregate functions needed on your data sets. Remember, though, that each approach requires the reference of columns in the query to use the ColX syntax, where X is an index number of the column you wish to use in your query (starting at 1).

Photo of author
Ryan Sheehy
Ryan has been dabbling in code since the late '90s when he cut his teeth exploring VBA in Excel. Having his eyes opened with the potential of automating repetitive tasks, he expanded to Python and then moved over to scripting languages such as HTML, CSS, Javascript and PHP.