Pivot Tables

Tables that show the average or total for different groups are often very helpful. Suppose some coworkers at a local company aren’t sure if the hours for which they are each scheduled have been assigned fairly. A table showing each employee, the hours they had been assigned, and how often they opened or closed the business could be used to see if there were big differences in assignments, and decide if those assignments were fair.

A screenshot of pivot table with example employee hours.

A pivot table is a table that allows you to pull relevant summary statistics from a dataset. You could use AVERAGEIF, SUMIF and COUNTIF to make the table, but sheets has a built in tool to create a pivot table that automates this process without having to type out all the formulas.

Start by selecting the data you’d like to use. Then navigate to the “Data” menu, and select “Pivot table”. You can place the new pivot table in a new sheet or an existing sheet.

A screenshot of how to select pivot table from Data meu.

The pivot table editor, which opens on the right, has sections for adding rows, columns, values, and filters.

A screenshot of adding a row in pivot table editor. A screenshot of adding a filter in pivot table editor.

The resulting pivot table would have the total hours, opening and closing shift for only those three days.

A screenshot of adding a variable in pivot table editor.

Q-3: What evidence from the table would support the employees’ claim that the assignments have not been assigned fairly?

A screenshot of a subset of original work hour pivot table.

Within a pivot table, double clicking on any value in the table will create a new sheet with the subset of data from that entry in the table. For example, double clicking on the value 4 in cell C4 in the table above creates the sheet below. You can use this subset to look at Jan’s assignments, and ask or answer questions about Jan specifically. This is also a great way to investigate interesting values or patterns in a pivot table.

A screenshot of a pivot table with a Jan filter.

Example: National Center for Health Statistics

For this example, consider a non-profit organization that works to improve the life expectancy of Americans. You have access to data from The National Center for Health Statistics (NCHS), which is a branch of the Center for Disease Control, that provides statistical information about the health of American people. The dataset below presents the number of deaths for the ten leading causes of death in the USA for each state beginning in 1999.

A screenshot of a sheet with leading cause of death data.

You are working on a project for your nonprofit to try to find the leading causes of death in the USA, in order to target possible areas of improvement for healthcare and death prevention. This can be done given the NCHS dataset above.

First, construct a pivot table with rows from the variable “Cause Name”.

A screenshot of creating a pivot table with rows from "Cause Name".

Then add “Deaths” as a value to the pivot table.

A screenshot of adding value "Deaths" in value in pivot table editor.

Summarize by SUM. (There are many options under summarize. Sum is the most useful for this context, but average, median, and count are also commonly used statistics in pivot tables.)

A screenshot of summarizing by sum for value "Deaths" in pivot table editor.

Make sure to have “Grand totals” enabled, so you can see the total number of deaths.

A screenshot of enabling grand totals in sheets.

This is what the completed table should look like:

A screenshot of complete pivot table with leading causes of death.

The cause of death responsible for the most deaths in the USA is heart disease. But what percentage of deaths is this? To calculate the percentage, you can add a column next to the pivot table that divides the deaths for each cause by the grand total.(This is an opportunity to use absolute references to make your life simpler.)

A screenshot of how to use absolute reference to add a coloumn for percentages in pivot table.

This shows that 33.8% (or over one third!) of deaths in the USA are due to heart disease. This is astonishingly high, and shows that efforts towards reducing heart disease or ameliorating symptoms due to heart disease is the highest priority for the nonprofit.

In order to present this information to your teammates, it might be easier to display this information as a chart, rather than a table. A bar chart, constructed from this pivot table, should make the information significantly easier to interpret, compared to the raw pivot table.

To do this, first select the first two coloumns with the relevant data and select “Insert > Chart”.

A screenshot of inserting a chart.

Then in the chart editor select the chart type to be “Column Chart”.

A screenshot of selecting "Coloumn Chart" in Chart Editor.

Make sure that the “X-Axis” is set to the cells with the disease names and the “Series” is set to the sum of deaths for each disease.

A screenshot of setting "X-Axis" and "Series" values in Chart Editor.

You should now have the chart below.

A screenshot of complete bar chart for leading cause of death data.

This chart makes it visually clear that heart disease and cancer are the highest causes of death by a substantial amount.

When you present this graph to your teammates, one of them asks how these percentages have changed over time. To look into this, add the variable “Year” as a column. (You’ll have to move or delete the percentage column, or construct a new pivot table.)

A screenshot of a how to add a column for "Year".

Now you should have the table below.

A screenshot of complete pivot table with a column for "Year".

This table is too large to be interpretable. Visualizing this data in a chart is much more helpful. Select the range A2:S12 (the pivot table excluding the first and last rows) and then, under the “Insert” menu, select “Chart”.

Sheets automatically selects a line chart for this data, with “Year” along the horizontal axis and a line for each cause of death, showing how each has varied over time. Line charts display how one or more quantitative variables change ver time. To construct a line chart your dataset must have a time variable. (In this dataset, it is the “Year” column.)

A screenshot of line chart created from pivot table for leading cause of death.

This graph is certainly more interpretable than the table, but it’s still difficult to distinguish the lines towards the bottom. Another issue is that there are several colors, many of which are hard to differentiate. Also, if a viewer were colorblind, this graph would be essentially unreadable. Before presenting this to your teammates, you need to address these issues. Consider reducing the number of causes displayed (perhaps to just the most “interesting” causes), and changing the colors used.

It’s difficult to see in the graph above, but deaths due to Alzheimer’s disease have been steadily increasing. This change is much easier to see if Alzheimer’s is the only cause of death displayed. Pivot tables allow for filtering, so you can restrict the table to Alzheimer’s related deaths only.

In the pivot table editor, the last option is “Filter”. Add a filter to “Cause Name”.

A screenshot of adding a filter for "Cause Name" in Chart Editor.

Then under the “Filter by values” option, select only “Alzheimer’s disease”. The pivot table and graph will automatically update and show only Alzheimer’s deaths.

A screenshot of adding a filter for Alzheimer’s.

While the raw number of deaths is significantly greater for heart disease and cancer, the growth of Alzheimer’s disease deaths is also very worrying to your nonprofit. Your manager asks you to investigate why the deaths are on the rise so dramatically, so you investigate that more in the next section.

Filtering also works on other values. For example, you can add an additional filter to only use data from California. First, add a filter to “State”.

A screenshot of adding a filter for "State" in Chart Editor.

Then under the “Filter by values” option, select only “California”.

A screenshot of adding a filter for California.

Below are two graphs for Alzheimer’s deaths: on the top just for California, on the bottom for the entire country.

A screenshot of Alzheimer's chart for California. A screenshot of Alzheimer's chart for all states.
You have attempted of activities on this page