Minimum and Maximum¶

The minimum and maximum of a dataset can be very useful statistics, and are relatively simple to calculate. These statistics only apply to quantitative variables.

Minimum Definition

The minimum value is the smallest value in the dataset, or the value that all other values in the dataset are greater than or equal to.

Maximum Definition

The maximum value is the largest value in the dataset, or the value that all other values in the dataset are less than or equal to.

The minimum or maximum value is sometimes the only value you need to know. For example, suppose your university has organized a field trip for your class to a concert, but the event is at a 21+ venue so people under the age of 21 are not allowed in. In this case, knowing that the minimum age of the students in your class is 21 is sufficient, as that tells you that everyone in the class is at least 21 and that all members of the class can go on the field trip.

Example: Dice Roll¶

Consider rolling a standard dice.

• There are six faces.

• Each face is equally likely to land face up.

• The faces are labelled as follows: 1, 2, 3, 4, 5, 6.

It might seem unnecessary to use Sheets to calculate the minimum and maximum possible results of a dice roll, but when there are thousands of values instead of six, using Sheets or some other tool will be a necessity.

You can calculate the minimum and maximum value in Sheets using the MIN and MAX functions respectively.

Minimum and Maximum in Sheets

The MIN function returns the minimum value of a set of values. You can either input several values separated by a comma (e.g. =MIN(value1, value2, value3)), or you can input a range of cells of which you want to know the minimum (e.g. =MIN(A1:A10)).

The MAX function returns the maximum value of a set of values. You can either input several values separated by a comma (e.g. =MAX(value1, value2, value3)), or you can input a range of cells for which you want to know the maximum (e.g. =MAX(A1:A10)).

This example illustrates how to calculate the minimum value of a dice roll using MIN, but the exact same logic and syntax applies to calculating the maximum using MAX. As stated above, there are two ways to calculate the minimum value of a dice roll.

In the first way, each value is input into the MIN function, separated by a comma.

Alternately, you can specify all the values in different cells, and input the cell range into the MIN function.

In future examples, you will see that specifying a cell range is the more efficient way to use MIN, MAX, and other statistical functions.

Example: Weather¶

Suppose you want to know the minimum and maximum temperature that New York City (NYC) generally experiences in a year.

The weather dataset previously seen here has the field “actual_min_temp” which records the coldest temperature every day, and a field “actual_max_temp” which records the highest temperature every day. (For this example, only NYC weather is considered so the “city” column is removed, and the month is not relevant so the “month_text” column is removed.)

This dataset for twelve months contains just 365 data points. It would be time-consuming but not impossible to scan each column visually and find the minimum and maximum values. But imagine if this dataset covered every day for one-hundred years! Sheets would be able to find the minimum and maximum just as quickly as it did for twelve months. Doing this manually, however, is error-prone and would not be fun.

Optional: Match¶

Knowing how to find the minimum and maximum values in a spreadsheet is useful for many situations, but sometimes it can be even more useful to know which row the minimum or maximum came from.

Match Definition

MATCH returns the relative position of an item in a range that matches a specified value. We can use the MATCH function to find the row of the minimum or maximum.

The MATCH function has three inputs and looks like this: MATCH(search_key, range, [search_type]).

• search_key: The value to search for

• range: The values of the column that you want to search (ex. A1:A5)

• search_type: The manner in which to search

• 1 causes MATCH to assume that the range is sorted in ascending order and return the largest value less than or equal to search_key

• 0 indicates an exact match, and is required when the range is not sorted

• -1 causes MATCH to assume that the range is sorted in descending order and return the smallest value greater than or equal to search_key

To practice using MATCH, suppose a company called CandyData handed you the Halloween Candy dataset from FiveThirtyEight with information about various Halloween candies. Suppose they ask you to find out which of the candies is most expensive. You know that you need to find the row with the highest value in the Price Percent column, so you can use the MATCH function!

Now you must start filling in the inputs for MATCH. The first input is the value you’re searching for. You’re looking for the maximum value in the column, and you know that to find the maximum value in a column you can use the MAX function (MAX(C2:C86)). So now you can fill in the first part of the MATCH function: MATCH(MAX(C2:C86), something, something).

The second input is the range of the values of the column that you want to search. Since you want to find the value in the column called Price Percent, you fill in the next part of the MATCH function: MATCH(MAX(C2:C86), C1:C86, something).

Notice that if you use C2:C86 instead of C1:C86 instead, the row value returned by the function will be shifted up by one, so the answer will be 53 instead of 54. This is because the returned value is equal to how far down the value is in the range, so when you omit the first row in the range (C1), the returned value will be one less than the row number because it’s counting the rows starting at C2.

This is what that bug would look like if you were using a smaller dataset and trying to find the state with the largest population:

The last input is the manner in which you want to search. Since the values in Price Percent aren’t sorted, you use 0. The final function is =MATCH(MAX(C2:C86), C1:C86, 0). The returned value is 46, meaning the most expensive candy is in row 46. You can now go back CandyData and tell them that “Nik L Nip” is the most expensive candy on the dataset.

Practice using the MATCH, MAX, and MIN functions to answer the following questions: