Before you keep reading...
Runestone Academy can only continue if we get support from individuals like you. As a student you are well aware of the high cost of textbooks. Our mission is to provide great books to you for free, but we ask that you consider a $10 donation, more if you can or less if $10 is a burden.
Before you keep reading...
Making great stuff takes time and $$. If you appreciate the book you are reading now and want to keep quality materials free for other students please consider a donation to Runestone Academy. We ask that you consider a $10 donation, but if you can give more thats great, if $10 is too much for your budget we would be happy with whatever you can afford as a show of support.
18.1. Aggregating Data¶
Say we want data about how many rides lasted longer than an hour. At this point we
don’t care about where they started, or which bike they were on, we just want to
know how many such rides there were.
To ask for the total number of records a query finds, we can select
AS keyword is used after
COUNT(*) to give the data that is returned a name. If we
leave it off, the “column” of data will just be labeled
COUNT(*). Calling it
makes it much clearer what we are looking at.
How many trips started at station 31111?
Hint: Build your way to the final query. Start by selecting all the data (
*) for all the
trips. Then write a
WHERE to only select ones that start from 31111. Finally, instead of selecting
all of the columns, select just the count of the number of records.
COUNT is one of the aggregation functions provided by SQL. Aggregation is the process
of combining data and
COUNT combines all the records and tells us how many there are.
But there are other ways we can aggregate data with SQL:
COUNTcounts the rows
SUMadds the values of a numeric column
MINcalculates the minimum of a numeric column
MAXcalculates the maximum of a numeric column
AVGcalculates the mean of a numeric column
MEDIANcalculates the median (middle value) of a numeric column
MODEcalculates the mode (most common value) of a numeric column
For example, the query below calculates the minimum and maximum trip duration:
Write a query to calculate the mean duration (average) of all trips.
WHERE filtering with aggregating functions. The
query below calculates the longest trip duration just for trips made by Casual
member type users.
Write a query to get the total duration (sum) of all trips taken on bike W01274.
18.1.1. Counting Unique Values¶
It is often helpful to not only count the number of rows but also to count the number
of unique values of a column. You can do this using the
To count the distinct values of a column, you can simply use
DISTINCT. For example, the query below counts the number of bike
Write a query to get the number of unique start locations there are in the table.