Ifs and Cases¶
In SQL, just as in many languages, one of the most basic and useful concepts is
the if/else syntax. Just as in Sheets, you
IF to return different values based on whether a condition is true.
For example, going back to the bikeshare data, suppose you are interested in counting the number of trips that are shorter than one hour and longer than one hour. The duration column contains the duration of the trips in seconds, so you can use this to create a column that tells you whether the duration is over one hour (3600 seconds).
The syntax of the
IF statement is
IF(condition, value if condition is true, value if condition is false). This
should look familar, as it is exactly the same as the syntax for the
function in Sheets.
Note the use of the
FALSE keywords, which are also similar in
Sheets. You do not need to use these specific keywords, you can use any
indicator you like. For example, you could write
IF(duration > 3600, "over_one_hour", "not_over_one_hour") or
IF(duration > 3600, 1, 0). In this case, the column name
is_over_one_hour indicates what this field represents, so it is sufficient
to use the inbuilt keywords
SQL also provides an even more versatile set of keywords that allow you to have
as many “ifs” and “elses” as you like. This is done using the
structure. This structure comprises several keywords.
The structure is opened with the
CASEkeyword. This tells SQL to start looking at each statement and evaluating if it is true or not.
Each new case is identified with the
WHENkeyword, followed by a statement. For example,
WHEN duration > 3600.
Each case must have a
THENkeyword after the
WHENstatement. If the statement after the
WHENkeyword is true, then the field evaluates to the value after the
If none of the statements after the
WHENkeyword are met, the structure will look for an
ELSEkeyword. In this case, the field evaluates to whatever follows the
The structure is closed with the
ENDkeyword. This indicates there are no more
This is perhaps best illustrated with an example. The following query counts the
number of trips for different durations. The durations are grouped manually
CASE structure in the field
duration_grouped checks the statements
WHEN keyword. For the first statement that is met, the field
evaluates to whatever follows the corresponding
THEN keyword. For example,
suppose the duration of a trip is 2468 seconds.
The first statement is false. It is not true that
duration < 10 * 60.
The second statement is false. It is not true that
duration < 30 * 60.
The third statement is true, since
duration < 60 * 60.
As soon as a condition is met, the field is evaluated, so the column
duration_grouped would be
"30-60 minutes" for this trip.