17.3. Selecting

Instructions to retrieve information from a database are known as queries. To make a query using SQL, we generally write a SELECT statement (recall that a complete instruction in programming is known as a statement). The general format of a SELECT is:

SELECT columns FROM table;

In SQL keywords like SELECT are generally written using all upper case, while column names and table names use lower case. This is a convention to make statements easier for people to read - it is also valid to use lowercase for the keywords.

It is also common to break the statement up across multiple lines - as statements get more and more complex, this too makes them easier to read:

--Still all one statement:
SELECT
    columns
FROM
    table;

In Python, each statement goes on its own line and must be indented in a particular way. SQL does not care about indentation. You are free to indent lines however you like. And a statement does not automatically end at the end of a line. It ends at the ; symbol. A final difference is how we make a comment. In Python, we use # to make a comment in the code. In SQL, we use -- to tell the computer to ignore the rest of the line.

Let’s use a SELECT to get all the data from the trip_data table. The following SQL query is an example of how to do so. It asks the database to SELECT all columns (* is a shorthand that means “all columns”) FROM the table named trip_data.

As we do a SELECT, it might not make sense to ask for all of the columns in a table. If we want to answer a question about the length of rides, we probably do not need the information about station numbers or bike numbers. To simplify the data we get, and reduce the amount of information that must be sent from the database to our program, we would want to specify a list of columns we care about instead of using *. To select just the start_date, end_date, and duration columns, the query would look like the following:

The column names must match the names of the columns in the data table and be separated by commas. Because SQL does not care about line breaks, it would be fine to list all the columns on one line like: start_date, end_date, duration.

Notice that we can specify the columns in any order - the order we ask for them does not have to match the order in the original table. Try changing the query so that duration is the first row that is returned.

Write a query to select the start and end stations for all trips. Make sure that the start station is the first column retrieved and the end station is the second.

Hint: check the previous page to check the column names you need to use of first select * and look at the returned data to find the column names you care about.

You have attempted of activities on this page