9.3. Project

Project (the verb, pronounced /prəˈjekt/, stemming from the word projection) is the simplest operator. It takes as input one relation, and a list of columns. It forms a result relation with all of the input relation’s rows, but just the specified columns. That relation must contain all of the input relation’s identifying columns with some, none or all of its non-identifying columns, and, perhaps, some computed columns. One result relation row corresponds to one input relation row.

A Project’s result relation has these notable name and structure characteristics:

Name characteristics:

Structure characteristics:

Important

Masters know how each operator’s result relation base relates to the base of the input relation’s base. For Project the bases and identifiers are the same, because we must include the identifying columns for this operator.

9.3.1. Examples

Let’s start with the creature relation once again, which has these rows:

Creature

creatureId

creatureName

creatureType

reside_townId

idol_creatureId

1

Bannon

person

p

10

2

Myers

person

a

9

3

Neff

person

be

NULL

4

Neff

person

b

3

5

Mieska

person

d

10

6

Carlis

person

p

9

7

Kermit

frog

g

8

8

Godzilla

monster

t

6

9

Thor

superhero

as

NULL

10

Elastigirl

superhero

mv

13

11

David Beckham

person

le

9

12

Harry Kane

person

le

11

13

Megan Rapinoe

person

sw

10

Note that creatureId is the identifier of Creature.

9.3.1.1. Project some of the columns

English Query:

Find the identifier and name of each Creature.

The corresponding chart for this looks like this:


../_images/ProjectCreature.png

Note what is shown in the operator: the identifying column (can be columns), additional columns that are carried along in the result (written after the word carry), and in this case no additional computed columns are given. We will see how that is used shortly.

Corresponding SQL:

Explanation of the SQL query (first tab):

The columns we wish to keep in the result are in a comma-separated list after the keyword SELECT. The input relation, in this case creature, is shown after the keyword FROM.

9.3.1.2. Projecting computed columns

English Query:

For each achievement, find each achId, skillCode, proficiency, and total elapsed time since the achievement.

The corresponding chart for this looks like this:


../_images/ProjectAchievementCompute.png

Corresponding SQL:

Explanation of the SQL query (first tab):

The input relation, in this case achievement, is shown after the keyword FROM. The columns we wish to keep in the result are in a comma-separated list after the keyword SELECT, as is a new column that gets computed. In this case, in SQLite the function julianday() applied to each of two dates represented as text and used with the minus operator will compute the number of days between the two dates. As mentioned in the previous section, other databases will do this differently. Note the keyword AS following the date computation: this enables us to rename the column. You could experiment with removing the AS and the new column name to see what results.

Note

Remember that if you try changing the query above and re-run it, you can always get back to the original by re-loading this page in your browser.

Q-5: After running the above project SQL query, can you explain why there is no value for the total elapsed time since achieved for the achievement whose id is 11? (Hint: look at the data inserted for this achievement on line 61-63 of the SQL data tab.)

Q-6: What might you want to do next to follow on from this result to make the data more appealing for your users? You could even try some simple things in the SQL Query tab above. Hopefully you will see that handling dates in a useful way can be tricky.

Important

The scope of computing columns in Project is one row.

The above is a crucial fact. Its consequence is that the expression producing a computed value in a row can use only data values from that one row. You cannot use Project, for example, to compute the average Score achieved by a Creature, because those Scores are not all in one row. Often you will execute other operators to place some values in a row so that you can compute desired expressions.

You can also add a constant column that has the same value in every row, such as the current year, or some value that you may want to use in a computation in a further step.

What expressions can you compute with a one-row scope? In short, a lot. SQL provides a large set of arithmetic and text manipulation operators and built-in functions that you can use in forming expressions on the data in a row. You should study your DBMS manual to determine what is available.

9.3.2. Some notes about the project operator

You should expect to spend some time deciding on a consistent style for specifying required columns in the narrative, and for the column modifier in the result relation. A naming style that sounds all right for the small example database might sound awkward with your data.

Here are some things to consider about columns:

  • When talking about a query sometimes you will find it inconvenient to name all of the to-be-Projected columns, and will focus on the ones not in the result and will say “Project away these columns,” or “Project all but these columns.”

  • The above notion leads to this: you might choose to form the result relation name using a complementary style using, for example, “all but creatureType of Creature” as the result relation name. (You might prefer “all columns but creatureType of Creature.”)

  • You will choose among long and short ways of specifying columns, e.g., shorten creatureId to simply id.

When first starting out, it is better to be fairly specific about what is in your result so that you can easily see how the result can be used for the next operation that you want to perform.

Note

Before reading on, think about these questions about extremes: What are the minimum and maximum number of result relation columns and rows for a Project?

9.3.3. Exercises

Try charts for these as practice:

  1. Find the skillCode and skillDescription of each Skill.

  2. Find the creatureId, skillCode, and aspiredProficiency of Aspiration.

  3. Find the skillCode of each TeamSkill.

  4. Find each skillCode, skillDescription of Skill, and the possible number of different values of proficiency that can be applied.

For the last one, here is a suggestion: Knowing proficiency values are integers and have a minimum and maximum value, how can you compute the total number of possible integer values that could be used for each particular skill? Recall that the Skill relation is this (id is skillCode):

Skill

skillCode

skillDescription

maxProficiency

minProficiency

origin_townId

A

float

10

-1

b

B2

2-crew bobsledding

25

0

d

C2

2-person canoeing

12

1

t

D3

Australasia debating

10

1

NULL

E

swim

5

0

b

O

sink

10

-1

b

PK

soccer penalty kick

10

1

le

THR

three-legged race

10

0

g

TR4

4x100 meter track relay

100

0

be

U

walk on water

5

1

d

Z

gargle

5

1

a


You have attempted of activities on this page