Up to this point, you have been dealing with data entirely within Sheets. However, when you start tackling problems on real-world datasets, you will likely have to fetch the data yourself. Most of the data you will deal with in this section is stored in online repositories such as GitHub or Google Drive.
The most common file format that is used with spreadsheets is a delimited file, in which:
Each row is separated by a new line.
Each column is separated by a delimiter/separator, which can be any character.
The most common separators are listed below.
Comma: Files using “,” as the separator are called “comma-separated value” (csv) files.
Tab: Files using tab as the separator are called “tab-separated value” (tsv) files.
Space: These exist but are significantly less common.
In general, any character can be a delimiter, but it is extremely uncommon to use a delimiter that is not a comma, tab, or space.
Luckily, Sheets makes importing data relatively straightforward. Here is a dataset on 80 different brands of cereal, that you can download to your computer and then work within Sheets.
Import this data into Sheets using the following instructions.
In a new spreadsheet, click “File > Import”.
You can either select a file from Google Drive or navigate to “Upload”, where you can drag and drop a file or select a local file.
After selecting a file, you will have to choose from a list of options. For the most part, Sheets is pretty good at automatically parsing the input file and knowing which separator to use. However, since you know this data is comma-separated, it is good practice to set “Separator type” to “Comma”. You can choose the “Import location” that suits you best.
When you click “Import data”, you should see the entire dataset.