6.10. Glossary

6.10.1. Definitions

Aggregation: The compiling of information from databases with the intent to prepare combined datasets for data processing.

API: Stands for an application programming interface, a software intermediary that allows applications to communicate with each other.

BeautifulSoup: A Python package that is used for parsing HTML documents.

Boolean Value: A data type with one of two possible values, either true or false.

CSV File: CSV stands for “comma-separated value,” and this format allows us to share data files in a simple text format.

Cascading Style Sheets (CSS): A language used for adding styles to web documents.

DataFrame: A commonly used pandas object which is 2-dimensional with columns of different types.

Dictionary: A data structure that is used for storing data. A dictionary has a set of keys and each of which is associated with a value.

Histogram: A diagram that uses rectangles to represent the distribution of numerical data.

HTML: Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser.

JSON: JavaScript Object Notation is a syntax that uses human-readable text to store and transmit data objects.

Mean: A data set, the arithmetic mean is the sum of the values divided by the number of values.

Pandas: A library that is written for Python used for data analysis.

Pivot Method: Pivot(index, columns, values) method produces a pivot table based on columns of the DataFrame. Uses unique values from index/columns and fills with values.

Scatter Plot: A diagram that uses the values of two variables and is plotted along two axes (x-axis and y-axis), the pattern of the points reveals correlations present in the data.

Web Scrap (Screen Scrape): A technique of extracting large amounts of data from a website and saving it to a local file in your computer or a database in a spreadsheet format.

6.10.2. Keywords

geoshape Altair provides geoshape mark to visualize geographic data.

mark_geoshape Sets the chart mark to geoshape.

pivot_table A table that summarizes the data of a more extensive table. The idea behind a pivot table is to take the unique values from some columns and make them the titles of a bunch of columns while summarizing the data for those columns from several rows. The following format is used to create a pivote_table in Pandas, DataFrame.pivot_table(index=’’, columns=’’, values=’’)

