8.8. Glossary

8.8.1. Defintions

Adjacency matrix: is a square matrix that is used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not.

ASCII: stands for American Standard Code for Information Interchange. It is a character encoding that can only encode 256 characters.

Data modeling: is when an organization creates a data model to show how the data will be stored in the database. With this data model, one can see a conceptual representation of the data objects and the association between different data objects and the rules.

Database design: is a protocol that is done by the organization to establish how data must be stored and how the data elements interrelate. This also involves classifying data and identifying interrelationships.

Edge: is a line segment that connects two or more nodes in a tree or a graph.

Encoding: is, in a general sense, when we convert data from one form to another. Character encoding referenced in the book refers to converting each character, such as letters, numbers, symbols, etc. to binary code. ASCII and Unicode are common types of encoding.

Foreign Key is a column or group of columns in a data set that provides a link between data in two tables.

Heatmap: is a graphical representation that uses a system of color-coding to represent different values in a data set.

Many-to-many: is a relation in a database where several records in one table are associated with several records in another table.

ISO-8859-1: is a single-byte coded graphic character set that can only represent the first 256 Unicode Characters. The “-1” means to use “Latin-1” character set because ISO-8859 supports some other languages as well.

Node: is a basic unit of a linked list or tree data structure that contains data.

Observational Unit: is the overall unit for which information is received, and then statistics are compiled in the process of collecting statistical data.

One-to-many: is a relation in a database where one record, usually called the parent, is associated with several other records, usually called child records, in another table.

One-to-one: is a relation in a database where one record of a table is associated with one and only one record in another table.

Primary key: is a key that is a unique identifier for each record in a relational database.

Text Complexity: is the level of challenge a textual reading has based on the following criteria: its quantitative features, its qualitative features, and reader/text factors.

UTF-8: stands for an 8-bit Unicode Transformation Format. It is a character encoding that can encode all 1,112,064 characters in Unicode.

Unicode errors: Unicode is an encoding standard that is used internationally. The Unicode error usually arises when you try to write a Unicode string to a file or device that does not handle a Unicode string. For instance, you will get an error if the output file or device only handles ASCII.

Vectorized String Methods: is a set of string methods, much like the ones in python, that enable the handling and manipulation of strings in series.

8.8.2. Keywords

graphviz is an open-source package used to represent and study graphs.

networkx is a Python library that is used for studying graphs.

pd.concat is used to concatenate the small data frames together into one large data frame.

Series is a one-dimensional data structure in pandas. It is a one-dimensional array that can hold any data type.

You have attempted of activities on this page