Skip to main content

Section 3.1 Data Structures and Algorithms

Subsection 3.1.1 Introduction

To determine the number of cities with a population exceeding 250,000 within a 500-mile radius of Dallas, Texas; to identify the count of employees in my company earning over $100,000 per year; or to assess the feasibility of connecting all telephone customers with less than 1,000 miles of cable, it is insufficient to possess the necessary information alone. We must organize that information in a manner that enables us to promptly find the answers to meet our requirements.
Information representation lies at the core of computer science. Most computer programs are designed not primarily for performing calculations, but for storing and retrieving information—usually with optimal speed. Hence, the study of data structures and the algorithms that manipulate them forms the essence of computer science. This book aims to assist you in comprehending how to structure information to support efficient processing.
In any course on Data Structures and Algorithms, you will typically encounter three key aspects:
  1. Introduction to commonly used data structures and algorithms, which constitute a programmer’s fundamental toolkit. For many problems, a data structure or algorithm from this toolkit will provide a suitable solution. Our focus is on data structures and algorithms that have stood the test of time and proven to be most beneficial.
  2. Exploration of tradeoffs and reinforcement of the idea that every data structure or algorithm entails costs and benefits. This is achieved by describing the space and time requirements associated with typical operations for each data structure. Similarly, we examine the time needed for various input types in each algorithm.
  3. Instruction on measuring the effectiveness of a data structure or algorithm. Only through such evaluation can we determine which data structure from our toolkit is most suitable for a new problem. The techniques presented also enable us to assess the merits of new data structures that we or others might invent.
When approaching problem-solving, there are often multiple approaches available. How do we choose among them? At the core of computer program design lie two goals, which can sometimes conflict:
  1. Designing an algorithm that is easy to understand, code, and debug.
  2. Designing an algorithm that efficiently utilizes the computer’s resources.
Ideally, a resulting program embodies both of these goals. We could consider such a program to be "elegant." While the algorithms and code examples presented here strive for elegance in this sense, it is not the primary objective of this book to explicitly address issues related to goal. Such concerns primarily fall within the domain of Software Engineering. Instead, our focus primarily revolves around issues related to goal.
How do we measure efficiency? The method employed to evaluate the efficiency of an algorithm or computer program is known as asymptotic analysis. This analysis also allows us to define the inherent complexity of a problem. Throughout the book, we employ asymptotic analysis techniques to estimate the time cost for each algorithm presented. This enables you to compare the efficiency of different algorithms used to solve the same problem and understand how they fare against one another.

Subsection 3.1.2 Philosophy of Data Structures

You might assume that as computers become more powerful, the importance of program efficiency diminishes. After all, processor speed and memory capacity continue to improve. Can’t we rely on tomorrow’s hardware to solve today’s efficiency issues?
However, our history of computer development has shown that as computing power increases, we tend to tackle increasingly complex problems. This can take the form of more sophisticated user interfaces, larger problem sizes, or previously unsolvable computational challenges. With more complex problems, the demand for computation grows, amplifying the need for efficient programs. Unfortunately, as tasks become more intricate, they often deviate from our everyday experiences. Therefore, today’s computer scientists must be well-versed in the principles of efficient program design because their everyday intuitions may not directly apply in the realm of computer programming.
In a broad sense, a data structure encompasses any representation of data and the operations associated with it. Even a simple integer or floating-point number stored in a computer can be seen as a basic data structure. However, the term "data structure" commonly refers to the organization or structuring of a collection of data items. For instance, a sorted list of integers stored in an array exemplifies such a structuring. These concepts are further explored in the discussion of Abstract Data Types.
With sufficient space to store a collection of data items, it is always possible to search for specific items, process the items in any desired order, or modify the value of individual items. The most straightforward example is an unsorted array containing all the data items. It is possible to perform all necessary operations on such an unsorted array. However, employing the appropriate data structure can make a significant difference in program efficiency. For instance, searching for a specific record in a hash table is much faster than searching for it in an unsorted array.
An efficient solution is one that solves the problem while adhering to the required resource constraints. These constraints can include the available space to store data (which may involve separate constraints for main memory and disk space) and the allotted time for each subtask. Sometimes, a solution is deemed efficient if it utilizes fewer resources than known alternatives, regardless of meeting specific requirements. The cost of a solution refers to the amount of resources it consumes. Typically, cost is measured in terms of a key resource, such as time, assuming that the solution fulfills other resource constraints as well.

Subsection 3.1.3 Selecting a Data Structure

It is an essential reminder that people develop programs to solve problems, yet programmers sometimes overlook this fact. Therefore, it is crucial to always bear this truth in mind when choosing a data structure to tackle a specific problem. The first step towards selecting the right data structure is analyzing the problem and determining the performance objectives that must be met. Without this initial analysis, program designers often make the mistake of using a familiar but inappropriate data structure, resulting in a slow program. Conversely, there is no point in adopting a complex representation to "enhance" a program that can achieve its performance goals through a simpler design.
To select a data structure for problem-solving, follow these steps:
  1. Analyze the problem and identify the essential operations that the data structure must support. These operations may include inserting a data item into the structure, deleting a data item, and finding a specific data item.
  2. Quantify the resource limitations for each operation.
  3. Choose the data structure that best satisfies the requirements identified in the previous steps.
This three-step approach to data structure selection operationalizes a data-centric perspective in the design process. The primary focus is on the data and the operations performed on them. The next concern is finding the appropriate representation for the data, followed by implementing that representation.
Resource constraints, especially for critical operations like searching, inserting, and deleting data records, typically guide the selection of a data structure. Addressing questions regarding the relative significance of these operations can help in making informed decisions. Whenever you face the task of choosing a data structure, consider asking yourself the following three questions:
  1. Are all data items inserted into the structure at the beginning, or are insertions interspersed with other operations? Static applications, where data is loaded initially and remains unchanged, often benefit from simpler data structures for efficient implementation. In contrast, dynamic applications frequently require more intricate structures.
  2. Is it possible to delete data items? If so, this may introduce additional complexity into the implementation.
  3. Are all data items processed in a defined order, or does the search involve locating specific data items? "Random access" searches typically necessitate more complex data structures.
By carefully considering these factors and evaluating the requirements of your problem, you can make informed choices when selecting an appropriate data structure.
You have attempted of activities on this page.