Skip to main content

Section 11.2 Streams and Files

As was noted in ChapterĀ 4, all input and output (I/O) in Java is accomplished through the use of input streams and output streams. You are already familiar with input and output streams because we have routinely used the System.out output stream and and the input stream (Fig. FigureĀ 11.2.1) in this text's examples. Recall that System.out usually connects your program (source) to the screen (destination) and usually connects the keyboard (source) to the running program (destination). What you have learned about streams will also be a key for connecting files to a program.

Figure 11.2.1. The System.out output stream connects your program to the screen and the input stream connects it to the keyboard.

Subsection 11.2.1 The Data Hierarchy

Data, or information, are the contents that flow through Java streams and are stored in files. All data are comprised of binary digits or bits. A bit is simply a 0 or a 1, the electronic states that correspond to these values. As we learned in ChapterĀ 5, a bit is the smallest unit of data.

However, it would be tedious if a program had to work with data in units as small as bits. Therefore, most operations involve various-sized aggregates of data such as an 8-bit byte, a 16-bit short, a 16-bit char, a 32-bit int, a 64-bit long, a 32-bit float, or a 64-bit double. As we know, these are Java's primitive numeric types. In addition to these aggregates, we can group together a sequence of char to form a String.

It is also possible to group data of different types into objects. A record, which corresponds closely to a Java object, can have fields that contain different types of data. For example, a student record might contain fields for the student's name and address represented by (Strings), expected year of graduation (int), and current grade point average represented by (double). Collections of these records are typically grouped into files.

For example, your registrar's office may have a separate file for each of its graduating classes. These are typically organized into a collection of related files, which is called a database. Taken together, the different kinds of data that are processed by a computer or stored in a file can be organized into a data hierarchy (FigureĀ 11.2.2).

Figure 11.2.2. A data hierarchy.

It's important to recognize that while we, the programmers, may group data into various types of abstract entities, the information flowing through an input or output stream is just a sequence of bits. There are no natural boundaries that mark where one byte (or one int or one record) ends and the next one begins. Therefore, it will be up to us to provide the boundaries as we process the data.

Subsection 11.2.2 Binary Files and Text Files

As we noted in chapterĀ 4, there are two types of files in Java: binary files and text files. Both kinds store data as a sequence of bitsā€”that is, a sequence of 0's and 1's. Thus, the difference between the two types of files lies in the way they are interpreted by the programs that read and write them. A binary file is processed as a sequence of bytes, whereas a text file is processed as a sequence of characters.

Text editors and other programs that process text files interpret the file's sequence of bits as a sequence of charactersā€”that is, as a string. Your Java source programs (*.java) are text files, and so are the HTML files that populate the World Wide Web. The big advantage of text files is their portability. Because their data are represented in the ASCII code (TableĀ 5.9.1), they can be read and written by just about any text-processing program. Thus, a text file created by a program on a Windows computer can be read by a Macintosh program.

In non-Java environments, data in binary files are stored as bytes, and the representation used varies from computer to computer. The manner in which a computer's memory stores binary data determines how it is represented in a file. Thus, binary data are not very portable. For example, a binary file of integers created on a Macintosh cannot typically be read by a Windows program.

One reason for the lack of portability is that each type of computer uses its own definition for how an integer is defined. On some systems an integer might be 16 bits, and on others it might be 32 bits, so even if you know that a Macintosh binary file contains integers, that still won't make it readable by Windows programs. Another problem is that even if two computers use the same number of bits to represent an integer, they might use different representation schemes. For example, some computers might use 10000101 as the 8-bit representation of the number 133, whereas other computers might use the reverse, 10100001, to represent 133.

The good news for us is that Java's designers have made its binary files platform independent by carefully defining the exact size and representation that must be used for integers and all other primitive types. Thus, binary files created by Java programs can be interpreted by Java programs on any platform.

Subsection 11.2.3 Input and Output Streams

Java has a wide variety of streams for performing I/O. They are defined in the package, which must be imported by any program that does I/O. They are generally organized into the hierarchy illustrated in FigureĀ 11.2.4. We will cover only a small portion of the hierarchy in this text. Generally speaking, binary files are processed by subclasses of InputStream and OutputStream. Text files are processed by subclasses of Reader and Writer, both of which are streams, despite their names.

Figure 11.2.4. Java's stream hierarchy.

InputStream and OutputStream are abstract classes that serve as the root classes for reading and writing binary data. Their most commonly used subclasses are DataInputStream and DataOutputStream, which are used for processing String data and data of any of Java's primitive typesā€”char, boolean, int, double, and so on. The analogues of these classes for processing text data are the Reader and Writer classes, which serve as the root classes for all text I/O.

The various subclasses of these root classes perform various specialized I/O operations. For example, FileInputStream and FileOutputStream are used for performing binary input and output on files. The PrintStream class contains methods for outputting various primitive dataā€”integers, floats, and so forthā€”as text. The System.out stream, one of the most widely used output streams, is an object of this type. The PrintWriter class, which was introduced in JDK 1.1 contains the same methods as PrintStream but the methods are designed to support platform independence and internationalized I/Oā€”that is, I/O that works in different languages and alphabets.

The various methods defined in PrintWriter are designed to output a particular type of primitive data (Fig.Ā 11.4). As you would expect, there is both a print() and println() method for each kind of data that the programmer wants to output.

Figure 11.2.6. The PrintWriter class.

TableĀ 11.2.7 briefly describes Java's most commonly used input and output streams. In addition to the ones we've already mentioned, you are already familiar with methods from the BufferedReader and File classes, which were used in ChapterĀ 4.

Table 11.2.7. Description of some of Java's important stream classes.
Class Description
InputStream Abstract root class of all binary input streams
FileInputStream Provides methods for reading bytes from a binary file
FilterInputStream Provides methods required to filter data
BufferedInputStream Provides input data buffering for reading large files
ByteArrayInputStream Provides methods for reading an array as if it were a stream
DataInputStream Provides methods for reading Java's primitive data types
PipedInputStream Provides methods for reading piped data from another thread
OutputStream Abstract root class of all binary output streams
FileOutputStream Provides methods for writing bytes to a binary file
FilterOutputStream Provides methods required to filter data
BufferedOutputStream Provides output data buffering for writing large files
ByteArrayOutputStream Provides methods for writing an array as if it were a stream
DataOutputStream Provides methods for writing Java's primitive data types
PipedOutputStream Provides methods for writing piped data to another thread
PrintStream Provides methods for writing primitive data as text
Reader Abstract root class for all text input streams
BufferedReader Provides buffering for character input streams
CharArrayReader Provides input operations on char arrays
FileReader Provides methods for character input on files
FilterReader Provides methods to filter character input
StringReader Provides input operations on String s
Writer Abstract root class for all text output streams
BufferedWriter Provides buffering for character output streams
CharArrayWriter Provides output operations to char arrays
FileWriter Provides methods for output to text files
FilterWriter Provides methods to filter character output
PrintWriter Provides methods for printing binary data as characters
StringWriter Provides output operations to String s

Filtering refers to performing operations on data while the data are being input or output. Methods in the FilterInputStream and FilterReader classes can be used to filter binary and text data during input. Methods in the FilterOutputStream and FilterWriter can be used to filter output data. These classes serve as the root classes for various filtering subclasses. They can also be subclassed to perform customized data filtering.

One type of filtering is buffering, which is provided by several buffered streams, including BufferedInputStream and BufferedReader, for performing binary and text input, and BufferedOutputStream and BufferedWriter, for buffered output operations.

As was discussed in chapterĀ 4, a buffer is a relatively large region of memory used to temporarily store data while they are being input or output. When buffering is used, a program will transfer a large number of bytes into the buffer from the relatively slow input device and then transfer these to the program as each read operation is performed. The transfer from the buffer to the program's memory is very fast.

Similarly, when buffering is used during output, data are transferred directly to the buffer and then written to the disk when the buffer fills up or when the flush() method is called.

You can also define your own data filtering subclasses to perform customized filtering. For example, suppose you want to add line numbers to a text editor's printed output. To perform this task, you could define a FilterWriter subclass and override its write() methods to perform the desired filtering operation. Similarly, to remove the line numbers from such a file during input, you could define a FilterReader subclass. In that case, you would override its read() methods to suit your goals for the program.

There are several classes that provide I/O-like operations on various internal memory structures. ByteArrayInputStream, ByteArrayOutputStream, CharArrayReader, and CharArrayWriter are four classes that take input from or send output to arrays in the program's memory. Methods in these classes can be useful for performing various operations on data during input or output. For example, suppose a program reads an entire line of integer data from a binary file into a ByteArray. It might then transform the data by, say, computing the remainder modulo N of each value. The program now can read these transformed data by treating the byte array as an input stream. A similar example would apply for some kind of output transformation.

The StringReader and StringWriter classes provide methods for treating Strings and StringBuffers as I/O streams. These methods can be useful for performing certain data conversions.

You have attempted of activities on this page.