Skip to main content

Section 11.5 Example: Reading and Writing Binary Files

Although text files are extremely useful and often employed, they can't and shouldn't be used for every data-processing application. For example, your college's administrative data system undoubtedly uses files to store student records. Because your student record contains a variety of different types of data ā€”String, ints, double ā€”it cannot be processed as text. Similarly, a company's inventory files, which also include data of a wide variety of types, cannot be processed as text. Files such as these must be processed as binary data.

Suppose you are asked to write an application that involves the use of a company's employee records. Recall that a record is a structure that combines different types of data into a single entity. It's like an object with no methods, just instance variables.

A binary file is a sequence of bytes. Unlike a text file, which is terminated by a special end-of-file marker, a binary file consists of nothing but data. A binary file doesn't have an end-of-file character because any such character would be indistinguishable from a binary datum.

Generally speaking, the steps involved in reading and writing binary files are the same as for text files:

The difference between text and binary file I/O resides in the Java streams that we use.

Subsection 11.5.1 Writing Binary Data

Let's begin by designing a method that will output employee data to a binary file. As the developer of this program, one thing you'll have to do is build some sample data files. These can't easily be built by handā€”remember you can't use a text editor to create themā€”so you'll want to develop a method that can generate some random data of the sort your application will have to process.

The first thing we need to know is exactly what the data look like. Let's assume that each record contains three individual pieces of dataā€”the employee's name, age, and pay rate. For example, the data in a file containing four records might look like this, once the data are interpreted:

Name0 24 15.06
Name1 25 5.09
Name2 40 11.45
Name3 52 9.25

As you can see, these data look as if they were randomly generated, but they resemble the real data in the important respects: They are of the right typeā€”String, int, doubleā€”and have the right kind of values. Of course, when these data are stored in the file, or in the program's memory, they just look like one long string of 0's and 1's.

Our approach to designing this output method will be the same as the approach we used in designing methods for text I/O. That is, we start with two questions:

  • What stream classes should I use?

  • What methods can I use?

And we find the answers to these by searching through the java.io package FigureĀ 11.2.4 and TableĀ 11.2.7).

Figure 11.5.4. The FileOutputStream class.

Because we are performing binary output, we need to use some subclass of OutputStream. Because we're outputting to a file, one likely candidate is FileOutputStream(Fig. FigureĀ 11.5.4). This class has the right kind of constructors, but it only contains write() methods for writing int and byte data, and we need to be able to write String and double data as well.

These kinds of methods are found in DataOutputStream (FigureĀ 11.5.5), which contains a write() method for each different type of data. As you can see, there's one method for each primitive type. However, note that the writeChar() takes an int parameter, which indicates that the character is written in binary format rather than as a ASCII or Unicode character. Although you can't tell by just reading its method signature, the writeChars(String) method also writes its data in binary format rather than as a sequence of characters. This is the main difference between these write() methods and the ones defined in the Writer branch of Java's I/O hierarchy.

Figure 11.5.5. The DataOutputStream class.

Now that we've found the appropriate classes and methods, we need to create a pipeline to write data to the file and develop an output algorithm. To construct a stream to use in writing employee records, we want to join together a DataOutputStream and a FileOutputStream. The DataOutputStream gives us the output methods we need, and the FileOutputStream lets us use the file's name to create the stream:

DataOutputStream outStream
  = new DataOutputStream(new FileOutputStream (fileName));

This enables the program to write data to the DataOutputStream, which will pass them through the FileOutputStream to the file itself. That settles the first question.

To develop the output algorithm, we need some kind of loop that involves calls to the appropriate methods. In this case, because we are generating random data, we can use a simple for loop to generate, say, five records of employee data. We need one write() statement for each of the elements in the employee record: The name (String), age (int), and pay rate (double):

for (int k = 0; k < 5; k++) {  // Output 5 data records
 outStream.writeUTF("Name" + k);             // Name
 outStream.writeInt((int)(20 + Math.random() * 25)); //Age
 outStream.writeDouble(Math.random() * 500); // Payrate}

Within the loop body we have one output statement for each data element in the record. The names of the methods reflect the type of data they write. Thus, we use writeInt() to write an int and writeDouble() to write a double. But why do we use writeUTF to write the employee's name, a String?

Subsubsection 11.5.1.1 The Unicode Text Format (UTF)

There is no DataOutputStream.writeString() method. Instead, Strings are written using the writeUTF() method. UTF stands for Unicode Text Format, a coding scheme for Java's Unicode character set. Recall that Java uses the Unicode character set instead of the ASCII set. As a 16-bit code, Unicode can represent 8-bit ASCII characters plus a wide variety of Asian and other international characters.

However, Unicode is not a very efficient coding scheme if you aren't writing an international program. If your program just uses the standard ASCII characters, which can be stored in 1 byte, you would be wasting 1 byte per character if you stored them as straight Unicode characters. Therefore, for efficiency purposes, Java uses the UTF format. UTF encoding can still represent all of the Unicode characters, but it provides a more efficient way of representing the ASCII subset.

It's now time to combine these separate elements into a single method (FigureĀ 11.5.6). The writeRecords() method takes a single String parameter that specifies the name of the file. This is a void method. It will output data to a file, but it will not return anything to the calling method. The method follows the standard output algorithm: Create an output stream, write the data, close the stream. Note also that the method includes a try/catch block to handle any IOExceptions that might be thrown.

private void writeRecords( String fileName )  {
 try {
   DataOutputStream outStream   // Open stream
      = new DataOutputStream(new FileOutputStream(fileName));
   for (int k = 0; k < 5; k++) { // Output 5 data records
      String name = "Name" + k;  // of name, age, payrate
      outStream.writeUTF("Name" + k);
      outStream.writeInt((int)(20 + Math.random() * 25));
      outStream.writeDouble(5.00 + Math.random() * 10);
   } // for
   outStream.close();          // Close the stream
 } catch (IOException e) {
   display.setText("IOERROR: " + e.getMessage() + "\n");
 }
} // writeRecords()
Figure 11.5.6. A method to write a binary file consisting of five randomly constructed records.

Subsection 11.5.2 Reading Binary Data

The steps involved in reading data from a binary file are the same as for reading data from a text file: Create an input stream and open the file, read the data, close the file. The main difference lies in the way you check for the end-of-file marker in a binary file.

Let's design a method to read the binary data that were output by the writeRecords() method. We'll call this method readRecords(). It, too, will consist of a single String parameter that provides the name of the file to be read, and it will be a void method. It will just display the data on System.out.

Figure 11.5.7. The FileInputStream class.

The next questions we need to address are: What stream classes should we use, and what methods should we use? For binary input, we need an InputStream subclass (FigureĀ 11.2.4 and TableĀ 11.2.7). As you've probably come to expect, the FileInputStream class contains constructors that let us create a stream from a file name (TableĀ 11.2.7). However, it does not contain useful read() methods.

Figure 11.5.8. The DataInputStream class contains methods for reading all types of data.

Fortunately, the DataInputStream class contains the input counterparts of the methods we found in DataOutputStream (FigureĀ 11.5.8). Therefore, our input stream for this method will be a combination of DataInputStream and FileInputStream:

DataInputStream inStream
   = new DataInputStream(new FileInputStream(file));

Now that we have identified the classes and methods we'll use to read the data, the most important remaining issue is designing a read loop that will terminate correctly. Unlike text files, binary files do not contain a special end-of-file marker. Therefore, the read methods can't see anything in the file that tells them they're at the end of the file.

Consequently, when a binary read method attempts to read past the end of the file, an end-of-file exception EOFException is thrown. Thus, the binary loop is coded as an infinite loop that's exited when the EOFException is raised:

try {
  while (true) {                        // Infinite loop
    String name = inStream.readUTF();   // Read a record
    int age = inStream.readInt();
    double pay = inStream.readDouble();
    display.append(name + " " + age + " " + pay + "\n");
  } // while
} catch (EOFException e) {} // Until EOF exception

The read loop is embedded within a try/catch statement. Note that the catch clause for the EOFException does nothing. Recall that when an exception is thrown in a try block, the block is exited for good, which is precisely the action we want to take. That's why we needn't do anything when we catch the EOFException. We have to catch the exception or else Java will catch it and terminate the program. This is an example of an expected exception.

Note also the read() statements within the loop are mirror opposites of the write() statements in the method that created the data. This will generally be true for binary I/O routines: The statements that read data from a file should ā€œmatchā€ those that wrote the data in the first place.

To complete the method, the only remaining task is to close() the stream after the data are read. The complete definition is shown in ListingĀ 11.5.11.

private void readRecords( String fileName ) {
 try {
   DataInputStream inStream                 // Open stream
     = new DataInputStream(new FileInputStream(fileName));
   display.setText("Name   Age Pay\n");
   try {
     while (true) {                        // Infinite loop
        String name = inStream.readUTF();  // Read a record
        int age = inStream.readInt();
        double pay = inStream.readDouble();
        display.append(name + " " + age + " " + pay + "\n");
     } // while
   } catch (EOFException e) { // Until EOF exception
   } finally {
       inStream.close();             // Close the stream
   }
   } catch (FileNotFoundException e) {
      display.setText("IOERROR: "+ fileName + " NOT Found: \n");
   } catch (IOException e) {
      display.setText("IOERROR: " + e.getMessage() + "\n");
   }
} // readRecords()
Listing 11.5.11. A method for reading binary data.

It's important that a close() statement be placed after the catch EOFException clause. If it were placed in the try block, it would never get executed. Note also that the entire method is embedded in an outer try block that catches the IOException, thrown by the various read() methods, and the FileNotFoundException, thrown by the FileInputStream() constructor. These make the method a bit longer, but conceptually they belong in this method.

Exercises Self-Study Exercise

1. Find the error.

    Which of the following method definitions would correctly read a binary file of int?

  • public void readIntegers(DataInputStream inStream) {
        try {
            while (true) {
                int num = inStream.readInt();
                System.out.println(num);
            }
            inStream.close();
        } catch (EOFException e) {
        } catch (IOException e) {
        }
    } // readIntegers
    
  • The inStream.close() statement is in the wrong place.

  • public void readIntegers(DataInputStream inStream) {
        try {
            while (true) {
                int num = inStream.readInt();
                System.out.println(num);
            }
        } catch (EOFException e) {
        } catch (IOException e) {
            inStream.close();
        }
    } // readIntegers
    
  • The inStream.close() statement is in the wrong place.

  • public void readIntegers(DataInputStream inStream) {
        try {
            while (true) {
                int num = inStream.readInt();
                System.out.println(num);
            }
        } catch (EOFException e) {
        } catch (IOException e) {
        } finally {
            inStream.close();
        }
    } // readIntegers
    
  • Yes, the inStream.close() statement goes in the finally block.

Subsection 11.5.3 The BinaryIO Application

Figure 11.5.14. A program to read and write binary files.

Given the methods we wrote in the previous section, we can now specify the overall design of the BinaryIO class (FigureĀ 11.5.14). The program sets up the same interface we used in the text file example (FigureĀ 11.5.15). It allowsĀ the user to specify the name of a data file to read or write. One button allows the user to write random employee records to a binary file, and the other allows the user to display the contents of a file in a JTextArea. The BinaryIO program in FigureĀ 11.5.16 incorporates both readRecords() and writeRecords() into a complete Java program.

Figure 11.5.15. User interface for BinaryIO program.
import javax.swing.*;         // Swing components
import java.awt.*;
import java.io.*;
import java.awt.event.*;
public class BinaryIO extends JFrame implements ActionListener{
    private JTextArea display = new JTextArea();
    private JButton read = new JButton("Read Records From File"),
                    write = new JButton("Generate Random Records");
    private JTextField nameField = new JTextField(10);
    private JLabel prompt = new JLabel("Filename:", JLabel.RIGHT);
    private JPanel commands = new JPanel();
    public BinaryIO() {
        super("BinaryIO Demo");                    // Set window title
        read.addActionListener(this);
        write.addActionListener(this);
        commands.setLayout(new GridLayout(2,2,1,1)); // Control panel
        commands.add(prompt);
        commands.add(nameField);
        commands.add(read);
        commands.add(write);
        display.setLineWrap(true);
        this.getContentPane().setLayout(new BorderLayout () );
        this.getContentPane().add("North", commands);
        this.getContentPane().add( new JScrollPane(display));
        this.getContentPane().add("Center", display);
    } // BinaryIO()

    private void readRecords( String fileName ) {
        try {
            DataInputStream inStream     // Open stream
               = new DataInputStream(new FileInputStream(fileName));
            display.setText("Name   Age Pay\n");
            try {
                while (true) {                // Infinite loop
                    String name = inStream.readUTF(); // Read a record
                    int age = inStream.readInt();
                    double pay = inStream.readDouble();
                    display.append(name + "   " + age + "   " + pay + "\n");
                } // while
            } catch (EOFException e) {  // Until EOF exception
            } finally {
                inStream.close();                  // Close the stream
            }
        } catch (FileNotFoundException e) {
            display.setText("IOERROR: File NOT Found: " + fileName + "\n");
        } catch (IOException e) {
            display.setText("IOERROR: " + e.getMessage() + "\n");
        }
    } // readRecords()

private void writeRecords( String fileName )  {
   try {
     DataOutputStream outStream   // Open stream
       = new DataOutputStream(new FileOutputStream(fileName));
     for (int k = 0; k < 5; k++) { // Output 5 data records
       String name = "Name" + k;   // of name, age, payrate
       outStream.writeUTF("Name" + k);
       outStream.writeInt((int)(20 + Math.random() * 25));
       outStream.writeDouble(5.00 + Math.random() * 10);
     } // for
     outStream.close();             // Close the stream
   } catch (IOException e) {
      display.setText("IOERROR: " + e.getMessage() + "\n");
   }
 } // writeRecords()

 public void actionPerformed(ActionEvent evt) {
     String fileName = nameField.getText();
     if (evt.getSource()  == read)
         readRecords(fileName);
      else
         writeRecords(fileName);
 } // actionPerformed()
 
 public static void main(String args[]) {
    BinaryIO bio = new BinaryIO();
    bio.setSize(400, 200);
    bio.setVisible(true);
    bio.addWindowListener(new WindowAdapter() { // Quit
       public void windowClosing(WindowEvent e) {
            System.exit(0);
       }
    });
 } // main()
} // BinaryIO
Listing 11.5.16. The BinaryIO class illustrates simple input and output from a binary file.

Subsection 11.5.4 Abstracting Data from Files

It's important to recognize that the method to read a binary file must exactly match the order of the write and read statements of the method that wrote the binary file. For example, if the file contains records that consist of a String followed by an int followed by a double, then they must be written by a sequence consisting of

writeUTF();
writeInt():
writeDouble();

And they must thereafter be read by the sequence of

readUTF();
readInt():
readDouble();

Attempting to do otherwise would make it impossible to interpret the data in the file.

This point should make it evident why (non-Java) binary files are not portable whereas text files are. With text files, each character consists of 8 bits, and each 8-bit chunk can be interpreted as an ASCII character. So even though a text file consists of a long sequence of 0's and 1's, we know how to find the boundaries between each character. That's why any text editor can read a text file, no matter what program created it.

On the other hand, binary files are also just a long sequence of 0's and 1's, but we can't tell where one data element begins and another one ends. For example, the 64-bit sequence

010100110011001001010100110011000
010100110011001011010100110011000

could represent two 32-bit ints or two 32-bit floats or one 64-bit double or four 16-bit chars or a single String of 8 ASCII characters. We can't tell what data we have unless we know exactly how the data were written.

You have attempted of activities on this page.