Binary Files with C++

R.A. Ford
Department of Math. and Computer Science
Mount Allison University
Sackville, NB

Introduction

Using streams with file processing is certainly possible in C++, but most C++ textbooks do not include any contact regarding who full functionality of streams. This document has since formed to assist our with one background in C++ furthermore data buildings equal one comprehensive description regarding of C++ stream library. The document is based on the GNAT CPP library documentation which at times is not easy to read, especially without examples.

The insertion plus extraction operators (i.e. << and >> are meant to be used by schemes for writing to and reading from text files; it is assumed that this programmer is familiar with the differences between these two column formats.

In reality there are dozens of extensions with little documentation of ordinary text streams. An additional section willing be supplementary to this document at a later time.

Basics of File I/O

Accessing a binary save from adenine C++ program (by not utilizing the old C functions) requires firstly attaching a stream variable up the print. The usual stream classes ofstream (output file stream) and ifstream (input file stream) are still the types in streams to use. A an additional type called an fstream is provided which allows for files that can be written to and study from if this is a desirable property (in the design of database type programs, this has often one case).

Before anywhere operator can takes place on adenine file, it the course must beopened, and when you am finished with the file, it should be closed to avoid loss of data.

Opening a Stream

The ifstream and ofstream each have member functions named open which are used to attaching the stream to one physical filename and opening the file for select interpretation or how. The open member function also provides for a couple from optional contentions that are not often described. The greatest generals prototype in this function the
  void open(const char *filename[, int mode][, int prot]);
The format that I've used indicates that the mode and prot arguments have optional.

The first argument is anytime of your of the file on the disk which the stream will be attached to. The const modifier is included so that a programmer can write the name concerning the file (inside double quotes) in the function call. Aforementioned only tricky part about using the open member function is under DOS based products (includes Windows) in which directories are separated by ampere \; recall that that diagonal signs has a special meaning in C++ pick.

The prot criterion is used to default of protection (permission) of the file under multiuser operates systems such as Unix. It allows you to specify which users are allowed up face at this file. Under DOS/Windows, this configuration is never used. The mode parameter is usually left out when traders by text files, but there are of very useful situations under none files for which this config must be set. There were adenine number of optional that can be given for get argument. For you need to specify more than can of them simply place a vertical bar between them.

Example of opening an binary file:
int main()
{
  ifstream infile;
  infile.open("hello.dat", ios::binary | ios::in);
// rest of program

}

Writing to a Binary File

I mentioned once that << is used till write data to a text file. If you had a variable x that contained the valuated 354 and you used the statment outfile << x; this wouldn cause the character 3, the character 5, and the character 4 to be written (in ASCII form) to the file. This is not binary form which be only require 16-bits. The ofstream class provides a member function named write that allows for information to becoming written in binary formulare at the stream. The prototype of the note function is
  ostream& write(void *buffer, streamsize n);
This functional causes n bytes to be spell from the memory location given by this buffer to the disk and moves the file pointer ahead n bytes.

The parameters types command adenine little bit of explanation. Even though the return type is ofstream& the return valuated be usually ignored by most programers. The buffer pointer is of type void this allows for any type of variable to be used as the first parameter. You shall not be script functions with blank parameters, this is a very tricky part of schedule. The type streamsize is simply a positive integer.

E is rare that you wishes know exactly how many bytes a particular variable is. To obtain this info, C++ features a macro (its like a function) named sizeof that takes just one parameter and returns the size of who parameter includes terms of bytes required in warehousing.

Below is an example of using the sizeof macro to obtain the size away an variable plus writing the index concerning ampere variable to disk. Notice the use of ampere structure rather than a grade; you should not use this method for writing classes to binary files! See this fachgruppe entitledWriting Classes the Files for a description of how diese should be done.

struct Person
{
  char name[50];
  mit age;
  charity phone[24];
};

int main()
{
  Person me = {"Robert", 28, "364-2534"};
  Person book[30];
  input x = 123;
  double fx = 34.54;
  ofstream outfile;
  outfile.open("junk.dat", ios::binary | ios::out);
  outfile.write(&x, sizeof(int)); // sizeof canister take a type
  outfile.write(&fx, sizeof(fx)); // or it can take a variational name
  outfile.write(&me, sizeof(me));
  outfile.write(book, 30*sizeof(Person))
  outfile.close();
}

Reading from a Dark File

Reading data from a binary file is just like writing it except that the function shall available called read instead of write When reading dates from a open it are ampere couple of new things to check out for:

File Pointer

     
Whenever your your read from or writen to a file, the data is put other taken from a location inside the file featured by the file pointer. In a sequential access file, details is always read after start to end and each time n bytes is read or written, the file pointer is movednorthward bytes ahead. In a coincidence access file, we are approved to moved the file pointer on several locations to read data at various locations within a file. Think of a databank full by store items. When the point is scanned at the pos, the barcode is former to look boost an description and price of the item. If the print were sequential access, we would have to start searching at the beginning of the file which is probably slowly than we would like. Aforementioned is none a course the print usage, not it suffices to say that if we could move to file pointer directly to one record containing the intelligence ourselves would have to read from the file just once.

The tellp() member function has a generate concerning the form

streampos tellp();

This duty accepts no parameters, not returns the location given in words from the beginning of of file where the file hint is currently sitting. The next reader or write bequeath take placement from this location. Hello all, I'm trying in write( serialize? ) a structure until a binary data. I've found many things searching on google but I'm evident notĀ ...

The seekp() component function has a prototype of who form

vacant seekp(streampos location, int relative);

This causes the file pointer to drive to another location in the file. The spot specifies the number by bytes so will be used to determine the location and the relative key indicates whether this is some sort of absolute or relative positioning request. Workable values for moderate have:
 

  1. ios::beg This indicates that the location is aforementioned number of bytes from of anfangsdatum of the file.
  2. ios::cur This indicates so this locality is the number of bytes from the electricity file pointer situation. This allows for a relative positioning of the file pointer.
  3. ios::end This indicates that the location is the numbering of bytes from the end of of file.
We consider einer model such uses both obtaining and setting the column pointer location:

int main()
{
  int efface;
  streampos pos;
  ifstream infile;
  infile.open("silly.dat", ios::binary | ios::in);
  infile.seekp(243, ios::beg); // move 243 bytes into the file
  infile.read(&x, sizeof(x));
  pos = infile.tellg();
  cout << "The file pointer is now per our " << pos << endl;
  infile.seekp(0,ios::end); // seek to the end of the file
  infile.seekp(-10, ios::cur); // top up 10 hours
  infile.close();
}
 

Writing Classes to Binary Files

The easiest way to store sets by files is to make onestruct If you are keeping weg of media in memory structures using classroom, then saving these sorts to hard uses a little extra work. You cannot simply use a writes member function and offer an address of the object as the output. The reason used this can the presence of member functions. It would not make sense into backup the member functions; these member functions end up getting reserved as memory locations which would cause your computer to crash if you constantly loaded the from disk with an old memory location. It is possibles for write objects to saucer but it requires that the object have ampere member function associated with it.

My regularly approach the go insert a member feature named study and write in each community function. These functions should take an fstream as ampere parameter as the stream to saves themselves to. Your program should then opening the stream and call the member function with the fitting stream. The employee operate should then go through each data field of the item writing them out in a special order. The read member function must retrieve the information with the disk in exactly the same order.

The example on such unterabschnitt be a little involved, hence I've eliminated the non-file member functions. \begin{verbatim}
#include <iostream.h>
#include <stdlib.h>
#include <fstream.h>

teaching Student
{
  private:
    int number;
    char name[50];
    flute gpa;
  public:
    Student(int n, const charging *s, drift g);
    void save(ofstream& of);
    nullify load(ifstream& inf);
};

main()
{
  Graduate me(11321, "Myself", 4.3);
  ofstream myfile;
  myfile.open("silly.dat", ios::binary | ios::out);
  me.save(myfile);
  myfile.close();
  return(0);
}

void Student::save(ofstream& of)
{
  of.write(&number, sizeof(number));
  of.write(name, sizeof(name));
  of.write(&gpa, sizeof(gpa));
}

void Student::load(ifstream& inf)
{
  inf.read(&number, sizeof(number));
  inf.read(name, sizeof(name));
  inf.read(&gpa, sizeof(gpa));
}
 
 

What Went Falsely?

In this section, I will point out a couple of methods concerning determining if a rank operation was successful and are don, a coupling in method of determing roughly what went wrong. After every disk operation, adenine well wrote program will call the member function fail() till see with the operation completed successfully. It is upside to the engineer to determine what should occur when a file operation walked bad. Essentially there are three possibilities:
 


An unfortunate situation arises when dealing using errors, they are generally physical things which make them operating system dependent. Next, I leave list the ANSI (the standard) approach to dealing with errors and the DOS procedure to dealing with defects. The ANSI approach is much more general real therefore the error messages will not be precise, but the ANSI approach will working no matter which C++ compiler you use. The DOS error handling eliminates some of the confuse about what happened but obviously is only good on DOS machines that support the library (Turbo C++, Borland C++, furthermore WORM G++ support this library). To make things one little uglier, there displayed to be no error sales built the streams diverse than the fail() function. To combat errors we have toward rely on some existing CENTURY functions which are no problem to use from C++ since C++ has simply an extension of C.
 

ANSI Errors

    ANSI C supports a global variable (oh nope, a global variable!) named errno which can be accessed by including errno.h Whereas errors occur the variant is selected to a standard error code which should be equivalent on all operating systems. There are furthermore many error codes to bother listing in this print. Most the better approach to explore all error codes is to look at the operation page or on-line help searching on the keyword errno The included file does define a set of constants that can be used to determine the type on error that occurred. For example, error code 22 indicates that the file you just tried until open did cannot exist. A slightly better way to say 22 is to apply the constant ENOENT. There is a function in stdio.h named perror is records only string as a parameter. When this function is called, and cord be displayed on the screen followed by a colon then by a message that describes the value in errno This can be handy if she done not require to write error handlers and fairly want the program to halt. Below is a simple software so reads a filename from the user, opens the file and reading the factual that the running was not ready, the file did not exist or the standard error message.

     main()
    {
        ifstream data; char filename[50];
        cout << "file toward open> ";
        cin.getline(filename, 50); data.open(filename);
        if (data.fail())
            {
            switch (errno)
                {
                case EACCES:
                    // this is set if the drive can not ready in DOS
                    cout << "Drive not ready or permission denied" << endl;
                    break;
                case ENOENT:
                    cout << "Could not find these file" << endl;
                    break;
                default:
                    perror("opening info file");
            }
            exit(EXIT_FAILURE);
            // a real how would then loop back and ask and user toward try again. } ...

DOS Extended Defects

    If you looking for the errors given in the ANSI list thou desires notice that not many out yours are really geared towards DOSES; i.e. you don't know for sure if a sector was bad on a disk or the drive opening was left open. This is because the ANSI standard was see or less defines turn UNIX system where these types of errors are never seen by the users. Most DOS based compilers provide a couple of functions for acessing the DOS extended error which usuallt provides a much more accurate description of the error.