Binary Files
Binary Files
In a sense, all files are "binary" in that they are just a collection of bytes stored in an operating system
construct called a file. However, when we talk about binary files, we are really referring to the way VB
opens and processes the file.
The other file types (sequential and random) have a definitive structure, and there are mechanisms
built into the language to read and write these files based on that structure. For example, the Input #
statement reads a sequential comma-delimited file field-by-field, the Line Input statement reads a
sequential file line by line, etc.
On the other hand, it is necessary to process a file in binary mode when that file does not have a
simple line-based or record-based structure. For example, an Excel "xls" file contains a series of
complex data structures to manage worksheets, formulas, charts, etc. If you really wanted to process
an "xls" file at a very low level, you could open the file in binary mode and move to certain byte
locations within the file to access data contained in the various internal data structures.
Fortunately, in the case of Excel, Microsoft provides us with the Excel object model, which makes it a
relatively simple matter to process xls files in VB applications. But the concept should be clear: to
process a file that does not contain simple line-oriented or record-oriented data, the binary mode
needs to be used and you must traverse or parse through the file to get at the data that you need.
We have seen partial syntax for the Open statement in the first topic on sequential files. The full
syntax for the Open statement, taken from MSDN, is:
Part Description
pathname Required. String expression that specifies a file name — may include directory or
folder, and drive.
mode Required. Keyword specifying the file mode: Append, Binary, Input, Output,
or Random. If unspecified, the file is opened for Random access.
access Optional. Keyword specifying the operations permitted on the open file: Read, Write,
or Read Write.
lock Optional. Keyword specifying the operations restricted on the open file by other
processes: Shared, Lock Read, Lock Write, and Lock Read Write.
filenumber Required. A valid file number in the range 1 to 511, inclusive. Use
the FreeFile function to obtain the next available file number.
reclength Optional. Number less than or equal to 32,767 (bytes). For files opened for random
access, this value is the record length. For sequential files, this value is the number of
characters buffered.
Remarks
You must open a file before any I/O operation can be performed on it. Open allocates a buffer for I/O
to the file and determines the mode of access to use with the buffer.
If the file specified by pathname doesn't exist, it is created when a file is opened
for Append, Binary, Output, or Random modes.
If the file is already opened by another process and the specified type of access is not allowed,
the Open operation fails and an error occurs.
Important: In Binary, Input, and Random modes, you can open a file using a different file number
without first closing the file. In Append and Output modes, you must close a file before opening it
with a different file number.
Given the information above, we would not use the optional Len clause when opening a file in binary
mode, as it does not apply. In the sample programs to follow, the optional lock entry is not used
either.
Thus, in the sample programs to follow, the following syntax will be used to open a binary file for
input:
The Get statement is used read data from a file opened in binary mode. The syntax, as it applies to
binary files is:
Byte position is the byte position within the file at which the reading begins. The byte position is "one-
based", meaning the first byte position in the file is 1, the second position is 2, and so on. You can
omit this entry, in which case the next byte following the last Get or Put statement is read. If you omit
the byte position entry, you must still include the delimiting commas in the Get statement, for
example:
Varname is a string variable into which the data will be read. This string variable is often referred to as
a "buffer" when processing binary files. It is important to note that the length, or size, of this string
variable determines how many bytes of data from the file will be read. Thus, it is necessary to set the
length of the string variable prior to issuing the Get statement. This is commonly done by using the
String$ function to pad the string variable with a number of blank spaces equal to the number of bytes
you want to read at a given time.
For example, the following statement pads the string variable strData with 10,000 blank spaces:
Now that VB "knows" how big "strData" is, the following Get statement will read the first (or next)
10,000 bytes from file number "intMyFile" and overlay strData with that file data:
In that a VB string variable can hold in the neighborhood of 2 GB worth of data, it would not be
unreasonable in most cases to read in the whole file in "one shot", as opposed to reading it in
"chunks" as described above. To do this, you can set the length of the "buffer" string variable to the
size of the file using the LOF (length of file) function as the first argument of the String$ function. The
LOF function takes the filenumber of the file to be processed as its argument, and returns the length
of the file in bytes. Thus, the following statement will fill the variable "strData" with a number of blank
spaces equal to the size of the file:
Then, when the subsequent Get statement is executed, the entire contents of the file will be stored in
strData:
The Input function (not to be confused with the Input # or Line Input statements) can be used as an
alternative to the Get statement. The syntax is:
where varname is the string variable into which the file data will be stored, number is the number of
characters to be read, and filenumber is a valid filenumber identifying the file from which you want to
read.
The following table contains examples that contrast the Get statement and Input function as ways of
reading data from a binary file:
The Put statement is used write data to a file opened in binary mode. The syntax, as it applies to
binary files is:
Byte position is the byte position within the file at which the writing begins. The byte position is "one-
based", meaning the first byte position in the file is 1, the second position is 2, and so on. You can
omit this entry, in which case the next byte following the last Get or Put statement is written. If you
omit the byte position entry, you must still include the delimiting commas in the Put statement, for
example:
Varname is a string variable from which the data will be written. This string variable is often referred to
as a "buffer" when processing binary files. It is important to note that the length, or size, of this string
variable determines how many bytes of data will be written to the file.
For example, the following statements cause 1 byte of data to file number "intMyFile":