009 – Reading WAVE Files in SystemVerilog

In this post we will go over a SystemVerilog testbench that can read a WAVE file and use it as input when simulating our Audio Processor.

As our audio processing logic becomes more sophisticated, the need for a simulation environment to test our code before generating a bitstream becomes greater. In the digital design world there is a much stronger reliance on simulation than in software development. Simulating our design gives us a degree of controllability and observability that cannot be obtained otherwise, even with the evolving capabilities of integrated logic analyzers. Thus, we will create a SystemVerilog module that can read the audio data from a stereo 24-bit WAVE audio file and use it as input for our Audio Processor.

WAVE File Format

WAVE files can store uncompressed audio in pulse-code modulation (PCM) format. This representation of audio data is widely used by Digital Audio Workstations (DAWs) in the professional audio world.

You can find several descriptions of the WAV file format online. I used mainly the one in soundfile++ as a reference, shown in the figures below.

The Canonical WAVE File Format. Source soundfile.sapp.org

Detailed Description of the Fields in a WAVE File. Source soundfile.sapp.org

Example of a 72-byte WAVE File. Source soundfile.sapp.org

I used my DAW of choice, Cubase, to generate a 24-bit WAVE audio file that I could use as an input. While inspecting it with a Hex Viewer, I realized that there is an additional ‘JUNK‘ section between the ‘WAVE’ and ‘fmt‘ identifiers, which is not described in the soundfile++ website. Take a look:

The ‘JUNK‘ section seems to be optional, as I could confirm that other WAVE files in my PC don’t have it. Moreover, in WaveLab (an audio editor by Steinberg, the makers of Cubase) it is possible to create WAVE files without the JUNK section. Just be aware of that if you try to run the code from this post as-is.

Another thing I noticed is that the WAVE file includes some metadata after the last audio sample. You can see this in the figure below, with the last audio sample highlighted in yellow, and the start of the end-of-file metadata highlighted in green.

Because we are only interested in the audio data, we will ignore those metadata sections, just be aware that they might be there. Pro tip: when you are getting everything up and running for the first time, using a stereo file with a mono signal will help identify the beginning and the end of the data section, because every value will appear twice.

Reading a WAVE File in SystemVerilog

We begin our WAVE-file-reading testbench by declaring all the signals we will need to store the metadata that comes before the audio data. This is not always necessary if the audio samples are the only thing we are interested in (which is the case most of the time), but it is still good practice. Some fields do need to be stored however, as we will reuse them later. The signals we need for reading out the metadata are shown below.

logic           [31:0]  chunk_id;
logic           [31:0]  chunk_size;
logic           [31:0]  format;
logic           [31:0]  junk_chunk_id;
logic           [31:0]  junk_chunk_size;
logic           [7:0]   junk_chunk_data[];
logic           [31:0]  subchunk_1_id;
logic           [31:0]  subchunk_1_size;
logic           [15:0]  audio_format;
logic           [15:0]  num_channels;
logic           [31:0]  sample_rate;
logic           [31:0]  byte_rate;
logic           [15:0]  block_align;
logic           [15:0]  bits_per_sample;
logic           [31:0]  subchunk_2_id;
logic           [31:0]  subchunk_2_size;
logic signed    [23:0]  data_left, data_left_aux;
logic signed    [23:0]  data_right, data_right_aux;
logic           [7:0]   led;

We open the WAVE file for reading in binary mode and start reading out the fields described earlier. As an example, here’s what that looks like for the Chunk ID and the Chunk Size at the beginning of the file:

audio_in = $fopen(\"C:/rtlaudiolab/fpga_audio_processor/sim/led_meter/tc_01/audio_in.wav\", \"rb\");
status = $fread(chunk_id, audio_in);
if (status != 0) begin
    $display (\"Chunk ID: 0x%h. Expected: 0x52494646 (\'RIFF\')\", chunk_id);
end
status = $fread(chunk_size, audio_in);
if (status != 0) begin
    chunk_size = { << byte {chunk_size}}; // Converts to big endian
    $display (\"Chunk Size: %h bytes\", chunk_size);
end

Now we get to the most important part. We need to have stored at least the SubChunk 2 Size, the Number of Channels, and the Bits per Sample fields, as they are needed to calculate how many audio samples the file contains. Once we know how many samples are in the file, we can start reading them out. I’d suggest you read one sample for each period of the sample rate clock in your design (I’m using 44.100 Hz in this example). Loading an entire file worth of samples at the beginning of your simulation can make your simulator run significantly slower. This section of the file is shown below.

for (int i = 0; i < number_of_samples; i++) begin
    for (int j = 0; j < (bits_per_sample/8); j++) begin
        status = $fread(aux, audio_in);
        data_left_aux = {aux, data_left_aux[23:8]};
    end
        data_left_aux = data_left_aux * 5;     // Makes the waveform larger in the viewer, comment or remove for actual simulation
    for (int j = 0; j < (bits_per_sample/8); j++) begin
        status = $fread(aux, audio_in);
        data_right_aux = {aux, data_right_aux[23:8]};
    end
        data_right_aux = data_right_aux * 5;   // Makes the waveform larger in the viewer, comment or remove for actual simulation
    @(posedge clock);
    data_valid <= 1\'b1;
    data_right <= data_right_aux;
    data_left <= data_left_aux;
    @(posedge clock);
    data_valid <= 1\'b0;
    #22665ns;
end

Here’s our complete ‘wave_file_reader’ module:

module wave_file_read();

    timeunit 1ns;
    timeprecision 1ps;

    logic clock;

    logic data_valid;
    int audio_in;
    int audio_out;

    int status;
    logic [7:0]    aux;
    int number_of_samples;

    logic           [31:0]  chunk_id;
    logic           [31:0]  chunk_size;
    logic           [31:0]  format;
    logic           [31:0]  junk_chunk_id;
    logic           [31:0]  junk_chunk_size;
    logic           [7:0]   junk_chunk_data[];
    logic           [31:0]  subchunk_1_id;
    logic           [31:0]  subchunk_1_size;
    logic           [15:0]  audio_format;
    logic           [15:0]  num_channels;
    logic           [31:0]  sample_rate;
    logic           [31:0]  byte_rate;
    logic           [15:0]  block_align;
    logic           [15:0]  bits_per_sample;
    logic           [31:0]  subchunk_2_id;
    logic           [31:0]  subchunk_2_size;
    logic signed    [23:0]  data_left, data_left_aux;
    logic signed    [23:0]  data_right, data_right_aux;

    initial begin
        clock = 1\'b0;
        forever begin
            #5ns;
            clock = ~clock;
        end
    end

    initial begin
        $display(\"******************************\");
        $display(\"******************************\");
        $display(\"*      Simulation Start      *\");
        $display(\"******************************\");
        $display(\"******************************\");
        data_valid <= 1\'b0;
        data_left_aux <= \'b0;
        data_right_aux <= \'b0;
        audio_in = $fopen(\"C:/rtlaudiolab/fpga_audio_processor/sim/led_meter/tc_01/audio_in.wav\", \"rb\");
        status = $fread(chunk_id, audio_in);
        if (status != 0) begin
            $display (\"Chunk ID: 0x%h. Expected: 0x52494646 (\'RIFF\')\", chunk_id);
        end
        status = $fread(chunk_size, audio_in);
        if (status != 0) begin
            chunk_size = { << byte {chunk_size}}; // Converts to big endian
            $display (\"Chunk Size: %h bytes\", chunk_size);
        end
        status = $fread(format, audio_in);
        if (status != 0) begin
            $display (\"Format: 0x%h. Expected: 0x57415645 (\'WAVE\')\", format);
        end
        status = $fread(junk_chunk_id, audio_in);
        if (status != 0) begin
            $display (\"JUNK ID: 0x%h. Expected: 0x4a554e4b (\'JUNK\')\", junk_chunk_id);
        end
        status = $fread(junk_chunk_size, audio_in);
        if (status != 0) begin
            junk_chunk_size = { << byte {junk_chunk_size}}; // Converts to big endian
            $display (\"Junk Chunk Size: %d bytes\", junk_chunk_size);
        end
        // Junk Chunk Data array
        junk_chunk_data = new[junk_chunk_size];
        foreach (junk_chunk_data[i]) begin
            status = $fread(aux, audio_in);
            if (status != 0) begin
                junk_chunk_data[i] = aux;
            end
        end
        $display (\"Junk data read\");
        status = $fread(subchunk_1_id, audio_in);
        if (status != 0) begin
            $display (\"Subchunk 1 ID read: 0x%h. Expected: 0x666d7420 (\'fmt \')\", subchunk_1_id);
        end
        status = $fread(subchunk_1_size, audio_in);
        subchunk_1_size = { << byte {subchunk_1_size}}; // Converts to big endian
        if (status != 0) begin
            $display (\"Subchunk 1 Size: %d bytes. Expected: 16 (PCM)\" , subchunk_1_size);
        end
        status = $fread(audio_format, audio_in);
        audio_format = { << byte {audio_format}}; // Converts to big endian
        if (status != 0) begin
            $display (\"Audio Format: %d. Expected: 1 (other values indicate data compression)\", audio_format);
        end
        status = $fread(num_channels, audio_in);
        num_channels = { << byte {num_channels}}; // Converts to big endian
        if (status != 0) begin
            $display (\"Number of Channels: %d. Expected: 2 (stereo)\", num_channels);
        end
        status = $fread(sample_rate, audio_in);
        sample_rate = { << byte {sample_rate}}; // Converts to big endian
        if (status != 0) begin
            $display (\"Sample Rate: %d Hz\", sample_rate);
        end
        status = $fread(byte_rate, audio_in);
        byte_rate = { << byte {byte_rate}}; // Converts to big endian
        if (status != 0) begin
            $display (\"Byte Rate: %d Bps\", byte_rate);
        end
        status = $fread(block_align, audio_in);
        block_align = { << byte {block_align}}; // Converts to big endian
        if (status != 0) begin
            $display (\"Block Align: %d bytes\", block_align);
        end
        status = $fread(bits_per_sample, audio_in);
        bits_per_sample = { << byte {bits_per_sample}}; // Converts to big endian
        if (status != 0) begin
            $display (\"Bits per Sample: %d\", bits_per_sample);
        end
        status = $fread(subchunk_2_id, audio_in);
        if (status != 0) begin
            $display (\"Subchunk 2 ID: 0x%h. Expected: 0x64617461 (\'data\')\", subchunk_2_id);
        end
        status = $fread(subchunk_2_size, audio_in);
        subchunk_2_size = { << byte {subchunk_2_size}}; // Converts to big endian
        if (status != 0) begin
            $display (\"Subchunk 2 Size: %d bytes\", subchunk_2_size);
        end
        number_of_samples = (subchunk_2_size*8)/(num_channels*bits_per_sample);
        $display (\"Calculated Number of Samples: %d\", number_of_samples);
        $display (\"Reading the audio data\");
        #100ns;
        for (int i = 0; i < number_of_samples; i++) begin
            for (int j = 0; j < (bits_per_sample/8); j++) begin
                status = $fread(aux, audio_in);
                data_left_aux = {aux, data_left_aux[23:8]};
            end
                // data_left_aux = data_left_aux * 5;     // Makes the waveform larger in the viewer, comment or remove for actual simulation
            for (int j = 0; j < (bits_per_sample/8); j++) begin
                status = $fread(aux, audio_in);
                data_right_aux = {aux, data_right_aux[23:8]};
            end
                // data_right_aux = data_right_aux * 5;   // Makes the waveform larger in the viewer, comment or remove for actual simulation
            @(posedge clock);
            data_valid <= 1\'b1;
            data_right <= data_right_aux;
            data_left <= data_left_aux;
            @(posedge clock);
            data_valid <= 1\'b0;
            #22665ns;
        end
        $display (\"Audio data read\");
        $fclose(audio_in);
        $display(\"******************************\");
        $display(\"******************************\");
        $display(\"*       Simulation End       *\");
        $display(\"******************************\");
        $display(\"******************************\");
        $stop();
    end

endmodule

Now we are ready to start our simulation. I’d suggest running just the WAVE-file-reading module with your desired input audio to make sure everything works as expected. It is helpful if you already know what the waveform should look like. Once the simulation is finished, we need to format the audio data signals so that they are displayed as typical audio waveforms in the simulator. This is shown in Figure 6 below for the Vivado Simulator.

That’s it! You should now be able to see your audio data nicely drawn in your simulator’s waveform viewer. Remember to change the radix of the audio data signals to signed decimal so the waveform is displayed properly. If you used a mono signal, the left and right channels will show the same waveform. Mine look like this:

The next post will be the first of a series in which we implement a linear-to-dBFS converter using Vitis High Level Synthesis. See you then!

Cheers,

Isaac

009 – Reading WAVE Files in SystemVerilog

WAVE File Format

Reading a WAVE File in SystemVerilog

Leave a Reply Cancel reply