NN RGB FPGA Exercise
NN RGB FPGA Exercise
Bonn-Rhein-Sieg
Digital Design
NN_RGB_FPGA Exercises
-1-
Overview
GitHub
NN_RGB_FPGA
FPGA_plain corresponds to lecture 3
FPGA_generate corresponds to lecture 4
Task
Color detection should be improved
Red tail lights are detected as yellow
Green vegetation is detected as yellow
Light blue sky and dark blue signs are not
distinguished
red tail light
Approach
Modify training data
“False yellow”: More red and light green
Labeling of blue and yellow without black/white
letters
Alternative Task
Training for detecting of green vegetation
-3-
Exercise: Octave Verification
Task
Octave uses training image to
visualize training result
Use different image
Task
Octave uses floating-point accuracy for calculation of output image
Use fixed-point accuracy to match FPGA implementation
-4-
Exercise: VHDL Simulation with Testbench
Task
Perform VHDL simulation with a testbench
Template and explanations in video “FPGA FIR Filter: Verification with VHDL Testbench”
Output images of testbench use no compression
They can be used to compare different circuit implementations
Task
Extend to a self-checking testbench
Template and explanations in video “FPGA FIR Filter: Self-Checking Testbench”
Reference image for expected results
Bit-true Octave implementation
Compare different VHDL versions
-5-
Exercise: Increase Clock Frequency
Task
FPGA design has timing requirement for 720p video, i.e. 74.25 MHz
Modify design for higher throughput, e.g. 200 MHz
Investigate capability of design
Remote lab will not require this throughput
Approach
Set timing requirement in nn_rgb.sdc
create_clock … -period 13.47ns … set to 5ns for 200 MHz
Check “Timing Analysis” in Quartus
Add pipeline stages in design
Increase effort of synthesis in Quartus: Compiler Settings “Performance”
Compare FPGA resources and power consumption in remote lab
Note: Additional pipeline stages require adjustment of “delay” for submodule control
-6-
Sigmoid Function
Function of neuron
z is factors times input values plus bias
z = w1 * x1 + w2 * x2 + w3 * x3 + bias
1
Sigmoid: h =
1 + e-z
-4 4
Implementation
Function table in ROM
Values between ±4
Limitation of values outside this range
Fixed-point implementation
Values have factor of 2^13 = 8K (8192)
Value ±4 corresponds to ±32K
-7-
FPGA Implementation of Sigmoid Function
neuron.vhd in FPGA_plain
Sum is limited to ±4 and shifted to
positive range
0 ≤ limit(sum+4) < 8
Factor of 8K
0 ≤ sumAdress < 64K
-8-
Definition of ROM Values
sigmoid_14_bit.mif sigmoid_12_bit.mif
ROM implemented as IP
(Intellectual Property) DEPTH = 16384; DEPTH = 4096;
Values defined in MIF WIDTH = 8; WIDTH = 8;
(Memory Initialization File) ADDRESS_RADIX = DEC; ADDRESS_RADIX = DEC;
DATA_RADIX = DEC; DATA_RADIX = DEC;
Definition of fixed-point values CONTENT CONTENT
BEGIN BEGIN
Input values corresponds to ±4 0 : 2; 0 : 2;
as discussed 1 : 2; 1 : 2;
2 : 2; 2 : 2;
Output values correspond to 3 : 2; …
range 0 … 1 with 8 bit word width 4 : 2; 2000 : 121;
Range of 0 … 255 5 : 2; …
…
Same accuracy as input 8000 : 121;
values x1, x2, x3 …
Task
Implement ROMs with different input word width
Provided: 12, 13, 14, 16 Bit
Generate other word width, e.g. 8, 9, 10, 11 Bit
Compare FPGA resources, power consumption and quality of color detection in remote lab
Please check: ROM size smaller than FPGA modules do not save resources
Task
Select word width of ROM with generic parameter
Choose ROM with VHDL command „if-generate“
Change ROM to case-statement
Check functionality and resources for different word widths
Choose parameter at top-level and forward to neuron and sigmoid function
- 10 -
Exercise: Sigmoid Function (II)
Task
FPGA RAMs have two ports
All ROMs have identical content
Implement two sigmoid functions in one ROM
IP module “ROM: 2-PORT”
Approach
Design new submodule with two neurons
- 11 -
Exercise: Sigmoid Function (III)
Task
Color mapping for output processing does not require sigmoid function
FPGA_plain: check if value > 127
FPGA_generate: check if value > 127 and
compare values for different colors (yellow, blue)
Approach
Output layer can use simplified function
output <= sumAdress(15 downto 8);
Can be selected by generic value parameter: output_layer (true/false)
- 12 -