Optimizing Image Analysis For Speed
Optimizing Image Analysis For Speed
Optimizing Image Analysis For Speed
General Rules
Rule #1: Do not compute what you do not need.
Use image resolution well fitted to the task. The higher the resolution, the slower the
processing.
Use the inRoi input of image processing to compute only the pixels that are needed
in further processing steps.
If several image processing operations occur in sequence in a confined region then it
might be better to use CropImage at first.
Do not overuse images of types other than UInt8 (8-bit).
Do not use multi-channel images when there is no color information being
processed.
If some computations can be done only once, move them before the main program
loop, or even to a separate program. Below is an example of a typical structure of
the "Main" macrofilter that implements this advice. There are two macrofilters: the
first one is responsible for once-only computations, and the second is a Task
implementing the main program loop:
Note that in the development environment displaying data on the preview windows
takes much time. Choose Program » Previews Update Mode » Disable
Visualization to get performance closer to the one you can expect in the runtime
environment.
In the runtime environment use the VideoBox control for image display. It is highly
optimized and can display hundreds of images per second.
Using the VideoBox controls, prefer the setting of SizeMode: Normal, especially if
the image to be displayed is large. Also consider
using DownsampleImage or ResizeImage.
Prefer the Update Data Previews Once an Iteration option.
Mind the Diagnostic Mode. Turn it off whenever you need to test speed.
Pay attention to the information provided by the Statistics window. Before optimizing
the program, make sure that you know what really needs optimizing.
Data flow programming allows for creating high speed machine vision applications nearly as
well as the standard C++ programming. This, however, requires meeting an assumption
that we are using high-level tools and image analysis is the main part. On the other hand,
for low level programming tasks – like using many simple filters to process high numbers of
pixels, points or small blobs – all interpreted languages will perform significantly slower than
C++.
Template Matching: Do not mark the entire object as the template region, but only
mark a small part having a unique shape.
Template Matching: Prefer high pyramid levels, i.e. leave
the inMaxPyramidLevel set to Auto, or to a high value like between 4 and 6.
Template Matching: Prefer inEdgePolarityMode set not to Ignore
and inEdgeNoiseLevel set to Low.
Template Matching: Use as high values of the inMinScore input as possible.
Template Matching: If you process high-resolution images, consider setting
the inMinPyramidLevel to 1 or even 2.
Template Matching: When creating template matching models, try to limit the range
of angles with the inMinAngle and inMaxAngle inputs.
Template Matching: Do not expect high speed when allowing rotations and scaling at
the same time. Also model creation can take much time or even fail with an "out of
memory" error.
Template Matching: Consider limiting inSearchRegion. It might be set manually, but
sometimes it also helps to use Region Analysis techniques before Template
Matching.
Template Matching: Decrease inEdgeCompleteness to achieve higher speed at the
cost of lower reliability. This might be useful when the pyramid cannot be made
higher due to loss of information.
Do not use these filters in the main program
loop: CreateEdgeModel1, CreateGrayModel, TrainOcr_MLP, TrainOcr_SVM.
If you always transform images in the same way, consider filters from the Image
Spatial Transforms Maps category instead of the ones from Image Spatial
Transforms.
Do not use image local transforms with arbitrary shaped
kernels: DilateImage_AnyKernel, ErodeImage_AnyKernel, SmoothImage_Mean_An
yKernel. Consider the alternatives without the "_AnyKernel" suffix.
SmoothImage_Median can be particularly slow. Use Gaussian or Mean smoothing
instead, if possible.
These are things that result from both the simplified data-flow programming model, as well
as from the modern architectures of computers and operating systems. Some, but not all, of
them can be solved with the use of Aurora Vision Library (see: When to use Aurora Vision
Library?). There is however, an idiom that might be useful also with Aurora Vision Studio –
it is called "Application Warm-Up" and consists in performing one or a couple of iterations
on test images (recorded) before the application switches to the operational stage. This can
be achieved with the following "GrabImage" variant macrofilter:
An example "GrabImage" macrofilter designed for application warming-up.
There is also a way to pre-allocate image memory before first iteration of the program
starts. For this purpose use the InspectImageMemoryPools filter at the end of the program,
and – after a the program is executed – copy its outPoolSizes value to the input of
a ChargeImageMemoryPools filter executed at the beginning. In some cases this will
improve performance of the first iteration.
There is an overhead of about 0.004 ms on each filter execution in Studio. That value may
seem very little, but if we consider an application which analyzes 50 blobs in each iteration
and executes 20 filters for each blob, then it may sums up to a total of 4 ms. This may
already be not negligible. If this is only a small part of a bigger application, then User
Filters might be the right solution. If, however, this is how the entire application works, then
the library should be used instead.
Case 2: Memory re-use for big images
Each filter in Aurora Vision Studio keeps its output data on the output ports. Consecutive
filters do not re-use this memory, but instead create new data. This is very convenient for
effective development of algorithms as the user can see all intermediate results. However, if
the application performs complex processing of very big images (e.g. from 10 megapixel or
line-scan cameras), then the issue of memory re-use might become critical. Aurora Vision
Library may then be useful, because only at the level of C++ programming the user can
have the full control over the memory buffers.
Aurora Vision Library also makes it possible to perform in-place data processing, i.e.
modifying directly the input data instead of creating new objects. Many simple image
processing operations can be performed in this way. Especially the Image
Drawing functions and image transformations in small regions of interest may get a
significant performance boost.
Filters of Aurora Vision Studio get initialized in the first iteration. This is for example when
the image memory buffers are allocated, because before the first image is acquired, the
filters do not know how much memory they will need. Sometimes, however, the application
can be optimized for specific conditions and it is important that the first iteration is not any
slower. On the level of C++ programming this can be achieved with preallocated memory
buffers and with separated initialization of some filters (especially for 1D Edge
Detection and Shape Fitting filters, as well as for image acquisition and I/O interfaces). See
also: Application Warm-Up.
https://docs.adaptive-vision.com/current/studio/programming_tips/OptimizingImageAnalysis.html
https://docs.adaptive-vision.com/current/studio/programming_tips/OptimizingImageAnalysis.html