Readme PDF
Readme PDF
Readme PDF
Abstract:
A deep learning model is optimized to its maximum extent via specifying the optimizers and loss
functions. The Purpose of TensorRT is to optimize the deep learning model without
compromising the performance of the model. This whole process is carried out before the
deployment phase by converting a Network built using the common frameworks like Tensorflow,
pytorch etc to a TensorRT network. Conversion to TensorRT network involves converting the
layers/operation native to the framework to layers/operations of a TensorRT network. Currently
TensorRT does not support all layers , we propose to create a custom plugin for one such
unsupported layer thus optimizing the TensorRT model to the maximum.
TensorRT allows developers to import, The Builder interface allows the creation of an
calibrate, generate, and deploy optimized optimized engine from a network definition. It
networks. Deep learning Networks can be allows the application to specify the maximum
imported directly from Caffe, or from other batch size and workspace size, the lowest
frameworks via the UFF or ONNX formats. acceptable level of precision, timing iteration
Users can also run custom layers through counts for autotuning, an interface for
TensorRT using the Plugin interface. The quantizing systems to run in 8-piece
GraphSurgeon feature provides the capability exactnessIt is possible to build multiple
to map tensorflow nodes to the corresponding engines based on the same network definition,
custom layers in tensorRT, thus linking many but with different builder configurations
TensorFlow networks with the tensorRT
3
References: