OpenVX Neural Network Extension Feedback

The OpenVX™ working group has release an extension to enable Convolutional Neural Network topologies to be represented as OpenVX graphs and mixed with traditional vision functions.
Neural Network technology has seen recent explosive progress in solving pattern matching tasks in computer vision such as object recognition, face identification, image search, image to text, and is also playing a key part in enabling driver assistance and autonomous driving systems. Convolutional Neural Networks (CNN) are computationally expensive, and so many companies are actively developing mobile and embedded processor architectures to accelerate neural network-based inferencing at high speed and low power. As a result of such rapid progress, the market for embedded neural network processing is in danger of fragmenting, creating barriers for developers seeking to configure and accelerate inferencing engines across multiple platforms.
The OpenVX Neural Network extension specifies an architecture for executing CNN-based inference in OpenVX graphs. The extension defines a multi-dimensional tensor object data structure which can be used to connect neural network layers, represented as OpenVX nodes, to create flexible CNN topologies. OpenVX neural network layers types include convolution, pooling, fully-connected, normalization, soft-max and activation – with nine different activation functions. The extension enables neural network inferencing to be mixed with traditional vision processing operations in the same OpenVX graph.
Today, OpenVX has also released an Import/Export extension that complements the Neural Network extension by defining an API to import and export OpenVX objects, such as: traditional computer vision nodes, data objects of a graph or partial graph, and CNN objects including network weights and biases or complete networks.
The high-level abstraction of OpenVX enables implementers to accelerate a dataflow graph of vision functions across a diverse array of hardware and software acceleration platforms. The inclusion of neural net inferencing functionality in OpenVX enables the same portable, processor independent, expression of functionality with significant freedom and flexibility in how that inferencing is actually accelerated. The OpenVX Neural Net extensionis released in provisional form to enable developers and implementers to provide feedback before finalization and industry feedback is welcomed at the OpenVX Forums.
Khronos is coordinating its neural network activities, and expects that NNEF files will be able to represent all aspects of an OpenVX neural network graph, and that OpenVX will enable import of network topologies via NNEF files through the Import/Export extension, once the NEFF format definition is complete.
Convolutional Neural Network topologies can be represented as OpenVX graphs

[ul]
[li]Layers are represented as OpenVX nodes[/li][li]Layers connected by multi-dimensional tensors objects[/li][li]Layers include convolution, normalization, pooling, fully-connected, soft-max[/li][li]Also activation layers – with nine different activation functions[/li][li]CNN nodes can be mixed with traditional vision nodes[/li][/ul]
Import/Export Extension

[ul]
[li]Efficient handling of network weights/biases or complete networks[/li][/ul]
The specification is provisional

[ul]
[li]Seeking feedback from the deep learning community.[/li][/ul]

Hi, I’m an author of tiny-dnn, which is one of a number of dl frameworks.
I believe this standardized API will give a great benefit to vision industry :smiley:

BTW, I think most of modern vision models (inception, resnet etc) have at least one of the following 3 operations - batch normalization, dropout and depth-concat.
It would be great if openvx add these operations into the standard.

Thank you for the feedback.
The problem is that currently we support only inference. And in Inference batch normalization reduced to multiplication and subtraction which are supported.
concat is implemented using the view feature. It make the concat a static operation which does not need copying.
dropout is a training only feature.

BTW we implemented resnet and inception (googlenet) with OpenVX.
Intel have a tool that convert Caffe models and weights to OpenVX and those networks are covered.
See Intel® Distribution of OpenVINO™ Toolkit