QuickNet is a novel deep network architecture that is faster and more efficient than the current “fast” deep web network architectures such as SqueezeNet. QuickNet utilizes fewer parameters when compared to previous network architectures. This has been made possible via implementing a couple of pivotal modifications to the reference “Darknet” network architecture model:
a- Using depth-wise separable convolutions.
b- Using parametric rectified linear units.
The creator of QuickNet proved that leaky rectified linear units at a given test time is equal to, from a computational point of view, parametric rectified linear units. He also observed that separable convolutions can be viewed as a compressed inception network. The deep web network architecture, QuickNet, was inspired by the aforementioned observations, which led to a faster more efficient network architecture model. QuickNet has at least four main advantages:
1- A smaller sized model, which runs more efficiently on a system with constrained memory resources.
2- A very fast network that runs more efficiently on a system with constrained processing power.
3- QuickNet has yielded an accuracy of 95.7% on the CIFAR-10 Dataset. This outperforms results of all previous experiments, except one, yet QuickNet represents a group of orthogonal approaches that could be combined, rather than used individually, to yield even better levels of accuracy.
4- Orthogonality when compared to previous models of network compression approaches, which permits realization of speed gains.
The QuickNet Deep Web Network Architecture:
It is currently widely agreed that memory resources, not computational resources, are the primary power consumers in a deep neural network ecosystem. Previous researches have shown that along a commercial 40 nm process, the energy intensive of the DRAM access is 200 times higher than 32 bit multiply. Accordingly, it is clear that in order to create a light weighted neural network architecture, which is also energy efficient, the architecture must rely on minimizing parameters. On the other hand, along with minimization of parameters, intermediate layer activation maps also need a large amount of memory resources, even at inference time. The proposed QuickNet architecture minimized the number of parameters to 3.56 million (using less than 14.24 megabytes at 32 bits), and this can be reduced even more to less than 1 megabyte via using Deep Compression. It has been observed that using Deep Compression with a rate of compression of 15x is more reasonable than the reported rate of 50x.
QuickNet aimed at minimization of parameters due to the fact that the pipeline of Deep Compression already represents an effective, yet simple, pipeline for reducing the model size by 50x via combining quantization with pruning and Huffman coding without sacrificing accuracy. This combination is sufficient, by itself, to significantly reduce the size of the model. Oppositely, working to tame the computational complexity of deep web networks, without sacrificing accuracy, has been significantly lagging behind. Accordingly, QuickNet aims at offering a network with low computational complexity, while maintaining high levels of accuracy.
Results of Testing of QuickNet:
The author of the paper stated that he experimented QuickNet on CIFAR-10 along with data augmentation using Keras framework. A dropout with a value of 0.5 was used along with batch normalization. Cross-entropy loss was used as a loss of function. A validation/test set was used with randomly chosen 6000 images, which were never presented to the network. Accuracy was measured using the validation/test set.
QuickNet testing exhibited an accuracy of 95.7%, which corresponds to an error rate of 4.3%. This is higher than all current deep network architectures, except Fractional Max Pooling. However, the QuickNet approach is orthogonal to that of Fractional Max Pooling and can be used in combination, yet the author of the paper chose to forgo it due to the following reasons:
1- To promote computational tractability as all the current implementation approaches of Fractional Max Pooling take much longer (reaching 15x in Lasagne) when compared to conventional pooling methods without producing any optimization effects.
2- To promote memory tractability due to the fact that memory resources consume the most energy and this can increase even more when Fractional Max Pooling is used.
One of the most interesting features of QuickNet is its significantly fast architecture convergence (within 70 epochs, 80% accuracy). This can open the door to networks’ updating and local training in the near future.
Even much better results can be yielded with further hyperparameter research along with experimentation using various adaptive optimizers e.g. RMSProp or Adam.