Convolutional neural networks for no-reference Image Quality Assessment
L.Kang,P.Ye,Y.Li and D.Daoermann, IEEE Conference on Computer Vision and Pattern Recognition, pp:1733-1740, 2014.
With the title we can predict that the goal of this paper is to predict the quality of an image by using CNN without having prior knowledge about distortion and a reference image.The focus is mainly on image distortions arising from image degradations like blur,compression and additive noise. CNN has the advantage to take raw images as input , include feature learning into the training process. Feature learning is done on the small image patch rather than on the entire image.
The network has 5 layers with one CNN layer ,min and max pooling, 2 fully connected layers and an output node. It also performs a simple local contrast normalization on the input gray scale image. This normalization usually attenuates the saturation problem and makes the network robust to illumination and contrast variation. The normalized image patches are convolved with filters to generate feature map(FM). The FM's are pooled to max and min pooling. Min pooling boosts the performance by 2%. Linear neurons are used in convolution and pooling layer instead of ReLu due to the fact that it allows only non negative signals to pass through. Because of this characteristic ReLu are not used in here. Dropout and parameter tunning is done to obtain better performance. The performance is evaluated by using Linear Correlation Coefficient (LCC) which measures the linear dependence between 2 quantities and Spearman Rank Order Correlation Coefficient (SROCC) measures how well a quantity can be described as a monotonic function of another quantity.
The network has 5 layers with one CNN layer ,min and max pooling, 2 fully connected layers and an output node. It also performs a simple local contrast normalization on the input gray scale image. This normalization usually attenuates the saturation problem and makes the network robust to illumination and contrast variation. The normalized image patches are convolved with filters to generate feature map(FM). The FM's are pooled to max and min pooling. Min pooling boosts the performance by 2%. Linear neurons are used in convolution and pooling layer instead of ReLu due to the fact that it allows only non negative signals to pass through. Because of this characteristic ReLu are not used in here. Dropout and parameter tunning is done to obtain better performance. The performance is evaluated by using Linear Correlation Coefficient (LCC) which measures the linear dependence between 2 quantities and Spearman Rank Order Correlation Coefficient (SROCC) measures how well a quantity can be described as a monotonic function of another quantity.