Image Classification on Android using OpenCV

This tutorial uses the popular computer vision library OpenCV for building an image classifier that runs on Android devices.

The overall process looks like this. First, the color histogram of the hue channel from the HSV color space is extracted from the image dataset. Next, an artificial neural network (ANN) is built and trained by such features and then saved for later use in an Android app. An Android Studio project is then created, which imports the Android release of OpenCV. After being imported successfully, the saved trained ANN is loaded for making predictions.

This tutorial assumes 2 things from the reader. The first one is being able to use OpenCV from a Java IDE (the IDE used in this tutorial is NetBeans). Within this Java project, the ANN is built and trained. The second assumption is the ability to use OpenCV within Android Studio.

If neither NetBeans nor Android Studio are prepared to use OpenCV, then do not worry. You can easily download a ready-to-use Android Studio project with OpenCV already imported:

The step-by-step guide for importing OpenCV within Android Studio is discussed in a previous tutorial titled “A Guide to Prepare OpenCV for Android”.

For NetBeans, there’s also a NetBeans project available on GitHub where OpenCV is already linked. You can simply download the project and import it within NetBeans.

After preparing OpenCV for use in NetBeans and Android Studio, you’re ready to start this tutorial. Note that the GitHub project for this tutorial is available here, where you can find both the NetBeans project and the Android Studio project.

The sections covered in this tutorial are as follows:

Let’s get started.

Preparing the Image Dataset

The dataset used within this tutorial is the Fruits360 dataset. It’s a large dataset with 60 classes, where each class has around 490 training images and around 165 testing images.

Once you’ve downloaded the dataset, you’ll find that it’s split into 2 main folders—Training and Test.

For simplicity, we’ll use 4 classes from the dataset: Apple Braeburn, Lemon Meyer, Mango, and Raspberry. The samples from each class are given in the next figure. By looking at these samples, we can easily infer that their colors are different. This is why we’ll use the color histogram later as a feature.

Each of these 4 classes has 2 folders, on for its training data, and one for its test data. For simplicity, we can copy the folders of each class within a new directory. This way, we’ll just see these 4 classes and nothing more.

In the root directory of the NetBeans project, I created a folder named Dataset. This folder has 2 subfolders named Train and Test. The folders associated with the 4 classes are copied within these 2 folders. As a result, the contents of the Dataset folder will appear as shown in the next figure. Note that the names of the classes folders are changed to 1 word starting with a lowercase letter. For example, “Apple Braeburn” becomes apple.

Before extracting features within each image, we have to loop through the images within the dataset. The next Java class named TrainANN does this job. The first line of the main() method returns the current directory of the project in the currentDirectory String variable.

Within the second line of the main() method, the OpenCV DLL file is loaded so we can use its libraries.

The names of the classes are saved into an array named classesNames. A for loop goes through this array. The loop variable named classIdx refers to the class ID. It starts from 0. Because there are 4 classes, it will end at 3. As a result, the ID of the first class labeled apple is 0, the ID for the second class labeled lemon is 1, and so on.

The current class name is returned into the currClassName variable. By concatenating the project directory saved in the currentDirectory variable with the class name, we can return the directory of each class into the currClassDir variable. The current class directory is printed.

Note that this directory returns the directory of the training data. This is why the returned directory located the classes within the Train folder. After training the ANN, we’ll come to the testing step, during which the word Train is replaced by Test.

package opencvapp;

import java.io.File;
import org.opencv.core.Core;

public class TrainANN {
    public static void main(String [] args){
        String currentDirectory = System.getProperty("user.dir");
        System.load(currentDirectory + "\OpenCVDLL\x64\" + Core.NATIVE_LIBRARY_NAME + ".dll");

        String [] classesNames = {"apple", "lemon", "mango", "raspberry"};
        
        for(int classIdx=0; classIdx<classesNames.length; classIdx++){
            String currClassName = classesNames[classIdx];
            String currClassDir = currentDirectory + "\Dataset\Train\" + currClassName + "\";
            System.out.println("Current Class Directory : " + currClassDir);
            
            File folder = new File(currClassDir);
            File[] listOfFiles = folder.listFiles();
            
            // Counter for the number of image being processed within the class.
            int imgCount = 0;

            for (File listOfFile : listOfFiles) {
                // Make sure we are working with a file and its extension is JPG
                if (listOfFile.isFile() && (currClassDir + listOfFile.getName()).endsWith(".jpg")) {
                    System.out.println("nClass Index " + classIdx + "(" + currClassName + ")" + ", Image Index " + imgCount + "(" + listOfFile.getName() + ")");
                    
                    String currImgPath = currClassDir + listOfFile.getName();
                    System.out.println(currImgPath);

                    imgCount++;
                }
            }
        }
    }    
}

A File object is created by passing the class directory to its constructor. Using the listFiles() method, the file’s names within this directory are returned into a file array named listOfFiles.

A variable named imgCount is initialized to 0 and refers to the current image number being processed within the current class. Its value is reset to 0 for each class being worked on.

A for loop then fetches the name of each file within the current class directory into the listOfFile variable. Actually, there’s no guarantee that listOfFile will always refer to a file. It might be a directory. This is why an if statement is added after the for loop to check to see if a file is returned using the isFile() method. Another check to be done within the if statement is whether the file is a JPG image file. This is because the image files of the dataset have the JPG extension.

Within the if statement, some details about the image are printed, including its class index, class name, image index within the class, and image file name. Also the absolute path of the image file is returned into the currImgPath variable by concatenating the current class directory within the currClassDir variable, with the image file name returned using the getName() method.

After retrieving the image absolute path, we can pass it to the OpenCV methods that extract the features used in this tutorial. This will be discussed later, but for now we can focus on just the currImgPath variable.

Finally, the imgCount variable is incremented by 1 for each image. The next figure shows a snippet of what’s printed in the console.

At the current time, the classes used from the dataset are prepared, and we’re able to loop through them and return the absolute path of each image. The next step is to extract the features that will be used for training the ANN.

Image Feature Extraction (Color Histogram)

The classes used from the dataset have different colors. This is why a good feature will be the color histogram. The modified code of the TrainANN class listed previously that’s used to extract the dataset’s histogram is listed below.

At the beginning of the main() method, 2 OpenCV Mat variables named datasetHist and datasetLabels are created. The datasetHist Mat holds the extracted features (color histogram) of all training images within the dataset. This is the input of the ANN.

The other Mat named datasetLabels holds the class labels of all training images within the dataset. This is the output of the ANN. By preparing the datasetHist and datasetLabels Mat variables, the training data of the ANN will be ready.

The necessary code for extracting the color histogram is added right after returning the absolute path of the images in the currImgPath variable. At first, the image is read using the read() method within the Imgcodecs OpenCV class. This method accepts the absolute image path returned in the currImgPath variable. The read image is returned as a Mat into the imgBGR variable. Note that the color space of the image returned by the read() method is BGR not RGB.

When extracting the color histogram, BGR or RGB aren’t the preferred options because there is no single channel responsible for specifying the color, but rather a combination of the 3 channels. Thus, it’s preferred to convert the image from its current color space to another space that uses fewer channels to specify the color. One of these color spaces is HSV (Hue Saturation Value). The first channel, Hue, is responsible for holding the color.

In order to convert the image from its current color space (BGR) to HSV, the first step is to create an empty Mat that holds the result of conversion. This mat is named imgHSV. The color conversion is done using the cvtColor() method within the Imgproc class. It accepts 3 arguments:

  1. Mat src: Source Mat, which is imgBRG in this case.
  2. Mat dst: Destination Mat, in which the result of conversion will be saved. It’s imgHSV in this case.
  3. int code: Color space conversion code. For our case, the code is 40. We can also use Imgproc.COLOR_BGR2HSV, which is translated to 40.
package opencvapp;

import java.io.File;
import java.util.Arrays;
import org.opencv.core.Core;
import org.opencv.core.CvType;
import org.opencv.core.Mat;
import org.opencv.core.MatOfFloat;
import org.opencv.core.MatOfInt;
import org.opencv.imgcodecs.Imgcodecs;
import org.opencv.imgproc.Imgproc;

public class TrainANN {
    public static void main(String [] args){
        String currentDirectory = System.getProperty("user.dir");
        System.load(currentDirectory + "\OpenCVDLL\x64\" + Core.NATIVE_LIBRARY_NAME + ".dll");

        Mat datasetHist = new Mat();
        Mat datasetLabels = new Mat();
        
        String [] classesNames = {"apple", "lemon", "mango", "raspberry"};
        
        for(int classIdx=0; classIdx<classesNames.length; classIdx++){
            String currClassName = classesNames[classIdx];
            String currClassDir = currentDirectory + "\Dataset\Train\" + currClassName + "\";
            System.out.println("Current Class Directory : " + currClassDir);
            
            File folder = new File(currClassDir);
            File[] listOfFiles = folder.listFiles();
            
            // Counter for the number of image being processed within the class.
            int imgCount = 0;

            for (File listOfFile : listOfFiles) {
                // Make sure we are working with a file and its extension is JPG
                if (listOfFile.isFile() && (currClassDir + listOfFile.getName()).endsWith(".jpg")) {
                    System.out.println("Class Index " + classIdx + "(" + currClassName + ")" + ", Image Index " + imgCount + "(" + listOfFile.getName() + ")");
                    
                    String currImgPath = currClassDir + listOfFile.getName();
                    System.out.println(currImgPath);
                    Mat imgBGR = Imgcodecs.imread(currImgPath);

                    Mat imgHSV = new Mat();
                    Imgproc.cvtColor(imgBGR, imgHSV, Imgproc.COLOR_BGR2HSV);

                    // Preparing parameters of Imgproc.calcHist().
                    MatOfInt selectedChannels = new MatOfInt(0); // Selecting the first channel in HSV which is Hue.
                    Mat imgHist = new Mat();
                    MatOfInt histSize = new MatOfInt(180); // Number of bins in the histogram.
                    MatOfFloat ranges = new MatOfFloat(0f, 180f); // Uniform bin distribution. Start value of first bin is 0 and end value for last bin is 180.

                    // Doc: https://docs.opencv.org/3.1.0/d6/dc7/group__imgproc__hist.html#ga4b2b5fd75503ff9e6844cc4dcdaed35d
                    Imgproc.calcHist(Arrays.asList(imgHSV), selectedChannels, new Mat(), imgHist, histSize, ranges);

                    // Transposing the histogram Mat from being 1D column vector to be 1D row vector.
                    imgHist = imgHist.t();
                    System.out.println("Hue Channel Hist : " + imgHist.dump());
                    System.out.println("Image Hist Size : (" + imgHist.rows() + ", " + imgHist.cols()+ ")n");

                    // Inserting the extracted histogram of the current image into the Mat collecting the histograms of all images.
                    datasetHist.push_back(imgHist);
                    datasetLabels.push_back(new MatOfInt(classIdx));
                    
                    imgCount++;
                }
            }
        }
        
        // Converting the type of the features & labels Mats into CV_32F because ANN accepts data of this type.
        datasetHist.convertTo(datasetHist, CvType.CV_32F);
        datasetLabels.convertTo(datasetLabels, CvType.CV_32F);
        
        System.out.println("Dataset Hist Size : (" + datasetHist.rows() + ", " + datasetHist.cols()+ ")");
        System.out.println("Dataset Label Size : (" + datasetLabels.rows() + ", " + datasetLabels.cols()+ ")");
        
    }    
}

After the image is converted to the HSV color space, we’re ready to extract the color histogram. The calcHist() method from the Imgproc class is used for this purpose. This method accepts 6 arguments, which are as follows:

  1. List<Mat> images: A list of Mat images to extract the color histogram. In our case, the histogram is extracted from just one image at once.
  2. MatOfInt channels: An integer Mat that specifies the indices of the channels from which the color histogram is extracted. In our code, the selectedChannels variable reflects that the channel within index 0 will be used (the Hue channel).
  3. Mat mask: Optional variable that’s unnecessary for our experiment. This is why it is set to an empty Mat in our code.
  4. Mat hist: The destination Mat in which the extracted histogram will be saved. In the above code, the destination Mat is an empty Mat named imgHist. It holds the histogram of the current image only.
  5. MatOfInt histSize: Number of bins within the histogram. According to the histSize variable, it’s set to 180.
  6. MatOfFloat ranges: The range of values for each bin. In our experiment, we’ll use uniform distribution by specifying 2 values within a Mat variable named ranges. The first value refers to the starting value of the first bin, which is 0. The second value refers to the end value of the last bin, which is selected to be 180.

After the calcHist() method extracts the histogram, the result will be returned in the destination Mat (imgHist). This Mat is a 1D column vector. I prefer converting it into a 1D row vector because a feature vector is usually represented as a row vector. This is done by transposing the vector using the t() method. However, you can leave it as a column vector if that works well for you.

Remember that imgHist represents the histogram of a single image. Previously, we created 2 Mat variables named datasetHist and datasetLabels for holding the features and labels of all images. In order to insert the histogram of the current image stored in imgHist inside the datasetHist Mat, the push_back() method is used. The same happens for the datasetLabels Mat to insert the current image’s class label.

After extracting the features from all training images within the dataset, we have to make sure that the type of the features and labels Mat variables is CV_32F (Float 32). We can do that using the convertTo() method.

After the above code executes successfully, the size of the datasetHist and datasetLabels Mat variables is printed according to the next figure. Note that the number of training images within the apple class is 492 and 490 for each of the other 3 classes for a total of 1,962 training images. This is indicated by the number of rows within these 2 Mat variables.

After building the training data inputs and outputs (i.e. datasetHist and datasetLabels), we’re ready to build and train the ANN using this data.

Building, Training, and Saving the ANN

The version of the TrainANN class in the previous section will be extended to build, train, and save the trained ANN. The new class is listed below.

OpenCV offers a class named ANN_MLP available in the org.opencv.ml package. This class stands for “Artificial Neural Networks- Multi Layer Perceptrons” and is used for building ANNs. For detailed instructions on building the ANN using OpenCV, read my previous tutorial titled “Running Artificial Neural Networks in Android using OpenCV”.

The first step towards building an ANN is to create an instance of the ANN_MLP class using the create() method. The instance is saved into a variable named ANN.

The architecture of the ANN (i.e. layers) is specified using the setLayerSizes() method. This method accepts a Mat that specifies both the number of layers and the number of neurons within each layer. The Mat is named layerSizes, and its length refers to the number of layers in the network. In this tutorial, this Mat has 4 rows referring to 4 different layers: 1 input layer, 2 hidden layers, and 1 output layer.

The value within the Mat refers to the number of neurons within each layer. For the input layer, it will have 180 neurons because the length of the feature vector extracted from each image is 180. The first hidden layer has 60 neurons, and the second hidden layer has 20 neurons. The output layer has only 1 output neuron.

Note that this problem is a multi-class problem, but we still used 1 neuron. Why is this the case?

There are 4 classes used in the experiment, where the ID of the first class labeled apple is 0, 1 for class lemon, 2 for mango, and 3 for the last class, raspberry. The result of the output neuron is rounded to the nearest class ID. For example, if its output is 2.1, then the predicted class ID will be 2, referring to the class label mango.

The activation function used in the network is specified using the setActivationFunction() method. Here, we’ll use the sigmoid function. The setTrainMethod() method specifies the training algorithm used. In this tutorial, the back propagation algorithm is used.

The last step before training the ANN is to specify some parameters using an instance of the TermCriteria class, which stands for “termination criteria”. These parameters specify when the ANN stops training. The parameters are as follows:

  1. int type: Termination type, which is set to TermCriteria.EPS + TermCriteria.COUNT. This means the training stops when either the error reaches a desired value or after reaching a maximum number of iterations. The desired error is specified using the epsilon argument, and the number of iterations is specified using the maxCount argument.
  2. int maxCount: Maximum number of iterations.
  3. double epsilon: The maximum difference between the desired and expected outputs. The lower the difference, the more accurate the predictions.

The instance of the TermCriteria class is fed to the ANN using the setTermCriteria() method.

package opencvapp;

import java.io.File;
import java.util.Arrays;
import org.opencv.core.Core;
import org.opencv.core.CvType;
import org.opencv.core.Mat;
import org.opencv.core.MatOfFloat;
import org.opencv.core.MatOfInt;
import org.opencv.core.TermCriteria;
import org.opencv.imgcodecs.Imgcodecs;
import org.opencv.imgproc.Imgproc;
import org.opencv.ml.ANN_MLP;
import org.opencv.ml.Ml;

public class TrainANN {
    public static void main(String [] args){
        String currentDirectory = System.getProperty("user.dir");
        System.load(currentDirectory + "\OpenCVDLL\x64\" + Core.NATIVE_LIBRARY_NAME + ".dll");

        Mat datasetHist = new Mat();
        Mat datasetLabels = new Mat();
        
        String [] classesNames = {"apple", "lemon", "mango", "raspberry"};
        
        for(int classIdx=0; classIdx<classesNames.length; classIdx++){
            String currClassName = classesNames[classIdx];
            String currClassDir = currentDirectory + "\Dataset\Train\" + currClassName + "\";
            System.out.println("Current Class Directory : " + currClassDir);
            
            File folder = new File(currClassDir);
            File[] listOfFiles = folder.listFiles();
            
            // Counter for the number of image being processed within the class.
            int imgCount = 0;

            for (File listOfFile : listOfFiles) {
                // Make sure we are working with a file and its extension is JPG
                if (listOfFile.isFile() && (currClassDir + listOfFile.getName()).endsWith(".jpg")) {
                    System.out.println("Class Index " + classIdx + "(" + currClassName + ")" + ", Image Index " + imgCount + "(" + listOfFile.getName() + ")");
                    
                    String currImgPath = currClassDir + listOfFile.getName();
                    System.out.println(currImgPath);
                    Mat imgBGR = Imgcodecs.imread(currImgPath);

                    Mat imgHSV = new Mat();
                    Imgproc.cvtColor(imgBGR, imgHSV, Imgproc.COLOR_BGR2HSV);

                    // Preparing parameters of Imgproc.calcHist().
                    MatOfInt selectedChannels = new MatOfInt(0); // Selecting the first channel in HSV which is Hue.
                    Mat imgHist = new Mat();
                    MatOfInt histSize = new MatOfInt(180); // Number of bins in the histogram.
                    MatOfFloat ranges = new MatOfFloat(0f, 180f); // Uniform bin distribution. Start value of first bin is 0 and end value for last bin is 180.

                    // Doc: https://docs.opencv.org/3.1.0/d6/dc7/group__imgproc__hist.html#ga4b2b5fd75503ff9e6844cc4dcdaed35d
                    Imgproc.calcHist(Arrays.asList(imgHSV), selectedChannels, new Mat(), imgHist, histSize, ranges);

                    // Transposing the histogram Mat from being 1D column vector to be 1D row vector.
                    imgHist = imgHist.t();
                    System.out.println("Hue Channel Hist : " + imgHist.dump());
                    System.out.println("Image Hist Size : (" + imgHist.rows() + ", " + imgHist.cols()+ ")n");

                    // Inserting the extracted histogram of the current image into the Mat collecting the histograms of all images.
                    datasetHist.push_back(imgHist);
                    datasetLabels.push_back(new MatOfInt(classIdx));
                    
                    imgCount++;
                }
            }
        }
        
        // Converting the type of the features & labels Mats into CV_32F because ANN accepts data of this type.
        datasetHist.convertTo(datasetHist, CvType.CV_32F);
        datasetLabels.convertTo(datasetLabels, CvType.CV_32F);
        
        System.out.println("Dataset Hist Size : (" + datasetHist.rows() + ", " + datasetHist.cols()+ ")");
        System.out.println("Dataset Label Size : (" + datasetLabels.rows() + ", " + datasetLabels.cols()+ ")");
        
        ANN_MLP ANN = ANN_MLP.create();

        Mat layerSizes = new Mat(4, 1, CvType.CV_8U);
        layerSizes.put(0, 0, 180); // 180 input neurons in the input layer because the histogram length is 180.
        layerSizes.put(1, 0, 60); // 60 hidden neurons in the first hidden layer.
        layerSizes.put(2, 0, 20); // 20 hidden neurons in the second hidden layer.
        layerSizes.put(3, 0, 1); // One output neuron in the output layer.
        ANN.setLayerSizes(layerSizes);
        System.out.println("Layers Sizes : n" + layerSizes.dump());

        ANN.setActivationFunction(ANN_MLP.SIGMOID_SYM);
        ANN.setTrainMethod(ANN_MLP.BACKPROP);

        TermCriteria criteria = new TermCriteria(TermCriteria.EPS + TermCriteria.COUNT, 10000, 0.00000001);
        ANN.setTermCriteria(criteria);

        ANN.train(datasetHist, Ml.ROW_SAMPLE, datasetLabels);
        
        try{
            ANN.save(currentDirectory + "\OpenCV_ANN_Fruits.yml");
            System.out.println("Model Saved Successfully.");
        } catch(Exception ex) {
            System.err.println("Error Saving Model.");
        }        
    }    
}

We train the ANN using the train() method. This method accepts 3 arguments:

  1. InputArray samples: Training data inputs, which is the datasetHist Mat in this case.
  2. int layout: Sample layout. If each sample is represented as a row, then its value is 0. Note that we can use ML.ROW_SAMPLE, which is translated to 0. If each sample is a column in the input Mat, then its value is 1. Alternatively, we can use ML.COL_SAMPLE. Remember that the datasetHist is transposed to make each sample represented as a row. Thus, we’ll use the ML.ROW_SAMPLE value.
  3. InputArray responses: Training data outputs, which is the datasetLabels Mat in this case.

After training completes, the trained ANN will be saved into a YML file named OpenCV_ANN_Fruits.yml in the root of the NetBeans project. This is the end of the training step. Now let’s discuss testing the trained ANN.

Making Predictions by Loading the Saved ANN

For making predictions using the trained ANN, there’s a new class named PredictANN, and its code is listed below. The path specified in the currClassDir variable now uses the word Test rather than Train because we are using the test images.

This class prepares the testing data inside the datasetHist and datasetLabels Mat variables exactly the same as the TrainANN class. Thus, the code is just copied and pasted from the TrainANN class. What is new is the loading of the saved YML file and the process of making predictions.

The model is loaded using the load() method by specifying the YML file path. After that, a for loop goes through all samples within datasetHist and fetches sample by sample within the Mat variable named sample.

package opencvapp;

import java.io.File;
import java.util.Arrays;
import org.opencv.core.Core;
import org.opencv.core.CvType;
import org.opencv.core.Mat;
import org.opencv.core.MatOfFloat;
import org.opencv.core.MatOfInt;
import org.opencv.imgcodecs.Imgcodecs;
import org.opencv.imgproc.Imgproc;
import org.opencv.ml.ANN_MLP;

public class PredictANN {
    public static void main(String[] args) {
        String currentDirectory = System.getProperty("user.dir");
        System.load(currentDirectory + "\OpenCVDLL\x64\" + Core.NATIVE_LIBRARY_NAME + ".dll");
        Mat datasetHist = new Mat();
        Mat datasetLabels = new Mat();
        
        String [] classesNames = {"apple", "lemon", "mango", "raspberry"};
        
        for(int classIdx=0; classIdx<classesNames.length; classIdx++){
            String currClassName = classesNames[classIdx];
            String currClassDir = currentDirectory + "\Dataset\Test\" + currClassName + "\";
            System.out.println("Current Class Directory : " + currClassDir);
            
            File folder = new File(currClassDir);
            File[] listOfFiles = folder.listFiles();
            
            int imgCount = 0;

            for (File listOfFile : listOfFiles) {
                // Make sure we are working with a file and its extension is JPG
                if (listOfFile.isFile() && (currClassDir + listOfFile.getName()).endsWith(".jpg")) {
                    System.out.println("Class Index " + classIdx + "(" + currClassName + ")" + ", Image Index " + imgCount + "(" + listOfFile.getName() + ")");
                    
                    String currImgPath = currClassDir + listOfFile.getName();
                    System.out.println(currImgPath);
                    Mat imgRGB = Imgcodecs.imread(currImgPath);

                    Mat imgHSV = new Mat();
                    Imgproc.cvtColor(imgRGB, imgHSV, Imgproc.COLOR_RGB2HSV);

                    // Preparing parameters of Imgproc.calcHist().
                    MatOfInt selectedChannels = new MatOfInt(0);
                    Mat imgHist = new Mat();
                    MatOfInt histSize = new MatOfInt(180);
                    MatOfFloat ranges = new MatOfFloat(0f, 180f);

                    // Doc: https://docs.opencv.org/3.1.0/d6/dc7/group__imgproc__hist.html#ga4b2b5fd75503ff9e6844cc4dcdaed35d
                    Imgproc.calcHist(Arrays.asList(imgHSV), selectedChannels, new Mat(), imgHist, histSize, ranges);

                    // Transposing the histogram Mat from being 1D column vector to be 1D row vector.
                    imgHist = imgHist.t();
                    System.out.println("Hue Channel Hist : " + imgHist.dump());
                System.out.println("Image Hist Size : (" + imgHist.rows() + ", " + imgHist.cols()+ ")n");

                    // Inserting the extracted histogram of the current image into the Mat collecting the histograms of all images.
                    datasetHist.push_back(imgHist);
                    datasetLabels.push_back(new MatOfInt(classIdx));
                    
                    imgCount++;
                }
            }
        }
        
        // Converting the type of the features & labels Mats into CV_32F because ANN accepts data of this type.
        datasetHist.convertTo(datasetHist, CvType.CV_32F);
        datasetLabels.convertTo(datasetLabels, CvType.CV_32F);
        
        System.out.println("Dataset Hist Size : (" + datasetHist.rows() + ", " + datasetHist.cols()+ ")");
        System.out.println("Dataset Label Size : (" + datasetLabels.rows() + ", " + datasetLabels.cols()+ ")");

        ANN_MLP ANN = ANN_MLP.load(currentDirectory + "\OpenCV_ANN_Fruits.yml");
        
        double num_correct_predictions = 0;
        for (int i = 0; i < datasetHist.rows(); i++) {
            Mat sample = datasetHist.row(i);
            double correct_label = datasetLabels.get(i, 0)[0];

            Mat results = new Mat();
            ANN.predict(sample, results, 0);

            double response = results.get(0, 0)[0];
            int predicted_label = (int) Math.round(response);
            
            System.out.println("Predicted Score : " + response + ", Predicted Label : " + predicted_label + ", Correct Label : " + correct_label);

            if (predicted_label == correct_label) {
                num_correct_predictions += 1;
            }
        }
        
        double accuracy = (num_correct_predictions / datasetHist.rows()) * 100;
        System.out.println("Accuracy : " + accuracy);

    }    
}

The sample Mat is fed to the predict() method in addition to another Mat named results, which holds the predicted class score. After rounding this score, the predicted class ID is returned into the predicted_label variable.

By comparing the predicted class with the correct class, we can count the number of correct predictions into the num_correct_predictions variable.

Finally, we calculate the accuracy by diving the value in the num_correct_predictions variable by the total number of samples. According to the next figure, the accuracy for the test data is 100%. Not surprisingly, this is the best result we can reach.

After training the ANN and reaching the best possible accuracy, the next step is to send the YML file to the Android device. For example, you can share this file with the Android device by Bluetooth or transfer it using a USB cable.

Regardless, you have to know the path of the YML file in your Android device. After opening the details of the YML file, its absolute path is given in the next figure. This path will be used in the Android app to load the model. The next section builds such an app.

Building an Android App

This section assumes you already have an Android app in which OpenCV is imported. The OpenCV version used for Android is 3.4.4. The main activity XML layout is listed below. The root view is a LinearLayout in which 7 child views exist: 3 TextView, 2 EditText, and 2 Button views.

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:orientation="vertical"
    tools:context="com.example.dell.opencvandroid.MainActivity">

    <TextView
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:text="1. Enter the Path of the Trained ANN YML File" />

    <EditText
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:text="/storage/emulated/0/Download/OpenCV_ANN_Fruits.yml"
        android:id="@+id/modelPath" />

    <TextView
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:text="2. Select an Image File" />

    <Button
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:text="Select a Fruit Image File"
        android:onClick="selectImage" />

    <TextView
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:text="Or Enter the Image File Path" />

    <EditText
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:text="/storage/emulated/0/Download/apple.jpg"
        android:id="@+id/imgPath" />

    <Button
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:text="Classify Image using ANN"
        android:onClick="predictANN" />

</LinearLayout>

The activity window is shown in the next figure:

The first EditText with ID modelPath accepts the trained ANN model path.

According to the previous figure, the path is /storage/emulated/0/Download/OpenCV_ANN_Fruits.yml. The second EditText, whose ID is imgPath, accepts the absolute path of the image to be classified.

Rather than writing the absolute path of the image, you can click on the first button, whose text is Select a Fruit Image File. This button calls a callback method named selectImage() within the activity Java file. The implementation of this method is given below. It creates a new Intent for selecting a file.

After clicking on the button, the user will be asked to select a file:.

By selecting the image file, the onActivityResult() callback method is called automatically. Its implementation is listed below. It makes sure a file is selected using an if statement and returns the URI of the selected file.

In order to return the absolute path of the selected file, a method named getPath() is called. Note that the implementation of the getPath() method and all its requirements are taken from StackOverflow and Paul Burke’s answer to this question.

The implementation of the getPath() method and all methods it needs are listed below. It accepts the application context in addition to the selected file URI and returns a String representing the absolute path of the selected file.

public static String getPath(final Context context, final Uri uri) {

    final boolean isKitKat = Build.VERSION.SDK_INT >= Build.VERSION_CODES.KITKAT;

    // DocumentProvider
    if (isKitKat && DocumentsContract.isDocumentUri(context, uri)) {
        // ExternalStorageProvider
        if (isExternalStorageDocument(uri)) {
            final String docId = DocumentsContract.getDocumentId(uri);
            final String[] split = docId.split(":");
            final String type = split[0];

            if ("primary".equalsIgnoreCase(type)) {
                return Environment.getExternalStorageDirectory() + "/" + split[1];
            }

            // TODO handle non-primary volumes
        }
        // DownloadsProvider
        else if (isDownloadsDocument(uri)) {

            final String id = DocumentsContract.getDocumentId(uri);
            final Uri contentUri = ContentUris.withAppendedId(
                    Uri.parse("content://downloads/public_downloads"), Long.valueOf(id));

            return getDataColumn(context, contentUri, null, null);
        }
        // MediaProvider
        else if (isMediaDocument(uri)) {
            final String docId = DocumentsContract.getDocumentId(uri);
            final String[] split = docId.split(":");
            final String type = split[0];

            Uri contentUri = null;
            if ("image".equals(type)) {
                contentUri = MediaStore.Images.Media.EXTERNAL_CONTENT_URI;
            } else if ("video".equals(type)) {
                contentUri = MediaStore.Video.Media.EXTERNAL_CONTENT_URI;
            } else if ("audio".equals(type)) {
                contentUri = MediaStore.Audio.Media.EXTERNAL_CONTENT_URI;
            }

            final String selection = "_id=?";
            final String[] selectionArgs = new String[] {
                    split[1]
            };

            return getDataColumn(context, contentUri, selection, selectionArgs);
        }
    }
    // MediaStore (and general)
    else if ("content".equalsIgnoreCase(uri.getScheme())) {
        return getDataColumn(context, uri, null, null);
    }
    // File
    else if ("file".equalsIgnoreCase(uri.getScheme())) {
        return uri.getPath();
    }

    return null;
}

public static String getDataColumn(Context context, Uri uri, String selection,
                                   String[] selectionArgs) {

    Cursor cursor = null;
    final String column = "_data";
    final String[] projection = {
            column
    };

    try {
        cursor = context.getContentResolver().query(uri, projection, selection, selectionArgs,
                null);
        if (cursor != null && cursor.moveToFirst()) {
            final int column_index = cursor.getColumnIndexOrThrow(column);
            return cursor.getString(column_index);
        }
    } finally {
        if (cursor != null)
            cursor.close();
    }
    return null;
}

public static boolean isExternalStorageDocument(Uri uri) {
    return "com.android.externalstorage.documents".equals(uri.getAuthority());
}

public static boolean isDownloadsDocument(Uri uri) {
    return "com.android.providers.downloads.documents".equals(uri.getAuthority());
}

public static boolean isMediaDocument(Uri uri) {
    return "com.android.providers.media.documents".equals(uri.getAuthority());
}

After the path is returned using the getPath() method, the onActivityResult() method overrides the text inside the imgPath EditText to that path and displays it in a Toast message, as shown in the next figure:

Predicting the Class Label of the Image in Android

Now we have the path of the trained ANN YML file and also the path of the image to be classified. The next step is to read the image, convert it to HSV, extract the Hue channel histogram, and predict the class label. By clicking on the last Button view defined inside the layout file, a callback method named predictANN() will be called. The implementation of this method is listed below:

public void predictANN(View v){
    EditText imgPath = findViewById(R.id.imgPath);
    Mat imgBGR = Imgcodecs.imread(imgPath.getText().toString());

    Mat imgHSV = new Mat();
    Imgproc.cvtColor(imgBGR, imgHSV, Imgproc.COLOR_BGR2HSV);

    // Preparing parameters of Imgproc.calcHist().
    MatOfInt selectedChannels = new MatOfInt(0);
    Mat imgHist = new Mat();
    MatOfInt histSize = new MatOfInt(180);
    MatOfFloat ranges = new MatOfFloat(0f, 180f);

    // Doc: https://docs.opencv.org/3.1.0/d6/dc7/group__imgproc__hist.html#ga4b2b5fd75503ff9e6844cc4dcdaed35d
    Imgproc.calcHist(Arrays.asList(imgHSV), selectedChannels, new Mat(), imgHist, histSize, ranges);

    imgHist = imgHist.t();

    imgHist.convertTo(imgHist, CvType.CV_32F);

    EditText modelPath = findViewById(R.id.modelPath);
    ANN_MLP ANN = ANN_MLP.load(modelPath.getText().toString());

    Mat results = new Mat();
    ANN.predict(imgHist, results, 0);

    double response = results.get(0, 0)[0];
    int predictedClassIdx = (int) Math.round(response);

    String [] classesNames = {"apple", "lemon", "mango", "raspberry"};
    String predictedLabel = classesNames[predictedClassIdx];

    Toast.makeText(getApplicationContext(), "Predicted Score : " + response + "nPredicted Class Idx : " + predictedClassIdx + "nPredicted Label : " + predictedLabel, Toast.LENGTH_LONG).show();

}

The predictANN() method simply fetches the image path from the EditText view, whose ID is imgPath. This path is fed to the Imgcodecs.imread() method, where the image is read as a Mat. The image is saved into the imgBGR variable.

A new empty Mat named imgHSV is created to hold the image after being converted from BGR to the HSV color space. The conversion takes place using the Imgproc.cvtColor() method as discussed previously. After that, the histogram is extracted using the Imgproc.calcHist() method.

Now we’re ready to load the model using the load() method. The path fed to this method is fetched from the EditText view, whose ID is modelPath. After predicting the class label of the image, a Toast message is shown. It prints the prediction score, the ID of the predicted class, and its text label.

In the Toast message, 3 values are printed: the prediction score, predicted class index, and label associated with that class. The prediction score is -0.008229. This is the value that the neural network predicts. This score is then rounded to the nearest integer which is 0 in this case. This integer 0 is regarded as the index of the predicted class. Finally, the label associated with the integer 0 is apple, as defined in the code.

The complete implementation of the main activity Java file that merges all the Java methods discussed above is listed below.

package com.example.dell.opencvandroid;

import android.content.ContentUris;
import android.content.Context;
import android.content.Intent;
import android.database.Cursor;
import android.net.Uri;
import android.os.Build;
import android.os.Environment;
import android.provider.DocumentsContract;
import android.provider.MediaStore;
import android.support.v7.app.AppCompatActivity;
import android.os.Bundle;
import android.view.View;
import android.widget.EditText;
import android.widget.Toast;

import org.opencv.android.OpenCVLoader;
import org.opencv.core.CvType;
import org.opencv.core.Mat;
import org.opencv.core.MatOfFloat;
import org.opencv.core.MatOfInt;
import org.opencv.imgcodecs.Imgcodecs;
import org.opencv.imgproc.Imgproc;
import org.opencv.ml.ANN_MLP;
import java.util.Arrays;

public class MainActivity extends AppCompatActivity {

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        OpenCVLoader.initDebug();
    }

    public void selectImage(View v) {
        Intent intent = new Intent();
        intent.setType("*/*");
        intent.setAction(Intent.ACTION_GET_CONTENT);
        startActivityForResult(intent, 0);
    }

    @Override
    protected void onActivityResult(int reqCode, int resCode, Intent data) {
        if(resCode == RESULT_OK && data != null) {
            Uri uri = data.getData();

            String realPath = getPath(getApplicationContext(), uri);

            EditText imgPath = findViewById(R.id.imgPath);
            imgPath.setText(realPath);
            Toast.makeText(getApplicationContext(), realPath, Toast.LENGTH_LONG).show();
        }
    }

    // Implementation of the getPath() method and all its requirements is taken from the StackOverflow Paul Burke's answer: https://stackoverflow.com/a/20559175/5426539
    public static String getPath(final Context context, final Uri uri) {

        final boolean isKitKat = Build.VERSION.SDK_INT >= Build.VERSION_CODES.KITKAT;

        // DocumentProvider
        if (isKitKat && DocumentsContract.isDocumentUri(context, uri)) {
            // ExternalStorageProvider
            if (isExternalStorageDocument(uri)) {
                final String docId = DocumentsContract.getDocumentId(uri);
                final String[] split = docId.split(":");
                final String type = split[0];

                if ("primary".equalsIgnoreCase(type)) {
                    return Environment.getExternalStorageDirectory() + "/" + split[1];
                }

                // TODO handle non-primary volumes
            }
            // DownloadsProvider
            else if (isDownloadsDocument(uri)) {

                final String id = DocumentsContract.getDocumentId(uri);
                final Uri contentUri = ContentUris.withAppendedId(
                        Uri.parse("content://downloads/public_downloads"), Long.valueOf(id));

                return getDataColumn(context, contentUri, null, null);
            }
            // MediaProvider
            else if (isMediaDocument(uri)) {
                final String docId = DocumentsContract.getDocumentId(uri);
                final String[] split = docId.split(":");
                final String type = split[0];

                Uri contentUri = null;
                if ("image".equals(type)) {
                    contentUri = MediaStore.Images.Media.EXTERNAL_CONTENT_URI;
                } else if ("video".equals(type)) {
                    contentUri = MediaStore.Video.Media.EXTERNAL_CONTENT_URI;
                } else if ("audio".equals(type)) {
                    contentUri = MediaStore.Audio.Media.EXTERNAL_CONTENT_URI;
                }

                final String selection = "_id=?";
                final String[] selectionArgs = new String[] {
                        split[1]
                };

                return getDataColumn(context, contentUri, selection, selectionArgs);
            }
        }
        // MediaStore (and general)
        else if ("content".equalsIgnoreCase(uri.getScheme())) {
            return getDataColumn(context, uri, null, null);
        }
        // File
        else if ("file".equalsIgnoreCase(uri.getScheme())) {
            return uri.getPath();
        }

        return null;
    }

    public static String getDataColumn(Context context, Uri uri, String selection,
                                       String[] selectionArgs) {

        Cursor cursor = null;
        final String column = "_data";
        final String[] projection = {
                column
        };

        try {
            cursor = context.getContentResolver().query(uri, projection, selection, selectionArgs,
                    null);
            if (cursor != null && cursor.moveToFirst()) {
                final int column_index = cursor.getColumnIndexOrThrow(column);
                return cursor.getString(column_index);
            }
        } finally {
            if (cursor != null)
                cursor.close();
        }
        return null;
    }

    public static boolean isExternalStorageDocument(Uri uri) {
        return "com.android.externalstorage.documents".equals(uri.getAuthority());
    }

    public static boolean isDownloadsDocument(Uri uri) {
        return "com.android.providers.downloads.documents".equals(uri.getAuthority());
    }

    public static boolean isMediaDocument(Uri uri) {
        return "com.android.providers.media.documents".equals(uri.getAuthority());
    }

    public void predictANN(View v){
        EditText imgPath = findViewById(R.id.imgPath);
        Mat imgBGR = Imgcodecs.imread(imgPath.getText().toString());

        Mat imgHSV = new Mat();
        Imgproc.cvtColor(imgBGR, imgHSV, Imgproc.COLOR_BGR2HSV);

        // Preparing parameters of Imgproc.calcHist().
        MatOfInt selectedChannels = new MatOfInt(0);
        Mat imgHist = new Mat();
        MatOfInt histSize = new MatOfInt(180);
        MatOfFloat ranges = new MatOfFloat(0f, 180f);

        // Doc: https://docs.opencv.org/3.1.0/d6/dc7/group__imgproc__hist.html#ga4b2b5fd75503ff9e6844cc4dcdaed35d
        Imgproc.calcHist(Arrays.asList(imgHSV), selectedChannels, new Mat(), imgHist, histSize, ranges);

        imgHist = imgHist.t();

        imgHist.convertTo(imgHist, CvType.CV_32F);

        EditText modelPath = findViewById(R.id.modelPath);
        ANN_MLP ANN = ANN_MLP.load(modelPath.getText().toString());

        Mat results = new Mat();
        ANN.predict(imgHist, results, 0);

        double response = results.get(0, 0)[0];
        int predictedClassIdx = (int) Math.round(response);

        String [] classesNames = {"apple", "lemon", "mango", "raspberry"};
        String predictedLabel = classesNames[predictedClassIdx];

        Toast.makeText(getApplicationContext(), "Predicted Score : " + response + "nPredicted Class Idx : " + predictedClassIdx + "nPredicted Label : " + predictedLabel, Toast.LENGTH_LONG).show();

    }
}

Project on GitHub

The GitHub project of this tutorial is available here:

The project has 2 folders. The first one is named OpenCVApp, which is the NetBeans project. The second one is named OpenCVAndroid, which is the Android Studio project. You can download and import these projects and give them a try.

For Contacting the Author

Fritz

Our team has been at the forefront of Artificial Intelligence and Machine Learning research for more than 15 years and we're using our collective intelligence to help others learn, understand and grow using these new technologies in ethical and sustainable ways.

Comments 0 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *

wix banner square