Visual Recognition in Android Using IBM Watson

First things first—

What is Visual Recognition? And where can it be used??

In short, Visual Recognition helps us find meaning in visual content.
We can use Visual Recognition to develop smart applications that analyze the visual content of images or video frames to understand what is happening in a scene.

Most Popular VR Use Cases

Some use cases where VR can be used:

You’re an online shopping e-commerce giant that wants to allure customers into buying the stuff they want — using Visual Recognition.
You advertise and introduce a new feature to your app and website: “Do you have a picture of the kinds of clothes and footwear you want? Send it to us, and we’ll take the headache out of your search.” Behind the scenes, you identify the color of the costume, its fabric, etc (all the classes and models have been trained prior), and now you give search suggestions matching these criteria. And your customers will go crazy wondering how this works… 😉
You’re running an online social media app. Users come to this platform and often upload pictures and share them with their connections. But you want to restrict the users from uploading explicit content. You can use Visual Recognition to identify the content of media being uploaded and check to see if it contains any kind of nudity, and then allow the picture to be posted only when it passes this test with a certain score.😈

Now what if I told you that implementing Visual Recognition and using it isn’t difficult at all?

By the end of this article, you’ll know how it’s done! 😎

We’re going to work with the second use-case in this blog post. We’ll be using IBM Watson for our demonstration.

A brief about IBM Watson:

Watson is an AI platform offered by IBM that includes plenty of cool services including Visual Recognition, Tone Analyzer, Conversations (Smart Chat Bots), and more. Check out more here: Watson Services Portfolio.

Service Creation

Create a free IBM Cloud account
Go to your IBM Cloud Services Dashboard
To create a Visual Recognition service, follow the instructions as shown below:

At the end of this you’ll have your Visual Recognition Service created on the IBM Cloud.

On the main page of the service, you’ll see the following tab:

Click on “SHOW” and copy the apiKey. It is an authorization token that all our requests from the app will carry.

Also, explore the API Reference if you feel like digging deeper. 😉

Get the Sample App ready

We’re using a custom sample app, hosted on GitHub. I’ve kept this app simple, but it’s complex enough to cover what we’re trying to demonstrate — filtering media based on content.

The app’s main screen contains one text box where the image url should be given as input. After giving this input, press the Fetch Results button to process it using IBM Watson. The results are shown on the next screen.

Once you get the basic idea of how things work with this sample, you can take it further and build something bigger. Now let’s get our hands dirty with CODE. 💻

Download the sample app from my GitHub
Open the VisualRecognitionSample in Android Studio
Open MainActivity.java and PASTE your API KEY in the highlighted field
Now run the app in an emulator or any attached Android device

Using the App

The app is fairly simple to use.

Copy/Paste any image link from the web and paste it into the URL field.
Click on the FETCH RESULTS button to go to results page.
Check out the following screenshots for a couple examples:

Positive Case

NEGATIVE CASE:

We can see how accurately Watson has identified the contents of the image.

Some sample images to try out with:

You can try out any explicit media to verify the results.

Understanding what we just did:

The following piece of code is doing all the magic for us. But it’ll be helpful if we take a deeper look into what it does behind the scenes.😃

Please follow this link to understand what each field means.

We are using the explicit classifier here. But let’s say we aren’t interested in knowing whether the content includes nudity, but rather, we’re interested in knowing if the content involves food items. In that case, we use the food classifier. Try this out, and see how well it works.

You can also create your own classifier on the console by training your own model. Just launch the Custom tool from the IBM Cloud Visual Recognition Dashboard. You’ll be asked to create another service instance. Just follow the steps and you’re good to go.

A full guide for this process of creating custom models might not be appropriate for this blog, but you can follow this DZone link for a more detailed explanation.

What have we learned so far?

In this blog, we’ve demonstrated what IBM Watson can do. We took a brief run through the Visual Recognition Service offered in the Watson Platform.

We now have a working sample app that helps us identify whether the contents of an uploaded image are explicit (contains any form of nudity) or not.

Don’t forget to hit the ❤ button! Do share this article, if you find it interesting…

Discuss this post on Hacker News and Reddit