ML Kit Tutorial: How to recognize well-known landmarks in an image(Landmark Recognition)

ML Kit Tutorial: How to recognize well-known landmarks in an image(Landmark Recognition)

If you use Google Photos, you probably have seen that the app creates a cool library of your visit to some well-known tourist places automatically with some cool captions(something like this). The technology powering this feature is machine learning which can recognize well-known landmarks and generate metadata. This technology is now available to mobile developers through ML Kit in the form of Landmark recognition API.

When you pass an image to the landmark recognition API, you get the landmarks that were recognized in it, landmark's geographic coordinates and the region of the image the landmark was found. We may consider the following example:

Landmark Recognition example

For the above image Landmark Recognition API produces the following results:

Description Brugge
Geographic Coordinates 51.207367, 3.226933
Knowledge Graph entity ID /m/0drjd2
Bounding Polygon (20, 342), (651, 342), (651, 798), (20, 798)
Confidence Score 0.77150935

This information can then be used to automatically generate image metadata, create individualized experiences for users based on the content they share, and many more.

In this lesson, we are going to learn how to use this ML Kit Landmark Recognition API to recognize well-known landmark in an image. This tutorial does not require you prior knowledge or experience in Machine Learning. But you should be well familiar with Android Studio and its directory structures. If not, then you may refer Android Studio Project Overview.

Before we start, have a look at what we are going to build in the end:

1. Follow steps 1 to 7 of ML Kit Tutorial: How to detect faces with ML Kit API and identify key facial features. For Landmark Recognition API, we need only one of the ML Kit dependencies in the app-level build.gradle file.

dependencies {
  // ...
  implementation 'com.google.firebase:firebase-ml-vision:17.0.0'
}

2. If you are going to use the cloud based API, you need to upgrade your project to Blaze plan and enable the Cloud Vision API. In order to do this, Follow step 11 of ML Kit Tutorial: How to recognize and extract text in images.

3. Now we can just go to the action part:

Configuring the landmark detector

Landmark detector API uses the STABLE version of the model and returns up to 10 results by default. If you want you may change these settings with FirebaseVisionCloudDetectorOptions object as follows:

FirebaseVisionCloudDetectorOptions options =
	new FirebaseVisionCloudDetectorOptions.Builder()
	.setModelType(FirebaseVisionCloudDetectorOptions.LATEST_MODEL)
	.setMaxResults(15)
	.build();

Running the landmark detector

To recognize landmarks in an image, get an instance of FirebaseVisionCloudLandmarkDetector as follows:

FirebaseVisionCloudLandmarkDetector detector = FirebaseVision.getInstance()
	.getVisionCloudLandmarkDetector();
// Or, to change the default settings:
// FirebaseVisionCloudLandmarkDetector detector = FirebaseVision.getInstance()
//   .getVisionCloudLandmarkDetector(options);

Now you can pass the image to the detectInImage method as follows:

Task> result = detector.detectInImage(image)
	.addOnSuccessListener(new OnSuccessListener>() {
		@Override
		public void onSuccess(List firebaseVisionCloudLandmarks) {
			// Task completed successfully
			// ...
		}
	})
	.addOnFailureListener(new OnFailureListener() {
		@Override
		public void onFailure(@NonNull Exception e) {
			// Task failed with an exception
			// ...
		}
	});

All these tasks are defined in CloudLandmarkRecognitionProcessor.java. You can include this file directly in main java package for quick setup. For convenience, you may put this in a separate package folder(in my case "cloudlandmarkrecognition").

If the cloud landmark recognition operation succeeds, the detector returns a list of FirebaseVisionCloudLandmark objects. Each FirebaseVisionCloudLandmarke object represents a landmark that was detected in the image. For each landmark, you can get its bounding coordinates in the input image, the landmark's name, its latitude and longitude, its Knowledge Graph entity ID (if available), and the confidence score of the match. The following example illustrates this:

for (FirebaseVisionCloudLandmark landmark: firebaseVisionCloudLandmarks) {

    Rect bounds = landmark.getBoundingBox();
    String landmarkName = landmark.getLandmark();
    String entityId = landmark.getEntityId();
    float confidence = landmark.getConfidence();

    // Multiple locations are possible, e.g., the location of the depicted
    // landmark and the location the picture was taken.
    for (FirebaseVisionLatLng loc: landmark.getLocations()) {
        double latitude = loc.getLatitude();
        double longitude = loc.getLongitude();
    }
}

This task along with methods for rendering a landmark information within an associated graphic overlay view are defined in the CloudLandmarkGraphic.java. You can include this file in the same package folder as CloudLandmarkRecognitionProcessor.java.

5. Now we can use CloudLandmarkRecognitionProcessor.java in the MainActivity.java like this. This is the full and final code of MainActivity.java.

6. Now run the project. You should see that the app is now completed exactly as shown in the video above.

For quick set up, you may download the project directly from here or you may refer to this repo for all the source codes.

And thats it! You have just learnt how to use the ML Kit's landmark recognition API to detect popular landmarks in an image. This is the fifth tutorial of the ML Kit Tutorial Series. If you have any issue while running the project or setting it up, just leave a comment below..






Author:


Ratul Doley
Ratul Doley
Entrepreneur and AI researcher. Currently learning and working on Unsupervised learning and Data Clustering. Professional Android app developer and designer. Updated Nov 15, 2018