Text Recognition on Images with Firebase Machine Learning Kit

Hi friends,

I have not been able to share articles for a long time. But today I am with you on a new topic 🙂

In the Google I / O 2019 event, artificial intelligence and machine learning were quite effective. Therefore, the ML Kit SDK for Firebase was introduced for Android&IOS developers.

ML Kit is a mobile SDK that provides Google's machine learning expertise to Android and iOS applications in a powerful and easy-to-use package.

What is ML Kit?

Google has announced the ML Kit to make it easier for developers and make mobile apps smarter. Using this SDK, you can make your applications smarter, even if you don't have professional-level ML or artificial intelligence knowledge.

The Google technologies behind this SDK are generally as follows:

  • Google Cloud Vision API
  • TensorFlow Lite
  • Android Neural Networks API

What can we do with ML Kit?

The general ML specifications are as follows:

  • Text recognition
  • Face detection
  • Barcode scanning
  • Image labeling
  • Landmark recognition
Lets Create Our first app for ML Kit.

Adding Firebase manually to Project:

Configuring AndroidManifest.xml

You must modify your androidmanifest.xml to enable offline machine learning in your android app.

Add the following meta-tag in directly below your application tag.

<meta-data
    android:name="com.google.firebase.ml.vision.DEPENDENCIES"
    android:value="ocr" />

In addition, we must add a camera feature in our application

<uses-permission android:name="android.permission.CAMERA"/>
Add the following dependencies in you app level build.gradle file:
implementation 'com.google.firebase:firebase-ml-vision:16.0.0'
implementation 'com.camerakit:camerakit:1.0.0-beta3.11'
implementation 'com.camerakit:jpegkit:0.1.0'
implementation 'org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.3.31'
implementation 'org.jetbrains.kotlinx:kotlinx-coroutines-android:1.0.0'

Simple Interface Design

<?xml version="1.0" encoding="utf-8"?>
<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:app="http://schemas.android.com/apk/res-auto"
android:layout_width="match_parent"
android:layout_height="match_parent">

<ImageView
android:layout_width="match_parent"
android:layout_height="match_parent"
android:contentDescription="select_image_for_text_recognition"
android:scaleType="centerCrop"
android:background="#3949AB"
/>


<TextView
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_centerInParent="true"

android:gravity="center"
android:padding="30dp"
android:text="Choose an Image"
android:textColor="@android:color/white"
android:textSize="20sp"
android:textStyle="bold" />

<ImageView
android:id="@+id/image_view"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:layout_above="@+id/button_cam"
android:layout_margin="20dp"
/>

<Button
android:id="@+id/button_cam"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_alignParentBottom="true"
android:layout_alignParentLeft="true"
android:layout_alignParentStart="true"
android:layout_margin="10dp"
android:text="Select Picture"
android:textSize="17sp" />

<Button
android:id="@+id/button_text"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_alignParentBottom="true"
android:layout_alignParentEnd="true"
android:layout_alignParentRight="true"
android:layout_margin="10dp"
android:text="Run"
android:textSize="17sp"
app:layout_constraintBottom_toBottomOf="parent"
app:layout_constraintLeft_toLeftOf="parent" />

</RelativeLayout>

runTextRecognition: The method that checks the received bitmap image for text. If the image contains text, the resulting text is sent to the processTextRecognitionResult method of type FirebaseVisionText.

processTextRecognitionResult: Allows you to parse and display the detected text from the runTextRecognition method on the screen.

runTextRecognition Method
private void runTextRecognition() {
    FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(myBitmap);
    FirebaseVisionTextDetector detector = FirebaseVision.getInstance()
            .getVisionTextDetector();
    mButton.setEnabled(false);
    detector.detectInImage(image)
            .addOnSuccessListener(
                    new OnSuccessListener<FirebaseVisionText>() {
                        @Override
                        public void onSuccess(FirebaseVisionText texts) {
                            mButton.setEnabled(true);
                            processTextRecognitionResult(texts);
                        }
                    })
            .addOnFailureListener(
                    new OnFailureListener() {
                        @Override
                        public void onFailure(@NonNull Exception e) {
                            mButton.setEnabled(true);
                            e.printStackTrace();
                        }
                    });
}

 

processTextRecognitionResult Method
private void processTextRecognitionResult(FirebaseVisionText texts) {

    StringBuilder t = new StringBuilder();

    List<FirebaseVisionText.Block> blocks = texts.getBlocks();
    if (blocks.size() == 0) {
        showToast("No text found");
        return;
    }

    for (int i = 0; i < blocks.size(); i++) {
        t.append(" ").append(blocks.get(i).getText());
    }

    showToast(t.toString());
}

private void showToast(String message) {
    Toast.makeText(getApplicationContext(), message, Toast.LENGTH_SHORT).show();
}

SCREENSHOTS

 

As a result, we were able to read the text on the image and display it as a Toast message 🙂

For additional ML Kit features and uses, refer to the documentation on the firebase. See you in new and different works 👋

I will soon share the Github link for this project.

 

Using these methods, we can perform text detection with the help of models in the device.

NOTICE: Only the Latin characters are detected with the models in the device. For example, Chinese, Japanese, or Arabic characters cannot be detected with this method. You need to use cloud-based models for an application that detects these characters.

Leave a reply:

Your email address will not be published.

Sliding Sidebar

A few words about me

A few words about me

Software engineer who likes weekend more than weekdays. Loves food and cooks good as well. Loves to travel and gym, can never spend a weekend sitting idle at home. Independent, modern man with a traditional heart. My actions speak more than words.

Social Profiles