OCR on Android

What is OCR?

Optical character recognition (OCR) refers to the process of automatically identifying from an image characters or symbols belonging to a specified alphabet. In this post we will focus on explaining how to use OCR on Android.

Once recognized the text of the image, it can be used to:

    • Save it to storage.
    • Process or edit it.
  • Translate it to other language.

Popularity of smartphones combined with ever better cameras has led to an increase in the use of this type of recognition techniques and a new category of mobile apps that make use of them.

On device or in the cloud?

Before using an OCR library, it is necessary to decide where the OCR process should take place, on the smartphone or in the cloud.

Depending on app requeriments, each approach has its advantages and disadvantages.

OCR en Android. OCR on Android. Cloud or local.

If the app requires, for example, performing character recognition without internet connection, the OCR engine will be launched on the device itself. In this way, sending images to a server could be avoided because cameras mounted on current devices can take large photos.

On the other hand, OCR libraries tend to occupy much space, being necessary to download each of the languages to recognize, as we will explain below.

What libraries can be used?

In the following link to Wikipedia there is a comparative table with all OCR libraries, supported platforms, programming languages used in its development and other relevant information.

Link: http://en.wikipedia.org/wiki/ List_of_optical_character_recognition_software

In this post we are going to use Tesseract library, that stands out above the rest. It is Open Source, has SDK, was created by HP and is currently developed by Google.

OCR on Android using Tesseract Library

Althoug Tesseract can be run on a Linux server as a cloud service, in this post we will implement Tesseract library in an Android app, launching the OCR engine on the device itself.

The original Tesseract project for Android is called Tesseract Android Tools and contains tools for compiling the Tesseract and Leptonica libraries for use on the Android platform, and a Java API for accessing to these natively-compiled libraries.

Link: https://github.com/rebbix/tesseract-android-tools/tree/master/tesseract-android-tools

For our example, we are going to use a fork of Tesseract Android Tools, which adds more functionality.

Link: https://github.com/rmtheis/tess-two

OCR Example on Android

We need a few simple steps to perform OCR on Android:

  1. Create a new Android Studio project.
  2. Add Tesseract library to the project adding the following lines to build.gradle:
    dependencies {
        compile 'com.rmtheis:tess-two:6.0.0'
    }
  3. Create a class called TessOCR with the following code:
    import android.content.Context;
    import android.graphics.Bitmap; 
    import com.googlecode.tesseract.android.TessBaseAPI; 
    
    public class TessOCR { 
        private final TessBaseAPI mTess; 
    
        public TessOCR(Context context, String language) { 
            mTess = new TessBaseAPI(); 
            String datapath = context.getFilesDir() + "/tesseract/"; mTess.init(datapath, language); 
        }
     
        public String getOCRResult(Bitmap bitmap) { 
            mTess.setImage(bitmap); return mTess.getUTF8Text(); 
        }
     
        public void onDestroy() { 
            if (mTess != null) mTess.end(); 
        } 
    }
    • Constructor needs a context (for example MainActivity context) and the language to recognize that is used to start OCR engine. Language must be in 639-2/B ISO format. Example: spa (spanish), chi (chinese).
    • Note:
      • To recognize each language, it is necessary to download a file and save it on device storage. In our case, it will be stored in the app data directory followed by /tesseract/.
      • These files that are used to recognize each language can be found in the next link: https://github.com/tesseract-ocr/tessdata.
    • getOCRResult method will return the recognized text from the image we pass as argument.
  4. Import TessOCR class created in previous point to Main Activity and create a new recognition instance with the following line:
    mTessOCR = new TessOCR (this, language);
  5. Add in MainActivity the method to perform character recognition.
    private void doOCR (final Bitmap bitmap) {
       	 if (mProgressDialog == null) {
            mProgressDialog = ProgressDialog.show(ocrView, "Processing",
                    "Doing OCR...", true);
       	 } else {
           	 mProgressDialog.show();
       	 }
       	 new Thread(new Runnable() {
            	public void run() {
                	final String srcText = mTessOCR.getOCRResult(bitmap);
                	ocrView.runOnUiThread(new Runnable() {
                   	 @Override
                  	  public void run() {
                        // TODO Auto-generated method stub
                        if (srcText != null && !srcText.equals("")) {
                            //srcText contiene el texto reconocido
                        }
                        mTessOCR.onDestroy();
                        mProgressDialog.dismiss();
                    }
                });
            }
        }).start();
    }
    • Firstly, this code starts a progress dialog indicating recognition status. Then launches a new thread of execution that will make recognition calling getOCRResult method. When recognition is finished, the dialog is dismissed and the recognized text is saved in srcText if everything has worked properly.
  6. Call step 5 method to start recognition.
    doOCR(bitmap);

Considerations

  • Recognition quality may vary depending on the image lighting conditions, camera resolution, text font, text size and others …
  • To achieve the highest possible quality, it is very important to center the text in the image and image is properly focused.

Preview using OCR in a translator app

The following video shows part of the app I’m developing for my degree final project (TFG), where I use the OCR techniques described.

 

Related articles

Use Android Priority Job Queue library for your background tasks

PDF reports in Android

Discovery of nearby Bluetoth devices in Android

ASO – Position your app in the App Store or Google Play

 

20 thoughts on “OCR on Android”

    • Hi Paul, thank you for your comment.

      In the project in which I wrote this code, ocrView was an activity, because I used a MVP (Model-View-Presenter) pattern and the code included in 4 and 5 points belongs to a Presenter, not to the Main Activity. I have to update it to avoid confusion.

      Answering your question, you have two options:

      1.Use the 4 and 5 points code in a presenter and declaring ocrView like below:


      private final OCRActivity ocrView;

      public OCRViewPresenter(OCRActivity view) {

      ocrView = view;
      }

      2. Implement OCR on the activity itself, using this instead of ocrView in the first occurrence and YourActivityName.this in the second one (YourActivityName.this.ocrView.runOnUiThread)

      Reply
    • Hi manish, thank you for your comment.

      OCR techniques are likely not to be accurate if:

      – Lighting conditions are poor.

      – Camera resolution is low.

      – Text font is not big enough.

      So try to achieve the highest possible quality, center the text in the image and focus properly the image. If you have any question after that, please let me know.

      Reply
  1. Very good work, congratulations!

    Do you have any material to create this “mask” in the camera? to indicate where the person needs to align the camera

    Reply
    • Thank you very much Renan, using a “mask” to help the person to align the camera is a good idea but I do not currently have any material about it. You can look for some Android third-party libraries that may implement it 😉

      Reply
  2. hello, after drawing the text from the image with tesseract already. I continue to want to use the translate API translate that text into how to do. Please help me. I need to hurry because I’m doing the final exam

    Reply
  3. hii. sorry i want to ask. regarding your material this, I dont understand in step 4. Can you explain a little more to me how to do the 4th step?

    Reply
  4. hi David ! thanks for this , may I ask if is it possible to convert the translated texts to augmented reality , and can you suggest tools that I can use ? Thank you 🙂

    Reply
  5. Hello Mr. David,

    I have stored trained data inside assets folder following by tesseract folder,
    but error says : IllegalArgumentException: Data path does not exist!

    Thank you for help in advance.

    Reply

Leave a Comment

¿Necesitas una estimación?

Calcula ahora