OCR on Android, optical character recognition: Tesseract

What is OCR?

Optical character recognition (OCR) refers to the process of automatically identifying from an image characters or symbols belonging to a specified alphabet. In this post we will focus on explaining how to use OCR on Android.

Once recognized the text of the image, it can be used to:

- Save it to storage.
- Process or edit it.

Translate it to other language.

Popularity of smartphones combined with ever better cameras has led to an increase in the use of this type of recognition techniques and a new category of mobile apps that make use of them.

On device or in the cloud?

Before using an OCR library, it is necessary to decide where the OCR process should take place, on the smartphone or in the cloud.

Depending on app requeriments, each approach has its advantages and disadvantages.

If the app requires, for example, performing character recognition without internet connection, the OCR engine will be launched on the device itself. In this way, sending images to a server could be avoided because cameras mounted on current devices can take large photos.

On the other hand, OCR libraries tend to occupy much space, being necessary to download each of the languages to recognize, as we will explain below.

What libraries can be used?

In the following link to Wikipedia there is a comparative table with all OCR libraries, supported platforms, programming languages used in its development and other relevant information.

Link: http://en.wikipedia.org/wiki/ List_of_optical_character_recognition_software

In this post we are going to use Tesseract library, that stands out above the rest. It is Open Source, has SDK, was created by HP and is currently developed by Google.

OCR on Android using Tesseract Library

Althoug Tesseract can be run on a Linux server as a cloud service, in this post we will implement Tesseract library in an Android app, launching the OCR engine on the device itself.

The original Tesseract project for Android is called Tesseract Android Tools and contains tools for compiling the Tesseract and Leptonica libraries for use on the Android platform, and a Java API for accessing to these natively-compiled libraries.

Link: https://github.com/rebbix/tesseract-android-tools/tree/master/tesseract-android-tools

For our example, we are going to use a fork of Tesseract Android Tools, which adds more functionality.

Link: https://github.com/rmtheis/tess-two

OCR Example on Android

We need a few simple steps to perform OCR on Android:

Create a new Android Studio project.
Add Tesseract library to the project adding the following lines to build.gradle:
```
dependencies {
    compile 'com.rmtheis:tess-two:6.0.0'
}
```
Create a class called TessOCR with the following code:
```
import android.content.Context;
import android.graphics.Bitmap; 
import com.googlecode.tesseract.android.TessBaseAPI; 

public class TessOCR { 
    private final TessBaseAPI mTess; 

    public TessOCR(Context context, String language) { 
        mTess = new TessBaseAPI(); 
        String datapath = context.getFilesDir() + "/tesseract/"; mTess.init(datapath, language); 
    }
 
    public String getOCRResult(Bitmap bitmap) { 
        mTess.setImage(bitmap); return mTess.getUTF8Text(); 
    }
 
    public void onDestroy() { 
        if (mTess != null) mTess.end(); 
    } 
}
```
- Constructor needs a context (for example MainActivity context) and the language to recognize that is used to start OCR engine. Language must be in 639-2/B ISO format. Example: spa (spanish), chi (chinese).
- Note:
  - To recognize each language, it is necessary to download a file and save it on device storage. In our case, it will be stored in the app data directory followed by /tesseract/.
  - These files that are used to recognize each language can be found in the next link: https://github.com/tesseract-ocr/tessdata.
- getOCRResult method will return the recognized text from the image we pass as argument.
Import TessOCR class created in previous point to Main Activity and create a new recognition instance with the following line:
```
mTessOCR = new TessOCR (this, language);
```

Add in MainActivity the method to perform character recognition.

private void doOCR (final Bitmap bitmap) {
   	 if (mProgressDialog == null) {
        mProgressDialog = ProgressDialog.show(ocrView, "Processing",
                "Doing OCR...", true);
   	 } else {
       	 mProgressDialog.show();
   	 }
   	 new Thread(new Runnable() {
        	public void run() {
            	final String srcText = mTessOCR.getOCRResult(bitmap);
            	ocrView.runOnUiThread(new Runnable() {
               	 @Override
              	  public void run() {
                    // TODO Auto-generated method stub
                    if (srcText != null && !srcText.equals("")) {
                        //srcText contiene el texto reconocido
                    }
                    mTessOCR.onDestroy();
                    mProgressDialog.dismiss();
                }
            });
        }
    }).start();
}

Firstly, this code starts a progress dialog indicating recognition status. Then launches a new thread of execution that will make recognition calling getOCRResult method. When recognition is finished, the dialog is dismissed and the recognized text is saved in srcText if everything has worked properly.

Call step 5 method to start recognition.
```
doOCR(bitmap);
```

Considerations

Recognition quality may vary depending on the image lighting conditions, camera resolution, text font, text size and others …
To achieve the highest possible quality, it is very important to center the text in the image and image is properly focused.

Preview using OCR in a translator app

The following video shows part of the app I’m developing for my degree final project (TFG), where I use the OCR techniques described.

Use Android Priority Job Queue library for your background tasks

PDF reports in Android

Discovery of nearby Bluetoth devices in Android

ASO – Position your app in the App Store or Google Play

20 thoughts on “OCR on Android”

Paul

26 February, 2017 at 01:20

I’ve an error on ocrView. how do i declare the variable
- David González Verdugo
  
  27 February, 2017 at 14:19
  
  Hi Paul, thank you for your comment.
  
  In the project in which I wrote this code, ocrView was an activity, because I used a MVP (Model-View-Presenter) pattern and the code included in 4 and 5 points belongs to a Presenter, not to the Main Activity. I have to update it to avoid confusion.
  
  Answering your question, you have two options:
  
  1.Use the 4 and 5 points code in a presenter and declaring ocrView like below:
  
  private final OCRActivity ocrView;
  public OCRViewPresenter(OCRActivity view) {
  ocrView = view; }
  
  2. Implement OCR on the activity itself, using this instead of ocrView in the first occurrence and YourActivityName.this in the second one (YourActivityName.this.ocrView.runOnUiThread)
manish

5 May, 2017 at 12:44

scanning is not accuracy its giving some other values
- David González Verdugo
  
  24 May, 2017 at 08:47
  
  Hi manish, thank you for your comment.
  
  OCR techniques are likely not to be accurate if:
  
  – Lighting conditions are poor.
  
  – Camera resolution is low.
  
  – Text font is not big enough.
  
  So try to achieve the highest possible quality, center the text in the image and focus properly the image. If you have any question after that, please let me know.
Dusan

7 July, 2017 at 14:50

How to make it on cloud?
i could not find any tutorial.
- David González Verdugo
  
  7 July, 2017 at 15:17
  
  Hello Dusan, thank you for your comment. If you are interested in implementing OCR in a server, there are different alternatives. For example, you can have a look at https://github.com/tleyden/open-ocr, which uses Tesseract and includes a REST API to upload the image you want to scan.
  
  Hope it will be useful
Renan Pinheiro

7 November, 2017 at 22:12

Very good work, congratulations!

Do you have any material to create this “mask” in the camera? to indicate where the person needs to align the camera
- David González Verdugo
  
  16 November, 2017 at 09:58
  
  Thank you very much Renan, using a “mask” to help the person to align the camera is a good idea but I do not currently have any material about it. You can look for some Android third-party libraries that may implement it 😉
Trần Hữu Nghị

25 November, 2017 at 20:53

hello, after drawing the text from the image with tesseract already. I continue to want to use the translate API translate that text into how to do. Please help me. I need to hurry because I’m doing the final exam
- David González Verdugo
  
  26 January, 2018 at 16:31
  
  Hi Trần Hữu Nghị, once Tesseract has processed the image and obtained the text, you will need a translator API to translate it, there are several such as the one from Google or Microsoft, among others.
Putri

9 July, 2018 at 11:49

hii. sorry i want to ask. regarding your material this, I dont understand in step 4. Can you explain a little more to me how to do the 4th step?
- David González Verdugo
  
  12 September, 2018 at 12:09
  
  Hi Putri, there was a problem with the code set in step 4, it was split into 4 different lines when it should be just one, have a look at it again.
Franz Jalagat

9 August, 2018 at 11:10

hi David ! thanks for this , may I ask if is it possible to convert the translated texts to augmented reality , and can you suggest tools that I can use ? Thank you 🙂
- David González Verdugo
  
  10 October, 2018 at 08:37
  
  Hi Franz, I think that translating texts to augmented reality could be possible but I’ve never tried it before, there’s some libraries from Google to work with AR, have a look at https://github.com/google-ar
Kamya

19 August, 2018 at 06:06

Hey, Which API is used in this project?
- David González Verdugo
  
  10 October, 2018 at 08:42
  
  Hi Kamya, do you mean the translator API? There are several ones such as Google and Microsoft translators.
tri rahmat

29 August, 2018 at 15:58

hello david
what is it available arabic/arabian language? and then translate to other languages?
- David González Verdugo
  
  18 October, 2018 at 15:17
  
  Hi Tri Rahmat, the arabic/arabian language is in https://github.com/tesseract-ocr/tessdata/blob/master/ara.traineddata and to translate to other languages you can use any translations API.
Truman

9 April, 2019 at 11:19

Thanks a lot man, your code works, helps me a lot!
Shruti

16 May, 2019 at 09:37

Hello Mr. David,

I have stored trained data inside assets folder following by tesseract folder,
but error says : IllegalArgumentException: Data path does not exist!

Thank you for help in advance.

OCR on Android

What is OCR?

On device or in the cloud?

What libraries can be used?

OCR on Android using Tesseract Library

OCR Example on Android

Considerations

Preview using OCR in a translator app

Related articles

20 thoughts on “OCR on Android”

Leave a Comment Cancel reply

What is OCR?

On device or in the cloud?

What libraries can be used?

OCR on Android using Tesseract Library

OCR Example on Android

Considerations

Preview using OCR in a translator app

Related articles

20 thoughts on “OCR on Android”

Leave a Comment Cancel reply

¿Necesitas una estimación?