Simple Android OCR source code

Download Free projects with Source Code

Description

The project is about Optical Character Recognition. It is a process of classifying optical patterns with respect to alphanumeric or other characters. Optical character recognition process includes segmentation, feature extraction and classification. The text in the image captured converts Analog text-based resources to digital text resources. And then these converted resources can be used in several ways like searchable text in indexes so as to identify documents or images. At the first stage of text capture a scanned image of a page is taken.

And this scanned copy will form basis for all other stages. The very next stage involves implementation of technology Optical Character Recognition for converting text content into machine understandable or readable format. OCR analysis takes the input as digital image which is printed or hand written and converts it to machine readable digital text format. Then OCR processes the digital image into small components for analysis of text or word or character blocks. And again, the character blocks are further broken into components and are compared with dictionary of characters.

In this android OCR application project, there is only one entity i.e. the user. The user needs to register with basic registration details and needs to create login credentials. After registration, user can login onto the application. The application allows the user to take an image using camera application and when done it will convert the image text into OCR. Once the text is finalized, user can share and save the text and image. The text or image can be shared through android’s default share method, or to other users registered in the system, or can also convert the text to pdf and add password protection to the pdf.

Modules

  1. Preparing Tesseract
  2. Adding tess-two to Android Studio Project
  3. Tesseract library usage
  4. Android implementation

1.Preparing Tesseract

  1. Install the tesseract source-code from our website
  2. extract content in a tesseract folder
  3. Requires Android 2.2 or higher
  4. Download a v3.02 trained data file for a language (english data for example).
  5. On the mobile side, data files must be extracted to a subdirectory named tessdata.

To import tesseract to your android project, yu must build it first:

You must have the android NDK, if you don't install it from here. After installing the android ndk, you must add its install directory to the environement variables under Path.Go to Control Panel\System and Security\System - advanced system settings - environement variables. 

2.Adding Tess-Two to Android Studio Project 

After we have build the tess-two library project, we must import it to the android application project in android studio.

  1. In your android studio project tree, add a new directory "libraries", then add a subdirectory name it "tess-two".
  2. In windows explorer, move the content of the tess-two build project to the tess-two direcctory in libraries android studio.
  3. You must add a build.gradle [new file] in the libraries\tess-two folder:
    - Make sure all build.gradle files in application project have same targetSdk version
    - Make sure that the tess-two library has build.gradle file

3.Tesseract library usage:

Add the tesserct library to the filecreated and call the tes data

public String detectText(Bitmap bitmap) {
TessDataManager.initTessTrainedData(context);
    TessBaseAPI tessBaseAPI = new TessBaseAPI();
    String path = "~/tessdata/eng.traineddata";
    tessBaseAPI.setDebug(true);
    tessBaseAPI.init(path, "eng"); //Init the Tess with the trained data file, with english language
    //For example if we want to only detect numbers
    tessBaseAPI.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST, "1234567890");
    tessBaseAPI.setVariable(TessBaseAPI.VAR_CHAR_BLACKLIST, "!@#$%^&*()_+=-qwertyuiop[]}{POIU" +
"YTREWQasdASDfghFGHjklJKLl;L:'\"\\|~`xcvXCVbnmBNM,./<>?");
    tessBaseAPI.setImage(bitmap);
    String text = tessBaseAPI.getUTF8Text();
    Log.d(TAG, "Got data: " + result);
    tessBaseAPI.end();
    return text;
}

 

4.Android  Implementation:

  • Still have to take a photo from the camera, or load a it from a file.
  • We will make a CameraEngine class that loads the camera hardware, and show live streaming on a SurfaceView.

Now in the MainActivity, we will have to:

  • Show the camera preview on a SurfaceView [On Resume]
  • Stop the camera preview and release the camera resource to let other apps use it. [On Pause]
  • Add two button: one for taking a shot (middle), another to focus(right).
  • Add a custom FocusBoxView to crop camera preview region, where text need to be extracted from.

 

 GET CODE NOW