sk-spell

podpora slovenčiny v Open Source programoch

simple example how to call/use tesseract library   

posledná zmena: 8. April 2012


back to tesseract-ocr-en

Tesseract provide c++ library. Version 3.02 (at the moment it is in the svn) will bring also MSVC library on Windows (minwg library is possible to create too).

So it is a time for simple example that shows how to call tesseract-ocr from your c++ program. It could be compiled by gcc.

Let’s create file ‘simple-ocr.cpp’ with most simple code:
notextile..


#include <baseapi.h>
#include <allheaders.h>
#include <sys/time.h>
int main() { // [1] tesseract::TessBaseAPI *myOCR = new tesseract::TessBaseAPI();

// [2] printf(“Tesseract-ocr version: %s\n”, myOCR->Version()); printf(“Leptonica version: %s\n”, getLeptonicaVersion()); // [3] if (myOCR->Init(NULL, “eng”)) { fprintf(stderr, “Could not initialize tesseract.\n”); exit(1); } // [4] Pix *pix = pixRead(“phototest.tif”); myOCR->SetImage(pix); // [5] char* outText = myOCR->GetUTF8Text(); printf(“OCR output:\n\n”); printf(outText); // [6] myOCR->Clear(); myOCR->End(); delete [] outText; pixDestroy(&pix); return 0; }
In the code you can see this blocks:
  1. declares a pointer to an tesseract::TessBaseAPI myOCR
  2. prints tesseract-ocr and leptonica versions. It is not needed, but can be useful (if you plan to issue bug report or ask for help ;-) )
  3. initializes English (eng) as ocr language for tesseract
  4. opens image file (phototest.tif) as image for OCR
  5. runs OCR and print output
  6. delete variables, pointers… and finished probram.

You can compile with this command (maybe you need to adjust your path to tesseract and leptonica header files):

$ g++ test_simple.cpp -o test_simple \
  -I/usr/include/leptonica \
  -I/usr/local/include/tesseract \
  -llept -ltesseract

It will create executable ‘test_simple’, that you can run.

© projekt sk-spell

RSS [opensource] [w3c] [firefox] [textpattern]