building tesseract from source

July 15, 2013

Thanks to the prior work of Matt Christy at eMOP, I got started building Tesseract from source (on Mac OSX 10.8.4).

Here’s my slightly modified workflow:


svn checkout http://tesseract-ocr.googlecode.com/svn/trunk/ tesseract-ocr
cd tesseract-ocr
./autogen.sh
mkdir build
cd build
../configure
make
make install

recently a makefile changed, and I need to regenerate them, starting at the source code root:


autoreconf --force --install
cd build
../configure
make 
make install

Making a “build” directory, makes it easier to keep track of source code changes with svn. I set up my global ignores to ignore the interim files and directories.


vi ~/.subversion/config

then I uncommented this line and added everything after .DS_Store


global-ignores = *.o *.lo *.la *.al .libs *.so *.so.[0-9]* *.a *.pyc *.pyo
   *.rej *~ #*# .#* .*.swp .DS_Store *.in build config configure *.cache
   aclocal.m4 m4

So then I only see source code files that are added or modified when I check


svn status