A small C++ implementation of LSTM networks, focused on OCR.

Related tags

Computer Visionclstm
Overview

clstm

CircleCI

CLSTM is an implementation of the LSTM recurrent neural network model in C++, using the Eigen library for numerical computations.

Status and scope

CLSTM is mainly in maintenance mode now. It was created at a time when there weren't a lot of good LSTM implementations around, but several good options have become available over the last year. Nevertheless, if you need a small library for text line recognition with few dependencies, CLSTM is still a good option.

Installation using Docker

You can train and run clstm without installation to the local machine using the docker image, which is based on Ubuntu 16.04. This is the best option for running clstm on a Windows host.

You can either run the last version of the clstm image from Docker Hub or build the Docker image from the repo (see ./docker/Dockerfile).

The command line syntax differs from a native installation:

docker run --rm -it -e [VARIABLES...] kbai/clstm BINARY [ARGS...]

is equivalent to

[VARIABLES...] BINARY [ARGS...]

For example:

docker run --rm -it -e ntrain=1000 kbai/clstm clstmocrtrain traininglist.txt

is equivalent to

ntrain=1000 clstmocrtrain traininglist.txt

Installation from source

Prerequisites

  • scons, swig, Eigen
  • protocol buffer library and compiler
  • libpng
  • Optional: HDF5, ZMQ, Python
# Ubuntu 15.04, 16.04 / Debian 8, 9
sudo apt-get install scons libprotobuf-dev protobuf-compiler libpng-dev libeigen3-dev swig

# Ubuntu 14.04:
sudo apt-get install scons libprotobuf-dev protobuf-compiler libpng-dev swig

The Debian repositories jessie-backports and stretch include sufficiently new libeigen3-dev packages.

It is also possible to download Eigen with Tensor support (> v3.3-beta1) and copy the header files to an include path:

# with wget
wget 'https://github.com/RLovelett/eigen/archive/3.3-rc1.tar.gz'
tar xf 3.3-rc1.tar.gz
rm -f /usr/local/include/eigen3
mv eigen-3.3-rc1 /usr/local/include/eigen3
# or with git:
sudo git clone --depth 1 --single-branch --branch 3.3-rc1 \
  "https://github.com/RLovelett/eigen" /usr/local/include/eigen3

To use the visual debugging methods, additionally:

# Ubuntu 15.04:
sudo apt-get install libzmq3-dev libzmq3 libzmqpp-dev libzmqpp3 libpng12-dev

For HDF5, additionally:

# Ubuntu 15.04:
sudo apt-get install hdf5-helpers libhdf5-8 libhdf5-cpp-8 libhdf5-dev python-h5py

# Ubuntu 14.04:
sudo apt-get install hdf5-helpers libhdf5-7 libhdf5-dev python-h5py

Building

To build a standalone C library, run

scons
sudo scons install

There are a bunch of options:

  • debug=1 build with debugging options, no optimization
  • display=1 build with display support for debugging (requires ZMQ, Python)
  • prefix=... install under a different prefix (untested)
  • eigen=... where to look for Eigen include files (should contain Eigen/Eigen)
  • openmp=... build with multi-processing support. Set the OMP_NUM_THREADS environment variable to the number of threads for Eigen to use.
  • hdf5lib=hdf5 what HDF5 library to use; enables HDF5 command line programs (may need hdf5_serial in some environments)

Running the tests

After building the executables, you can run two simple test runs as follows:

  • run-cmu will train an English-to-IPA LSTM
  • run-uw3-500 will download a small OCR training/test set and train an OCR LSTM

There is a full set of tests in the current version of clstm; just run them with:

./run-tests

This will check:

  • gradient checkers for layers and compute steps
  • training a simple model through the C++ API
  • training a simple model through the Python API
  • checking the command line training tools, including loading and saving

Python bindings

To build the Python extension, run

python setup.py build
sudo python setup.py install

(this is currently broken)

Documentation / Examples

You can find some documentation and examples in the form of iPython notebooks in the misc directory (these are version 3 notebooks and won't open in older versions).

You can view these notebooks online here: http://nbviewer.ipython.org/github/tmbdev/clstm/tree/master/misc/

C++ API

The clstm library operates on the Sequence type as its fundamental data type, representing variable length sequences of fixed length vectors. The underlying Sequence type is a rank 4 tensor with accessors for individual rank-2 tensors at different time steps.

Networks are built from objects implementing the INetwork interface. The INetwork interface contains:

struct INetwork {
    Sequence inputs, d_inputs;      // input sequence, input deltas
    Sequence outputs, d_outputs;    // output sequence, output deltas
    void forward();                 // propagate inputs to outputs
    void backward();                // propagate d_outputs to d_inputs
    void update();                  // update weights from the last backward() step
    void setLearningRate(Float,Float); // set learning rates
    ...
};

Network structures can be hierarchical and there are some network implementations whose purpose it is to combine other networks into more complex structures.

struct INetwork {
    ...
    vector<shared_ptr<INetwork>> sub;
    void add(shared_ptr<INetwork> net);
    ...
};

At its lowest level, layers are created by:

  • create an instance of the layer with make_layer
  • set any parameters (including ninput and noutput) as attributes
  • add any sublayers to the sub vector
  • call initialize()

There are three different functions for constructing layers and networks:

  • make_layer(kind) looks up the constructor and gives you an uninitialized layer
  • layer(kind,ninput,noutput,args,sub) performs all initialization steps in sequence
  • make_net(kind,args) initializes a whole collection of layers at once
  • make_net_init(kind,params) is like make_net, but parameters are given in string form

The layer(kind,ninput,noutput,args,sub) function will perform these steps in sequence.

Layers and networks are usually passed around as shared_ptr<INetwork>; there is a typedef of this calling it Network.

This can be used to construct network architectures in C++ pretty easily. For example, the following creates a network that stacks a softmax output layer on top of a standard LSTM layer:

Network net = layer("Stacked", ninput, noutput, {}, {
    layer("LSTM", ninput, nhidden,{},{}),
    layer("SoftmaxLayer", nhidden, noutput,{},{})
});

Note that you need to make sure that the number of input and output units are consistent between layers.

In addition to these basic functions, there is also a small implementation of CTC alignment.

The C++ code roughly follows the lstm.py implementation from the Python version of OCRopus. Gradients have been verified for the core LSTM implementation, although there may be still be bugs in other parts of the code.

There is also a small multidimensional array class in multidim.h; that isn't used in the core LSTM implementation, but it is used in debugging and testing code, for plotting, and for HDF5 input/output. Unlike Eigen, it uses standard C/C++ row major element order, as libraries like HDF5 expect. (NB: This will be replaced with Eigen::Tensor.)

LSTM models are stored in protocol buffer format (clstm.proto), although adding new formats is easy. There is an older HDF5-based storage format.

Python API

The clstm.i file implements a simple Python interface to clstm, plus a wrapper that makes an INetwork mostly a replacement for the lstm.py implementation from ocropy.

Command Line Drivers

There are several command line drivers:

  • clstmfiltertrain training-data test-data learns text filters;
    • input files consiste of lines of the form "inputoutput"
  • clstmfilter applies learned text filters
  • clstmocrtrain training-images test-images learns OCR (or image-to-text) transformations;
    • input files are lists of text line images; the corresponding UTF-8 ground truth is expected in the corresponding .gt.txt file
  • clstmocr applies learned OCR models

In addition, you get the following HDF5-based commands:

  • clstmseq learns sequence-to-sequence mappings
  • clstmctc learns sequence-to-string mappings using CTC alignment
  • clstmtext learns string-to-string transformations

Note that most parameters are passed through the environment:

lrate=3e-5 clstmctc uw3-dew.h5

See the notebooks in the misc/ subdirectory for documentation on the parameters and examples of usage.

(You can find all parameters via grep 'get.env' *.cc.)

Comments
  • Other language trial report

    Other language trial report

    With 500-character subset of Japanese Kanji, clstm works fine. (hidden_nodes = 100, MacBook, gcc48 from homebrew) Your kid reads Japanese brilliantly.

    I am trying 2492-char subset. it seems to take several weeks (hidden=200, this time) (NO nhidden = 200 seems to be hopeless, he/she seems to learn one char by forgetting another)

    Now trying 3700 chars( little bigger tesseract jp-dataset ) with nhidden = 800 and nhidden =1200. Unless my PC broke, I will see the result next spring.

    opened by isaomatsunami 11
  • deprecation of genericLSTM

    deprecation of genericLSTM

    I have seen that in the last version genericLSTM has been "softly" deprecated. Could you please explain me the reason why this choice has been made?

    opened by apbard 10
  • clstmocr - error opening clstm file (trained model)

    clstmocr - error opening clstm file (trained model)

    I have a trained uw3 model which had trained about 4000 iterations. Then, when I execute clstmocr, I face some error:

    ./clstmocr image.jpg 
    #: load = uw3-500-4000.clstm
    #: conf = 0
    #: output = text
    #: save_text = 1
    FATAL: error on open
    

    Any ideas ?

    opened by lomograb 8
  • Training on top of an existing model

    Training on top of an existing model

    Hi there, I am trying to train a new clstm model containing +1000 lines, the training process would take days. My technique would be to train a couple of hours a day, and continue training the next day, as such. I created an arabic-8000.clstm model for testing, and added to the script: load=arabic-8000.clstm start=8000

    But the problem is that clstmocrtrain starts from 0 all over again. Waiting for your reply

    opened by ghost 8
  • how to specify cpu cores to speed up trainning

    how to specify cpu cores to speed up trainning

    if i want to run the following tests

    
    #!/bin/bash
    set -x
    set -a
    test -d book || {
        wget -nd http://tmbdev.net/ocrdata/uw3-500.tgz
        tar -xzf uw3-500.tgz
    }
    find book -name '*.bin.png' | sort -r > uw3-all
    sed 1,50d uw3-all > uw3-train
    sed 50q uw3-all > uw3-test
    report_every=10
    save_every=1000
    ntrain=200000
    dewarp=center
    display_every=10
    test_every=10000
    display_every=100
    testset=uw3-test.h5
    hidden=800
    lrate=1e-4
    save_name=uw3-500
    report_time=1
    # gdb --ex run --args \
    ./clstmocrtrain uw3-train uw3-test
    

    lets say i have 32 cores,how to specify cpu cores to speed up trainning

    opened by wanghaisheng 8
  • Illegal instruction

    Illegal instruction

    hello,I loaded your code,it's very amazing! I know little about it so i have some questions. when i run ./test-lstm , it gives error: test-lstm.cc:80:3: error: use of undeclared identifier 'unlink'; did you mean 'inline'?After googled, I add head file #include <unistd.h> and solve this problem. Then run ./test-filter.sh , it gives anther error :./test-filter.sh: line 7: 26632 Illegal instruction: 4 hidden=20 ntrain=1001 neps=0 report_every=200 save_every=1000 lrate=1e-2 save_name=_filter ./clstmfiltertrain _filter.txt clstmfilter FAILED how can I do with it ?

    opened by morusu 8
  • load pretrainned model error

    load pretrainned model error

    
    >>>>>>> ./test-lstm
    #: ntrain = 100000
    #: ntest = 1000
    #: gpu = -1
    training 1:4:2 network to learn delay
    .Stacked <<<0.0001 0.9 in 20 1 out 20 2>>>
    .Stacked    inputs 20 1 1 Seq:[0.000000|0.400000|1.000000:20][-0.030502|-0.001325|0.025830:20]
    .Stacked    outputs 20 2 1 Seq:[0.000045|0.500000|0.999955:40][-0.003537|0.000000|0.003537:40]
    .Stacked.NPLSTM <<<0.0001 0.9 in 20 1 out 20 4>>>
    .Stacked.NPLSTM    WCI 4 6 Bat:[-2.371926|-0.180369|1.806145:24][-0.042775|0.002258|0.046467:24]
    .Stacked.NPLSTM    WGF 4 6 Bat:[-1.157534|0.080894|1.083409:24][-0.006257|0.002911|0.012813:24]
    .Stacked.NPLSTM    WGI 4 6 Bat:[-0.212294|0.408031|2.245568:24][-0.002479|0.003996|0.014401:24]
    .Stacked.NPLSTM    WGO 4 6 Bat:[-2.919670|0.557016|3.321400:24][-0.018948|0.007477|0.029906:24]
    .Stacked.NPLSTM    inputs 20 1 1 Seq:[0.000000|0.400000|1.000000:20][-0.030502|-0.001325|0.025830:20]
    .Stacked.NPLSTM    outputs 20 4 1 Seq:[-0.953226|0.038960|0.739090:80][-0.043424|0.000437|0.023548:80]
    .Stacked.NPLSTM    ci 20 4 1 Seq:[-0.999453|-0.009550|0.961935:80][-0.009550|-0.000172|0.014207:80]
    .Stacked.NPLSTM    gf 20 4 1 Seq:[0.103049|0.455415|0.836064:80][-0.001099|-0.000045|0.001491:80]
    .Stacked.NPLSTM    gi 20 4 1 Seq:[0.691834|0.832398|0.977853:80][-0.000271|0.000066|0.000974:80]
    .Stacked.NPLSTM    go 20 4 1 Seq:[0.063462|0.769145|0.996185:80][-0.001031|0.000088|0.004340:80]
    .Stacked.NPLSTM    source 20 5 1 Seq:[-0.953226|0.096544|1.000000:100][-0.030502|-0.000562|0.027306:100]
    .Stacked.NPLSTM    state 20 4 1 Seq:[-1.978872|-0.030874|1.210067:80][-0.010696|-0.000545|0.015035:80]
    .Stacked.SoftmaxLayer <<<0.0001 0.9 in 20 4 out 20 2>>>
    .Stacked.SoftmaxLayer    W1 2 5 Bat:[-6.143187|-0.003997|6.134448:10][-0.061927|-0.000000|0.061927:10]
    .Stacked.SoftmaxLayer    inputs 20 4 1 Seq:[-0.953226|0.038960|0.739090:80][-0.043424|0.000437|0.023548:80]
    .Stacked.SoftmaxLayer    outputs 20 2 1 Seq:[0.000045|0.500000|0.999955:40][-0.003537|0.000000|0.003537:40]
    #: verbose = 0
    OK (pre-save) 0.00620409
    saving
    loading
    OK 0.00620409
    nparams 106
    OK (params) 0.00620409
    OK (hacked-params) 0.5
    OK (restored-params) 0.00620409
    
    real    0m11.372s
    user    0m11.368s
    sys     0m0.005s
    
    >>>>>>> ./test-xps-third.sh
    #: ntrain = 400000
    #: save_name = xps-total
    #: report_time = 0
    #: charsep =
    got 899 files, 50 tests
    #: load = xps-391100.clstm
    .Stacked: 0.0001 0.9 in 0 48 out 0 5702
    .Stacked.Parallel: 0.0001 0.9 in 0 48 out 0 200
    .Stacked.Parallel.NPLSTM: 0.0001 0.9 in 0 48 out 0 100
    .Stacked.Parallel.Reversed: 0.0001 0.9 in 0 48 out 0 100
    .Stacked.Parallel.Reversed.NPLSTM: 0.0001 0.9 in 0 48 out 0 100
    .Stacked.SoftmaxLayer: 0.0001 0.9 in 0 200 out 0 5702
    #: start = -1
    start 391101
    #: test_every = 1000
    #: save_every = 1000
    #: report_every = 1
    #: display_every = 1000
    clstmocrtrain: clstm.cc:231: void ocropus::Codec::encode(ocropus::Classes&, const wstring&): Assertion `encoder->count(c) > 0' failed.
    ./test-xps-third.sh: line 20:  1284 Aborted                 (core dumped) ./clstmocrtrain xps-train-total xps-test-total
    
    >>>>>>> echo TEST FAILED
    TEST FAILED
    
    
    
    opened by wanghaisheng 7
  • loading of previously trained models

    loading of previously trained models

    Hi, I have an issue concerning the master-branch, where loading of trained models into clstmocrtrain is implemented but not working (for me). I have audited the code but cannot find what issues the following error with a given load- and/or start-parameter. got 950 files, 50 tests .Stacked: 0.0001 0.9 in 0 48 out 0 74 FATAL: missing parameter

    Best wishes

    opened by stexandev 7
  • clstm.i and clstm.h are not in sync

    clstm.i and clstm.h are not in sync

    clstm_wrap.cpp:11671:21: error: ‘struct ocropus::INetwork’ has no member named ‘d_inputs’
       if (arg1) (arg1)->d_inputs = *arg2;
                         ^
    clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_inputs_get(PyObject*, PyObject*)’:
    clstm_wrap.cpp:11705:35: error: ‘struct ocropus::INetwork’ has no member named ‘d_inputs’
       result = (Sequence *)& ((arg1)->d_inputs);
                                       ^
    clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_outputs_set(PyObject*, PyObject*)’:
    clstm_wrap.cpp:11823:21: error: ‘struct ocropus::INetwork’ has no member named ‘d_outputs’
       if (arg1) (arg1)->d_outputs = *arg2;
                         ^
    clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_outputs_get(PyObject*, PyObject*)’:
    clstm_wrap.cpp:11857:35: error: ‘struct ocropus::INetwork’ has no member named ‘d_outputs’
       result = (Sequence *)& ((arg1)->d_outputs);
                                       ^
    cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
    clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_inputs_set(PyObject*, PyObject*)’:
    clstm_wrap.cpp:11671:21: error: ‘struct ocropus::INetwork’ has no member named ‘d_inputs’
       if (arg1) (arg1)->d_inputs = *arg2;
                         ^
    clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_inputs_get(PyObject*, PyObject*)’:
    clstm_wrap.cpp:11705:35: error: ‘struct ocropus::INetwork’ has no member named ‘d_inputs’
       result = (Sequence *)& ((arg1)->d_inputs);
                                       ^
    clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_outputs_set(PyObject*, PyObject*)’:
    clstm_wrap.cpp:11823:21: error: ‘struct ocropus::INetwork’ has no member named ‘d_outputs’
       if (arg1) (arg1)->d_outputs = *arg2;
                         ^
    clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_outputs_get(PyObject*, PyObject*)’:
    clstm_wrap.cpp:11857:35: error: ‘struct ocropus::INetwork’ has no member named ‘d_outputs’
       result = (Sequence *)& ((arg1)->d_outputs);
                                       ^
    
    opened by futurely 7
  • error: use of undeclared identifier 'environ'

    error: use of undeclared identifier 'environ'

    Hello,

    As I tried to install Kraken, I found a bug with the clstm dependency installation.

    I tried with Python 3 and Python 2, but the result remains the same. I use homebrewed python on macOS 10.13.3

    Here is the output I get while running pip install clstm

    Output
    Collecting clstm
      Using cached clstm-0.0.5.tar.gz
    Requirement already satisfied: numpy>=1.9.0 in /usr/local/lib/python3.6/site-packages (from clstm)
    Building wheels for collected packages: clstm
      Running setup.py bdist_wheel for clstm: started
      Running setup.py bdist_wheel for clstm: finished with status 'error'
      Complete output from command /usr/local/opt/python3/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/private/var/folders/35/8h9cj97x2sg8pxmk6l73fn8w0000gn/T/pip-build-rncsaq3b/clstm/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /var/folders/35/8h9cj97x2sg8pxmk6l73fn8w0000gn/T/tmpi5jp5zoypip-wheel- --python-tag cp36:
      making proto file
      clstm.proto: No such file or directory
      running bdist_wheel
      running build_ext
      building '_clstm' extension
      swigging clstm.i to clstm_wrap.cpp
      swig -python -c++ -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/include/python3.6m -I/usr/local/lib/python3.6/site-packages/numpy/core/include -o clstm_wrap.cpp clstm.i
      clstm.i:35: Warning 451: Setting a const char * variable may leak memory.
      creating build
      creating build/temp.macosx-10.13-x86_64-3.6
      clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/include/eigen3 -I/usr/local/include/eigen3 -I/usr/local/include -I/usr/include/hdf5/serial -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/include/python3.6m -I/usr/local/lib/python3.6/site-packages/numpy/core/include -c clstm_wrap.cpp -o build/temp.macosx-10.13-x86_64-3.6/clstm_wrap.o -std=c++11 -w -Dadd_raw=add -DNODISPLAY=1 -DTHROW=throw -DHGVERSION="\"unknown\""
      clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/include/eigen3 -I/usr/local/include/eigen3 -I/usr/local/include -I/usr/include/hdf5/serial -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/include/python3.6m -I/usr/local/lib/python3.6/site-packages/numpy/core/include -c clstm.cc -o build/temp.macosx-10.13-x86_64-3.6/clstm.o -std=c++11 -w -Dadd_raw=add -DNODISPLAY=1 -DTHROW=throw -DHGVERSION="\"unknown\""
      clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/include/eigen3 -I/usr/local/include/eigen3 -I/usr/local/include -I/usr/include/hdf5/serial -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/include/python3.6m -I/usr/local/lib/python3.6/site-packages/numpy/core/include -c clstm_prefab.cc -o build/temp.macosx-10.13-x86_64-3.6/clstm_prefab.o -std=c++11 -w -Dadd_raw=add -DNODISPLAY=1 -DTHROW=throw -DHGVERSION="\"unknown\""
      clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/include/eigen3 -I/usr/local/include/eigen3 -I/usr/local/include -I/usr/include/hdf5/serial -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/include/python3.6m -I/usr/local/lib/python3.6/site-packages/numpy/core/include -c extras.cc -o build/temp.macosx-10.13-x86_64-3.6/extras.o -std=c++11 -w -Dadd_raw=add -DNODISPLAY=1 -DTHROW=throw -DHGVERSION="\"unknown\""
      extras.cc:679:17: error: use of undeclared identifier 'environ'
          char **ep = environ;
                      ^
      1 error generated.
      error: command 'clang' failed with exit status 1
      
      ----------------------------------------
      Running setup.py clean for clstm
    Failed to build clstm
    Installing collected packages: clstm
      Running setup.py install for clstm: started
        Running setup.py install for clstm: finished with status 'error'
        Complete output from command /usr/local/opt/python3/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/private/var/folders/35/8h9cj97x2sg8pxmk6l73fn8w0000gn/T/pip-build-rncsaq3b/clstm/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /var/folders/35/8h9cj97x2sg8pxmk6l73fn8w0000gn/T/pip-noa4v0f4-record/install-record.txt --single-version-externally-managed --compile:
        making proto file
        clstm.proto: No such file or directory
        running install
        running build
        running build_ext
        building '_clstm' extension
        swigging clstm.i to clstm_wrap.cpp
        swig -python -c++ -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/include/python3.6m -I/usr/local/lib/python3.6/site-packages/numpy/core/include -o clstm_wrap.cpp clstm.i
        clstm.i:35: Warning 451: Setting a const char * variable may leak memory.
        creating build
        creating build/temp.macosx-10.13-x86_64-3.6
        clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/include/eigen3 -I/usr/local/include/eigen3 -I/usr/local/include -I/usr/include/hdf5/serial -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/include/python3.6m -I/usr/local/lib/python3.6/site-packages/numpy/core/include -c clstm_wrap.cpp -o build/temp.macosx-10.13-x86_64-3.6/clstm_wrap.o -std=c++11 -w -Dadd_raw=add -DNODISPLAY=1 -DTHROW=throw -DHGVERSION="\"unknown\""
        clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/include/eigen3 -I/usr/local/include/eigen3 -I/usr/local/include -I/usr/include/hdf5/serial -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/include/python3.6m -I/usr/local/lib/python3.6/site-packages/numpy/core/include -c clstm.cc -o build/temp.macosx-10.13-x86_64-3.6/clstm.o -std=c++11 -w -Dadd_raw=add -DNODISPLAY=1 -DTHROW=throw -DHGVERSION="\"unknown\""
        clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/include/eigen3 -I/usr/local/include/eigen3 -I/usr/local/include -I/usr/include/hdf5/serial -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/include/python3.6m -I/usr/local/lib/python3.6/site-packages/numpy/core/include -c clstm_prefab.cc -o build/temp.macosx-10.13-x86_64-3.6/clstm_prefab.o -std=c++11 -w -Dadd_raw=add -DNODISPLAY=1 -DTHROW=throw -DHGVERSION="\"unknown\""
        clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/include/eigen3 -I/usr/local/include/eigen3 -I/usr/local/include -I/usr/include/hdf5/serial -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/include/python3.6m -I/usr/local/lib/python3.6/site-packages/numpy/core/include -c extras.cc -o build/temp.macosx-10.13-x86_64-3.6/extras.o -std=c++11 -w -Dadd_raw=add -DNODISPLAY=1 -DTHROW=throw -DHGVERSION="\"unknown\""
        extras.cc:679:17: error: use of undeclared identifier 'environ'
            char **ep = environ;
                        ^
        1 error generated.
        error: command 'clang' failed with exit status 1
        
        ----------------------------------------
    
    opened by loranger 6
  • import error in clstm python module

    import error in clstm python module

    I tried to import clstm python module but this following error happens:

    ImportError                               Traceback (most recent call last)
    <ipython-input-1-2e5fe7a7c3df> in <module>()
    ----> 1 import clstm
    
    /home/kendemu/clstm/clstm.py in <module>()
         30                 fp.close()
         31             return _mod
    ---> 32     _clstm = swig_import_helper()
         33     del swig_import_helper
         34 else:
    
    /home/kendemu/clstm/clstm.py in swig_import_helper()
         22             fp, pathname, description = imp.find_module('_clstm', [dirname(__file__)])
         23         except ImportError:
    ---> 24             import _clstm
         25             return _clstm
         26         if fp is not None:
    
    ImportError: No module named _clstm
    

    How can I solve this error? I did scons, sudo scons install command and I saw the README that python setup.py build command is broken.

    opened by kendemu 5
  • How to make predictions using python code

    How to make predictions using python code

    Hi, taking this example as a reference I wrote the following code:

    import clstm
    import cv2
    import numpy as np
    import matplotlib.pyplot as plt
    import os
    from scipy.ndimage import filters
    
    def decode2(pred, codec, threshold = .5):
        eps = filters.gaussian_filter(pred[:,0,0],2,mode='nearest')
        loc = (np.roll(eps,-1)>eps) & (np.roll(eps,1)>eps) & (np.eps<threshold)
        classes = np.argmax(pred,axis=1)[:,0]
        codes = classes[loc]
        chars = [chr(codec[c]) for c in codes]
        return "".join(chars)    
    
    def decode1(pred, codec):
        classes = np.argmax(pred,axis=1)[:,0]
        print(classes)
        codes = classes[(classes!=0) & (np.roll(classes,1)==0)]
        #[print(int(c)) for c in codes]
        chars = [codec.decode(int(c)) for c in codes]
        return "".join(chars)
    
    img_name="new-22_mcrop8.png"
    
    img=cv2.imread(img_name, 0)
    h=img.shape[0]
    img = img.T.reshape(img.shape[0]*img.shape[1])
    print("img.shape", img.shape)
    
    net = clstm.load_net("model-180000.clstm")
    print(clstm.network_info(net))
    
    noutput=net.codec.size()
    ninput=h
    print("in, out: ",ninput,noutput)
    
    #plt.imshow(img.reshape(h,-1))
    #plt.show()
    
    print("img.shape:", img.shape)
    xs = np.array(img.reshape(-1,h,1),'f')
    
    #plt.imshow(xs.reshape(-1,h).T,cmap=plt.cm.gray)
    #plt.show()
    
    print("xs.shape", xs.shape)
    net.inputs.aset(xs)
    net.forward()
    pred = net.outputs.array()
    print("pred.shape", pred.shape)
    
    #plt.imshow(pred.reshape(-1,noutput).T, interpolation='none')
    #plt.show()
    
    codec = [net.codec.decode(i) for i in range(net.codec.size())]
    print("codec: ", codec)
    
    print(decode1(pred, net.codec))
    

    This is the image:

    new-22_mcrop8

    and the expected output is obviously TRENTO. This is the model:

    model-180000.clstm.zip

    The model was trained with clstmocrtrain and works perfectly when I use clstmocr.

    I had a look at the c++ code and I can see this:

    raw() = -raw() + Float(1.0);
    [...]
    normalizer->normalize(image, raw);
    

    before the forward call. Maybe this is the problem. Can I call these from python(I doubt)? Do I have to rewrite these in python? Is there a simpler way that I missed, like a predict() method? Should I have a look at Kraken?

    Thanks for any suggestion.

    opened by lorenzob 0
  • question: clstmocrtrain on GPU

    question: clstmocrtrain on GPU

    Great thanks for all superb work! Trying to run clstmocrtrain on CUDA, doesnt' seem to work. I've compiled everything with "scons -j7 gpu=1" (I've 8 cores). No errors were produced (well after all it took me a bit of time to get here, but compilation worked).

    Now running gpu=1 clstmocrtrain + rest of parameters seems to occupy CPU only and nothing is shown on GPU.

    Question is how to validate if all is prepared correctly and why clstmocrtrain is not running on GPU? Should there be a recommendation to use another GPU enabled tool to generate valid LSTM for Ocropy, I'm very open to that. I will try Kraken, but wanted to first try clstmocrtrain.

    Environment: Mint 18.2 (ubuntu 16.04), +-----------------------------------------------------------------------------+ | NVIDIA-SMI 384.130 Driver Version: 384.130 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Quadro M1200 Off | 00000000:01:00.0 Off | N/A | | N/A 47C P8 N/A / N/A | 8MiB / 4044MiB | 0% Default | +-------------------------------+----------------------+----------------------+

    +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 10592 G /usr/lib/xorg/Xorg 7MiB | +-----------------------------------------------------------------------------+

    top - 03:04:28 up 39 min, 8 users, load average: 2,86, 3,37, 2,77 Tasks: 320 total, 3 running, 316 sleeping, 0 stopped, 1 zombie %Cpu(s): 23,1 us, 1,5 sy, 0,0 ni, 74,7 id, 0,6 wa, 0,0 hi, 0,1 si, 0,0 st KiB Mem : 32787312 total, 27967764 free, 1509580 used, 3309968 buff/cache KiB Swap: 0 total, 0 free, 0 used. 29756676 avail Mem

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 10642 myself 20 0 40120 13536 5696 R 100,0 0,0 23:28.40 clstmocrtrain

    command: saventrain=201 hidden=50 lrate=1e-2 save_name=gpu-test ~/OCR/clstm/clstmocrtrain -v ./manifest

    Thank you in advance for help.

    opened by bugsyb 0
  • Inaccurate training

    Inaccurate training

    When I prepare test data like you do for ocropy. I run training with the command save_name=models/model test_every=1001 save_every=1000 nhidden=1 display_every=1 report_every=1 clstmocrtrain trainfiles.txt testfiles.txt But results are inaccurate even after iteration 160000. Can anyone tell me what I'm doing wrong?

    opened by Programmer888 0
  • Segmentation Fault when running tests

    Segmentation Fault when running tests

    I downloaded and installed this project from the master-branch-pre-devices-merge release. I undertook these steps:

    sudo apt-get install scons libprotobuf-dev protobuf-compiler libpng-dev libeigen3-dev swig
    scons
    sudo scons install
    

    The program says Segmentation fault (core dumped) when I run ./run-cmu

    opened by Programmer888 1
  • How to optimally prepare the data

    How to optimally prepare the data

    Hi, I'd like to know what is the recommended/optimal data preparation for training (and recognition, if different).

    For example:

    • is it better to use a grayscale image or a binary one?
    • is it better to leave some white margins (left/right, top/bottom) or trim tight to the text(*)? In the first case, the "target_height" should include the margin or not?
    • does it perform some kind or text straightening/dewarping or should I do it?
    • any other things to consider?

    Thanks.

    (*) I'm asking this because when I started to using it, it was common for the very first letter to be discarded and adding some white margin seemed to fix it. But I was using very little data and maybe it was just a coincidence. The uw3 samples also have a small border.

    opened by lorenzob 0
  • Error while using clstm models that I trained and test it !

    Error while using clstm models that I trained and test it !

    @tmbdev Hi ! when I use default clstm models like arabic-beirut-200.clstm model, it's ok and convert successfully: [email protected] ~/Desktop/mags $ kraken -i images/tt1.jpg image.txt binarize segment ocr -m arabic-beirut- 200.clstm Loading RNN default ✓ Binarizing ✓ Segmenting ✓ Processing ✓ Writing recognition results for /tmp/tmpq1EMeq ✓ but when I use any clstm models that I trained and tested them I get: [email protected] ~/Desktop/mags $ kraken -i images/tt1.jpg image.txt binarize segment ocr -m persian-keyhan-5000.clstm Loading RNN default ✓ Binarizing ✓ Segmenting ✓ Traceback (most recent call last): File "/usr/local/bin/kraken", line 10, in sys.exit(cli()) File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in call return self.main(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 1093, in invoke return _process_result(rv) File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 1031, in _process_result **ctx.params) File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke return callback(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/kraken/kraken.py", line 167, in process_pipeline task(base_image=base_image, input=input, output=output) File "/usr/local/lib/python2.7/dist-packages/kraken/kraken.py", line 125, in recognizer for pred in it: File "/usr/local/lib/python2.7/dist-packages/kraken/rpred.py", line 211, in mm_rpred pred = nets[script].predictString(line) File "/usr/local/lib/python2.7/dist-packages/kraken/lib/models.py", line 88, in predictString line = line.reshape(-1, self.rnn.ninput(), 1) ValueError: can only specify one unknown dimension

    I use kraken version 0.9.6 : [email protected] ~/Desktop/mags $ kraken --version kraken, version 0.9.6.dev8 and compile separate-derivs branch to train my clstm model.

    opened by Hadi58 0
Releases(pre-devices-merge)
Face Detection with DLIB

Face Detection with DLIB In this project, we have detected our face with dlib and opencv libraries. Setup This Project Install DLIB & OpenCV You can i

Can 2 Jan 16, 2022
Textboxes implementation with Tensorflow (python)

tb_tensorflow A python implementation of TextBoxes Dependencies TensorFlow r1.0 OpenCV2 Code from Chaoyue Wang 03/09/2017 Update: 1.Debugging optimize

Jayne Shin (신재인) 20 May 31, 2019
BoxToolBox is a simple python application built around the openCV library

BoxToolBox is a simple python application built around the openCV library. It is not a full featured application to guide you through the w

František Horínek 1 Nov 12, 2021
Handwriting Recognition System based on a deep Convolutional Recurrent Neural Network architecture

Handwriting Recognition System This repository is the Tensorflow implementation of the Handwriting Recognition System described in Handwriting Recogni

Edgard Chammas 346 Jan 07, 2023
Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Scene Text-Spotting based on PSEnet+CRNN Pytorch implementation of an end to end Text-Spotter with a PSEnet text detector and CRNN text recognizer. We

azhar shaikh 62 Oct 10, 2022
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

SynthText Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Ved

Ankush Gupta 1.8k Dec 28, 2022
Python tool that takes the OCR.space JSON output as input and draws a text overlay on top of the image.

OCR.space OCR Result Checker = Draw OCR overlay on top of image Python tool that takes the OCR.space JSON output as input, and draws an overlay on to

a9t9 4 Oct 18, 2022
一款基于Qt与OpenCV的仿真数字示波器

一款基于Qt与OpenCV的仿真数字示波器

郭赟 4 Nov 02, 2022
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.2k Dec 31, 2022
SRA's seminar on Introduction to Computer Vision Fundamentals

Introduction to Computer Vision This repository includes basics to : Python Numpy: A python library Git Computer Vision. The aim of this repository is

Society of Robotics and Automation 147 Dec 04, 2022
Dirty, ugly, and hopefully useful OCR of Facebook Papers docs released by Gizmodo

Quick and Dirty OCR of Facebook Papers Gizmodo has been working through the Facebook Papers and releasing the docs that they process and review. As lu

Bill Fitzgerald 2 Oct 28, 2021
Hiiii this is the Spanish for Linux and win 10 and in the near future the english version of PortScan my new tool on which you can see what ports are Open only with the IP adress.

PortScanner-by-IIT PortScanner es una herramienta programada en Python3. Como su nombre indica esta herramienta escanea los primeros 150 puertos de re

5 Sep 19, 2022
CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

简介 基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别 文本检测:CTPN 文本识别:DenseNet + CTC 环境部署 sh setup.sh 注:CPU环境执行前需注释掉for gpu部分,并解开for cpu部分的注释 Demo 将测试图片放入test_images

Yang Chenguang 2.6k Dec 29, 2022
A selectional auto-encoder approach for document image binarization

The code of this repository was used for the following publication. If you find this code useful please cite our paper: @article{Gallego2019, title =

Javier Gallego 89 Nov 18, 2022
governance proposal to make fei redeemable for eth

Feil Proposal 🌲 Abstract Migrate all ETH from Fei protocol-controlled value into Yearn ETH Vault. Allow redemptions of outstanding FEI for yvETH. At

13 Mar 31, 2022
Driver Drowsiness Detection with OpenCV & Dlib

In this project, we have built a driver drowsiness detection system that will detect if the eyes of the driver are close for too long and infer if the driver is sleepy or inactive.

Mansi Mishra 4 Oct 26, 2022
One Metrics Library to Rule Them All!

onemetric Installation Install onemetric from PyPI (recommended): pip install onemetric Install onemetric from the GitHub source: git clone https://gi

Piotr Skalski 49 Jan 03, 2023
📷 Face Recognition using Haar-Cascade Classifier, OpenCV, and Python

Face-Recognition-System Face Recognition using Haar-Cascade Classifier, OpenCV and Python. This project is based on face detection and face recognitio

1 Jan 10, 2022
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

hocr-tools About About the code Installation System-wide with pip System-wide from source virtualenv Available Programs hocr-check -- check the hOCR f

OCRopus 285 Dec 08, 2022
轻量级公式 OCR 小工具:一键识别各类公式图片,并转换为 LaTeX 格式

QC-Formula | 青尘公式 OCR 介绍 轻量级开源公式 OCR 小工具:一键识别公式图片,并转换为 LaTeX 格式。 支持从 电脑本地 导入公式图片;(后续版本将支持直接从网页导入图片) 公式图片支持 .png / .jpg / .bmp,大小为 4M 以内均可; 支持印刷体及手写体,前

青尘工作室 26 Jan 07, 2023