text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

Overview

text-detection-ctpn

Scene text detection based on ctpn (connectionist text proposal network). It is implemented in tensorflow. The origin paper can be found here. Also, the origin repo in caffe can be found in here. For more detail about the paper and code, see this blog. If you got any questions, check the issue first, if the problem persists, open a new issue.


NOTICE: Thanks to banjin-xjy, banjin and I have reonstructed this repo. The old repo was written based on Faster-RCNN, and remains tons of useless code and dependencies, make it hard to understand and maintain. Hence we reonstruct this repo. The old code is saved in branch master


roadmap

  • reonstruct the repo
  • cython nms and bbox utils
  • loss function as referred in paper
  • oriented text connector
  • BLSTM

setup

nms and bbox utils are written in cython, hence you have to build the library first.

cd utils/bbox
chmod +x make.sh
./make.sh

It will generate a nms.so and a bbox.so in current folder.


demo

  • follow setup to build the library
  • download the ckpt file from googl drive or baidu yun
  • put checkpoints_mlt/ in text-detection-ctpn/
  • put your images in data/demo, the results will be saved in data/res, and run demo in the root
python ./main/demo.py

training

prepare data

  • First, download the pre-trained model of VGG net and put it in data/vgg_16.ckpt. you can download it from tensorflow/models
  • Second, download the dataset we prepared from google drive or baidu yun. put the downloaded data in data/dataset/mlt, then start the training.
  • Also, you can prepare your own dataset according to the following steps.
  • Modify the DATA_FOLDER and OUTPUT in utils/prepare/split_label.py according to your dataset. And run split_label.py in the root
python ./utils/prepare/split_label.py
  • it will generate the prepared data in data/dataset/
  • The input file format demo of split_label.py can be found in gt_img_859.txt. And the output file of split_label.py is img_859.txt. A demo image of the prepared data is shown below.


train

Simplely run

python ./main/train.py
  • The model provided in checkpoints_mlt is trained on GTX1070 for 50k iters. It takes about 0.25s per iter. So it will takes about 3.5 hours to finished 50k iterations.

some results

NOTICE: all the photos used below are collected from the internet. If it affects you, please contact me to delete them.


oriented text connector

  • oriented text connector has been implemented, i's working, but still need futher improvement.
  • left figure is the result for DETECT_MODE H, right figure for DETECT_MODE O

Comments
  • How to export model for Tensorflow Serving?

    How to export model for Tensorflow Serving?

    Follow some tutorials of exporting model for Tensorflow Serving, I've come up below trial:

    cfg_from_file('ctpn/text.yml')
    config = tf.ConfigProto(allow_soft_placement=True)
    with tf.Session(config=config) as sess:
            net = get_network("VGGnet_test")
    
            saver = tf.train.Saver()
            try:
                ckpt = tf.train.get_checkpoint_state(cfg.TEST.checkpoints_path)
                saver.restore(sess, ckpt.model_checkpoint_path)
            except:
                raise 'Missing pre-trained model: {}'.format(ckpt.model_checkpoint_path)
    
           # The main export trial is here
           #############################
            export_path = os.path.join(
                tf.compat.as_bytes('/tmp/ctpn'),
                tf.compat.as_bytes(str(1)))
            builder = tf.saved_model.builder.SavedModelBuilder(export_path)
    
            freezing_graph = sess.graph
            prediction_signature = tf.saved_model.signature_def_utils.predict_signature_def(
                inputs={'input': freezing_graph.get_tensor_by_name('Placeholder:0')},
                outputs={'output': freezing_graph.get_tensor_by_name('Placeholder_1:0')}
            )
    
            builder.add_meta_graph_and_variables(
                sess,
                [tf.saved_model.tag_constants.SERVING],
                signature_def_map={
                    tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: prediction_signature
                },
                clear_devices=True)
    
            builder.save()
            print('[INFO] Export SavedModel into {}'.format(export_path))
            #############################
    

    With output of freezing_graph is:

    Tensor("Placeholder:0", shape=(?, ?, ?, 3), dtype=float32)
    Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
    Tensor("rpn_conv/3x3/rpn_conv/3x3:0", shape=(?, ?, ?, 512), dtype=float32)
    Tensor("lstm_o/Reshape_2:0", shape=(?, ?, ?, 512), dtype=float32)
    Tensor("lstm_o/Reshape_2:0", shape=(?, ?, ?, 512), dtype=float32)
    Tensor("rpn_cls_score/Reshape_1:0", shape=(?, ?, ?, 20), dtype=float32)
    Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32)
    Tensor("Reshape_2:0", shape=(?, ?, ?, 20), dtype=float32)
    Tensor("rpn_bbox_pred/Reshape_1:0", shape=(?, ?, ?, 40), dtype=float32)
    Tensor("Placeholder_1:0", shape=(?, 3), dtype=float32)
    

    After I got /tmp/ctpn/1 exported model, I try to load into Tensorflow Serving server:

    tensorflow_model_server --port=9000 --model_name=ctpn --model_base_path=/tmp/ctpn
    

    But it came up an error:

    Loading servable: {name: detector version: 1} failed: Not found: Op type not registered 'PyFunc' in binary running on [...]. Make sure the Op and Kernel are registered in the binary running in this process.
    

    So there are 2 questions:

    • Am I right about the inputs (Placeholder:0) and the outputs (Placeholder_1:0) of prediction_signature
    • Where do I miss PyFunc?
    opened by hiepph 20
  • exuse me !!!!do you have some tricks when training model?why do i train MLT dataset as default parameters and after 50000iters,it can detect nothing?

    exuse me !!!!do you have some tricks when training model?why do i train MLT dataset as default parameters and after 50000iters,it can detect nothing?

    exuse me !!!! do you have some tricks when training model?why do i train MLT dataset as default parameters and after 50000 iters,it can detect nothing?

    opened by cjt222 15
  • BiLSTM and Training Time

    BiLSTM and Training Time

    Thanks for sharing your implementation with us. I have implemented CTPN with Caffe which failed to converge when adding LSTM. First, I want to ask whether you have added the BiLSTM in your code or not. I am new to tensorflow. After looking at the code, I think you just implement the LSTM not the BiLSTM, is it right ? Second, I want to ask how long did you train your model? I have run the train script of your programs on a GPU device. It seems that it would take 5-6 days to finish the first 180000 iterations.

    Thanks very much.

    opened by TaoDream 14
  • I try to change code from python2 to python3

    I try to change code from python2 to python3

    I try to change code from python2 to python3,but when I finish all the mistake,and run the code,it caused error below,i do not know where it come from and how to carry out it,how can i do?Thank you so much

    2017-10-26 14:07:03.163176: W tensorflow/core/framework/op_kernel.cc:1158] Unknown: KeyError: b'TEST' Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call return fn(*args) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn status, run_metadata) File "/usr/local/lib/python3.6/contextlib.py", line 88, in exit next(self.gen) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.UnknownError: KeyError: b'TEST' [[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]] [[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "ctpn/demo.py", line 95, in _, _ = test_ctpn(sess, net, im) File "/root/chengjuntao/text-detection-ctpn/lib/fast_rcnn/test.py", line 171, in test_ctpn rois = sess.run([net.get_output('rois')[0]],feed_dict=feed_dict) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run run_metadata_ptr) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run feed_dict_string, options, run_metadata) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run target_list, options, run_metadata) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.UnknownError: KeyError: b'TEST' [[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]] [[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

    Caused by op 'rois/PyFunc', defined at: File "ctpn/demo.py", line 85, in net = get_network("VGGnet_test") File "/root/chengjuntao/text-detection-ctpn/lib/networks/factory.py", line 20, in get_network return VGGnet_test() File "/root/chengjuntao/text-detection-ctpn/lib/networks/VGGnet_test.py", line 14, in init self.setup() File "/root/chengjuntao/text-detection-ctpn/lib/networks/VGGnet_test.py", line 68, in setup .proposal_layer(_feat_stride, anchor_scales, 'TEST', name='rois')) File "/root/chengjuntao/text-detection-ctpn/lib/networks/network.py", line 28, in layer_decorated layer_output = op(self, layer_input, *args, **kwargs) File "/root/chengjuntao/text-detection-ctpn/lib/networks/network.py", line 241, in proposal_layer [tf.float32,tf.float32]) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/script_ops.py", line 198, in py_func input=inp, token=token, Tout=Tout, name=name) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/gen_script_ops.py", line 38, in _py_func name=name) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in init self._traceback = _extract_stack()

    UnknownError (see above for traceback): KeyError: b'TEST' [[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]] [[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

    opened by cjt222 14
  • ModuleNotFoundError: No module named 'lib'

    ModuleNotFoundError: No module named 'lib'

    When I try to execute the demo.py im receiving this error

    File "demo.py", line 9, in from lib.networks.factory import get_network ModuleNotFoundError: No module named 'lib'

    Can someone please help me with this ?

    opened by SuryaprakashNSM 12
  • AttributeError: 'NoneType' object has no attribute 'model_checkpoint_path'

    AttributeError: 'NoneType' object has no attribute 'model_checkpoint_path'

    File "/home/hjq/PycharmProjects/Tensorflow-OCR/text-detection-ctpn-master/ctpn/demo.py", line 103, in raise 'Check your pretrained {:s}'.format(ckpt.model_checkpoint_path) AttributeError: 'NoneType' object has no attribute 'model_checkpoint_path'

    opened by seawater668 11
  • After CTPN. What is your idea?

    After CTPN. What is your idea?

    Hello. eragonruan! I was very impressed with your code. One more time. Thank you so much.

    I have a problem. When the image passes through the CTPN, a green box is created. If I put this image in the OCR engine, how should I separate it (green box)? My idea is to use OpenCV. But the green box is on the text. so it hides the text. Can I draw a thin line to solve the problem?

    opened by rudebono 10
  • win10 on cpu encounter KeyError: b'TEST' error

    win10 on cpu encounter KeyError: b'TEST' error

    @eragonruan My enviroment is win10+cpu+tensorflow1.3. I have spend lot of time for this; I'm confused about this. Please help me on your spare time. Thanks!

    Here is the error: `2018-07-26 11:11:58.723047: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2018-07-26 11:11:58.728604: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. Tensor("Placeholder:0", shape=(?, ?, ?, 3), dtype=float32) Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32) Tensor("rpn_conv/3x3/rpn_conv/3x3:0", shape=(?, ?, ?, 512), dtype=float32) Tensor("lstm_o/Reshape_2:0", shape=(?, ?, ?, 512), dtype=float32) Tensor("lstm_o/Reshape_2:0", shape=(?, ?, ?, 512), dtype=float32) Tensor("rpn_cls_score/Reshape_1:0", shape=(?, ?, ?, 20), dtype=float32) Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32) Tensor("Reshape_2:0", shape=(?, ?, ?, 20), dtype=float32) Tensor("rpn_bbox_pred/Reshape_1:0", shape=(?, ?, ?, 40), dtype=float32) Tensor("Placeholder_1:0", shape=(?, 3), dtype=float32) Loading network VGGnet_test... Restoring from checkpoints/VGGnet_fast_rcnn_iter_50000.ckpt... done 2018-07-26 11:12:09.006369: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\framework\op_kernel.cc:1192] Unknown: KeyError: b'TEST' Traceback (most recent call last): File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\client\session.py", line 1327, in _do_call return fn(*args) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\client\session.py", line 1306, in _run_fn status, run_metadata) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\contextlib.py", line 88, in exit next(self.gen) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.UnknownError: KeyError: b'TEST' [[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_2, rpn_bbox_pred/Reshape_1, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "./ctpn/demo.py", line 97, in _, _ = test_ctpn(sess, net, im) File "D:\workspace\text-detection-ctpn-master\lib\fast_rcnn\test.py", line 51, in test_ctpn rois = sess.run([net.get_output('rois')[0]],feed_dict=feed_dict) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\client\session.py", line 895, in run run_metadata_ptr) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\client\session.py", line 1124, in _run feed_dict_tensor, options, run_metadata) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\client\session.py", line 1321, in _do_run options, run_metadata) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\client\session.py", line 1340, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.UnknownError: KeyError: b'TEST' [[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_2, rpn_bbox_pred/Reshape_1, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]

    Caused by op 'rois/PyFunc', defined at: File "./ctpn/demo.py", line 82, in net = get_network("VGGnet_test") File "D:\workspace\text-detection-ctpn-master\lib\networks\factory.py", line 8, in get_network return VGGnet_test() File "D:\workspace\text-detection-ctpn-master\lib\networks\VGGnet_test.py", line 15, in init self.setup() File "D:\workspace\text-detection-ctpn-master\lib\networks\VGGnet_test.py", line 56, in setup .proposal_layer(_feat_stride, anchor_scales, 'TEST', name='rois')) File "D:\workspace\text-detection-ctpn-master\lib\networks\network.py", line 21, in layer_decorated layer_output = op(self, layer_input, *args, **kwargs) File "D:\workspace\text-detection-ctpn-master\lib\networks\network.py", line 215, in proposal_layer [tf.float32,tf.float32]) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\ops\script_ops.py", line 203, in py_func input=inp, token=token, Tout=Tout, name=name) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\ops\gen_script_ops.py", line 36, in _py_func name=name) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op op_def=op_def) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\framework\ops.py", line 2630, in create_op original_op=self._default_original_op, op_def=op_def) File "C:\Users\d00455280.CHINA\AppData\Local\Continuum\Anaconda3\envs\ocr\lib\site-packages\tensorflow\python\framework\ops.py", line 1204, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

    UnknownError (see above for traceback): KeyError: b'TEST' [[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_2, rpn_bbox_pred/Reshape_1, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]`

    opened by Crocodiles 9
  •   UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. This may consume a large amount of memory.

    UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. This may consume a large amount of memory.

    when i try to train my own datasets, i faced this core dump,

    Computing bounding-box regression targets... bbox target means: [[ 0. 0. 0. 0.] [ 0. 0. 0. 0.]] [ 0. 0. 0. 0.] bbox target stdevs: [[ 0.1 0.1 0.2 0.2] [ 0.1 0.1 0.2 0.2]] [ 0.1 0.1 0.2 0.2] Normalizing targets done Solving... /data/resys/var/python2.7.3/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py:91: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. "Converting sparse IndexedSlices to a dense Tensor of unknown shape. " Segmentation fault (core dumped)

    this fault caused by the function train_model() in lib/fast-rcnn/train.py , when it run train_op=opt.apply_gradients(list(zip(grads,tvars),global_step=global_step)) .

    have you ever faced this error?

    opened by louisly 8
  • WHERE ARE gt_img_1001.txt ... gt_img_6000.txt FILES?!

    WHERE ARE gt_img_1001.txt ... gt_img_6000.txt FILES?!

    I cloned the project, downloaded the pretrained model and the training data, unzipped them then got a folder named TEXTVOC that contains other folders (Annotations, ImagesSets and JPEGImages). I placed TEXTVOC in the data folder and edited the paths in the split_label.py as following: path = '/home/hani/Desktop/text-detection-ctpn/data/TEXTVOC/JPEGImages' gt_path = '/home/hani/Desktop/text-detection-ctpn/data/TEXTVOC/Annotations'. But, when I run it python lib/prepare_training_data/split_label.py, I got this error:

    /home/hani/Desktop/text-detection-ctpn/data/TEXTVOC/JPEGImages/img_1001.jpg
    Traceback (most recent call last):
      File "lib/prepare_training_data/split_label.py", line 34, in <module>
        with open(gt_file, 'r') as f:
    FileNotFoundError: [Errno 2] No such file or directory: '/home/hani/Desktop/text-detection-ctpn/data/TEXTVOC/Annotations/gt_img_1001.txt'
    

    As it is shown, the gt_img_1001.txt is not missing, which means whether the gt_path is wrong whether those files do not exist. I looked into all folders and I didn't find any file starting with gt_* PS: I also run this command: ln -s TEXTVOC VOCdevkit2007, so I didn't miss anything that should be done.

    opened by maky-hnou 7
  • how to run the train scripts

    how to run the train scripts

    在split_label.py中有两个路径path = '/media/D/code/OCR/text-detection-ctpn/data/mlt_english+chinese/image'和gt_path = '/media/D/code/OCR/text-detection-ctpn/data/mlt_english+chinese/label',代码中会分别读取这个两个路径下的文件,请问....../label中放什么样的文件?谢谢

    opened by jibadallz 7
  • Bump tensorflow-gpu from 1.4.0 to 2.9.3

    Bump tensorflow-gpu from 1.4.0 to 2.9.3

    Bumps tensorflow-gpu from 1.4.0 to 2.9.3.

    Release notes

    Sourced from tensorflow-gpu's releases.

    TensorFlow 2.9.3

    Release 2.9.3

    This release introduces several vulnerability fixes:

    TensorFlow 2.9.2

    Release 2.9.2

    This releases introduces several vulnerability fixes:

    ... (truncated)

    Changelog

    Sourced from tensorflow-gpu's changelog.

    Release 2.9.3

    This release introduces several vulnerability fixes:

    Release 2.8.4

    This release introduces several vulnerability fixes:

    ... (truncated)

    Commits
    • a5ed5f3 Merge pull request #58584 from tensorflow/vinila21-patch-2
    • 258f9a1 Update py_func.cc
    • cd27cfb Merge pull request #58580 from tensorflow-jenkins/version-numbers-2.9.3-24474
    • 3e75385 Update version numbers to 2.9.3
    • bc72c39 Merge pull request #58482 from tensorflow-jenkins/relnotes-2.9.3-25695
    • 3506c90 Update RELEASE.md
    • 8dcb48e Update RELEASE.md
    • 4f34ec8 Merge pull request #58576 from pak-laura/c2.99f03a9d3bafe902c1e6beb105b2f2417...
    • 6fc67e4 Replace CHECK with returning an InternalError on failing to create python tuple
    • 5dbe90a Merge pull request #58570 from tensorflow/r2.9-7b174a0f2e4
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump numpy from 1.14.2 to 1.22.0

    Bump numpy from 1.14.2 to 1.22.0

    Bumps numpy from 1.14.2 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Model Quantization

    Model Quantization

    The pretrained model is favorable as it detects text with high accuracy. However, it takes long time to inference with CPU, this is out of expectation in production deployment. Is there a way to quantize the model?

    What I've tried is to convert the checkpoint model to saved_model format, then load from saved_model to perform quantization with TFLite converter, code snippet as followed:

    # Load checkpoint and convert to saved_model
    import tf
    trained_checkpoint_prefix = "checkpoints_mlt/ctpn_50000.ckpt"
    export_dir = "exported_model"
    
    graph = tf.Graph()
    with tf.compat.v1.Session(graph=graph) as sess:
        # Restore from checkpoint
        loader = tf.compat.v1.train.import_meta_graph(trained_checkpoint_prefix + ".meta")
        loader.restore(sess, trained_checkpoint_prefix)
    
    # Export checkpoint to SavedModel
    builder = tf.compat.v1.saved_model.builder.SavedModelBuilder(export_dir)
    builder.add_meta_graph_and_variables(sess,
                                         [tf.saved_model.TRAINING, tf.saved_model.SERVING],
                                         strip_default_attrs=True)
    builder.save()
    

    In result, I got a .pb file and and a variables folder with checkpoint and index files inside. Then errors popped out when I tried to perform quantization:

    converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
    converter.inference_input_type = tf.int8  # or tf.uint8
    converter.inference_output_type = tf.int8  # or tf.uint8
    tflite_quant_model = converter.convert()
    

    This is the error:

    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-4-03205673177f> in <module>
         11 converter.inference_input_type = tf.int8  # or tf.uint8
         12 converter.inference_output_type = tf.int8  # or tf.uint8
    ---> 13 tflite_quant_model = converter.convert()
    
    ~/virtualenvironment/tf2/lib/python3.6/site-packages/tensorflow/lite/python/lite.py in convert(self)
        450     # TODO(b/130297984): Add support for converting multiple function.
        451     if len(self._funcs) != 1:
    --> 452       raise ValueError("This converter can only convert a single "
        453                        "ConcreteFunction. Converting multiple functions is "
        454                        "under development.")
    
    ValueError: This converter can only convert a single ConcreteFunction. Converting multiple functions is under development.
    

    Understand that this error was raised due to the multiple inputs input_image and input_im_info required by the model. Appreciate if anyone could help.

    opened by leeshien 0
  • Bump ipython from 5.1.0 to 7.16.3

    Bump ipython from 5.1.0 to 7.16.3

    Bumps ipython from 5.1.0 to 7.16.3.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Releases(untagged-48d74c6337a71b6b5f87)
Owner
Shaohui Ruan
Interested in machine learning & computer vision
Shaohui Ruan
Deep LearningImage Captcha 2

滑动验证码深度学习识别 本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口,基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。 只需要几百张缺口标注图片即可训练出精度高的识别模型,识别效果样例: 克隆项目 运行命令: git cl

Python3WebSpider 117 Dec 28, 2022
Fun program to overlay a mask to yourself using a webcam

Superhero Mask Overlay Description Simple project made for fun. It consists of placing a mask (a PNG image with transparent background) on your face.

KB Kwan 10 Dec 01, 2022
The CIS OCR PostCorrectionTool

The CIS OCR Post Correction Tool PoCoTo Source code for the Java-based PoCoTo client enabling fast interactive batch corrections of complete OCR error

CIS OCR Group 36 Dec 15, 2022
Read Japanese manga inside browser with selectable text.

mokuro Read Japanese manga with selectable text inside a browser. See demo: https://kha-white.github.io/manga-demo mokuro_demo.mp4 Demo contains excer

Maciej Budyś 170 Dec 27, 2022
This is a repository to learn and get more computer vision skills, make robotics projects integrating the computer vision as a perception tool and create a lot of awesome advanced controllers for the robots of the future.

This is a repository to learn and get more computer vision skills, make robotics projects integrating the computer vision as a perception tool and create a lot of awesome advanced controllers for the

Elkin Javier Guerra Galeano 17 Nov 03, 2022
Super Mario Game With Python

Super_Mario Hello all this is a simple python program which tries to use our body as a controller for the super mario game Here I have used media pipe

Adarsh Badagala 219 Nov 25, 2022
Markup for note taking

Subtext: markup for note-taking Subtext is a text-based, block-oriented hypertext format. It is designed with note-taking in mind. It has a simple, pe

Gordon Brander 224 Jan 01, 2023
Creating of virtual elements of the graphical interface using opencv and mediapipe.

Virtual GUI Creating of virtual elements of the graphical interface using opencv and mediapipe. Element GUI Output Description Button By default the b

Aleksei 4 Jun 16, 2022
An application of high resolution GANs to dewarp images of perturbed documents

Docuwarp This project is focused on dewarping document images through the usage of pix2pixHD, a GAN that is useful for general image to image translat

Thomas Huang 97 Dec 25, 2022
Image augmentation for machine learning experiments.

imgaug This python library helps you with augmenting images for your machine learning projects. It converts a set of input images into a new, much lar

Alexander Jung 13.2k Jan 02, 2023
scantailor - Scan Tailor is an interactive post-processing tool for scanned pages.

Scan Tailor - scantailor.org This project is no longer maintained, and has not been maintained for a while. About Scan Tailor is an interactive post-p

1.5k Dec 28, 2022
Neural search engine for AI papers

Papers search Neural search engine for ML papers. Demo Usage is simple: input an abstract, get the matching papers. The following demo also showcases

Giancarlo Fissore 44 Dec 24, 2022
Distilling Knowledge via Knowledge Review, CVPR 2021

ReviewKD Distilling Knowledge via Knowledge Review Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia This project provides an implementation for the

DV Lab 194 Dec 28, 2022
Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

EAST_ICPR: EAST for ICPR MTWI 2018 CHALLENGE Introduction This is a repository forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE. Origin Reposi

Haozheng Li 157 Aug 23, 2022
Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrieval.

Dual Encoding for Video Retrieval by Text Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding

81 Dec 01, 2022
Characterizing possible failure modes in physics-informed neural networks.

Characterizing possible failure modes in physics-informed neural networks This repository contains the PyTorch source code for the experiments in the

Aditi Krishnapriyan 55 Jan 02, 2023
pyntcloud is a Python library for working with 3D point clouds.

pyntcloud is a Python library for working with 3D point clouds.

David de la Iglesia Castro 1.2k Jan 07, 2023
Color Picker and Color Detection tool for METR4202

METR4202 Color Detection Help This is sample code that can be used for the METR4202 project demo. There are two files provided, both running on Python

Miguel Valencia 1 Oct 23, 2021
Application that instantly translates sign-language to letters.

Sign Language Translator Project Description The main purpose of project is translating sign-language to letters. In accordance with this purpose we d

3 Sep 29, 2022
Memory tests solver with using OpenCV

Human Benchmark project This project is OpenCV based programs which are puzzle solvers for 7 different games for https://humanbenchmark.com/. made as

Bahadır Araz 24 Dec 27, 2022