Layout Parser is a deep learning based tool for document image layout analysis tasks.

Overview

Layout Parser Logo

Docs PyPI PyVersion License


Layout Parser is a deep learning based tool for document image layout analysis tasks.

Installation

Use pip or conda to install the library:

pip install layoutparser

# Install Detectron2 for using DL Layout Detection Model
# Please make sure the PyTorch version is compatible with
# the installed Detectron2 version. 
pip install 'git+https://github.com/facebookresearch/detectron2.git#egg=detectron2' 

# Install the ocr components when necessary 
pip install layoutparser[ocr]      

This by default will install the CPU version of the Detectron2, and it should be able to run on most of the computers. But if you have a GPU, you can consider the GPU version of the Detectron2, referring to the official instructions.

Quick Start

We provide a series of examples for to help you start using the layout parser library:

  1. Table OCR and Results Parsing: layoutparser can be used for conveniently OCR documents and convert the output in to structured data.

  2. Deep Layout Parsing Example: With the help of Deep Learning, layoutparser supports the analysis very complex documents and processing of the hierarchical structure in the layouts.

DL Assisted Layout Prediction Example

Example Usage

The images shown in the figure above are: a screenshot of this paper, an image from the PRIMA Layout Analysis Dataset, a screenshot of the WSJ website, and an image from the HJDataset.

With only 4 lines of code in layoutparse, you can unlock the information from complex documents that existing tools could not provide. You can either choose a deep learning model from the ModelZoo, or load the model that you trained on your own. And use the following code to predict the layout as well as visualize it:

>>> import layoutparser as lp
>>> model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config')
>>> layout = model.detect(image) # You need to load the image somewhere else, e.g., image = cv2.imread(...)
>>> lp.draw_box(image, layout,) # With extra configurations

Citing layoutparser

If you find layoutparser helpful to your work, please consider citing our tool and paper using the following BibTeX entry.

@article{shen2021layoutparser,
  title={LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis},
  author={Shen, Zejiang and Zhang, Ruochen and Dell, Melissa and Lee, Benjamin Charles Germain and Carlson, Jacob and Li, Weining},
  journal={arXiv preprint arXiv:2103.15348},
  year={2021}
}
Comments
  • Apply detect() on readable PDF files

    Apply detect() on readable PDF files

    Hi there, from the docs I infere that detect() operates, for example, on PIL.Image objects. Is there way to directly operate on already readable PDF files (which obviates the need applying OCR as well). Greetings

    enhancement 
    opened by simonschoe 12
  • AttributeError: module layoutparser has no attribute Detectron2LayoutModel

    AttributeError: module layoutparser has no attribute Detectron2LayoutModel

    Hi,

    Thank you for this awesome program! I successfully installed layout-parser Detectron2 on my windows 10 laptop. When I run the following code:

    import layoutparser as lp import cv2 from pdf2image import convert_from_bytes

    images = convert_from_bytes(open('C:\temp\ConsigneeList\Doc 4 Distribution List.pdf', 'rb').read())

    model = lp.Detectron2LayoutModel( config_path ='lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config', # In model catalog label_map = {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In modellabel_map extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional ) #loop through each page for image in images: ocr_agent = lp.ocr.TesseractAgent()

    image = np.array(image)
    
    layout = model.detect(image)
    

    text_blocks = lp.Layout([b for b in layout if b.type == 'Text']) #loop through each text box on page.

    for block in text_blocks: segment_image = (block .pad(left=5, right=5, top=5, bottom=5) .crop_image(image)) text = ocr_agent.detect(segment_image) block.set(text=text, inplace=True)

    for i, txt in enumerate(text_blocks.get_texts()):
            my_file = open("OUTPUT FILE PATH/FILENAME.TXT","a+")
            my_file.write(txt)
    

    I get the following errors:


    AttributeError Traceback (most recent call last) in ----> 1 model = lp.Detectron2LayoutModel( 2 config_path ='lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config', # In model catalog 3 label_map = {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In modellabel_map 4 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional 5 )

    C:\ProgramData\Anaconda3\lib\site-packages\layoutparser\file_utils.py in getattr(self, name) 224 value = getattr(module, name) 225 else: --> 226 raise AttributeError(f"module {self.name} has no attribute {name}") 227 228 setattr(self, name, value)

    AttributeError: module layoutparser has no attribute Detectron2LayoutModel

    Any ideas on what is wrong? Thank you!!

    Sincerely,

    tom

    Checklist

    1. I have searched related issues but cannot get the expected help.
    2. The bug has not been fixed in the latest version, see the Layout Parser Releases

    To Reproduce Steps to reproduce the behavior:

    1. What command or script did you run?
    A placeholder for the command.
    

    Environment

    1. Please describe your Platform [Windows/MacOS/Linux]
    2. Please show the Layout Parser version
    3. You may add addition that may be helpful for locating the problem, such as
      • How you installed PyTorch [e.g., pip, conda, source]
      • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

    Error traceback If applicable, paste the error traceback here.

    Screenshots If applicable, add screenshots to help explain your problem.

    Additional context Add any other context about the problem here.

    bug 
    opened by theiman112860 10
  • 'GCVAgent' object has no attribute '_client'

    'GCVAgent' object has no attribute '_client'

    Hi, when I was running the tutorial of "OCR tables and parse the output", when I was trying to obtain the result:

    res = ocr_agent.detect(image, return_response=True)

    The response was

    Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/layoutparser/ocr/gcv_agent.py", line 168, in detect res = self._detect(img_content) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/layoutparser/ocr/gcv_agent.py", line 134, in _detect response = self._client.document_text_detection( AttributeError: 'GCVAgent' object has no attribute '_client'

    I googled and some sites said The Client() class was removed in the Client Library v0.25.1 and replaced with ImageAnnotatorClient().

    Was this a problem? Thank you.

    bug 
    opened by junxi-liu 8
  • Error installing dependencies

    Error installing dependencies

    Hi Team, Thank you for all the great work. It looks amazing. I tried installing pip install layoutparser but it thrown me the below error, can you please let me know how to rectify this,

    ERROR: Command errored out with exit status 1: command: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-wheel-awmfv0cr' cwd: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632
    Complete output (22 lines): running bdist_wheel running build running build_py creating build creating build\lib.win-amd64-3.8 creating build\lib.win-amd64-3.8\pycocotools copying pycocotools\coco.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\cocoeval.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\mask.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools_init_.py -> build\lib.win-amd64-3.8\pycocotools running build_ext cythoning pycocotools/_mask.pyx to pycocotools_mask.c C:\Users\pss.ch\AppData\Roaming\Python\Python38\site-packages\Cython\Compiler\Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\pycocotools_mask.pyx tree = Parsing.p_module(s, pxd, full_module_name) building 'pycocotools._mask' extension creating build\temp.win-amd64-3.8 creating build\temp.win-amd64-3.8\Release creating build\temp.win-amd64-3.8\Release\common creating build\temp.win-amd64-3.8\Release\pycocotools C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Users\pss.ch\AppData\Roaming\Python\Python38\site-packages\numpy\core\include -I./common "-IC:\Program Files\Anaconda\include" "-IC:\Program Files\Anaconda\include" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tc./common/maskApi.c /Fobuild\temp.win-amd64-3.8\Release./common/maskApi.obj -Wno-cpp -Wno-unused-function -std=c99 cl : Command line error D8021 : invalid numeric argument '/Wno-cpp' error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe' failed with exit status 2

    ERROR: Failed building wheel for pycocotools ERROR: Command errored out with exit status 1: command: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-record-w4euj5sb\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\Users\pss.ch\AppData\Roaming\Python\Python38\Include\pycocotools' cwd: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632
    Complete output (20 lines): running install running build running build_py creating build creating build\lib.win-amd64-3.8 creating build\lib.win-amd64-3.8\pycocotools copying pycocotools\coco.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\cocoeval.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\mask.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools_init_.py -> build\lib.win-amd64-3.8\pycocotools running build_ext skipping 'pycocotools_mask.c' Cython extension (up-to-date) building 'pycocotools._mask' extension creating build\temp.win-amd64-3.8 creating build\temp.win-amd64-3.8\Release creating build\temp.win-amd64-3.8\Release\common creating build\temp.win-amd64-3.8\Release\pycocotools C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\AppData\Roaming\Python\Python38\site-packages\numpy\core\include -I./common "-IC:\Program Files\Anaconda\include" "-IC:\Program Files\Anaconda\include" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tc./common/maskApi.c /Fobuild\temp.win-amd64-3.8\Release./common/maskApi.obj -Wno-cpp -Wno-unused-function -std=c99 cl : Command line error D8021 : invalid numeric argument '/Wno-cpp' error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe' failed with exit status 2 ---------------------------------------- ERROR: Command errored out with exit status 1: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-record-w4euj5sb\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\AppData\Roaming\Python\Python38\Include\pycocotools' Check the logs for full command output.

    opened by sriprad 8
  • enforce_cpu not working

    enforce_cpu not working

    When setting enforce_cpu true, still using CUDA instead of CPU. I think it is due to this https://github.com/Layout-Parser/layout-parser/blob/e035fc8f952addc620670e5b47864fe213db0e10/src/layoutparser/models/layoutmodel.py#L120

    Possible fix could be cfg.MODEL.DEVICE = "cuda" if torch.cuda.is_available() and (not enforce_cpu) else "cpu"

    bug 
    opened by lkluo 5
  • Adding support for mathematical formula recognition

    Adding support for mathematical formula recognition

    Have you considered adding support for mathematical formula recognition? Identifying the position of mathematical formulas in documents has always been a problem.

    modeling 
    opened by SleepyCelery 5
  • draw_box draw only one box from layout

    draw_box draw only one box from layout

    Describe the bug I just installed everything according to the installation guide and launched your jupyter notebook from here Deep Layout Parsing Example. After first draw_box it's show only one box, but in print(layout) i see all boxes. Same with second draw_box from your guide. not sure what i'm doing wrong.

    To Reproduce Steps to reproduce the behavior:

    1. installation guide + detectron2 install also from your guide
    2. Run jupyter notebook

    Environment

    1. MacOS
    2. VS Code
    3. Here some stuff from pip:
    torch==1.11.0
    torchvision==0.12.0
    Pillow==9.1.0
    opencv-python==4.5.5.64
    layoutparser==0.3.3
    

    Error traceback No errors, just behaviour not same like in guide or other guides

    Screenshots attached

    output1 output2

    bug 
    opened by Moo1234567 4
  • Gives wrong results when the code is run for some images in a loop

    Gives wrong results when the code is run for some images in a loop

    The code works when it is run for a single image. But when I run the same code in a loop for few images from the publaynet dataset, cached results seem to apply (i.e. The bounding boxes overlap and the boxes for the previous images are also put in the current image).

    opened by surajsubramanian 4
  • ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)

    ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)

    While using this code, I get this error of Pillow. I tried re-installing pillow but still struggling with this issue. Any help to make this code run?

    import layoutparser as lp
    model = lp.Detectron2LayoutModel(
                config_path ='lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', # In model catalog
                label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
                extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional
            )
    model.detect(image)
    

    Getting this error:

    ImportError                               Traceback (most recent call last)
    [<ipython-input-6-59f0fb07b7e3>](https://localhost:8080/#) in <module>
          1 import layoutparser as lp
    ----> 2 model = lp.Detectron2LayoutModel(
          3             config_path ='lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', # In model catalog
          4             label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
          5             extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional
    
    31 frames
    [/usr/local/lib/python3.7/dist-packages/PIL/ImageFont.py](https://localhost:8080/#) in <module>
         35 from . import Image
         36 from ._deprecate import deprecate
    ---> 37 from ._util import is_directory, is_path
         38 
         39 
    
    ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)
    
    
    opened by arhamshah 3
  • TypeError: inner() got an unexpected keyword argument 'image_context'

    TypeError: inner() got an unexpected keyword argument 'image_context'

    Hello! Recently encountered an issue when trying to use Google's OCR when running ocr_agent.detect

    Running this:

    image = cv2.imread("/Users/liz/Documents/Projects/LayoutParser/test2.png")
    ocr_agent = lp.GCVAgent.with_credential("/Users/liz/Documents/Projects/Keys/GoogleCloud/vision-341523-e3cbd0df8d19.json",languages = ['en'])
    res = ocr_agent.detect(image, return_response=True)
    

    Gives me the following error:

    TypeError                                 Traceback (most recent call last)
    <ipython-input-9-76614ef6a3e8> in <module>
          1 image = cv2.imread("/Users/liz/Documents/Projects/LayoutParser/test2.png")
          2 ocr_agent = lp.GCVAgent.with_credential("/Users/liz/Documents/Projects/Keys/GoogleCloud/vision-341523-e3cbd0df8d19.json",languages = ['en'])
    ----> 3 res = ocr_agent.detect(image, return_response=True)
          4 
          5 #layout = ocr_agent.gather_full_text_annotation(res, agg_level=lp.GCVFeatureType.WORD)
    
    /opt/homebrew/Caskroom/miniforge/base/envs/data310/lib/python3.9/site-packages/layoutparser/ocr.py in detect(self, image, return_response, return_only_text, agg_output_level)
        222                 img_content = image_file.read()
        223 
    --> 224         res = self._detect(img_content)
        225 
        226         if return_response:
    
    /opt/homebrew/Caskroom/miniforge/base/envs/data310/lib/python3.9/site-packages/layoutparser/ocr.py in _detect(self, img_content)
        188     def _detect(self, img_content):
        189         img_content = self._vision.types.Image(content=img_content)
    --> 190         response = self._client.document_text_detection(
        191             image=img_content, image_context=self._context
        192         )
    
    TypeError: inner() got an unexpected keyword argument 'image_context'
    

    Not sure what it is caused by, might be user error but I haven't been able to find anything else about it and I've tried everything I can think of (all the packages are up to date (or in google cloud vision's case, downgraded to stay on the old API). Thanks!

    bug 
    opened by liz-goodwin 3
  • bad result detected

    bad result detected

    I got bad result using layout-parser here is the image I am used: 1

    here is the code run in python :

    image = cv2.imread("1.png")
    # Convert the image from BGR (cv2 default loading style)
    # to RGB
    image = image[..., ::-1]
    origin_image = image.copy()
    
    model = lp.Detectron2LayoutModel('lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config', 
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})
    # Load the deep layout model from the layoutparser API 
    # For all the supported model, please check the Model 
    # Zoo Page: https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html
    
    layout = model.detect(image)
    # print("layout : ", layout)
    # Detect the layout of the input image
    text_blocks = lp.Layout([b for b in layout if b.type=='Text'])
    drawRectangleInImage(origin_image, text_blocks, (36,255,12))
    
    titles_blocks = lp.Layout([b for b in layout if b.type=='Title'])
    drawRectangleInImage(origin_image, titles_blocks, (76, 155, 175))
    
    figure_blocks = lp.Layout([b for b in layout if b.type=='Figure'])
    drawRectangleInImage(origin_image, figure_blocks, (122, 96, 216))
    
    lists_blocks = lp.Layout([b for b in layout if b.type=='List'])
    drawRectangleInImage(origin_image, lists_blocks, (176, 155, 175))
    
    tables_blocks = lp.Layout([b for b in layout if b.type=='Table'])
    drawRectangleInImage(origin_image, tables_blocks, (76, 255, 75))
    
    cv2.imshow('image', origin_image)
    cv2.waitKey()
    

    here is the result:

    截屏2022-01-18 11 45 06

    by the way :

    there is some warning generated :

    /usr/local/lib/python3.9/site-packages/detectron2/structures/image_list.py:99: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). max_size = (max_size + (stride - 1)) // stride * stride /usr/local/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

    bug 
    opened by DamonsJ 3
  • Any idea about Detectron gets overlapping and sometimes misses some blocks

    Any idea about Detectron gets overlapping and sometimes misses some blocks

    The problem I am currently using layout-parser to detect the blocks of a scanned book papers and trying to take each block separately from the page and do some processing over them.

    Checklist

    To Reproduce

    import layoutparser as lp
    import cv2
    
    image = cv2.imread("/content/image_0.jpg")
    # Convert the image from BGR (cv2 default loading style) to RGB
    image = image[..., ::-1]
    
    model = lp.Detectron2LayoutModel((lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config),
                                     extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                     label_map={1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"})
    
    
    # Detect the layout of the input image
    layout = model.detect(image)
    
    # Show the detected layout of the input image
    lp.draw_box(image, layout, box_width=3)
    

    Environment

    1. Platform [Linux] (on colab)
    2. Installation commands
    !sudo apt-get update
    !sudo apt-get install libleptonica-dev tesseract-ocr libtesseract-dev python3-pil tesseract-ocr-eng tesseract-ocr-script-latn
    !pip install layoutparser	
    !pip install layoutparser torchvision && pip install "git+https://github.com/facebookresearch/[email protected]#egg=detectron2"	
    !pip install "layoutparser[ocr]"	
    !pip install "layoutparser[layoutmodels]" # Install DL layout model toolkit 
    

    Screenshots

    1- Overlapping |3|image_3| |---|---|

    2- Missing |7|image_7| |---|---|

    I know it may not the right place to release that issue, but I think you may have an idea about that problem

    bug 
    opened by rrrokhtar 0
  • [Bug] has_torch_function_variadic error

    [Bug] has_torch_function_variadic error

    Describe the bug When attempting to initialise a model (I've tried with AutoLayoutModel and Detectron2LayoutModel), torch.jit throws a RuntimeError as below...

    RuntimeError: 
    undefined value has_torch_function_variadic:
      File "/opt/conda/lib/python3.8/site-packages/torch/utils/smdebug.py", line 2962
             >>> loss.backward()
        """
        if has_torch_function_variadic(input, target, weight, pos_weight):
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            return handle_torch_function(
                binary_cross_entropy_with_logits,
    'binary_cross_entropy_with_logits' is being compiled since it was called from 'sigmoid_focal_loss'
      File "/opt/conda/lib/python3.8/site-packages/fvcore/nn/focal_loss.py", line 34
        """
        p = torch.sigmoid(inputs)
        ce_loss = F.binary_cross_entropy_with_logits(inputs, targets, reduction="none")
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        p_t = p * targets + (1 - p) * (1 - targets)
        loss = ce_loss * ((1 - p_t) ** gamma)
    

    To Reproduce Steps to reproduce the behavior:

    1. Install layout-parser, OpenCV, Detectron2 as below
    %pip install opensearch-py opencv-python --quiet
    %pip install -U layoutparser[ocr] --quiet
    !python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.10/index.html
    
    1. Import layoutparser and attempt to init model with lp.models.Detectron2LayoutModel(...)
    2. Error appears

    Environment Linux with layoutparser latest

    bug 
    opened by lucafrost 0
  • cannot import name 'is_directory' from 'PIL._util'(lp.Detectron2LayoutModel)

    cannot import name 'is_directory' from 'PIL._util'(lp.Detectron2LayoutModel)

    Describe the bug When I tried the sample codes:

    !pip install layoutparser
    !pip install 'git+https://github.com/facebookresearch/[email protected]#egg=detectron2'
    
    import layoutparser as lp
    import cv2
    import PIL
    
    image = cv2.imread("image.png")
    model = lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config')
    layout = model.detect(image)
    

    Colab link(Python 3.8.16): https://colab.research.google.com/drive/1lb8_Pcw8_NNdeKPL80HOYca8gaCB0f-E?usp=sharing

    I got an error on this line:

    lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config')

    The error message is:

    ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.8/dist-packages/PIL/_util.py)

    I hope that I can get your help. Thanks!

    bug 
    opened by sudoghut 0
  • [Fix] reduce memory consumption and close pdf stream after usage

    [Fix] reduce memory consumption and close pdf stream after usage

    Flushes the pages and pdf afterwards to reduce the memory/ram consumption.

    Opens the pdf stream as a context manager so that the file is closed afterwads.

    opened by jakobnrmnn 0
  • Minor installation instruction error

    Minor installation instruction error

    On Mac, the command

    pip3 install -U layoutparser[ocr]
    

    doesn't work (returns "zsh: no matches found: layoutparser[ocr]"), you need to do

    pip3 install -U "layoutparser[ocr]"
    
    bug 
    opened by bholtdwyer 0
Releases(v0.3.4)
  • v0.3.4(Apr 6, 2022)

    Bug fixes

    • fix one critical bug for visualization mentioned in #131 by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/132

    Full Changelog: https://github.com/Layout-Parser/layout-parser/compare/v0.3.3...v0.3.4

    Source code(tar.gz)
    Source code(zip)
  • v0.3.3(Apr 3, 2022)

    Functional Updates

    • Robust pdf loading for empty pages by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/115
    • fix to issue #94 -- avoiding TesseractAgent.detect() inferring any sequence of digit as float by @k-for-code in https://github.com/Layout-Parser/layout-parser/pull/95
    • Better layout comparison by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/128
    • Better visualization functions by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/129

    Example Updates

    • Minor update to Deep Learning Parser example notebook by @Jim-Salmons in https://github.com/Layout-Parser/layout-parser/pull/56
    • Set inplace to True in sorting function by @yusanshi in https://github.com/Layout-Parser/layout-parser/pull/104
    • Add notebook for customizing LayoutParser Models with Label Studio Annotation by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/124

    New Contributors

    • @Jim-Salmons made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/56
    • @yusanshi made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/104
    • @k-for-code made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/95

    Full Changelog: https://github.com/Layout-Parser/layout-parser/compare/v0.3.2...v0.3.3

    Source code(tar.gz)
    Source code(zip)
  • v0.3.2(Sep 23, 2021)

    Important fixes for multibackend layout model support:

    • Resolves the issues mentioned in #78 with other fixes to improve the multibackend layout model support #79
    • Better tests for different backends #79 for preventing future related issues
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Sep 15, 2021)

    • Fixes for automatically setting label_map in Detectron2LayoutModel #75
    • Remove unnecessary class annotations (that might breaks Python 3.6 users) #75
    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Sep 13, 2021)

    We are excited to release LayoutParser v0.3.0, with a lot of exciting updates and functional improvements.

    New Features

    • The biggest change in this version is that LayoutParser now supports multiple deep learning backends: Detectron2, effdet, and paddledetection. This allows for more flexible usage of the layoutparser library, and makes it easier for implementing customized layout models in the future. #54 #67
    • Additionally, the newly added AutoModel and improved model configuration parsing makes it easier load and use the layout detection models. #69
      • e.g, model = lp.AutoLayoutModel("lp://efficientdet/PubLayNet").
    • To support this multi-backend framework, we implement the dynamic importing mechanism as well as better ways for installing layoutparser and the needed dependencies (see instructions). #65 #68
    • And now layoutparser supports directly loading PDF files into as layout objects: #71
      import layoutparser as lp
      pdf_layout, pdf_images = lp.load_pdf("path/to/pdf", load_images=True)
      lp.draw_box(pdf_images[0], pdf_layout[0])
      
    • To support more flexible processing of the layout objects, a set of new toolkits are available: #72
      import layout parser as lp
      page_layout = lp.load_pdf("tests/fixtures/io/example.pdf")[0]
      pdf_lines = lp.simple_line_detection(page_layout)
      

    New Models

    • Add MFD model that can detect (display) equation regions within scientific documents #59
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Apr 12, 2021)

    Layout Parser v0.2.0 Release Notes

    New Features

    1. Support for loading and exporting the layout data in json and csv , see #6
    2. Add support for union and intersect operations, see #20 and the detailed explanation

    Improvements

    1. Functional improvements:
      1. When loading Layout Parser official models, Detectron2LayoutModel can automatically detect the label_map, . For example,

        model = lp.Detectron2LayoutModel("lp://HJDataset/faster_rcnn_R_50_FPN_3x/config")
        model.label_map
        # {1: 'Page Frame', ... }
        
      2. Detectron2LayoutModel now supports the enforce_cpu flag that enforces using cpu even when CUDA devices are available.

      3. For visualization.draw_box, it now supports a show_element_type flag that shows the bbox category name on the top left corner of the layout objects.

    2. Improve installation command and documentation, especially for installing Detectron2 on Windows platforms #25

    New Models

    1. Add the table bank detection models that can identify table regions

    Fixes

    1. Fix the incorrect layout issue mentioned in #9 - Thanks to @remidbs.
    2. Fix the some of the dependency issues mentioned in #11 and #13 by using iopath instead of fvcore. See #18, Thanks to @edisongustavo.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.3(Dec 21, 2020)

    Improvements:

    • Supports lazy loading for the Detectron2 module. Now the dependency for Detectron2 will be requested only when you explicitly create a Detectron2LayoutModel object. This might be helpful for using the plain layoutparser library without installing the Detectron2 module.

    New models:

    • Incorporated a pre-trained model based on the NewspaperNavigator dataset: lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config

    Fixes:

    • Corrected a bug in visualization that might overwrite original the image
    Source code(tar.gz)
    Source code(zip)
  • v0.1.2(Oct 30, 2020)

    In this version, we released a new model for publaynet and made several improvements:

    1. We released the mask_rcnn_X_101_32x8d_FPN_3x model trained on the publaynet dataset. Note: it's been trained on the full training set (while others are only trained on the validation set), and you could expect a 15% performance improvement based on this new model.
    2. We improved the support for PIL images for both layout modeling and visualization
    3. We improved the Default Language Settings for the Tesseract OCR model
    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Jul 16, 2020)

    Fixes

    • Fixed a bug that could cause errors in loading Prima Models

    Updates

    • Update the prima MASK RCNN model with higher accuracy, and listed detailed evaluation reports.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Jun 24, 2020)

    layoutparser now supports the following functionalities:

    • Coordinate system:

      • Supports the 3 basic coordinate system and their geometric relationships
      • Supports the TextBlook and Layout system for convenient coordinate and text processing
    • OCR System:

      • Supports OCR based on Google Cloud Vision and Tesseract API.
    • Layout Modeling:

      • Supports using pre-trained Deep Learning models for layout object detection using Detection2
    • Visualization:

      • Supports highly-customizable presentation of the box coordinates and text in the detected layout
    Source code(tar.gz)
    Source code(zip)
A Python library for setting up projects using tabular data.

A Python library for setting up projects using tabular data. It can create project folders, standardize delimiters, and convert files to CSV from either individual files or a directory.

0 Dec 13, 2022
Quick tutorial on orchest.io that shows how to build multiple deep learning models on your data with a single line of code using python

Deep AutoViML Pipeline for orchest.io Quickstart Build Deep Learning models with a single line of code: deep_autoviml Deep AutoViML helps you build te

Ram Seshadri 6 Oct 02, 2022
Generating a report CSV and send it to an email - Python / Django Rest Framework

Generating a report in CSV format and sending it to a email How to start project. Create a folder in your machine Create a virtual environment python3

alexandre Lopes 1 Jan 17, 2022
Python-samples - This project is to help someone need some practices when learning python language

Python-samples - This project is to help someone need some practices when learning python language

Gui Chen 0 Feb 14, 2022
Easy OpenAPI specs and Swagger UI for your Flask API

Flasgger Easy Swagger UI for your Flask API Flasgger is a Flask extension to extract OpenAPI-Specification from all Flask views registered in your API

Flasgger 3.1k Jan 05, 2023
MonsterManualPlus - An advanced monster manual for Tower of the Sorcerer.

Monster Manual + This is an advanced monster manual for Tower of the Sorcerer mods. Users can get a plenty of extra imformation for decision making wh

Yifan Zhou 1 Jan 01, 2022
Run `black` on python code blocks in documentation files

blacken-docs Run black on python code blocks in documentation files. install pip install blacken-docs usage blacken-docs provides a single executable

Anthony Sottile 460 Dec 23, 2022
VSCode extension that generates docstrings for python files

VSCode Python Docstring Generator Visual Studio Code extension to quickly generate docstrings for python functions. Features Quickly generate a docstr

Nils Werner 506 Jan 03, 2023
Type hints support for the Sphinx autodoc extension

sphinx-autodoc-typehints This extension allows you to use Python 3 annotations for documenting acceptable argument types and return value types of fun

Alex Grönholm 462 Dec 29, 2022
A Python package develop for transportation spatio-temporal big data processing, analysis and visualization.

English 中文版 TransBigData Introduction TransBigData is a Python package developed for transportation spatio-temporal big data processing, analysis and

Qing Yu 251 Jan 03, 2023
Second version of SQL-PYTHON-Practicas

SQLite-Python Acerca de | Autor Sobre el repositorio Segunda version de SQL-PYTHON-Practicas 💻 Tecnologias Visual Studio Code Python SQLite3 📖 Requi

1 Jan 06, 2022
🏆 A ranked list of awesome python developer tools and libraries. Updated weekly.

Best-of Python Developer Tools 🏆 A ranked list of awesome python developer tools and libraries. Updated weekly. This curated list contains 250 awesom

Machine Learning Tooling 646 Jan 07, 2023
PyPresent - create slide presentations from notes

PyPresent Create slide presentations from notes Add some formatting to text file

1 Jan 06, 2022
100 Days of Code Learning program to keep a habit of coding daily and learn things at your own pace with help from our remote community.

100 Days of Code Learning program to keep a habit of coding daily and learn things at your own pace with help from our remote community.

Git Commit Show by Invide 41 Dec 30, 2022
level2-data-annotation_cv-level2-cv-15 created by GitHub Classroom

[AI Tech 3기 Level2 P Stage] 글자 검출 대회 팀원 소개 김규리_T3016 박정현_T3094 석진혁_T3109 손정균_T3111 이현진_T3174 임종현_T3182 Overview OCR (Optimal Character Recognition) 기술

6 Jun 10, 2022
Python For Finance Cookbook - Code Repository

Python For Finance Cookbook - Code Repository

Packt 544 Dec 25, 2022
The Python Dict that's better than heroin.

addict addict is a Python module that gives you dictionaries whose values are both gettable and settable using attributes, in addition to standard ite

Mats Julian Olsen 2.3k Dec 22, 2022
The mitosheet package, trymito.io, and other public Mito code.

Mito Monorepo Mito is a spreadsheet that lives inside your JupyterLab notebooks. It allows you to edit Pandas dataframes like an Excel file, and gener

Mito 1.4k Dec 31, 2022
A collection of lecture notes, drawings, flash cards, mind maps, scripts

Neuroanatomy A collection of lecture notes, drawings, flash cards, mind maps, scripts and other helpful resources for the course "Functional Organizat

Georg Reich 3 Sep 21, 2022
A simple flask application to collect annotations for the Turing Change Point Dataset, a benchmark dataset for change point detection algorithms

AnnotateChange Welcome to the repository of the "AnnotateChange" application. This application was created to collect annotations of time series data

The Alan Turing Institute 16 Jul 21, 2022