A common, beautiful interface to tabular data, no matter the format

Overview

rows

Join the chat at https://gitter.im/turicas/rows Current version at PyPI Downloads per month on PyPI Supported Python Versions Software status License: LGPLv3

No matter in which format your tabular data is: rows will import it, automatically detect types and give you high-level Python objects so you can start working with the data instead of trying to parse it. It is also locale-and-unicode aware. :)

Want to learn more? Read the documentation (or build and browse the docs locally by running make docs-serve after installing requirements-development.txt).

Installation

The easiest way to getting the hands dirty is install rows, using pip.

PyPI

pip install rows

For another ways to instal refer to the Installation section documentation.

Contribution start guide

The preferred way to start contributing for the project is creating a virtualenv (you can do by using virtualenv, virtualenvwrapper, pyenv or whatever tool you'd like).

Create the virtualenv:

mkvirtualenv rows

Install all plugins' dependencies:

pip install --editable .[all]

Install development dependencies:

pip install -r requirements-development.txt
Comments
  • OverflowError

    OverflowError

    Após instalar as dependências requeridas para-o pacote socios-brasil, ao tentar descompactar como indicado, obtenho o erro abaixo:

    Traceback (most recent call last):
     File "extract_dump.py", line 27, in <module> 
        import rows
     File "C:\Users\milcent\AppData\Local\Continuum\Anaconda3\lib\site-packages\row s\__init__.py", line 22, in <module>
        import rows.plugins as plugins
     File "C:\Users\milcent\AppData\Local\Continuum\Anaconda3\lib\site-packages\row s\plugins\__init__.py", line 20, in <module>
        from . import plugin_csv as csv # NOQA
     File "C:\Users\milcent\AppData\Local\Continuum\Anaconda3\lib\site-packages\row s\plugins\plugin_csv.py", line 34, in <module>
        unicodecsv.field_size_limit(sys.maxsize) 
    OverflowError: Python int too large to convert to C long
    

    Rodando em Windows 7, Anaconda 64 bits, Python 3.6. Grato, Marcel Milcent

    opened by milcent 13
  • PDF Plugin

    PDF Plugin

    Create an algorithm to automatically extract tables from PDFs (available in text format). Could use pdftables, but the code is not up-to-date, does not work with Python3 etc.

    enhancement plugin 
    opened by turicas 7
  • Converter PDF x TXT

    Converter PDF x TXT

    Bom dia, estou tentando converter um arquivo pdf escaneado para texto (o pdf contém tabelas). Consegui instalar a biblioteca rows e as dependências rows[pdf], rows[cli]. Quando eu tento rodar o código em prompt command: rows pdf-to-text teste.pdf result.txt Eu tenho o seguinte erro: image

    Alguma ideia do que possa ser o problema?

    opened by Danielydsm 6
  • Autodetect delimiter in CSV files

    Autodetect delimiter in CSV files

    Currently the import_from_csv method have the parameter 'delimiter' that assumes ',' as default, but sometimes we don't know what is the delimiter and need it autodetect. Specially usefull in case of CSV files generated in MS Excell that uses ';' as delimiter.

    A quick and dirty possibility to make this works is counting the number of times ',', ';' and 'tab' is used in the file and assumes as delimiter the most used.

    enhancement help wanted plugin 
    opened by jeanferri 6
  • OverflowError: Python int too large to convert to C long

    OverflowError: Python int too large to convert to C long

    Bom dia!

    Estou aprendendo Python, então este pode ser um erro bem simples de resolver, mesmo assim não faço ideia do que pode ser feito:

    Ao tentar importar o rows aparece a mensagem do título.

    duplicate 
    opened by tbmpereira 5
  • Text plugin is not working on `rows convert`

    Text plugin is not working on `rows convert`

    The file cha-de-bebe.txt is not being read correctly on the command line (try rows print cha-de-bebe.txt or rows convert cha-de-bebe.txt cha-de-bebe.csv) -- but it was generated correctly using rows print http://some-url/ > cha-de-bebe.txt.

    @jsbueno could you please help checking it? I think this bug started after your PR #270 .

    bug 
    opened by turicas 5
  • locale.Error: unsupported locale setting

    locale.Error: unsupported locale setting

    ======================================================================
    ERROR: test_DecimalField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 203, in test_DecimalField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_FloatField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 171, in test_FloatField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_IntegerField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 144, in test_IntegerField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_PercentField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 250, in test_PercentField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_locale_context (tests.tests_localization.LocalizationTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_localization.py", line 41, in test_locale_context
        with locale_context(name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    opened by ignatenkobrain 5
  • Porting rows to Python3

    Porting rows to Python3

    This is a work in progress.

    I could make all tests pass on Python3, but 3 are broken on Python2 because of something I can't find yet on the type identification system.

    This PR is just to share it with you. Maybe your familiarity with the code can help fixing the tests.

    []'s!

    opened by henriquebastos 5
  • UserWarning: Call to deprecated function or class get_active_sheet

    UserWarning: Call to deprecated function or class get_active_sheet

    Hi, when I build package for Debian, debhelper tools runs pybuild, showing this warnings [1] I use the lastest source: git20151115.837b41.

    Is there something here or other has the same problem? thanks.

    [1] pybuild --test --test-nose -i python{version} -p 2.7 --dir . I: pybuild base:184: cd /pkgs/pkg-rows/rows-0.1.1+git20151115.837b41/.pybuild/pythonX.Y_2.7/build; python2.7 -m nose tests ...................................................................................................../usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self): /usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self): ./usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self): ./usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self):

    ..........................

    Ran 129 tests in 1.936s

    OK

    opened by kretcheu 5
  • Add sphinx documentation

    Add sphinx documentation

    Hello dear reviewer,

    I basically did three things:

    • Add the sphinx to the requirements-development.txt
    • Create a basic documentation, based on the Readme, with few improvements i've made.
    • Move some basic project information (intro and archtecture) to the init.py of the rows module

    I think the Sphinx doc can also be used as a website, and maybe can be hosted at github pages.

    []'s I hope this will be usefull! :)

    opened by raphapassini 5
  • Could not find import_from_pdf function

    Could not find import_from_pdf function

    I need to import data from pdf and found this example: https://gist.github.com/turicas/6b9ca83dcd531a6cd4fd87ced2a28c70

    But I was unable to run it, since the import_from_pdf is not available to me.

    I have already run the command: pip install rows[all]

    Is pdf format no longer supported?

    opened by marcellalves 4
  • New release on pypi

    New release on pypi

    I started using the "rows" lib today, and I've lost several hours of work because of a bug on empty cells in ods input. Here is my story.

    I was learning/discovering the "rows" lib with an ODS file, and I fall across a strange behavior. Of course, I thought it was because I didn't use the lib properly : so I tried all possible options, searched on the Internet... etc. After several hours, I eventually tried the same code with an equivalent XLSX file and I found out that the behavior was different ! So I realized that I had found a bug on my first day of use of the rows lib !

    I decided that I should report the bug. I took the time to write a script to illustrate my bug report. I was using rows 0.4.1 from pypi, but, before creating the bug report on github, I thought I should check if the bug is still present in the "develop" branch... and my script shows that the bug is fixed in the "develop" branch !

    Release 0.4.1 is dated Feb 14, 2019... almost 4 years old ! There has been 210 commits since 0.4.1 ; among these 210 commits, I counted about 45 fixes. While counting the commit messages with a fix message, I found the commit that fixes my bug: issue #320 fixed on Match 27 2019 in this commit https://github.com/turicas/rows/commit/c569f9415f2c76b2f6e9afbe1d748946e759711f

    So, in December 2022, some users are wasting hours because of a bug that was found and fixed 3,5 years ago :-( No comment !

    So, please, push a new release on pypi !

    opened by alexis-via 2
  • Replace unicodecsv by standard csv module

    Replace unicodecsv by standard csv module

    unicodecsv is not maintained since a while now [1]. It was preferred over standard csv because of the unicode support. Now that Python3 csv module [2] supports it, let's use it.

    For more context, we hit issues while rebuilding uncicodecsv during Fedora Python3.11 mass rebuild [3][4].

    [1] https://github.com/jdunck/python-unicodecsv [2] https://docs.python.org/3/library/csv.html [3] https://copr.fedorainfracloud.org/coprs/g/python/python3.11/package/python-unicodecsv/ [4] https://bugzilla.redhat.com/show_bug.cgi?id=2021938

    opened by jcapiitao 1
  • NameError: name 'obj' is not defined

    NameError: name 'obj' is not defined

    Esse erro rolou quando fui tentar usar o método closest_same_column em rows.plugins.pdf image

    Aparentemente aqui no código está faltando a parte em que pegamos o o objeto que tem o valor passado como parâmetro para trabalharmos com ele (e aparentemente isso também acontece com o outro método closest_same_line

    opened by dehatanes 0
  • Python 3.10: cannot import name 'Iterator' from 'collections'

    Python 3.10: cannot import name 'Iterator' from 'collections'

    File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/rows/plugins/utils.py", line 20, in <module> 
    from collections import Iterator, OrderedDict            
    ImportError: cannot import name 'Iterator' from 'collections'
    

    Maybe this will be fix:

    try:
        from collections.abc import Iterator
    except ImportError:
        from collections import Iterator
    
    opened by fagci 0
  • [pgimport] Option to do not store values as NULL

    [pgimport] Option to do not store values as NULL

    NULL values can be confusing when analyzing data and there will be some cases where we prefer to add empty values as empty strings instead of NULL. The function pgimport (and the CLI equivalent) should have an option to deal with this scenario.

    enhancement cli plugin utils 
    opened by turicas 0
Releases(v0.4.1)
Owner
Álvaro Justen
Free/libre software hacker, hypnotist, remote worker, teacher, coffee lover/roaster
Álvaro Justen
pspsps(1) is a compyuter software to call an online catgirl to the Linux terminyal.

pspsps(1): call a catgirl from the Internyet to the Linux terminyal show processes: ps show catgirls: pspsps —@ Melissa Boiko 32 Dec 19, 2022

An AddOn storing wireguard configuration

Wireguard Database Connector Overview Development Status: 0.1.7 (alpha) First of all, I'd like to thank Jared McKnight for wireguard who inspired me t

Markus Neubauer 3 Dec 30, 2021
Dyson Sphere Program Blueprint Toolkit

dspbptk This is dspbptk, the Dyson Sphere Program Blueprint toolkit. Dyson Sphere Program is an amazing factory-building game by the incredibly talent

Johannes Bauer 22 Nov 15, 2022
A parser of Windows Defender's DetectionHistory forensic artifact, containing substantial info about quarantined files and executables.

A parser of Windows Defender's DetectionHistory forensic artifact, containing substantial info about quarantined files and executables.

Jordan Klepser 101 Oct 30, 2022
Simple python code for compile brainfuck program.

py-brainf*ck Just a basic compiled that compiles your brainf*ck codes and gives you informations about memory, used cells, dumped version, logs etc...

4 Jun 13, 2021
Python3 Interface to numa Linux library

py-libnuma is python3 interface to numa Linux library so that you can set task affinity and memory affinity in python level for your process which can help you to improve your code's performence.

Dalong 13 Nov 10, 2022
Just some mtk tool for exploitation, reading/writing flash and doing crazy stuff

Just some mtk tool for exploitation, reading/writing flash and doing crazy stuff. For linux, a patched kernel is needed (see Setup folder) (except for read/write flash). For windows, you need to inst

Bjoern Kerler 1.1k Dec 31, 2022
Sublime Text 2/3 style auto completion for ST4

Hippie Autocompletion Sublime Text 2/3 style auto completion for ST4: cycle through words, do not show popup. Simply hit Tab to insert completion, hit

Alexander Schepanovski 20 May 19, 2022
Collection of system-wide scripts that I use on my Gentoo

linux-scripts Collection of scripts that I use on my Gentoo machine. I tend to put all scripts in /scripts directory. It is not likely that you would

Xoores 1 Jan 09, 2022
Import some key/value data to Prometheus custom-built Node Exporter in Python

About the app In one particilar project, i had to import some key/value data to Prometheus. So i have decided to create my custom-built Node Exporter

Hamid Hosseinzadeh 1 May 19, 2022
[x]it! support for working with todo and check list files in Sublime Text

[x]it! for Sublime Text This Sublime Package provides syntax-highlighting, shortcuts, and auto-completions for [x]it! files. Features Syntax highlight

Jan Heuermann 18 Sep 19, 2022
Slotscheck - Find mistakes in your slots definitions

🎰 Slotscheck Adding __slots__ to a class in Python is a great way to reduce mem

Arie Bovenberg 67 Dec 31, 2022
Hotpile: High Order Turing Machine Language Compiler

Hotpile: High Order Turing Machine Language Compiler Build and Run Requirements: Python 3.6+, bison, flex, and GCC installed. Needs to be run under UN

Jiang Weihao 4 Dec 29, 2021
Your missing PO formatter and linter

pofmt Your missing PO formatter and linter Features Wrap msgid and msgstr with a constant max width. Can act as a pre-commit hook. Display lint errors

Frost Ming 5 Mar 22, 2022
Ergonomic option parser on top of dataclasses, inspired by structopt.

oppapī Ergonomic option parser on top of dataclasses, inspired by structopt. Usage from typing import Optional from oppapi import from_args, oppapi @

yukinarit 4 Jul 19, 2022
This repository contains Python games that I've worked on. You'll learn how to create python games with AI. I try to focus on creating board games without GUI in Jupyter-notebook.

92_Python_Games 🎮 Introduction 👋 This repository contains Python games that I've worked on. You'll learn how to create python games with AI. I try t

Milaan Parmar / Милан пармар / _米兰 帕尔马 166 Jan 01, 2023
Code needed for hybrid land cover change analysis for NASA IDS project

Documentation for the NASA IDS change analysis Poley 10/21/2021 Required python packages: whitebox numpy rasterio rasterio.mask os glob math itertools

Andrew Poley 2 Nov 12, 2021
Easy way to build a SaaS application using Python and Dash

EasySaaS This project will be attempt to make a great starting point for your next big business as easy and efficent as possible. This project will cr

xianhu 3 Nov 17, 2022
An example of Connecting a MySQL Database with Python Code

An example of Connecting a MySQL Database with Python Code And How to install Table of contents General info Technologies Setup General info In this p

Mohammad Hosseinzadeh 1 Nov 23, 2021
A Sophisticated And Beautiful Doxing Tool

Garuda V1.1 A Sophisticated And Beautiful Doxing Tool Works on Android[Termux] | Linux | Windows Don't Forget to give it a star ❗ How to use ❓ First o

The Cryptonian 67 Jan 10, 2022