create custom test databases that are populated with fake data

Overview

Screenshot


Latest Version Status

About

Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql, mongodb, redis, couchdb.

Installation

The installation through pypi retrieves 'fake-factory' as a main dependency.

pip install fake2db

Optional requirements

PostgreSQL
pip install psycopg2

For psycopg2 to install you need pg_config in your system.

On Mac, the solution is to install postgresql:

brew install postgresql

On CentOS, the solution is to install postgresql-devel:

sudo yum install postgresql-devel

Mongodb

pip install pymongo

Redis

pip install redis

MySQL

mysql connector is needed for mysql db generation:

http://dev.mysql.com/downloads/connector/python/

CouchDB

pip install couchdb

Usage

--rows argument is pretty clear :) integer

--db argument takes 6 possible options : sqlite, mysql, postgresql, mongodb, redis, couchdb

--name argument is OPTIONAL. When it is absent fake2db will name db's randomly.

--host argument is OPTIONAL. Hostname to use for database connection. Not used for sqlite.

--port argument is OPTIONAL. Port to use for database connection. Not used for sqlite.

--username argument is OPTIONAL. Username for the database user.

--password argument is OPTIONAL. Password for database user. Only supported for mysql & postgresql.

--locale argument is OPTIONAL. The localization of data to be generated ('en_US' as default).

--seed argument is OPTIONAL. Integer for seeding random generator to produce the same data set between runs. Note: uuid4 values still generated randomly.

fake2db --rows 200 --db sqlite

fake2db --rows 1500 --db postgresql --name test_database_postgre

fake2db --db postgresql --rows 2500 --host container.local --password password --user docker

fake2db --rows 200 --db sqlite --locale cs_CZ --seed 1337

In addition to the databases supported in the db argument, you can also run fake2db with FoundationDB SQL Layer. Once SQL Layer is installed, simply use the postgresql generator and specify the SQL Layer port. For example:

fake2db --rows --db postgresql --port 15432

Custom Database Generation

If you want to create a custom db/table, you have to provide --custom parameter followed by the column item you want. At the point in time, i mapped all the possible column items you can use here:

https://github.com/emirozer/fake2db/blob/master/fake2db/custom.py

Feed any keys you want to the custom flag:

fake2db.py --rows 250 --db mysql --username mysql --password somepassword --custom name date country

fake2db.py --rows 1500 --db mysql --password randompassword --custom currency_code credit_card_full credit_card_provider

fake2db.py --rows 20 --db mongodb --custom name date country

Sample output - SQLite

Screenshot

Screenshot

Screenshot

Comments
  • Error when running cli

    Error when running cli

    I tried with and without a virtualenv for dependency isolation, using the following command:

    fake2db --db postgresql --host 0.0.0.0 --port 5632 --password password --username user --rows 10 --name mydb

    Which gave me:

    File "/usr/local/bin/fake2db", line 7, in <module>
        from fake2db.fake2db import main
      File "/usr/local/lib/python3.5/site-packages/fake2db/fake2db.py", line 6, in <module>
        from custom import faker_options_container
    ImportError: No module named 'custom'
    

    But the custom.py file exists, it's just not finding it. Installed using pip 9.0.1 and python 2.7.10.

    sudo seems to work to bypass the error. Not sure why it needs it, but this is probably unrelated to the specific package, so I'll close this.

    opened by christabor 5
  • Allow faster data loading.

    Allow faster data loading.

    I would like to create a table with a million rows or more to stress test a service I am evaluating.

    Using a server hosted by aws (rds mysql 5.6), I'm currently using no cpu on my macbook so it is definitely network bound.

    I'm able to load about 3000 rows per 5 minutes. This speed is really slow when you would like to generate a very large table (e.g. many millions of rows)

    Some possible suggestions:

    1. bulk insert
    2. save to a csv, and then call a sql command to load from csv
    3. create a pool of connections and insert through each connection
    opened by thomasdziedzic 5
  • Allow custom schemata

    Allow custom schemata

    Let us specify (via a schema.json file or something similar) a custom schemata. Something compatible with https://github.com/topliceanu/mongoose-gen would be great.

    opened by Californian 5
  • Missing faker_options_container

    Missing faker_options_container

    SUMMARY: Followed README, attempted to run fake2db.py but encounter missing dependencies. The following replicates my steps to include confirmation of dependencies:

    ERROR: _ Traceback (most recent call last): File "fake2db.py", line 6, in from .custom import faker_options_container ValueError: Attempted relative import in non-package _

    STEPS:

    # pip install fake2db Requirement already satisfied: fake2db in /Users/jason/Documents/Scripts/fake2db Requirement already satisfied: fake-factory>=0.5.3 in /usr/local/lib/python2.7/site-packages (from fake2db)

    # pip install psycopg2 Requirement already satisfied: psycopg2 in /usr/local/lib/python2.7/site-packages

    # cd /Users/jason/Documents/Scripts/fake2db *# ls LICENSE docs fake2db.egg-info setup.py README.md fake2db requirements.txt

    # pip install -r requirements.txt Obtaining file:///Users/jason/Documents/Scripts/fake2db (from -r requirements.txt (line 3)) Requirement already satisfied: fake-factory>=0.5.3 in /usr/local/lib/python2.7/site-packages (from fake2db==0.5.2->-r requirements.txt (line 3)) Installing collected packages: fake2db Found existing installation: fake2db 0.5.2 Uninstalling fake2db-0.5.2: Successfully uninstalled fake2db-0.5.2 Running setup.py develop for fake2db Successfully installed fake2db

    # cd fake2db # ls init.py custom.py mongodb_handler.py redis_handler.py base_handler.py fake2db.py mysql_handler.py sqlite_handler.py couchdb_handler.py helpers.py postgresql_handler.py

    # python fake2db.py --db postgresql --host 127.0.0.1 --port 5432 --password fakepassword --username postgres --name TESTDB --rows 100 --custom address

    Traceback (most recent call last): File "fake2db.py", line 6, in from .custom import faker_options_container ValueError: Attempted relative import in non-package

    opened by JasonBrannon 4
  • name not taken in consideration for mysql

    name not taken in consideration for mysql

    Name argument is not taken in consideration when creating mysql db. None should replace by args.name here https://github.com/emirozer/fake2db/blob/master/fake2db/fake2db.py#L161

    opened by zaher-mh 3
  • Python socket package throws error in fake2db_logger under helpers.py

    Python socket package throws error in fake2db_logger under helpers.py

    $ fake2db --help
    Traceback (most recent call last):
      File "/usr/local/bin/fake2db", line 9, in <module>
        load_entry_point('fake2db==0.2.2', 'console_scripts', 'fake2db')()
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 357, in load_entry_point
        return get_distribution(dist).load_entry_point(group, name)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 2394, in load_entry_point
        return ep.load()
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 2108, in load
        entry = __import__(self.module_name, globals(),globals(), ['__name__'])
      File "/Library/Python/2.7/site-packages/fake2db/fake2db.py", line 7, in <module>
        logger, extra_information = fake2db_logger()
      File "/Library/Python/2.7/site-packages/fake2db/helpers.py", line 13, in fake2db_logger
        local_ip = socket.gethostbyname(socket.gethostname())
    socket.gaierror: [Errno 8] nodename nor servname provided, or not known
    

    I don't think it's an internet issue, as I am connecting to the ip of a local vm. What are your thought?

    opened by aneeshvaidya 3
  • Removed duplicity of dependency packages listing

    Removed duplicity of dependency packages listing

    This is a good technique to avoid typing the same packages names and versions in different places (setup.py and requirements files). Also it gives the ability to test different versions in development keeping the setup.py functional.

    opened by mauricioabreu 3
  • AttributeError: 'NoneType' object has no attribute 'execute'

    AttributeError: 'NoneType' object has no attribute 'execute'

    Cool idea; wanted to play with this.

    OSX 10.9.5 MySQL 5.5.38 Python 2.7.5

    Installed http://dev.mysql.com/downloads/connector/python/

    Opened new terminal (iTerm2) session

    Ran $ fake2db --rows 200 --db mysql

    Got:

    Traceback (most recent call last):
      File "/usr/local/bin/fake2db", line 8, in <module>
        load_entry_point('fake2db==0.1.5', 'console_scripts', 'fake2db')()
      File "/Library/Python/2.7/site-packages/fake2db/fake2db.py", line 106, in main
        fake_mysql_handler.fake2db_mysql_initiator(host, port, int(args.rows))
      File "/Library/Python/2.7/site-packages/fake2db/mysql_handler.py", line 42, in fake2db_mysql_initiator
        cursor.execute(tables[key])
    AttributeError: 'NoneType' object has no attribute 'execute'
    

    Any thoughts? Thanks!

    opened by ericdorsey 3
  • Clean the database initialize process, and fix the bug that --name is…

    Clean the database initialize process, and fix the bug that --name is…

    When I use the following command:

    fake2db --db sqlite --name test.db --rows 100
    

    Instead of using test.db as the database file name, fake2db use a generated filename such as sqlite_JZMYNRKM.db.

    I hope I had fixed the problem, but it is not full tested for other engine(postgresql, mysql, ...).

    Thanks a lot.

    opened by huyx 2
  • Not sure how to start with.

    Not sure how to start with.

    Hi, I've installed fake2db successfully in ubuntu 14.04 and still, i'm unable to run cli commands as shown in readme.

    Python 2.7.6 (default, Mar 22 2014, 22:59:56) [GCC 4.8.2] on linux2 Type "help", "copyright", "credits" or "license" for more information.

    fake2db --rows 100 --db postgresql --name testdb File "", line 1 fake2db --rows 100 --db postgresql --name testdb ^ SyntaxError: invalid syntax

    opened by emprit1988 2
  • Fix how the Postgres connector could only practically connect to localhost

    Fix how the Postgres connector could only practically connect to localhost

    Tis ont erpefct, but now fake2db actually works for me now.

    I tried to make this a minimal change, but there were some things I could not overlook, so there's some extra changes not related to the ability to pass in username/password.

    Tested locally with:

    $ fake2db --db postgresql --rows 1337 --host elk_postgres.local --password password --user docker
    2015-09-02 17:19:25,854 crccheck      Rows argument : 1337
    2015-09-02 17:19:36,184 crccheck      Database created and opened succesfully: postgresql_vsgjqwju
    2015-09-02 17:19:38,134 crccheck      simple_registration Commits are successful after write job!
    2015-09-02 17:19:39,864 crccheck      detailed_registration Commits are successful after write job!
    2015-09-02 17:19:41,488 crccheck      companies Commits are successful after write job!
    2015-09-02 17:19:42,958 crccheck      user_agent Commits are successful after write job!
    2015-09-02 17:19:45,376 crccheck      customer Commits are successful after write job!
    
    opened by crccheck 2
  • Allow multiple columns with same faker key OR allow column naming

    Allow multiple columns with same faker key OR allow column naming

    I found myself needing to do the following to replicate a table structure:

    fake2db --rows 1500 --db mysql --name=test_speed --username root --password secret --custom date_time_this_year random_digit_not_null random_digit_not_null uuid4 word boolean boolean boolean random_number random_number word last_name word word word word last_name year
    

    Then I found out that I couldn't use duplicate faker keys for my columns, so using random_digit_not_null twice is not possible.

    I wrote some code in the mysql handler to append keys to the columns in order to allow duplicates (see this commit https://github.com/denitsa-cm/fake2db/commit/73df91998d6c3929e90d5c211f61b765c560510d)

    I do think the mysql_handler is probably not the best place for this -> the unique columns should somehow be formatted further up and passed to all handlers. But it's the first time I'm touching python so I just hacked it a bit for my purposes.

    Then the command above would result in a table structure like this:

    image

    Would be nice if something like this was supported out of the box for all database engines. Or perhaps even better, the option to name the columns in addition to providing the faker keys. This would alleviate the issue altogether.

    opened by denitsa-md 0
  • database already exists

    database already exists

    I inserted 10000 to the db named fake. Then I want to insert another 10000. But error happened.

    fake2db --rows 10000 --db postgresql --name fake --username postgres 2019-03-18 09:55:19,907 www Rows argument : 20000 2019-03-18 09:55:20,036 www database "fake" already exists

    Traceback (most recent call last): File "/home/www/pyenv/aws/bin/fake2db", line 11, in sys.exit(main()) File "/home/www/pyenv/aws/local/lib/python2.7/site-packages/fake2db/fake2db.py", line 167, in main number_of_rows=args.rows, name=args.name, custom=custom) File "/home/www/pyenv/aws/local/lib/python2.7/site-packages/fake2db/postgresql_handler.py", line 18, in fake2db_initiator cursor, conn = self.database_caller_creator(number_of_rows, **connection_kwargs) File "/home/www/pyenv/aws/local/lib/python2.7/site-packages/fake2db/postgresql_handler.py", line 46, in database_caller_creator cur.execute('CREATE DATABASE %s;' % dbname) psycopg2.ProgrammingError: database "fake" already exists

    Thanks for your great work.

    opened by jjuu 1
  • How to get auto-increment ID and specify custom column names?

    How to get auto-increment ID and specify custom column names?

    Hi,

    I just ran this command:

    $ fake2db --rows 3 --db sqlite --custom name date country
    

    And the result that I got:

    db

    I have 2 questions here:

    1. How do I get auto-increment ID? For example, Francis get id 1 and Robert get id 3?
    2. How do I rename the custom column names? For example, full_name instead of name and birth_date instead of date.

    Thank you in advance for your help. This project is very cool and helpful. 🙂

    opened by zulhfreelancer 0
  • Populate an existing database + schema?

    Populate an existing database + schema?

    Hey there – is there any way we could run this against an existing database (with a flag or something is fine) and not get errors? We'd like to add this to our migrations in our dev environments, but we end up having to create a new database, then import it into the primary db.

    We'd be happy to make adjustments in a PR if you want to outline the easiest / best way to do so. Thanks :)

    opened by dwelch2344 1
Releases(1.0.0)
Owner
Emir Ozer
polyglot programmer & distributed systems engineer.
Emir Ozer
masscan + nmap 快速端口存活检测和服务识别

masnmap masscan + nmap 快速端口存活检测和服务识别。 思路很简单,将masscan在端口探测的高速和nmap服务探测的准确性结合起来,达到一种相对比较理想的效果。 先使用masscan以较高速率对ip存活端口进行探测,再以多进程的方式,使用nmap对开放的端口进行服务探测。 安

starnightcyber 75 Dec 19, 2022
Test django schema and data migrations, including migrations' order and best practices.

django-test-migrations Features Allows to test django schema and data migrations Allows to test both forward and rollback migrations Allows to test th

wemake.services 382 Dec 27, 2022
a plugin for py.test that changes the default look and feel of py.test (e.g. progressbar, show tests that fail instantly)

pytest-sugar pytest-sugar is a plugin for pytest that shows failures and errors instantly and shows a progress bar. Requirements You will need the fol

Teemu 963 Dec 28, 2022
A Django plugin for pytest.

Welcome to pytest-django! pytest-django allows you to test your Django project/applications with the pytest testing tool. Quick start / tutorial Chang

pytest-dev 1.1k Dec 31, 2022
How to Create a YouTube Bot that Increases Views using Python Programming Language

YouTube-Bot-in-Python-Selenium How to Create a YouTube Bot that Increases Views using Python Programming Language. The app is for educational purpose

Edna 14 Jan 03, 2023
A friendly wrapper for modern SQLAlchemy and Alembic

A friendly wrapper for modern SQLAlchemy (v1.4 or later) and Alembic. Documentation: https://jpsca.github.io/sqla-wrapper/ Includes: A SQLAlchemy wrap

Juan-Pablo Scaletti 129 Nov 28, 2022
Donors data of Tamil Nadu Chief Ministers Relief Fund scrapped from https://ereceipt.tn.gov.in/cmprf/Interface/CMPRF/MonthWiseReport

Tamil Nadu Chief Minister's Relief Fund Donors Scrapped data from https://ereceipt.tn.gov.in/cmprf/Interface/CMPRF/MonthWiseReport Scrapper scrapper.p

Arunmozhi 5 May 18, 2021
Simple assertion library for unit testing in python with a fluent API

assertpy Simple assertions library for unit testing in Python with a nice fluent API. Supports both Python 2 and 3. Usage Just import the assert_that

19 Sep 10, 2022
The best, free, all in one, multichecking, pentesting utility

The best, free, all in one, multichecking, pentesting utility

Mickey 58 Jan 03, 2023
Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)

Leon 3.5k Dec 30, 2022
Generic automation framework for acceptance testing and RPA

Robot Framework Introduction Installation Example Usage Documentation Support and contact Contributing License Introduction Robot Framework is a gener

Robot Framework 7.7k Jan 07, 2023
A twitter bot that simply replies with a beautiful screenshot of the tweet, powered by poet.so

Poet this! Replies with a beautiful screenshot of the tweet, powered by poet.so Installation git clone https://github.com/dhravya/poet-this.git cd po

Dhravya Shah 30 Dec 04, 2022
Simple frontend TypeScript testing utility

TSFTest Simple frontend TypeScript testing utility. Installation Install webpack in your project directory: npm install --save-dev webpack webpack-cli

2 Nov 09, 2021
Automated mouse clicker script using PyAutoGUI and Typer.

clickpy Automated mouse clicker script using PyAutoGUI and Typer. This app will randomly click your mouse between 1 second and 3 minutes, to prevent y

Joe Fitzgibbons 0 Dec 01, 2021
Silky smooth profiling for Django

Silk Silk is a live profiling and inspection tool for the Django framework. Silk intercepts and stores HTTP requests and database queries before prese

Jazzband 3.7k Jan 04, 2023
Python dilinin Selenium kütüphanesini kullanarak; Amazon, LinkedIn ve ÇiçekSepeti üzerinde test işlemleri yaptığımız bir case study reposudur.

Python dilinin Selenium kütüphanesini kullanarak; Amazon, LinkedIn ve ÇiçekSepeti üzerinde test işlemleri yaptığımız bir case study reposudur. LinkedI

Furkan Gulsen 8 Nov 01, 2022
A utility for mocking out the Python Requests library.

Responses A utility library for mocking out the requests Python library. Note Responses requires Python 2.7 or newer, and requests = 2.0 Installing p

Sentry 3.8k Jan 03, 2023
Python tools for penetration testing

pyTools_PT python tools for penetration testing Please don't use these tool for illegal purposes. These tools is meant for penetration testing for leg

Gourab 1 Dec 01, 2021
The source code and slide for my talk about the subject: unittesing in python

PyTest Talk This talk give you some ideals about the purpose of unittest? how to write good unittest? how to use pytest framework? and show you the ba

nguyenlm 3 Jan 18, 2022
FauxFactory generates random data for your automated tests easily!

FauxFactory FauxFactory generates random data for your automated tests easily! There are times when you're writing tests for your application when you

Og Maciel 37 Sep 23, 2022