Picka: A Python module for data generation and randomization.

Last update: Nov 30, 2021

Related tags

Overview

Picka: A Python module for data generation and randomization.

Author:	Anthony Long
Version:	1.0.1 - Fixed the broken image stuff. Whoops

What is Picka?

Picka generates randomized data for testing.

Data is generated both from a database of known good data (which is included), or by generating realistic data (valid), using string formatting (behind the scenes).

Picka has a function for any field you would need filled in. With selenium, something like would populate the "field-name-here" box for you, 100 times with random names.

for x in xrange(101):
        self.selenium.type('field-name-here', picka.male_name())

But this is just the beginning. Other ways to implement this, include using dicts:

user_information = {
        "first_name": picka.male_name(),
        "last_name": picka.last_name(),
        "email_address": picka.email(10, extension='example.org'),
        "password": picka.password_numerical(6),
}

This would provide:

{
        "first_name": "Jack",
        "last_name": "Logan",
        "email_address": "[email protected]",
        "password": "485444"
}

Don't forget, since all of the data is considered "clean" or valid - you can also use it to fill selects and other form fields with pre-defined values. For example, if you were to generate a state; picka.state() the result would be "Alabama". You can use this result to directly select a state in an address drop-down box.

Examples:

Selenium

def search_for_garbage():
        selenium.open('http://yahoo.com')
        selenium.type('id=search_box', picka.random_string(10))
        selenium.submit()

def test_search_for_garbage_results():
        search_for_garbage()
        selenium.wait_for_page_to_load('30000')
        assert selenium.get_xpath_count('id=results') == 0

Webdriver

driver = webdriver.Firefox()
driver.get("http://somesite.com")
x = {
        "name": [
                "#name",
                picka.name()
        ]
}
driver.find_element_by_css_selector(
        x["name"][0]).send_keys(x["name"][1]
)

Funcargs / pytest

def pytest_generate_tests(metafunc):
        if "test_string" in metafunc.funcargnames:
                for i in range(10):
                        metafunc.addcall(funcargs=dict(numiter=picka.random_string(20)))

def test_func(test_string):
        assert test_string.isalpha()
        assert len(test_string) == 20

MySQL / SQLite

first, last, age = picka.first_name(), picka.last_name(), picka.age()
cursor.execute(
   "insert into user_data (first_name, last_name, age) VALUES (?, ?, ?)",
   (first, last, age)
)

HTTP

def post(host, data):
        http = httplib.HTTP(host)
        return http.send(data)

def test_post_result():
        post("www.spam.egg/bacon.htm", picka.random_string(10))

Comments

No test suite

Slightly ironic, a test data generation toolkit which doesnt have a test suite.

Also setup.py doesnt declare Python 3 support, hence the need for a test suite to validate it works correctly.

opened by jayvdb 1
Additional Functionality for Testers to Add Their Own Data

Picka provides general data for testing. Leveraging this effort provides custom test data. Test data is not limited to just preconfigured values when it's possible to add custom test data. Data can be accessed sequentially, randomly or completely.

opened by bkuehlhorn 1
Fixed test file, added alternative sentence maker
Fixed usage of number in tests (it takes one arg, not two)

Added sentence_actual, which returns an actual sentence from the Sherlock text.

Added _picka._Book class to hold the text and split sentences read from Sherlock. Users can call sentence() without reading the entire file again and again.

Added test of sentence_actual to picka.tests

The sentence_actual function has some nice features:

You're much less likely to get a sentence fragment

You can specify a minimum and maximum number of words

It should be relatively efficient, because the split sentences are cached by the _Book class.

The sentences aren't always perfect, but I think that has to do with the source. A book other than Sherlock Holmes, preferably one with less dialog, would give more "normal" sentences.
opened by TadLeonard 1
Library does not take locale into account
The library assumes an English locale is used (e.g., English-language hardcoded month names). Ideally the library would use locale-dependent constants so that computations are done correctly (e.g., the duration of a month in month_and_day):

>>> locale.setlocale(locale.LC_ALL, 'it_IT') 'it_IT' >>> picka.month() 'Marzo' >>> picka.month_and_day() 'Maggio 2'
opened by svisser 0
picka.age will return ages outside of the bounds

If I call picka.age(1, 1) repeatedly I get 1 and 2 as results. I would have expected it to always return 1. Note that this situation can occur when passing variables to picka.age, I don't expect people to write this in their code themselves.

I can also get ages outside of the bounds when I call picka.age(0, 1) which resorts to using the default values and can therefore return any age within the default values.

opened by svisser 0
Module name means "cunt"

I'm not sure if this is a real issue, but when I look at this module I cannot do so with a straight face. "Picka" is "cunt" in Serbian, Macedonian, Bosnian, Croatian, and I'm unsure as to whether there are other languages where this holds.

While not grounds for any specific action, I find this largely amusing and just wanted to share.

opened by geomaster 2

Releases(v0.96)

v0.96(Jan 17, 2014)

hex, rbg, image and more.
Source code(tar.gz)
Source code(zip)
picka-0.9.6.tar.gz(8.13 MB)
picka-0.9.6.zip(8.18 MB)

Owner

Anthony

GitHub Repository http://antlong.com

Kennedy Institute of Rheumatology University of Oxford Project November 2019

TradingBot6M Kennedy Institute of Rheumatology University of Oxford Project November 2019 Run Change api.txt to binance api key: https://www.binance.c

2 Nov 16, 2021

Mortgage-loan-prediction - Show how to perform advanced Analytics and Machine Learning in Python using a full complement of PyData utilities

Mortgage-loan-prediction - Show how to perform advanced Analytics and Machine Learning in Python using a full complement of PyData utilities. This is aimed at those looking to get into the field of D

1 Dec 26, 2021

Automated Exploration Data Analysis on a financial dataset

Automated EDA on financial dataset Just a simple way to get automated Exploration Data Analysis from financial dataset (OHLCV) using Streamlit and ta.

28 Nov 27, 2022

This repo is dedicated to the data extraction and manipulation of the World Bank's database called STEP.

Overview Welcome to the Step-X repository. This repo is dedicated to the data extraction and manipulation of the World Bank's database called STEP. Be

0 Jan 20, 2022

bigdata_analyse 大数据分析项目

bigdata_analyse 大数据分析项目 wish 采用不同的技术栈，通过对不同行业的数据集进行分析，期望达到以下目标：了解不同领域的业务分析指标深化数据处理、数据分析、数据可视化能力增加大数据批处理、流处理的实践经验增加数据挖掘的实践经验

2.4k Dec 30, 2022

Data pipelines built with polars

valves Warning: the project is very much work in progress. Valves is a collection of functions for your data .pipe()-lines. This project aimes to host

14 Jan 03, 2023

An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify.

An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify. The ETL process flows from AWS's S3 into staging tables in AWS Redshift.

1 Feb 11, 2022

Produces a summary CSV report of an Amber Electric customer's energy consumption and cost data.

Amber Electric Usage Summary This is a command line tool that produces a summary CSV report of an Amber Electric customer's energy consumption and cos

12 May 26, 2022

Sensitivity Analysis Library in Python (Numpy). Contains Sobol, Morris, Fractional Factorial and FAST methods.

Sensitivity Analysis Library (SALib) Python implementations of commonly used sensitivity analysis methods. Useful in systems modeling to calculate the

663 Jan 05, 2023

Performance analysis of predictive (alpha) stock factors

Alphalens Alphalens is a Python Library for performance analysis of predictive (alpha) stock factors. Alphalens works great with the Zipline open sour

2.5k Jan 09, 2023

AWS Glue ETL Code Samples

AWS Glue ETL Code Samples This repository has samples that demonstrate various aspects of the new AWS Glue service, as well as various AWS Glue utilit

1.2k Jan 03, 2023

Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data.

PremiershipPlayerAnalysis Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data. No

5 Sep 06, 2021

Picka: A Python module for data generation and randomization.

Related tags

Overview

Picka: A Python module for data generation and randomization.

What is Picka?

Examples:

Selenium

Webdriver

Funcargs / pytest

MySQL / SQLite

HTTP

Comments

No test suite

Additional Functionality for Testers to Add Their Own Data

Fixed test file, added alternative sentence maker

Library does not take locale into account

picka.age will return ages outside of the bounds

Module name means "cunt"

Releases(v0.96)

v0.96(Jan 17, 2014)

Owner

Anthony

Kennedy Institute of Rheumatology University of Oxford Project November 2019

Mortgage-loan-prediction - Show how to perform advanced Analytics and Machine Learning in Python using a full complement of PyData utilities

Automated Exploration Data Analysis on a financial dataset

This repo is dedicated to the data extraction and manipulation of the World Bank's database called STEP.

bigdata_analyse 大数据分析项目

Data pipelines built with polars

An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify.

Produces a summary CSV report of an Amber Electric customer's energy consumption and cost data.

Sensitivity Analysis Library in Python (Numpy). Contains Sobol, Morris, Fractional Factorial and FAST methods.

Performance analysis of predictive (alpha) stock factors

AWS Glue ETL Code Samples

Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data.

A data parser for the internal syncing data format used by Fog of World.

Deep universal probabilistic programming with Python and PyTorch

The Master's in Data Science Program run by the Faculty of Mathematics and Information Science

Handle, manipulate, and convert data with units in Python

Used for data processing in machine learning, and help us to construct ML model more easily from scratch

vartests is a Python library to perform some statistic tests to evaluate Value at Risk (VaR) Models

Senator Trades Monitor

An implementation of the largeVis algorithm for visualizing large, high-dimensional datasets, for R