The micro-framework to create dataframes from functions.

Last update: Jan 07, 2023

Related tags

Overview

Hamilton

The micro-framework to create dataframes from functions.

Specifically, Hamilton is a framework that allows for delayed executions of functions in a Directed Acyclic Graph (DAG). This is meant to solve the problem of creating complex data pipelines. Core to the design of Hamilton is a clear mapping of function name to implementation. That is, Hamilton forces a certain paradigm with writing functions, and aims for DAG clarity, easy modifications, unit testing, and documentation.

For the backstory on how Hamilton came about, see our blog post!.

Getting Started

Here's a quick getting started guide to get you up and running in less than 15 minutes.

Installation

Requirements:

Python 3.6 or 3.7

To get started, first you need to install hamilton. It is published to pypi under sf-hamilton:

pip install sf-hamilton

While it is installing we encourage you to start on the next section.

Note: the content (i.e. names, function bodies) of our example code snippets are for illustrative purposes only, and don't reflect what we actually do internally.

Hamilton in 15 minutes

Hamilton is a new paradigm when it comes to creating dataframes. Rather than thinking about manipulating a central dataframe, you instead think about the column(s) you want to create, and what inputs are required. There is no need for you to think about maintaining this dataframe, meaning you do not need to think about any "glue" code; this is all taken care of by the Hamilton framework.

For example rather than writing the following to manipulate a central dataframe object df:

df['col_c'] = df['col_a'] + df['col_b']

you write

def col_c(col_a: pd.Series, col_b: pd.Series) -> pd.Series:
    """Creating column c from summing column a and column b."""
    return col_a + col_b

In diagram form: The Hamilton framework will then be able to build a DAG from this function definition.

So let's create a "Hello World" and start using Hamilton!

Your first hello world.

By now, you should have installed Hamilton, so let's write some code.

Create a file my_functions.py and add the following functions:

pd.Series: """The cost per signup in relation to spend.""" return spend / signups ">

import pandas as pd

def avg_3wk_spend(spend: pd.Series) -> pd.Series:
    """Rolling 3 week average spend."""
    return spend.rolling(3).mean()

def spend_per_signup(spend: pd.Series, signups: pd.Series) -> pd.Series:
    """The cost per signup in relation to spend."""
    return spend / signups

The astute observer will notice we have not defined spend or signups as functions. That is okay, this just means these need to be provided as input when we come to actually wanting to create a dataframe.

Create a my_script.py which is where code will live to tell Hamilton what to do:

import importlib
import logging
import sys

import pandas as pd
from hamilton import driver

logger = logging.getLogger(__name__)
logging.basicConfig(stream=sys.stdout)
initial_columns = {  # load from actuals or wherever -- this is our initial data we use as input.
    'signups': pd.Series([1, 10, 50, 100, 200, 400]),
    'spend': pd.Series([10, 10, 20, 40, 40, 50]),
}
# we need to tell hamilton where to load function definitions from
module_name = 'my_functions'
module = importlib.import_module(module_name)
dr = driver.Driver(initial_columns, module)  # can pass in multiple modules
# we need to specify what we want in the final dataframe.
output_columns = [
    'spend',
    'signups',
    'avg_3wk_spend',
    'spend_per_signup',
]
# let's create the dataframe!
df = dr.execute(output_columns, display_graph=True)
print(df)

Run my_script.py

python my_script.py

You should see the following output:

   spend  signups  avg_3wk_spend  spend_per_signup
0     10        1            NaN            10.000
1     10       10            NaN             1.000
2     20       50      13.333333             0.400
3     40      100      23.333333             0.400
4     40      200      33.333333             0.200
5     50      400      43.333333             0.125

Congratulations - you just created your first dataframe with Hamilton!

License

Hamilton is released under the BSD 3-Clause Clear License. If you need to get in touch about something, contact us at algorithms-opensource (at) stitchfix.com.

Contributing

We take contributions, large and small. We operate via a Code of Conduct and expect anyone contributing to do the same.

To see how you can contribute, please read our contributing guidelines and then our developer setup guide.

Prescribed Development Workflow

In general we prescribe the following:

Ensure you understand Hamilton Basics.
Familiarize yourself with some of the Hamilton decorators. They will help keep your code DRY.
Start creating Hamilton Functions that represent your work. We suggest grouping them in modules where it makes sense.
Write a simple script so that you can easily run things end to end.
Join our discord community to chat/ask Qs/etc.

For the backstory on Hamilton we invite you to watch ~9 minute lightning talk on it that we gave at the apply conference: video, slides.

PyCharm Tips

If you're using Hamilton, it's likely that you'll need to migrate some code. Here are some useful tricks we found to speed up that process.

Live templates

Live templates are a cool feature and allow you to type in a name which expands into some code.

E.g. For example, we wrote one to make it quick to stub out Hamilton functions: typing graphfunc would turn into ->

def _(_: pd.Series) -> pd.Series:
   """""""
   return _

Where the blanks are where you can tab with the cursor and fill things in. See your pycharm preferences for setting this up.

Multiple Cursors

If you are doing a lot of repetitive work, one might consider multiple cursors. Multiple cursors allow you to do things on multiple lines at once.

To use it hit option + mouse click to create multiple cursors. Esc to revert back to a normal mode.

Contributors

Stefan Krawczyk (@skrawcz)
Elijah ben Izzy (@elijahbenizzy)
Danielle Quinn (@danfisher-sf)
Rachel Insoft (@rinsoft-sf)
Shelly Jang (@shellyjang)
Vincent Chu (@vslchusf)

Comments

Decorator to specify value and dag node inputs

Is your feature request related to a problem? Please describe. I am looking for a decorator to specify both value and dag node inputs. parameterized allows for value inputs, parameterized_inputs allows for dag node inputs, but there are no decorators that allow you to do both.

Describe the solution you'd like A decorator that allows for both value and dag node inputs.

Describe alternatives you've considered I tried just decorating a function with both decorators, but that is not supported.

Additional context
enhancement

opened by wangkev 17
[idea] can we make hamilton run in a distributed manner?
What's the idea?

Hamilton runs locally on data that fits in memory on a single core.

Can we improve speed and data scale by making hamilton run in a parallel and distributed manner?

Why we think it should be possible?

Hamilton ultimately creates a DAG before executing. The idea would be to distribute various parts of this DAG.

We'd have to figure out initial data loading, but other than that, it seems like we're solving something that other systems have already solved for us. Can we harness them? Perhaps by going from a Hamilton DAG and "compiling" it to the target system of choice?

Ideas to explore

[x] ray

[x] spark

[x] dask

[ ] your idea here!

enhancement product idea
opened by skrawcz 14
Add docker container(s) to help run examples
Is your feature request related to a problem? Please describe. The friction to getting the examples up and running is installing the dependencies. A docker container with them already provided would reduce friction for people to get started with Hamilton.

Describe the solution you'd like

A docker container, that has different python virtual environments, that has the dependencies to run the examples.

The container has the hamilton repository checked out -- so it has the examples folder.

Then using it would be:

docker pull image

docker start image

activate python virtual environment

run example

Describe alternatives you've considered Not doing this.

Additional context This was a request from a Hamilton talk.
documentation good first issue help wanted
opened by skrawcz 13
Better error handling for errors in the execute() stage

Is your feature request related to a problem? Please describe. When my code runs into errors in the execute() stage, the traceback just goes to the execute() function, and doesn't provide any useful feedback about the provenance of the error. For example, if my DAG feeds a series into another series and they have incompatible indexes, I get an index error but I can't tell which series caused.

Describe the solution you'd like Just passing back the names of the relevant series with the error would helpful.

Describe alternatives you've considered More involved type checking when constructing the series functions and then when building the DAG could potentially eliminate some of the errors (probably not all though).
enhancement

opened by ropeladder 11

ValueError with typing.Union type signature

hamilton raises a ValueError when using typing.Union, even if the type signature has the input type.

Current behavior

Error when using typing.Union types.

Stack Traces

ValueError: 1 errors encountered:
  Error: Type requirement mismatch. Expected x:typing.Union[int, pandas.core.series.Series] got 1 instead.

Steps to replicate behavior

run.py

from hamilton.driver import Driver
from hamilton.base import SimplePythonGraphAdapter, DictResult
import lib

initial_columns = {'x': 1}
adapter = SimplePythonGraphAdapter(DictResult)
dr = Driver(initial_columns, lib, adapter=adapter)
print(dr.execute(['foo_union']))

lib.py

import typing as t
import pandas as pd


def foo_union(x: t.Union[int, pd.Series]) -> t.Union[int, pd.Series]:
    '''foo union'''
    return x + 1

Library & System Information

python 3.6, hamilton 1.9.0, tested on mac and linux.

Expected behavior

To be able to run dag.

Additional context

Everything works if there are functions that have typing.Any as type signature. For example, the above will run if we add foo_any to lib.py:

import typing as t
import pandas as pd


def foo_union(x: t.Union[int, pd.Series]) -> t.Union[int, pd.Series]:
    '''foo union'''
    return x + 1


def foo_any(x: t.Any) -> t.Any:
    '''foo any'''
    return x + 1

enhancement

opened by wangkev 11

Data quality
[Short description explaining the high-level reason for the pull request]

Changes

Testing

Notes

Checklist

[x] PR has an informative and human-readable title (this will be pulled into the release notes)

[x] Changes are limited to a single goal (no scope creep)

[x] Code can be automatically merged (no conflicts)

[x] Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.

[x] Passes all existing automated tests

[x] Any change in functionality is tested

[x] New functions are documented (with a description, list of inputs, and expected output)

[x] Placeholder code is flagged / future TODOs are captured in comments

[x] Project documentation has been updated if adding/changing functionality.

[x] Reviewers requested with the Reviewers tool :arrow_right:

Testing checklist

Python - local testing

[ ] python 3.6

[ ] python 3.7
opened by elijahbenizzy 11
[RFC] consolidate CI code into shell scripts
Is your feature request related to a problem? Please describe.

All of this project's CI configuration currently lives in a YAML file, https://github.com/stitchfix/hamilton/blob/main/.circleci/config.yml.

This introduces some friction to development in the following ways:

adding a new CI job involves adding a new block of YAML that is mostly the same as the others (see, for example, https://github.com/stitchfix/hamilton/commit/e2ad1367488f73d9d74fb9fd4f0c337f6713063b)

running tests locally involves copying and pasting commands from that YAML file

duplication of code across jobs makes it a bit more difficult to understand what is different between them

Describe the solution you'd like

I'd like to propose the following refactoring of this project's CI jobs:

put CI code into one or more shell scripts in a directory like .ci/

using environment variables to handle the fact that some jobs are slightly different from others (e.g. the dask jobs don't require installing ray)

change .circlci/config.yaml so that it runs those shell scripts

document in CONTRIBUTING.md how to run the tests in Docker locally, with commands that can just directly be copied and run by contributors, like this:

docker run \ --rm \ -v $(pwd):/app \ --workdir /app \ --env BUILD_TYPE="dask" \ -it circleci/python:3.7 \ bash .ci/test.sh

Describe alternatives you've considered

Using a Makefile instead of shell scripts could also work for this purpose, but in my experience shell scripts are understood by a broader range of people and have fewer surprises around quoting, interpolation, and exit codes.

Additional context

A similar pattern to the one I'm proposing has been very very useful for us in LightGBM.

Consider, for example, how many different job configurations the CI for that project's R package uses (https://github.com/microsoft/LightGBM/blob/3ad26a499614cf0af075ce4ea93b880bcc69b6bb/.github/workflows/r_package.yml) and how little repetition there is across jobs.

If you all are interested in trying this out, I'd be happy to propose some PRs.

Thanks for considering it!
opened by jameslamb 9
Make graphviz optional
Makes graphviz an optional dependency.

Fixes #26.

Additions

Helpful error message if graphviz isn't installed and the user tries to plot

Removals

Dependency on graphviz

Testing

Not really sure how to test this.

Surely there is a way to mock existence or non-existence of a package? Cursory googling did not reveal one.

Todos

Test maybe? If we can figure out how

Checklist

I'm not sure I can do all of these things:

[x] PR has an informative and human-readable title

[x] Changes are limited to a single goal (no scope creep)

[x] Code can be automatically merged (no conflicts)

[ ] Code follows the standards laid out in the TODO link to standards

[x] Passes all existing automated tests

[ ] Any change in functionality is tested

[x] New functions are documented (with a description, list of inputs, and expected output)

[x] Placeholder code is flagged / future todos are captured in comments

[ ] Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)

[ ] Reviewers requested with the Reviewers tool :arrow_right:

Testing checklist

Python

[ ] python 3.6

[ ] python 3.7
opened by ivirshup 9

PandasDataFrameResult: Convert non-list values to single row frame

When trying to run an intermediate node which produces a scalar in the hello_world example, Hamilton throws an error:

WARNING:hamilton.base:It appears no Pandas index type was detected. This will likely break when trying to create a DataFrame. E.g. are you requesting all scalar values? Use a different result builder or return at least one Pandas object with an index. Ignore this warning if you're using DASK for now.
ERROR:hamilton.driver:-------------------------------------------------------------------
Oh no an error! Need help with Hamilton?
Join our slack and ask for help! https://join.slack.com/t/hamilton-opensource/shared_invite/zt-1bjs72asx-wcUTgH7q7QX1igiQ5bbdcg
-------------------------------------------------------------------

Traceback (most recent call last):
  File "my_script.py", line 29, in <module>
    df = dr.execute(output_columns)
  File "/Users/ian.hoffman/src/hamilton/examples/hello_world/.venv/lib/python3.8/site-packages/hamilton/driver.py", line 142, in execute
    raise e
  File "/Users/ian.hoffman/src/hamilton/examples/hello_world/.venv/lib/python3.8/site-packages/hamilton/driver.py", line 139, in execute
    return self.adapter.build_result(**outputs)
  File "/Users/ian.hoffman/src/hamilton/examples/hello_world/.venv/lib/python3.8/site-packages/hamilton/base.py", line 171, in build_result
    raise ValueError(f"Cannot build result. Cannot handle type {value}.")
ValueError: Cannot build result. Cannot handle type 28.333333333333332.

If we can run an entire DAG, it seems like we should be able to run any sub-DAG of the DAG.

Changes

Updates PandasDataFrameResult.build_result() to convert scalar values into dataframes.

How I tested this

Updated unit tests.

Checklist

[X] PR has an informative and human-readable title (this will be pulled into the release notes)
[X] Changes are limited to a single goal (no scope creep)
[X] Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
[X] Any change in functionality is tested
[X] New functions are documented (with a description, list of inputs, and expected output)
[X] Placeholder code is flagged / future TODOs are captured in comments
[X] Project documentation has been updated if adding/changing functionality.

opened by ianhoffman 8

Prototype Data Quality Feature
Is your feature request related to a problem? Please describe. When creating pipelines, data issues can silently wreak havoc; your code didn't change but the data did and now things are wonky...

To combat that, there are projects like pandera that allow you to annotate functions with expectations, and at runtime, have those expectations checked and appropriately exposed.

Hamilton, should have some support for runtime data quality checks, so that we can not only support clean code bases, but also clean data as well.

Describe the solution you'd like We should prototype an approach where there is:

A way to set expectations on the output of a function, what the data should like.

Use those expectations either right after function execution, or on conclusion of a Hamilton DAG, or some other way.

A way to specify what should happen when an expectation is not met -- e.g. log warnings, surface warnings, or stop execution.

Thinking of a way to bootstrap these expectations from a dataset -- so that users can update/change expectations easily as time goes on.

Directionally https://pandera.readthedocs.io/en/stable/ seems like a good first approach to try, i.e. via decorators.

Describe alternatives you've considered This is something that as the prototype is being built out, we're thinking about alternatives considered too.

Additional context Some ideas on approaches:

https://pandera.readthedocs.io/en/stable/

https://github.com/awslabs/deequ

https://greatexpectations.io/

https://github.com/whylabs/whylogs

enhancement product idea
opened by skrawcz 8

[ci] add flake8

Is your feature request related to a problem? Please describe.

The project does not currently have any automatic protections against some classes of issue that can be detected by flake8, including:

unused imports
unused variables
duplicated test names (which leads to pytest skipping tests)
f-strings used on strings with no templating

Describe the solution you'd like

I believe this project should use flake8 for linting, and should store the configuration for that tool in a setup.cfg file (https://flake8.pycqa.org/en/latest/user/configuration.html).

Adding this tool to the project's testing setup would reduce the effort required for pull request authors and reviewers to detect such issues, and would reduce the risk of code changes with such issues making it to main.

Describe alternatives you've considered

N/A

Additional context

Here is the result of running flake8 (ignoring style-only issues) on the current latest commit on main.

flake8 \
    --ignore=E126,E128,E203,E241,E261,E302,E303,E402,E501,W503,W504,W605 \
    .

./hamilton/function_modifiers.py:50:13: F402 import 'node' from line 11 shadowed by loop variable
./hamilton/function_modifiers.py:169:30: F541 f-string is missing placeholders
./hamilton/function_modifiers.py:874:21: F541 f-string is missing placeholders
./hamilton/function_modifiers.py:878:34: F541 f-string is missing placeholders
./hamilton/graph.py:172:53: F821 undefined name 'graphviz'
./hamilton/graph.py:195:92: F821 undefined name 'networkx'
./hamilton/graph.py:316:13: F401 'graphviz' imported but unused
./hamilton/graph.py:451:17: F841 local variable 'e' is assigned to but never used
./hamilton/node.py:3:1: F401 'typing.Collection' imported but unused
./hamilton/driver.py:24:5: F811 redefinition of unused 'node' from line 13
./hamilton/data_quality/default_validators.py:227:30: F541 f-string is missing placeholders
./hamilton/data_quality/default_validators.py:394:9: F401 'pandera' imported but unused
./hamilton/data_quality/default_validators.py:396:21: F541 f-string is missing placeholders
./hamilton/data_quality/pandera_validators.py:1:1: F401 'typing.Any' imported but unused
./hamilton/data_quality/pandera_validators.py:21:16: F541 f-string is missing placeholders
./tests/test_function_modifiers.py:620:1: F811 redefinition of unused 'test_tags_invalid_value' from line 607
./tests/resources/bad_functions.py:5:1: F401 'tests.resources.only_import_me' imported but unused
./tests/resources/layered_decorators.py:1:1: F401 'hamilton.function_modifiers.tag' imported but unused
./tests/resources/cyclic_functions.py:5:1: F401 'tests.resources.only_import_me' imported but unused
./tests/resources/data_quality.py:1:1: F401 'numpy as np' imported but unused
./examples/ray/hello_world/run_rayworkflow.py:1:1: F401 'importlib' imported but unused
./graph_adapter_tests/h_ray/test_h_ray_workflow.py:1:1: F401 'tempfile' imported but unused

If maintainers are interested, I'd be happy to put up a pull request proposing this.

Thanks for your time and consideration.

opened by jameslamb 7

Remove 3.6 Support
Python 3.6 has been deemed End of Life (EOL) since December 2021.

We should make moves to stop supporting it.

Tasks

[ ] Remove 3.6 test support.

[ ] Remove 3.6 specific code.

[ ] Update setup.py to not list 3.6 and set minimum python version to 3.7.

[ ] Update any documentation & markdown files to remove reference to 3.6.

[ ] Create pull request with all the changes.

[ ] Determine timeline for a merge. With telemetry -- we should be able to confirm whether anyone is using Hamilton with python 3.6. If so then it should be easy to remove, if not, we'll then need to understand what's stopping people from moving to 3.7+. (See @skrawcz to know whether this is the case or not).

good first issue repo hygiene
opened by skrawcz 0
Polars example
Polars is gaining traction, we should have an example. This PR helps show an example that matches our hello world. It requires one adjust to the extract_columns decorator to function. Otherwise the user right now has to create their own build result function. Which seems fine for now, but it's something we could build a sf-hamilton-polars package for to house.

Changes

prototypes change to extract_columns to handle providing the dataframe types.

adds example polars code.

How I tested this

Tested this locally.

Notes

Checklist

[ ] PR has an informative and human-readable title (this will be pulled into the release notes)

[ ] Changes are limited to a single goal (no scope creep)

[ ] Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.

[ ] Any change in functionality is tested

[ ] New functions are documented (with a description, list of inputs, and expected output)

[ ] Placeholder code is flagged / future TODOs are captured in comments

[ ] Project documentation has been updated if adding/changing functionality.
opened by skrawcz 1
Parameterized extract - WIP, needs more testing
[Short description explaining the high-level reason for the pull request]

Changes

How I tested this

Notes

Checklist

[ ] PR has an informative and human-readable title (this will be pulled into the release notes)

[ ] Changes are limited to a single goal (no scope creep)

[ ] Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.

[ ] Any change in functionality is tested

[ ] New functions are documented (with a description, list of inputs, and expected output)

[ ] Placeholder code is flagged / future TODOs are captured in comments

[ ] Project documentation has been updated if adding/changing functionality.
opened by elijahbenizzy 0
Clarify behavior of decorator ordering
We need to make clear our philosophy and resolution method for functions such as:

@extract_fields({'out_value1': int, 'out_value2': str}) @tag(test_key="test-value") @check_output(data_type=dict, importance="fail") @does(_dummy) def uber_decorated_function(in_value1: int, in_value2: str) -> dict: pass

Right now it is not clear, nor obvious.

Current behavior

This is what the graph looks like:

So it would be unexpected to see check_output over the output of extract_fields.

Steps to replicate behavior

Function code:

def _dummy(**values) -> dict: return {f"out_{k.split('_')[1]}": v for k, v in values.items()} @extract_fields({'out_value1': int, 'out_value2': str}) @tag(test_key="test-value") @check_output(data_type=dict, importance="fail") @does(_dummy) def uber_decorated_function(in_value1: int, in_value2: str) -> dict: pass

Expected behavior

check_output should probably operate over what's directly underneath that. tag similarly should apply to all? or just what's underneath? does should apply to uber_decorated_function extract_fields is the last thing that's applied?

Additional context

Thoughts: can we create a linter that reorders decorators?
triage
opened by skrawcz 0

hamilton --init to get started

Is your feature request related to a problem? Please describe. New folks might want to get started in an existing repo. New DS/college students could use hamilton to get started on a simple modeling project...

Describe the solution you'd like

hamilton init
# Creates a basic project structure with some functions + hamilton files

hamilton init --project=hello_world 
# Creates the hello_world example

hamilton init --project=recomendations_stack
# Creates the scaffolding for a rec-stack example

hamilton init --project=web-service
# Create[s the scaffolding for a flask app

hamilton init kaggle --kaggle-competition=...
# Maybe we could create a template from a kaggle competition?

Additional context Messing around with dbt and they have this

enhancement onboarding

opened by elijahbenizzy 0

Releases(sf-hamilton-1.13.0)

sf-hamilton-1.13.0(Jan 2, 2023)
What's Changed

Updates bug hunters by @skrawcz in https://github.com/stitchfix/hamilton/pull/261

Fixes reusing_functions example by @skrawcz in https://github.com/stitchfix/hamilton/pull/262

Adds telemetry by @skrawcz in https://github.com/stitchfix/hamilton/pull/255

Bumps version to 1.13.0 by @skrawcz in https://github.com/stitchfix/hamilton/pull/264

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.12.0...sf-hamilton-1.13.0
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.12.0(Dec 27, 2022)
What's Changed

[ci] remove 'test_suite' in setup.py by @jameslamb in https://github.com/stitchfix/hamilton/pull/257

subdag modifier + tag_outputs + refactor by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/237

Updates hamilton version to 1.12.0 by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/258

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.11.1...sf-hamilton-1.12.0
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.11.1(Dec 20, 2022)
What's Changed

Adds Alaa Abedrabbo to bug hunters list by @skrawcz in https://github.com/stitchfix/hamilton/pull/215

Makes the fastapi async example more robust by @skrawcz in https://github.com/stitchfix/hamilton/pull/211

remove unnecessary code in tests by @jameslamb in https://github.com/stitchfix/hamilton/pull/221

fix a few flake8 warnings by @jameslamb in https://github.com/stitchfix/hamilton/pull/220

remove unused imports by @jameslamb in https://github.com/stitchfix/hamilton/pull/218

[ci] enforce flake8 (fixes #161) by @jameslamb in https://github.com/stitchfix/hamilton/pull/222

simplify pull request template by @jameslamb in https://github.com/stitchfix/hamilton/pull/219

[docs] restructure 'how to contribute' in developer guide by @jameslamb in https://github.com/stitchfix/hamilton/pull/223

[ci] move CI logic into shell scripts (fixes #114) by @jameslamb in https://github.com/stitchfix/hamilton/pull/225

Fix whitespace in readme for CI by @skrawcz in https://github.com/stitchfix/hamilton/pull/232

DBT + Hamilton Example by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/236

Fixes dbt integration to be much cleaner using FAL integration by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/239

dbt Hamilton example update requirements and readme by @datarshreya in https://github.com/stitchfix/hamilton/pull/241

PandasDataFrameResult: Convert non-list values to single row frame by @ianhoffman in https://github.com/stitchfix/hamilton/pull/243

Alternate fix for does by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/247

Adjusts index type check warnings by @skrawcz in https://github.com/stitchfix/hamilton/pull/246

Bumps version to 1.11.1 by @skrawcz in https://github.com/stitchfix/hamilton/pull/250

New Contributors

@datarshreya made their first contribution in https://github.com/stitchfix/hamilton/pull/241

@ianhoffman made their first contribution in https://github.com/stitchfix/hamilton/pull/243

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.11.0...sf-hamilton-1.11.1
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.11.0(Oct 21, 2022)
What's Changed

Black formatting by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/184

[ci] remove unnecessary f-strings by @jameslamb in https://github.com/stitchfix/hamilton/pull/190

Fixes ray workflow adapter to work with Ray 2.0 by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/189

Fixes for @does by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/187

fixes inconsistent parameterize documentation to use correct helper f… by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/192

Very minor spelling fix in README by @AAbedrabbo in https://github.com/stitchfix/hamilton/pull/202

Uses fixed instead of relative import for async example by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/204

Adding container image for running examples by @bovem in https://github.com/stitchfix/hamilton/pull/209

Adds more instructions for using and running docker by @skrawcz in https://github.com/stitchfix/hamilton/pull/210

Add pandas index checks by @skrawcz in https://github.com/stitchfix/hamilton/pull/200

Tweaks warning message from pandas index check by @skrawcz in https://github.com/stitchfix/hamilton/pull/213

Bump version to 1.11.0 by @skrawcz in https://github.com/stitchfix/hamilton/pull/212

New Contributors

@AAbedrabbo made their first contribution in https://github.com/stitchfix/hamilton/pull/202

@bovem made their first contribution in https://github.com/stitchfix/hamilton/pull/209

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.10.0...sf-hamilton-1.11.0
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.10.0(Aug 20, 2022)
What's Changed

Fixes DAG construction slowness by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/169

Async implementation of driver/adapter by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/171

Fixes mistaken superclass call by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/177

Full parametrized decorator by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/163

Adds Nullable validator by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/176

Adds instructions on how to push to Anaconda by @skrawcz in https://github.com/stitchfix/hamilton/pull/157

Adds union support check when passing in inputs by @skrawcz in https://github.com/stitchfix/hamilton/pull/173

Release for 1.10.0 by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/182

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.9.0...sf-hamilton-1.10.0
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.9.0(Jul 14, 2022)
What's Changed

Data quality by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/115

Fixes whitespace that was failing circleci by @skrawcz in https://github.com/stitchfix/hamilton/pull/152

Bumps version to 1.9.0 by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/154

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.8.0...sf-hamilton-1.9.0
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.8.0(Jul 3, 2022)
What's Changed

Adds the ability to pass functions not defined in modules - fixes #134 by @skrawcz in https://github.com/stitchfix/hamilton/pull/145

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.7.1...sf-hamilton-1.8.0
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.7.1(Jun 27, 2022)
What's Changed

Fixes SimplePythonDataFrameGraphAdapter.check_input_type by @skrawcz in https://github.com/stitchfix/hamilton/pull/136

Fixes case where optional user inputs broke computation by @skrawcz in https://github.com/stitchfix/hamilton/pull/133

Switches documentation to point to Slack instead of Discord by @skrawcz in https://github.com/stitchfix/hamilton/pull/141

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.7.0...sf-hamilton-1.7.1
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.7.0(May 2, 2022)
What's Changed

Adds Ray Workflow Graph Adapter - implements #67 by @skrawcz in https://github.com/stitchfix/hamilton/pull/108

Implements tags by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/83

Exposes passing kwargs to graphviz object by @skrawcz in https://github.com/stitchfix/hamilton/pull/125

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.6.0...sf-hamilton-1.7.0
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.6.0(Apr 12, 2022)
What's Changed

remove unused loggers by @jameslamb in https://github.com/stitchfix/hamilton/pull/103

fix minor formatting and grammar issues in docs by @jameslamb in https://github.com/stitchfix/hamilton/pull/102

Adds parameterized_inputs by @skrawcz in https://github.com/stitchfix/hamilton/pull/104

Fixes parameterized_inputs typo by @skrawcz in https://github.com/stitchfix/hamilton/pull/109

Bumps version to 1.6.0 by @skrawcz in https://github.com/stitchfix/hamilton/pull/110

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.5.1...sf-hamilton-1.6.0
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.5.1(Apr 11, 2022)
What's Changed

Adds our awesome bug-finders to the contributors list by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/99

Adds message about discord channel for help by @skrawcz in https://github.com/stitchfix/hamilton/pull/100

remove duplicate functools import by @jameslamb in https://github.com/stitchfix/hamilton/pull/101

Fixes whitespace in contributing.md by @skrawcz in https://github.com/stitchfix/hamilton/pull/105

Bumps version from 1.5.0 to 1.5.1 by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/106

This also includes support for optionals, as the release was done non-standardly (off a local build).

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.5.0...sf-hamilton-1.5.1
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.5.0(Mar 25, 2022)
What's Changed

Adds READMEs to examples by @skrawcz in https://github.com/stitchfix/hamilton/pull/73

Adds more links to the Discord channel by @skrawcz in https://github.com/stitchfix/hamilton/pull/74

Adds Christopher Prohm as a Contributor by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/71

remove unused imports by @jameslamb in https://github.com/stitchfix/hamilton/pull/76

Adds James Lamb as a contributor by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/78

fix typos in docs and comments by @jameslamb in https://github.com/stitchfix/hamilton/pull/80

remove dependency on pytest-runner by @jameslamb in https://github.com/stitchfix/hamilton/pull/81

remove dependency on pytest-assume by @jameslamb in https://github.com/stitchfix/hamilton/pull/82

Here's a numpy example from a numpy tutorial on doing AQI by @skrawcz in https://github.com/stitchfix/hamilton/pull/79

simplify pull request template by @jameslamb in https://github.com/stitchfix/hamilton/pull/90

remove support for 'python setup.py test' by @jameslamb in https://github.com/stitchfix/hamilton/pull/89

Extract-columns double-execution bug by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/93

Handle TypeVar types and enables most common generics support by @skrawcz in https://github.com/stitchfix/hamilton/pull/94

Fixes has_cycles behavior by @skrawcz in https://github.com/stitchfix/hamilton/pull/95

Bumps version to 1.5.0 by @skrawcz in https://github.com/stitchfix/hamilton/pull/97

New Contributors

@jameslamb made their first contribution in https://github.com/stitchfix/hamilton/pull/76

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.4.0...sf-hamilton-1.5.0
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.4.0(Feb 10, 2022)
What's Changed -- including 1.3.0 for posterity

Adds release methodology instructions by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/30

Adds documentation showing scalar creation & input by @skrawcz in https://github.com/stitchfix/hamilton/pull/31

Brings distributed execution and optional freedom from pandas by @skrawcz in https://github.com/stitchfix/hamilton/pull/47

Implements opinionated decorator lifecycle by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/28

Changes the way drivers handle parameters by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/56

Stefan/refactor visualization by @skrawcz in https://github.com/stitchfix/hamilton/pull/58

Renames columns to outputs in driver & ResultMixin build_result function by @skrawcz in https://github.com/stitchfix/hamilton/pull/61

1.3.0 release by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/62

Replaces executor with adapter in places we missed by @skrawcz in https://github.com/stitchfix/hamilton/pull/63

Adds simple case to help motivate @extract_fields by @skrawcz in https://github.com/stitchfix/hamilton/pull/66

Bumps version 1.4.0 by @skrawcz in https://github.com/stitchfix/hamilton/pull/70

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.2.0...sf-hamilton-1.4.0
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.2.0(Dec 14, 2021)
What's Changed

Fixes setup.py to enable pushing to pypi by @skrawcz in https://github.com/stitchfix/hamilton/pull/14

Add release docs by @skrawcz in https://github.com/stitchfix/hamilton/pull/15

Adds link to blog post for hamilton by @elijahbenizzy in https://github.com/stitchfix/hamilton/pull/16

Adds issue templates by @skrawcz in https://github.com/stitchfix/hamilton/pull/20

Make graphviz optional by @ivirshup in https://github.com/stitchfix/hamilton/pull/27

New Contributors

@ivirshup made their first contribution in https://github.com/stitchfix/hamilton/pull/27

Full Changelog: https://github.com/stitchfix/hamilton/compare/sf-hamilton-1.1.1...sf-hamilton-1.2.0
Source code(tar.gz)
Source code(zip)
sf-hamilton-1.2.0.tar.gz(32.43 KB)
sf_hamilton-1.2.0-py3-none-any.whl(20.41 KB)
sf-hamilton-1.1.1(Oct 13, 2021)
Patches

Touches up the package settings: stitchfix/hamilton#13

Open Source: stitchfix/hamilton#8

Parametrized inputs: stitchfix/hamilton#5

Source code(tar.gz)
Source code(zip)

The micro-framework to create dataframes from functions.

Related tags

Overview

Hamilton

Getting Started

Installation

Hamilton in 15 minutes

Your first hello world.

License

Contributing

Prescribed Development Workflow

PyCharm Tips

Live templates

Multiple Cursors

Contributors

Comments

What's the idea?

Why we think it should be possible?

Ideas to explore

Current behavior

Stack Traces

Steps to replicate behavior

Library & System Information

Expected behavior

Additional context

Changes

Testing

Notes

Checklist

Testing checklist

Python - local testing

Additions

Removals

Testing

Todos

Checklist

Testing checklist

Python

Changes

How I tested this

Checklist

Tasks

Changes

How I tested this

Notes

Checklist

Changes

How I tested this

Notes

Checklist

Current behavior

Steps to replicate behavior

Expected behavior

Additional context

Releases(sf-hamilton-1.13.0)

sf-hamilton-1.13.0(Jan 2, 2023)

What's Changed

sf-hamilton-1.12.0(Dec 27, 2022)

What's Changed

sf-hamilton-1.11.1(Dec 20, 2022)

What's Changed

New Contributors

sf-hamilton-1.11.0(Oct 21, 2022)

What's Changed

New Contributors

sf-hamilton-1.10.0(Aug 20, 2022)

What's Changed

sf-hamilton-1.9.0(Jul 14, 2022)

What's Changed

sf-hamilton-1.8.0(Jul 3, 2022)

What's Changed

sf-hamilton-1.7.1(Jun 27, 2022)

What's Changed

sf-hamilton-1.7.0(May 2, 2022)

What's Changed

sf-hamilton-1.6.0(Apr 12, 2022)

What's Changed

sf-hamilton-1.5.1(Apr 11, 2022)

What's Changed

sf-hamilton-1.5.0(Mar 25, 2022)