Functional tensors for probabilistic programming

Overview

Build Status Latest Version Documentation Status

Funsor

Funsor is a tensor-like library for functions and distributions.

See Functional tensors for probabilistic programming for a system description.

Installing

Install using pip:

Funsor supports Python 3.6+.

pip install funsor

Install from source:

git clone [email protected]:pyro-ppl/funsor.git
cd funsor
git checkout master
pip install .

Using funsor

Funsor can be used through a number of interfaces:

  • Funsors can be used directly for probabilistic computations, using PyTorch optimizers in a standard training loop. Start with these examples: discrete_hmm, eeg_slds, kalman_filter, pcfg, sensor, slds, and vae.
  • Funsors can be used to implement custom inference algorithms within Pyro, using custom elbo implementations in standard pyro.infer.SVI training. See these examples: mixed_hmm and bart forecasting.
  • funsor.pyro provides a number of Pyro-compatible (and PyTorch-compatible) distribution classes that use funsors under the hood, as well utilities to convert between funsors and distributions.
  • funsor.minipyro provides a limited alternate backend for the Pyro probabilistic programming language, and can perform some ELBO computations exactly.

Design

See design doc.

The goal of this library is to generalize Pyro's delayed inference algorithms from discrete to continuous variables, and to create machinery to enable partially delayed sampling compatible with universality. To achieve this goal this library makes three orthogonal design choices:

  1. Open terms are objects. Funsors generalize the tensor interface to also cover arbitrary functions of multiple variables ("inputs"), where variables may be integers, real numbers, or real tensors. Function evaluation / substitution is the basic operation, generalizing tensor indexing. This allows probability distributions to be first-class Funsors and make use of existing tensor machinery, for example we can generalize tensor contraction to computing analytic integrals in conjugate probabilistic models.

  2. Support nonstandard interpretation. Funsors support user-defined interpretations, including, eager, lazy, mixed eager+lazy, memoized (like opt_einsum's sharing), and approximate interpretations like Monte Carlo approximations of integration operations (e.g. .sum() over a funsor dimension).

  3. Named dimensions. Substitution is the most basic operation of Funsors. To avoid the difficulties of broadcasting and advanced indexing in positionally-indexed tensor libraries, all Funsor dimensions are named. Indexing uses the .__call__() method and can be interpreted as substitution (with well-understood semantics). Funsors are viewed as algebraic expressions with one algebraic free variable per dimension. Each dimension is either covariant (an output) or contravariant (an input).

Using funsor we can easily implement Pyro-style delayed sampling, roughly:

trace_log_prob = 0.

def pyro_sample(name, dist, obs=None):
    assert isinstance(dist, Funsor)
    if obs is not None:
        value = obs
    elif lazy:
        # delayed sampling (like Pyro's parallel enumeration)
        value = funsor.Variable(name, dist.support)
    else:
        value = dist.sample('value')[0]['value']

    # save log_prob in trace
    trace_log_prob += dist(value)

    return value

# ...later during inference...
loss = -trace_log_prob.reduce(logaddexp)  # collapses delayed variables

See funsor/minipyro.py for complete implementation.

Related projects

Citation

If you use Funsor, please consider citing:

@article{obermeyer2019functional,
  author = {Obermeyer, Fritz and Bingham, Eli and Jankowiak, Martin and
            Phan, Du and Chen, Jonathan P},
  title = {{Functional Tensors for Probabilistic Programming}},
  journal = {arXiv preprint arXiv:1910.10775},
  year = {2019}
}
Comments
  • Profiling GaussianHMM with NumPyro backend

    Profiling GaussianHMM with NumPyro backend

    I just do profiling funsor.pyro.hmm.GaussianHMM and pyro.distributions.GaussianHMM and have some observations:

    • funsor is slow comparing to pyro, especially for small hidden/obs dims. With batch_dim, time_dim, obs_dim, hidden_dim = 5, 6, 3, 2, pyro takes 5ms while funsor takes 35ms to evaluate log_prob.
    • The overhead in funsor seems to be constant. I increased obs_dim, hidden_dim to 30, 20 and verified that .
    • torch.cat takes a large amount of time in funsor (e.g. with batch_dim, time_dim = 50, 60, this op takes 7ms per the total 70ms). Similarly, torch.pad takes a portion of time in pyro (but the time is less than torch.cat in funsor). I think the reason is we replace pad by new_zeros + cat in funsor.
    • Time for numerical calculation (except cat, pad) seems to be similar between two versions. It seems to me that at some points, funsor switch dimensions. I guess this is due to the fact that in funsor, we store dimensions in a set. For example, I looked at args of triangular solve and got (this is not important I believe)
    funsor torch.Size([5, 3, 2, 4]) torch.Size([5, 3, 2, 2])
    funsor torch.Size([5, 3, 2, 1]) torch.Size([5, 3, 2, 2])
    funsor torch.Size([5, 1, 2, 4]) torch.Size([5, 1, 2, 2])
    funsor torch.Size([5, 1, 2, 1]) torch.Size([5, 1, 2, 2])
    funsor torch.Size([1, 5, 2, 4]) torch.Size([1, 5, 2, 2])
    funsor torch.Size([1, 5, 2, 1]) torch.Size([1, 5, 2, 2])
    funsor torch.Size([5, 2, 2]) torch.Size([5, 2, 2])
    funsor torch.Size([5, 2, 1]) torch.Size([5, 2, 2])
    funsor torch.Size([5, 2, 1]) torch.Size([5, 2, 2])
    CPU times: user 287 ms, sys: 30.9 ms, total: 318 ms
    Wall time: 28.3 ms
    pyro torch.Size([5, 3, 2, 4]) torch.Size([5, 3, 2, 2])
    pyro torch.Size([5, 3, 2, 1]) torch.Size([5, 3, 2, 2])
    pyro torch.Size([5, 1, 2, 4]) torch.Size([5, 1, 2, 2])
    pyro torch.Size([5, 1, 2, 1]) torch.Size([5, 1, 2, 2])
    pyro torch.Size([5, 1, 2, 4]) torch.Size([5, 1, 2, 2])
    pyro torch.Size([5, 1, 2, 1]) torch.Size([5, 1, 2, 2])
    pyro torch.Size([5, 2, 2]) torch.Size([5, 2, 2])
    pyro torch.Size([5, 2, 1]) torch.Size([5, 2, 2])
    pyro torch.Size([5, 2, 1]) torch.Size([5, 2, 2])
    CPU times: user 70.4 ms, sys: 1.17 ms, total: 71.6 ms
    Wall time: 5.84 ms
    
    Profiling code
    import pyro.distributions as dist
    import pytest
    import torch
    from pyro.distributions.util import broadcast_shape
    
    from funsor.pyro.hmm import DiscreteHMM, GaussianHMM, GaussianMRF, SwitchingLinearHMM
    from funsor.testing import assert_close, random_mvn
    
    batch_dim, time_dim, obs_dim, hidden_dim = 5, 6, 3, 2
    
    init_shape = (batch_dim,)
    trans_mat_shape = trans_mvn_shape = obs_mat_shape = obs_mvn_shape = (batch_dim, time_dim)
    init_dist = random_mvn(init_shape, hidden_dim)
    trans_mat = torch.randn(trans_mat_shape + (hidden_dim, hidden_dim))
    trans_dist = random_mvn(trans_mvn_shape, hidden_dim)
    obs_mat = torch.randn(obs_mat_shape + (hidden_dim, obs_dim))
    obs_dist = random_mvn(obs_mvn_shape, obs_dim)
    
    actual_dist = GaussianHMM(init_dist, trans_mat, trans_dist, obs_mat, obs_dist)
    expected_dist = dist.GaussianHMM(init_dist, trans_mat, trans_dist, obs_mat, obs_dist)
    assert actual_dist.batch_shape == expected_dist.batch_shape
    assert actual_dist.event_shape == expected_dist.event_shape
    
    shape = broadcast_shape(init_shape + (1,),
                            trans_mat_shape, trans_mvn_shape,
                            obs_mat_shape, obs_mvn_shape)
    data = obs_dist.expand(shape).sample()
    assert data.shape == actual_dist.shape()
    
    %time actual_log_prob = actual_dist.log_prob(data)
    %time expected_log_prob = expected_dist.log_prob(data)
    

    With the above observations, I think that using jax.jit will resolve this "constant" overhead. So I would like to go ahead and implement GaussianHMM in NumPyro. This requires modifications of current funsor.pyro to make it work with NumPyro. All changes in this PR are just for profiling, not to merge. Hopefully, jax.jit will work well with lazily evaluation mechanisms in funsor.

    Tasks

    • [x] todo
    discussion 
    opened by fehiepsi 14
  • Support global backend

    Support global backend

    Addresses #207

    Follow up discussions at #317, this PR supports global backend in funsor.

    To make it easier for reviewing, I list here important changes in this PR:

    • Separate out jax and numpy implementation. I also take this chance to reorder those ops according to their names (to make it easier to compare different backends or to find some specific implementation).
    • Use numpy for default implementation of numeric ops. funsor.torch or funsor.jax are lazily imported, depending on the environment variable FUNSOR_BACKEND or the usage of funsor.set_backend(...).
    • We can change backend using the utility set_backend. There is a problem: if we have change backend to JAX, we can't revert back to numpy backend.
    • Port Pyro Tensordot implementation to funsor. Pyro implementation has some statements x.dim(), which is not available for ndarray
    • Despite that we will use numpy backend by default, I still use PyTorch names (except for amax, amin) for those ops (like the current master branch). I guess Pyro devs would like PyTorch names than NumPy names for numeric ops.

    Some small changes:

    • replace torch._C._get_tracing_state by funsor.util.get_tracing_state
    • add funsor.util.is_nn_module
    • add ops.is_numeric_array to check if a variable is a numeric array. We can rename it to ops.is_tensor to match torch.is_tensor if prefered.
    • add ops.detach to detach gradient. This is no op for numpy backend.
    • fix issues of ops.log which gives warnings at 0. value
    • support both tuple or iterator for testing utilities randn, zeros,... (because I found so many places in test files that use that pattern torch.zeros(1, 2, 3) instead of torch.zeros((1, 2, 3))).
    • dispatch for recursion_reinterpret.register(torch.Tensor) is implemented in torch.py file, instead of interpreter.py file. (similar for children, to_funsor, allclose)
    • others are cleanup and reordering changes

    Tests pass with all backends

    • [x] test_affine
    • [x] test_alpha_conversion
    • [x] test_cnf
    • [x] test_delta
    • [x] test_gaussian
    • [x] test_import
    • [x] test_integrate
    • [x] test_joint
    • [x] test_sum_product
    • [x] test_tensor
    • [x] test_terms

    Tests only work with torch/pyro

    • test/pyro/
    • test/test_minipyro.py
    • test/test_adjoint.py: requires einsum.adjoint.require_backward
    • test/test_distributions.py: refactoring
    • test/test_einsum.py
    • test/test_memoize.py: requires some einsum stuffs
    • test/test_optimizer.py: require distributions and einsum

    TODO:

    • [x] separate numpy and jax backend.
    • [x] revise DICE logic
    awaiting review 
    opened by fehiepsi 12
  • Add a single normal form for sum-product contractions

    Add a single normal form for sum-product contractions

    Resolves #156

    As outlined in #156 I'm using this as a staging branch for the refactoring PRs associated with the proposed changes there:

    • [x] #157 Original PR (diff)
    • [x] #173 Refactor Affine into Contraction (https://github.com/pyro-ppl/funsor/pull/157/commits/e20978978283998341f5ed75a755cebc76fa2bab)
    • [x] #165 Refactor optimizer to use Contraction and normalize (https://github.com/pyro-ppl/funsor/pull/157/commits/a4b2ec29b10f6649d86cf9fee24d56e574b279c8)
    • [x] #169 Refactor Joint into multivariate Delta and Contraction (https://github.com/pyro-ppl/funsor/pull/157/commits/d2c79b0c8cdab99b370ca160de53dc8d62469dd9)
    • [x] Separate unfolding from normalization (https://github.com/pyro-ppl/funsor/pull/157/commits/ffdd38176567242fde59ab1f070488ab98cce72e)
    • [x] Troubleshoot performance issues (mostly caused by repeated unfolding)
    • [x] Use #212 for pattern matching
    • [x] Remove obsolete patterns (diff)
    • [x] Update MultiDelta to keep separate log_densitys for each point (https://github.com/pyro-ppl/funsor/pull/157/commits/30cbcb1c13806991cdeb70e5618799b60cf844b9)
    • [x] Fix regressions in adjoint and einsum
    • [x] Add missing patterns for minipyro
    • [x] Update bound inference tests for Number

    Original description: This PR takes a first step towards the goals in #156 by adding two things:

    1. A Contraction term representing finitary sum-product operations that takes advantage of multipledispatch.Variadic for pattern matching. This is also necessary for #72.
    2. A normalize interpretation that uses associative and distributive properties to put linear expressions into something like DNF, grouping operations into single Contraction terms where possible.

    As a demonstration, I have included patterns and tests illustrating how Joint is equivalent to interleaving eager and normalize.

    Tested:

    • I copied smoke tests and test cases verbatim from test_joint.py, test_gaussian.py, test_affine.py and test_einsum.py
    awaiting review refactor 
    opened by eb8680 10
  • DiCE factors from importance sampling?

    DiCE factors from importance sampling?

    In Tensor.unscaled_sample, a DiCE factor is added to the sample. In Gaussian.unscaled_sample, this isn't necessary because of the parameterization of the delta function.

    So far including DiCE with the samples doesn't sit quite right conceptually. Could it make sense to instead approach this from importance sampling? The default interpretation of adjoint(p) @ f for continuous p might be to use importance sampling with some default proposal q. To maintain differentiability, the importance weights would be p / q.detach(). If samples can be drawn from p, then p can be used for the proposal, and the importance weights would be p / p.detach(), which in this context are the DiCE factors.

    Am I missing some benefits of the DiCE formalism?

    discussion 
    opened by mattwescott 10
  • Generalize partial sum product to Markov models

    Generalize partial sum product to Markov models

    This attempts to generalize partial_sum_product to support elimination of markov variables following discussion with @eb8680 and @fritzo at pyro-ppl/pyro#2687.

    TODO

    • [x] modified_partial_sum_product that eliminates first-order markov variables using parallel-scan algorithm.
    • [x] Add more tests
    • [ ] Rename modified_partial_sum_product or replace partial_sum_product by it?

    Tests

    • [x] _expected_modified_partial_sum_product unrolls one markov dim at a time
    • [x] test_modified_partial_sum_product_0
    • [x] test_modified_partial_sum_product_1
    • [x] test_modified_partial_sum_product_2
    • [x] test_modified_partial_sum_product_3
    • [x] test_modified_partial_sum_product_4
    • [x] test_modified_partial_sum_product_5
    • [x] test_modified_partial_sum_product_6
    • [x] test_modified_partial_sum_product_7: xfail x and y markov vars with different ordinals
    • [x] test_modified_partial_sum_product_8
    • [x] test_modified_partial_sum_product_9
    • [x] test_modified_partial_sum_product_10
    • [x] test_modified_partial_sum_product_11
    • [x] test_modified_partial_sum_product_12: xfail w_curr in time plate but lower ordinal than y and x markov vars
    • [x] test_modified_partial_sum_product_13
    • [x] test_modified_partial_sum_product_14
    • [x] test_modified_partial_sum_product_15: xfail y depends on two time dims time and tones
    • [x] test_modified_partial_sum_product_16: x_prev->y_curr and y_prev->x_curr

    Note

    • Markov sites cannot be within local plates inside (see tests 2 and 10)

    Update

    • Added unrolling of markov vars in local plates to handle tests 2 and 10

    Useful links

    • pyro-ppl/funsor#167
    • pyro-ppl/funsor#204
    • pyro-ppl/funsor#293
    • pyro-ppl/numpyro#686
    • pyro-ppl/funsor#399
    enhancement WIP 
    opened by ordabayevy 9
  • Make funsor.distributions backend agnostic

    Make funsor.distributions backend agnostic

    Blocked by https://github.com/pyro-ppl/pyro-api/pull/19

    This is similar to #326 but doesn't include funsor.numpyro stuff. I have resolved has_rsample issue and make the whole test_distributions backend agnostic.

    awaiting review 
    opened by fehiepsi 9
  • refactor Gaussian to use Tensor api instead of torch.Tensor

    refactor Gaussian to use Tensor api instead of torch.Tensor

    This PR follows Approach 2 in #207 to make Gaussian backend-agnostic. My plan is to

    • check if info_vec is torch.Tensor, then set self.backend = TorchTensor.
    • make backend-specific ops such as triangular_solve, cholesky static methods of their corresponding TorchTensor type. For example,
    x.triangular_solve(y)
    

    will be

    self.backend.triangular_solve(x, y)
    

    It seems that there is a lot of work to do but I hope that future refactoring will be easier after this.

    Tasks

    • ~[] subclass Tensor to TorchTensor~ this is not necessary for now
    • [x] collect and implement all required ~static methods~ ops for TorchTensor
    • [x] use the above ~static methods~ ops in Gaussian
    • [x] do the same for other classes/functions in gaussian.py
    • [x] make sure all tests pass
    • [x] make align_tensors and materialize backend agnostic
    • [x] make numpy backend pass for gaussian smoke test

    Future PRs

    • deal with torch._C._get_tracing_state() statements
    • make sure that numpy backend will work
    awaiting review 
    opened by fehiepsi 9
  • `NotImplementedError` when using `apply_optimizer`

    `NotImplementedError` when using `apply_optimizer`

    This term below comes up in test_pyroapi_funsor:

    from funsor.cnf import Contraction
    from funsor.tensor import Tensor
    import torch
    import funsor.ops as ops
    from funsor import Bint, Real
    from funsor.terms import Unary, Binary, Variable, Number, lazy, to_data
    from funsor.constant import Constant
    from funsor.delta import Delta
    import funsor
    funsor.set_backend("torch")
    
    with lazy:
        term = Contraction(ops.add, ops.mul,
         frozenset({Variable('plate_inner__BOUND_78', Bint[2]), Variable('x__BOUND_77', Real), Variable('plate_outer__BOUND_79', Bint[3])}),  # noqa
         (Unary(ops.exp,
           Contraction(ops.null, ops.add,
            frozenset(),
            (Delta(
              (('x__BOUND_77',
                (Tensor(
                  torch.tensor([-1.5227684840130555, 0.38168390241818895, -1.0276085758313882], dtype=torch.float64),  # noqa
                  (('plate_outer__BOUND_79',
                    Bint[3],),),
                  'real'),
                 Number(0.0),),),)),
             Constant(
              (('plate_inner__BOUND_78',
                Bint[2],),),
              Tensor(
               torch.tensor([0.0, 0.0, 0.0], dtype=torch.float64),
               (('plate_outer__BOUND_79',
                 Bint[3],),),
               'real')),))),
          Unary(ops.all,
           Binary(ops.eq,
            Tensor(
             torch.tensor([-1.5227684840130555, 0.38168390241818895, -1.0276085758313882], dtype=torch.float64),  # noqa
             (('plate_outer__BOUND_79',
               Bint[3],),),
             'real'),
            Tensor(
             torch.tensor([-1.5227684840130555, 0.38168390241818895, -1.0276085758313882], dtype=torch.float64),  # noqa
             (('plate_outer__BOUND_79',
               Bint[3],),),
             'real'))),
          Tensor(
           torch.tensor([[-3.304130052277938, -0.9255234395261538, -1.5122103473560844], [-3.217490519312117, -1.2663745664889694, -1.2900109994655682]],
    dtype=torch.float64),  # noqa
           (('plate_inner__BOUND_78',
             Bint[2],),
            ('plate_outer__BOUND_79',
             Bint[3],),),
           'real'),))
    
    x = to_data(funsor.optimizer.apply_optimizer(term))
    

    Error message:

    Traceback (most recent call last):
      File "hehe.py", line 54, in <module>
        x = to_data(funsor.optimizer.apply_optimizer(term))
      File "/home/ordabayev/repos/funsor/funsor/optimizer.py", line 169, in apply_optimizer
        return interpreter.reinterpret(expr)
      File "/home/ordabayev/repos/funsor/funsor/interpreter.py", line 255, in reinterpret
        return recursion_reinterpret(x)
      File "/home/ordabayev/repos/funsor/funsor/interpreter.py", line 236, in recursion_reinterpret
        return _STACK[-1].interpret(type(x), *map(recursion_reinterpret, children(x)))
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 196,
    in interpret
        result = s.interpret(cls, *args)
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 155,
    in interpret
        return self.dispatch(cls, *args)(*args)
      File "/home/ordabayev/repos/funsor/funsor/optimizer.py", line 87, in optimize_contraction_variadic
        return optimize.interpret(Contraction, r, b, v, tuple(ts))
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 196,
    in interpret
        result = s.interpret(cls, *args)
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 155,
    in interpret
        return self.dispatch(cls, *args)(*args)
      File "/home/ordabayev/repos/funsor/funsor/optimizer.py", line 145, in optimize_contract_finitary_funsor
        path_end = Contraction(
      File "/home/ordabayev/repos/funsor/funsor/terms.py", line 210, in __call__
        return interpret(cls, *args)
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 196,
    in interpret
        result = s.interpret(cls, *args)
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 155,
    in interpret
        return self.dispatch(cls, *args)(*args)
      File "/home/ordabayev/repos/funsor/funsor/cnf.py", line 323, in eager_contraction_tensor
        raise NotImplementedError("TODO")
    NotImplementedError: TODO
    
    enhancement 
    opened by ordabayevy 8
  • Fix nan gradients in TraceEnum_ELBO when evaluated eagerly

    Fix nan gradients in TraceEnum_ELBO when evaluated eagerly

    Fixes nan gradients in TraceEnum_ELBO when evaluated eagerly (mentioned in https://github.com/pyro-ppl/funsor/issues/493).

    The problem seems to be that reflect.interpret in line 62 leads to substitutions that have adjoint as a base_interpretation which in turn leads to some extra expressions being added to the tape. Here I fix it by changing interpretation to _old_interpretation.

    This also solves test_adjoint.py::test_sequential_sum_product_adjoint: xfail_param(MarkovProduct, reason="mysteriously doubles adjoint values?") (blocked by #545 ).

    awaiting review 
    opened by ordabayevy 7
  • Avoid creating distributions when inferring value domain

    Avoid creating distributions when inferring value domain

    Addresses #412 pair coded with @eb8680

    This refactors logic in Distribution._infer_value_domain() to avoid creating a temporary dummy distribution if backend distributions implement a .infer_shapes() method.

    awaiting review refactor 
    opened by fritzo 7
  • Switch to sqrt(precision) representation in Gaussian

    Switch to sqrt(precision) representation in Gaussian

    Resolves #567 Adapts @fehiepsi's https://github.com/pyro-ppl/pyro/pull/2019

    This switches the internal Gaussian representation to a numerically stable and space efficient representation

    - Gaussian(info_vec, precision, inputs)
    + Gaussian(white_vec, prec_sqrt, inputs)
    

    In the new parametrization, Gaussians represent the log-density function

    Gaussian(white_vec, prec_sqrt, OrderedDict(x=Reals[n]))
      = -1/2 || x @ prec_sqrt - white_vec ||^2
    

    These two parameters are shaped to efficiently represent low-rank data:

    assert white_vec.shape == batch_shape + (rank,)
    assert prec_sqrt.shape == batch_shape + (dim, rank)
    

    reducing space complexity from O(dim(dim+1)) to O(rank(dim+1)). In my real-world example rank=1, dim=2369, and batch_shape=(1343,), so the space reduction is 30GB → 13MB.

    Computations is cheap in this representation: addition amounts to concatenation, and plate-reduction amounts to transpose and reshape. Some ops are only supported on full-rank Gaussians, and I've added checks based on the new property .is_full_rank. This partial support is ok because the Gaussian funsors that arise in Bayesian models are all full rank due to priors (notwithstanding numerical loss of rank).

    Because the Gaussian funsor is internal, the interface change should not cause breakage of most user code, since most user code uses to_funsor() and to_data() with backend-specific distributions. One broken piece of user code is Pyro's AutoGaussianFunsor which will need an update (and which will be sped up).

    As suggested by @eb8680, I've added some optional kwarg parametrizations and properties to support conversion to other Gaussian representations, e.g. g = Gaussian(mean=..., covariance=..., inputs=...) and g._mean, g._covariance. This allows more Gaussian math to live in gaussian.py.

    Tested

    • [x] added new tests
    • [x] pass existing funsor tests (linear algebra)
    • [x] pass existing funsor tests (patterns)
    • [x] pass NumPyro tests (conjugacy)
    • [x] pass Pyro tests on a branch (https://github.com/pyro-ppl/pyro/pull/2943)
    • [x] check computational cost on the pyro-cov model (results: under 12GB memory)
    awaiting review refactor 
    opened by fritzo 6
  • Feature Request: Add Support for TransformedDistribution with non-trivial event dimensions

    Feature Request: Add Support for TransformedDistribution with non-trivial event dimensions

    https://github.com/pyro-ppl/funsor/blob/1d07af18c21894dd56e2f4f877c7845430c3b729/funsor/distribution.py#L531

    Opening a feature request as per this comment by @fehiepsi

    Also note that the same issue happens when using TruncatedNormal distribution, see this comment.

    enhancement 
    opened by pankajb64 0
  • FR rules for (Gaussian + UniformPolygon).reduce(logaddexp)

    FR rules for (Gaussian + UniformPolygon).reduce(logaddexp)

    It would be great to have reduction rules for uniform likelihoods wrt Gaussian latent variables:

    • [ ] (MyContinuousDist + Uniform).reduce(logaddexp) implemented via .cdf()
    • [ ] (Gaussian + UniformPolygon).reduce(logaddexp) implemented via quadrature (MC or QMC)

    This could be used for Pyro and NumPyro probabilistic programs with constraint-like likelihoods, something like

    def model(data):
        z = pyro.sample("z", Normal(...))
        # Observe a constraint like lb < z < ub.
        pyro.sample("x", Uniform(...), obs=z)
    

    here we can be lazy and compute marginal likelihood via Normal(...).cdf([lb, ub]).

    enhancement 
    opened by fritzo 0
  • Delta integrate pattern

    Delta integrate pattern

    (Separating this from #593 PR)

    This proposes the following logic for the Delta Integrate pattern:

    • If reduced_var is in integrand.inputs then apply substitution to delta and integrand:
    delta = Delta("x",  point, log_density)
    integrand = Variable("x")
    Integrate(delta, integrand, reduced_vars="x")
      => delta(x=point).exp() * integrand(x=point)
      => log_density.exp() * point
    

    where log_density can be a Dice factor or an importance weight in general.

    • If reduced_var is not in integrand.inputs then just reduce delta:
    delta = Delta("x",  point, log_density)
    integrand = Number(3.0)
    Integrate(delta, integrand, reduced_vars="x")
      => delta.reduce(logaddexp, "x").exp() * integrand
      => 1 * 3.0
      => 3.0
    

    where delta is normalized.

    awaiting review 
    opened by ordabayevy 0
  • Provenance funsor

    Provenance funsor

    Provenance tracking implementation in funsor. Here the provenance is the set of (name, point) tuples of RV samples that were in the history of the computations of the tracked term. Substitution results in the multiplication by the density:

    >>> logp_x = Provenance(logp_value, {("x", point)})
    >>> logp_x(x=point)
    logp_value
    

    The Provenance funsor is intended to be used in https://github.com/pyro-ppl/pyro/pull/2893 as a wrapper for ProvenanceTensor.

    enhancement awaiting review 
    opened by ordabayevy 1
  • FR Thompson sampling example

    FR Thompson sampling example

    I came across this cute example of nested semiring dynamic programming in the context of Bayesian optimization. Thompson sampling first performs Bayesian regression, fitting a posterior p(θ|Xs,ys) over parameters θ given a fully-supervised dataset of (X,y) pairs, then repeatedly samples parameters θ and optimizes expected reward E[y|X,θ] over X. That is, at each step

    X_optimal <- arg max_X int_y int_θ y p(y|θ,X)  p(θ) prod_i p(yi|θ,Xi)
    # new choice                       --reward-- prior ---likelihood---
    

    For example in Pyroed p(θ) prod_i p(yi|θ,Xi) is approximated variationally, and E[y|θ,X] is just a tensor contraction (although there it is optimized via annealed Gibbs sampling rather than via dynamic programming, as an aside we could add an annealed Gibbs sampling interpretation to Funsor for sum-product contractions).

    What would be a good example here, where we might leverage Funsor's dynamic programming to implement both the integral and argmax operators?

    examples 
    opened by fritzo 0
Releases(0.4.4)
  • 0.4.4(Dec 28, 2022)

  • 0.4.3(Mar 21, 2022)

    What's Changed

    • Update black by @fritzo in https://github.com/pyro-ppl/funsor/pull/583
    • Extend partial_sum_product() to cases tractable for Gaussians by @fritzo in https://github.com/pyro-ppl/funsor/pull/584
    • update classifiers by @anukaal in https://github.com/pyro-ppl/funsor/pull/586
    • Fix provenance tensor to work with PyTorch 1.11 by @fritzo in https://github.com/pyro-ppl/funsor/pull/588
    • Update to PyTorch 1.11.0 by @fritzo in https://github.com/pyro-ppl/funsor/pull/589
    • Bump to version 0.4.3 by @fritzo in https://github.com/pyro-ppl/funsor/pull/590

    New Contributors

    • @anukaal made their first contribution in https://github.com/pyro-ppl/funsor/pull/586

    Full Changelog: https://github.com/pyro-ppl/funsor/compare/0.4.2...0.4.3

    Source code(tar.gz)
    Source code(zip)
  • 0.4.2(Dec 13, 2021)

    Enhancements

    • #543 Added ProvenanceTensor for torch backend
    • #543 Added Constant funsor
    • #549 Added funsor.recipes.forward_fiter_backward_rsample()
    • #553 Added a Precondition interpretation for Gaussians, plus funsor.recipes.forward_fiter_backward_precondition()
    • #572 Support Funsors in pprint.pprint() and pdb

    Breaking changes

    • #568 Switch to sqrt(precision) representation for Gaussian
    Source code(tar.gz)
    Source code(zip)
  • 0.4.1(Jul 2, 2021)

  • 0.4.0(Jan 24, 2021)

    • Added new sum product algorithms (contributed by @ordabayevy)
    • Added many new conjugacy patterns
    • Automated op registration and made adding new ops easier
    • #419 Support ExtendedDistribution
    • #432 Support IndependentConstraint
    • #397 Allow .reduce(op, vars_not_in_inputs)
    • #430 Add Tuple Funsor
    • #409 Change sarkka_bilmes_product interface
    Source code(tar.gz)
    Source code(zip)
  • 0.3.0(Oct 15, 2020)

    • #380 Drop support for Python 3.5 and PyTorch <=1.5
    • #352 New syntax for funsor.domain.Domains: bint(n) -> Bint[n], reals(*shape) -> Reals[shape]
    • #369 New StatefulInterpretation pattern to simplify context-dependent interpretations
    Source code(tar.gz)
    Source code(zip)
  • 0.2.0(Jul 27, 2020)

    • Supports multiple backends: numpy (default), torch, and jax. To change the backend, use funsor.set_backend("torch") or funsor.set_backend("jax").
    Source code(tar.gz)
    Source code(zip)
  • 0.1.2(Oct 24, 2019)

  • 0.1.1(Oct 24, 2019)

  • 0.1.0(Oct 23, 2019)

Owner
Pyro - Deep Universal Probabilistic Programming
ETL pipeline on movie data using Python and postgreSQL

Movies-ETL ETL pipeline on movie data using Python and postgreSQL Overview This project consisted on a automated Extraction, Transformation and Load p

Juan Nicolas Serrano 0 Jul 07, 2021
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) an

PyMC 7.2k Dec 30, 2022
BasstatPL is a package for performing different tabulations and calculations for descriptive statistics.

BasstatPL is a package for performing different tabulations and calculations for descriptive statistics. It provides: Frequency table constr

Angel Chavez 1 Oct 31, 2021
Anomaly Detection with R

AnomalyDetection R package AnomalyDetection is an open-source R package to detect anomalies which is robust, from a statistical standpoint, in the pre

Twitter 3.5k Dec 27, 2022
MDAnalysis is a Python library to analyze molecular dynamics simulations.

MDAnalysis Repository README [*] MDAnalysis is a Python library for the analysis of computer simulations of many-body systems at the molecular scale,

MDAnalysis 933 Dec 28, 2022
Convert tables stored as images to an usable .csv file

Convert an image of numbers to a .csv file This Python program aims to convert images of array numbers to corresponding .csv files. It uses OpenCV for

711 Dec 26, 2022
Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.

Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.

HoloViz 2.9k Jan 06, 2023
Template for a Dataflow Flex Template in Python

Dataflow Flex Template in Python This repository contains a template for a Dataflow Flex Template written in Python that can easily be used to build D

STOIX 5 Apr 28, 2022
cLoops2: full stack analysis tool for chromatin interactions

cLoops2: full stack analysis tool for chromatin interactions Introduction cLoops2 is an extension of our previous work, cLoops. From loop-calling base

YaqiangCao 25 Dec 14, 2022
The Dash Enterprise App Gallery "Oil & Gas Wells" example

This app is based on the Dash Enterprise App Gallery "Oil & Gas Wells" example. For more information and more apps see: Dash App Gallery See the Dash

Austin Caudill 1 Nov 08, 2021
Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions.

About Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions. The tool provides rich data and a summary g

9 Nov 16, 2022
A model checker for verifying properties in epistemic models

Epistemic Model Checker This is a model checker for verifying properties in epistemic models. The goal of the model checker is to check for Pluralisti

Thomas Träff 2 Dec 22, 2021
PrimaryBid - Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift

Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift This project is composed of two parts: Part1 and Part2

Emmanuel Boateng Sifah 1 Jan 19, 2022
This python script allows you to manipulate the audience data from Sl.ido surveys

Slido-Automated-VoteBot This python script allows you to manipulate the audience data from Sl.ido surveys Since Slido blocks interference from automat

Pranav Menon 1 Jan 24, 2022
Desafio 1 ~ Bantotal

Challenge 01 | Bantotal Please read the instructions for the challenge by selecting your preferred language below: Español Português License Copyright

Maratona Behind the Code 44 Sep 28, 2022
Wafer Fault Detection - Wafer circleci with python

Wafer Fault Detection Problem Statement: Wafer (In electronics), also called a slice or substrate, is a thin slice of semiconductor, such as a crystal

Avnish Yadav 14 Nov 21, 2022
An orchestration platform for the development, production, and observation of data assets.

Dagster An orchestration platform for the development, production, and observation of data assets. Dagster lets you define jobs in terms of the data f

Dagster 6.2k Jan 08, 2023
PyPDC is a Python package for calculating asymptotic Partial Directed Coherence estimations for brain connectivity analysis.

Python asymptotic Partial Directed Coherence and Directed Coherence estimation package for brain connectivity analysis. Free software: MIT license Doc

Heitor Baldo 3 Nov 26, 2022
Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Damast This repository contains code developed for the digital humanities project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval

University of Stuttgart Visualization Research Center 2 Jul 01, 2022
ICLR 2022 Paper submission trend analysis

Visualize ICLR 2022 OpenReview Data

Jintang Li 75 Dec 06, 2022