A library for pattern matching on symbolic expressions in Python.

Overview

MatchPy

MatchPy is a library for pattern matching on symbolic expressions in Python.

Work in progress

Latest version released on PyPi Latest version released via conda-forge Test coverage Build status of the master branch Documentation Status The Journal of Open Source Software Digital Object Identifier

Installation

MatchPy is available via PyPI, and for Conda via conda-forge. It can be installed with pip install matchpy or conda install -c conda-forge matchpy.

Overview

This package implements pattern matching in Python. Pattern matching is a powerful tool for symbolic computations, operating on symbolic expressions. Given a pattern and an expression (which is usually called subject), the goal of pattern matching is to find a substitution for all the variables in the pattern such that the pattern becomes the subject. As an example, consider the pattern f(x), where f is a function and x is a variable, and the subject f(a), where a is a constant symbol. Then the substitution that replaces x with a is a match. MatchPy supports associative and/or commutative function symbols, as well as sequence variables, similar to pattern matching in Mathematica.

A detailed example of how to use MatchPy can be found here.

MatchPy supports both one-to-one and many-to-one pattern matching. The latter makes use of similarities between patterns to efficiently find matches for multiple patterns at the same time.

A list of publications about MatchPy can be found below.

Expressions

Expressions are tree-like data structures, consisting of operations (functions, internal nodes) and symbols (constants, leaves):

>>> from matchpy import Operation, Symbol, Arity
>>> f = Operation.new('f', Arity.binary)
>>> a = Symbol('a')
>>> print(f(a, a))
f(a, a)

Patterns are expressions which may contain wildcards (variables):

>>> from matchpy import Pattern, Wildcard
>>> x = Wildcard.dot('x')
>>> print(Pattern(f(a, x)))
f(a, x_)

In the previous example, x is the name of the variable. However, it is also possible to use wildcards without names:

>>> w = Wildcard.dot()
>>> print(Pattern(f(w, w)))
f(_, _)

It is also possible to assign variable names to entire subexpressions:

>>> print(Pattern(f(w, a, variable_name='y')))
y: f(_, a)

Pattern Matching

Given a pattern and an expression (which is usually called subject), the idea of pattern matching is to find a substitution that maps wildcards to expressions such that the pattern becomes the subject. In MatchPy, a substitution is a dict that maps variable names to expressions.

>>> from matchpy import match
>>> y = Wildcard.dot('y')
>>> b = Symbol('b')
>>> subject = f(a, b)
>>> pattern = Pattern(f(x, y))
>>> substitution = next(match(subject, pattern))
>>> print(substitution)
{x ↦ a, y ↦ b}

Applying the substitution to the pattern results in the original expression.

>>> from matchpy import substitute
>>> print(substitute(pattern, substitution))
f(a, b)

Sequence Wildcards

Sequence wildcards are wildcards that can match a sequence of expressions instead of just a single expression:

>>> z = Wildcard.plus('z')
>>> pattern = Pattern(f(z))
>>> subject = f(a, b)
>>> substitution = next(match(subject, pattern))
>>> print(substitution)
{z ↦ (a, b)}

Associativity and Commutativity

MatchPy natively supports associative and/or commutative operations. Nested associative operators are automatically flattened, the operands in commutative operations are sorted:

>>> g = Operation.new('g', Arity.polyadic, associative=True, commutative=True)
>>> print(g(a, g(b, a)))
g(a, a, b)

Associativity and commutativity is also considered for pattern matching:

>>> pattern = Pattern(g(b, x))
>>> subject = g(a, a, b)
>>> print(next(match(subject, pattern)))
{x ↦ g(a, a)}
>>> h = Operation.new('h', Arity.polyadic)
>>> pattern = Pattern(h(b, x))
>>> subject = h(a, a, b)
>>> list(match(subject, pattern))
[]

Many-to-One Matching

When a fixed set of patterns is matched repeatedly against different subjects, matching can be sped up significantly by using many-to-one matching. The idea of many-to-one matching is to construct a so called discrimination net, a data structure similar to a decision tree or a finite automaton that exploits similarities between patterns. In MatchPy, there are two such data structures, implemented as classes: DiscriminationNet and ManyToOneMatcher. The DiscriminationNet class only supports syntactic pattern matching, that is, operations are neither associative nor commutative. Sequence variables are not supported either. The ManyToOneMatcher class supports associative and/or commutative matching with sequence variables. For syntactic pattern matching, the DiscriminationNet should be used, as it is usually faster.

>>> pattern1 = Pattern(f(a, x))
>>> pattern2 = Pattern(f(y, b))
>>> matcher = ManyToOneMatcher(pattern1, pattern2)
>>> subject = f(a, b)
>>> matches = matcher.match(subject)
>>> for matched_pattern, substitution in sorted(map(lambda m: (str(m[0]), str(m[1])), matches)):
...     print('{} matched with {}'.format(matched_pattern, substitution))
f(a, x_) matched with {x ↦ b}
f(y_, b) matched with {y ↦ a}

Roadmap

Besides the existing features, we plan on adding the following to MatchPy:

  • Support for Mathematica's Alternatives: For example f(a | b) would match either f(a) or f(b).
  • Support for Mathematica's Repeated: For example f(a..) would match f(a), f(a, a), f(a, a, a), etc.
  • Support pattern sequences (PatternSequence in Mathematica). These are mainly useful in combination with Alternatives or Repeated, e.g. f(a | (b, c)) would match either f(a) or f(b, c). f((a a)..) would match any f with an even number of a arguments.
  • All these additional pattern features need to be supported in the ManyToOneMatcher as well.
  • Better integration with existing types such as dict.
  • Code generation for both one-to-one and many-to-one matching. There is already an experimental implementation, but it still has some dependencies on MatchPy which can probably be removed.
  • Improving the documentation with more examples.
  • Better test coverage with more randomized tests.
  • Implementation of the matching algorithms in a lower-level language, for example C, both for performance and to make MatchPy's functionality available in other languages.

Contributing

If you have some issue or want to contribute, please feel free to open an issue or create a pull request. Help is always appreciated!

The Makefile has several tasks to help development:

  • To install all needed packages, you can use make init .
  • To run the tests you can use make test. The tests use pytest.
  • To generate the documentation you can use make docs .
  • To run the style checker (pylint) you can use make check .

If you have any questions or need help with setting things up, please open an issue and we will try the best to assist you.

Publications

Manuel Krebber and Henrik Barthels
Journal of Open Source Software, Volume 3(26), pp. 2, June 2018.

Manuel Krebber, Henrik Barthels and Paolo Bientinesi
Proceedings of the 7th Workshop on Python for High-Performance and Scientific Computing, November 2017.

Manuel Krebber, Henrik Barthels and Paolo Bientinesi
Proceedings of the 15th Python in Science Conference, July 2017.

Manuel Krebber
Master Thesis, RWTH Aachen University, May 2017

If you want to cite MatchPy, please reference the JOSS paper:

@article{krebber2018,
    author    = {Manuel Krebber and Henrik Barthels},
    title     = {{M}atch{P}y: {P}attern {M}atching in {P}ython},
    journal   = {Journal of Open Source Software},
    year      = 2018,
    pages     = 2,
    month     = jun,
    volume    = {3},
    number    = {26},
    doi       = "10.21105/joss.00670",
    web       = "http://joss.theoj.org/papers/10.21105/joss.00670",
}
Comments
  • `ManyToOneReplacer` slow for small expressions

    `ManyToOneReplacer` slow for small expressions

    I have added ~1700 ReplacementRules in ManyToOneReplacer. The .replace() has become very slow even for very small expressions(like 2*x).

    File: https://github.com/parsoyaarihant/sympy/blob/502fd311d64155560383a5f4f3b2710c318dc410/sympy/integrals/rubi/patterns.py

    opened by parsoyaarihant 38
  • Added internal implementation of the Hopcroft-Karp algorithm

    Added internal implementation of the Hopcroft-Karp algorithm

    Added Hopcroft-Karp algorithm in order to free MatchPy of its GPL dependency. Readapted from the C++ implementation in SymEngine.

    Translating the C++ implementation in SymEngine back to Python.

    I'm the author of the C++ implementation so no copyright notice mentioning SymEngine is necessary.

    opened by Upabjojr 23
  • `ManyToOneReplacer` for Rubi

    `ManyToOneReplacer` for Rubi

    Hi, I implemented ~80 rules using ManyToOneReplacer for Rubi integration and MatchPy was able to match the subjects very quickly. However, when I added ~550 rules, there was significant decrease in speed while matching even smaller subjects.

    It would be great if you could advice us on how to solve this issue.

    Here are the files for our project:

    Patterns: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/patterns.py Constraint: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/constraint.py All Rubi Files: https://github.com/parsoyaarihant/sympy/tree/rubi4/sympy/rubi

    opened by parsoyaarihant 21
  • Optional arguments in machpy

    Optional arguments in machpy

    I wanted to know if matchpy supports optional arguments during match.

    Mathematica: https://reference.wolfram.com/language/ref/Optional.html

    Example:

    >>> x = Symbol('x')
    >>> a_ = Wildcard.dot('a', optional=1)
    >>> a_free = CustomConstraint(lambda a, x: str(x) not in a.symbols)
    >>> x_ = Wildcard.dot('x')
    >>> pattern = Pattern(Mul(a_, x_), a_free)
    >>> next(match(x, pattern))
    {a: 1}
    
    opened by parsoyaarihant 20
  • Fixing the code generator

    Fixing the code generator

    The code generator enforces the reading of all variables whenever a constraint has been encountered. In this PR, the constraint is check only if all variables have been read.

    Furthermore, indentation has been changes to four spaces and a related bug forcing to always use bugs has been fixed as well.

    opened by Upabjojr 15
  • Repeated definition of the same CustomConstraint

    Repeated definition of the same CustomConstraint

    In the current RUBI rules patterns in SymPy, the same constraint is defined many times. For example: CustomConstraint(lambda a, x: FreeQ(a, x)) has its definition repeated more and more, like in:

    https://github.com/sympy/sympy/blob/628286fb5cc2248002bb0b141a911757a2e81fa7/sympy/integrals/rubi/rules/miscellaneous_algebraic.py#L22

    and in the next rule the same constraint is redefined: https://github.com/sympy/sympy/blob/628286fb5cc2248002bb0b141a911757a2e81fa7/sympy/integrals/rubi/rules/miscellaneous_algebraic.py#L26

    Python defines a new object for every new lambda definition, even if the functions return the same identical value.

    Does this practice have some potential negative consequences, like:

    • the size of the generated decision tree?
    • the overall speed of MatchPy?
    opened by Upabjojr 13
  • using `int` as Wildcard

    using `int` as Wildcard

    Hi, I am trying to define class ConstantWild (similar to ConstantSymbol you suggested earlier).

    Here is the code:

    from matchpy import Wildcard
    
    class ConstantWild(Wildcard):
        def __init__(self, value):
            super(self.__class__, self).__init__(min_count=1, fixed_size=True, variable_name=str(value))
            self.value = value
    
        def __str__(self):
            return str(self.value)
    

    The above code have same attributes as Wildcard.dot but I don't understand why it is giving errors.

    opened by parsoyaarihant 9
  • Custom sorting key for substitute( )

    Custom sorting key for substitute( )

    Substitute now accepts custom sorting key to specify custom sorting for elements in Multiset.

    This should solve the compatibility problem with SymPy which does not allow comparison operators (>, >=, <, <=) to be evaluated to booleans if the truth of the expression cannot be easily determined.

    opened by Upabjojr 8
  • Need Help in code generation.

    Need Help in code generation.

    I could not find any documentation related to code generation. Last year @wheerd had written a code generation script for rubi in sympy. https://gist.github.com/wheerd/13ac5a0e9b560db6201d96136fa0bbcc

    But code generated was huge so, It cannot be used. This year I have removed the repeated definition of constraints. So code generation should work probably.

    However, I am unable to write the code generation script. Here is the new structure of rubi in sympy. Can someone update the gist accordingly?

    rules = [] #list to keep track which rules has been applied
    def cons_f1(m, x):
        return FreeQ(m, x)
    cons1 = CustomConstraint(cons_f1)
    
    def cons_f2(m):
        return NonzeroQ(m + S(1))
    cons2 = CustomConstraint(cons_f2)
    
    pattern1 = Pattern(Integral(x_**WC('m', S(1)), x_), cons1, cons2)
    def replacement1(m, x):
        rules.append(1)
        return Simp(x**(m + S(1))/(m + S(1)), x)
    rule1 = ReplacementRule(pattern1, replacement1)
    
    help wanted question 
    opened by ashishkg0022 7
  • Alternate way to define constraint

    Alternate way to define constraint

    I wanted to know if there is alternatative way to define constraint in matchpy. Example:

    
    def FreeQ(vars, x): # returns True of any `var` contains `x`
    	if any(i.contains(x) for i in vars):
    		return False
    	return True
    
    def NonzeroQ(e): # returns True if `e` is not zero.
    	return e != 0
    
    pattern = Pattern(Int(Pow(Add(a_, Mul(b_, x_)), m_), x_), FreeQ(list(a, b, m), x), NonzeroQ(Add(m, 1)))
    

    Thanks.

    opened by parsoyaarihant 7
  • GPL'd dependency: hopcroftkarp library

    GPL'd dependency: hopcroftkarp library

    The hopcroftkarp library upon which this project depends is GPL'd.

    I have re-implemented the Hopcroft-Karp algorithm in C++ as part of an attempt to make MatchPy usable in SymEngine (that is, SymPy's core rewritten in C++). I didn't perform extensive testing, so there may be bugs. That code can be translated into Python.

    enhancement 
    opened by Upabjojr 6
  • FreeQ equivalent in MatchPy?

    FreeQ equivalent in MatchPy?

    Mathematica has FreeQ to test whether an expression contains a symbol. This is very useful in pattern matching for equations as you can specify that, for example, in a * x + b == 0 the variables a and b should not contain the variable x.

    In SymPy we are currently using things like CustomConstraint(lambda a, x: not a.has(x)), where a.has(x) is a SymPy expression that tells you if x is contained in the expression tree of a.

    Would it make sense to add an optimized FreeQ-like tester that checks whether the variable is contained in the expression during the matching iteration of MatchPy?

    opened by Upabjojr 1
  • Optional wildcards to match the identity element of the current node

    Optional wildcards to match the identity element of the current node

    When using the optional=value argument in Wildcard, one can specify which value to get in case no matching is found. The value has to be given for every wildcard.

    In case of addition and multiplication, the usual optional values are the identity elements, zero and one respectively.

    It would be nice to have the possibility to have the optional value to return the identity element of the current node, if available.

    enhancement 
    opened by Upabjojr 7
  • substitute not working with SymPy

    substitute not working with SymPy

    Calling sorted( ) in https://github.com/HPAC/matchpy/blob/5cae3f275e3a1f725518516bf3c700dc3be03a56/matchpy/functions.py#L87 fails with SymPy, as SymPy objects are not comparable with operators (<, <=, >, >=). Operators are overloaded in SymPy to create instances of inequality objects.

    SymPy has a function called default_sort_key meant to deal with this problem:

    from sympy import symbols
    x, y, z = symbols("x y z")
    from sympy.core.compatibility import default_sort_key
    sorted([z, x, y], key=default_sort_key)
    

    The question is, is it possible to modify MatchPy to specify a custom sorting key so that SymPy can specify the way its objects should be sorted?

    Or is it an issue with Multiset?

    opened by Upabjojr 11
  • One-to-one matching: symbols with variable_name do not work in commutative functions

    One-to-one matching: symbols with variable_name do not work in commutative functions

    In one-to-one matching, symbols that have a variable_name do not work in commutative functions. They seem to work correctly in many-to-one matching. Example:

    from matchpy import Operation, Symbol, Arity, Pattern, match, ManyToOneMatcher
    
    f = Operation.new('f', Arity.binary, commutative=True)
    a_x = Symbol('a', variable_name='x')
    a = Symbol('a')
    b = Symbol('b')
    
    subject = f(a, b)
    pattern = Pattern(f(a_x, b))
    
    print(list(match(subject, pattern)))
    
    matcher = ManyToOneMatcher(pattern)
    print(list(matcher.match(subject)))
    

    This results in

    []
    [(Pattern(f(a: x, b)), {'x': Symbol('a')})]
    

    but it should result in

    [{'x': Symbol('a')}]
    [(Pattern(f(a: x, b)), {'x': Symbol('a')})]
    

    So far, we don't have any tests that verify this functionality. Once it is fixed, we should add some.

    bug 
    opened by hbarthels 0
  • Match arbitrary function

    Match arbitrary function

    Is there any way to match a function with a certain number of arguments? Suppose I've defined a binary function sum, is there a way to create a pattern f(a, b) that matches sum(x, y)?

    enhancement 
    opened by sidmani 4
  • Improve checking correctness of input

    Improve checking correctness of input

    To avoid issues such as #60, we should check that objects are created and functions are used correctly, at least in those cases where so far, incorrect input does not produce any errors.

    enhancement 
    opened by hbarthels 0
Releases(0.5.5)
Owner
High-Performance and Automatic Computing
Umeå University & RWTH Aachen
High-Performance and Automatic Computing
Ergonomic option parser on top of dataclasses, inspired by structopt.

oppapī Ergonomic option parser on top of dataclasses, inspired by structopt. Usage from typing import Optional from oppapi import from_args, oppapi @

yukinarit 4 Jul 19, 2022
A simple but fully functional calculator that will take multiple operations.

Functional-Calculator A simple but fully functional calculator that will take multiple operations. Usage Run the following command through terminal: p

Uzziel Ariel 1 Dec 22, 2022
The best way to learn Python is by practicing examples. The repository contains examples of basic concepts of Python. You are advised to take the references from these examples and try them on your own.

90_Python_Exercises_and_Challenges The best way to learn Python is by practicing examples. This repository contains the examples on basic and advance

Milaan Parmar / Милан пармар / _米兰 帕尔马 205 Jan 06, 2023
Practice in Oxford_AI&ML class

Practice in Oxford_AI&ML class

St3ve Lee 2 Feb 04, 2022
Repository, with small useful and functional applications

Repositorio,com pequenos aplicativos uteis e funcionais A ideia e ir deselvolvendo pequenos aplicativos funcionais e adicionar a este repositorio List

GabrielDuke 6 Dec 06, 2021
rTorrent Crash Prevention

rTorrent Disk Checker This program is capable of the following when: - a torrent is added by any program (autodl-irssi, RSS Downloader et

16 Dec 14, 2022
A Python package that provides physical constants.

PhysConsts A Python package that provides physical constants. The code is being developed by Marc van der Sluys of the department of Astrophysics at t

Marc van der Sluys 1 Jan 05, 2022
My Solutions to 120 commonly asked data science interview questions.

Data_Science_Interview_Questions Introduction 👋 Here are the answers to 120 Data Science Interview Questions The above answer some is modified based

Milaan Parmar / Милан пармар / _米兰 帕尔马 181 Dec 31, 2022
An html wrapper for python

MessySoup What is it? MessySoup is a python wrapper for html elements. While still a ways away, the main goal is to be able to build a wesbite straigh

4 Jan 05, 2022
Write Streamlit apps using Notion! (Prototype)

Streamlit + Notion test app Write Streamlit apps using Notion! ☠️ IMPORTANT: This is just a little prototype I made to play with some ideas. Not meant

Thiago Teixeira 22 Sep 08, 2022
Herramienta para poder automatizar reuniones en Zoom.

Crear Reunión Zoom con Python Herramienta para poder automatizar reuniones en Zoom. Librerías Requeridas Nombre Comando PyAutoGui pip install pyautogu

JkDev 3 Nov 12, 2022
An async API wrapper for Dress To Impress written in Python.

dti.py An async API wrapper for Dress To Impress written in Python. Some notes: For the time being, there are no front-facing docs for this beyond doc

Steve C 1 Dec 14, 2022
The learning agent learns firstly approaching to the football and then kicking the football to the target position

Football Court This project utilized Pytorch and Tensorflow so that the learning agent learns firstly approaching to the football and then kicking the

1 Nov 19, 2021
Projeto de Jogo de dados em Python 3 onde é definido o lado a ser apostado e número de jogadas, pontuando os acertos e exibindo se ganhou ou perdeu.

Jogo de DadoX Projeto de script que simula um Jogo de dados em Python 3 onde é definido o lado a ser apostado (1, 2, 3, 4, 5 e 6) ou se vai ser um núm

Estênio Mariano 1 Jul 10, 2021
A Klipper plugin for accurate Z homing

Stable Z Homing for Klipper A Klipper plugin for accurate Z homing This plugin provides a new G-code command, STABLE_Z_HOME, which homes Z repeatedly

Matthew Lloyd 24 Dec 28, 2022
Slientruss3d : Python for stable truss analysis tool

slientruss3d : Python for stable truss analysis tool Desciption slientruss3d is a python package which can solve the resistances, internal forces and

3 Dec 26, 2022
WATTS provides a set of Python classes that can manage simulation workflows for multiple codes where information is exchanged at a coarse level

WATTS (Workflow and Template Toolkit for Simulation) provides a set of Python classes that can manage simulation workflows for multiple codes where information is exchanged at a coarse level.

13 Dec 23, 2022
Frappe app for authentication, can be used with FrappeVue-AdminLTE

Frappeauth App Frappe app for authentication, can be used with FrappeVue-AdminLTE

Anthony C. Emmanuel 9 Apr 13, 2022
Basit bir sunucu - istemci örneği

basitSunucuistemci Aşağıdaki adresteki uygulamadaki process kapanmama sorununun çözülmesi ile oluşturulmuş yeni depo https://github.com/pricheal/pytho

Ali Orhun Akkirman 10 Dec 27, 2022
1. 네이버 카페 댓글을 빨리 다는 기능

naver_autoprogram 기능 설명 네이버 카페 댓글을 빨리 다는 기능 네이버 카페 자동 출석 체크 기능 동작 방식 카페 댓글 기능 기본 동작은 주기적인 스케쥴 동작으로 해당 카페 ID 와 특정 API 주소로 대상이 새글을 작성했는지 체크. 해당 대상이 새글 등

1 Dec 22, 2021