Find graph motifs using intuitive notation

Last update: Jan 02, 2023

Overview

d o t m o t i f

Find graph motifs using intuitive notation

DotMotif is a library that identifies subgraphs or motifs in a large graph. It looks like this:

# Look for all motifs of the form,

# Neuron A excites B:
A -> B [type = "excitatory"]
# ...and B inhibits C:
B -> C [type = "inhibitory"]

Or like this:

TwitterInfluencer(person) {
    # An influencer has more than a million
    # followers and is verified.
    person.followers > 1000000
    person.verified = true
}

InfluencerAwkward(person1, person2) {
    # Two people who are both influencers...
    TwitterInfluencer(person1)
    TwitterInfluencer(person2)
    # ...where one follows the other, but...
    person1 -> person2
    # ...the other doesn't follow back
    person2 !> person1
}

# Search for all awkward twitter influencer
# relationships in the dataset:
InfluencerAwkward(X, Y)

Get Started

To follow along in an interactive Binder without installing anything, launch a Jupyter Notebook here:

If you have DotMotif, a NetworkX graph, and a curious mind, you already have everything you need to start using DotMotif:

from dotmotif import Motif, GrandIsoExecutor

executor = GrandIsoExecutor(graph=my_networkx_graph)

triangle = Motif("""
A -> B
B -> C
C -> A
""")

results = executor.find(triangle)

Parameters

You can also pass optional parameters into the constructor for the dotmotif object. Those arguments are:

Argument	Type, Default	Behavior
`ignore_direction`	`bool`: `False`	Whether to disregard direction when generating the database query
`limit`	`int`: `None`	A limit (if any) to impose on the query results
`enforce_inequality`	`bool`: `False`	Whether to enforce inequality; in other words, whether two nodes should be permitted to be aliases for the same node. For example, in `A->B->C`; if `A!=C`, then set to `True`
`exclude_automorphisms`	`bool`: `False`	Whether to return only a single example for each detected automorphism. See more in the documentation

For more details on how to write a query, see Getting Started.

Citing

If this tool is helpful to your research, please consider citing it with:

# https://doi.org/10.1038/s41598-021-91025-5
@article{Matelsky_Motifs_2021, 
    title={{DotMotif: an open-source tool for connectome subgraph isomorphism search and graph queries}},
    volume={11}, 
    ISSN={2045-2322}, 
    url={http://dx.doi.org/10.1038/s41598-021-91025-5}, 
    DOI={10.1038/s41598-021-91025-5}, 
    number={1}, 
    journal={Scientific Reports}, 
    publisher={Springer Science and Business Media LLC}, 
    author={Matelsky, Jordan K. and Reilly, Elizabeth P. and Johnson, Erik C. and Stiso, Jennifer and Bassett, Danielle S. and Wester, Brock A. and Gray-Roncal, William},
    year={2021}, 
    month={Jun}
}

Comments

Neuprint Executor - Labeling Edges by ROI
Hi Jordan,

Do you see an easy way to assign ROI labels to edges in the neuprint executor? Let's say I want to query something like this:

A -> B [weight > 20, ROI == "CX"] A -> B [weight > 30, ROI == "CRE(L)"]

So basically, there are two things here—multigraphs, which you address already in the docs, and encoding edge ROIs. I wonder if that's rather a hard thing to do or not. The data should be there as neuprint-python fetch_synapse_connections returns something like this

bodyId_pre bodyId_post roi_pre roi_post x_pre y_pre z_pre x_post y_post z_post confidence_pre confidence_post 0 792368888 754547386 PED(R) PED(R) 14013 27747 19307 13992 27720 19313 0.996 0.401035 1 792368888 612742248 PED(R) PED(R) 14049 27681 19417 14044 27662 19408 0.921 0.881487 2 792368888 5901225361 PED(R) PED(R) 14049 27681 19417 14055 27653 19420 ...

According to this issue it looks like it's possible. My observation is that the physical location of a connection between two neurons is an important feature of a motif. Looking forward to hearing what you say.

EDIT: Maybe an indirect way to support multiple edges between two nodes is by grouping edge attributes. Does something like this seem plausible. You are doing smth similar in the multigraph docs already: A -> B [synapse_count > 2]. But what exactly is synapse_count?

A -> B [[weight >= 20, ROI == "CX"], [weight > 30, ROI == "CRE(L)"]]

Best, Jakob
enhancement cypher Neo4jExecutor NeuPrintExecutor
opened by jakobtroidl 9

Error on first query

Tried to run the query from the tutorial:

motif = Motif("""
# My Awesome Motif

Nose_Cell -> Brain_Cell
Brain_Cell -> Arm_Cell
""")

But got this error:

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-1-3a88159c0a0c> in <module>
----> 1 import dotmotif
      2 import networkx
      3 
      4 motif = Motif("""
      5 # My Awesome Motif

~\anaconda3\lib\site-packages\dotmotif\__init__.py in <module>
     24 from networkx.algorithms import isomorphism
     25 
---> 26 from .parsers.v2 import ParserV2
     27 from .validators import DisagreeingEdgesValidator
     28 

~\anaconda3\lib\site-packages\dotmotif\parsers\v2\__init__.py in <module>
     11 
     12 
---> 13 dm_parser = Lark(open(os.path.join(os.path.dirname(__file__), "grammar.lark"), "r"))
     14 
     15 

FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\xxxx\\anaconda3\\lib\\site-packages\\dotmotif\\parsers\\v2\\grammar.lark'

bug parser install

opened by lix2k3 9

Filtering By Properties w/ Invalid Characters in the Name
Hey There, I'm using dotmotif to query the neuPrint dataset and have found some of the neurons have properties that aren't accepted in the query string format e.g. 'AVLP(R)': True,

Is there a way to still query w/ these params? I tried adding directly to the _node_constraints but that doesn't seem to work either e.g.

motif._node_constraints['A']['AVLP(R)'] = {} motif._node_constraints['A']['AVLP(R)']['='] = [True] Variable `R` not defined (line 2, column 83 (offset: 156)) " WHERE B.status = "Traced" AND A.status = "Orphan" AND A.INP = True AND A.AVLP(R) = True"
parser cypher
opened by simonwarchol 7
fix: weight edge attribute doesn't throw errors anymore (#127)

The edge attribute in the neuprint executor threw an error with the new JSON feature implementation. I also made the neuprint executor tests more rigorous.

opened by jakobtroidl 3
Upgrade grandiso version to use limits and iterable

In grandiso v1.1.0 and above, there is an optional limit argument to the find_motifs call which short-circuits motif counting if a certain number of valid mappings are found.

Right now, NetworkX and GrandIso executors implement the dotmotif limit parameter by finding all motifs and then downselecting, which is super inefficient and lame. We could pretty substantially improve performance by supporting the GrandIso limit arg.

A notable challenge: We perform an additional downselect after running grandiso (to double-check attribute filters). So we may need to store a list of mappings temporarily in order to backfill the results list if candidate mappings are filtered out.
enhancement GrandIsoExecutor

opened by j6k4m8 2
Non-string ids not supported by Neo4jExecutor

Ingesting a NetworkX graph with integer ids results in an error: ValueError: Could not export graph: unsupported operand type(s) for +: 'int' and 'str'. It should be straightforward to handle integers, though A node can be any hashable Python object except None. Maybe just cast with repr.
question Neo4jExecutor

opened by jtpdowns 2
Support n constraints on each edge value-operator pair
Currently, the parser overwrites previous operators if it's redefined:

A -> B [value<=5, value<=2]

...will yield a constraint operator of

{ "value": { "lte": 2.0 } }

(i.e. overwriting the first rule).
bug parser
opened by j6k4m8 2
Node- and edge-attribute support in DSL
Proposed syntax concepts:

Nodes

Inline maplike:

Node1 { type="GABA", z<12 } -> Node2

Pros:

Succinct

Cons:

Possible duplication or conflicting attributes if map is included on multiple lines for the same node

Postfix where-like:

Node1 -> Node2 | Node1.type = "GABA", Node1.z < 12

Pros:

Succinct

Cons:

Possible duplication or conflicting attributes if attrs are included on multiple lines for the same node

Footnote constraints

Node1 -> Node2 Node1.type = "GABA" Node1.z < 12

Pros:

Reduces possibility of conflicting constraints

Clear syntax; can be standalone in its own macro

Cons:

Linecount verbose

Decouples attributes from connectivity clauses

Edges

Inline maplike:

A ->{type: "excitatory", neurotransmitter: "ACh"} B

Pros:

Inline

Cons:

Reduces clarity of language

Postfix where-like:

A -> B | [type: "excitatory", neurotransmitter: "ACh"]

Pros:

Inline

Cons:

Reduces clarity of language

Infix maplike:

A -[type: "excitatory", neurotransmitter: "ACh"]> B

Pros:

Inline

Cons:

Reduces clarity of language

enhancement DSL
opened by j6k4m8 2
Add macro edge aliases
This adds support for complex edge constraints in macros:

decreasing_edge_weights(a, b, c) { a -> b as ab b -> c as bc ab.weight > bc.weight } ...

In increasing levels of challengingness:

[x] Add support for simple (edge-value) edge constraints in macros

[x] Add support for dynamic (edge-edge) edge constraints in macros

[x] Extend support for recursive calls to macros with simple constraints

[x] Extend support for recursive calls to macros with dynamic constraints

[x] Update documentation

This fixes #110 and finishes work started in #119.
enhancement DSL parser
opened by j6k4m8 1
Add edge aliasing and edge constraints
This PR adds support for edge aliases (first described in #110) and comparisons between edge attributes with values and with other edges.

This enables syntax like this:

A -> B as ab B -> A as ba ab.weight > ba.weight

[x] Add support in the DSL

[x] Add support in the parser + transformer

[x] Add support in the executors:

[x] GrandIso

[x] NetworkX

[x] NeuPrint

[x] Neo4j

I am going to push macro support in a separate PR, since this one is getting pretty lengthy already!
enhancement DSL parser cypher Neo4jExecutor NetworkXExecutor NeuPrintExecutor GrandIsoExecutor
opened by j6k4m8 1
Add node attribute bracket syntax
Adds support for "bracket" syntax for node attributes. An attribute like XYZ(ABC) or ABC DEF used to be disallowed because of illegal characters in the attribute name, particularly when using the "dot-attribute" notation:

# broken: A -> B A.ABC DEF > 10

The new syntax uses bracket-attribute notation to "escape" these names:

# working: import dotmotif from dotmotif.executors.NeuPrintExecutor import NeuPrintExecutor HOSTNAME = "neuprint.janelia.org" DATASET = "hemibrain:v1.2.1" TOKEN = "[YOUR TOKEN HERE]" motif = dotmotif.Motif(""" A -> B A['AVLP(R)'] = True """) E = NeuPrintExecutor(HOSTNAME, DATASET, TOKEN) E.find(motif, limit=2)

Fixes #111.
parser cypher Neo4jExecutor NeuPrintExecutor
opened by j6k4m8 1
Add Impossible Constraints validator
We should be able to automatically catch things like this:

A.type = 4 A.type != 4

Right now, we'll catch them in certain instances, but not when constraints are inherited from automorphisms (see #118). Getting smarter about this will likely improve runtime considerably.
enhancement validator
opened by j6k4m8 0
Anonymous motif participants
Anonymous motif participants:

A -> _hidden _hidden -> B

Anonymous node participants in macros:

two_hop(A, B) { A -> _i _i -> B } two_hop(neuron1, neuron2)
enhancement DSL parser
opened by j6k4m8 0

Releases(v0.13.0)

v0.13.0(Oct 11, 2022)

Source code(tar.gz)
Source code(zip)
v0.12.0(Jun 7, 2022)

Source code(tar.gz)
Source code(zip)
v0.10.0(Feb 14, 2022)
What's Changed

Update documentation https://github.com/aplbrain/dotmotif/pull/105

Add initial working multigraph edge validator plus tests by @j6k4m8 in https://github.com/aplbrain/dotmotif/pull/107

Full Changelog: https://github.com/aplbrain/dotmotif/compare/v0.9.2...v0.10.0
Source code(tar.gz)
Source code(zip)
v0.9.1(May 6, 2021)

Source code(tar.gz)
Source code(zip)
v0.9.0(Mar 23, 2021)

Source code(tar.gz)
Source code(zip)
v0.4.0(Feb 26, 2019)

Semi-stable first release to begin syncing with the Changelog. Use at your own risk!
Source code(tar.gz)
Source code(zip)

Owner

APL BRAIN

GitHub Repository https://bossdb.org/tools/dotmotif

GINO Is Not ORM - a Python asyncio ORM on SQLAlchemy core.

GINO - GINO Is Not ORM - is a lightweight asynchronous ORM built on top of SQLAlchemy core for Python asyncio. GINO 1.0 supports only PostgreSQL with

2.5k Dec 27, 2022

A Redis client library for Twisted Python

txRedis Asynchronous Redis client for Twisted Python. Install Install via pip. Usage examples can be found in the examples/ directory of this reposito

127 Oct 23, 2022

Making it easy to query APIs via SQL

Shillelagh Shillelagh (ʃɪˈleɪlɪ) is an implementation of the Python DB API 2.0 based on SQLite (using the APSW library): from shillelagh.backends.apsw

207 Dec 30, 2022

A simple wrapper to make a flat file drop in raplacement for mongodb out of TinyDB

Purpose A simple wrapper to make a drop in replacement for mongodb out of tinydb. This module is an attempt to add an interface familiar to those curr

180 Jan 01, 2023

A fast MySQL driver written in pure C/C++ for Python. Compatible with gevent through monkey patching.

:: Description :: A fast MySQL driver written in pure C/C++ for Python. Compatible with gevent through monkey patching :: Requirements :: Requires P

549 Nov 18, 2022

Python interface to Oracle Database conforming to the Python DB API 2.0 specification.

cx_Oracle version 8.2 (Development) cx_Oracle is a Python extension module that enables access to Oracle Database. It conforms to the Python database

841 Dec 21, 2022

MariaDB connector using python and flask

MariaDB connector using python and flask This should work with flask and to be deployed on docker. Setting up stuff 1. Docker build and run docker bui

1 Jan 11, 2022

Logica is a logic programming language that compiles to StandardSQL and runs on Google BigQuery.

Logica: language of Big Data Logica is an open source declarative logic programming language for data manipulation. Logica is a successor to Yedalog,

1.5k Dec 30, 2022

Async database support for Python. 🗄

Databases Databases gives you simple asyncio support for a range of databases. It allows you to make queries using the powerful SQLAlchemy Core expres

3.2k Dec 30, 2022

Python client for Apache Kafka

Kafka Python client Python client for the Apache Kafka distributed stream processing system. kafka-python is designed to function much like the offici

5.1k Jan 08, 2023

A HugSQL-inspired database library for Python

PugSQL PugSQL is a simple Python interface for using parameterized SQL, in files. See pugsql.org for the documentation. To install: pip install pugsql

558 Dec 24, 2022

Records is a very simple, but powerful, library for making raw SQL queries to most relational databases.

Records: SQL for Humans™ Records is a very simple, but powerful, library for making raw SQL queries to most relational databases. Just write SQL. No b

6.9k Jan 03, 2023

Implementing basic MongoDB CRUD (Create, Read, Update, Delete) queries, using Python.

MongoDB with Python Implementing basic MongoDB CRUD (Create, Read, Update, Delete) queries, using Python. We can connect to a MongoDB database hosted

4 Dec 01, 2021

Sample code to extract data directly from the NetApp AIQUM MySQL Database

This sample code shows how to connect to the AIQUM Database and pull user quota details from it. AIQUM Requirements: 1. AIQUM 9.7 or higher. 2. An

1 Nov 08, 2021

Py2neo is a comprehensive toolkit for working with Neo4j from within Python applications or from the command line.

Py2neo v3 Py2neo is a client library and toolkit for working with Neo4j from within Python applications and from the command line. The core library ha

64 Oct 14, 2022

AWS SDK for Python

Boto3 - The AWS SDK for Python Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to wri

7.8k Jan 04, 2023

MongoX is an async python ODM for MongoDB which is built on top Motor and Pydantic.

MongoX MongoX is an async python ODM (Object Document Mapper) for MongoDB which is built on top Motor and Pydantic. The main features include: Fully t

112 Dec 04, 2022

Lazydata: Scalable data dependencies for Python projects

lazydata: scalable data dependencies lazydata is a minimalist library for including data dependencies into Python projects. Problem: Keeping all data

629 Nov 21, 2022

A fast unobtrusive MongoDB ODM for Python.

MongoFrames MongoFrames is a fast unobtrusive MongoDB ODM for Python designed to fit into a workflow not dictate one. Documentation is available at Mo

45 Jun 01, 2022

A Pythonic, object-oriented interface for working with MongoDB.

PyMODM MongoDB has paused the development of PyMODM. If there are any users who want to take over and maintain this project, or if you just have quest

345 Dec 25, 2022

Find graph motifs using intuitive notation

Related tags

Overview

d o t m o t i f

Get Started

Parameters

Citing

Comments

Nodes

Inline maplike:

Postfix where-like:

Footnote constraints

Edges

Inline maplike:

Postfix where-like:

Infix maplike:

Releases(v0.13.0)

v0.13.0(Oct 11, 2022)

v0.12.0(Jun 7, 2022)

v0.10.0(Feb 14, 2022)

What's Changed

v0.9.1(May 6, 2021)

v0.9.0(Mar 23, 2021)

v0.4.0(Feb 26, 2019)

Owner

APL BRAIN

GINO Is Not ORM - a Python asyncio ORM on SQLAlchemy core.

A Redis client library for Twisted Python

Making it easy to query APIs via SQL

A simple wrapper to make a flat file drop in raplacement for mongodb out of TinyDB

A fast MySQL driver written in pure C/C++ for Python. Compatible with gevent through monkey patching.

Python interface to Oracle Database conforming to the Python DB API 2.0 specification.

MariaDB connector using python and flask

Logica is a logic programming language that compiles to StandardSQL and runs on Google BigQuery.

Async database support for Python. 🗄

Python client for Apache Kafka

A HugSQL-inspired database library for Python

Records is a very simple, but powerful, library for making raw SQL queries to most relational databases.

Implementing basic MongoDB CRUD (Create, Read, Update, Delete) queries, using Python.

Sample code to extract data directly from the NetApp AIQUM MySQL Database

Py2neo is a comprehensive toolkit for working with Neo4j from within Python applications or from the command line.

AWS SDK for Python

MongoX is an async python ODM for MongoDB which is built on top Motor and Pydantic.

Lazydata: Scalable data dependencies for Python projects

A fast unobtrusive MongoDB ODM for Python.

A Pythonic, object-oriented interface for working with MongoDB.