Manage large and heterogeneous data spaces on the file system.

Last update: Dec 14, 2022

Overview

signac - simple data management

The signac framework helps users manage and scale file-based workflows, facilitating data reuse, sharing, and reproducibility.

It provides a simple and robust data model to create a well-defined indexable storage layout for data and metadata. This makes it easier to operate on large data spaces, streamlines post-processing and analysis and makes data collectively accessible.

Resources

Framework documentation: Examples, tutorials, topic guides, and package Python APIs.
Package documentation: API reference for the signac package.
Slack Chat Support: Get help and ask questions on the signac Slack workspace.
signac website: Framework overview and news.

Installation

The recommended installation method for signac is through conda or pip. The software is tested for Python 3.6+ and is built for all major platforms.

To install signac via the conda-forge channel, execute:

conda install -c conda-forge signac

To install signac via pip, execute:

pip install signac

Detailed information about alternative installation methods can be found in the documentation.

Quickstart

The framework facilitates a project-based workflow. Set up a new project:

$ mkdir my_project
$ cd my_project
$ signac init MyProject

and access the project handle:

>>> project = signac.get_project()

Testing

You can test this package by executing:

$ python -m pytest tests/

Acknowledgment

When using signac as part of your work towards a publication, we would really appreciate that you acknowledge signac appropriately. We have prepared examples on how to do that here. Thank you very much!

The signac framework is a NumFOCUS Affiliated Project.

Comments

Added buffering to SyncedCollection
Description

Added buffering feature for SyncedCollection. The buffering will be provided by signac.buffered and SyncedCollection.buffered.

Motivation and Context

Related to #249. This is continuation of work in PR

Types of Changes

[ ] Documentation update

[ ] Bug fix

[x] New feature

[ ] Breaking change¹

¹The change breaks (or has the potential to break) existing functionality.

Checklist:

[x] I am familiar with the Contributing Guidelines.

[x] I agree with the terms of the Contributor Agreement.

[x] My name is on the list of contributors.

[x] My code follows the code style guideline of this project.

[ ] The changes introduced by this pull request are covered by existing or newly introduced tests.

If necessary:

[ ] I have updated the API documentation as part of the package doc-strings.

[ ] I have created a separate pull request to update the framework documentation on signac-docs and linked it here.

[ ] I have updated the changelog and added all related issue and pull request numbers for future reference (if applicable). See example below.

Example for a changelog entry: Fix issue with launching rockets to the moon (#101, #212).
GSoC
opened by vishav1771 30
Refactor path function handling
This PR switched from a bug fix to a refactor. See #666 for the bug fix only.

Original Title: Don't generate views with underspecified path provided by user

Signac has unexpected behavior when generating a view if the user specifies a custom path that doesn't uniquely specifiy jobs.

I would expect signac to link to all jobs matching the description in the view folder.

What @Nipuli-Gunaratne and I found was that signac just picks one job.

Description

We check that the mapping of source-> link is 1-1 in other parts of import_export.py but don't check user's provided path function.

Two posssible solutions I see are:

Generate an error and exit. Suggest the fix in the error message.

Try to fix the problem by adding jobs ids to the path.

In this draft PR, I find the places in the code for adding the error checking code in two places it might fit: import_export.py::_make_path_function and in linked_view.py::create_linked_view.

I think it works best in _make_path_function.

Motivation and Context

If you make this test project

#init.py import signac project = signac.init_project('Test-view') jobs = [dict(a=1,b=1), dict(a=1,b=2), dict(a=2,b=1), dict(a=2,b=2) ] for j in jobs: job = project.open_job(j) job.init() print(j, job.id)

and generate a view with a user-specified custom path that does not uniquely identify jobs

signac view test_error "a/{a}"

Signac just picks one of the jobs to link. a=1 gets job 8aacdb17187e6acf2b175d4aa08d7213 (b=2) and not 386b19932c82f3f9749dd6611e846293 (b=1)
a=2 gets job 5e4d14d82c320bafb2f1286fe486d1f8 (b=1) and not d48f81ad571306570e2eb9fe7920cd3c (b=2)

Fix that we'd suggest to users OR try to do automatically:

Remake the path specification as "a/{a}/id/{id}"

Types of Changes

[ ] Documentation update

[x] Bug fix

[ ] New feature

[ ] Breaking change¹

¹The change breaks (or has the potential to break) existing functionality.

Checklist:

[x] I am familiar with the Contributing Guidelines.

[x] I agree with the terms of the Contributor Agreement.

[x] My name is on the list of contributors.

[ ] My code follows the code style guideline of this project.

[ ] The changes introduced by this pull request are covered by existing or newly introduced tests.

If necessary:

[ ] I have updated the API documentation as part of the package doc-strings.

[ ] I have created a separate pull request to update the framework documentation on signac-docs and linked it here.

[ ] I have updated the changelog and added all related issue and pull request numbers for future reference (if applicable). See example below.

refactor
opened by cbkerr 24
Proposal: Unify dict classes and improve buffering and synchronization
Tl;dr: We need to improve synchronization and caching logic, and I think the first step is to combine the _SyncedDict, SyncedAttrDict, and JSONDict classes.

I apologize in advance for the lengthy nature of this issue. This issue will serve as a pseudo-signac Enhancement Proposal, I'll try and document very thoroughly and it can be a test case for the utility of such proposals :)

In view of our recent push for deprecations and our discussion of reorganizing namespaces and subpackages to prepare for signac 2.0, I'd like to also revisit discussion of the different dict classes. We have various open bugs and features (#234, #196, #239, #238, #198) that are related to improving our synchronization and caching processes. Our synchronization clearly has some holes in it, and in the process of making #239 @bdice has raised concerns about inconsistencies with respect to cache correctness and cache coherence, e.g. the fact that a job that exists and is cached will still exist in the cache after it is deleted (Bradley, feel free to add more information).

Fixing all of these is a complex problem, in part due to fragmentation in our implementation of various parts of the logic. I'd like to use this issue to broadly discuss the various problems that we need to fix, and we can spawn off related issues as needed once we have more of a plan of attack to address our problems. Planning this development more thoroughly is critical since the bugs that may arise touch on pretty much all critical code paths in signac. I think that a good first step is looking into simplifying the logic associated with our various dictionary classes. That change should make it easier to improve #198 since synchronization will be in one place. After that, I think it will be easier to consider the various levels of caching and properly define the invariants we want to preserve.

With respect to the various dictionary classes, I think we need to reassess and simplify our hierarchy:

The core class is the _SyncedDict class in core/synceddict.py, and I think it should exist in more or less its current form.

I understand the logic of separating out the SyncedAttrDict in core/attrdict.py since attribute-based access to a dictionary is technically a feature unrelated to synchronization. However, the class is very minimal, and I think that the benefits of maintaining this level of purity in distinction is outweighed by the increased difficulty users and newer developers have in finding code in the code base. I would like to merge these classes.

The JSONDict class in core/jsondict.py is, in my opinion, harder to justify separating from _SyncedDict on a conceptual level. Although in principle one could argue for different types of file backends, in practice we're very tied to JSON. The bigger problem, though, is that in my understanding (please correct me if I'm wrong here) the primary distinction between the two classes is less about the file backend and more about buffering. We always set the parent of SyncedAttrDict, which is what we use for job statepoints, and this ensures that statepoints changes are immediately synced to disk. Conversely, job documents are JSONDict objects, which use buffering. The fact that the _SyncedDict has _load and _save methods that essentially must be implemented by a child class when parent is not set, and that the JSONDict is the only example we have of such a class, suggests that this is a level of abstraction that isn't very helpful and mainly complicates management of the code. At least for now, I would prefer to unify the JSONDict with _SyncedDict; the logic for when we buffer is already governed by the parent, but the logic for how we buffer is governed by the various other functions in jsondict.py. Afterwards, if we see a benefit to separating the choice of file backend, we could recreate JSONDict where the new version of the class would really only implement JSON-specific logic. This change would have the added benefit of unifying statepoints and documents: I don't think it is intuitive design to have the document and the statepoint be two different classes for reasons of buffering, and it makes it substantially more difficult to follow the logic of how the _sp_save_hook works and why it's necessary. Longer term, I would like to refactor the logic for persistence vs. buffering so that the roles of Job and _SyncedDict are more disjoint, but I recognize that there may not be a way to completely decouple the bidirectional link.

@csadorf @bdice @mikemhenry any commentary on this is welcome, also please tag any other devs who might have enough knowledge of these topics to provide useful feedback.
enhancement proposal GSoC refactor
opened by vyasr 24
Lazy statepoint loading
Description

Changes behavior of Job to load its statepoint lazily, when opened by id.

Motivation and Context

Implementation of #238.

Types of Changes

[ ] Documentation update

[ ] Bug fix

[x] New feature

[x] Breaking change¹

¹The change breaks (or has the potential to break) existing functionality.

Checklist:

[x] I am familiar with the Contributing Guidelines.

[x] I agree with the terms of the Contributor Agreement.

[x] My name is on the list of contributors.

[x] My code follows the code style guideline of this project.

[x] The changes introduced by this pull request are covered by existing or newly introduced tests.

If necessary:

[ ] I have updated the API documentation as part of the package doc-strings.

[ ] I have created a separate pull request to update the framework documentation on signac-docs and linked it here.

[x] I have updated the changelog.

enhancement
opened by bdice 24
Add id property for jobs
Add id property to Job

Description

Adds id as a property of the Job class. I also added a test to ensure that it produces the correct string.

Motivation and Context

Just follows the python trend of using properties instead of getters and setters.

Types of Changes

[ ] Documentation update

[ ] Bug fix

[x] New feature

[ ] Breaking change¹

¹The change breaks (or has the potential to break) existing functionality.

Checklist:

[x] I am familiar with the Contributing Guidelines.

[x] I agree with the terms of the Contributor Agreement.

[x] My name is on the list of contributors.

[x] My code follows the code style guideline of this project.

[x] The changes introduced by this pull request are covered by existing or newly introduced tests.

If necessary:

[ ] I have updated the API documentation as part of the package doc-strings.

[ ] I have created a separate pull request to update the framework documentation on signac-docs and linked it here.

[ ] I have updated the changelog.

enhancement
opened by b-butler 23
Improve job.data, project.data (H5Store) examples.
Description

I created this PR to start addressing this issue: https://github.com/glotzerlab/signac-docs/issues/50

@klywang Do you have specific suggestions on how to improve this? (I don't think I've "solved" the issue yet, since I haven't provided explicit examples for job.data as requested.) Feel free to edit this PR directly.

Motivation and Context

https://github.com/glotzerlab/signac-docs/issues/50

Types of Changes

[x] Documentation update

[ ] Bug fix

[ ] New feature

[ ] Breaking change¹

¹The change breaks (or has the potential to break) existing functionality.

Checklist:

[x] I am familiar with the Contributing Guidelines.

[x] I agree with the terms of the Contributor Agreement.

[ ] My name is on the list of contributors.

[ ] My code follows the code style guideline of this project.

[ ] The changes introduced by this pull request are covered by existing or newly introduced tests.

If necessary:

[ ] I have updated the API documentation as part of the package doc-strings.

[ ] I have created a separate pull request to update the framework documentation on signac-docs and linked it here.

[ ] I have updated the changelog.
opened by bdice 22
Improve Sync Data Structures
Description

This PR is related to #249 . In this PR, we are implementing SyncedCollection, SyncedAttrDict, SyncedList, JSONCollection, JSONDict, JSONList.

Motivation and Context

This refractor is to provide support for the multiple backends and resolve #196.

Types of Changes

[ ] Documentation update

[ ] Bug fix

[x] New feature

[x] Breaking change¹

¹The change breaks (or has the potential to break) existing functionality.

Checklist:

[x] I am familiar with the Contributing Guidelines.

[x] I agree with the terms of the Contributor Agreement.

[x] My name is on the list of contributors.

[x] My code follows the code style guideline of this project.

[x] The changes introduced by this pull request are covered by existing or newly introduced tests.

If necessary:

[ ] I have updated the API documentation as part of the package doc-strings.

[ ] I have created a separate pull request to update the framework documentation on signac-docs and linked it here.

[ ] I have updated the changelog and added all related issue and pull request numbers for future reference (if applicable). See example below.

Example for a changelog entry: Fix issue with launching rockets to the moon (#101, #212).
GSoC
opened by vishav1771 20
Optimize `Collection` for internal use in `Project`

A possible optimization for signac 2.0 (or later) would be to reduce the amount of code we use in the Collection class when calling Project.find_jobs. See: https://github.com/glotzerlab/signac/blob/b29e0485c7998ba0c6d041e9ec15b533334d9b64/signac/contrib/project.py#L717

We copy a large amount of data used from the Project's internal caches during the construction of a Collection and calling its find method. At the very least, we never use the Collection's ability to interact with data on disk or its ability to automatically generate ids (primary keys) for new records in the context of a signac Project, so we could eliminate some logic there in a cut-down class.

Originally posted by @bdice in https://github.com/glotzerlab/signac/issues/652#issuecomment-1002303783

I attempted an optimization in 13f2a8fb205f65d49421f4f4009ee3d78d00f9bf but it was unclear if copy/reference semantics would be correct in the resulting indices outside the context of a signac Project's limited usage of Collection. A smaller class that is designed for the actual use case of signac's Project.find_jobs could act as an internal cache with only the necessary logic (e.g. no file I/O or id generation).
enhancement refactor

opened by bdice 19
Convert all docstrings to numpy style

As part of our overall docs overhaul (see glotzerlab/signac-docs#64), we want to convert our docstrings to numpy style (as decided in glotzerlab/signac-docs#74). The best automated tool I'm familiar with for this task is pyment. In addition to converting docstrings, it will also generate docstrings for functions, classes, etc that are missing docstrings entirely. However, the conversion will require significant manual review to ensure that all docstrings are converted correctly.
enhancement documentation

opened by vyasr 19

Assigning to nested keys in a job document

Original report by Bradley Dice (Bitbucket: bdice, GitHub: bdice).

I would like to use nested keys in a job document. Presently, this does not work as one would expect from "normal" dictionaries. See code snippet below.

#!python
>>> job.document # We start from an empty job document
{}
>>> job.document['a']['b'] = 'c' # This will error as expected, since job.document['a'] is unassigned
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/bdice/env/glotzer-software/signac/0.8.5/lib/python/signac-0.8.5-py3.5.egg/signac/core/jsondict.py", line 48, in __getitem__
KeyError: 'a'
>>> job.document['a'] = dict() # We create an empty dictionary as the value of key 'a'
>>> job.document['a']['b'] = 'c' # We attempt to assign a nested key
>>> job.document # This nested key does not appear to be set
{'a': {}}
>>> job.document['a'] = dict(b = 'c') # However, it is possible to set the value to be another dictionary, and this works
>>> job.document
{'a': {'b': 'c'}}

After looking through the source code of the JSonDict with @vyasr, it is not immediately clear how to fix this in the source. It would probably involve some customization of the __getitem__ or __setitem__ functions. However, this workaround will function:

#!python
>>> temp_dict = job.document['a']
>>> temp_dict['b'] = 'c'
>>> job.document['a'] = temp_dict

Version:

$ signac --version
signac 0.8.5

bug

opened by csadorf 19

Use more compact schema for root directory files
Based on feature usage, the project root directory currently contains a number of special files. Some of them are hidden, some are not.

signac_project_document.json

signac.rc

.signac_sp_cache.json.gz

.signac_history.txt

As suggested by @vyasr we may want to switch to a more compact storage format, for example by bundling all files within a .signac folder.
enhancement pinned
opened by csadorf 18
Bump coverage from 6.5.0 to 7.0.1
Bumps coverage from 6.5.0 to 7.0.1.

Changelog

Sourced from coverage's changelog.

Version 7.0.1 — 2022-12-23

When checking if a file mapping resolved to a file that exists, we weren't considering files in .whl files. This is now fixed, closing issue 1511_.

File pattern rules were too strict, forbidding plus signs and curly braces in directory and file names. This is now fixed, closing issue 1513_.

Unusual Unicode or control characters in source files could prevent reporting. This is now fixed, closing issue 1512_.

The PyPy wheel now installs on PyPy 3.7, 3.8, and 3.9, closing issue 1510_.

.. _issue 1510: nedbat/coveragepy#1510 .. _issue 1511: nedbat/coveragepy#1511 .. _issue 1512: nedbat/coveragepy#1512 .. _issue 1513: nedbat/coveragepy#1513

.. _changes_7-0-0:

Version 7.0.0 — 2022-12-18

Nothing new beyond 7.0.0b1.

.. _changes_7-0-0b1:

Version 7.0.0b1 — 2022-12-03

A number of changes have been made to file path handling, including pattern matching and path remapping with the [paths] setting (see :ref:config_paths). These changes might affect you, and require you to update your settings.

(This release includes the changes from 6.6.0b1 <changes_6-6-0b1_>_, since 6.6.0 was never released.)

Changes to file pattern matching, which might require updating your configuration:

Previously, * would incorrectly match directory separators, making precise matching difficult. This is now fixed, closing issue 1407_.

Now ** matches any number of nested directories, including none.

Improvements to combining data files when using the

... (truncated)

Commits

c5cda3a docs: releases take a little bit longer now

9d4226e docs: latest sample HTML report

8c77758 docs: prep for 7.0.1

da1b282 fix: also look into .whl files for source

d327a70 fix: more information when mapping rules aren't working right.

35e249f fix: certain strange characters caused reporting to fail. #1512

152cdc7 fix: don't forbid plus signs in file names. #1513

31513b4 chore: make upgrade

873b059 test: don't run tests on Windows PyPy-3.9

5c5caa2 build: PyPy wheel now installs on 3.7, 3.8, and 3.9. #1510

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 1
Bump pytest-xdist from 3.0.2 to 3.1.0
Bumps pytest-xdist from 3.0.2 to 3.1.0.

Changelog

Sourced from pytest-xdist's changelog.

pytest-xdist 3.1.0 (2022-12-01)

Features

[#789](https://github.com/pytest-dev/pytest-xdist/issues/789) <https://github.com/pytest-dev/pytest-xdist/issues/789>_: Users can now set a default distribution mode in their configuration file:

.. code-block:: ini

[pytest] addopts = --dist loadscope

[#842](https://github.com/pytest-dev/pytest-xdist/issues/842) <https://github.com/pytest-dev/pytest-xdist/issues/842>_: Python 3.11 is now officially supported.

Removals

[#842](https://github.com/pytest-dev/pytest-xdist/issues/842) <https://github.com/pytest-dev/pytest-xdist/issues/842>_: Python 3.6 is no longer supported.

Commits

92a76bb Release 3.1.0

6226965 Merge pull request #851 from nicoddemus/789-default-dist-mode

7a0bc4c Let users configure dist mode in the configuration file

c6bcd20 [pre-commit.ci] pre-commit autoupdate (#849)

99c80c3 Fix typo psutils -> psutil (#848)

e14895a [pre-commit.ci] pre-commit autoupdate (#846)

bb27210 Merge pull request #844 from pytest-dev/pre-commit-ci-update-config

4a33933 Use ternary operator to remove mypy error

41620d2 [pre-commit.ci] pre-commit autoupdate

6b6f133 Merge pull request #842 from nicoddemus/drop-py36-add-py311

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 1
Bump redis from 4.3.5 to 4.4.0
Bumps redis from 4.3.5 to 4.4.0.

Release notes

Sourced from redis's releases.

Version 4.4.0

Changes

4.4.0rc4 release notes 4.4.0rc3 release notes 4.4.0rc2 release notes 4.4.0rc1 release notes

🚀 New Features (since 4.4.0rc4)

Async clusters: Support creating locks inside async functions (#2471)

🐛 Bug Fixes (since 4.4.0rc4)

Async: added 'blocking' argument to call lock method (#2454)

Added a replacement for the default cluster node in the event of failure. (#2463)

Fixed geosearch: Wrong number of arguments for geosearch command (#2464)

🧰 Maintenance (since 4.4.0rc4)

Updating dev dependencies (#2475)

Removing deprecated LGTM (#2473)

Added an explicit index name in RediSearch example (#2466)

Adding connection step to bloom filter examples (#2478)

Contributors (since 4.4.0rc4)

We'd like to thank all the contributors who worked on this release!

@Sibuken, @barshaul, @chayim, @dvora-h, @nermiller, @uglide and @utkarshgupta137

4.4.0rc4

Changes

🚀 New Features

CredentialsProvider class added to support password rotation (#2261)

Enable AsyncIO cluster mode lock (#2446)

🐛 Bug Fixes

Failover handling improvements for RedisCluster and Async RedisCluster (#2377)

Improved response parsing options handler for special cases (#2302)

Contributors

We'd like to thank all the contributors who worked on this release!

@KMilhan, @barshaul, @dvora-h and @fadida

4.4.0rc3

Changes

... (truncated)

Commits

6fa6cfc Version 4.4.0 (#2485)

938ba6d Updating dev dependencies (#2475)

f32c835 Removing Deprecated LGTM (#2473)

37b961c Use explicit index name in RediSearch example (#2466)

a114f26 Async clusters: Support creating locks inside async functions (#2471)

c48dc83 Async: added 'blocking' argument to call lock method (#2454)

2c12155 Added a replacement for the default cluster node in the event of failure. (#2...

f4d07dd Wrong number of arguments for geosearch command (#2464)

dfe2152 Adding connection step to bloom filter examples (#2478)

f492f85 Install package deps in readthedocs build (#2465)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 1
Bump numpy from 1.23.5 to 1.24.1
Bumps numpy from 1.23.5 to 1.24.1.

Release notes

Sourced from numpy's releases.

v1.24.1

NumPy 1.24.1 Release Notes

NumPy 1.24.1 is a maintenance release that fixes bugs and regressions discovered after the 1.24.0 release. The Python versions supported by this release are 3.8-3.11.

Contributors

A total of 12 people contributed to this release. People with a "+" by their names contributed a patch for the first time.

Andrew Nelson

Ben Greiner +

Charles Harris

Clément Robert

Matteo Raso

Matti Picus

Melissa Weber Mendonça

Miles Cranmer

Ralf Gommers

Rohit Goswami

Sayed Adel

Sebastian Berg

Pull requests merged

A total of 18 pull requests were merged for this release.

#22820: BLD: add workaround in setup.py for newer setuptools

#22830: BLD: CIRRUS_TAG redux

#22831: DOC: fix a couple typos in 1.23 notes

#22832: BUG: Fix refcounting errors found using pytest-leaks

#22834: BUG, SIMD: Fix invalid value encountered in several ufuncs

#22837: TST: ignore more np.distutils.log imports

#22839: BUG: Do not use getdata() in np.ma.masked_invalid

#22847: BUG: Ensure correct behavior for rows ending in delimiter in...

#22848: BUG, SIMD: Fix the bitmask of the boolean comparison

#22857: BLD: Help raspian arm + clang 13 about __builtin_mul_overflow

#22858: API: Ensure a full mask is returned for masked_invalid

#22866: BUG: Polynomials now copy properly (#22669)

#22867: BUG, SIMD: Fix memory overlap in ufunc comparison loops

#22868: BUG: Fortify string casts against floating point warnings

#22875: TST: Ignore nan-warnings in randomized out tests

#22883: MAINT: restore npymath implementations needed for freebsd

#22884: BUG: Fix integer overflow in in1d for mixed integer dtypes #22877

#22887: BUG: Use whole file for encoding checks with charset_normalizer.

Checksums

... (truncated)

Commits

a28f4f2 Merge pull request #22888 from charris/prepare-1.24.1-release

f8fea39 REL: Prepare for the NumPY 1.24.1 release.

6f491e0 Merge pull request #22887 from charris/backport-22872

48f5fe4 BUG: Use whole file for encoding checks with charset_normalizer [f2py] (#22...

0f3484a Merge pull request #22883 from charris/backport-22882

002c60d Merge pull request #22884 from charris/backport-22878

38ef9ce BUG: Fix integer overflow in in1d for mixed integer dtypes #22877 (#22878)

bb00c68 MAINT: restore npymath implementations needed for freebsd

64e09c3 Merge pull request #22875 from charris/backport-22869

dc7bac6 TST: Ignore nan-warnings in randomized out tests

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 1
Bump tables from 3.7.0 to 3.8.0
Bumps tables from 3.7.0 to 3.8.0.

Release notes

Sourced from tables's releases.

Release v3.8.0

Changes from 3.7.0 to 3.8.0

Improvements

Support for Python 3.11 has been added (PR #962).

Support for Python 3.6 and Python 3.7 has been dropped (PR #966).

Added a new (registered) HDF5 filter for Blosc2 compressor (PR #969).

Added optimized paths for Blosc2 reading and writing in tables. This bypasses the HDF5 filter pipeline by building the Blosc2 CFrames and sending them to the HDF5 direct chunking machinery (PR #969).

Internal C-Blosc sources updated to 1.21.2.

Thanks to Oscar Guiñon, Francesc Alted for implementing Blosc2 the support and NumFOCUS for providing a grant for that.

Other changes

Starting form this release, C source files generated by Cython are no longer included in the source distribution package.

Pre-built HTML documentation is no longer included in the source package.

Changelog

Sourced from tables's changelog.

Changes from 3.8.0 to 3.9.0

XXX version-specific blurb XXX

Commits

e34d1f7 Update copyright year

f1e9fc3 Add a performance comparison with pandas

ce2c32d Getting ready for release 3.8.0

0f28388 Prevent PyPy builds on linux too

53eaed0 Prevent building macos pypy wheels.

e241459 Merge pull request #979 from PyTables/tables-3.8.0

0f14177 Continue silencing warnings for recent NumPy (1.24)

5d02ad6 Merge branch 'master' into tables-3.8.0

a0fccf2 Silence more warnings for recent NumPy (1.24)

8ea4b98 Silence a warning for recent NumPy

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 1
Move to a fully pyproject.toml based build
Description

This PR removes setup.py and setup.cfg entirely, migrating all project and build configuration information into pyproject.toml. In the process, all linter configs have also been moved into pyproject.toml. The exception is flake8, which does not (and will not) support pyproject.toml, so the flake8 configuration is now stored in the .flake8 file, which is specific to this linter. Additionally, bump2version also does not support pyproject.toml (although unlike flake8 the proposal has not been entirely rejected, so it may eventually), so that configuration has also been moved to a project-specific .bumpversion file.

Motivation and Context

Various changes to Python packaging over the last 6 or 7 years have moved towards more static packaging and towards storing data in a backend-agnostic format. These changes allow these of setuptools alternatives (like flit) as well as more reproducible builds based on build isolation into virtual environments that provide all necessary build dependencies. Direct invocation of setup.py has been deprecated in the process. The changes in this PR modernize signac's build system for compatibility with these new approaches.

Checklist:

[x] I am familiar with the Contributing Guidelines.

[x] I agree with the terms of the Contributor Agreement.

[x] My name is on the list of contributors.

[x] The changes introduced by this pull request are covered by existing or newly introduced tests.

[x] The package documentation and framework documentation in signac-docs are up to date with these changes.

[x] I have updated the changelog and added any related issue and pull request numbers for future reference.
opened by vyasr 2

Releases(v1.8.0)

v1.8.0(Oct 5, 2022)
[1.8.0] -- 2022-10-05

Added

Official support for Python 3.10 (#631).

Benchmarks can be run using the asv (airspeed velocity) tool (#629).

Continuous integration tests run in parallel with pytest-xdist (#705).

The Project.path and Job.path properties (#685).

Changed

Schema migration is now performed on directories rather than signac projects and supports a wider range of schemas (#654).

Deprecated features now use FutureWarning instead of DeprecationWarning, which is hidden by default (#687, #691, #692).

Project names have a default in anticipation of removing names entirely. Project names will be removed in signac 2.0 (#644).

Project.workspace is now a property, not a method (#685).

Continuous integration uses GitHub Actions instead of CircleCI (#776, #788).

Raise errors in testing when DeprecatedWarnings or FutureWarnings are raised (#713).

Change GitHub PR to check for uncompleted tasks (i.e. unchecked checkboxes) (#686).

Deprecated

Project methods read_statepoints, write_statepoints, and dump_statepoints are deprecated (#579, #197).

Project.index method is deprecated (#591, #588).

JobSearchIndex class is deprecated (#600).

index argument is deprecated in Project methods (#602, #588).

signac.cite module is deprecated (#611, #592).

The config module and all its methods are deprecated (#675, #753, #814).

Accessing Project.workspace as a method, it should be accessed as a property (#685).

Project.num_jobs (#685).

ProjectSchema.__call__, ProjectSchema.detect (#685).

Fixed

H5Store.mode returns the file mode (#607).

User-provided path functions now raise an error if not unique (#666).

Collection class no longer raises an error when searching by a primary key that does not exist (#676).

Relative paths on Windows are not used if the current directory has no common prefix (#777).

get_project() now raises an error if provided a root directory that does not exist (#779, #792).

Catch internally raised warnings on use of deprecated password cache (#754).

Catch KeyError from multithreading error (#710).

Tests now properly show raised warnings (#603).

Removed

Removed upper bound of Python 4 on python_requires (#780, #781).

Dropped support for Python 3.6 and Python 3.7 (#715) following the recommended support schedules of NEP 29.

Dropped dependency on deprecation package (#687, #718).

Removed unused _extract utility function to avoid CVE-2007-4559 (#829).

Source code(tar.gz)
Source code(zip)
v1.7.0(Jun 8, 2021)
This release adds SyncedCollections, a new, performant, and flexible approach to syncing job state points and documents with an underlying resource. Thanks to all who contributed! 🎨

Added

New SyncedCollection class and subclasses to replace JSONDict with more general support for different types of resources (such as MongoDB collections or Redis databases) and more complete support for different data types synchronized with files (#196, #234, #249, #316, #383, #397, #465, #484, #529, #530). This change introduces a minor-backwards incompatible change; for users making direct use of signac buffering, the force_write parameter is no longer respected. If the argument is passed, a warning will now be raised to indicate that it is ignored and will be removed in signac 2.0.

Unified querying for state point and document filters using 'sp' and 'doc' as prefixes (#332, #514). This change introduces a minor backwards-incompatible change to the Collection index schema ('statepoint'->'sp'), but this does not affect any APIs, only indexes saved to file using a previous version of signac. Indexing APIs will be removed in signac 2.0.

Changed

Optimized internal path joins to speed up project iteration (#515).

Deprecated

doc_filter arguments, which are replaced by namespaced filters. Due to their long history, doc_filter arguments will still be accepted in signac 2.0 and will only be removed in 3.0 (#516).

The modules signac.core.attrdict, signac.core.json, signac.core.jsondict, and signac.core.synceddict.py are deprecated in favor of the new SyncedCollection classes and will be removed in signac 2.0 (#483).

Fixed

Corrected docstrings for Job.update_statepoint and Project.update_statepoint (#506, #563).

Source code(tar.gz)
Source code(zip)
v1.6.0(Jan 25, 2021)
This release focuses on performance improvements and better docs. Large projects should see massive speedups (4-7x on an SSD) for iterating over the project and working with signac-flow. Now you can scale up your science! 🎨

Added

Implemented JobsCursor.__contains__ check (#449).

Added documentation for JobsCursor class (#475).

Changed

Optimized job hash and equality checks (#442, #455).

Optimized H5Store initialization (#443).

State points are loaded lazily when Job is opened by id (#238, #239).

Optimized Job and Project classes to cache internal properties and initialize on access (#451).

Python 3.6 is only tested with oldest dependencies (#474).

Improved documentation for updating and resetting state points (#444).

Deprecated

Deprecate syncutil.copytree method (#439).

Fixed

Zero-dimensional NumPy arrays can be used in state points and documents (#449).

Source code(tar.gz)
Source code(zip)
v1.5.1(Dec 20, 2020)
Added

Support for h5py version 3 (#411).

Added pyupgrade to pre-commit hooks (#413).

Code is formatted with black and isort pre-commit hooks (#415).

Added macOS to CircleCI testing pipeline (#281, #414).

Official support for Python 3.9 (#417).

Changed

Optimized internal function _mkdir_p (#421).

Optimized performance of job initialization (#422).

Optimized performance of buffer storage (#428).

Optimized performance of creating/loading synced data structures (#429).

Source code(tar.gz)
Source code(zip)
v1.5.0(Sep 21, 2020)
Added

Type annotations are validated during continuous integration (#313).

Added _repr_html_ method in ProjectSchema class (#314, #324).

Allow grouping by variables that are not present in all jobs in the project in JobsCursor.groupby (#321, #323).

Added parameters usecols and flatten to allow selection of columns and flattening of nested data when converting signac data into a pandas DataFrame (#327, #330).

Added support for pre-commit hooks (#355, #358).

Expanded CLI documentation (#187, #359, #377).

Changed

Docstrings are now written in numpydoc style.

Fixed

Fix the signac config verify command (previously broken) (#301, #302).

Warnings now appear when raised by the signac CLI (#317, #308).

Fix dots in synchronization error messages (#375, #376).

Deprecated

Deprecate the create_access_modules method in Project, to be removed in 2.0 (#303, #308).

The MainCrawler class has replaced the MasterCrawler class. Both classes are deprecated (#342).

Removed

Dropped support for Python 3.5 (#340). The signac project will follow the NEP 29 deprecation policy going forward.

Removed dependency on pytest-subtests (#379).

Source code(tar.gz)
Source code(zip)
v1.4.0(Feb 29, 2020)
Added

Added Windows to platforms tested with continuous integration (#264, #266).

Add command line option -m/--merge for signac sync (#280, #230).

Changed

Workspace directory is created when Project is initialized (#267, #271).

Changed testing framework from unittest to pytest (#212, #275).

Refactored internal use of deprecated get_statepoint function (#227, #282).

Fixed

Fixed issues on Windows with H5Store, project import/export, and operations that move files (#264, #266).

Calling items or values on _SyncedDict objects does not mutate nested dictionaries (#234, #269).

Fixed issue with project.data access from separate instances of H5StoreManager (#274, #278).

Fixed error when launching signac shell if permissions are denied for .signac_shell_history (#279).

Removed

Removed vendored tqdm module and replaced it with a requirement (#289).

Removed support for rapidjson as an alternative JSON library (#285, #287).

Removed tuple of keys implementation of nested dictionaries (#272, #296).

Source code(tar.gz)
Source code(zip)
v1.3.0(Dec 19, 2019)
Added

Official support for Python 3.8 (#258).

Add properties Project.id and Job.id (#250).

Add signac.diff_jobs function to compare two or more state points (#248, #247).

Add function to initialize a sample data space for testing purposes (#215).

Add schema version to ensure compatibility and enable migrations in future package versions (#165, #253).

Changed

Implemented Project.__contains__ check in constant time (#231).

Fixed

Attempting to create a linked view for a Project on Windows now raises an informative error message (#214, #236).

Project configuration is initialized using ConfigObj, allowing the configuration to include commas and special characters (#251, #252).

Deprecated

Deprecate the get_id method in Project and Job classes in favor of the id property, to be removed in 2.0 (#250).

In-memory modification of the project configuration, to be removed in 2.0 (#246).

Removed

Dropped support for Python 2.7 (#232).

Source code(tar.gz)
Source code(zip)
v1.2.0(Jul 22, 2019)
Added

Keep signac shell command history on a per-project basis (#134, #194).

Add read_json() and to_json() methods to Collection class (#104, #200).

Fixed

Fix issue where shallow copies of instances of Job would behave incorrectly (#153, #207).

Fix issue causing a failure of the automatic conversion of valid key types (#168, #205).

Improve the "dots in keys" error message to make it easier to fix related issues (#170, #205).

Update the __repr__ and __repr_html__ implementations of the Project, Job, and JobsCursor classes (#193).

Reduce the logging verbosity about a missing default host key in the configuration (#201).

Fix issue with incorrect detection of dict-like files managed with the DictManager class (e.g. job.stores) (#203).

Fix issue with generating views from the command line for projects with only one job (#208, #211).

Fix issue with heterogeneous types in state point values that are lists (#209, #210).

Source code(tar.gz)
Source code(zip)
v1.1.0(May 19, 2019)
Added

Add command line options --sp and --doc for signac find that allow users to display key-value pairs of the state point and document in combination with the job id (#97, #146).

Improve the representation (return value of repr()) of instances of H5Group and SyncedAttrDict.

Fixed

Fix: Searches for whole numbers will match all numerically matching integers regardless of whether they are stored as decimals or whole numbers (#169).

Fix: Passing an instance of dict to H5Store.setdefault() will return an instance of H5Group instead of a dict (#180).

Fix error with storing numpy arrays and scalars in a synced dictionary (e.g. job.statepoint, job.document) (#184).

Fix issue with ResourceWarning originating from unclosed instance of Collection (#186).

Fix issue with using the get_project() function with a relative path and search=False (#191).

Removed

Support for Python version 3.4 (no longer tested).

Source code(tar.gz)
Source code(zip)
v1.0.0(Feb 28, 2019)
Highlights

Native integration of HDF5 files with the H5Store and H5StoreManager, which are exposed as the job.data, job.stores, project.data, and project.stores properties respectively.

The newly added signac.get_job() function makes it easier to obtain instances of Job by calling the function from within a job's workspace directory or by directly providing the path to the job's workspace directory. This is especially useful for interactive work or when accessing jobs which are outside of the current project.

Simplified export of project and job data to pandas dataframes via the to_dataframe() function.

Projects and job search results are displayed nicely in Jupyter Notebooks.

Support for compressed Collection files.

Added

Official support for Python 3.7.

The H5Store and H5StoreManager classes, which are useful for storing (numerical) array-like data with an HDF5-backend. These classes are exposed within the root namespace.

The job.data and project.data properties which present an instance of H5Store to access numerical data within the job workspace and project root directory.

The job.stores and project.stores properties, which present an instance of H5StoreManager to manage multiple instances of H5Store to store numerical array-like data within the project workspace and project root directory.

The signac.get_job() and the signac.Project.get_job() functions that allow users to get a job handle by switching into or providing the job's workspace directory.

The job variable is automatically set when opening a signac shell from within a job's workspace directory.

Add the signac shell -c option which allows the direct specification of Python commands to be executed within the shell.

Automatic cast of numpy arrays to lists when storing them within a JSONDict, e.g., a job.statepoint or job.document.

Enable Collection class to manage collections stored in compressed files (gzip, zip, etc.).

Enable deleting of JSONDict keys through the attribute interface, e.g., del job.doc.foo.

Pretty HTML representation of instances of Project and JobsCursor targeted at Jupyter Notebooks (requires pandas, automatically enabled when installed).

The to_dataframe() function to export the job state point and document data of a Project or a JobsCursor, e.g., the result of Project.find_jobs(), as a pandas.Dataframe (requires pandas).

Changed

Dots (.) in keys are no longer allowed for JSONDict and Collection keys (previously deprecated).

The JSONDict module is exposed in the root namespace, which is useful for storing text-serializable data with a JSON-backend similar to the job.statepoint or job.document, etc.

The Job.init() method returns the job to allow one-line job creation and initialization.

The search argument was added to the signac.get_project() function, which when True (the default), will cause signac to search for a project within and above a specified root directory, not only within the root directory. The behavior without any arguments remains unchanged.

Fixed

Fix Collection.update() behavior such that existing documents with identical primary key are updated. Previously, a KeyError would be raised.

Fix issue where the Job.move() would trigger a confusing DestinationExists exception when trying to move jobs across devices / file systems.

Fix issue that caused failures when the python-rapidjson package is installed. The python-rapidjson package is used as the primary JSON-backend when installed.

Fix issue where schema with multiple keys would subset incorrectly if the list of jobs or statepoints was provided as an iterator rather than a sequence.

Removed

Removes the obsolete and deprecated core.search_engine module.

The previously deprecated Project.find_statepoints() and Project.find_job_documents() functions have been removed.

The Project.find_jobs() no longer accepts the obsolete index argument.

Source code(tar.gz)
Source code(zip)
v0.9.5(Jan 31, 2019)
Fixed

Ensure that the next() function can be called for a JobsIterator, e.g., project.find().

Pickling issue that occurs when a _SyncedDict (job.statepoint, job.document, etc.) contains a list.

Issue with the readline module that would cause signac shell to fail on Windows operating systems.

Source code(tar.gz)
Source code(zip)

Manage large and heterogeneous data spaces on the file system.

Related tags

Overview

signac - simple data management

Resources

Installation

Quickstart

Testing

Acknowledgment

Comments

Description

Motivation and Context

Types of Changes

Checklist:

Original Title: Don't generate views with underspecified path provided by user

Description

Motivation and Context

Fix that we'd suggest to users OR try to do automatically:

Types of Changes

Checklist:

Description

Motivation and Context

Types of Changes

Checklist:

Description

Motivation and Context

Types of Changes

Checklist:

Description

Motivation and Context

Types of Changes

Checklist:

Description

Motivation and Context

Types of Changes

Checklist:

Version 7.0.1 — 2022-12-23

Version 7.0.0 — 2022-12-18

Version 7.0.0b1 — 2022-12-03

pytest-xdist 3.1.0 (2022-12-01)

Features

Removals

Version 4.4.0

Changes

🚀 New Features (since 4.4.0rc4)

🐛 Bug Fixes (since 4.4.0rc4)

🧰 Maintenance (since 4.4.0rc4)

Contributors (since 4.4.0rc4)

4.4.0rc4

Changes

🚀 New Features

🐛 Bug Fixes

Contributors

4.4.0rc3

Changes

v1.24.1

NumPy 1.24.1 Release Notes

Contributors

Pull requests merged

Checksums

Release v3.8.0

Changes from 3.7.0 to 3.8.0

Improvements

Other changes

Changes from 3.8.0 to 3.9.0

Description

Motivation and Context

Checklist:

Releases(v1.8.0)

v1.8.0(Oct 5, 2022)

[1.8.0] -- 2022-10-05

Added

Changed

Deprecated

Fixed

Removed

v1.7.0(Jun 8, 2021)

Added

Changed

Deprecated