A mutable set that remembers the order of its entries. One of Python's missing data types.

Overview

Travis Codecov Pypi

An OrderedSet is a mutable data structure that is a hybrid of a list and a set. It remembers the order of its entries, and every entry has an index number that can be looked up.

Usage examples

An OrderedSet is created and used like a set:

>>> from ordered_set import OrderedSet

>>> letters = OrderedSet('abracadabra')

>>> letters
OrderedSet(['a', 'b', 'r', 'c', 'd'])

>>> 'r' in letters
True

It is efficient to find the index of an entry in an OrderedSet, or find an entry by its index. To help with this use case, the .add() method returns the index of the added item, whether it was already in the set or not.

>>> letters.index('r')
2

>>> letters[2]
'r'

>>> letters.add('r')
2

>>> letters.add('x')
5

OrderedSets implement the union (|), intersection (&), and difference (-) operators like sets do.

>>> letters |= OrderedSet('shazam')

>>> letters
OrderedSet(['a', 'b', 'r', 'c', 'd', 'x', 's', 'h', 'z', 'm'])

>>> letters & set('aeiou')
OrderedSet(['a'])

>>> letters -= 'abcd'

>>> letters
OrderedSet(['r', 'x', 's', 'h', 'z', 'm'])

The __getitem__() and index() methods have been extended to accept any iterable except a string, returning a list, to perform NumPy-like "fancy indexing".

>>> letters = OrderedSet('abracadabra')

>>> letters[[0, 2, 3]]
['a', 'r', 'c']

>>> letters.index(['a', 'r', 'c'])
[0, 2, 3]

OrderedSet implements __getstate__ and __setstate__ so it can be pickled, and implements the abstract base classes collections.MutableSet and collections.Sequence.

OrderedSet can be used as a generic collection type, similar to the collections in the typing module like List, Dict, and Set. For example, you can annotate a variable as having the type OrderedSet[str] or OrderedSet[Tuple[int, str]].

OrderedSet in data science applications

An OrderedSet can be used as a bi-directional mapping between a sparse vocabulary and dense index numbers. As of version 3.1, it accepts NumPy arrays of index numbers as well as lists.

This combination of features makes OrderedSet a simple implementation of many of the things that pandas.Index is used for, and many of its operations are faster than the equivalent pandas operations.

For further compatibility with pandas.Index, get_loc (the pandas method for looking up a single index) and get_indexer (the pandas method for fancy indexing in reverse) are both aliases for index (which handles both cases in OrderedSet).

Authors

OrderedSet was implemented by Robyn Speer. Jon Crall contributed changes and tests to make it fit the Python set API. Roman Inflianskas added the original type annotations.

Comparisons

The original implementation of OrderedSet was a recipe posted to ActiveState Recipes by Raymond Hettiger, released under the MIT license.

Hettiger's implementation kept its content in a doubly-linked list referenced by a dict. As a result, looking up an item by its index was an O(N) operation, while deletion was O(1).

This version makes different trade-offs for the sake of efficient lookups. Its content is a standard Python list instead of a doubly-linked list. This provides O(1) lookups by index at the expense of O(N) deletion, as well as slightly faster iteration.

In Python 3.6 and later, the built-in dict type is inherently ordered. If you ignore the dictionary values, that also gives you a simple ordered set, with fast O(1) insertion, deletion, iteration and membership testing. However, dict does not provide the list-like random access features of OrderedSet. You would have to convert it to a list in O(N) to look up the index of an entry or look up an entry by its index.

Owner
Elia Robyn Lake (Robyn Speer)
Knowledge graph sorceress. She/her. Developer of ConceptNet, which provides the common-sense background knowledge for some state-of-the-art NLP.
Elia Robyn Lake (Robyn Speer)
IADS 2021-22 Algorithm and Data structure collection

A collection of algorithms and datastructures introduced during UoE's Introduction to Datastructures and Algorithms class.

Artemis Livingstone 20 Nov 07, 2022
A collection of data structures and algorithms I'm writing while learning

Data Structures and Algorithms: This is a collection of data structures and algorithms that I write while learning the subject Stack: stack.py A stack

Dhravya Shah 1 Jan 09, 2022
Svector (pronounced Swag-tor) provides extension methods to pyrsistent data structures

Svector Svector (pronounced Swag-tor) provides extension methods to pyrsistent data structures. Easily chain your methods confidently with tons of add

James Chua 5 Dec 09, 2022
Data Structures and algorithms package implementation

Documentation Simple and Easy Package --This is package for enabling basic linear and non-linear data structures and algos-- Data Structures Array Sta

1 Oct 30, 2021
A Python dictionary implementation designed to act as an in-memory cache for FaaS environments

faas-cache-dict A Python dictionary implementation designed to act as an in-memory cache for FaaS environments. Formally you would describe this a mem

Juan 3 Dec 13, 2022
Map single-cell transcriptomes to copy number evolutionary trees.

Map single-cell transcriptomes to copy number evolutionary trees. Check out the tutorial for more information. Installation $ pip install scatrex SCA

Computational Biology Group (CBG) 12 Jan 01, 2023
A JSON-friendly data structure which allows both object attributes and dictionary keys and values to be used simultaneously and interchangeably.

A JSON-friendly data structure which allows both object attributes and dictionary keys and values to be used simultaneously and interchangeably.

Peter F 93 Dec 01, 2022
This Repository consists of my solutions in Python 3 to various problems in Data Structures and Algorithms

Problems and it's solutions. Problem solving, a great Speed comes with a good Accuracy. The more Accurate you can write code, the more Speed you will

SAMIR PAUL 1.3k Jan 01, 2023
Persistent dict, backed by sqlite3 and pickle, multithread-safe.

sqlitedict -- persistent dict, backed-up by SQLite and pickle A lightweight wrapper around Python's sqlite3 database with a simple, Pythonic dict-like

RARE Technologies 954 Dec 23, 2022
A mutable set that remembers the order of its entries. One of Python's missing data types.

An OrderedSet is a mutable data structure that is a hybrid of a list and a set. It remembers the order of its entries, and every entry has an index number that can be looked up.

Elia Robyn Lake (Robyn Speer) 173 Nov 28, 2022
My notes on Data structure and Algos in golang implementation and python

My notes on DS and Algo Table of Contents Arrays LinkedList Trees Types of trees: Tree/Graph Traversal Algorithms Heap Priorty Queue Trie Graphs Graph

Chia Yong Kang 0 Feb 13, 2022
A DSA repository but everything is in python.

DSA Status Contents A: Mathematics B: Bit Magic C: Recursion D: Arrays E: Searching F: Sorting G: Matrix H: Hashing I: String J: Linked List K: Stack

Shubhashish Dixit 63 Dec 23, 2022
Google, Facebook, Amazon, Microsoft, Netflix tech interview questions

Algorithm and Data Structures Interview Questions HackerRank | Practice, Tutorials & Interview Preparation Solutions This repository consists of solut

Quan Le 8 Oct 04, 2022
Chemical Structure Generator

CSG: Chemical Structure Generator A simple Chemical Structure Generator. Requirements Python 3 (= v3.8) PyQt5 (optional; = v5.15.0 required for grap

JP&K 5 Oct 22, 2022
My solutions to the competitive programming problems on LeetCode, USACO, LintCode, etc.

This repository holds my solutions to the competitive programming problems on LeetCode, USACO, LintCode, CCC, UVa, SPOJ, and Codeforces. The LeetCode

Yu Shen 32 Sep 17, 2022
🔬 Fixed struct serialization system, using Python 3.9 annotated type hints

py-struct Fixed-size struct serialization, using Python 3.9 annotated type hints This was originally uploaded as a Gist because it's not intended as a

Alba Mendez 4 Jan 14, 2022
Array is a functional mutable sequence inheriting from Python's built-in list.

funct.Array Array is a functional mutable sequence inheriting from Python's built-in list. Array provides 100+ higher-order methods and more functiona

182 Nov 21, 2022
Data Structure With Python

Data-Structure-With-Python- Python programs also include in this repo Stack A stack is a linear data structure that stores items in a Last-In/First-Ou

Sumit Nautiyal 2 Jan 09, 2022
Simple spill-to-disk dictionary

Chest A dictionary that spills to disk. Chest acts likes a dictionary but it can write its contents to disk. This is useful in the following two occas

Blaze 59 Dec 19, 2022
A mutable set that remembers the order of its entries. One of Python's missing data types.

An OrderedSet is a mutable data structure that is a hybrid of a list and a set. It remembers the order of its entries, and every entry has an index nu

Elia Robyn Lake (Robyn Speer) 173 Nov 28, 2022