streamdf

Streamdf is a lightweight data frame library built on top of the dictionary of numpy array, developed for Kaggle's time-series code competition.

Key Features

Fast and robust insertion
- The insertion of row can be performed with amortized constant time (much faster than np.append)
- Automatically falls back to the default value when an abnormal value is inserted
Time-travel
- Get the past state of the data as a slice of the original dataframe without copying
Null/empty-safe aggregations
- Provides a set of aggregation methods that can be safely called when an element has nan or is empty.
Columnar layout
- Internal data is stored in a simple columnar format, which is easier to use for analysis than numpy's structured array

Example

import pandas as pd
from streamdf import StreamDf

df = pd.read_csv('test.csv')
sdf = StreamDf.from_pandas(df)

# extend
sdf.extend({
    'x': 1,
    'y': 2
})

assert len(sdf) == len(df) + 1

# access
print(sdf['x'])

# aggregate
sdf.last_value('x')

import numpy as np
from streamdf import StreamDf

sdf = StreamDf.empty({'x': np.int32, 'time': 'datetime64[D]'}, 'time')

sdf.extend({'x': 1, 'time': np.datetime64('2018-01-01')})
sdf.extend({'x': 5, 'time': np.datetime64('2018-02-01')})
sdf.extend({'x': 3, 'time': np.datetime64('2018-02-03')})

assert len(sdf) == 3

# Time travel (zero copy)
sliced = sdf.slice_until(np.datetime64('2018-02-02'))

assert len(sliced) == 2

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
streamdf		streamdf
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

streamdf

streamdf

tests

tests

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

streamdf

Key Features

Example

About

Releases

Packages

Languages

License

nyanp/streamdf

Folders and files

Latest commit

History

Repository files navigation

streamdf

Key Features

Example

About

Resources

License

Stars

Watchers

Forks

Languages