Yet Another Time Series Model

Last update: Sep 13, 2022

Overview

Yet Another Timeseries Model (YATSM)

	master	v0.6.x-maintenance
Build
Coverage
Docs
DOI
\|

About

Yet Another Timeseries Model (YATSM) is a Python package for utilizing a collection of timeseries algorithms and methods designed to monitor the land surface using remotely sensed imagery.

The "Yet Another..." part of the package name is a reference to the algorithms implemented:

Continuous Change Detection and Classification (CCDC)
- Citation: Zhu and Woodcock, 2014; Zhu, Woodcock, Holden, and Yang 2015
- Note: Unvalidated, non-reference implementation
Long term mean phenology fitting using Landsat data
- Citation: Melaas, Friedl, and Zhu 2013
- Note: validated against Melaas et al's code, but not a reference implementation
Commission detection via p-of-F (e.g., Chow test) test similar to what is used in LandTrendr (Kennedy, et al, 2010)
...
More to come! Please reach out if you would like to help contribute

Note that the algorithms implemented within YATSM are not to be considered "reference" implementations unless otherwise noted.

The objective of making many methods of analyzing remote sensing timeseries available in one package is to leverage the strengths of multiple methods to overcome the weaknesses in any one approach. The opening of the Landsat archive in 2008 made timeseries analysis of Landsat data finally possible and kickstarted a "big bang" of methods that have evolved and proliferated since then. Over the years, it has become obvious that each individual algorithm is designed to monitor slightly different processes or leverages different aspects of the same datasets. Recent comparative analysis studies (Healey, Cohen, et al, forthcoming) strongly suggest that an ensemble of such algorithms is more accurate and informative than any one result alone. A suite of weak learners combined intelligently does indeed create a more powerful ensemble learner. By using a common set of vocabulary and making these algorithms available in one place, the YATSM package hopes to make such an ensemble possible.

Please consider citing as:

Christopher E. Holden. (2017). Yet Another Time Series Model (YATSM). Zenodo. http://doi.org/10.5281/zenodo.251125

Comments

GEO: glmnet-python version change

Was running YATSM for p012r031, everything was going fine until the 67th job, when I suddenly got the following error message:

09:26:06:DEBUG:66:config_parser.convert_config:Predicting using "GLMNET_LassoCV" pickle specified from configuration file (/usr3/graduate/valpasq/Documents/yatsm/yatsm/regression/pickles/glmnet_LassoCV_n50.pkl)
Traceback (most recent call last):
  File "/usr3/graduate/valpasq/venv/bin/yatsm", line 8, in <module>
    load_entry_point('yatsm==0.5.5', 'console_scripts', 'yatsm')()
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 700, in __call__
    return self.main(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 680, in main
    rv = self.invoke(ctx)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 1027, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 873, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 508, in invoke
    return callback(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/decorators.py", line 16, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/cli/line.py", line 50, in line
    cfg = parse_config_file(config)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 145, in parse_config_file
    return convert_config(cfg)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 69, in convert_config
    cfg[pred_method]['pickle'])
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 150, in _unpickle_predictor
    reg = joblib.load(pickle)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 425, in load
    obj = unpickler.load()
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 858, in load
    dispatch[key](self)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 1090, in load_global
    klass = self.find_class(module, name)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 1124, in find_class
    __import__(module)
ImportError: No module named elastic_net_cv

I tried changing to GLMNET_Lasso20, then got this message:

09:47:50:DEBUG:66:config_parser.convert_config:Predicting using "GLMNET_Lasso20" pickle specified from configuration file (/usr3/graduate/valpasq/Documents/yatsm/yatsm/regression/pickles/glmnet_Lasso20.pkl)
Traceback (most recent call last):
  File "/usr3/graduate/valpasq/venv/bin/yatsm", line 8, in <module>
    load_entry_point('yatsm==0.5.5', 'console_scripts', 'yatsm')()
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 700, in __call__
    return self.main(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 680, in main
    rv = self.invoke(ctx)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 1027, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 873, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 508, in invoke
    return callback(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/decorators.py", line 16, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/cli/line.py", line 50, in line
    cfg = parse_config_file(config)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 145, in parse_config_file
    return convert_config(cfg)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 69, in convert_config
    cfg[pred_method]['pickle'])
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 150, in _unpickle_predictor
    reg = joblib.load(pickle)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 425, in load
    obj = unpickler.load()
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 858, in load
    dispatch[key](self)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 1083, in load_newobj
    obj = cls.__new__(cls, *args)
TypeError: function.__new__(X): X is not a type object (function)

Oddly enough, when I tried the sklearn Lasso20, things seem to be running alright, though I do get a warning about convergence: 10:03:31:DEBUG:66:config_parser.convert_config:Predicting using "Lasso20" pickle specified from configuration file (/usr3/graduate/valpasq/Documents/yatsm/yatsm/regression/pickles/sklearn_Lasso20.pkl) 10:03:31:DEBUG:93:cache.test_cache:Attempt reading in from cache directory?: True 10:03:31:DEBUG:95:cache.test_cache:Attempt writing to cache directory?: True 10:03:31:INFO:81:line.line:Job 0 of 5 - using config file /projectnb/landsat/projects/Massachusetts/p012r031/p012r031_config.yaml 10:03:31:DEBUG:96:line.line:Responsible for lines: [ 0 5 10 ..., 7140 7145 7150] 10:03:31:DEBUG:125:line.line:Already processed line 0 10:03:31:DEBUG:125:line.line:Already processed line 5 10:03:31:DEBUG:125:line.line:Already processed line 10 10:03:31:DEBUG:125:line.line:Already processed line 15 10:03:31:DEBUG:125:line.line:Already processed line 20 10:03:31:DEBUG:125:line.line:Already processed line 25 10:03:31:DEBUG:125:line.line:Already processed line 30 10:03:31:DEBUG:125:line.line:Already processed line 35 10:03:31:DEBUG:125:line.line:Already processed line 40 10:03:31:DEBUG:125:line.line:Already processed line 45 10:03:31:DEBUG:125:line.line:Already processed line 50 10:03:31:DEBUG:125:line.line:Already processed line 55 10:03:31:DEBUG:125:line.line:Already processed line 60 10:03:31:DEBUG:125:line.line:Already processed line 65 10:03:31:DEBUG:128:line.line:Running line 70 10:03:32:DEBUG:158:reader.read_line:Read in Y from cache file /project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/sklearn/linear_model/coordinate_descent.py:444: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations ConvergenceWarning) 10:07:41:DEBUG:192:line.line: Saving YATSM output to /projectnb/landsat/projects/Massachusetts/p012r031/images/YATSM/yatsm_r70.npz 10:07:41:DEBUG:199:line.line:Line 70 took 249.888620138s to run 10:07:41:DEBUG:128:line.line:Running line 75 10:07:42:DEBUG:158:reader.read_line:Read in Y from cache file

Did something change with the pickles? I do find it really strange that my first 60+ jobs ran fine before I started getting errors, which makes me think this is not something to do with my copy of YATSM (since I didn't do a pull or anything that should change those files). Since the last lines of the errors have to do with File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py" I'm wondering if this maybe has something to do with site packages?

Any insight would be much appreciated--was hoping to run all 5 MA scenes this week,

PS - My log files are a mess, but the first few runs are in /projectnb/landsat/projects/Massachusetts/p012r031/images/, more recent runs in /projectnb/landsat/projects/Massachusetts/p012r031.

opened by valpasq 17

WIP: Pipeline fixes

WIP to address some of the design flaws, limitations, and unimplemented functionality. Before this is merged, I'd like to have some progress toward including the data reading and preprocessing explicitly.

@wkearn - I have some thoughts I'd love to get your feedback on, mostly having to do with how to represent tasks in the graph that have side-effects. I'll try to write via the code review process...

opened by ceholden 14
Cache error: TypeError: long() argument must be a string or a number, not 'NoneType'

When I run the cache function in the terminal of Ubuntu I get this error: TypeError: long() argument must be a string or a number, not 'NoneType'. I do not know how to fix it. I have tried different options like change the job number the total jobs, the locations and directories where I have the .csv, the input images (Landsat 7. Envi BSQ), the folder of the output and also the parameter called prediction. But it has not worked. Thanks in advance for your help

p007r059yaml.txt

(yatsm_venv)[email protected]:~$ yatsm cache /home/ideam/yatsm/7_59/p007r059/p007r059.yaml 1 8 /home/ideam/yatsm_venv/local/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment. warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.') Traceback (most recent call last): File "/home/ideam/yatsm_venv/bin/yatsm", line 9, in load_entry_point('yatsm==0.6.1', 'console_scripts', 'yatsm')() File "/home/ideam/yatsm_venv/local/lib/python2.7/site-packages/click/core.py", line 716, in call return self.main(_args, *_kwargs) File "/home/ideam/yatsm_venv/local/lib/python2.7/site-packages/click/core.py", line 696, in main rv = self.invoke(ctx) File "/home/ideam/yatsm_venv/local/lib/python2.7/site-packages/click/core.py", line 1060, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/ideam/yatsm_venv/local/lib/python2.7/site-packages/click/core.py", line 889, in invoke return ctx.invoke(self.callback, *_ctx.params) File "/home/ideam/yatsm_venv/local/lib/python2.7/site-packages/click/core.py", line 534, in invoke return callback(_args, *_kwargs) File "/home/ideam/yatsm_venv/local/lib/python2.7/site-packages/click/decorators.py", line 17, in new_func return f(get_current_context(), *args, *_kwargs) File "/home/ideam/yatsm/yatsm/cli/cache.py", line 108, in cache Y = io.gdal_reader.read_row(df['filename'], job_line) File "/home/ideam/yatsm/yatsm/io/stack_line_readers.py", line 155, in read_row return self._read_row(row) File "/home/ideam/yatsm/yatsm/io/stack_line_readers.py", line 138, in _read_row data[n_b, i, :] = band.ReadAsArray(0, row, self.n_col, 1) TypeError: long() argument must be a string or a number, not 'NoneType'

opened by johanna-bernal 12

Error when phenology fitting is enabled

Error occurs when attempting to run phenology fitting:

  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/cli/line.py", line 190, in line
    ltm = pheno.LongTermMeanPhenology(yatsm, **cfg['phenology'])
TypeError: __init__() got an unexpected keyword argument 'year_interval'

Config location: /projectnb/landsat/projects/Massachusetts/p012r031/p012r031_config.yaml

bug

opened by valpasq 12

CCDCesque critical value specification
The "test statistic" used in CCDC is the square root of the sum of squared scaled residuals for all bands tested. This "test statistic" isn't normalized by how many bands you're adding, so the critical value needs to depend on how many test indices you have. If you use more indices to test with, then you'll need to increase the critical value by some amount.

Zhu derives the "test statistic" critical value for p=0.01 and k=len(test_indices) using the inverse survival function of the scipy.stats.chi distribution since we're probably summing squares of normally distributed variables (the scaled residuals).

Questions of independence, normality, statistical soundness, etc. aside, my biggest concern is that we don't really care about finding change according to some null hypothesis testing framework value. CCDC is, at best, vaguely statistical and we've never analytically or numerically explored what the distribution of the "test statistic" is under the null hypothesis of no change. However, using the scipy.stats.chi.isf does convey the important message that the critical value depends on how many bands are being tested.

So, the proposed solution either includes:

Better documentation explaining this!

Convert threshold parameter to p_value and retrieve the threshold using scipy.stats.chi.isf for a given number of test_indices

Accept both threshold and p_value with threshold being the default input that overrides p_value if both are specified

If the user specifies p_value, then we back out threshold against the chi distribution

If threshold is specified, well then there's no difference versus current behavior.

If the user specifies both parameters, we keep threshold as being authoritative and warn the user

Thoughts, @bullocke, @valpasq, @parevalo, and @xjtang?
opened by ceholden 6
Issue in no data zone in line.py line 196

https://github.com/ceholden/yatsm/blob/master/yatsm/cli/line.py#L196

File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/cli/line.py", line 196, in line output.extend(yatsm.record) TypeError: 'NoneType' object is not iterable

Fix (tested): Add if statement to check if record exists:

if yatsm.record is not None: output.extend(yatsm.record)
bug duplicate

opened by bullocke 6

Error specifying commission_alpha - TypeError: 'NoneType' object is not iterable

Tried to run YATSM with commission_alpha=0.5 using config file /projectnb/landsat/projects/Massachusetts/p013r030/p013r030_config.yaml

YATSM:
    algorithm: "CCDCesque"
    prediction: "GLMNET_Lasso20"
    design_matrix: "1 + x + harm(x, 1) + harm(x, 2)"
    reverse: False
    commission_alpha: 0.05

But got the following error:

  File "/usr3/graduate/valpasq/venv/bin/yatsm", line 8, in <module>
    load_entry_point('yatsm==0.5.6-beta', 'console_scripts', 'yatsm')()
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 700, in __call__
    return self.main(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 680, in main
    rv = self.invoke(ctx)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 1027, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 873, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 508, in invoke
    return callback(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/decorators.py", line 16, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/cli/line.py", line 178, in line
    yatsm, cfg['YATSM']['commission_alpha'])
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/algorithms/postprocess.py", line 127, in commission_test
    for i_m, _m in enumerate(_models):
TypeError: 'NoneType' object is not iterable

bug

opened by valpasq 5

Phenology fit error

Pulled updated repository yesterday (currently up to date). Tried running YATSM with phenology enabled. A good number of lines process with no issue, but eventually the following error occurs:

  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/phenology.py", line 309, in fit
    self.q_min, self.q_max)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/phenology.py", line 241, in _fit_record
    pad_start = np.arange(1, yeardoy[:, 1].min() + 1)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/numpy/core/_methods.py", line 29, in _amin
    return umr_minimum(a, axis, None, out, keepdims)
ValueError: zero-size array to reduction operation minimum which has no identity

yatsm_*.o* logfiles located in /projectnb/landsat/projects/Massachusetts/ Verbose turned on to help ID which lines caused the error.

bug

opened by ceholden 5

Require `config` in the schema

The recent updates to the config schema changed the reader section to have a name and a config section where before it was name and $name (like GDAL). Should config be a required entry in the schema?

This just screwed me up temporarily as I was using my old (i.e. Friday's) config file and I got an error about the input_file field missing that I had to trace down to this name change, which I just missed as I scanned the new example config. Obviously it's in the new example, so new users who copy that won't hit this.

It seems that if we're going to throw an error when something in the config section (or the section itself) is missing, we should throw it in the config validation step rather than a few steps later when we start trying to load data.

opened by wkearn 4
added seasonal symbology option for TS plot
Added simple seasonal symbology option to TS (sequential date) plot for 0.6.x-maintenance version of yatsm pixel.

Current symbology puts emphasis on growing season observations (100% opaque) vs. other seasons (50% transparent).

Seasons defined by month and most applicable to temperate ecoregions in the Northern Hemisphere.

Winter (leaf-off) = November, December, January, February, March

Spring = April, May

Summer (leaf-on) = June, July, August

Fall = September, October

Dates loosely correspond to this visual:

Source: https://nccwsc.usgs.gov/content/ecological-drought-northeast-united-states-anticipating-changes-iconic-species-landscapes
opened by valpasq 4

glmnet/fortran error when running yatsm line

I'm trying to run yatsm line and getting the following error:

15:15:19:INFO:51:line.line:Job 1 of 400 - using config file /projectnb/landsat/projects/Colombia/images/005057/Results/FIT1/557_FIT1.yaml
0-th dimension must be fixed to 4 but got 6

Traceback (most recent call last):
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/bin/yatsm", line 9, in <module>
    load_entry_point('yatsm==0.6.0.dev0', 'console_scripts', 'yatsm')()
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/click/core.py", line 716, in __call__
    return self.main(*args, **kwargs)
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/click/core.py", line 696, in main
    rv = self.invoke(ctx)
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/click/core.py", line 1060, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/click/core.py", line 889, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/yatsm/cli/line.py", line 156, in line
    yatsm.fit(_X, _Y, _dates, **algo_cfg.get('fit', {}))
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/yatsm/algorithms/ccdc.py", line 222, in fit
    self.train()
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/yatsm/algorithms/ccdc.py", line 326, in train
    bands=self.test_indices)
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/yatsm/algorithms/yatsm.py", line 195, in fit_models
    model.fit(X, y, **self.estimator_fit)
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/glmnet/elastic_net.py", line 86, in fit
    enet_path(X, y, alpha=self.alpha, **kwargs)
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/glmnet/elastic_net.py", line 349, in enet_path
    = elastic_net(X, y, alpha, **kwargs)
  File "/usr3/graduate/parevalo/miniconda2/envs/conda_env/lib/python2.7/site-packages/glmnet/glmnet.py", line 91, in elastic_net
    nlam=nlam, isd=standardize)
_glmnet.error: failed in converting 6th argument `vp' of _glmnet.elnet to C/Fortran array

I'm using a conda venv and packages were installed during its creation, using the environment.yaml file and following the instructions provided in the documentation.
I'm using yatsm 0.6.0dev (very latest version)
Config files in: /projectnb/landsat/projects/Colombia/images/005057/Results/FIT1
The script used is Yatsm.sh

I'm not even sure how to try to debug it myself or trace the cause of the problem. Also, I checked your glmnet-python repository and you suggested using the fork by the user "shuras" and tried to install that to see if it fixed the problem but the installation was unsuccessful.

opened by parevalo 4

Installing yatsm error

Dear, please I having an error message below when I'm trying to install yatsm on Ubuntu 17.04 Anaconda3. What does it mean? Thanks

Command "/home/user/anaconda3/envs/yatsm/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-kred0d4t-build/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-qkvpe3vz-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-kred0d4t-build/

CondaValueError: pip returned an error.

P.S. if it's not the right place to post such request please let me know

opened by mlateb 2
Postprocessing: sieve on records

Resolve changes occurring in models that are not near-simultaneously co-occurring within pixels in a neighborhood region.

Depends a bit on #14 because finding a pixel's neighbors in results will require quite a lot of np.where to index. Using a file format that could store variables in some indexing scheme would speed up the searches substantially (using pytables, h5py, etc.)
enhancement

opened by ceholden 0
Result file IO abstractions
Motivation

Right now we're using NumPy saved files that store structural arrays for the results but this might change in the future (see #14), especially to accommodate some visualization utilities that would benefit from having all the results for an image in one container indexed intelligently.

Another annoyance is that each CLI utility uses duplicate code to open/inspect/read/write from/to the result files. Ideally this should be refactored into some common set of functions.

Proposal

Implement a "drivers" for each format (so just NumPy for now) that contains the logic for inspecting/reading/writing to/etc. each format. Eventually this will necessitate updating the configuration files to specify what result storage driver should be used.

The currently implemented iter_records, for example, would still iterate over result records, but would do so in a way that makes sense for the format. For the current NumPy saved files, we'd yield one row worth of records at a time. If we used something that stores results in blocks, maybe it would be chunks of data irregardless of the row:

driver = drivers.register(result_format) for rec in driver.iter_records(config): # do stuff

We usually want to perform a query on the records based on the segment dates, so there could be some higher level API access that would perform a query optimized for the format (NumPy files would just use simple np.where against them but we could use in kernel searches if using pytables):

driver = drivers.register(result_format) for matching_rec in driver.query_records(config, start='2000-01-01', end='2001-01-01'): # do more stuff

Justification

If we refactor out all of the result IO from the CLI scripts, we'll make testing much easier and probably reduce the overall amount of code. Refactoring out just the NumPy format probably won't take too much time and would set us up to easily transition to a better file format.
backward_incompatible YEP: YATSM Enhancement Proposal
opened by ceholden 0
Add multiple classification results to saved files

Feature request: it would be very useful to be able to save the results of different classifications in the same result files for each of the scenes. For example, after modifying the training data iteratively and running three subsequent classifications, each of the resulting classes (and the probability, if enabled) would be appended to the result file.
enhancement

opened by parevalo 0

Releases(v0.6.2)

v0.6.2(Jan 18, 2017)
v0.6.2

Bug fix release v0.6.2 from v0.6.x-maintenance branch.

Fixed

Fix missing predictions on pixel plotter (commit 9c07f6cbe436bc5063d930b9e9139036a437a94b, 9c07f6cbe436bc5063d930b9e9139036a437a94b)

Fix issue with synthetic image predictions in yatsm map (commit c33ea1c5fbbe835c4bacd6ecea334886442d1af3)

Clean up output result files of any Python objects (commit e2e61542689ff7626681c4dfff8da511eab46127)

With @valpasq, add "seasonal" symbology option to yatsm pixel (commit e594ecdb52a54b1664c5d062b362e0f05ac7bc23)

CCDCesque: Fixed for case when a model refit would try to take place despite n < p (commit 5c27bad3f394e35166ae94e3663692ecd7bcfe43)

Source code(tar.gz)
Source code(zip)
v0.6.1(May 12, 2016)
Bug fix release and beginning of v0.6.x maintenance branch.

Milestone v0.6.1

v0.6.1 - 2016-05-12

Version v0.6.x will be backward patched for any bug fixes (for an undetermined amount of time) as version v0.7.0 will introduce backwards incompatible changes in order to enable incorporation of data from multiple sensors and to better link time series models together in a cohesive pipeline.

Fixed

CCDCesque: Fixed case in which bands not used as "test indices" would not have time series models estimated (i.e., no coef or rmse) if the time series ends immediately after training #88

RLM: Fixed divide by zero error when n == p (number of observations equals number of parameters estimated) #88

Source code(tar.gz)
Source code(zip)
v0.6.0(Apr 22, 2016)
v0.6.0 - 2016-04-22

Milestone v0.6.0

Changed

CCDCesque: Optimize algorithm implementation. Performance estimates show 2x speed gain #70

CLI: Improve yatsm pixel by enabling the plotting of multiple refit model estimates on the same graph (commit)

CLI: Improve yatsm pixel --embed option (commit)

CLI: Add --verbose-yatsm to main yatsm command so it works with all programs running a YATSM algorithm (commit)

Use setuptools entry points to point YATSM to available time series algorithms (commit)

Added

Expose stay_regularized for segment refitting steps #74

Add capability to specify fit section for statistical estimators that are passed to the fit method of the estimator #61

CCDCesque: allow specification of min_rmse per band using an array or just one value for all bands #75

Add submodule yatsm.regression.diagnostics for regression diagostics, including RMSE (commit)

Add new module yatsm.accel with decorator (try_jit) that applies numba.jit to functions only if numba is available #70

Apply yatsm.accel.try_jit to calculation of yatsm.regression.diagnostics.rmse, yatsm.regression.robust_fit.RLM, and others #70

Benchmark algorithm performance across project history using Airspeed Velocity #71

Improve clean target in package's setup.py so it deletes built estimator pickles and .c/.so built with Cython (commit)

Increase test coverage from ~20% to ~80%

Added documentation to Read the Docs

Fixed

CCDCesque: Fix bug in calculation of end attribute for last timeseries record #72

CCDCesque: Fix bug in parsing of test_indices if user doesn't supply any #73

"Packaged" estimator pickles are built on installation of YATSM so they will work with user versions of libraries (commit)

Fix DeprecationWarnings with scikit-learn>=0.17.0 (commit)

yatsm.regression.robust_fit.RLM: Fix a bug caused by dividing by zero. This bug only occurs when the number of observations in a time series segment is approximately equal to the number of parameters (n ~= k) #86

Fix NumPy deprecation warnings and improve yatsm changemap num performance #83

Source code(tar.gz)
Source code(zip)
v0.5.5(Nov 25, 2015)
v0.5.5 - 2015-11-24

Milestone v0.5.5

Added

Abort if config file 'n_bands' looks incorrect (commit)

Changed

Reorganize long term mean phenology code into generic phenology related submodule.

Reorganize changemap and map logic to separate module #60

Fixed

Fix bug with spline EVI prediction in LTM phenology module when data include last day in leap year (366) #56

Fix bug with phenology half-max calculation that created erroneous transition dates #58

Fix bug with phenology calculation for 100% masked data pixels #54

Fix yatsm pixel to correctly plot designs that include categorical variables (commit)

Fix passing of a list of dataset min/max values within config files instead of 1 number #59

Add missing phenology module to setup.py (commit)

Source code(tar.gz)
Source code(zip)
v0.5.4(Oct 28, 2015)
Milestone v0.5.4

Fixed

Fix multiple bugs encountered when running phenology estimates #49

Changed

Metadata from yatsm line runs are now stored in metadata sub-file of NumPy compressed saved files #53

Algorithm configurations must now declare subsections that match estimator methods (e.g., init and fit) #52

Refactored yatsm.phenology to make LongTermMeanPhenology estimator follow scikit-learn API #50

Added

Add --num_threads option to yatsm CLI. This argument sets various environment variables (e.g., OPENBLAS_NUM_THREADS or MKL_NUM_THREADS) before beginning computation to set or limit multithreaded linear algebra calculations within NumPy #51

Add a changelog!

Source code(tar.gz)
Source code(zip)
v0.5.3(Oct 16, 2015)
Bug fixes from v0.5.0 and onward:

Changes: v0.5.3

Fix bug when running on real datasets with 100% missing data in timeseries (e.g., in corners) #47 #48

Fix yatsm train and yatsm classify for v0.5.0+ releases

Update config file parsing to sklearn classifiers for yaml usage. Delete intermediate 'helper' classes that were used to type-check ini config file

v0.5.2:

Catch TSLengthException so yatsm line can continue running #43

Allow refit methods to be from distributed pickles #44

Fix references to old variable names in yatsm.algorithms.postprocess #45

v0.5.1:

Use environment variables in configuration files #42

Pre-package a set of pickled regressions using package_data from setuptools #41

Breaks:

Need to update classifier YAML configuration files. See example RandomForest configuration file here

Source code(tar.gz)
Source code(zip)
v0.5.0(Sep 14, 2015)
Very backwards incompatible release required to redefine project objectives and use better technology (click & YAML) for command line interface. See milestone v0.5.0 for individual tickets.

Highlights include:

CLI conversion to use click #28

All sub-commands listed with yatsm command for better visibility

Redefine YATSM as baseclass & add CCDCesque implementation #29

Specify prediction algorithm using pickles to set hyperparameters #26

Configuration file using YAML for easier organization & more sustainable parsing #30

Refactored robust fit into more generalized refit step. User can refit using specified prediction algorithms #33

Addressed requirement file organization and documentation, including adding conda install instructions #32

Tests now use py.test fixtures for better code reuse; test coverage decrease unfortunate side effect

Inevitable bugs will be fixed in v0.5.1.
Source code(tar.gz)
Source code(zip)
v0.4.1(Aug 9, 2015)
CCDC style model now includes a "slope test" for stability of training period (see #22).

Enable the "slope test" by adding the following to your model configuration file:

[YATSM] ... slope_test = True

A True boolean value will enable the "slope test" as shown in Zhu and Woodcock, 2014. Specifying a float value instead will enable the "slope test" but use the specified float value as the test threshold instead of the default value specified by threshold.
Source code(tar.gz)
Source code(zip)
v0.4.0(Apr 23, 2015)
Model specification improvement and dataset caching improvements. Tasks include:

Updating cache files with new data

"cache_yatsm.py" cache dataset updater

Model design matrix specification using Patsy syntax

QA/QC for yatsm_map.py

Move to Patsy style model specification (see #25) makes previous results incompatible with release.
Source code(tar.gz)
Source code(zip)
v0.3.1(Mar 19, 2015)
Bug fixes and features not included in v0.3.0. Includes:

Faster image IO by keeping file references open until all reading concludes 1e30d277a561cd048e745a397752aa52e8686857

Bug fix for clobbering training data ROI mask values f882a00c2552855599020c3e9e34fd71b22bc0b6

Source code(tar.gz)
Source code(zip)
v0.3.0(Mar 13, 2015)
Incremental update if not interested in phenology.

Removes statsmodels based calculation of robust linear models for multi-temporal cloud masking in favor of custom, less flexible, version. Speeds up calculation by 3-4x.

Implement basic elements of Eli Melaas' phenology algorithm for calculation of long term mean start of spring, peak EVI, end of growing season, and length of growing season.

Source code(tar.gz)
Source code(zip)
v0.2.0(Feb 17, 2015)
Algorithm is stable and utility scripts are in place.

New features to algorithm:

Delete observations during monitoring if they're likely to be noise (remove_noise)

Dynamic RMSE calculation (dynamic_rmse)

Implemented commission (false positive) test based on the Chow Test (commission_alpha)

Implement clone of statsmodels.robust.robust_linear_model.RLM with less bells and whistles to make it 3-4x faster

Source code(tar.gz)
Source code(zip)
v0.1.0(Nov 21, 2014)

Algorithm is stable and command line utilities are in place for running models and then visualizing results. Achieves cherry-picked feature parity with CCDC, with someme work done to extend it
Source code(tar.gz)
Source code(zip)

Owner

Chris Holden

Geospatial data scientist with PhD in time series analysis of remote sensing data. he/him

GitHub Repository https://yatsm.readthedocs.org/en/latest/

Stitch image tiles into larger composite TIFs

untiler Utility to take a directory of {z}/{x}/{y}.(jpg|png) tiles, and stitch into a scenetiff (tif w/ exact merc tile bounds). Future versions will

38 Dec 16, 2022

Pandas Network Analysis: fast accessibility metrics and shortest paths, using contraction hierarchies :world_map:

Pandana Pandana is a Python library for network analysis that uses contraction hierarchies to calculate super-fast travel accessibility metrics and sh

321 Jan 05, 2023

Asynchronous Client for the worlds fastest in-memory geo-database Tile38

This is an asynchonous Python client for Tile38 that allows for fast and easy interaction with the worlds fastest in-memory geodatabase Tile38.

53 Dec 29, 2022

A Python framework for building geospatial web-applications

Hey there, this is Greppo... A Python framework for building geospatial web-applications. Greppo is an open-source Python framework that makes it easy

304 Dec 27, 2022

Geospatial Image Processing for Python

GIPPY Gippy is a Python library for image processing of geospatial raster data. The core of the library is implemented as a C++ library, libgip, with

83 Aug 19, 2022

Track International space station with python

NASA-ISS-tracker Track International space station with python Modules import json import turtle import urllib.request import time import webbrowser i

8 Aug 12, 2021

A package built to support working with spatial data using open source python

EarthPy EarthPy makes it easier to plot and manipulate spatial data in Python. Why EarthPy? Python is a generic programming language designed to suppo

414 Dec 23, 2022

Evaluation of file formats in the context of geo-referenced 3D geometries.

Geo-referenced Geometry File Formats Classic geometry file formats as .obj, .off, .ply, .stl or .dae do not support the utilization of coordinate syst

11 Mar 02, 2022

A Python interface between Earth Engine and xarray

eexarray A Python interface between Earth Engine and xarray Description eexarray was built to make processing gridded, mesoscale time series data quic

159 Dec 23, 2022

Python Data. Leaflet.js Maps.

folium Python Data, Leaflet.js Maps folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js

6k Jan 02, 2023

pure-Python (Numpy optional) 3D coordinate conversions for geospace ecef enu eci

Python 3-D coordinate conversions Pure Python (no prerequistes beyond Python itself) 3-D geographic coordinate conversions and geodesy. API similar to

292 Dec 29, 2022

Get Landsat surface reflectance time-series from google earth engine

geextract Google Earth Engine data extraction tool. Quickly obtain Landsat multispectral time-series for exploratory analysis and algorithm testing On

50 Dec 15, 2022

Open GeoJSON data on geojson.io

geojsonio.py Open GeoJSON data on geojson.io from Python. geojsonio.py also contains a command line utility that is a Python port of geojsonio-cli. Us

114 Dec 21, 2022

Code and coordinates for Matt's 2021 xmas tree

xmastree2021 Code and coordinates for Matt's 2021 xmas tree This repository contains the code and coordinates used for Matt's 2021 Christmas tree, as

117 Jan 01, 2023

Using SQLAlchemy with spatial databases

GeoAlchemy GIS Support for SQLAlchemy. Introduction GeoAlchemy is an extension of SQLAlchemy. It provides support for Geospatial data types at the ORM

109 Dec 01, 2022

Using Global fishing watch's data to build a machine learning model that can identify illegal fishing and poaching activities through satellite and geo-location data.

3 May 06, 2022

Yet Another Time Series Model

Related tags

Overview

Yet Another Timeseries Model (YATSM)

About

Comments

Dear, please I having an error message below when I'm trying to install yatsm on Ubuntu 17.04 Anaconda3. What does it mean? Thanks

CondaValueError: pip returned an error.

Motivation

Proposal

Justification

Releases(v0.6.2)

v0.6.2(Jan 18, 2017)

v0.6.2

Fixed

v0.6.1(May 12, 2016)

v0.6.1 - 2016-05-12

Fixed

v0.6.0(Apr 22, 2016)

v0.6.0 - 2016-04-22

Changed

Added

Fixed

v0.5.5(Nov 25, 2015)

v0.5.5 - 2015-11-24

Added

Changed

Fixed

v0.5.4(Oct 28, 2015)

Fixed

Changed

Added

v0.5.3(Oct 16, 2015)

v0.5.0(Sep 14, 2015)

v0.4.1(Aug 9, 2015)

v0.4.0(Apr 23, 2015)

v0.3.1(Mar 19, 2015)

v0.3.0(Mar 13, 2015)

v0.2.0(Feb 17, 2015)

v0.1.0(Nov 21, 2014)

Owner

Chris Holden

Stitch image tiles into larger composite TIFs

Pandas Network Analysis: fast accessibility metrics and shortest paths, using contraction hierarchies :world_map:

Asynchronous Client for the worlds fastest in-memory geo-database Tile38

A Python framework for building geospatial web-applications

Geospatial Image Processing for Python

Track International space station with python

A package built to support working with spatial data using open source python

Evaluation of file formats in the context of geo-referenced 3D geometries.

A Python interface between Earth Engine and xarray

Python Data. Leaflet.js Maps.

pure-Python (Numpy optional) 3D coordinate conversions for geospace ecef enu eci

Get Landsat surface reflectance time-series from google earth engine

Open GeoJSON data on geojson.io

Code and coordinates for Matt's 2021 xmas tree

Using SQLAlchemy with spatial databases

Using Global fishing watch's data to build a machine learning model that can identify illegal fishing and poaching activities through satellite and geo-location data.

Use Mapbox GL JS to visualize data in a Python Jupyter notebook

Rasterio reads and writes geospatial raster datasets

PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Wraps GEOS geometry functions in numpy ufuncs.