Find big moving stocks before they move using machine learning and anomaly detection

Overview

Surpriver - Find High Moving Stocks before they Move

Find high moving stocks before they move using anomaly detection and machine learning. Surpriver uses machine learning to look at volume + price action and infer unusual patterns which can result in big moves in stocks.

Files Description

Path Description
surpriver Main folder.
└  dictionaries Folder to save data dictionaries for later use.
└  figures Figures for this github repositories.
└  stocks List of all the stocks that you want to analyze.
data_loader.py Module for loading data from yahoo finance.
detection_engine.py Main module for running anomaly detection on data and finding stocks with most unusual price and volume patterns.
feature_generator.py Generates price and volume return features as well as plenty of technical indicators.

Usage

Packages

You will need to install the following package to train and test the models.

You can install all packages using the following command. Please note that the script was written using python3.

pip install -r requirements.txt

Running with Docker

You can also use docker if you know what it is and have some knowledge on how to use it. Here are the steps to run the tool with docker.

  • First you must build the container: docker build . -t surpriver
  • Then you need to copy the contents of docker-compose.yml.template to a new file called docker-compose.yml
  • Replace <C:\\path\\to\\this\\dir> with the directory you are working in.
  • Run the container by executing docker-compose up -d
  • Execute any of the commands below by prepending docker exec -it surpriver to your command line.

Predictions for Today

If you want to go ahead and directly get the most anomalous stocks for today, you can simple run the following command to get the stocks with the most unusual patterns. We will dive deeper into the command in the following sections.

Get Most Anomalous Stocks for Today

When you do not have the data dictionary saved and you are running it for the first time.
python detection_engine.py --top_n 25 --min_volume 5000 --data_granularity_minutes 60 --history_to_use 14 --is_load_from_dictionary 0 --data_dictionary_path 'dictionaries/data_dict.npy' --is_save_dictionary 1 --is_test 0 --future_bars 0

This command will give you the top 25 stocks that had the highest anomaly score in the last 14 bars of 60 minute candles. It will also store all the data that it used to make predictions in the dictionaries/data_dict.npy folder. Below is a more detailed explanation of each parameter.

  • top_n: The total number of most anomalous stocks you want to see.
  • min_volume: Filter for volume. Any stock that has an average of volume lower than this value will be ignored.
  • data_granularity_minutes: Data granularity to use for analysis. The available options are 1min, 5min, 15min, 30min, 60min.
  • history_to_use: Historical bars to use to analyze the unusual and anomalous patterns.
  • is_save_dictionary: Whether to save the stock data that is used for analysis in a dictionary or not. Enabling this would save you time if you want to do some further analysis on the data.
  • data_dictionary_path: Dictionary path where data would be stored.
  • is_load_from_dictionary: Whether to load the data from dictionary or download it from yahoo finance directly. You can use the dictionary you saved above here for multiple runs.
  • is_test: You can actually test the predictions by leaving some of the recent data as future data and analyzing whether the most anomalous stocks moved the most after their predictions. If this value is 1, the value of future_bars should be greater than 5.
  • future_bars: These number of bars will be saved from the recent history for testing purposes.
  • output_format: The format for results. If you pass CLI, the results will be printed to the console. If you pass JSON, a JSON file will be created with results for today's date. The default is CLI.
When you have the data dictionary saved, you can just run the following command.
python detection_engine.py --top_n 25 --min_volume 5000 --data_granularity_minutes 60 --history_to_use 14 --is_load_from_dictionary 1 --data_dictionary_path 'dictionaries/data_dict.npy' --is_save_dictionary 0 --is_test 0 --future_bars 0 --output_format 'CLI'

Notice the change in is_save_dictionary and is_load_from_dictionary.

Here is an output of how a single prediction looks like. Please note that negative scores indicate higher anomalous and unusual patterns while positive scores indicate normal patterns. The lower the better.

Last Bar Time: 2020-08-25 11:30:00-04:00
Symbol: SPI
Anomaly Score: -0.029
Today Volume (Today = Date Above): 313.94K
Average Volume 5d: 206.53K
Average Volume 20d: 334.14K
Volatility 5bars: 0.013
Volatility 20bars: 0.038
Future Absolute Sum Price Changes: 72.87

Test on Historical Data

If you are suspicious of the use of Machine Learning and Artificial Intelligence in trading, you can actually test the predictions from this tool on historical data. The two most important command line arguments for testing are is_test and future_bars. If the former one is set to 1 and the later one is set to anything more than 5, the tool will actually leave that amount of data for analysis purposes and use the data prior to that for anomalous predictions. Next, it will look at that remaining data to see how well the predictions did. Here is an example of a scatter plot from the following command.

Find Anomalous Stocks and Test them on Historical Data

python detection_engine.py --top_n 25 --min_volume 5000 --data_granularity_minutes 60 --history_to_use 14 --is_load_from_dictionary 0 --data_dictionary_path 'dictionaries/data_dict.npy' --is_save_dictionary 1 --is_test 1 --future_bars 25

If you have already generated the data dictionary, you can use the following command where we set is_load_from_dictionary to 1 and is_save_dictionary to 0.

python detection_engine.py --top_n 25 --min_volume 5000 --data_granularity_minutes 60 --history_to_use 14 --is_load_from_dictionary 1 --data_dictionary_path 'dictionaries/data_dict.npy' --is_save_dictionary 0 --is_test 1 --future_bars 25

As you can see in the image above, the anomalous stocks (score < 0) usually have a higher absolute change in the future on average. That proves that the predictions are actually for those stocks that moved more than average in the next few hours/days. One question arises here, what if the tool is just picking the highest volatility stocks because those would yield high future absolute change. In order to prove that it's not the case, here is the more detailed description of stats you get from the above command.

--> Future Performance
Correlation between future absolute change vs anomalous score (lower is better, range = (-1, 1)): **-0.23**
Total absolute change in future for Anomalous Stocks: **89.660**
Total absolute change in future for Normal Stocks: **43.000**
Average future volatility of Anomalous Stocks: **0.332**
Average future volatility of Normal Stocks: **0.585**
Historical volatility for Anomalous Stocks: **2.528**
Historical volatility for Normal Stocks: **2.076**

You can see that historical volatility for normal vs anomalous stocks is not that different. However, the difference in total absolute future change is double for anomalous stocks as compared to normal stocks.

Support for Crypto Currencies

You can now specify which data source you wold like to use along with which stocks list you would like to use.

python detection_engine.py --top_n 25 --min_volume 500 --data_granularity_minutes 60 --history_to_use 14 --is_load_from_dictionary 0 --data_dictionary_path 'dictionaries/feature_dict.npy' --is_save_dictionary 1 --is_test 0 --future_bars 0  --data_source binance --stock_list cryptos.txt
  • data_source: Specifies where to get data from, current supported options are binance and yahoo_finance(default)
  • stocks_list: Which file in the stocks directory contains the list of tickers to analyze. Default is stocks.txt.

Results

We will try to post the top 25 results for a single set of parameters every week.

August 31, 2020 to September 05, 2020: https://pastebin.com/L5T2BYUx

Limitations

The tool only finds stocks that have some unusual behavior in their price and volume action combined. It does not predict which direction the stock is going to move. That might be a feature that I'll implement in the future but for right now, you'll need to look at the charts and do your DD to figure that out.

License

License: GPL v3

A product by Tradytics

Copyright (c) 2020-present, Tradytics.com

Owner
Tradytics
Artificial Intelligence driven Trading Tools
Tradytics
stock data on eink with raspberry

small python skript to display tradegate data on a waveshare e-ink important you need locale "de_AT.UTF-8 UTF-8" installed. do so in raspi-config's Lo

Simon Oberhammer 24 Feb 22, 2022
This repository provides all Python codes and Jupyter Notebooks of the book Python for Finance

Python for Finance (O'Reilly) This repository provides all Python codes and Jupyter Notebooks of the book Python for Finance -- Analyze Big Financial

Yves Hilpisch 1.6k Jan 03, 2023
Python Algorithmic Trading Library

PyAlgoTrade PyAlgoTrade is an event driven algorithmic trading Python library. Although the initial focus was on backtesting, paper trading is now pos

Gabriel Becedillas 3.9k Jan 01, 2023
Software for quick purchase of mystery boxes on Binance.

english | русский язык Software for quick purchase of mystery boxes on Binance. Purpose Installation & setup Motivation Specification Disclaimer Purpo

Ellis 5 Mar 08, 2022
Zipline, a Pythonic Algorithmic Trading Library

Zipline is a Pythonic algorithmic trading library. It is an event-driven system for backtesting. Zipline is currently used in production as the backte

Quantopian, Inc. 15.7k Jan 02, 2023
Supply a wrapper ``StockDataFrame`` based on the ``pandas.DataFrame`` with inline stock statistics/indicators support.

Stock Statistics/Indicators Calculation Helper VERSION: 0.3.2 Introduction Supply a wrapper StockDataFrame based on the pandas.DataFrame with inline s

Cedric Zhuang 1.1k Dec 28, 2022
Python Backtesting library for trading strategies

backtrader Yahoo API Note: [2018-11-16] After some testing it would seem that data downloads can be again relied upon over the web interface (or API v

DRo 9.8k Dec 30, 2022
:mag_right: :chart_with_upwards_trend: :snake: :moneybag: Backtest trading strategies in Python.

Backtesting.py Backtest trading strategies with Python. Project website Documentation the project if you use it. Installation $ pip install backtestin

3.1k Dec 31, 2022
Python library for backtesting trading strategies & analyzing financial markets (formerly pythalesians)

finmarketpy (formerly pythalesians) finmarketpy is a Python based library that enables you to analyze market data and also to backtest trading strateg

Cuemacro 3k Dec 30, 2022
An open source reinforcement learning framework for training, evaluating, and deploying robust trading agents.

TensorTrade: Trade Efficiently with Reinforcement Learning TensorTrade is still in Beta, meaning it should be used very cautiously if used in producti

4k Dec 30, 2022
Beibo is a Python library that uses several AI prediction models to predict stocks returns over a defined period of time.

Beibo is a Python library that uses several AI prediction models to predict stocks returns over a defined period of time.

Santosh 54 Dec 10, 2022
Indicator divergence library for python

Indicator divergence library This module aims to help to find bullish/bearish divergences (regular or hidden) between two indicators using argrelextre

8 Dec 13, 2022
Technical Analysis Library using Pandas and Numpy

Technical Analysis Library in Python It is a Technical Analysis library useful to do feature engineering from financial time series datasets (Open, Cl

Darío López Padial 3.4k Jan 02, 2023
This repository contains a set of plugins for Volatility 3

volatility_plugins This repository contains a set of plugins for Volatility 3 These plugins are not compatible with Volatility 2 To use these plugins

Immersive-Labs-Sec 10 Nov 30, 2022
Fourth and final milestone project

Milestone Project 4: Pound Dog Click link to visit "Pound Dog" Aim of the project The aim of this project is to provide access to a website informing

Jamie Wilson 1 Oct 31, 2021
High-performance TensorFlow library for quantitative finance.

TF Quant Finance: TensorFlow based Quant Finance Library Table of contents Introduction Installation TensorFlow training Development roadmap Examples

Google 3.5k Jan 01, 2023
Find big moving stocks before they move using machine learning and anomaly detection

Surpriver - Find High Moving Stocks before they Move Find high moving stocks before they move using anomaly detection and machine learning. Surpriver

Tradytics 1.5k Dec 31, 2022
Github.com/CryptoSignal - #1 Quant Trading & Technical Analysis Bot - 2,100 + stars, 580 + forks

CryptoSignal - #1 Quant Trading & Technical Analysis Bot - 2,100 + stars, 580 + forks https://github.com/CryptoSignal/Crypto-Signal Development state:

Github.com/Signal - 2,100 + stars, 580 + forks 4.2k Jan 01, 2023
Python sync/async framework for Interactive Brokers API

Introduction The goal of the IB-insync library is to make working with the Trader Workstation API from Interactive Brokers as easy as possible. The ma

Ewald de Wit 2k Dec 30, 2022