Creating a statistical model to predict 10 year treasury yields

Overview

Predicting 10-Year Treasury Yields

Intitially, I wanted to see if the volatility in the stock market, represented by the VIX index (data source), had a tangible impact on 10-Year Treasury yields (data source). Below are the results of my exploration of the VIX's effect on 10Y yields:

Line Graph Comparing VIX Price and Yield over the last 31 years

VIX and Yield TS

As can be seen in the above graph, there doesn't seem to be much correlation off the bat, simply looking at their annual trends. Overall, yields seem to have dropped quite dramatically over the last 31 years, with not much reaction to major changes in volatility. Meanwhile, VIX has had a more dramatic journey, with plenty of large ups and downs. Although it doesn't seem like much of a correlation from this view, it would be more beneficial to look at a scatter plot and create a regression line to be sure.

VIX vs. Yield Scatter Plot

VIX vs. Yield

The red line in the scatter plot is the regression line obtained. The regression line seems to be slanted downward, indicating a negative effect. This means that when the volatility in the stock market goes up, 10Y Treasury yields go down. The regression equation: 10-Year Treasury Yield = 4.71 + -0.02(VIX Price) indicates that an increase of $1 US in the VIX price would cause the yield to go down by 0.02 percentage points. Since the VIX price will never be $0, it does not make sense to interpret the y-intercept of 4.71. Thus, based on this scatter plot, and the fact that there is a slope to regression line, there may be a significant impact on yield by the price of VIX. However, to check if it is statistically significant, the t-statistic is needed.

Stata Analysis

Thus, I decided to run some statistical analysis in stata, contained here. The first regression I ran was between VIX Price and 10Y yields to see if there was any statistically significant effect of stock volatility on yields. When checking for statistical significance in the 5% size, the t-statistic of the coefficient must be either above 1.96 or below -1.96 to be considered significant. In this case, the t-statistic was -1.46, which meant that the stock volatility was not statistically significant.

...Not so fast. One issue with trying to simplify trends in this way is that omitted variables could play a big part in the statistical significance of present variables. Thus, I decided to use 4 more key macroeconomical datasets: unemployment rate, interest rate, change in CPI, and inflationary expectations. With these 4 key parts of the economy accounted for, I ran another regression, including all of the variables against the yield.

The new data was quite interesting. I had expected the change in CPI and inflationary expectations to be really important factors, but it turns out they are statistically insignificant. The t-statistic for change in CPI was 0.12 and for inflationary expectations was -1.71, short of the 1.96 and -1.96 thresholds required respectively. On the other hand, the t-statistic for the VIX Price dropped to -3.49, meaning that some of the variables that were added to the model were in fact invisibly impacting the effects of the volatility. The unemployment rate and interest rate were both statistically significant, with t-statistics of 10.99 and 37.20 respectively. Overall, 80.19% of the variation in the 10-Year Treasury yield could be explained by my model.

Interest Rate vs. 10-Year Treasury Yield Graph

ir vs. yield

Having seen the graph of a statistically insignificant variable (pre-multiple regression), I wanted to plot a scatter plot of an extremely significant variable to see the contrast. It is clear that there is a clear positive relationship between interest rate and the 10-Year Treasury yield. The regression line: 10-Year Treasury Yield = 2.31 + 0.73(Interest Rate) indicates that an increase in interest rate of 1 percentage point leads to a 0.73 percentage point increase in the yield. It is possible for rates to come down to 0, so the y-intercept indicates that the 10Y Treasury Note yields 2.31% when the interest rate hits 0. The constrast between the two red regression lines, as well as the distribution of the dots shown in the two scatter plots is quite clear, indicating how statistically significant the two variables are comparitavely.

Project instructions

10Y Treasury data citation:

OECD, "Main Economic Indicators - complete database", Main Economic Indicators (database),http://dx.doi.org/10.1787/data-00052-en (October 23, 2021) Copyright, 2016, OECD. Reprinted with permission.

Change in CPI data citation:

OECD, "Main Economic Indicators - complete database", Main Economic Indicators (database),http://dx.doi.org/10.1787/data-00052-en (October 23, 2021) Copyright, 2016, OECD. Reprinted with permission.

Inflation Expectation data citation:

Surveys of Consumers, University of Michigan, University of Michigan: Inflation Expectation© [MICH], retrieved from FRED, Federal Reserve Bank of St. Louis https://fred.stlouisfed.org/series/MICH/, (October 23, 2021)

Office365 (Microsoft365) audit log analysis tool

Office365 (Microsoft365) audit log analysis tool The header describes it all WHY?? The first line of code was written long time before other colleague

Anatoly 1 Jul 27, 2022
Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions.

About Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions. The tool provides rich data and a summary g

9 Nov 16, 2022
PrimaryBid - Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift

Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift This project is composed of two parts: Part1 and Part2

Emmanuel Boateng Sifah 1 Jan 19, 2022
A powerful data analysis package based on mathematical step functions. Strongly aligned with pandas.

The leading use-case for the staircase package is for the creation and analysis of step functions. Pretty exciting huh. But don't hit the close button

48 Dec 21, 2022
Data pipelines built with polars

valves Warning: the project is very much work in progress. Valves is a collection of functions for your data .pipe()-lines. This project aimes to host

14 Jan 03, 2023
This is a tool for speculation of ancestral allel, calculation of sfs and drawing its bar plot.

superSFS This is a tool for speculation of ancestral allel, calculation of sfs and drawing its bar plot. It is easy-to-use and runing fast. What you s

3 Dec 16, 2022
vartests is a Python library to perform some statistic tests to evaluate Value at Risk (VaR) Models

gg I wasn't satisfied with any of the other available Gemini clients, so I wrote my own. Requires Python 3.9 (maybe older, I haven't checked) and opti

RAFAEL RODRIGUES 5 Jan 03, 2023
Program that predicts the NBA mvp based on data from previous years.

NBA MVP Predictor A machine learning model using RandomForest Regression that predicts NBA MVP's using player data. Explore the docs » View Demo · Rep

Muhammad Rabee 1 Jan 21, 2022
Stitch together Nanopore tiled amplicon data without polishing a reference

Stitch together Nanopore tiled amplicon data using a reference guided approach Tiled amplicon data, like those produced from primers designed with pri

Amanda Warr 14 Aug 30, 2022
This program analyzes a DNA sequence and outputs snippets of DNA that are likely to be protein-coding genes.

This program analyzes a DNA sequence and outputs snippets of DNA that are likely to be protein-coding genes.

1 Dec 28, 2021
Package for decomposing EMG signals into motor unit firings, as used in Formento et al 2021.

EMGDecomp Package for decomposing EMG signals into motor unit firings, created for Formento et al 2021. Based heavily on Negro et al, 2016. Supports G

13 Nov 01, 2022
Processo de ETL (extração, transformação, carregamento) realizado pela equipe no projeto final do curso da Soul Code Academy.

Processo de ETL (extração, transformação, carregamento) realizado pela equipe no projeto final do curso da Soul Code Academy.

Débora Mendes de Azevedo 1 Feb 03, 2022
A model checker for verifying properties in epistemic models

Epistemic Model Checker This is a model checker for verifying properties in epistemic models. The goal of the model checker is to check for Pluralisti

Thomas Träff 2 Dec 22, 2021
Common bioinformatics database construction

biodb Common bioinformatics database construction 1.taxonomy (Substance classification database) Download the database wget -c https://ftp.ncbi.nlm.ni

sy520 2 Jan 04, 2022
Used for data processing in machine learning, and help us to construct ML model more easily from scratch

Used for data processing in machine learning, and help us to construct ML model more easily from scratch. Can be used in linear model, logistic regression model, and decision tree.

ShawnWang 0 Jul 05, 2022
Single-Cell Analysis in Python. Scales to >1M cells.

Scanpy – Single-Cell Analysis in Python Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It inc

Theis Lab 1.4k Jan 05, 2023
Python script for transferring data between three drives in two separate stages

Waterlock Waterlock is a Python script meant for incrementally transferring data between three folder locations in two separate stages. It performs ha

David Swanlund 13 Nov 10, 2021
BasstatPL is a package for performing different tabulations and calculations for descriptive statistics.

BasstatPL is a package for performing different tabulations and calculations for descriptive statistics. It provides: Frequency table constr

Angel Chavez 1 Oct 31, 2021
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis. You write a high level configuration file specifying your in

Blue Collar Bioinformatics 917 Jan 03, 2023