A GitHub action that suggests type annotations for Python using machine learning.

Overview

Typilus: Suggest Python Type Annotations

A GitHub action that suggests type annotations for Python using machine learning.

This action makes suggestions within each pull request as suggested edits. You can then directly apply these suggestions to your code or ignore them.

Sample Suggestion Sample Suggestion

What are Python type annotations? Introduced in Python 3.5, type hints (more traditionally called type annotations) allow users to annotate their code with the expected types. These annotations are optionally checked by external tools, such as mypy and pyright, to prevent type errors; they also facilitate code comprehension and navigation. The typing module provides the core types.

Why use machine learning? Given the dynamic nature of Python, type inference is challenging, especially over partial contexts. To tackle this challenge, we use a graph neural network model that predicts types by probabilistically reasoning over a program’s structure, names, and patterns. This allows us to make suggestions with only a partial context, at the cost of suggesting some false positives.

Install Action in your Repository

To use the GitHub action, create a workflow file. For example,

name: Typilus Type Annotation Suggestions

# Controls when the action will run. Triggers the workflow on push or pull request
# events but only for the master branch
on:
  pull_request:
    branches: [ master ]

jobs:
  suggest:
    # The type of runner that the job will run on
    runs-on: ubuntu-latest

    steps:
    # Checks-out your repository under $GITHUB_WORKSPACE, so that typilus can access it.
    - uses: actions/[email protected]
    - uses: typilus/[email protected]
      env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        MODEL_PATH: path/to/model.pkl.gz   # Optional: provide the path of a custom model instead of the pre-trained model.
        SUGGESTION_CONFIDENCE_THRESHOLD: 0.8   # Configure this to limit the confidence of suggestions on un-annotated locations. A float in [0, 1]. Default 0.8
        DISAGREEMENT_CONFIDENCE_THRESHOLD: 0.95  # Configure this to limit the confidence of suggestions on annotated locations.  A float in [0, 1]. Default 0.95

The action uses the GITHUB_TOKEN to retrieve the diff of the pull request and to post comments on the analyzed pull request.

Technical Details & Internals

This GitHub action is a reimplementation of the Graph2Class model of Allamanis et al. PLDI 2020 using the ptgnn library. Internally, it uses a Graph Neural Network to predict likely type annotations for Python code.

This action uses a pre-trained neural network that has been trained on a corpus of open-source repositories that use Python's type annotations. At this point we do not support online adaptation of the model to each project.

Training your own model

You may wish to train your own model and use it in this action. To do so, please follow the steps in ptgnn. Then provide a path to the model in your GitHub action configuration, through the MODEL_PATH environment variable.

Contributing

We welcome external contributions and ideas. Please look at the issues in the repository for ideas and improvements.

You might also like...
 30 Days Of Machine Learning Using Pytorch
30 Days Of Machine Learning Using Pytorch

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

customer churn prediction prevention in telecom industry using machine learning and survival analysis

Telco Customer Churn Prediction - Plotly Dash Application Description This dash application allows you to predict telco customer churn using machine l

using Machine Learning Algorithm to classification AppleStore application

AppleStore-classification-with-Machine-learning-Algo- using Machine Learning Algorithm to classification AppleStore application. the first step : 1: p

CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

SmartSim Example Zoo This repository contains CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning appl

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.
Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.

Trading Tesla with Machine Learning and Sentiment Analysis An interactive program to train a Random Forest Classifier to predict Tesla daily prices us

A machine learning web application for binary classification using streamlit
A machine learning web application for binary classification using streamlit

Machine Learning web App This is a machine learning web application for binary classification using streamlit options this application contains 3 clas

Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning

Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning My

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Comments
  • IndexError: list index out of range

    IndexError: list index out of range

    Diff GET Status Code:  200
    Traceback (most recent call last):
      File "/usr/src/entrypoint.py", line 81, in <module>
        changed_files = get_changed_files(diff_rq.text)
      File "/usr/src/changeutils.py", line 38, in get_changed_files
        assert file_diff_lines[3].startswith("---")
    IndexError: list index out of range
    

    logs_302.zip

    opened by ZdenekM 1
  • Several small fixes

    Several small fixes

    Here are couple of things I noticed trying Typilus inference using GH Action:

    • gracefully handle patches that include a file renames (\wo any content modifications) by skipping such files
    • extractor stats reporting only processed files
    opened by bzz 0
  • Create a ptgnn-based Typilus model

    Create a ptgnn-based Typilus model

    Create and use the full Typilus model instead of graph2class.

    • [ ] Implement it in ptgnn
    • [ ] Use action cache to store intermediate result
    • [ ] Auto-update type space "once in a while"
    enhancement 
    opened by mallamanis 0
Releases(v0.9)
Python package for machine learning for healthcare using a OMOP common data model

This library was developed in order to facilitate rapid prototyping in Python of predictive machine-learning models using longitudinal medical data from an OMOP CDM-standard database.

Sontag Lab 75 Jan 03, 2023
Bodywork deploys machine learning projects developed in Python, to Kubernetes.

Bodywork deploys machine learning projects developed in Python, to Kubernetes. It helps you to: serve models as microservices execute batch jobs run r

Bodywork Machine Learning 409 Jan 01, 2023
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se

alkaline-ml 1.3k Dec 22, 2022
Bayesian optimization in JAX

Bayesian optimization in JAX

Predictive Intelligence Lab 26 May 11, 2022
Built on python (Mathematical straight fit line coordinates error predictor machine learning foundational model)

Sum-Square_Error-Business-Analytical-Tool- Built on python (Mathematical straight fit line coordinates error predictor machine learning foundational m

om Podey 1 Dec 03, 2021
Implementations of Machine Learning models, Regularizers, Optimizers and different Cost functions.

Linear Models Implementations of LinearRegression, LassoRegression and RidgeRegression with appropriate Regularizers and Optimizers. Linear Regression

Keivan Ipchi Hagh 1 Nov 22, 2021
K-Means clusternig example with Python and Scikit-learn

Unsupervised-Machine-Learning Flat Clustering K-Means clusternig example with Python and Scikit-learn Flat clustering Clustering algorithms group a se

Emin 1 Dec 13, 2021
Customers Segmentation with RFM Scores and K-means

Customer Segmentation with RFM Scores and K-means RFM Segmentation table: K-Means Clustering: Business Problem Rule-based customer segmentation machin

5 Aug 10, 2022
Python Automated Machine Learning library for tabular data.

Simple but powerful Automated Machine Learning library for tabular data. It uses efficient in-memory SAP HANA algorithms to automate routine Data Scie

Daniel Khromov 47 Dec 17, 2022
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear

23.3k Dec 31, 2022
Reproducibility and Replicability of Web Measurement Studies

Reproducibility and Replicability of Web Measurement Studies This repository holds additional material to the paper "Reproducibility and Replicability

6 Dec 31, 2022
We have a dataset of user performances. The project is to develop a machine learning model that will predict the salaries of baseball players.

Salary-Prediction-with-Machine-Learning 1. Business Problem Can a machine learning project be implemented to estimate the salaries of baseball players

Ayşe Nur Türkaslan 9 Oct 14, 2022
Probabilistic programming framework that facilitates objective model selection for time-varying parameter models.

Time series analysis today is an important cornerstone of quantitative science in many disciplines, including natural and life sciences as well as eco

Christoph Mark 129 Dec 24, 2022
MLReef is an open source ML-Ops platform that helps you collaborate, reproduce and share your Machine Learning work with thousands of other users.

The collaboration platform for Machine Learning MLReef is an open source ML-Ops platform that helps you collaborate, reproduce and share your Machine

MLReef 1.4k Dec 27, 2022
Mesh TensorFlow: Model Parallelism Made Easier

Mesh TensorFlow - Model Parallelism Made Easier Introduction Mesh TensorFlow (mtf) is a language for distributed deep learning, capable of specifying

1.3k Dec 26, 2022
TensorFlow implementation of an arbitrary order Factorization Machine

This is a TensorFlow implementation of an arbitrary order (=2) Factorization Machine based on paper Factorization Machines with libFM. It supports: d

Mikhail Trofimov 785 Dec 21, 2022
Official code for HH-VAEM

HH-VAEM This repository contains the official Pytorch implementation of the Hierarchical Hamiltonian VAE for Mixed-type Data (HH-VAEM) model and the s

Ignacio Peis 8 Nov 30, 2022
A chain of stores, 10 different stores and 50 different requests a 3-month demand forecast for its product.

Demand-Forecasting Business Problem A chain of stores, 10 different stores and 50 different requests a 3-month demand forecast for its product.

Ayşe Nur Türkaslan 3 Mar 06, 2022
Uber Open Source 1.6k Dec 31, 2022
Stats, linear algebra and einops for xarray

xarray-einstats Stats, linear algebra and einops for xarray ⚠️ Caution: This project is still in a very early development stage Installation To instal

ArviZ 30 Dec 28, 2022