A proof-of-concept jupyter extension which converts english queries into relevant python code

Overview

Text2Code for Jupyter notebook

A proof-of-concept jupyter extension which converts english queries into relevant python code.

Blog post with more details:

Data analysis made easy: Text2Code for Jupyter notebook

Demo Video:

Text2Code for Jupyter notebook

Supported Operating Systems:

  • Ubuntu
  • macOS

Installation

NOTE: We have renamed the plugin from mopp to jupyter-text2code. Uninstall mopp before installing new jupyter-text2code version.

pip uninstall mopp

CPU-only install:

For Mac and other Ubuntu installations not having a nvidia GPU, we need to explicitly set an environment variable at time of install.

export JUPYTER_TEXT2CODE_MODE="cpu"

GPU install dependencies:

sudo apt-get install libopenblas-dev libomp-dev

Installation commands:

git clone https://github.com/deepklarity/jupyter-text2code.git
cd jupyter-text2code
pip install .
jupyter nbextension enable jupyter-text2code/main

Uninstallation:

pip uninstall jupyter-text2code

Usage Instructions:

  • Start Jupyter notebook server by running the following command: jupyter notebook
  • If you don't see Nbextensions tab in Jupyter notebook run the following command:jupyter contrib nbextension install --user
  • You can open the sample notebooks/ctds.ipynb notebook for testing
  • If installation happened successfully, then for the first time, Universal Sentence Encoder model will be downloaded from tensorflow_hub
  • Click on the Terminal Icon which appears on the menu (to activate the extension)
  • Type "help" to see a list of currently supported commands in the repo
  • Watch Demo video for some examples

Docker containers for jupyter-text2code

We have published CPU and GPU images to docker hub with all dependencies pre-installed.

Visit https://hub.docker.com/r/deepklarity/jupyter-text2code/ to download the images and usage instructions.
CPU image size: 1.51 GB
GPU image size: 2.56 GB

Model training:

Generate training data:

From a list of templates present at jupyter_text2code/jupyter_text2code_serverextension/data/ner_templates.csv, generate training data by running the following command:

cd scripts && python generate_training_data.py

This command will generate data for intent matching and NER(Named Entity Recognition).

Create intent index faiss

Use the generated data to create a intent-matcher using faiss.

cd scripts && python create_intent_index.py

Train NER model

cd scripts && python train_spacy_ner.py

Steps to add more intents:

  • Add more templates in ner_templates with a new intent_id
  • Generate training data. Modify generate_training_data.py if different generation techniques are needed or if introducing a new entity.
  • Train intent index
  • Train NER model
  • modify jupyter_text2code/jupyter_text2code_serverextension/__init__.py with new intent's condition and add actual code for the intent
  • Reinstall plugin by running: pip install .

TODO:

  • Publish Docker image
  • Refactor code and make it mode modular, remove duplicate code, etc
  • Add support for Windows
  • Add support for more commands
  • Improve intent detection and NER
  • Explore sentence Paraphrasing to generate higher-quality training data
  • Gather real-world variable names, library names as opposed to randomly generating them
  • Try NER with a transformer-based model
  • With enough data, train a language model to directly do English->code like GPT-3 does, instead of having separate stages in the pipeline
  • Create a survey to collect linguistic data
  • Add Speech2Code support

Authored By:

Owner
DeepKlarity
DeepKlarity
EOReader is a multi-satellite reader allowing you to open optical and SAR data.

Remote-sensing opensource python library reading optical and SAR sensors, loading and stacking bands, clouds, DEM and index.

ICube-SERTIT 152 Dec 30, 2022
iNaturalist observations along hiking trails

This tool reads the route of a hike and generates a table of iNaturalist observations along the trails. It also shows the observations and the route of the hike on a map. Moreover, it saves waypoints

7 Nov 11, 2022
geemap - A Python package for interactive mapping with Google Earth Engine, ipyleaflet, and ipywidgets.

A Python package for interactive mapping with Google Earth Engine, ipyleaflet, and folium

Qiusheng Wu 2.4k Dec 30, 2022
Implemented a Google Maps prototype that provides the shortest route in terms of distance

Implemented a Google Maps prototype that provides the shortest route in terms of distance, the fastest route, the route with the fewest turns, and a scenic route that avoids roads when provided a sou

1 Dec 26, 2021
Implementation of Trajectory classes and functions built on top of GeoPandas

MovingPandas MovingPandas implements a Trajectory class and corresponding methods based on GeoPandas. Visit movingpandas.org for details! You can run

Anita Graser 897 Jan 01, 2023
FDTD simulator that generates s-parameters from OFF geometry files using a GPU

Emport Overview This repo provides a FDTD (Finite Differences Time Domain) simulator called emport for solving RF circuits. Emport outputs its simulat

4 Dec 15, 2022
Rasterio reads and writes geospatial raster datasets

Rasterio Rasterio reads and writes geospatial raster data. Geographic information systems use GeoTIFF and other formats to organize and store gridded,

Mapbox 1.9k Jan 07, 2023
peartree: A library for converting transit data into a directed graph for sketch network analysis.

peartree 🍐 🌳 peartree is a library for converting GTFS feed schedules into a representative directed network graph. The tool uses Partridge to conve

Kuan Butts 183 Dec 29, 2022
geobeam - adds GIS capabilities to your Apache Beam and Dataflow pipelines.

geobeam adds GIS capabilities to your Apache Beam pipelines. What does geobeam do? geobeam enables you to ingest and analyze massive amounts of geospa

Google Cloud Platform 61 Nov 08, 2022
Wraps GEOS geometry functions in numpy ufuncs.

PyGEOS PyGEOS is a C/Python library with vectorized geometry functions. The geometry operations are done in the open-source geometry library GEOS. PyG

362 Dec 23, 2022
Platform for building statistical models of cities and regions

UrbanSim UrbanSim is a platform for building statistical models of cities and regions. These models help forecast long-range patterns in real estate d

Urban Data Science Toolkit 419 Dec 30, 2022
This GUI app was created to show the detailed information about the weather in any city selected by user

WeatherApp Content Brief description Tools Features Hotkeys How it works Screenshots Ways to improve the project Installation Brief description This G

TheBugYouCantFix 5 Dec 30, 2022
QLUSTER is a relative orbit design tool for formation flying satellite missions and space rendezvous scenarios

QLUSTER is a relative orbit design tool for formation flying satellite missions and space rendezvous scenarios, that I wrote in Python 3 for my own research and visualisation. It is currently unfinis

Samuel Low 9 Aug 23, 2022
Tool to display your current position and angle above your radar

🛠 Tool to display your current position and angle above your radar. As a response to the CS:GO Update on 1.2.2022, which makes cl_showpos a cheat-pro

Miko 6 Jan 04, 2023
Fiona reads and writes geographic data files

Fiona Fiona reads and writes geographic data files and thereby helps Python programmers integrate geographic information systems with other computer s

987 Jan 04, 2023
prettymaps - A minimal Python library to draw customized maps from OpenStreetMap data.

A small set of Python functions to draw pretty maps from OpenStreetMap data. Based on osmnx, matplotlib and shapely libraries.

Marcelo de Oliveira Rosa Prates 9k Jan 08, 2023
Build, deploy and extract satellite public constellations with one command line.

SatExtractor Build, deploy and extract satellite public constellations with one command line. Table of Contents About The Project Getting Started Stru

Frontier Development Lab 70 Nov 18, 2022
Python bindings to libpostal for fast international address parsing/normalization

pypostal These are the official Python bindings to https://github.com/openvenues/libpostal, a fast statistical parser/normalizer for street addresses

openvenues 651 Dec 16, 2022
A package built to support working with spatial data using open source python

EarthPy EarthPy makes it easier to plot and manipulate spatial data in Python. Why EarthPy? Python is a generic programming language designed to suppo

Earth Lab 414 Dec 23, 2022
Obtain a GNSS position fix from an 11-millisecond raw GNSS signal snapshot

Obtain a GNSS position fix from an 11-millisecond raw GNSS signal snapshot without any prior knowledge about the position of the receiver and only coarse knowledge about the time.

Jonas Beuchert 2 Nov 17, 2022