A proof-of-concept jupyter extension which converts english queries into relevant python code

Overview

Text2Code for Jupyter notebook

A proof-of-concept jupyter extension which converts english queries into relevant python code.

Blog post with more details:

Data analysis made easy: Text2Code for Jupyter notebook

Demo Video:

Text2Code for Jupyter notebook

Supported Operating Systems:

  • Ubuntu
  • macOS

Installation

NOTE: We have renamed the plugin from mopp to jupyter-text2code. Uninstall mopp before installing new jupyter-text2code version.

pip uninstall mopp

CPU-only install:

For Mac and other Ubuntu installations not having a nvidia GPU, we need to explicitly set an environment variable at time of install.

export JUPYTER_TEXT2CODE_MODE="cpu"

GPU install dependencies:

sudo apt-get install libopenblas-dev libomp-dev

Installation commands:

git clone https://github.com/deepklarity/jupyter-text2code.git
cd jupyter-text2code
pip install .
jupyter nbextension enable jupyter-text2code/main

Uninstallation:

pip uninstall jupyter-text2code

Usage Instructions:

  • Start Jupyter notebook server by running the following command: jupyter notebook
  • If you don't see Nbextensions tab in Jupyter notebook run the following command:jupyter contrib nbextension install --user
  • You can open the sample notebooks/ctds.ipynb notebook for testing
  • If installation happened successfully, then for the first time, Universal Sentence Encoder model will be downloaded from tensorflow_hub
  • Click on the Terminal Icon which appears on the menu (to activate the extension)
  • Type "help" to see a list of currently supported commands in the repo
  • Watch Demo video for some examples

Docker containers for jupyter-text2code

We have published CPU and GPU images to docker hub with all dependencies pre-installed.

Visit https://hub.docker.com/r/deepklarity/jupyter-text2code/ to download the images and usage instructions.
CPU image size: 1.51 GB
GPU image size: 2.56 GB

Model training:

Generate training data:

From a list of templates present at jupyter_text2code/jupyter_text2code_serverextension/data/ner_templates.csv, generate training data by running the following command:

cd scripts && python generate_training_data.py

This command will generate data for intent matching and NER(Named Entity Recognition).

Create intent index faiss

Use the generated data to create a intent-matcher using faiss.

cd scripts && python create_intent_index.py

Train NER model

cd scripts && python train_spacy_ner.py

Steps to add more intents:

  • Add more templates in ner_templates with a new intent_id
  • Generate training data. Modify generate_training_data.py if different generation techniques are needed or if introducing a new entity.
  • Train intent index
  • Train NER model
  • modify jupyter_text2code/jupyter_text2code_serverextension/__init__.py with new intent's condition and add actual code for the intent
  • Reinstall plugin by running: pip install .

TODO:

  • Publish Docker image
  • Refactor code and make it mode modular, remove duplicate code, etc
  • Add support for Windows
  • Add support for more commands
  • Improve intent detection and NER
  • Explore sentence Paraphrasing to generate higher-quality training data
  • Gather real-world variable names, library names as opposed to randomly generating them
  • Try NER with a transformer-based model
  • With enough data, train a language model to directly do English->code like GPT-3 does, instead of having separate stages in the pipeline
  • Create a survey to collect linguistic data
  • Add Speech2Code support

Authored By:

Owner
DeepKlarity
DeepKlarity
Search and download Copernicus Sentinel satellite images

sentinelsat Sentinelsat makes searching, downloading and retrieving the metadata of Sentinel satellite images from the Copernicus Open Access Hub easy

837 Dec 28, 2022
Specification for storing geospatial vector data (point, line, polygon) in Parquet

GeoParquet About This repository defines how to store geospatial vector data (point, lines, polygons) in Apache Parquet, a popular columnar storage fo

Open Geospatial Consortium 449 Dec 27, 2022
Client library for interfacing with USGS datasets

USGS API USGS is a python module for interfacing with the US Geological Survey's API. It provides submodules to interact with various endpoints, and c

Amit Kapadia 104 Dec 30, 2022
Code and coordinates for Matt's 2021 xmas tree

xmastree2021 Code and coordinates for Matt's 2021 xmas tree This repository contains the code and coordinates used for Matt's 2021 Christmas tree, as

Stand-up Maths 117 Jan 01, 2023
Geospatial web application developed uisng earthengine, geemap, and streamlit.

geospatial-streamlit Geospatial web applications developed uisng earthengine, geemap, and streamlit. App 1 - Land Surface Temperature A simple, code-f

13 Nov 27, 2022
Tool to display your current position and angle above your radar

🛠 Tool to display your current position and angle above your radar. As a response to the CS:GO Update on 1.2.2022, which makes cl_showpos a cheat-pro

Miko 6 Jan 04, 2023
Implementation of Trajectory classes and functions built on top of GeoPandas

MovingPandas MovingPandas implements a Trajectory class and corresponding methods based on GeoPandas. Visit movingpandas.org for details! You can run

Anita Graser 897 Jan 01, 2023
A library to access OpenStreetMap related services

OSMPythonTools The python package OSMPythonTools provides easy access to OpenStreetMap (OSM) related services, among them an Overpass endpoint, Nomina

Franz-Benjamin Mocnik 342 Dec 31, 2022
Blender addons to make the bridge between Blender and geographic data

Blender GIS Blender minimal version : 2.8 Mac users warning : currently the addon does not work on Mac with Blender 2.80 to 2.82. Please do not report

5.9k Jan 02, 2023
Constraint-based geometry sketcher for blender

Geometry Sketcher Constraint-based sketcher addon for Blender that allows to create precise 2d shapes by defining a set of geometric constraints like

1.7k Jan 02, 2023
Starlite-tile38 - Showcase using Tile38 via pyle38 in a Starlite application

Starlite-Tile38 Showcase using Tile38 via pyle38 in a Starlite application. Repo

Ben 8 Aug 07, 2022
iNaturalist observations along hiking trails

This tool reads the route of a hike and generates a table of iNaturalist observations along the trails. It also shows the observations and the route of the hike on a map. Moreover, it saves waypoints

7 Nov 11, 2022
User friendly Rasterio plugin to read raster datasets.

rio-tiler User friendly Rasterio plugin to read raster datasets. Documentation: https://cogeotiff.github.io/rio-tiler/ Source Code: https://github.com

372 Dec 23, 2022
Manipulation and analysis of geometric objects

Shapely Manipulation and analysis of geometric objects in the Cartesian plane. Shapely is a BSD-licensed Python package for manipulation and analysis

3.1k Jan 03, 2023
Imports VZD (Latvian State Land Service) open data into postgis enabled database

Python script main.py downloads and imports Latvian addresses into PostgreSQL database. Data contains parishes, counties, cities, towns, and streets.

Kaspars Foigts 7 Oct 26, 2022
A utility to search, download and process Landsat 8 satellite imagery

Landsat-util Landsat-util is a command line utility that makes it easy to search, download, and process Landsat imagery. Docs For full documentation v

Development Seed 681 Dec 07, 2022
PySAL: Python Spatial Analysis Library Meta-Package

Python Spatial Analysis Library PySAL, the Python spatial analysis library, is an open source cross-platform library for geospatial data science with

Python Spatial Analysis Library 1.1k Dec 18, 2022
peartree: A library for converting transit data into a directed graph for sketch network analysis.

peartree 🍐 🌳 peartree is a library for converting GTFS feed schedules into a representative directed network graph. The tool uses Partridge to conve

Kuan Butts 183 Dec 29, 2022
r.cfdtools 7 Dec 28, 2022
Software for Advanced Spatial Econometrics

GeoDaSpace Software for Advanced Spatial Econometrics GeoDaSpace current version 1.0 (32-bit) Development environment: Mac OSX 10.5.x (32-bit) wxPytho

GeoDa Center 38 Jan 03, 2023