A proof-of-concept jupyter extension which converts english queries into relevant python code

Overview

Text2Code for Jupyter notebook

A proof-of-concept jupyter extension which converts english queries into relevant python code.

Blog post with more details:

Data analysis made easy: Text2Code for Jupyter notebook

Demo Video:

Text2Code for Jupyter notebook

Supported Operating Systems:

  • Ubuntu
  • macOS

Installation

NOTE: We have renamed the plugin from mopp to jupyter-text2code. Uninstall mopp before installing new jupyter-text2code version.

pip uninstall mopp

CPU-only install:

For Mac and other Ubuntu installations not having a nvidia GPU, we need to explicitly set an environment variable at time of install.

export JUPYTER_TEXT2CODE_MODE="cpu"

GPU install dependencies:

sudo apt-get install libopenblas-dev libomp-dev

Installation commands:

git clone https://github.com/deepklarity/jupyter-text2code.git
cd jupyter-text2code
pip install .
jupyter nbextension enable jupyter-text2code/main

Uninstallation:

pip uninstall jupyter-text2code

Usage Instructions:

  • Start Jupyter notebook server by running the following command: jupyter notebook
  • If you don't see Nbextensions tab in Jupyter notebook run the following command:jupyter contrib nbextension install --user
  • You can open the sample notebooks/ctds.ipynb notebook for testing
  • If installation happened successfully, then for the first time, Universal Sentence Encoder model will be downloaded from tensorflow_hub
  • Click on the Terminal Icon which appears on the menu (to activate the extension)
  • Type "help" to see a list of currently supported commands in the repo
  • Watch Demo video for some examples

Docker containers for jupyter-text2code

We have published CPU and GPU images to docker hub with all dependencies pre-installed.

Visit https://hub.docker.com/r/deepklarity/jupyter-text2code/ to download the images and usage instructions.
CPU image size: 1.51 GB
GPU image size: 2.56 GB

Model training:

Generate training data:

From a list of templates present at jupyter_text2code/jupyter_text2code_serverextension/data/ner_templates.csv, generate training data by running the following command:

cd scripts && python generate_training_data.py

This command will generate data for intent matching and NER(Named Entity Recognition).

Create intent index faiss

Use the generated data to create a intent-matcher using faiss.

cd scripts && python create_intent_index.py

Train NER model

cd scripts && python train_spacy_ner.py

Steps to add more intents:

  • Add more templates in ner_templates with a new intent_id
  • Generate training data. Modify generate_training_data.py if different generation techniques are needed or if introducing a new entity.
  • Train intent index
  • Train NER model
  • modify jupyter_text2code/jupyter_text2code_serverextension/__init__.py with new intent's condition and add actual code for the intent
  • Reinstall plugin by running: pip install .

TODO:

  • Publish Docker image
  • Refactor code and make it mode modular, remove duplicate code, etc
  • Add support for Windows
  • Add support for more commands
  • Improve intent detection and NER
  • Explore sentence Paraphrasing to generate higher-quality training data
  • Gather real-world variable names, library names as opposed to randomly generating them
  • Try NER with a transformer-based model
  • With enough data, train a language model to directly do English->code like GPT-3 does, instead of having separate stages in the pipeline
  • Create a survey to collect linguistic data
  • Add Speech2Code support

Authored By:

Owner
DeepKlarity
DeepKlarity
Manage your XYZ Hub or HERE Data Hub spaces from Python.

XYZ Spaces for Python Manage your XYZ Hub or HERE Data Hub spaces and Interactive Map Layer from Python. FEATURED IN: Online Python Machine Learning C

HERE Technologies 30 Oct 18, 2022
A library to access OpenStreetMap related services

OSMPythonTools The python package OSMPythonTools provides easy access to OpenStreetMap (OSM) related services, among them an Overpass endpoint, Nomina

Franz-Benjamin Mocnik 342 Dec 31, 2022
๐ŸŒ Local tile server for viewing geospatial raster files with ipyleaflet or folium

๐ŸŒ Local Tile Server for Geospatial Rasters Need to visualize a rather large (gigabytes) raster you have locally? This is for you. A Flask application

Bane Sullivan 192 Jan 04, 2023
gjf: A tool for fixing invalid GeoJSON objects

gjf: A tool for fixing invalid GeoJSON objects The goal of this tool is to make it as easy as possible to fix invalid GeoJSON objects through Python o

Yazeed Almuqwishi 91 Dec 06, 2022
The geospatial toolkit for redistricting data.

maup maup is the geospatial toolkit for redistricting data. The package streamlines the basic workflows that arise when working with blocks, precincts

Metric Geometry and Gerrymandering Group 60 Dec 05, 2022
๐ŸŒ Local tile server for viewing geospatial raster files with ipyleaflet

๐ŸŒ Local Tile Server for Geospatial Rasters Need to visualize a rather large raster (gigabytes) you have locally? This is for you. A Flask application

Bane Sullivan 192 Jan 04, 2023
Construct and use map tile grids in different projection.

Morecantile +-------------+-------------+ ymax | | | | x: 0 | x: 1 | | y: 0 | y: 0

Development Seed 67 Dec 23, 2022
List of Land Cover datasets in the GEE Catalog

List of Land Cover datasets in the GEE Catalog A list of all the Land Cover (or discrete) datasets in Google Earth Engine. Values, Colors and Descript

David Montero Loaiza 5 Aug 24, 2022
python toolbox for visualizing geographical data and making maps

geoplotlib is a python toolbox for visualizing geographical data and making maps data = read_csv('data/bus.csv') geoplotlib.dot(data) geoplotlib.show(

Andrea Cuttone 976 Dec 11, 2022
This is the antenna performance plotted from tinyGS reception data.

tinyGS-antenna-map This is the antenna performance plotted from tinyGS reception data. See their repository. The code produces a plot that provides Az

Martin J. Levy 14 Aug 21, 2022
peartree: A library for converting transit data into a directed graph for sketch network analysis.

peartree ๐Ÿ ๐ŸŒณ peartree is a library for converting GTFS feed schedules into a representative directed network graph. The tool uses Partridge to conve

Kuan Butts 183 Dec 29, 2022
A proof-of-concept jupyter extension which converts english queries into relevant python code

Text2Code for Jupyter notebook A proof-of-concept jupyter extension which converts english queries into relevant python code. Blog post with more deta

DeepKlarity 2.1k Dec 29, 2022
leafmap - A Python package for geospatial analysis and interactive mapping in a Jupyter environment.

A Python package for geospatial analysis and interactive mapping with minimal coding in a Jupyter environment

Qiusheng Wu 1.4k Jan 02, 2023
Python Data. Leaflet.js Maps.

folium Python Data, Leaflet.js Maps folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js

6k Jan 02, 2023
Asynchronous Client for the worlds fastest in-memory geo-database Tile38

This is an asynchonous Python client for Tile38 that allows for fast and easy interaction with the worlds fastest in-memory geodatabase Tile38.

Ben 53 Dec 29, 2022
ArcGIS Python Toolbox for WhiteboxTools

WhiteboxTools-ArcGIS ArcGIS Python Toolbox for WhiteboxTools. This repository is related to the ArcGIS Python Toolbox for WhiteboxTools, which is an A

Qiusheng Wu 190 Dec 30, 2022
a Geolocator made in python

Geolocator A Geolocator made in python โœจ Features locates ur location using ur ip thats it! ๐Ÿ’โ€โ™€๏ธ How to use first download the locator.py file instal

Portgas D Ace 1 Oct 27, 2021
LicenseLocation - License Location With Python

LicenseLocation Hi,everyone! โค ๐Ÿงก ๐Ÿ’› ๐Ÿ’š ๐Ÿ’™ ๐Ÿ’œ This is my first project! โœ” Actual

The Bin 1 Jan 25, 2022
A multi-page streamlit app for the geospatial community.

A multi-page streamlit app for the geospatial community.

Qiusheng Wu 522 Dec 30, 2022
Earthengine-py-notebooks - A collection of 360+ Jupyter Python notebook examples for using Google Earth Engine with interactive mapping

earthengine-py-notebooks A collection of 360+ Jupyter Python notebook examples for using Google Earth Engine with interactive mapping Contact: Qiushen

Qiusheng Wu 1.1k Dec 29, 2022