Think Big, Teach Small: Do Language Models Distil Occam’s Razor?

Last update: Dec 07, 2021

Related tags

Deep Learning think-big-teach-small

Overview

Think Big, Teach Small: Do Language Models Distil Occam’s Razor?

Software related to the paper "Think Big, Teach Small: Do Language Models Distil Occam’s Razor?"

Authors: Gonzalo Jaimovitch-López, David Castellano-Falcón, Cèsar Ferri, José Hernández-Orallo

Experiments

GPT-2

The experiment is fully performed on a single Notebook.

When opening the Notebook, just follow the code sections to run the experiment. Note that a file with the experiment results is provided. The results are printed in the corresponding section.

GPT-3

There are different Notebooks which post-process the outputs returned by GPT-3 in the experiment.

You can find two folders: main (for the experiments presented in the main paper) and additional (for the experiments included in the supplementary material).

The use of GPT-3 requires of an API key which cannot be provided with the code. However, the prompts used in the experiment are included in the repository.

If you would like to run the prompt queries in GPT-3, visit the OpenAI´s API Webpage. Make sure you adjust the temperature depending on the experiment you would like to test. Furthermore, note that results obtained with the use of the API from the webpage and the use of the API from the Python environment might differ based on the different encodings.

Main experiments

Run the lines of code in order. Note that you will have to choose (using the following cell at the top of the notebooks) the desired model to obtain the results.

#Choose between {'ada', 'babbage', 'curie', 'davinci'}
MODEL = 'davinci'

Additional experiments

Alternative alphabet (Apple, Banana)
Separator between characters in input / output
Concepts with loops
Many more concepts / Not using machine teaching

Run the lines of code in order. Note that you will have to choose (using the following cell at the top of the notebooks) the desired experiment to obtain the results.

#Choose complete_EXPERIMENT.csv being EXPERIMENT {'ada', 'babbage', 'curie', 'davinci', 'EXP_A', 'EXP_B'}
EXPERIMENT = 'ada'

Baselines

MagicHaskeller

MagicHaskeller must be previously installed.

To run the experiment, execute the Python script. The returned functions will be written in the corresponding file depending on the path provided in the script.

From the list of functions (you can find the outputs in this folder), we take the first function from the top of the list and use it as a solution, querying the test examples using Haskell. The summary of the results can be found in MHResults.txt.

Louise

Louise must be previously installed.

First you should run Louise and execute the dedicated script including the different examples where indicated depending on the concept (you can find them in pos_neg_ex.txt).

Subsequently, the evaluation of the test examples (using the predicates returned by the system) is performed in the Notebook.

Humans

We provide a PDF with the questionnaire performed by the human participants in this experiment. Note that the headlines mark the start of each screen that was presented to the participants, as this is not clearly reflected in the PDF version of the form. This can be observed when opening the HTML file, stored in the source code folder.

Additional Material

A Python script is provided to test the P3 functioning.

Finally, the R scripts for the generation of the paper plots are included.

Think Big, Teach Small: Do Language Models Distil Occam’s Razor?

Related tags

Overview

Think Big, Teach Small: Do Language Models Distil Occam’s Razor?

Experiments

GPT-2

GPT-3

Main experiments

Additional experiments

MagicHaskeller

Louise

Humans

Additional Material

Owner

Research - dataset and code for 2016 paper Learning a Driving Simulator

Tensorflow implementation of our method: "Triangle Graph Interest Network for Click-through Rate Prediction".

Reinforcement Learning for finance

PlenOctree Extraction algorithm

Towards Understanding Quality Challenges of the Federated Learning: A First Look from the Lens of Robustness

yolov5 deepsort 行人车辆跟踪检测计数

CTC segmentation python package

Unofficial implementation of "Coordinate Attention for Efficient Mobile Network Design"

[ICCV21] Code for RetrievalFuse: Neural 3D Scene Reconstruction with a Database

Efficiently computes derivatives of numpy code.

Implementation of Kaneko et al.'s MaskCycleGAN-VC model for non-parallel voice conversion.

This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

Jremesh-tools - Blender addon for quad remeshing

Neural Turing Machines (NTM) - PyTorch Implementation

Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets" (ECCV 2020 Spotlight)

Official implementation of "MetaSDF: Meta-learning Signed Distance Functions"

PyTorch implementations of deep reinforcement learning algorithms and environments

Pytorch implementation of SimSiam Architecture

Pytorch implementation of set transformer

PyTorch implementation for the paper Pseudo Numerical Methods for Diffusion Models on Manifolds

Think Big, Teach Small: Do Language Models Distil Occam’s Razor?

Related tags

Overview

Think Big, Teach Small: Do Language Models Distil Occam’s Razor?

Experiments

GPT-2

GPT-3

Main experiments

Additional experiments

MagicHaskeller

Louise

Humans

Additional Material

Owner

Research - dataset and code for 2016 paper Learning a Driving Simulator

Tensorflow implementation of our method: "Triangle Graph Interest Network for Click-through Rate Prediction".

Reinforcement Learning for finance

PlenOctree Extraction algorithm

Towards Understanding Quality Challenges of the Federated Learning: A First Look from the Lens of Robustness

yolov5 deepsort 行人 车辆 跟踪 检测 计数

CTC segmentation python package

Unofficial implementation of "Coordinate Attention for Efficient Mobile Network Design"

[ICCV21] Code for RetrievalFuse: Neural 3D Scene Reconstruction with a Database

Efficiently computes derivatives of numpy code.

Implementation of Kaneko et al.'s MaskCycleGAN-VC model for non-parallel voice conversion.

This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

Jremesh-tools - Blender addon for quad remeshing

Neural Turing Machines (NTM) - PyTorch Implementation

Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets" (ECCV 2020 Spotlight)

Official implementation of "MetaSDF: Meta-learning Signed Distance Functions"

PyTorch implementations of deep reinforcement learning algorithms and environments

Pytorch implementation of SimSiam Architecture

Pytorch implementation of set transformer

PyTorch implementation for the paper Pseudo Numerical Methods for Diffusion Models on Manifolds

yolov5 deepsort 行人车辆跟踪检测计数