An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

Overview

An Ensemble of CNN

Machine Learning project 2017

Read me

NOTE: All commands should be run inside the tensorflow environment

Dependencies

Python 3.5.1 Tensorflow 1.3 numpy 1.13

Although the model might work with previous version of above libraries, these version are what were used in development. Link to dataset: https://www.kaggle.com/xainano/handwrittenmathsymbols

Preparation

Do the following steps only if the folders MathExprJpeg/train_data and MathExprJpeg/train_data and files x_data.npy, y_data.npy and labels.npy are none existent, otherwise go directly to Running Main section.

In order to run the model the training and testing folders must be created. If test_data folder and train_data folder does not exist in the MathExprJpeg folder, these must be created. Create them by running the script create_train_test_data.py like this:

python create_train_test_data.py

Then the datafiles for the training data must be created in order to speed up training. Called x_data.npy, y_data.npy and labels.npy. If non existent create by running the script create_datafiles.py like this:

python create_datafiles.py
Running main

After the preparation steps are done, set the preferred modes in cnn_math_main.py file. This is done by changing the parameters at the top of the file:

# set if data shold be read in advance
fileread = True
ensemble_mode = False 

fileread = True means that the data will be read from the previously created x_data.npy and y_data.npy files. Setting this to True is highly recommended. The ensemble_mode = False means that the model will not be run in ensemble mode. This is recommended as the ensemble mode is performance heavy and can not be guaranteed to work in the latest releases.

It is recommended to change the name of the logging file for each run:

writer = tf.summary.FileWriter('./logs/cnn_math_logs_true_2ep_r1')
writer.add_graph(sess.graph)

Also set the preferred value to the epoch and batch_size:

training_epochs = 40
batch_size = 20

When all of the above has been done, the model can be run with the command:

python cnn_math_main.py

The model has been known to sometimes get errors while reading files. The source of which is unknown. If such an error is to occur, run the following commands:

rm -r MatchExprJpeg/train_data
rm -r MatchExprJpeg/test_data
rm x_data.npy
rm y_data.npy
rm labels.npy

python create_train_test_data.py
python create_datafiles.py

And then try to run the cnn_math_main.py again. start tensorboard with:

tensorboard --logdir=./logs

to see the graphs for the scalars, the image being processed etc.

The model is implemented in the cnn_model.py class. If this file is tempered with it is possible that the model breaks.

Happy predicting

MachineLearningProjectCNN

PIKA: a lightweight speech processing toolkit based on Pytorch and (Py)Kaldi

PIKA: a lightweight speech processing toolkit based on Pytorch and (Py)Kaldi PIKA is a lightweight speech processing toolkit based on Pytorch and (Py)

336 Nov 25, 2022
Package for working with hypernetworks in PyTorch.

Package for working with hypernetworks in PyTorch.

Christian Henning 71 Jan 05, 2023
Image process framework based on plugin like imagej, it is esay to glue with scipy.ndimage, scikit-image, opencv, simpleitk, mayavi...and any libraries based on numpy

Introduction ImagePy is an open source image processing framework written in Python. Its UI interface, image data structure and table data structure a

ImagePy 1.2k Dec 29, 2022
Caffe implementation for Hu et al. Segmentation for Natural Language Expressions

Segmentation from Natural Language Expressions This repository contains the Caffe reimplementation of the following paper: R. Hu, M. Rohrbach, T. Darr

10 Jul 27, 2021
This repository contains the segmentation user interface from the OpenSurfaces project, extracted as a lightweight tool

OpenSurfaces Segmentation UI This repository contains the segmentation user interface from the OpenSurfaces project, extracted as a lightweight tool.

Sean Bell 66 Jul 11, 2022
Diabet Feature Engineering - Predict whether people have diabetes when their characteristics are specified

Diabet Feature Engineering - Predict whether people have diabetes when their characteristics are specified

Şebnem 6 Jan 18, 2022
The official code for paper "R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling".

R2D2 This is the official code for paper titled "R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Mode

Alipay 49 Dec 17, 2022
Pytorch implementation of "Get To The Point: Summarization with Pointer-Generator Networks"

About this repository This repo contains an Pytorch implementation for the ACL 2017 paper Get To The Point: Summarization with Pointer-Generator Netwo

wxDai 7 Oct 14, 2022
Attentional Focus Modulates Automatic Finger‑tapping Movements

"Attentional Focus Modulates Automatic Finger‑tapping Movements", in Scientific Reports

Xingxun Jiang 1 Dec 02, 2021
Unsupervised Image to Image Translation with Generative Adversarial Networks

Unsupervised Image to Image Translation with Generative Adversarial Networks Paper: Unsupervised Image to Image Translation with Generative Adversaria

Hao 71 Oct 30, 2022
PyTorch code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised DA

PyTorch Code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised Domain Adaptation Viraj Prabhu, Shivam Khare, Deeks

Viraj Prabhu 46 Dec 24, 2022
An experimentation and research platform to investigate the interaction of automated agents in an abstract simulated network environments.

CyberBattleSim April 8th, 2021: See the announcement on the Microsoft Security Blog. CyberBattleSim is an experimentation research platform to investi

Microsoft 1.5k Dec 25, 2022
This is the official pytorch implementation for the paper: Instance Similarity Learning for Unsupervised Feature Representation.

ISL This is the official pytorch implementation for the paper: Instance Similarity Learning for Unsupervised Feature Representation, which is accepted

19 May 04, 2022
An off-line judger supporting distributed problem repositories

Thaw 中文 | English Thaw is an off-line judger supporting distributed problem repositories. Everyone can use Thaw release problems with license on GitHu

countercurrent_time 2 Jan 09, 2022
meProp: Sparsified Back Propagation for Accelerated Deep Learning (ICML 2017)

meProp The codes were used for the paper meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting (ICML 2017) [pdf]

LancoPKU 107 Nov 18, 2022
Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec

Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec This repo

Building and Urban Data Science (BUDS) Group 5 Dec 02, 2022
Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval

BiDR Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval. Requirements torch==

Microsoft 11 Oct 20, 2022
Delta Conformity Sociopatterns Analysis - Delta Conformity Sociopatterns Analysis

Delta_Conformity_Sociopatterns_Analysis ∆-Conformity is a local homophily measur

2 Jan 09, 2022
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

EntityQuestions This repository contains the EntityQuestions dataset as well as code to evaluate retrieval results from the the paper Simple Entity-ce

Princeton Natural Language Processing 119 Sep 28, 2022
An implementation of the 1. Parallel, 2. Streaming, 3. Randomized SVD using MPI4Py

PYPARSVD This implementation allows for a singular value decomposition which is: Distributed using MPI4Py Streaming - data can be shown in batches to

Romit Maulik 44 Dec 31, 2022