Header-only library for using Keras models in C++.

Overview

logo

CI (License MIT 1.0)

frugally-deep

Use Keras models in C++ with ease

Table of contents

Introduction

Would you like to build/train a model using Keras/Python? And would you like to run the prediction (forward pass) on your model in C++ without linking your application against TensorFlow? Then frugally-deep is exactly for you.

frugally-deep

  • is a small header-only library written in modern and pure C++.
  • is very easy to integrate and use.
  • depends only on FunctionalPlus, Eigen and json - also header-only libraries.
  • supports inference (model.predict) not only for sequential models but also for computational graphs with a more complex topology, created with the functional API.
  • re-implements a (small) subset of TensorFlow, i.e., the operations needed to support prediction.
  • results in a much smaller binary size than linking against TensorFlow.
  • works out-of-the-box also when compiled into a 32-bit executable. (Of course, 64 bit is fine too.)
  • utterly ignores even the most powerful GPU in your system and uses only one CPU core per prediction. ;-)
  • but is quite fast on one CPU core compared to TensorFlow, and you can run multiple predictions in parallel, thus utilizing as many CPUs as you like to improve the overall prediction throughput of your application/pipeline.

Supported layer types

Layer types typically used in image recognition/generation are supported, making many popular model architectures possible (see Performance section).

  • Add, Concatenate, Subtract, Multiply, Average, Maximum
  • AveragePooling1D/2D, GlobalAveragePooling1D/2D
  • Bidirectional, TimeDistributed, GRU, LSTM, CuDNNGRU, CuDNNLSTM
  • Conv1D/2D, SeparableConv2D, DepthwiseConv2D
  • Cropping1D/2D, ZeroPadding1D/2D
  • BatchNormalization, Dense, Flatten, Normalization
  • Dropout, AlphaDropout, GaussianDropout, GaussianNoise
  • SpatialDropout1D, SpatialDropout2D, SpatialDropout3D
  • RandomContrast, RandomFlip, RandomHeight
  • RandomRotation, RandomTranslation, RandomWidth, RandomZoom
  • MaxPooling1D/2D, GlobalMaxPooling1D/2D
  • ELU, LeakyReLU, ReLU, SeLU, PReLU
  • Sigmoid, Softmax, Softplus, Tanh
  • Exponential, GELU, Softsign
  • UpSampling1D/2D
  • Reshape, Permute, RepeatVector
  • Embedding

Also supported

  • multiple inputs and outputs
  • nested models
  • residual connections
  • shared layers
  • variable input shapes
  • arbitrary complex model architectures / computational graphs
  • custom layers (by passing custom factory functions to load_model)

Currently not supported are the following:

ActivityRegularization, AveragePooling3D, Conv2DTranspose (why), Conv3D, ConvLSTM2D, Cropping3D, Dot, GRUCell, LocallyConnected1D, LocallyConnected2D, LSTMCell, Masking, MaxPooling3D, RepeatVector, RNN, SimpleRNN, SimpleRNNCell, StackedRNNCells, ThresholdedReLU, Upsampling3D, temporal models

Usage

  1. Use Keras/Python to build (model.compile(...)), train (model.fit(...)) and test (model.evaluate(...)) your model as usual. Then save it to a single HDF5 file using model.save('....h5', include_optimizer=False). The image_data_format in your model must be channels_last, which is the default when using the TensorFlow backend. Models created with a different image_data_format and other backends are not supported.

  2. Now convert it to the frugally-deep file format with keras_export/convert_model.py

  3. Finally load it in C++ (fdeep::load_model(...)) and use model.predict(...) to invoke a forward pass with your data.

The following minimal example shows the full workflow:

# create_model.py
import numpy as np
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model

inputs = Input(shape=(4,))
x = Dense(5, activation='relu')(inputs)
predictions = Dense(3, activation='softmax')(x)
model = Model(inputs=inputs, outputs=predictions)
model.compile(loss='categorical_crossentropy', optimizer='nadam')

model.fit(
    np.asarray([[1, 2, 3, 4], [2, 3, 4, 5]]),
    np.asarray([[1, 0, 0], [0, 0, 1]]), epochs=10)

model.save('keras_model.h5', include_optimizer=False)
python3 keras_export/convert_model.py keras_model.h5 fdeep_model.json
// main.cpp
#include <fdeep/fdeep.hpp>
int main()
{
    const auto model = fdeep::load_model("fdeep_model.json");
    const auto result = model.predict(
        {fdeep::tensor(fdeep::tensor_shape(static_cast<std::size_t>(4)),
        std::vector<float>{1, 2, 3, 4})});
    std::cout << fdeep::show_tensors(result) << std::endl;
}

When using convert_model.py a test case (input and corresponding output values) is generated automatically and saved along with your model. fdeep::load_model runs this test to make sure the results of a forward pass in frugally-deep are the same as in Keras.

For more integration examples please have a look at the FAQ.

Performance

Below you can find the average durations of multiple consecutive forward passes for some popular models ran on a single core of an Intel Core i5-6600 CPU @ 3.30GHz. frugally-deep and TensorFlow were compiled (GCC ver. 7.1) with g++ -O3 -march=native. The processes were started with CUDA_VISIBLE_DEVICES='' taskset --cpu-list 1 ... to disable the GPU and to only allow usage of one CPU. (see used Dockerfile)

Model Keras + TF frugally-deep
DenseNet121 0.12 s 0.25 s
DenseNet169 0.13 s 0.28 s
DenseNet201 0.16 s 0.39 s
InceptionV3 0.21 s 0.32 s
MobileNet 0.05 s 0.15 s
MobileNetV2 0.05 s 0.17 s
NASNetLarge 0.83 s 4.03 s
NASNetMobile 0.08 s 0.32 s
ResNet101 0.22 s 0.45 s
ResNet101V2 0.21 s 0.42 s
ResNet152 0.31 s 0.65 s
ResNet152V2 0.29 s 0.61 s
ResNet50 0.13 s 0.26 s
ResNet50V2 0.12 s 0.22 s
VGG16 0.40 s 0.56 s
VGG19 0.49 s 0.68 s
Xception 0.25 s 1.20 s

Requirements and Installation

  • A C++14-compatible compiler: Compilers from these versions on are fine: GCC 4.9, Clang 3.7 (libc++ 3.7) and Visual C++ 2015
  • Python 3.7 or higher
  • TensorFlow and Keras 2.7.0 (This is the tested version, but somewhat older ones might work too.)

Guides for different ways to install frugally-deep can be found in INSTALL.md.

FAQ

See FAQ.md

Disclaimer


The API of this library still might change in the future. If you have any suggestions, find errors or want to give general feedback/criticism, I'd love to hear from you. Of course, contributions are also very welcome.

License

Distributed under the MIT License. (See accompanying file LICENSE or at https://opensource.org/licenses/MIT)

Comments
  • Problem with results of siamese CNN using EfficientNet

    Problem with results of siamese CNN using EfficientNet

    Hi there,

    First of all let me thank you for this fantastic library!

    Recently I got stuck on converting a siamese network that utilizes functional model and EfficientNetB0 architecture. I'm strictly following this repo for my development: https://github.com/sajadamouei/Person-Re-ID-with-light-weight-network. Since EfficientNetB0 uses FixedDropout and reduce layers that shrink the dimensionality (requires multiplying tensors by 1x1xDEPTH Conv) I had to implement them myself in the library. When I convert EfficientNetB0 on its own and load it in my C++ app, the output is EXACTLY as expected on both python and C++ side - no problems there. However, When I try to create siamese network out of them like presented here: https://github.com/sajadamouei/Person-Re-ID-with-light-weight-network/blob/master/model.py I get totally different results. In anticipation to your question - yes, I made super sure that the inputs to the network are EXACTLY the same on both sides - python and C++. I've tried everything to fix this and concluded that there must be something wrong with either the way frugally-deep deals with functional models OR the converter itself. What I also noticed is that tensors look completely different when they reach both Flatten layers in the architecture. Any ideas why this may be happening? Please look at the below screenshots to better understand the problem.

    github_plane Screenshot 2022-03-06 at 11 57 21

    opened by pavel123 37
  • Hash value for json/net loaded?

    Hash value for json/net loaded?

    I think it would be handy for us to have a hash over a loaded model (so I could store, together with the results, some indication of how they were generated - particularly handy for encodings, which tend to be incompatible). I could simply calculate a hash over the file/string used to initialise the net, but since many files could potentially result in the same net it would be nicer if the net itself could provide such a hash. Is such a function implemented or, if not, do you see an easy way to get such a hash?

    Thanks

    Sven

    opened by utcke 36
  • How to convert model with

    How to convert model with "relu6" layer?

    My Keras model uses "rule6" layer , how to change convert_model.py to make the json file? and any examples for adding custom layer in fdeep::load_model?

    Thank you very much!

    opened by binlbl 32
  • Using Eigen Unsupported modules to improve convolutions

    Using Eigen Unsupported modules to improve convolutions

    I noticed that Eigen 3.3 has unsupported modules, including modules for Tensors and gemm operations.

    https://bitbucket.org/eigen/eigen/src/9b065de03d016d802a25366ff5f0055df6318121/unsupported/Eigen/CXX11/src/Tensor/README.md?at=default#markdown-header-convolutions

    I noticed you implement your own gemm operation in fdeep/convolution.hpp in function convolve_im2col. This could be improved by using gemm functions from the eigen unsupported modules.

    I ran a test by inferring the UNet model from pix2pix in frugally deep. It took 18s compared to a model converted from onnx and inferred in OpenCV which took 3s. I think this shows that convolutions in frugally could be improved.

    Thanks

    opened by pfeatherstone 32
  • Slow-ish run time on MSVC

    Slow-ish run time on MSVC

    Hi!

    First of all thank you for this great library! :-) I've got a fairly small model (18 layers) for real-time applications, basically mainly consisting of 5 blocks of Conv2D/ReLu/MaxPool2D, and input size 64x64x3. I'm unfortunately seeing some speed problems with fdeep. A forward pass takes around 11ms in Keras, and it's taking 60ms in fdeep. (I've measured by calling predict 100x in a for-loop and then averaging - a bit crude but should do the trick for this purpose). I've compiled with the latest VS2017 15.5.5, Release mode, and default compiler flags (/O2). If I enable AVX2 and instrinsics, it goes down to 50ms, but still way too slow. (I've tried without im2col but it's even slower, around >10x).

    I've run the VS profiler, but I'm not 100% sure I'm interpreting the results correctly. I think around 30%+5% of the total time is spent in Eigen's gebp and gemm functions, where we probably can't do much. Except maybe: I think I've seen you're using RowMajor storage for the Eigen matrices. Eigen is supposedly more optimised for its default, ColMajor storage. Would it be hard to change that in fdeep? Another 30% seems to be spent in convolve_im2col. But I'm not 100% sure where. I first thought it was the memcpy in eigen_mat_to_values but eigen_mat_to_values itself contains very few profiler samples only. There's also a lot of internal::transform and std::transform showing up in the profiler as well (internal::transform<ContainerOut>(reuse_t{}, f, std::forward<ContainerIn>(xs));) but I couldn't really figure out what the actual code is that this executes. I also saw that I think you pre-instantiate some convolution functions for common kernels. Most of my convolution kernels are 3x3, and it looks like you only instantiate n x m kernels for n and m equals 1 and 2. Could it help adding 3x3 there? So yea I'm really not sure about all of it. If indeed the majority of time is spent in Eigen's functions, then the RowMajor thing could indeed be a major problem.

    I'm happy to send you the model and an example input via email if you wanted to have a look.

    Here's some screenshots of the profiler: image image image

    Thank you very much!

    enhancement 
    opened by patrikhuber 32
  • Input to model

    Input to model

    If I have RGB Image and i want it to pass it to the model , what should i do ?

    what i've made is flatten the input image into vector of float , i appened the r , g , b values after each others to get just 1 vector called "input_vector"

    and then this is the next step.

             typedef fplus::shared_ref<std::vector<float>> shared_float_vec;
             shared_float_vec x(fplus::make_shared_ref<vector<float>>(std::move(input_vector)));
             const auto result = decision_model.predict({fdeep::tensor3(fdeep::shape3(3,60,60),x)});
    

    then the output is incorrect , what should i do then ? or what i've done wrong ?

    opened by rmmal 32
  •  lambda layer using tf.image

    lambda layer using tf.image

    I am using Lambda layer which includes this function to extract patches in image

    patch_one = tf.image.extract_glimpse(inputs[0], [26, 26], inputs[1][:, j, :], centered=False, normalized=False, noise='zero')

    Is it possible to implement this custom layer in your library and load model?

    opened by katmatus 30
  • Stop at the 'Loading json ...'

    Stop at the 'Loading json ...'

    Hi Tobias, thanks for this great library! I trained a ResNet50 network using Keras. I was able to convert the .h5 model to a .json. However, when I run the program as follows:

    #include <fdeep/fdeep.hpp>
    #include <opencv2/opencv.hpp>
    
    int main()
    {
    	const cv::Mat image = cv::imread("Image_1_2.jpg");
    	cv::cvtColor(image, image, cv::COLOR_BGR2RGB);
    	assert(image.isContinuous());
    	const auto model = fdeep::load_model("train7.json");
    	// Use the correct scaling, i.e., low and high.
    	const auto input = fdeep::tensor5_from_bytes(image.ptr(),
    		static_cast<std::size_t>(image.rows),
    		static_cast<std::size_t>(image.cols),
    		static_cast<std::size_t>(image.channels()),
    		0.0f, 1.0f);
    	const auto result = model.predict_class({ input });
    	std::cout << result << std::endl;
    	system("pause");
    }
    

    It likes the example in the FAQ--How to use images loaded with OpenCV as input for a model? But it doesn't work with my Keras model. It just spent about 236s to load json, and then stop here. My CPU is Core i5-3230M, which is not a good CPU. My model is used to classify 7 kinds of algae cells, which used transfer learning based on ResNet50.
    The python program for trainning model as follows:

    import numpy as np
    import matplotlib.pyplot as plt
    import keras
    from keras.preprocessing import image
    from keras.preprocessing.image import ImageDataGenerator
    from keras.applications import ResNet50
    from keras.applications.resnet50 import preprocess_input
    from keras import Model, layers
    from keras.models import load_model
    
    input_path = "data/LvsRod/"
    
    train_datagen = ImageDataGenerator(
        rescale=1. / 255,
        rotation_range=20,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        preprocessing_function=preprocess_input)
    
    train_generator = train_datagen.flow_from_directory(
        input_path + 'train',
        batch_size=10,
        class_mode='binary',
        target_size=(224, 224))
    
    validation_datagen = ImageDataGenerator(
        rescale=1. / 255,
        preprocessing_function=preprocess_input)
    
    validation_generator = validation_datagen.flow_from_directory(
        input_path + 'validation',
        shuffle=False,
        class_mode='binary',
        target_size=(224, 224))
    
    conv_base = ResNet50(include_top=False, weights='imagenet', input_shape=(224, 224, 3))
    
    for layer in conv_base.layers:
        layer.trainable = False
    
    x = conv_base.output
    x = layers.Flatten()(x)
    x = layers.Dense(256, activation='relu')(x)
    x = layers.Dropout(0.5)(x)
    predictions = layers.Dense(7, activation='softmax')(x)
    model = Model(conv_base.input, predictions)
    
    optimizer = keras.optimizers.SGD(lr=1e-4, momentum=0.9)
    model.compile(loss='sparse_categorical_crossentropy',
                  optimizer=optimizer,
                  metrics=['accuracy'])
    
    history = model.fit_generator(generator=train_generator,
                                  steps_per_epoch=10,  # added in Kaggle
                                  epochs=30,
                                  validation_data=validation_generator,
                                  validation_steps=10  # added in Kaggle
                                 )
    
    # save
    model.save('train7.h5')
    

    The h5model can download from this

    URL:https://pan.baidu.com/s/1YkuBHBkjjUs2dcpc8XTLqA
    Extraction code:1od1

    Because the file is too big, so I cannot upload it here. I really want to know how to solve the problem.

    opened by callmefish 28
  • Bad performance.

    Bad performance.

    Hi Tobias,

    I am getting a bad performance when using frugally-deep and I wanted to ask you about some advice. Of course I've read FAQ about the performance so I got that covered.

    Here is what I've tested so far:

    | Environment| Description | Time | |----------|:-------------|------:| | Python | Default settings (GPU ON) | 35ms| | Python | os.environ['CUDA_VISIBLE_DEVICES']='-1' | 45ms| | Python | NO GPU and tf.config.threading.set_intra_op_parallelism_threads(1) | 75ms| | Visual Studio 2017 | Default (Release -O2, whole program optimization) | 310ms| | Visual Studio 2017 | Compiled with AVX2 | 280ms|

    It is quite interesting that single switch (AVX2) gave me 10% boost! but it is still far, very far from what you have advocated.

    I did run a benchmark and here is what I've got:

    obraz

    Any ideas? Could I send you my model and example code? (privately as this is for the job, I will be happy to support you if I get paid for the project :) ).

    opened by TrueWodzu 27
  • Cannot load InceptionV3 model

    Cannot load InceptionV3 model

    So, I successfully loaded some models and predicted them.

    Yet, when I tried to load InceptionV3 model, I get an error. There was not any errors when I converted the model from 'h5' to 'json' but the code below does not work.

    image

    The error I got

    image

    opened by Terminou 25
  • Frugally LSTM Encoder-Decoder results different from Keras/Tensorflow LSTM Encoder-Decoder (missing support for initial_state)

    Frugally LSTM Encoder-Decoder results different from Keras/Tensorflow LSTM Encoder-Decoder (missing support for initial_state)

    Hi @Dobiasd

    I have been working on the Encoder-Decoder model for Vehicle Path Forecasting since you added support for returned_states and show_tensor5 on LSTM-based models. The workflow of the project was described on this past issue. After some experiments, the LSTM-Based encoder and decoder models are not giving me any problem related to returned_states = True or show_tensor5, confirming frugally-deep fixes worked. However, I have been trying to replicate the results I obtained using the Keras/Tensorflow models without success.

    The frugally-deep fdeep_encoder_model_NT is returning the exact same encoder_hidden_state and encoder_cell_state states compared to its Tf + Keras counterparts using the encoder_model.hdf5. However, the fdeep_decoder_model_NT is not giving me the same decoder_hidden_state and decoder_cell_state output states (compared to the results using Tf + Keras encoder_model.hdf5) :(

    Specifically, I develop the decoder inference model using TF + Keras (please refer yourself to past comments in this issue to see the corresponding code), and then converted it from .hdf5 to .json, ready to be ported into the C++ application (same as with the encoder model). Validating the encoder states: image However, both frugally-deep decoder_hidden_state and decoder_cell_state differ from corresponding Keras-based decoder_hidden_state and decoder_cell_state: image Resulting, as expected, in a wrong bounding box prediction: image which does not match with the corresponding Keras Results: image I do not really know about what is happening with fdeep_decoder_model_NT, so I have various options in mind:

    • I have trained another model using LSTM instead CuDNNLSTM layers in order to check if the problem is with CuDNNLSTM layer implementation. However, the problem is still present when using other LSTM-based cells like CuDNNLSTM and LSTM. The fdeep_encoder_model works well but fdeep_decoder_model is still making wrong predictions (both at states returned and next bbox prediction).
    • Now I am working in the main.cpp file. Maybe the problem is inside my internal manipulation of fdeep::tensor5 and fdeep::tensor5s when feeding the data into the ported models. However, both models are working well, except that the decoder's model is making (inaccurate) predictions of future bounding boxes, but it did not crash in any step of the script execution.
    • I am puzzled about the following fact: At main.cpp the decoders predictions is made with the following command: auto decoder_outputs = decoder_model.predict({target_seq, encoder_states.at(0), encoder_states.at(1)});, where encoder_states.at(0) and encoder_states.at(1) represent h_enc and c_enc respectively. However, I tried by interchanging the encoder states at the input of the decoder prediction line like this: auto decoder_outputs = decoder_model.predict({target_seq, encoder_states.at(1), encoder_states.at(0)}); and obtaining the exact same predicted_next_box (even though I interchanged the input order of decoder_states at the prediction function).
    • Finally, apart from the wrong values of h_dec and c_dec returned by fdeep_decoder_model, I noticed both h_dec hidden states (from frugally AND Keras) are in the range [-1, 1], but that does not occur to c_dec hidden states. In Keras, c_dec have values from [-11, 11] but, in frugally, c_dec takes values from [-1, 1]. In addition, based on your suggestion about internal scaling causing this kind of issues, by inspecting the fdeep_encoder_model.json, there are some initializers parameters that are using Variance_Scaling parameter inside that maybe are the cause of errors at inference-time. I think maybe this at the root of the problem but I have no idea of how to get the correct h_enc and c_enc, both between the same ranges used in Keras and with the correct values as well.

    Here is the main.cpp file I am running to test the results. Any comment or suggestion about the code would be welcomed!

    #include <fdeep/fdeep.hpp>
    #include <vector>
    #include <fstream>
    #include <iostream>
    
    int main()
    {
    	// Loading the previously trained models
    	const auto encoder_model = fdeep::load_model("fdeep_encoder_model_NT.json");
    	std::cout << "Encoder Model Loaded!" << std::endl;
    	const auto decoder_model = fdeep::load_model("fdeep_decoder_model_NT.json");
    	std::cout << "Decoder Model Loaded!" << std::endl;
    	// Batch_size = 1, num_timesteps = 10 and num_features = 4
    	fdeep::shape5 in_traj_shape(1,1,1,10,4);
    	// Loading a sample sequence trajectory into tensor5 data structure
    	const std::vector<float> src_traj  = {1728, 715, 191, 221,
    					1717, 710, 202, 215,
    					1706, 704, 206, 198,
    					1695, 700, 217, 196,
    					1687, 696, 228, 183,
    					1680, 689, 240, 181,
    					1668, 668, 240, 198,
    					1661, 668, 243, 194,
    					1650, 664, 251, 189,
    					1635, 660, 266, 181};
    	// Input trajectory from vector to tensor5 data structure
    	const fdeep::shared_float_vec shared_traj(fplus::make_shared_ref<fdeep::float_vec>(src_traj));
    	const fdeep::tensor5 encoder_inputs(in_traj_shape, shared_traj);
    	std::cout << "Trajectory #0!" << fdeep::show_tensor5(encoder_inputs) << std::endl;
    	// Using loaded encoder model to predict encoder output states
    	// Then encoder_states can be feed as input tensors into decoder_model
    	const auto encoder_states = encoder_model.predict({encoder_inputs});
    	// Printing for debbuging purposes
    	std::cout << "h_enc: "<< fdeep::show_tensor5(encoder_states.at(0)) << std::endl;
    	std::cout << "c_enc: "<< fdeep::show_tensor5(encoder_states.at(1)) << std::endl;
    	// Creating a SOS input sequence token to signal decoder model to start making predictions
    	fdeep::shape5 bbox_shape(1,1,1,1,4);
    	// Loading a sample sequence trajectory into tensor5 data structure
    	const std::vector<float> SOS_token  = {9999.0, 9999.0, 9999.0, 9999.0};
    	const fdeep::shared_float_vec shared_SOS_token(fplus::make_shared_ref<fdeep::float_vec>(SOS_token));
    	fdeep::tensor5 target_seq(bbox_shape, shared_SOS_token);
    	// In Python we have: Prediction, h, c = decoder_model.predict([target_seq] + state)
    	auto decoder_outputs = decoder_model.predict({target_seq, encoder_states.at(1), encoder_states.at(0)});
    	// Printing for debugging purposes
    	std::cout << "h_dec: "<< fdeep::show_tensor5(decoder_outputs.at(1)) << std::endl;
    	std::cout << "c_dec: "<< fdeep::show_tensor5(decoder_outputs.at(2)) << std::endl;
    	std::cout << "Predicted next bounding box!" << fdeep::show_tensor5(decoder_outputs.at(0)) << std::endl;
    }
    

    The fdeep_encoder_model_NT.json model imported into the C++ application is avaliable to download and inspect from this past comment. The fdeep_decoder_model_NT.json can be downloaded from the following link: Decoder model: https://drive.google.com/open?id=1hwrjcnNfWaqQI0o8TmJKtfsAwj6zd9aq I would really appreciate any help with this issue. I am puzzled because the encoder model is working perfectly but the decoder model does not, specifically, the results between the Keras vs Frugally decoder models differ, giving me wrong output predictions that cannot be used at all.

    opened by MarlonCajamarca 25
  • `visualize_layers.py` uses `scipy.misc.imsave` which no longer exists

    `visualize_layers.py` uses `scipy.misc.imsave` which no longer exists

    The documentation suggests switching to imageio.imwrite instead: https://docs.scipy.org/doc/scipy-1.2.1/reference/generated/scipy.misc.imsave.html

    There's even a migration guide: https://imageio.readthedocs.io/en/v2.6.1/scipy.html

    Another alternative would be keras.preprocessing.image.save_img.

    opened by torokati44 0
  • Modify Unit Tests CmakeLists and INSTALL.md

    Modify Unit Tests CmakeLists and INSTALL.md

    Modify Unit Tests CmakeLists.txt to let Cmake detect Python to execute command instead of using "python3 xxxx", because not all user can use "python3" to run python scripts. The command to convert h5 to json may be failed because of command "python3". I add find_package to detect Python and try to check pip.exe. pip3.exe etc. to check Tensorflow using "pip show tensorflow" to make sure user has install tensorflow. The requirment of Python and tensorflow is written in INSTALL.md.

    opened by sirius-william 4
  • Thanks !

    Thanks !

    Thank the project author very much! My graduate design project is a one-dimensional convolutional neural network. After training with Python's TensorFlow 2.10, I have been looking for ways to deploy the model in my Qt project. I have tried to compile TensorFlow C++(compilation always fails), TensorFlow C API (TensorFlow 2.10 is not supported), TensorRT (AMD graphics driver is not supported), OpenVino (the network architecture I choose is not supported). By chance, I found this library in Google. It is easy to use and does not require much dependence. It only requires header files. It perfectly solves my project needs. Thank you! PS. When using, the python script part is executed in CMakeList.txt in the test, using python3 xxxx. However, not all users can run Python scripts through the command 'python3'. It is recommended to find Python in CMakeLists.txt, or let users specify Python paths. In addition, Mingw will report Fatal error: can't write 286 bytes to section. text when compiling unittest. It is recommended to add: target_ compile_ options(PROJECT_NAME PRIVATE $<$<CXX_ COMPILER_ ID:MSVC>:/bigobj> $<$<CXX_ COMPILER_ ID:GNU>:-Wa,-mbig-obj>) This problem also arises when the library is used in other projects. #

    opened by sirius-william 2
  • Consider having different convolution implementations available and choosing the fastest one at runtime

    Consider having different convolution implementations available and choosing the fastest one at runtime

    Different convolution implementations might perform differently depending on the convolution settings (input size/depth, kernel size/count) and depending on the hardware (mostly CPU/memory) used.

    Right now, for example, we have a special implementation used for 2D convolutions in case strides = (1, 1) (which utilized not only by the Conv2D layer, but also by DepthwiseConv2D, and SeparableConv2D).

    I wonder if it would make sense to provide a function to the user, that when called on a model, tries out different implementations and remembers which one performed best for future calls of model.predict. (Maybe in some settings, event a naive non-im2col convolution is the fastest one.)

    Pros:

    • potentially faster forward passes

    Cons:

    • increased code complexity
    • potentially wrong settings in case the background load on the user's machine varies too much during the evaluation
    opened by Dobiasd 0
  • Feature Suggestion: Support Transformer Models

    Feature Suggestion: Support Transformer Models

    First off, I would like to say that this is a really great piece of work! I have been using it with LSTMs for time-series data and have found frugally-deep to be invaluable. I am starting to investigate Transformers in order to see how they stack up to LSTMs and it would be wonderful if support for Transformer models could be added. I am in the early stages of working with Transformers, but the specific layers that I currently do not see supported are: MultiHeadAttention and LayerNormalization.

    help wanted 
    opened by jonathan-lazzaro-nnl 11
  • Feature suggestion: Support ONNX models?

    Feature suggestion: Support ONNX models?

    How about supporting ONNX in frugally? You could have a protobuf importer for ONNX models or add a tool which converts ONNX to the JSON format you use? Just a thought. A header only ONNX inference engine would be very very useful.

    opened by pfeatherstone 24
Releases(v0.15.19-p0)
Owner
Tobias Hermann
likes functional programming, neat software architecture, and machine learning.
Tobias Hermann
Off-policy continuous control in PyTorch, with RDPG, RTD3 & RSAC

arXiv technical report soon available. we are updating the readme to be as comprehensive as possible Please ask any questions in Issues, thanks. Intro

Zhihan 31 Dec 30, 2022
Repository for scripts and notebooks from the book: Programming PyTorch for Deep Learning

Repository for scripts and notebooks from the book: Programming PyTorch for Deep Learning

Ian Pointer 368 Dec 17, 2022
Contrastive Learning Inverts the Data Generating Process

Official code to reproduce the results and data presented in the paper Contrastive Learning Inverts the Data Generating Process.

71 Nov 25, 2022
The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

Dinghan Shen 49 Dec 22, 2022
Custom Implementation of Non-Deep Networks

ParNet Custom Implementation of Non-deep Networks arXiv:2110.07641 Ankit Goyal, Alexey Bochkovskiy, Jia Deng, Vladlen Koltun Official Repository https

Pritama Kumar Nayak 20 May 27, 2022
AugLiChem - The augmentation library for chemical systems.

AugLiChem Welcome to AugLiChem! The augmentation library for chemical systems. This package supports augmentation for both crystaline and molecular sy

BaratiLab 17 Jan 08, 2023
TensorFlow implementation of original paper : https://github.com/hszhao/PSPNet

Keras implementation of PSPNet(caffe) Implemented Architecture of Pyramid Scene Parsing Network in Keras. For the best compability please use Python3.

VladKry 386 Dec 29, 2022
Deep Learning and Logical Reasoning from Data and Knowledge

Logic Tensor Networks (LTN) Logic Tensor Network (LTN) is a neurosymbolic framework that supports querying, learning and reasoning with both rich data

171 Dec 29, 2022
[ICCV21] Official implementation of the "Social NCE: Contrastive Learning of Socially-aware Motion Representations" in PyTorch.

Social-NCE + CrowdNav Website | Paper | Video | Social NCE + Trajectron | Social NCE + STGCNN This is an official implementation for Social NCE: Contr

VITA lab at EPFL 125 Dec 23, 2022
A flexible submap-based framework towards spatio-temporally consistent volumetric mapping and scene understanding.

Panoptic Mapping This package contains panoptic_mapping, a general framework for semantic volumetric mapping. We provide, among other, a submap-based

ETHZ ASL 194 Dec 20, 2022
Google AI Open Images - Object Detection Track: Open Solution

Google AI Open Images - Object Detection Track: Open Solution This is an open solution to the Google AI Open Images - Object Detection Track 😃 More c

minerva.ml 46 Jun 22, 2022
AdaDM: Enabling Normalization for Image Super-Resolution

AdaDM AdaDM: Enabling Normalization for Image Super-Resolution. You can apply BN, LN or GN in SR networks with our AdaDM. Pretrained models (EDSR*/RDN

58 Jan 08, 2023
Classifying audio using Wavelet transform and deep learning

Audio Classification using Wavelet Transform and Deep Learning A step-by-step tutorial to classify audio signals using continuous wavelet transform (C

Aditya Dutt 17 Nov 29, 2022
Implementation of Google Brain's WaveGrad high-fidelity vocoder

WaveGrad Implementation (PyTorch) of Google Brain's high-fidelity WaveGrad vocoder (paper). First implementation on GitHub with high-quality generatio

Ivan Vovk 363 Dec 27, 2022
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it

Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes and a curated list of awesome resources of such topics.

mani 1.2k Jan 07, 2023
Minimal deep learning library written from scratch in Python, using NumPy/CuPy.

SmallPebble Project status: experimental, unstable. SmallPebble is a minimal/toy automatic differentiation/deep learning library written from scratch

Sidney Radcliffe 92 Dec 30, 2022
[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition

Counterfactual Zero-Shot and Open-Set Visual Recognition This project provides implementations for our CVPR 2021 paper Counterfactual Zero-S

144 Dec 24, 2022
Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)

Distributed Deep Learning in Open Collaborations This repository contains the code for the NeurIPS 2021 paper "Distributed Deep Learning in Open Colla

Yandex Research 96 Sep 15, 2022
FasterAI: A library to make smaller and faster models with FastAI.

Fasterai fasterai is a library created to make neural network smaller and faster. It essentially relies on common compression techniques for networks

Nathan Hubens 193 Jan 01, 2023
Visual Tracking by TridenAlign and Context Embedding

Visual Tracking by TridentAlign and Context Embedding (TACT) Test code for "Visual Tracking by TridentAlign and Context Embedding" Janghoon Choi, Juns

Janghoon Choi 32 Aug 25, 2021