Header-only library for using Keras models in C++.

Overview

logo

CI (License MIT 1.0)

frugally-deep

Use Keras models in C++ with ease

Table of contents

Introduction

Would you like to build/train a model using Keras/Python? And would you like to run the prediction (forward pass) on your model in C++ without linking your application against TensorFlow? Then frugally-deep is exactly for you.

frugally-deep

  • is a small header-only library written in modern and pure C++.
  • is very easy to integrate and use.
  • depends only on FunctionalPlus, Eigen and json - also header-only libraries.
  • supports inference (model.predict) not only for sequential models but also for computational graphs with a more complex topology, created with the functional API.
  • re-implements a (small) subset of TensorFlow, i.e., the operations needed to support prediction.
  • results in a much smaller binary size than linking against TensorFlow.
  • works out-of-the-box also when compiled into a 32-bit executable. (Of course, 64 bit is fine too.)
  • utterly ignores even the most powerful GPU in your system and uses only one CPU core per prediction. ;-)
  • but is quite fast on one CPU core compared to TensorFlow, and you can run multiple predictions in parallel, thus utilizing as many CPUs as you like to improve the overall prediction throughput of your application/pipeline.

Supported layer types

Layer types typically used in image recognition/generation are supported, making many popular model architectures possible (see Performance section).

  • Add, Concatenate, Subtract, Multiply, Average, Maximum
  • AveragePooling1D/2D, GlobalAveragePooling1D/2D
  • Bidirectional, TimeDistributed, GRU, LSTM, CuDNNGRU, CuDNNLSTM
  • Conv1D/2D, SeparableConv2D, DepthwiseConv2D
  • Cropping1D/2D, ZeroPadding1D/2D
  • BatchNormalization, Dense, Flatten, Normalization
  • Dropout, AlphaDropout, GaussianDropout, GaussianNoise
  • SpatialDropout1D, SpatialDropout2D, SpatialDropout3D
  • RandomContrast, RandomFlip, RandomHeight
  • RandomRotation, RandomTranslation, RandomWidth, RandomZoom
  • MaxPooling1D/2D, GlobalMaxPooling1D/2D
  • ELU, LeakyReLU, ReLU, SeLU, PReLU
  • Sigmoid, Softmax, Softplus, Tanh
  • Exponential, GELU, Softsign
  • UpSampling1D/2D
  • Reshape, Permute, RepeatVector
  • Embedding

Also supported

  • multiple inputs and outputs
  • nested models
  • residual connections
  • shared layers
  • variable input shapes
  • arbitrary complex model architectures / computational graphs
  • custom layers (by passing custom factory functions to load_model)

Currently not supported are the following:

ActivityRegularization, AveragePooling3D, Conv2DTranspose (why), Conv3D, ConvLSTM2D, Cropping3D, Dot, GRUCell, LocallyConnected1D, LocallyConnected2D, LSTMCell, Masking, MaxPooling3D, RepeatVector, RNN, SimpleRNN, SimpleRNNCell, StackedRNNCells, ThresholdedReLU, Upsampling3D, temporal models

Usage

  1. Use Keras/Python to build (model.compile(...)), train (model.fit(...)) and test (model.evaluate(...)) your model as usual. Then save it to a single HDF5 file using model.save('....h5', include_optimizer=False). The image_data_format in your model must be channels_last, which is the default when using the TensorFlow backend. Models created with a different image_data_format and other backends are not supported.

  2. Now convert it to the frugally-deep file format with keras_export/convert_model.py

  3. Finally load it in C++ (fdeep::load_model(...)) and use model.predict(...) to invoke a forward pass with your data.

The following minimal example shows the full workflow:

# create_model.py
import numpy as np
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model

inputs = Input(shape=(4,))
x = Dense(5, activation='relu')(inputs)
predictions = Dense(3, activation='softmax')(x)
model = Model(inputs=inputs, outputs=predictions)
model.compile(loss='categorical_crossentropy', optimizer='nadam')

model.fit(
    np.asarray([[1, 2, 3, 4], [2, 3, 4, 5]]),
    np.asarray([[1, 0, 0], [0, 0, 1]]), epochs=10)

model.save('keras_model.h5', include_optimizer=False)
python3 keras_export/convert_model.py keras_model.h5 fdeep_model.json
// main.cpp
#include <fdeep/fdeep.hpp>
int main()
{
    const auto model = fdeep::load_model("fdeep_model.json");
    const auto result = model.predict(
        {fdeep::tensor(fdeep::tensor_shape(static_cast<std::size_t>(4)),
        std::vector<float>{1, 2, 3, 4})});
    std::cout << fdeep::show_tensors(result) << std::endl;
}

When using convert_model.py a test case (input and corresponding output values) is generated automatically and saved along with your model. fdeep::load_model runs this test to make sure the results of a forward pass in frugally-deep are the same as in Keras.

For more integration examples please have a look at the FAQ.

Performance

Below you can find the average durations of multiple consecutive forward passes for some popular models ran on a single core of an Intel Core i5-6600 CPU @ 3.30GHz. frugally-deep and TensorFlow were compiled (GCC ver. 7.1) with g++ -O3 -march=native. The processes were started with CUDA_VISIBLE_DEVICES='' taskset --cpu-list 1 ... to disable the GPU and to only allow usage of one CPU. (see used Dockerfile)

Model Keras + TF frugally-deep
DenseNet121 0.12 s 0.25 s
DenseNet169 0.13 s 0.28 s
DenseNet201 0.16 s 0.39 s
InceptionV3 0.21 s 0.32 s
MobileNet 0.05 s 0.15 s
MobileNetV2 0.05 s 0.17 s
NASNetLarge 0.83 s 4.03 s
NASNetMobile 0.08 s 0.32 s
ResNet101 0.22 s 0.45 s
ResNet101V2 0.21 s 0.42 s
ResNet152 0.31 s 0.65 s
ResNet152V2 0.29 s 0.61 s
ResNet50 0.13 s 0.26 s
ResNet50V2 0.12 s 0.22 s
VGG16 0.40 s 0.56 s
VGG19 0.49 s 0.68 s
Xception 0.25 s 1.20 s

Requirements and Installation

  • A C++14-compatible compiler: Compilers from these versions on are fine: GCC 4.9, Clang 3.7 (libc++ 3.7) and Visual C++ 2015
  • Python 3.7 or higher
  • TensorFlow and Keras 2.7.0 (This is the tested version, but somewhat older ones might work too.)

Guides for different ways to install frugally-deep can be found in INSTALL.md.

FAQ

See FAQ.md

Disclaimer


The API of this library still might change in the future. If you have any suggestions, find errors or want to give general feedback/criticism, I'd love to hear from you. Of course, contributions are also very welcome.

License

Distributed under the MIT License. (See accompanying file LICENSE or at https://opensource.org/licenses/MIT)

Comments
  • Problem with results of siamese CNN using EfficientNet

    Problem with results of siamese CNN using EfficientNet

    Hi there,

    First of all let me thank you for this fantastic library!

    Recently I got stuck on converting a siamese network that utilizes functional model and EfficientNetB0 architecture. I'm strictly following this repo for my development: https://github.com/sajadamouei/Person-Re-ID-with-light-weight-network. Since EfficientNetB0 uses FixedDropout and reduce layers that shrink the dimensionality (requires multiplying tensors by 1x1xDEPTH Conv) I had to implement them myself in the library. When I convert EfficientNetB0 on its own and load it in my C++ app, the output is EXACTLY as expected on both python and C++ side - no problems there. However, When I try to create siamese network out of them like presented here: https://github.com/sajadamouei/Person-Re-ID-with-light-weight-network/blob/master/model.py I get totally different results. In anticipation to your question - yes, I made super sure that the inputs to the network are EXACTLY the same on both sides - python and C++. I've tried everything to fix this and concluded that there must be something wrong with either the way frugally-deep deals with functional models OR the converter itself. What I also noticed is that tensors look completely different when they reach both Flatten layers in the architecture. Any ideas why this may be happening? Please look at the below screenshots to better understand the problem.

    github_plane Screenshot 2022-03-06 at 11 57 21

    opened by pavel123 37
  • Hash value for json/net loaded?

    Hash value for json/net loaded?

    I think it would be handy for us to have a hash over a loaded model (so I could store, together with the results, some indication of how they were generated - particularly handy for encodings, which tend to be incompatible). I could simply calculate a hash over the file/string used to initialise the net, but since many files could potentially result in the same net it would be nicer if the net itself could provide such a hash. Is such a function implemented or, if not, do you see an easy way to get such a hash?

    Thanks

    Sven

    opened by utcke 36
  • How to convert model with

    How to convert model with "relu6" layer?

    My Keras model uses "rule6" layer , how to change convert_model.py to make the json file? and any examples for adding custom layer in fdeep::load_model?

    Thank you very much!

    opened by binlbl 32
  • Using Eigen Unsupported modules to improve convolutions

    Using Eigen Unsupported modules to improve convolutions

    I noticed that Eigen 3.3 has unsupported modules, including modules for Tensors and gemm operations.

    https://bitbucket.org/eigen/eigen/src/9b065de03d016d802a25366ff5f0055df6318121/unsupported/Eigen/CXX11/src/Tensor/README.md?at=default#markdown-header-convolutions

    I noticed you implement your own gemm operation in fdeep/convolution.hpp in function convolve_im2col. This could be improved by using gemm functions from the eigen unsupported modules.

    I ran a test by inferring the UNet model from pix2pix in frugally deep. It took 18s compared to a model converted from onnx and inferred in OpenCV which took 3s. I think this shows that convolutions in frugally could be improved.

    Thanks

    opened by pfeatherstone 32
  • Slow-ish run time on MSVC

    Slow-ish run time on MSVC

    Hi!

    First of all thank you for this great library! :-) I've got a fairly small model (18 layers) for real-time applications, basically mainly consisting of 5 blocks of Conv2D/ReLu/MaxPool2D, and input size 64x64x3. I'm unfortunately seeing some speed problems with fdeep. A forward pass takes around 11ms in Keras, and it's taking 60ms in fdeep. (I've measured by calling predict 100x in a for-loop and then averaging - a bit crude but should do the trick for this purpose). I've compiled with the latest VS2017 15.5.5, Release mode, and default compiler flags (/O2). If I enable AVX2 and instrinsics, it goes down to 50ms, but still way too slow. (I've tried without im2col but it's even slower, around >10x).

    I've run the VS profiler, but I'm not 100% sure I'm interpreting the results correctly. I think around 30%+5% of the total time is spent in Eigen's gebp and gemm functions, where we probably can't do much. Except maybe: I think I've seen you're using RowMajor storage for the Eigen matrices. Eigen is supposedly more optimised for its default, ColMajor storage. Would it be hard to change that in fdeep? Another 30% seems to be spent in convolve_im2col. But I'm not 100% sure where. I first thought it was the memcpy in eigen_mat_to_values but eigen_mat_to_values itself contains very few profiler samples only. There's also a lot of internal::transform and std::transform showing up in the profiler as well (internal::transform<ContainerOut>(reuse_t{}, f, std::forward<ContainerIn>(xs));) but I couldn't really figure out what the actual code is that this executes. I also saw that I think you pre-instantiate some convolution functions for common kernels. Most of my convolution kernels are 3x3, and it looks like you only instantiate n x m kernels for n and m equals 1 and 2. Could it help adding 3x3 there? So yea I'm really not sure about all of it. If indeed the majority of time is spent in Eigen's functions, then the RowMajor thing could indeed be a major problem.

    I'm happy to send you the model and an example input via email if you wanted to have a look.

    Here's some screenshots of the profiler: image image image

    Thank you very much!

    enhancement 
    opened by patrikhuber 32
  • Input to model

    Input to model

    If I have RGB Image and i want it to pass it to the model , what should i do ?

    what i've made is flatten the input image into vector of float , i appened the r , g , b values after each others to get just 1 vector called "input_vector"

    and then this is the next step.

             typedef fplus::shared_ref<std::vector<float>> shared_float_vec;
             shared_float_vec x(fplus::make_shared_ref<vector<float>>(std::move(input_vector)));
             const auto result = decision_model.predict({fdeep::tensor3(fdeep::shape3(3,60,60),x)});
    

    then the output is incorrect , what should i do then ? or what i've done wrong ?

    opened by rmmal 32
  •  lambda layer using tf.image

    lambda layer using tf.image

    I am using Lambda layer which includes this function to extract patches in image

    patch_one = tf.image.extract_glimpse(inputs[0], [26, 26], inputs[1][:, j, :], centered=False, normalized=False, noise='zero')

    Is it possible to implement this custom layer in your library and load model?

    opened by katmatus 30
  • Stop at the 'Loading json ...'

    Stop at the 'Loading json ...'

    Hi Tobias, thanks for this great library! I trained a ResNet50 network using Keras. I was able to convert the .h5 model to a .json. However, when I run the program as follows:

    #include <fdeep/fdeep.hpp>
    #include <opencv2/opencv.hpp>
    
    int main()
    {
    	const cv::Mat image = cv::imread("Image_1_2.jpg");
    	cv::cvtColor(image, image, cv::COLOR_BGR2RGB);
    	assert(image.isContinuous());
    	const auto model = fdeep::load_model("train7.json");
    	// Use the correct scaling, i.e., low and high.
    	const auto input = fdeep::tensor5_from_bytes(image.ptr(),
    		static_cast<std::size_t>(image.rows),
    		static_cast<std::size_t>(image.cols),
    		static_cast<std::size_t>(image.channels()),
    		0.0f, 1.0f);
    	const auto result = model.predict_class({ input });
    	std::cout << result << std::endl;
    	system("pause");
    }
    

    It likes the example in the FAQ--How to use images loaded with OpenCV as input for a model? But it doesn't work with my Keras model. It just spent about 236s to load json, and then stop here. My CPU is Core i5-3230M, which is not a good CPU. My model is used to classify 7 kinds of algae cells, which used transfer learning based on ResNet50.
    The python program for trainning model as follows:

    import numpy as np
    import matplotlib.pyplot as plt
    import keras
    from keras.preprocessing import image
    from keras.preprocessing.image import ImageDataGenerator
    from keras.applications import ResNet50
    from keras.applications.resnet50 import preprocess_input
    from keras import Model, layers
    from keras.models import load_model
    
    input_path = "data/LvsRod/"
    
    train_datagen = ImageDataGenerator(
        rescale=1. / 255,
        rotation_range=20,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        preprocessing_function=preprocess_input)
    
    train_generator = train_datagen.flow_from_directory(
        input_path + 'train',
        batch_size=10,
        class_mode='binary',
        target_size=(224, 224))
    
    validation_datagen = ImageDataGenerator(
        rescale=1. / 255,
        preprocessing_function=preprocess_input)
    
    validation_generator = validation_datagen.flow_from_directory(
        input_path + 'validation',
        shuffle=False,
        class_mode='binary',
        target_size=(224, 224))
    
    conv_base = ResNet50(include_top=False, weights='imagenet', input_shape=(224, 224, 3))
    
    for layer in conv_base.layers:
        layer.trainable = False
    
    x = conv_base.output
    x = layers.Flatten()(x)
    x = layers.Dense(256, activation='relu')(x)
    x = layers.Dropout(0.5)(x)
    predictions = layers.Dense(7, activation='softmax')(x)
    model = Model(conv_base.input, predictions)
    
    optimizer = keras.optimizers.SGD(lr=1e-4, momentum=0.9)
    model.compile(loss='sparse_categorical_crossentropy',
                  optimizer=optimizer,
                  metrics=['accuracy'])
    
    history = model.fit_generator(generator=train_generator,
                                  steps_per_epoch=10,  # added in Kaggle
                                  epochs=30,
                                  validation_data=validation_generator,
                                  validation_steps=10  # added in Kaggle
                                 )
    
    # save
    model.save('train7.h5')
    

    The h5model can download from this

    URL:https://pan.baidu.com/s/1YkuBHBkjjUs2dcpc8XTLqA
    Extraction code:1od1

    Because the file is too big, so I cannot upload it here. I really want to know how to solve the problem.

    opened by callmefish 28
  • Bad performance.

    Bad performance.

    Hi Tobias,

    I am getting a bad performance when using frugally-deep and I wanted to ask you about some advice. Of course I've read FAQ about the performance so I got that covered.

    Here is what I've tested so far:

    | Environment| Description | Time | |----------|:-------------|------:| | Python | Default settings (GPU ON) | 35ms| | Python | os.environ['CUDA_VISIBLE_DEVICES']='-1' | 45ms| | Python | NO GPU and tf.config.threading.set_intra_op_parallelism_threads(1) | 75ms| | Visual Studio 2017 | Default (Release -O2, whole program optimization) | 310ms| | Visual Studio 2017 | Compiled with AVX2 | 280ms|

    It is quite interesting that single switch (AVX2) gave me 10% boost! but it is still far, very far from what you have advocated.

    I did run a benchmark and here is what I've got:

    obraz

    Any ideas? Could I send you my model and example code? (privately as this is for the job, I will be happy to support you if I get paid for the project :) ).

    opened by TrueWodzu 27
  • Cannot load InceptionV3 model

    Cannot load InceptionV3 model

    So, I successfully loaded some models and predicted them.

    Yet, when I tried to load InceptionV3 model, I get an error. There was not any errors when I converted the model from 'h5' to 'json' but the code below does not work.

    image

    The error I got

    image

    opened by Terminou 25
  • Frugally LSTM Encoder-Decoder results different from Keras/Tensorflow LSTM Encoder-Decoder (missing support for initial_state)

    Frugally LSTM Encoder-Decoder results different from Keras/Tensorflow LSTM Encoder-Decoder (missing support for initial_state)

    Hi @Dobiasd

    I have been working on the Encoder-Decoder model for Vehicle Path Forecasting since you added support for returned_states and show_tensor5 on LSTM-based models. The workflow of the project was described on this past issue. After some experiments, the LSTM-Based encoder and decoder models are not giving me any problem related to returned_states = True or show_tensor5, confirming frugally-deep fixes worked. However, I have been trying to replicate the results I obtained using the Keras/Tensorflow models without success.

    The frugally-deep fdeep_encoder_model_NT is returning the exact same encoder_hidden_state and encoder_cell_state states compared to its Tf + Keras counterparts using the encoder_model.hdf5. However, the fdeep_decoder_model_NT is not giving me the same decoder_hidden_state and decoder_cell_state output states (compared to the results using Tf + Keras encoder_model.hdf5) :(

    Specifically, I develop the decoder inference model using TF + Keras (please refer yourself to past comments in this issue to see the corresponding code), and then converted it from .hdf5 to .json, ready to be ported into the C++ application (same as with the encoder model). Validating the encoder states: image However, both frugally-deep decoder_hidden_state and decoder_cell_state differ from corresponding Keras-based decoder_hidden_state and decoder_cell_state: image Resulting, as expected, in a wrong bounding box prediction: image which does not match with the corresponding Keras Results: image I do not really know about what is happening with fdeep_decoder_model_NT, so I have various options in mind:

    • I have trained another model using LSTM instead CuDNNLSTM layers in order to check if the problem is with CuDNNLSTM layer implementation. However, the problem is still present when using other LSTM-based cells like CuDNNLSTM and LSTM. The fdeep_encoder_model works well but fdeep_decoder_model is still making wrong predictions (both at states returned and next bbox prediction).
    • Now I am working in the main.cpp file. Maybe the problem is inside my internal manipulation of fdeep::tensor5 and fdeep::tensor5s when feeding the data into the ported models. However, both models are working well, except that the decoder's model is making (inaccurate) predictions of future bounding boxes, but it did not crash in any step of the script execution.
    • I am puzzled about the following fact: At main.cpp the decoders predictions is made with the following command: auto decoder_outputs = decoder_model.predict({target_seq, encoder_states.at(0), encoder_states.at(1)});, where encoder_states.at(0) and encoder_states.at(1) represent h_enc and c_enc respectively. However, I tried by interchanging the encoder states at the input of the decoder prediction line like this: auto decoder_outputs = decoder_model.predict({target_seq, encoder_states.at(1), encoder_states.at(0)}); and obtaining the exact same predicted_next_box (even though I interchanged the input order of decoder_states at the prediction function).
    • Finally, apart from the wrong values of h_dec and c_dec returned by fdeep_decoder_model, I noticed both h_dec hidden states (from frugally AND Keras) are in the range [-1, 1], but that does not occur to c_dec hidden states. In Keras, c_dec have values from [-11, 11] but, in frugally, c_dec takes values from [-1, 1]. In addition, based on your suggestion about internal scaling causing this kind of issues, by inspecting the fdeep_encoder_model.json, there are some initializers parameters that are using Variance_Scaling parameter inside that maybe are the cause of errors at inference-time. I think maybe this at the root of the problem but I have no idea of how to get the correct h_enc and c_enc, both between the same ranges used in Keras and with the correct values as well.

    Here is the main.cpp file I am running to test the results. Any comment or suggestion about the code would be welcomed!

    #include <fdeep/fdeep.hpp>
    #include <vector>
    #include <fstream>
    #include <iostream>
    
    int main()
    {
    	// Loading the previously trained models
    	const auto encoder_model = fdeep::load_model("fdeep_encoder_model_NT.json");
    	std::cout << "Encoder Model Loaded!" << std::endl;
    	const auto decoder_model = fdeep::load_model("fdeep_decoder_model_NT.json");
    	std::cout << "Decoder Model Loaded!" << std::endl;
    	// Batch_size = 1, num_timesteps = 10 and num_features = 4
    	fdeep::shape5 in_traj_shape(1,1,1,10,4);
    	// Loading a sample sequence trajectory into tensor5 data structure
    	const std::vector<float> src_traj  = {1728, 715, 191, 221,
    					1717, 710, 202, 215,
    					1706, 704, 206, 198,
    					1695, 700, 217, 196,
    					1687, 696, 228, 183,
    					1680, 689, 240, 181,
    					1668, 668, 240, 198,
    					1661, 668, 243, 194,
    					1650, 664, 251, 189,
    					1635, 660, 266, 181};
    	// Input trajectory from vector to tensor5 data structure
    	const fdeep::shared_float_vec shared_traj(fplus::make_shared_ref<fdeep::float_vec>(src_traj));
    	const fdeep::tensor5 encoder_inputs(in_traj_shape, shared_traj);
    	std::cout << "Trajectory #0!" << fdeep::show_tensor5(encoder_inputs) << std::endl;
    	// Using loaded encoder model to predict encoder output states
    	// Then encoder_states can be feed as input tensors into decoder_model
    	const auto encoder_states = encoder_model.predict({encoder_inputs});
    	// Printing for debbuging purposes
    	std::cout << "h_enc: "<< fdeep::show_tensor5(encoder_states.at(0)) << std::endl;
    	std::cout << "c_enc: "<< fdeep::show_tensor5(encoder_states.at(1)) << std::endl;
    	// Creating a SOS input sequence token to signal decoder model to start making predictions
    	fdeep::shape5 bbox_shape(1,1,1,1,4);
    	// Loading a sample sequence trajectory into tensor5 data structure
    	const std::vector<float> SOS_token  = {9999.0, 9999.0, 9999.0, 9999.0};
    	const fdeep::shared_float_vec shared_SOS_token(fplus::make_shared_ref<fdeep::float_vec>(SOS_token));
    	fdeep::tensor5 target_seq(bbox_shape, shared_SOS_token);
    	// In Python we have: Prediction, h, c = decoder_model.predict([target_seq] + state)
    	auto decoder_outputs = decoder_model.predict({target_seq, encoder_states.at(1), encoder_states.at(0)});
    	// Printing for debugging purposes
    	std::cout << "h_dec: "<< fdeep::show_tensor5(decoder_outputs.at(1)) << std::endl;
    	std::cout << "c_dec: "<< fdeep::show_tensor5(decoder_outputs.at(2)) << std::endl;
    	std::cout << "Predicted next bounding box!" << fdeep::show_tensor5(decoder_outputs.at(0)) << std::endl;
    }
    

    The fdeep_encoder_model_NT.json model imported into the C++ application is avaliable to download and inspect from this past comment. The fdeep_decoder_model_NT.json can be downloaded from the following link: Decoder model: https://drive.google.com/open?id=1hwrjcnNfWaqQI0o8TmJKtfsAwj6zd9aq I would really appreciate any help with this issue. I am puzzled because the encoder model is working perfectly but the decoder model does not, specifically, the results between the Keras vs Frugally decoder models differ, giving me wrong output predictions that cannot be used at all.

    opened by MarlonCajamarca 25
  • `visualize_layers.py` uses `scipy.misc.imsave` which no longer exists

    `visualize_layers.py` uses `scipy.misc.imsave` which no longer exists

    The documentation suggests switching to imageio.imwrite instead: https://docs.scipy.org/doc/scipy-1.2.1/reference/generated/scipy.misc.imsave.html

    There's even a migration guide: https://imageio.readthedocs.io/en/v2.6.1/scipy.html

    Another alternative would be keras.preprocessing.image.save_img.

    opened by torokati44 0
  • Modify Unit Tests CmakeLists and INSTALL.md

    Modify Unit Tests CmakeLists and INSTALL.md

    Modify Unit Tests CmakeLists.txt to let Cmake detect Python to execute command instead of using "python3 xxxx", because not all user can use "python3" to run python scripts. The command to convert h5 to json may be failed because of command "python3". I add find_package to detect Python and try to check pip.exe. pip3.exe etc. to check Tensorflow using "pip show tensorflow" to make sure user has install tensorflow. The requirment of Python and tensorflow is written in INSTALL.md.

    opened by sirius-william 4
  • Thanks !

    Thanks !

    Thank the project author very much! My graduate design project is a one-dimensional convolutional neural network. After training with Python's TensorFlow 2.10, I have been looking for ways to deploy the model in my Qt project. I have tried to compile TensorFlow C++(compilation always fails), TensorFlow C API (TensorFlow 2.10 is not supported), TensorRT (AMD graphics driver is not supported), OpenVino (the network architecture I choose is not supported). By chance, I found this library in Google. It is easy to use and does not require much dependence. It only requires header files. It perfectly solves my project needs. Thank you! PS. When using, the python script part is executed in CMakeList.txt in the test, using python3 xxxx. However, not all users can run Python scripts through the command 'python3'. It is recommended to find Python in CMakeLists.txt, or let users specify Python paths. In addition, Mingw will report Fatal error: can't write 286 bytes to section. text when compiling unittest. It is recommended to add: target_ compile_ options(PROJECT_NAME PRIVATE $<$<CXX_ COMPILER_ ID:MSVC>:/bigobj> $<$<CXX_ COMPILER_ ID:GNU>:-Wa,-mbig-obj>) This problem also arises when the library is used in other projects. #

    opened by sirius-william 2
  • Consider having different convolution implementations available and choosing the fastest one at runtime

    Consider having different convolution implementations available and choosing the fastest one at runtime

    Different convolution implementations might perform differently depending on the convolution settings (input size/depth, kernel size/count) and depending on the hardware (mostly CPU/memory) used.

    Right now, for example, we have a special implementation used for 2D convolutions in case strides = (1, 1) (which utilized not only by the Conv2D layer, but also by DepthwiseConv2D, and SeparableConv2D).

    I wonder if it would make sense to provide a function to the user, that when called on a model, tries out different implementations and remembers which one performed best for future calls of model.predict. (Maybe in some settings, event a naive non-im2col convolution is the fastest one.)

    Pros:

    • potentially faster forward passes

    Cons:

    • increased code complexity
    • potentially wrong settings in case the background load on the user's machine varies too much during the evaluation
    opened by Dobiasd 0
  • Feature Suggestion: Support Transformer Models

    Feature Suggestion: Support Transformer Models

    First off, I would like to say that this is a really great piece of work! I have been using it with LSTMs for time-series data and have found frugally-deep to be invaluable. I am starting to investigate Transformers in order to see how they stack up to LSTMs and it would be wonderful if support for Transformer models could be added. I am in the early stages of working with Transformers, but the specific layers that I currently do not see supported are: MultiHeadAttention and LayerNormalization.

    help wanted 
    opened by jonathan-lazzaro-nnl 11
  • Feature suggestion: Support ONNX models?

    Feature suggestion: Support ONNX models?

    How about supporting ONNX in frugally? You could have a protobuf importer for ONNX models or add a tool which converts ONNX to the JSON format you use? Just a thought. A header only ONNX inference engine would be very very useful.

    opened by pfeatherstone 24
Releases(v0.15.19-p0)
Owner
Tobias Hermann
likes functional programming, neat software architecture, and machine learning.
Tobias Hermann
Invariant Causal Prediction for Block MDPs

MISA Abstract Generalization across environments is critical to the successful application of reinforcement learning algorithms to real-world challeng

Meta Research 41 Sep 17, 2022
Deep Reinforcement Learning based autonomous navigation for quadcopters using PPO algorithm.

PPO-based Autonomous Navigation for Quadcopters This repository contains an implementation of Proximal Policy Optimization (PPO) for autonomous naviga

Bilal Kabas 16 Nov 11, 2022
Using LSTM write Tang poetry

本教程将通过一个示例对LSTM进行介绍。通过搭建训练LSTM网络,我们将训练一个模型来生成唐诗。本文将对该实现进行详尽的解释,并阐明此模型的工作方式和原因。并不需要过多专业知识,但是可能需要新手花一些时间来理解的模型训练的实际情况。为了节省时间,请尽量选择GPU进行训练。

56 Dec 15, 2022
Compute FID scores with PyTorch.

FID score for PyTorch This is a port of the official implementation of Fréchet Inception Distance to PyTorch. See https://github.com/bioinf-jku/TTUR f

2.1k Jan 06, 2023
Bayesian Deep Learning and Deep Reinforcement Learning for Object Shape Error Response and Correction of Manufacturing Systems

Bayesian Deep Learning for Manufacturing 2.0 (dlmfg) Object Shape Error Response (OSER) Digital Lifecycle Management - In Process Quality Improvement

Sumit Sinha 30 Oct 31, 2022
Implementation of Neonatal Seizure Detection using EEG signals for deploying on edge devices including Raspberry Pi.

NeonatalSeizureDetection Description Link: https://arxiv.org/abs/2111.15569 Citation: @misc{nagarajan2021scalable, title={Scalable Machine Learn

Vishal Nagarajan 11 Nov 08, 2022
The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

GSDN-F and GSDN-EF This repository provides a reference implementation of GSDN-F and GSDN-EF as described in the paper "Understanding Graph Neural Net

Guoji Fu 18 Nov 14, 2022
Python package to generate image embeddings with CLIP without PyTorch/TensorFlow

imgbeddings A Python package to generate embedding vectors from images, using OpenAI's robust CLIP model via Hugging Face transformers. These image em

Max Woolf 81 Jan 04, 2023
Multi-task yolov5 with detection and segmentation based on yolov5

YOLOv5DS Multi-task yolov5 with detection and segmentation based on yolov5(branch v6.0) decoupled head anchor free segmentation head README中文 Ablation

150 Dec 30, 2022
Anchor-free Oriented Proposal Generator for Object Detection

Anchor-free Oriented Proposal Generator for Object Detection Gong Cheng, Jiabao Wang, Ke Li, Xingxing Xie, Chunbo Lang, Yanqing Yao, Junwei Han, Intro

jbwang1997 56 Nov 15, 2022
A simple Rock-Paper-Scissors game using CV in python

ML18_Rock-Paper-Scissors-using-CV A simple Rock-Paper-Scissors game using CV in python For IITISOC-21 Rules and procedure to play the interactive game

Anirudha Bhagwat 3 Aug 08, 2021
World Models with TensorFlow 2

World Models This repo reproduces the original implementation of World Models. This implementation uses TensorFlow 2.2. Docker The easiest way to hand

Zac Wellmer 234 Nov 30, 2022
A high-level Python library for Quantum Natural Language Processing

lambeq About lambeq is a toolkit for quantum natural language processing (QNLP). Documentation: https://cqcl.github.io/lambeq/ Getting started Prerequ

Cambridge Quantum 315 Jan 01, 2023
A synthetic texture-invariant dataset for object detection of UAVs

A synthetic dataset for object detection of UAVs This repository contains a synthetic datasets accompanying the paper Sim2Air - Synthetic aerial datas

LARICS Lab 10 Aug 13, 2022
Repo for the paper "DiLBERT: Cheap Embeddings for Disease Related Medical NLP"

DiLBERT Repo for the paper "DiLBERT: Cheap Embeddings for Disease Related Medical NLP" Pretrained Model The pretrained model presented in the paper is

Kevin Roitero 2 Dec 15, 2022
This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper

DeepShift This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper, that aims to replace multiplicati

Mostafa Elhoushi 88 Dec 23, 2022
The final project for "Applying AI to Wearable Device Data" course from "AI for Healthcare" - Udacity.

Motion Compensated Pulse Rate Estimation Overview This project has 2 main parts. Develop a Pulse Rate Algorithm on the given training data. Then Test

Omar Laham 2 Oct 25, 2022
Pairwise learning neural link prediction for ogb link prediction

Pairwise Learning for Neural Link Prediction for OGB (PLNLP-OGB) This repository provides evaluation codes of PLNLP for OGB link property prediction t

Zhitao WANG 31 Oct 10, 2022
Code repo for "Cross-Scale Internal Graph Neural Network for Image Super-Resolution" (NeurIPS'20)

IGNN Code repo for "Cross-Scale Internal Graph Neural Network for Image Super-Resolution" [paper] [supp] Prepare datasets 1 Download training dataset

Shangchen Zhou 278 Jan 03, 2023
Official code of Team Yao at Multi-Modal-Fact-Verification-2022

Official code of Team Yao at Multi-Modal-Fact-Verification-2022 A Multi-Modal Fact Verification dataset released as part of the De-Factify workshop in

Wei-Yao Wang 11 Nov 15, 2022