Deep Learning GPU Training System

Overview

DIGITS

Build Status

DIGITS (the Deep Learning GPU Training System) is a webapp for training deep learning models. The currently supported frameworks are: Caffe, Torch, and Tensorflow.

Feedback

In addition to submitting pull requests, feel free to submit and vote on feature requests via our ideas portal.

Documentation

Current and most updated document is availabel at NVIDIA Accelerated Computing, Deep Learning Documentation, NVIDIA DIGITS.

Installation

Installation method Supported platform[s] Available versions Instructions
Source Ubuntu 14.04, 16.04 GitHub tags docs/BuildDigits.md

Official DIGITS container is available at nvcr.io via docker pull command.

Usage

Once you have installed DIGITS, visit docs/GettingStarted.md for an introductory walkthrough.

Then, take a look at some of the other documentation at docs/ and examples/:

Get help

Installation issues

  • First, check out the instructions above
  • Then, ask questions on our user group

Usage questions

Bugs and feature requests

Notice on security

Users shall understand that DIGITS is not designed to be run as an exposed external web service.

Comments
  • Torch Data Augmentation

    Torch Data Augmentation

    Data augmentation needs little introduction I recon. It counters overfitting and makes your model generalize better, yielding better validation accuracies; or alternatively, allows you to use smaller datasets with similar performance.

    In the Zoo that's the internet, I see many implementations of different augmentations, of which few are proper and nicely portable. A part from Digits yielding a great UI; ease of use; and deep learning turn-key solution, I strongly feel we can expand to the functional side as well to make this a deep learning killer-app.

    For torch, I have made an implementation during lua preprocessing from frontend to backend to enable Digits to do so. In #330 there was already an attempt for augmentation, which happened on the dataset-creation side; something I am strongly against. Resizing and cropping I would consider a transformation, while I consider augmenting the data in its container an augmentation. I think therefore it's fine to resize during dataset loading (and squashing/filling/etc), but I would probably leave it at that.

    Anyway, I set up a more dynamic structure to pass around these options on the torch side; instead of adding a dozen of arguments to each function, I am just adding a table.

    Implements the following (screenshot): image

    I have iterated through many augmentation types but these were the most useful. Almost done, now running elaborate tests.

    Progress

    The code is already functional, though see progress below. See code, shoot!

    Features

    • [x] Make UI data transforms only visible for the Torch framework (invisible for Caffe)
    • [x] ~~Implement UI option for normalization (scales the [0 255] to [0 1])~~
    • [x] Data Augmentation UI
    • [x] Flips (mirrors)
    • [x] Quadrilateral rotations
    • [x] Arbitrary rotations
    • [x] Arbitrary scales
    • [x] Augmenting in HSV space
    • [x] Augmenting with noise (Thoughts?)
    • [x] [Travis] Tests
    • [x] Use Data Augmentation Template: data_augmentation.html

    Testing

    • [x] No augmentation
    • [x] Flips (mirrors)
    • [x] Quadrilateral rotations
    • [x] Arbitrary rotations
    • [x] Arbitrary scales
    • [x] Arbitrary rotations & arbitrary scales
    • [x] Augmenting in HSV space
    • [x] Augmenting with noise
    • [x] All Augmentations & benchmark speed; identify bottlenecks
    • [x] Verify models reporting a slower learning/less overfitting trade-off : more generalization.
    enhancement torch 
    opened by TimZaman 46
  • running on multiple GPU is very slow

    running on multiple GPU is very slow

    I am trying to run 50-layer residual network with 4 K40m GPUs and it's very slow (same batch_size 16 as running on single GPU), take 6 hours for 1 epoch. However, If I run it on 1 GPU the speed is normal.

    System: CentOS, digits v3, nvcaffe-0.14

    BTW, I tried use Googlenet and it was ok on 4 GPUs.

    Any suggestion or potential issue?

    duplicate 
    opened by 201power 37
  • ERROR: Expected caffe suffix

    ERROR: Expected caffe suffix "-nv". libcaffe.so does not match. Are you building from the NVIDIA/caffe fork?

    Hi,

    I'm running on Ubuntu 14.4 LTS.

    ERROR: Expected caffe suffix "-nv". libcaffe.so does not match. Are you building from the NVIDIA/caffe fork?

    [email protected]:~/digits$ pip install -r requirements.txt
    You are using pip version 7.0.3, however version 7.1.0 is available.
    You should consider upgrading via the 'pip install --upgrade pip' command.
    Requirement already satisfied (use --upgrade to upgrade): Pillow>=2.3.0 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 1))
    Requirement already satisfied (use --upgrade to upgrade): numpy>=1.7 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 2))
    Requirement already satisfied (use --upgrade to upgrade): scipy>=0.13.3 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 3))
    Collecting protobuf>=2.5.0 (from -r requirements.txt (line 4))
      Downloading protobuf-2.6.1.tar.gz (188kB)
        100% |████████████████████████████████| 188kB 2.3MB/s 
    Collecting pydot>=1.0.2 (from -r requirements.txt (line 5))
      Downloading pydot-1.0.2.tar.gz
    Requirement already satisfied (use --upgrade to upgrade): six>=1.5.2 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 6))
    Requirement already satisfied (use --upgrade to upgrade): requests>=2.2.1 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 7))
    Requirement already satisfied (use --upgrade to upgrade): gevent>=1.0 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 8))
    Requirement already satisfied (use --upgrade to upgrade): Flask>=0.10.1 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 9))
    Collecting Flask-WTF>=0.11 (from -r requirements.txt (line 10))
      Downloading Flask_WTF-0.12-py2-none-any.whl
    Collecting Flask-SocketIO (from -r requirements.txt (line 11))
      Downloading Flask-SocketIO-0.6.0.tar.gz
    Collecting lmdb (from -r requirements.txt (line 12))
      Downloading lmdb-0.86.tar.gz (144kB)
        100% |████████████████████████████████| 147kB 2.9MB/s 
    Requirement already satisfied (use --upgrade to upgrade): nose>=1.3.1 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 13))
    Requirement already satisfied (use --upgrade to upgrade): mock>=1.0.1 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 14))
    Requirement already satisfied (use --upgrade to upgrade): beautifulsoup4>=4.2.1 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 15))
    Requirement already satisfied (use --upgrade to upgrade): selenium>=2.25.0 in /home/ubuntu/anaconda/lib/python2.7/site-packages (from -r requirements.txt (line 16))
    Collecting gunicorn (from -r requirements.txt (line 17))
      Downloading gunicorn-19.3.0-py2.py3-none-any.whl (110kB)
        100% |████████████████████████████████| 110kB 3.8MB/s 
    Requirement already satisfied (use --upgrade to upgrade): setuptools in /home/ubuntu/anaconda/lib/python2.7/site-packages/setuptools-17.1.1-py2.7.egg (from protobuf>=2.5.0->-r requirements.txt (line 4))
    Requirement already satisfied (use --upgrade to upgrade): pyparsing in /home/ubuntu/anaconda/lib/python2.7/site-packages (from pydot>=1.0.2->-r requirements.txt (line 5))
    Requirement already satisfied (use --upgrade to upgrade): Werkzeug in /home/ubuntu/anaconda/lib/python2.7/site-packages (from Flask-WTF>=0.11->-r requirements.txt (line 10))
    Collecting WTForms (from Flask-WTF>=0.11->-r requirements.txt (line 10))
      Downloading WTForms-2.0.2-py27-none-any.whl (128kB)
        100% |████████████████████████████████| 131kB 3.3MB/s 
    Collecting gevent-socketio>=0.3.6 (from Flask-SocketIO->-r requirements.txt (line 11))
      Downloading gevent_socketio-0.3.6-py27-none-any.whl
    Requirement already satisfied (use --upgrade to upgrade): gevent-websocket in /home/ubuntu/anaconda/lib/python2.7/site-packages (from gevent-socketio>=0.3.6->Flask-SocketIO->-r requirements.txt (line 11))
    Installing collected packages: protobuf, pydot, WTForms, Flask-WTF, gevent-socketio, Flask-SocketIO, lmdb, gunicorn
      Running setup.py install for protobuf
      Running setup.py install for pydot
      Running setup.py install for Flask-SocketIO
      Running setup.py install for lmdb
    Successfully installed Flask-SocketIO-0.6.0 Flask-WTF-0.12 WTForms-2.0.2 gevent-socketio-0.3.6 gunicorn-19.3.0 lmdb-0.86 protobuf-2.6.1 pydot-1.0.2
    [email protected]:~/digits$ sudo apt-get install graphviz
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    graphviz is already the newest version.
    The following packages were automatically installed and are no longer required:
      linux-headers-3.13.0-49 linux-headers-3.13.0-49-generic
      linux-image-3.13.0-49-generic linux-image-extra-3.13.0-49-generic
    Use 'apt-get autoremove' to remove them.
    0 upgraded, 0 newly installed, 0 to remove and 267 not upgraded.
    [email protected]:~/digits$ ./digits-devserver
      ___ ___ ___ ___ _____ ___
     |   \_ _/ __|_ _|_   _/ __|
     | |) | | (_ || |  | | \__ \
     |___/___\___|___| |_| |___/
    
    Welcome to the DIGITS config module.
    
    Where is caffe installed?
        (enter "SYS" if installed system-wide)
        [default is SYS]
    (q to quit) >>> SYS
    ERROR: Expected caffe suffix "-nv". libcaffe.so does not match. Are you building from the NVIDIA/caffe fork?
    
    (q to quit) >>> 
    
    caffe 
    opened by dbl001 35
  • Accuracy & confusion matrix

    Accuracy & confusion matrix

    See #17

    Adds a new kind of job for performance evaluation of trained classifiers. It is now possible to visualize :

    • accuracy / recall curve
    • confusion matrix

    Accuracy and the confusion matrix are computed against a chosen snapshot of a training task, and against both the validation set and testing set (if it exists). An "evaluate performance" button has been added on the training view. This is currently the only way to run an evaluation job. The results are stored in the job directory in the form of two pickle files.

    button

    Accuracy / recall curve

    accuracy recall curve

    Confusion matrix

    I chose a very simple representation of the confusion matrix (not in the form of a matrix !), because it is more adapted to datasets with lots of classes. For each class, the top 10 most represented classes are displayed, with their respective %.

    confusion matrix

    Related jobs

    I added a "Related jobs" section on each job show view. It displays the jobs which depends on the current job. For example, models trained on a specific dataset, evaluations ran on a specific model.

    Related jobs

    Let me know what you think, critiques and comments are more than welcome.

    opened by groar 29
  • Windows Compatibility

    Windows Compatibility

    On my machine the image serving, e.g. of the mean.jpg does not work. The browser (tested IE and Chrome) cannot interpret the image probably due to the missing content type. The send_file function takes care of that all.

    windows 
    opened by crohkohl 27
  • Add support for HDF5 datasets

    Add support for HDF5 datasets

    Closes #224

    TODO before merge

    • [x] Create models from HDF5 datasets using HDF5Data layers
    • [x] Expose backend and compression information in REST API
    • [x] Shard HDF5 files into acceptable dataset sizes - https://github.com/BVLC/caffe/issues/2953#issuecomment-137274066

    TODO after merge

    • Allow non-image data (see #197)
    • Analyze prebuilt HDF5 datasets in "generic" path
    enhancement 
    opened by lukeyeager 26
  • Set map_size for LMDB

    Set map_size for LMDB

    @crohkohl, @danst18, I'm breaking the discussion in #203 out into a new issue.

    Here's the situation as I understand it. Please correct me if any of this is wrong.

    | map_size | Linux | OSX & Windows | | --- | --- | --- | | lower than size of dataset | LMDB runs out of memory | ? | | higher than system memory | No problem | LMDB can't allocate enough memory |

    On Linux, you can just set it as high as you like and never see a problem. But that strategy blows up on other platforms.

    Should [map_size] be made configurable? https://github.com/NVIDIA/DIGITS/pull/203#issuecomment-128859465

    This is a sufficient but lazy solution. I would like to understand whether this can be avoided programmatically somehow before making a decision. My googling skills are failing me.

    question 
    opened by lukeyeager 26
  • can't find hdf5.h when build caffe

    can't find hdf5.h when build caffe

    I want to install digits on my debian jessie.
    When I build caffe(NVIDIA's fork), I got errors complaining that hdf5.h could not be found.

    I'm sure I had installed libhdf5-serial-dev and libhdf5-dev, and I found the header file in /usr/include/hdf5/serial and its libs in /usr/lib/x86_64-linux-gnu.

    So, what's wrong? Some one help me?

    The build error message show below:

    (venv)➜  caffe  make all --jobs=4
    CXX src/caffe/layer_factory.cpp
    CXX src/caffe/util/insert_splits.cpp
    CXX src/caffe/util/db.cpp
    CXX src/caffe/util/upgrade_proto.cpp
    In file included from src/caffe/util/upgrade_proto.cpp:10:0:
    ./include/caffe/util/io.hpp:8:18: fatal error: hdf5.h: no such file or directory
     #include "hdf5.h"
                      ^
    compilation terminated.
    Makefile:512: recipe for target '.build_release/src/caffe/util/upgrade_proto.o' failed
    make: *** [.build_release/src/caffe/util/upgrade_proto.o] Error 1
    make: *** 正在等待未完成的任务....
    In file included from ./include/caffe/common_layers.hpp:10:0,
                     from ./include/caffe/vision_layers.hpp:10,
                     from src/caffe/layer_factory.cpp:6:
    ./include/caffe/data_layers.hpp:9:18: fatal error: hdf5.h: no such file or directory
     #include "hdf5.h"
                      ^
    compilation terminated.
    Makefile:512: recipe for target '.build_release/src/caffe/layer_factory.o' failed
    make: *** [.build_release/src/caffe/layer_factory.o] Error 1
    
    question caffe platform 
    opened by tangshi 26
  • mAP always zero

    mAP always zero

    I can't figure out why my model training mAP (val) doesn't get above zero. I'm trying to use the same approach and the SpaceNet_DetectNet_Train_Val.prototxt from this article.

    My label files 000n.txt look like this: p 0.0 0 0.0 0 0 24 118 0 0 0 0 0 0 0 0

    My images are 1280x1280, and I'm using these custom classes: dontcare,p

    image

    Where am I going wrong?

    object-detection 
    opened by DarylWM 25
  • CUDNN_STATUS_BAD_PARAM

    CUDNN_STATUS_BAD_PARAM

    Ubuntu 14.04LTS Clean install nvidia dpkg install

    $ sudo apt-get install cuda
    $ sudo apt-get install digits
    
    $ gedit .bashrc
    add to endline next.
    
    export PATH=/usr/local/cuda/bin:$PATH
    export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
    
    $ sudo reboot
    
    $ nvidia-smi
    Tue May 31 13:32:37 2016       
    +------------------------------------------------------+                       
    | NVIDIA-SMI 352.93     Driver Version: 352.93         |                       
    |-------------------------------+----------------------+----------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |===============================+======================+======================|
    |   0  GeForce GTX 960     Off  | 0000:01:00.0      On |                  N/A |
    | 20%   37C    P8    10W / 160W |    289MiB /  4095MiB |      0%      Default |
    +-------------------------------+----------------------+----------------------+
    |   1  GeForce GTX 960     Off  | 0000:02:00.0     Off |                  N/A |
    | 20%   43C    P8     9W / 160W |     13MiB /  4095MiB |      0%      Default |
    +-------------------------------+----------------------+----------------------+
    
    $ nvcc -V
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2015 NVIDIA Corporation
    Built on Tue_Aug_11_14:27:32_CDT_2015
    Cuda compilation tools, release 7.5, V7.5.17
    

    ----digits run and create Dataset----

    MNIST Image Size28x28 Image Type GRAYSCALE

    run Image Classification Model

    select Caffe and LeNet

    run, and rize next error

    ERROR: Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM

    bug 
    opened by shinfo001 25
  • Error: status == CUDNN_STATUS_SUCCESS (8 vs. 0)  CUDNN_STATUS_EXECUTION_FAILED

    Error: status == CUDNN_STATUS_SUCCESS (8 vs. 0) CUDNN_STATUS_EXECUTION_FAILED

    I am getting this error when trying to run training with my custom network.

    status == CUDNN_STATUS_SUCCESS (8 vs. 0) CUDNN_STATUS_EXECUTION_FAILED

    I found this post that refers to this error: https://github.com/BVLC/caffe/issues/1700#issuecomment-133476490

    But it doesn't specify where or how to fix it. Also I am not sure if the issues are related or something completely different. Let me mention that this custom framework works perfectly fine when I run it in my local caffe install, and I can also see all the nodes if I hit the visualize button. It starts training and fails after the first epoch.

    pasted_image_at_2015_08_21_12_18_am

    bug 
    opened by alfredox10 24
  • Fix TypeError

    Fix TypeError

    File "/opt/digits/digits/extensions/data/imageSegmentation/data.py", line 225, in split_image_list random.shuffle(self.random_indices) File "/usr/lib/python3.8/random.py", line 307, in shuffle x[i], x[j] = x[j], x[i] TypeError: 'range' object does not support item assignment

    opened by vertexodessa 0
  • DIGITS DOCKET CONTAINER INSTALLING SUNNY PLUGIN

    DIGITS DOCKET CONTAINER INSTALLING SUNNY PLUGIN

    I'm Sorry, I'm trying to install Sunnybrook for the segmentation example on the docker container, as I want to run it over the TensorFlow backend (not Coffe). I tried to repeat the install procedure from inside the container doing docker exec -it XXXXX bash, being XXX the container ID, and later downloading the plugin from https://github.com/NVIDIA/DIGITS/tree/master/plugins/data and later doing the install proccedure, but it not works. Is there any official way to do this? I did pip install --ignore-installed setuptools (no error appears)

    Installing collected packages: setuptools Successfully installed setuptools-44.1.1

    git clone https://github.com/NVIDIA/DIGITS.git I went to /DIGITS/plugins/data/sunnybrook via "cd" finally I run pip install . No error appear, but after restarting docker, when trying to create a Sunny dataset it fails (See in the following post the error, I've posted appart, for clarity)

    Can you help please? Kind regards

    opened by crmuinos 1
  • I'm confused between which version of DIGITS to install

    I'm confused between which version of DIGITS to install

    Apologies in advance since I'm new to all this but I'm confused regarding which version of DIGITS to install. I'm beginning a fresh install of the latest Ubuntu version and as of now, after hours of scouring the internet, I have found DIGITS versions that work standalone, versions that work in Docker, then there's the official DIGITS github page which has DIGITS upto version 6 and on the NGC, there's DIGITS 20.03???

    What is going on I'm so confused. I was excited to get DIGITS up and running on my local machine just as soon as I had completed the Nvidia DLI's course and now I'm just stumped as to where to start. Would also like to know how different is DIGITS running for Tensorflow from the Caffe DIGITS.

    Please help.

    opened by RazaZaidi2802 0
  • cannot see detectnet bounding boxes using Caffe model on Nano

    cannot see detectnet bounding boxes using Caffe model on Nano

    We have trained and deployed a custom model on the nano using a caffe detectnet model. We trained in digits, and it works well when conducting inference in DIGITS, but it will not show bounding boxes when running on the nano. Is there a patch for this issue?

    opened by eanmikale 0
  • Module Creation erros

    Module Creation erros

    So I am about to train with digits as specify in Hello AI Wold an then 4cd6b3f6e3058db2dfd91edaef62c9058f65ab8d

    this is the run code

    inception_5b/relu_pool_proj ← inception_5b/pool_proj inception_5b/relu_pool_proj → inception_5b/pool_proj (in-place) Setting up inception_5b/relu_pool_proj TRAIN Top shape for layer 158 ‘inception_5b/relu_pool_proj’ 5 128 40 40 (1024000) Creating layer ‘inception_5b/output’ of type ‘Concat’ Layer’s types are Ftype:FLOAT Btype:FLOAT Fmath:FLOAT Bmath:FLOAT Created Layer inception_5b/output (159) inception_5b/output ← inception_5b/1x1 inception_5b/output ← inception_5b/3x3 inception_5b/output ← inception_5b/5x5 inception_5b/output ← inception_5b/pool_proj inception_5b/output → inception_5b/output Setting up inception_5b/output TRAIN Top shape for layer 159 ‘inception_5b/output’ 5 1024 40 40 (8192000) Creating layer ‘pool5/drop_s1’ of type ‘Dropout’ Layer’s types are Ftype:FLOAT Btype:FLOAT Fmath:FLOAT Bmath:FLOAT Created Layer pool5/drop_s1 (160) pool5/drop_s1 ← inception_5b/output pool5/drop_s1 → pool5/drop_s1 Check failed: status == CUDNN_STATUS_SUCCESS (8 vs. 0) CUDNN_STATUS_EXECUTION_FAILED, device 0

    I am using a 2070 super

    Server: 9dca63a42e15 DIGITS version: 6.1.1 Caffe version: 0.17.0 Caffe flavor: NVIDIA My brain is soup at this point please help me out. caffe_output.log

    I have not be able to create one model yet

    3f542d1f6aa28d3568d8dcf4a11558753180c8ff

    I am also unable to install the source digits without crashing Ubuntu. Today is May 11 and I started trying to have it work since the 7th please could you help me out. I am really exited about this tool.

    opened by cespedesk 0
Releases(v6.1.1)
  • v6.1.1(Apr 10, 2018)

    Since 6.1.0

    Bugfixes

    • Update for new TF API (#2014)
    • Update CI scripts to add some new deps to Caffe build (#1993)
    • Update import and API for pydicom 1.0
    • Fix label distribution and its view page (#1916)
    Source code(tar.gz)
    Source code(zip)
  • v6.1.0(Dec 12, 2017)

    Since 6.0

    New Features

    • Added functionality to integrate DIGITS with S3 Endpoints (#1868)
    • Added publish to inference server on classification workflow (#1906)

    Bugfixes

    • Fix frozen graph issue (#1907)
    • Fix 404 error for /datasets/inference-form/... from #1888 (#1889)
    • Remove timeout assertion (#1859)

    Changes

    • Various updates on document

    Known Issues

    • Out of memory error in the semantic-segmentation example when training the FCN AlexNet model on Tesla P100.
    Source code(tar.gz)
    Source code(zip)
  • v6.0.0(Aug 30, 2017)

    See release notes for the 6.0 release candidate.

    Since 6.0 RC1

    New Features

    • Added support for URL prefix (#1803)

    Bugfixes

    • Fixed loading/saving tensorflow models (#1794)

    Changes

    • Various updates on document

    Known Issues

    • Visualization for Caffe models does not currently work. (#1738)
    Source code(tar.gz)
    Source code(zip)
  • v6.0.0-rc.1(Jul 25, 2017)

    New Features

    • Added TensorFlow backend for DIGITS as an alternate to Caffe and Torch (#1714)
    • Added examples and support for GANs (#1714)
    • Added support for text classification (#1025)
    • Added more viewing options for image segmentation (#1188)

    Changes

    • HTML embedding now defaults to PNG (#1270)
    • Images that causes exceptions will now show the file name (#1636)

    Bugfixes

    • Fixed softmax visualization issue with scaled images (#1647)
    • Documentation was changed for model store with official pictures (#1650)
    • Fixed Caffe search path in Windows (#1244)
    • Fixed image file entry in Sunnybrook inference form (#1237)
    • Fixed bugs when visiting nested image folder (#1477)

    Known Issues

    • Visualization for Caffe models does not currently work. (#1738)
    Source code(tar.gz)
    Source code(zip)
  • v5.0.0(Feb 2, 2017)

    See release notes for the 5.0 release candidate.

    New since 5.0 RC

    • Enable the DIGITS Model Store (https://github.com/NVIDIA/DIGITS/pull/1308)
    • Fix calculations related to batch accumulation for Caffe (https://github.com/NVIDIA/DIGITS/pull/1307)
    • Various documentation updates
    Source code(tar.gz)
    Source code(zip)
  • v5.0.0-rc.1(Oct 15, 2016)

    279 commits since v4.0.0

    New Features

    • Import pretrained models from a model "store" (#896, #1077, #1161)
    • Support for image segmentation workflows (#830, #961, #1131)
    • Online data augmentation with Torch (#777)
    • Show CPU and system memory utilization during training (#800)
    • Improved bounding-box visualizations for object detection models (#869)
    • Create groups of jobs for easier display on the home page (#734)
    • Reuse data extensions for inference (#1024)
    • Support for plugin extensions (#1093, #927, #947)
    • Add documentation for the REST API (#964)

    Changes

    • Use environment variables for configuration instead of a file (#1091)
    • Remove digits-server and dependency on gunicorn (#1127)
    • digits-devserver is now just a small shell script instead of a Python script (#1121)
    • New design for Torch multi-GPU training (#828)
    • Add Ubuntu 16.04 support by updating dependency versions (#965)
    • Allow testing of only Caffe or only Torch with the testsuite (#1143)
    • Return more info when downloading a model tarball or json (#891)

    Bugfixes

    • Fix bug with Torch and CUDA_VISIBLE_DEVICES (#1130)
    • Fix issues with browsers returning incorrectly cached css and js files (#904)

    Known Issues

    • Training goes on longer than required when using batch accumulation (#1240)
    Source code(tar.gz)
    Source code(zip)
  • v4.0.0(Jul 19, 2016)

    529 commits since v3.0.0

    New Features

    • Add support for object-detection networks like DetectNet (#735) with documentation (#803)
    • Parameter sweep over batch size and learning rate (#708)
    • Show accuracy confusion matrix for "Classify Many" (#608)
    • Test a model with an LMDB (#638)
    • Add basic login functionality (#463)

    Changes

    • Major revamp of home page (#728, #790)
    • Allow use of BVLC/caffe (#769)
    • Run inference jobs in separate processes (#573)

    Bugfixes

    • Made device_query compatible with CUDA 8.0 (#890)

    For more information, see the release notes for v3.1, v3.2, v3.3, and the 4.0 RC.

    Source code(tar.gz)
    Source code(zip)
  • v4.0.0-rc.2(Jul 19, 2016)

    211 commits since v3.3.0

    New Features

    • Add support for object-detection networks like DetectNet (#735) with documentation (#803)
    • Parameter sweep over batch size and learning rate (#708)
    • Add plugin systems for data formats (#731) and inference visualizations (#756)
    • Expose Caffe's iter_size solver option (#744)
    • Add syntax highlighting when editing custom networks (#751)
    • View list of related jobs (#767)
    • Explore generic datasets (#822)
    • Add example for doing text classification with Torch (#684)

    Changes

    • Major revamp of home page (#728, #790)
    • Allow use of BVLC/caffe (#769)
    • New Torch multi-GPU programming model (#732)
    • Make small improvements to standard networks (#733, #749)
    • Set weight_decay to lr / 100 (#792)
    • Make major improvements to TravisCI build system (#766, #788)
    Source code(tar.gz)
    Source code(zip)
  • v3.3.0(Apr 25, 2016)

    New Features

    • Show accuracy confusion matrix for "Classify Many" (#608)
    • Test a model with an LMDB (#638)
    • Use layer stages in network descriptions for full control over train/val/deploy networks (#628)
    • Option to limit number of images to use for "Classify/Test Many" (#592)
    • Better in-app documentation for Python layers (#651)

    Changes

    • Run inference jobs in separate processes (#573)
    • Path autocompletion returns sorted list (#621)

    Bugfixes

    • Fixed UI bugs when using Safari (#702)
    • Fixed file serving for files with absolute paths (#586)
    • Fixed some UI bugs related to permissions (#594, #596)
    • Various torch-related bugfixes (#661, #663, #681, #686, #699)
    • Windows compatibility fixes (#698)
    Source code(tar.gz)
    Source code(zip)
  • v3.2.0(Feb 18, 2016)

    New Features

    • Add support for new solvers - RMSprop, AdaDelta and Adam (#564)
    • AlexNet for Torch now works for multiple GPUs (#539)
    • New documentation for installing CUDA toolkit, drivers, etc. (#558)

    Changes

    • Only look in one location for config files (#541)
    • Re-use weights when retraining a model on the same dataset (#538)
    • Functional improvements and documentation changes for examples/classification (#559, #557, #579, #582)
    • Better error-checking for caffe networks referencing invalid layer "bottoms" (#576)

    Bugfixes

    • Fixes for multistep learning rate (#549, #550)
    Source code(tar.gz)
    Source code(zip)
  • v3.1.0(Jan 22, 2016)

    New Features

    • Enable multi-GPU for Torch (#480)
    • Add basic login functionality (#463)
    • Allow Torch to fine-tune pretrained models (#499)
    • Allow Caffe to fine-tune from multiple pretrained models (#498)
    • New tutorials
      • Fine-tuning (#500)
      • Siamese networks (#453)
      • Weight initialization (#522)
    • Allow optional specification of image folder during multiple inference (#526)

    Changes

    • Torch performance improvements (#368, #390, #441, #339)
    • Disable colormap for "Top N" feature (#481)
    • Better real-time updates for dataset creation (#473)
    • Better display for device_query tool (#497)
    • Display the job directory for all job types (#469)
    • Use Flask "Blueprints" to cleanup routing code (#507)
    • Cleanup and alphabetize imports throughout the project (#501)
    • Removed docs/API.md and docs/FlaskRoutes.md (a05356ebfe0fe462f20143625ec8c942847348de)

    Bugfixes

    • Enable importing of LMDBs created with Caffe's convert_imageset tool (#517)
    Source code(tar.gz)
    Source code(zip)
  • v3.0.0(Jan 22, 2016)

    See release notes for v3.0 RC.

    New since 3.0 RC

    • Fix handling of unencoded LMDBs in Torch (#475)
    • Significant performance enhancement for creating datasets (#491)
    • Various documentation fixes / updates
    Source code(tar.gz)
    Source code(zip)
  • v3.0.0-rc.3(Dec 10, 2015)

    New Features

    • Add Torch7 as an alternative backend to Caffe (#324, #345)
    • Make using python layers easier by [optionally] attaching a python file to each model (#329)
    • Add the ability to clone previous jobs with a click (#334)
    • Update the homepage to show job updates in real-time (#240)
    • Enable mean subtraction by subtracting the mean file as well as subtracting the mean pixel (#321)
    • Support NVcaffe v0.14 (#341, #336)
    • Display the job directory size for each DatasetJob and ModelJob (#309)
    • Add a backend badge (LMDB/HDF5) to DatasetJobs on the homepage (#323)
    • Explore images in LMDB datasets (#331)

    Changes

    • Use port 34448 for the digits-server instead of port 8080 (#392)
    • Remove digits-walkthrough (#352)
    • Enforce standard UI for file input fields across different browsers (#325)

    Bugfixes

    • Fix PicklingErrors issues on all platforms (#307)
    • Fix issue when running inference on many images at once (#361)

    Known Issues

    • Large inference requests (i.e. "Classify many") may cause timeouts or even crashes (#479)
    • Incorrect handling of unencoded LMDB in Torch wrapper (#477)
    Source code(tar.gz)
    Source code(zip)
  • v2.2.1(Sep 17, 2015)

  • v2.2.0(Sep 16, 2015)

    New Features

    • Add [initial] support for HDF5 datasets (#226)
    • Zoom in on weight/activation visualizations (#267)
    • Add a new page for comparing training results (#195)
    • Add notes to jobs (#283)

    Changes

    • Open inference results in a new browser tab (#244)
    • Various improvements for using prebuilt LMDBs (#268)
    • Sort subfolders when parsing a folder of images (#296)
    • Use input_shape instead of input_dim for deploy network prototxt (#231)

    Known Issues

    • Using a snapshot from a previous network doesn't work unless the network is on the first page (#285)
    • Parameter counting fails for some layer types (like PReLU) (#317)
    Source code(tar.gz)
    Source code(zip)
  • v2.1.0(Sep 14, 2015)

    New Features

    • Add support for "Generic Inference" (i.e. non-classification) networks (#189)
    • Display number of learned parameters in a model (#221)
    • Show ground truth in "Classify Many" if provided (#110)
    • Zoom in on a selection of the loss/accuracy graph (#113)
    • Add autocomplete for server-side path input fields (#183)
    • Select max/min images per class when parsing a folder of images (#161)
    • Allow user to download log from CreateDb tasks (#221)
    • Show number of available GPUs on home page (#207)
    • Allow local file upload for image lists (#106)
    • Display DIGITS version in top right of page header (#153) and in the console output (c181797cdf3ce27bf65a22fd39fbc61b95ecaab6)

    Changes

    • Double the LMDB map_size when running out of memory instead of setting to 1TB (#209)
      • requires py-lmdb 0.87
    • Rename default GoogLeNet layers and tops (9ff246eed47ec04461956b133495260855168e2e)
    • Add pagination to Previous Networks list (c181797cdf3ce27bf65a22fd39fbc61b95ecaab6)
    • Various changes that help with Windows compatibility (#199)
    • Major refactoring of tests (#192)

    Known issues

    • Parameter counting fails for some layer types (like PReLU) (#317)
    Source code(tar.gz)
    Source code(zip)
  • v2.0.0(Sep 3, 2015)

    New Features

    • Enabled support for multi-GPU Caffe (#92)
      • Select multiple and/or specific GPUs for training (#92, #104)
    • Created new routes for JSON REST API (#134, #136)
    • Started using GPU for inference (#66)
    • Added NVML info about GPU memory/utilization (#93)
    • Enabled ADAGRAD and NESTEROV as alternative solver types (@drozdvadym in #102)
    • Added scripts to download standard datasets MNIST and CIFAR
    • Added option to set server name (#111)
    • Added support for PPM images (#123)
    • Enabled path autocompletion while setting values in the configuration (#96)

    Changes

    • Added a python classification example (#147)
    • Subtract mean pixel during training (#169)
    • Added TravisCI integration to run tests (#28)
    • Added Coveralls integration for test coverage
    • Added Landscape integration to inspect code
    • Added auto-generated documentation of the webapp’s HTTP routes
    • Switched to loading config files from new, more logical locations (#96)
    • Started suppressing most of Caffe’s raw output (b382e99b8a143c9bbbf659ba74e67bf2ef12718e, 019bc6ca750601396a502ad0fd2b0d47b239f0d7)
    • Added a CLA

    Bugfixes

    • Fixed various OSX platform-specific issues (#32, @trivedigaurav in #94)

    Known Issues

    • Some motherboards cause P2P bandwidth issues (https://github.com/NVIDIA/caffe/issues/10)
    Source code(tar.gz)
    Source code(zip)
  • v2.0.0-rc3(Jul 31, 2015)

    See release notes for v2.0.0-preview.

    New since 2.0 Preview

    • Recommend NVIDIA/Caffe v0.13(https://github.com/NVIDIA/DIGITS/commit/5dc0f8e646d28587c07ff6fe9bcd1990820b41c2)
      • Requires cuDNN v3
    • Subtract mean pixel during training (#169)
    • Fixes regarding deployment of digits-server (c9a9dce2fcf7bb12363e6cccc44a6dd0a26a8271, e7bbc63213a10bbea516ee51adc5ffcf160494e8)
    Source code(tar.gz)
    Source code(zip)
  • v2.0.0-preview(Jul 7, 2015)

    New Features

    • Enabled support for multi-GPU Caffe (#92)
      • Select multiple and/or specific GPUs for training (#92, #104)
    • Created new routes for JSON REST API (#134, #136)
    • Started using GPU for inference (#66)
    • Added NVML info about GPU memory/utilization (#93)
    • Enabled ADAGRAD and NESTEROV as alternative solver types (@drozdvadym in #102)
    • Added scripts to download standard datasets MNIST and CIFAR
    • Added option to set server name (#111)
    • Added support for PPM images (#123)
    • Enabled path autocompletion while setting values in the configuration (#96)

    Changes

    Bugfixes

    • Fixed various OSX platform-specific issues (#32, @trivedigaurav in #94)

    Known Issues

    • Some motherboards cause P2P bandwidth issues (https://github.com/NVIDIA/caffe/issues/10)
    Source code(tar.gz)
    Source code(zip)
  • v1.1.2(Jun 26, 2015)

  • v1.1.0(Apr 24, 2015)

    New Features

    • Add GoogLeNet as a default network (#11)
    • "Classify Many Images" shows classification results of many images at once (#61)
    • Show statistics (mean, standard deviation, histogram of values) for each layer of the network at inference time (#67)
    • Allow saving images in database with PNG encoding (#73)
    • Optionally turn off shuffling when creating a dataset (#72)
    • Optionally provide a random seed to caffe (73fe257)

    Changes

    • Upgrade to NVIDIA/caffe version 0.11.0 (e2bcb27)
    • Update pip requirements list to match packages available on Ubuntu 14.04 where possible (4162db4, 133213d)
    • Use C3.js instead of Google Charts to enable DIGITS to run without an internet connection (#34)
    • Change default image resize mode from HALF_CROP to SQUASH (b4f3261)

    Bugfixes

    • Save images in BGR order instead of RGB because caffe uses OpenCV to read encoded images (#59)
    • Scale the LeNet standard network by the standard deviation of MNIST (~80) during train, val and test phases (5a38aa5, 23c1a78)
    • Use a white background when removing transparency from images (#85)

    Known Issues

    • The GoogLeNet standard network is not behaving correctly when trained on the full ImageNet dataset (#82)
    • "Classify Many Images" may timeout if too many images are uploaded and the server takes too long to respond (#70)
    Source code(tar.gz)
    Source code(zip)
[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias

Counterfactual VQA (CF-VQA) This repository is the Pytorch implementation of our paper "Counterfactual VQA: A Cause-Effect Look at Language Bias" in C

Yulei Niu 94 Dec 03, 2022
Bayesian Generative Adversarial Networks in Tensorflow

Bayesian Generative Adversarial Networks in Tensorflow This repository contains the Tensorflow implementation of the Bayesian GAN by Yunus Saatchi and

Andrew Gordon Wilson 1k Nov 29, 2022
Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision

MLP Mixer Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision. Give us a star if you like this repo. Author: Github: bangoc123 Emai

Ngoc Nguyen Ba 86 Dec 10, 2022
TransGAN: Two Transformers Can Make One Strong GAN

[Preprint] "TransGAN: Two Transformers Can Make One Strong GAN", Yifan Jiang, Shiyu Chang, Zhangyang Wang

VITA 1.5k Jan 07, 2023
Awesome Transformers in Medical Imaging

This repo supplements our Survey on Transformers in Medical Imaging Fahad Shamshad, Salman Khan, Syed Waqas Zamir, Muhammad Haris Khan, Munawar Hayat,

Fahad Shamshad 666 Jan 06, 2023
Data from "HateCheck: Functional Tests for Hate Speech Detection Models" (Röttger et al., ACL 2021)

In this repo, you can find the data from our ACL 2021 paper "HateCheck: Functional Tests for Hate Speech Detection Models". "test_suite_cases.csv" con

Paul Röttger 43 Nov 11, 2022
DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download

Bubbliiiing 31 Nov 25, 2022
Deep Learning Emotion decoding using EEG data from Autism individuals

Deep Learning Emotion decoding using EEG data from Autism individuals This repository includes the python and matlab codes using for processing EEG 2D

Juan Manuel Mayor Torres 12 Dec 08, 2022
The code for 'Deep Residual Fourier Transformation for Single Image Deblurring'

Deep Residual Fourier Transformation for Single Image Deblurring Xintian Mao, Yiming Liu, Wei Shen, Qingli Li and Yan Wang code will be released soon

145 Dec 13, 2022
A simple, fully convolutional model for real-time instance segmentation.

You Only Look At CoefficienTs ██╗ ██╗ ██████╗ ██╗ █████╗ ██████╗████████╗ ╚██╗ ██╔╝██╔═══██╗██║ ██╔══██╗██╔════╝╚══██╔══╝ ╚██

Daniel Bolya 4.6k Dec 30, 2022
Pytorch implementation of the Variational Recurrent Neural Network (VRNN).

VariationalRecurrentNeuralNetwork Pytorch implementation of the Variational RNN (VRNN), from A Recurrent Latent Variable Model for Sequential Data. Th

emmanuel 251 Dec 17, 2022
style mixing for animation face

An implementation of StyleGAN on Animation dataset. Install git clone https://github.com/MorvanZhou/anime-StyleGAN cd anime-StyleGAN pip install -r re

Morvan 46 Nov 30, 2022
This program writes christmas wish programmatically. It is using turtle as a pen pointer draw christmas trees and stars.

Introduction This is a simple program is written in python and turtle library. The objective of this program is to wish merry Christmas programmatical

Gunarakulan Gunaretnam 1 Dec 25, 2021
RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depth, in ICCV 2021 (oral)

RINDNet RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depth Mengyang Pu, Yaping Huang, Qingji Guan and Haibin Lin

Mengyang Pu 75 Dec 15, 2022
PyTorch implementation of CloudWalk's recent work DenseBody

densebody_pytorch PyTorch implementation of CloudWalk's recent paper DenseBody. Note: For most recent updates, please check out the dev branch. Update

Lingbo Yang 401 Nov 19, 2022
TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

912 Jan 08, 2023
A graph neural network (GNN) model to predict protein-protein interactions (PPI) with no sample features

A graph neural network (GNN) model to predict protein-protein interactions (PPI) with no sample features

2 Jul 25, 2022
"Segmenter: Transformer for Semantic Segmentation" reproduced via mmsegmentation

Segmenter-based-on-OpenMMLab "Segmenter: Transformer for Semantic Segmentation, arxiv 2105.05633." reproduced via mmsegmentation. We reproduce Segment

EricKani 22 Feb 24, 2022
DimReductionClustering - Dimensionality Reduction + Clustering + Unsupervised Score Metrics

Dimensionality Reduction + Clustering + Unsupervised Score Metrics Introduction

11 Nov 15, 2022
The implementation of 'Image synthesis via semantic composition'.

Image synthesis via semantic synthesis [Project Page] by Yi Wang, Lu Qi, Ying-Cong Chen, Xiangyu Zhang, Jiaya Jia. Introduction This repository gives

DV Lab 71 Jan 06, 2023