Sem descrição

Dheeraj Peri c4f90be499 Add QAT instructions for RN50 há 5 anos atrás
.github 30425620be Update issue templates há 6 anos atrás
CUDA-Optimized dd9f5c401c Create README.md há 5 anos atrás
FasterTransformer fe2cef59f5 Update translate_sample.py há 5 anos atrás
Kaldi 3419c93192 Fixing config file header há 6 anos atrás
MxNet e470c2150a Updating RN50/MxNet há 6 anos atrás
PyTorch f11884b38a minor readme updates há 5 anos atrás
TensorFlow c4f90be499 Add QAT instructions for RN50 há 5 anos atrás
TensorFlow2 57b8a6ac3a Update losses.py há 5 anos atrás
.gitignore 0663b67c1a Updating models há 6 anos atrás
.gitmodules 784eb0d8ca [BERT/TF] trtis dependency fix (#373) há 6 anos atrás
README.md fed7ba99cd [ConvNets/TF] Updating RN50, Adding ResNext and SE-ResNext há 5 anos atrás
hubconf.py ff16b6c649 removing torchhub access through master há 6 anos atrás

README.md

NVIDIA Deep Learning Examples for Tensor Cores

Introduction

This repository provides the latest deep learning example networks for training. These examples focus on achieving the best performance and convergence from NVIDIA Volta Tensor Cores.

NVIDIA GPU Cloud (NGC) Container Registry

These examples, along with our NVIDIA deep learning software stack, are provided in a monthly updated Docker container on the NGC container registry (https://ngc.nvidia.com). These containers include:

  • The latest NVIDIA examples from this repository
  • The latest NVIDIA contributions shared upstream to the respective framework
  • The latest NVIDIA Deep Learning software libraries, such as cuDNN, NCCL, cuBLAS, etc. which have all been through a rigorous monthly quality assurance process to ensure that they provide the best possible performance
  • Monthly release notes for each of the NVIDIA optimized containers

Directory structure

The examples are organized first by framework, such as TensorFlow, PyTorch, etc. and second by use case, such as computer vision, natural language processing, etc. We hope this structure enables you to quickly locate the example networks that best suit your needs. Here are the currently supported models:

Computer Vision

Natural Language Processing

Recommender Systems

Text to Speech

  • Tacotron2 & WaveGlow [PyTorch]
  • FastPitch (modified FastSpeech) [PyTorch]

Speech Recognition

CUDA Accelerated Applications

Jupyter Notebooks

| Models | TensorFlow | PyTorch | TensorRT | Triton |
| ------------- | ------------- | ------------- | ------------- | ------------- | | SSD | Inference | Inference | - | - | | MaskRCNN | - | Training & Inference | - | - | | Jasper | - | - | PyTorch Inference TensorRT Colab, PyTorch Inference TensorRT | PyTorch Inference TRTIS | | Tacotron2 & WaveGlow | - | Training & Inference | - | PyTorch Inference TRTIS | | BERT | Inference Movie Review Sentiment, Fine-Tuning SQuaD, Inference Colab, Inference | - | - | - | | BioBERT | Inference | - | - | - | | UNet Industrial | Export and Inference Colab, Inference | - | - | - | | Automatic Mixed Precision | AMP Training | - | - | - |

Feature Matrix

Models Framework DALI AMP Multi-GPU Multi-Node TensorRT ONNX Triton TF-TRT
ResNet-50 v1.5 PyTorch Yes Yes Yes - - - - -
ResNeXt101-32x4d PyTorch Yes Yes Yes - - - - -
SE-ResNeXt101-32x4d PyTorch Yes Yes Yes - - - - -
SSD300 v1.1 PyTorch Yes Yes Yes - - - - -
BERT PyTorch N/A Yes Yes Yes - - Yes -
Transformer-XL PyTorch N/A Yes Yes Yes - - - -
Neural Collaborative Filtering PyTorch N/A Yes Yes - - - - -
DLRM PyTorch N/A Yes Yes - - - - -
Mask R-CNN PyTorch N/A Yes Yes - - - - -
Jasper PyTorch N/A Yes Yes - Yes Yes Yes -
Tacotron 2 And WaveGlow v1.10 PyTorch N/A Yes Yes - Yes Yes Yes -
FastPitch PyTorch N/A Yes Yes - - - - -
GNMT v2 PyTorch N/A Yes Yes - - - - -
Transformer PyTorch N/A Yes Yes - - - - -
ResNet-50 v1.5 TensorFlow Yes Yes Yes - - - - -
ResNeXt101-32x4d TensorFlow Yes Yes Yes - - - - -
SE-ResNeXt101-32x4d TensorFlow Yes Yes Yes - - - - -
SSD320 v1.2 TensorFlow N/A Yes Yes - - - - -
BERT TensorFlow N/A Yes Yes Yes Yes - Yes Yes
BioBert TensorFlow N/A Yes Yes - - - - -
Transformer-XL TensorFlow N/A Yes Yes - - - - -
Neural Collaborative Filtering TensorFlow N/A Yes Yes - - - - -
Variational Autoencoder Collaborative Filtering TensorFlow N/A Yes Yes - - - - -
WideAndDeep TensorFlow N/A Yes Yes - - - - -
U-Net Industrial TensorFlow N/A Yes Yes - Yes - - Yes
U-Net Medical TensorFlow N/A Yes Yes - Yes - - Yes
V-Net Medical TensorFlow N/A Yes Yes - Yes Yes - Yes
Mask R-CNN TensorFlow N/A Yes Yes - - - - -
GNMT v2 TensorFlow N/A Yes Yes - - - - -
Faster Transformer Tensorflow N/A - - - Yes - - -
Transformer-XL TensorFlow N/A Yes Yes - - - - -
U-Net Medical TensorFlow-2 N/A Yes Yes - Yes - - Yes
Mask R-CNN TensorFlow-2 N/A Yes Yes - - - - -
ResNet50 v1.5 MXNet Yes Yes Yes - - - - -
HMM Kaldi N/A - Yes - - - Yes -

NVIDIA support

In each of the network READMEs, we indicate the level of support that will be provided. The range is from ongoing updates and improvements to a point-in-time release for thought leadership.

Feedback / Contributions

We're posting these examples on GitHub to better support the community, facilitate feedback, as well as collect and implement contributions using GitHub Issues and pull requests. We welcome all contributions!

Known issues

In each of the network READMEs, we indicate any known issues and encourage the community to provide feedback.