|
|
@@ -29,8 +29,6 @@ This repository provides a script and recipe to train the Transformer model to a
|
|
|
* [Inference performance benchmark](#inference-performance-benchmark)
|
|
|
* [Results](#results)
|
|
|
* [Training accuracy results](#training-accuracy-results)
|
|
|
- * [NVIDIA DGX-1 (8x V100 16G)](#nvidia-dgx-1-(8x-v100-16G))
|
|
|
- * [Training stability test](#training-stability-test)
|
|
|
* [Training performance results](#training-performance-results)
|
|
|
* [NVIDIA DGX-1 (8x V100 16G)](#nvidia-dgx-1-8x-v100-16g)
|
|
|
* [NVIDIA DGX-2 (16x V100 32G)](#nvidia-dgx-2-16x-v100-32g))
|
|
|
@@ -356,6 +354,8 @@ Running this code with the provided hyperparameters will allow you to achieve th
|
|
|
|
|
|
In some cases we can train further with the same setup to achieve slightly better results.
|
|
|
|
|
|
+#### Training performance results
|
|
|
+
|
|
|
##### NVIDIA DGX-1 (8x V100 16G)
|
|
|
|
|
|
Our results were obtained by running the `run_training.sh` and `run_training_fp32.sh` training scripts in the PyTorch NGC container on NVIDIA DGX-1 with (8x V100 16G) GPUs. Performance numbers (in tokens per second) were averaged over an entire training epoch.
|