4 жил өмнө · 78ec680207
--- a/PyTorch/LanguageModeling/BERT/README.md
+++ b/PyTorch/LanguageModeling/BERT/README.md
@@ -733,14 +733,14 @@ Our results were obtained by running the `scripts/run_squad.sh` and `scripts/run
 
				  
			
 
				 ##### Pre-training loss results: NVIDIA DGX A100 (8x A100 40GB)
			
 
				 
			
 
				-| DGX System | GPUs | Accumulated Batch size / GPU (Phase 1 and Phase 2) | Accumulation steps (Phase 1 and Phase 2) | Final Loss - TF32 | Final Loss - mixed precision | Time to train(hours) - TF32 | Time to train(hours) - mixed precision | Time to train speedup (TF32 to mixed precision)
			
 
				+| DGX System | GPUs / Node | Accumulated Batch size / GPU (Phase 1 and Phase 2) | Accumulation steps (Phase 1 and Phase 2) | Final Loss - TF32 | Final Loss - mixed precision | Time to train(hours) - TF32 | Time to train(hours) - mixed precision | Time to train speedup (TF32 to mixed precision)
			
 
				 |---|---|---|---|---|---|---|---|---
			
 
				 |32 x DGX A100 |8|256 and 128|4 and 8|---|1.3415|---|2.3|---  
			
 
				 |32 x DGX A100 |8|256 and 128|4 and 16|1.3415|---|3.7|---|--- 
			
 
				 
			
 
				 ##### Pre-training loss results: NVIDIA DGX-2H V100 (16x V100 32GB)
			
 
				 
			
 
				-| DGX System | GPUs | Accumulated Batch size / GPU (Phase 1 and Phase 2) | Accumulation steps (Phase 1 and Phase 2) | Final Loss - FP32 | Final Loss - mixed precision | Time to train(hours) - FP32 | Time to train(hours) - mixed precision | Time to train speedup (FP32 to mixed precision)
			
 
				+| DGX System | GPUs / Node | Accumulated Batch size / GPU (Phase 1 and Phase 2) | Accumulation steps (Phase 1 and Phase 2) | Final Loss - FP32 | Final Loss - mixed precision | Time to train(hours) - FP32 | Time to train(hours) - mixed precision | Time to train speedup (FP32 to mixed precision)
			
 
				 |---|---|---|---|---|---|---|---|---
			
 
				 |32 x DGX-2H |16|128 and 64|2 and 8|---|1.3223|---|2.07|---  
			
 
				 |32 x DGX-2H |16|128 and 64|4 and 16|1.3305|---|7.9|---|---  
			
@@ -749,7 +749,7 @@ Our results were obtained by running the `scripts/run_squad.sh` and `scripts/run
 
				 
			
 
				 Following results were obtained by running on pytorch:19.07-py3 NGC container.
			
 
				 
			
 
				-| DGX System | GPUs | Accumulated Batch size / GPU (Phase 1 and Phase 2) | Accumulation steps (Phase 1 and Phase 2) | Final Loss - FP32 | Final Loss - mixed precision | Time to train(hours) - FP32 | Time to train(hours) - mixed precision | Time to train speedup (FP32 to mixed precision)
			
 
				+| DGX System | GPUs / Node| Accumulated Batch size / GPU (Phase 1 and Phase 2) | Accumulation steps (Phase 1 and Phase 2) | Final Loss - FP32 | Final Loss - mixed precision | Time to train(hours) - FP32 | Time to train(hours) - mixed precision | Time to train speedup (FP32 to mixed precision)
			
 
				 |---|---|---|---|---|---|---|---|---
			
 
				 | 1 x NVIDIA DGX-1|8|8192 and 4096 |512 and 1024|-|1.36|-|153.16|-
			
 
				 | 1 x NVIDIA DGX-2H|16|4096 and 2048 |64 and 256|-|1.35|-|58.4|-