Question 12
Domain 3: NVIDIA Tools, Performance, and DeploymentWhich NVIDIA technology is specifically designed for inference optimization?
Correct answer: B
Explanation
TensorRT is NVIDIA’s inference optimization SDK, built to “optimize neural network models for deployment” and improve low-latency, high-throughput inference. It is used after training to accelerate execution on NVIDIA GPUs by applying layer fusion, precision calibration, and other runtime optimizations.
Why each option is right or wrong
A. NVIDIA Omniverse
B. TensorRT
NVIDIA TensorRT is the inference-focused SDK in NVIDIA’s software stack, intended to optimize trained neural network models for deployment on GPUs. It applies runtime optimizations such as layer fusion and reduced-precision execution to improve latency and throughput, which is why it is the correct choice when the question asks specifically about inference optimization.
C. NVIDIA DGX
D. NVIDIA CUDA