NVIDIA Tensorrt Inference Server

NVIDIA's new Hopper H200 AI GPU tested: 3x faster GenAI with TensorRT-LLM in MLPerf 4.0 results

Using these new TensorRT-LLM optimizations, NVIDIA has pulled out a huge 2.4x performance leap with its current H100 AI GPU in MLPerf Inference 3.1 to 4.0 with GPT-J tests using an offline scenario.

TechCrunch

Nvidia launches NIM to make it smoother to deploy AI models into production

At its GTC conference, Nvidia today announced Nvidia NIM, a new software platform designed to streamline the deployment of custom and pre-trained AI models into production environments. NIM takes the ...

BGR

NVIDIA Is Helping Apple Build A Faster And Better AI Experience

Apple and NVIDIA shared details of a collaboration to improve the performance of LLMs with a new text generation technique for AI. Cupertino writes: Accelerating LLM inference is an important ML ...

TechRepublic

NVIDIA GTC Keynote: Blackwell Architecture Will Accelerate AI Products in Late 2024

Developers can now take advantage of NVIDIA NIM packages to deploy enterprise generative AI, said NVIDIA CEO Jensen Huang. NVIDIA’s newest GPU platform is the Blackwell (Figure A), which companies ...

Infosecurity-magazine.com

Critical Vulnerabilities Found in NVIDIA's Triton Inference Server

A chain of critical vulnerabilities in NVIDIA's Triton Inference Server has been discovered by researchers, just two weeks after a Container Toolkit vulnerability was identified. The Triton Inference ...

The Next Platform

The First AI Benchmarks Pitting AMD Against Nvidia

Rated horsepower for a compute engine is an interesting intellectual exercise, but it is where the rubber hits the road that really matters. We finally have the first benchmarks from MLCommons, the ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results