TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains component
Building TensorRT engine for C:\Apps\stable-diffusion-webui-dev\stable-diffusion-webui\models\Unet-onnx\bunny4_f5a202a7.onnx: C:\Apps\stable-diffusion-webui-dev\stable-diffusion-webui\models\Unet-trt\bunny4_f5a202a7_cc89_sample=1x4x64x64+2x4x64x64+8x4x96x96-timesteps=1+2+8-encod...