gpu Memory utilisation Error, i have seen same error in 12GB...
mlc-llm/cpp/serve/threaded_engine.cc:283: Check failed: (output_res.IsOk()) is false: Insufficient GPU memory error: The available single GPU memory is 4762.535 MB, which is less than the sum of model weight siz