mkl-service 2.4.0 py38h7f8727e_0 defaults [conda] mkl_fft 1.3.1 py38hd3c417c_0 defaults [conda] mkl_random 1.2.2 py38h51133e4_0 defaults [conda] numpy 1.23.5 py38h14f4228_0 defaults [conda] numpy-base 1.23.5 py38h31eccc5_0 defaults [conda] torch 1.13.1 pypi_0 pypi ...
Describe the bug I found that the CPU memory increase happens when repeat inference for a long time on Intel Arc A770. Reproduce code memory trend: Related code: import torch from transformers import AutoModelForCausalLM, LlamaTokenizer ...