It seems like Ollama (non-docker) models crash and restart while any output is being processed With 70k Context: Jun 23 20:18:29 main ollama[7231]: llm_load_tensors: offloading 9 repeating layers to GPU Jun 23 20:18:29 main ollama[7231]: llm_load_tensors: offloaded 9/28 layers...
Open eliranwongopened this issueMar 9, 2024· 0 comments Open opened this issueMar 9, 2024· 0 comments eliranwongcommentedMar 9, 2024 Hi, I would like to reopen the issue, as the suggestion does not work, thanks: #84 Sign up for freeto join this conversation on GitHub. Already have ...
qwen2-72b start to output gibberish like this: .5"5.F9(CB;6@FC9!DC:$B$D60G5",3B+2;1-*,@%=876E0;5*:.98G4!980+D at some point if i set num_ctx to 8192. Normal output from llm was expected. Issue persist when usingollama run, or when using api (Silly Tavern) qwen2-7...
Merge pull request open-webui#166 from ollama-webui/dev … 201eaf1 Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment Reviewers No reviews Assignees No one assigned Labels None yet Projects None yet Milestone No milestone Development...