bool instruct = false; // instruction mode (used for Alpaca models) bool ignore_eos = false; // do not stop generating after eos bool perplexity = false; // compute perplexity over the prompt }; bool gpt_params_parse(int argc, char ** argv, gpt_params & params);0...
This adds an option to compute perplexity over the prompt input similar to https://huggingface.co/docs/transformers/perplexity. It does so by chunking up the prompt into non-overlapping chunks of the context window size. It then runs the forward pass and computes the softmax probability of the...
The current implementation constructs a list of the entire files content in memory just to (almost) discard it immediately. We found that this leads to high memory usage and can slow down the clust...