预计mlc-llm跑llama-2-7b-int4需要内存大于4G; MIUI系统内存占用较高,原始内存8G,空闲内存在4G左右;
将之前编译的qwen1.5-1.8b-q4f16_1-android.tar放到mlc-llm/dist/prebuilt/lib/qwen1.5-1.8b/目录下。没有就创建该目录。 mkdir -p mlc-llm/dist/prebuilt/lib/qwen1.5-1.8b/ cp dist/prebuilt_libs/qwen1.5-1.8b-q4f16_1-android.tar mlc-llm/dist/prebuilt/lib/qwen1.5-1.8b/ 进入mlc-llm/android...
一个编译好的apk:https://github.com/BBuf/run-rwkv-world-4-in-mlc-llm/releases/download/v1.0...
[1]GitHub - mlc-ai/mlc-llm: Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.:https://github.com/mlc-ai/mlc-llm [2]MLC LLM介绍:https://mlc.ai/mlc-llm/ [3]WebLLM:https://mlc.ai/web-llm/ [4]GitHub存储库:https://github.com/mlc-ai/mlc-l...
目前我跑通了Metal和Android平台的RWKV5推理(包含1.5B和3B),并且也编译出了一个3B int8模式的apk提供给android用户使用,地址为:https://github.com/BBuf/run-rwkv-world-4-in-mlc-llm/releases/download/v1.0.0/rwkv5-3b-int8.apk 。大家可以下载这个apk来体验最新的RWKV-5-3B模型。
我这里编译了一个RWKV4 World 3B模型的权重int4量化版本的apk,地址为:https://github.com/BBuf/run-rwkv-world-4-in-mlc-llm/releases/download/v1.0.0/app-debug.apk 。感兴趣的小伙伴可以下载这个apk到android手机上来运行,需要注意的是由于要在线拉取HuggingFace的权重,所以手机上需要可以访问HuggingFace需要代...
通过USB将你的手机连接到电脑。通常会自动安装必要的驱动程序。当你运行程序时,将出现设备选择对话框。选择你的手机,APK将自动安装并运行。 一个编译好的apk: https://github.com/BBuf/run-rwkv-world-4-in-mlc-llm/releases/download/v1.0.0/app-debug.apk ...
MLC LLM is a machine learning compiler and high-performance deployment engine for large language models. The mission of this project is to enable everyone to develop, optimize, and deploy AI models natively on everyone's platforms. AMD GPUNVIDIA GPUApple GPUIntel GPU ...
BBuf/run-rwkv-world-4-in-mlc-llm#1 Closed GameOverFlowChart commented Sep 5, 2023 • edited The prebuild apk only works with the 2 models which are listed in the app. In the discord there is an unofficial apk version which only works with llama2 based models. Both work for me...
MLC-LLM 是一个机器学习编译器和高性能大型语言模型部署引擎。该项目的使命是让每个人都能在自己的平台上开发、优化和部署 AI 模型。InternLM 2.5 是上海人工智能实验室发布的新一代大规模语言模型,相比于之前的版本,InternLM 2.5支持百万长文,推理能力开源领先。本文将带大家手把手使用 MLC-LLM 将 InternLM2.5-...