第五步:从profiling timeline的末端观察报错的算子 第六步:torch.save输入输出后进行单算子问题复现,提供device日志给研发确认(提交issue、发帖) 第七步:研发确认,输入存在inf 3.3 编译debug版本调试 案例:发生coreDump或者Segment fault后,使用gdb查看堆栈,存在“??”符号: 第一步:编译debug版本的包:DEBUG=1 bash c...
LOG_INFO("start ai box."); m_run = true; m_thread = std::thread([&]() { // struct typedef typedef std::map<std::string, AnalysisPtr> ANALYSISES; typedef std::vector<DeviceInfo> DEVICES; ANALYSISES analysises; int patrol_num = Config::GetInstance().PATROL_NUM; while (m_run && !
一、问题现象(附报错日志上下文): 用transformers 库推理参数量较大如果使用 device_map="auto" 这样使用多卡加载模型,就会涉及卡间移动 tensor 会出现以下报错: [ERROR] ASCENDCL(3534751,python):2024-06-06-15:04:20.381.859 [stream.cpp:151]3534751 aclrtSynchronizeStreamWithTimeout: [INIT][DEFAULT]synchro...
* GXDNN_RESULT_DEVICE_ERROR device error * @remark if devicePath is "/dev/gxnpu", open npu device * devicePath is "/dev/gxsnpu", open snpu device */ GxDnnResult GxDnnOpenDevice(const char *devicePath, GxDnnDevice *device); /*===*/ /** * @brief Close NPU device * @param [...
optional其他参数包括,config、feature_extractor、device、device_map等 pipeline 文本分类任务 在HF Hub 上寻找一个文本分类器并下载,模型下载方式可参考Ascend NPU 之 HuggingFace Transformers(一) 以michellejieli/NSFW_text_classifier模型为例, >>>importtorch>>>importtorch_npu>>>fromtransformersimportpipeline# 读...
DeviceMap ffffc08e24383d70Owning Process ffff8c0536112080 Image: audiodg.exeAttached Process N/A Image: N/AWait Start TickCount 945141 Ticks: 4159 (0:00:01:04.984)Context Switch Count 192 IdealProcessor: 16UserTime 00:00:00.359KernelTime 00:00:00.375Win32 Start Address ntdll!TppWorkerThr...
In file included from /var/lib/dkms/davinci_ascend/1.0/build/vascend_drv/dma_pool_map.c:24: ./include/linux/dma-noncoherent.h:88:40: note: expected ‘phys_addr_t’ {aka ‘long long unsigned int’} but argument is of type ‘struct device *’ 88 | void arch_sync_dma_for_cpu...
DeviceMap ffffc08e24383d70Owning Process ffff8c0536112080 Image: audiodg.exeAttached Process N/A Image: N/AWait Start TickCount 945141 Ticks: 4159 (0:00:01:04.984)Context Switch Count 192 IdealProcessor: 16UserTime 00:00:00.359KernelTime 00:00:00.375Win32 Start Address ntdll!TppWorkerThread ...
├── special_tokens_map.json ├── tokenizer_config.json ├── tokenizer.json └── vocab.txt 由于NPU 中的所有数据都将被转到到 fp16 精度进行计算,而 BGE Embedding 模型的原始精度是FP32,因此在FP32转FP16的过程中部分超过FP16表达范围的值会溢出,例如一个极小的负数,BGE Embedding也同样存在这...