this feature would be very helpful for a WIP PR: mapping entire ggml cgraph to the QNN graph although it seems it's a bad news for my formal third PR:#12326, it doesn't matter 🤗 and I'd like to see success of similar PR from others in this great tech community although that ...
void ggml_graph_add_node(struct ggml_cgraph * cgraph, struct ggml_tensor * tensor) { cgraph->nodes[cgraph->n_nodes] = tensor; cgraph->n_nodes++; } Collaborator slaren Sep 11, 2024 GGML_ASSERT(cgraph->size > cgraph->n_nodes) ...
evaluate_and_capture_cuda_graph(ggml_backend_cuda_context*, ggml_cgraph*, std::vector<void*, std::allocator<void*> >&, bool&, bool&, bool&) in /home/diego/code/llama.cpp/ggml/src/ggml-cuda/ggml-cuda.cu:2655 [0x12994c] === in /home/diego/code/llama.cpp/build/bin/libggml...
int, unsigned long, unsigned long) () from llama.cpp/build-rpc-cuda/bin/libggml-rpc.so#6 0x00007fcc8309c0a5 in ggml_backend_rpc_start_server () from llama.cpp/build-rpc-cuda/bin/libggml-rpc.so#7 0x000055c29a4d9ade in main ()[Inferior 1 (process 225145) detached] ...
() #5 0x0000559119933cff in ggml_backend_cuda_graph_compute(ggml_backend*, ggml_cgraph*) () #6 0x0000559119abb4bb in ggml_backend_sched_graph_compute_async () #7 0x0000559119b0d7b0 in llama_decode () #8 0x0000559119bcd039 in llama_init_from_gpt_params(gpt_params&) () #9 0x...