Backends

llama-node currently supports llm-rs, llama.cpp and rwkv.cpp backends.

llm-rs can supported multiple inference at same time.

llama.cpp and rwkv.cpp will treat async inference (in concurrent) as sequential requests.

npm install @llama-node/llama-cpp

npm install @llama-node/core

npm install @llama-node/rwkv-cpp