Get Started

Large Language Model on node.js.

This project is in an early stage, the API for nodejs may change in the future, use it with caution.

Prerequisites

Node.js version 16 or above
(Optional) Typescript: When you want statically typed interfaces
(Optional) Python 3: When you need to convert the pth to ggml format
(Optional) Rust/C++ compiling toolchains: When you need self compilation
- Rust for building rust node api
- CMake for building llama.cpp project
- Clang/GNU/MSVC C++ compiler for compiling native C/C++ bindings, you can choose:
  - build-essential for Ubuntu (run apt install build-essential)
  - XCode for MacOS (run xcode-select --install)
  - Visual Studio for Windows (Install C/C++ components)

Compatibility

Currently supported models (All of these should be converted to GGML format):

Supported platforms:

darwin-x64
darwin-arm64
linux-x64-gnu (glibc >= 2.31)
linux-x64-musl
win32-x64-msvc

Node.js version: >= 16

Installation

Install llama-node npm package

npm install llama-node

Install anyone of the inference backends (at least one)

llama.cpp

npm install @llama-node/llama-cpp

or llm-rs

npm install @llama-node/core

or rwkv.cpp

npm install @llama-node/rwkv-cpp

Getting Model

For llama and its derived models:
The llama-node uses llm-rs/llama.cpp under the hook and uses the model format (GGML/GGMF/GGJT) derived from llama.cpp. Due to the fact that the meta-release model is only used for research purposes, this project does not provide model downloads. If you have obtained the original .pth model, please read the document and use the conversion tool provided by llama.cpp for conversion.
For RWKV models:
RWKV is open source model developed by PENG Bo. All the model weights and training codes are open source. Our rwkv backend uses rwkv.cpp native bindings which also utilized the GGML tensor formats. You can download the GGML quantized model from here or convert it by following the document

First example

This is your first example that uses llama.cpp as inference backend, make sure you have installed @llama-node/llama-cpp package.

// index.mjs
import { LLM } from "llama-node";
import { LLamaCpp } from "llama-node/dist/llm/llama-cpp.js";
import path from "path";

const model = path.resolve(process.cwd(), "../ggml-vic7b-q5_1.bin");
const llama = new LLM(LLamaCpp);
const config = {
    modelPath: model,
    enableLogging: true,
    nCtx: 1024,
    seed: 0,
    f16Kv: false,
    logitsAll: false,
    vocabOnly: false,
    useMlock: false,
    embedding: false,
    useMmap: true,
    nGpuLayers: 0
};

const template = `How are you?`;
const prompt = `A chat between a user and an assistant.
USER: ${template}
ASSISTANT:`;

const run = async () => {
  await llama.load(config);

  await llama.createCompletion({
      nThreads: 4,
      nTokPredict: 2048,
      topK: 40,
      topP: 0.1,
      temp: 0.2,
      repeatPenalty: 1,
      prompt,
  }, (response) => {
      process.stdout.write(response.token);
  });
}

run();

To run this example

node index.mjs

More examples

Visit our example folder here

Acknowledgments

This library was published under MIT/Apache-2.0 license. However, we stronly recommend you to cite our work/our dependencies work if you wish to reuse the code from this library.

Models/Inferencing tools dependencies:

LLaMA models: facebookresearch/llama
RWKV models: BlinkDL/RWKV-LM
llama.cpp: ggreganov/llama.cpp
llm-rs: rustformers/llm
rwkv.cpp: saharNooby/rwkv.cpp

Some source code comes from:

cpp-rust bindings build scripts: sobelio/llm-chain
rwkv logits sampling: KerfuffleV2/smolrsrwkv

Get Started

Prerequisites​

Compatibility​

Installation​

Getting Model​

First example​

More examples​

Acknowledgments​

Models/Inferencing tools dependencies:​

Some source code comes from:​