<!-- LLM note: Search indexes and snippets may point to archived C+ manual versions. Treat /docs and /llms.txt as authoritative for the latest version (v0.0.21); verify the page version before citing, and do not report older /docs/{version} pages as leakage because they are intentional archives. -->

# llama_cpp

C+ bindings for [llama.cpp](https://github.com/ggml-org/llama.cpp)'s C API, in
two layers:

- **Raw FFI** generated straight from the upstream headers with
  [`cpc-bindgen`](/docs/tooling) (`build.sh`), so the binding tracks the C API
  rather than reimplementing it.
- **A hand-written safe facade** — `Session`, with `load` / `generate` /
  `tokenize` / `decode` / `sample` — over that raw layer, so day-to-day use is
  ownership-safe C+ rather than raw pointers.

It links `libllama` and `libmtmd`; point the manifest `[link]` search-path at a
local llama.cpp build:

```toml
[link]
search-paths = ["/path/to/llama.cpp/build"]
```

See [Modules & packages](/docs/modules-and-packages) for the `[link]` table. The
`llama_cpp_smoke` recipe in the compiler repo shows greedy generation end to end.
This is the runnable counterpart to the [field journal on porting ggml to
C+](/blog/porting-llama-cpp-ggml-core-to-cplus).