Skip to main content

Firebird API

LLM Inference Engine with BitNet Support.

Module: src/firebird/

CLI Commands​

Chat Mode​

./bin/firebird chat --model path/to/model.gguf

Server Mode​

./bin/firebird serve --port 8080 --model model.gguf

HTTP API​

POST /v1/chat/completions​

OpenAI-compatible chat endpoint.

{
"model": "bitnet-3b",
"messages": [{"role": "user", "content": "Hello!"}],
"temperature": 0.7
}

Performance​

Model SizeMemoryTokens/sec
1.5B~1GB15-20
3B~2GB8-12
7B~4GB4-6