Run a local model with Ollama
Run a local model with Ollama
Section titled “Run a local model with Ollama”Outcome: A single chat query completes against a model running on local Ollama — no cloud provider required.
Prerequisites:
gormesinstalled. An Ollama server reachable athttp://localhost:11434/v1. At least one model pulled (e.g.ollama pull llama3.1).
-
Confirm Ollama is up
Terminal window curl -s http://localhost:11434/v1/models | headYou should see a JSON document listing local models.
-
Run a scripted chat query through Ollama
The provider, endpoint, and model are invocation overrides set through environment variables (
gormes chatitself only takes-q/--query):Terminal window GORMES_INFERENCE_PROVIDER=ollama \GORMES_ENDPOINT=http://localhost:11434/v1 \GORMES_INFERENCE_MODEL=llama3.1 \gormes chat -q "test local model"Replace
llama3.1with the exact tag returned by step 1.
Verify
Section titled “Verify”GORMES_INFERENCE_PROVIDER=ollama GORMES_ENDPOINT=http://localhost:11434/v1 GORMES_INFERENCE_MODEL=llama3.1 gormes chat -q "say hi"Expected: a model-generated reply on stdout. The process exits with status 0.
Troubleshooting
Section titled “Troubleshooting”Not Found: model 'xxx' not found→ The model tag does not exist locally. Pull it:ollama pull <tag>, then setGORMES_INFERENCE_MODELto the exact tag.- Connection refused / timeout → Ollama is not running on
localhost:11434. Start it (ollama serve) or correct theGORMES_ENDPOINTURL. - Want it as the default? → Persist with
gormes setup provider, or sethermes.provider,hermes.endpoint, andhermes.modelviagormes config set.
See also
Section titled “See also”- Connect a provider and open chat
- Add a fallback provider chain — keep Ollama as a free fallback.