The Lightest Local Model Launcher with Live UI Code Preview.
[!] DISCLAIMER & EULA / WARNING: READ BEFORE USE
| oQoRun | First Generation Successfully Running Version | ⬇ Download 0 |
* MacOS / Linux users can use this by
downloading native Llama.cpp binaries and running node server.js.
Designed from the ground up for low-end hardware. Runs flawlessly on 10-year-old CPUs with just 8GB RAM. No dedicated GPU required.
Instantly hosts a lightning-fast 127.0.0.1:8081/v1
OpenAI-compatible API. The perfect backend engine for VS Code Continue, Cline, or
OpenClaw.
Zero bloatware. No heavy Electron frameworks or gigabytes of 3D UI installations. Just pure, unadulterated Node.js performance.
Not sure which model to download? Your computer's RAM (Memory) determines how large of a "brain" it can run. Use this table as a beginner's rule of thumb:
| Your Computer Info | Max Model Size (Params) | Reference Models | Experience |
|---|---|---|---|
| 8GB RAM (Older Laptops / Office PCs) |
1B ~ 3B | Phi-4-mini, Qwen2.5-1.5B, Ministral-3B | Fast & smooth text generation. The sweet spot for oQoRun! |
| 16GB RAM (Modern Standard PCs) |
7B ~ 9B | Llama-3-8B, Gemma-2-9B, Qwen2.5-7B | Excellent logic & coding, moderate generation speed on CPU. |
| 32GB RAM (High-End PCs) |
12B ~ 14B | Qwen2.5-14B, DeepSeek-Coder-V2-Lite | Advanced coding & logic. (Keep browser tabs closed) |
| 64GB RAM (Workstations) |
32B ~ 70B | Qwen2.5-32B, Llama-3-70B (Low Quant) | Expert reasoning. (Minimum Hardware: 24GB+ VRAM Graphics Card Required) |
| 128GB+ RAM (Monster AI Servers) |
100B+ | Command R+, Goliath-120B | The ultimate endgame. Enterprise-grade intelligence. (Minimum Hardware: Multiple High-End GPUs Required) |
* Always download models in
.gguf format and look for "Q4" or "Q5" in the filename (meaning
4-bit/5-bit
quantized to save RAM).
To keep oQoRun as the lightest model launcher possible, minimizing file size and avoiding copyright issues, this software package does NOT include the AI engine itself nor the AI models. Before using it, ensure you have manually prepared the following three items:
The oQoRun backend server requires Node.js to run.
node -v to verify.
Because hardware varies (NVIDIA vs standard CPU), you must download the appropriate engine for your machine.
llama-b8101-bin-win-cpu-x64.
llama-server.exe file directly inside this
folder.This is the "brain" of the AI.
.gguf file.
models.
.gguf files into this
models folder.
Once you have prepared the three items above, double-click the following file in your folder:
👉
oQoRun.bat
Upon first execution, the system will automatically perform the following for you:
npm install to automatically
set up the required web environment.http://localhost:3000 in your
default browser.models folder.Connected
light at the top of the interface.oQoRun provides a seamless, OpenAI-compatible local API for AI agents. Here is the exact setup flow to hook up OpenClaw:
Run openclaw configure in your terminal and select these options:
o Where will the Gateway run?
| Local (this machine)
o Select sections to configure
| Model
o Model/auth provider
| Custom Provider
o API Base URL
| http://127.0.0.1:8081/v1
o API Key (leave blank if not required)
| 33113 (or any placeholder you prefer)
o Endpoint compatibility
| Unknown (detect automatically)
o Model ID
| Model Phi-4-mini-reasoning-Q4_K_M.gguf (Copy from oQoRun API window)
o Detected OpenAI-compatible endpoint.
o Endpoint ID
| custom-127-0-0-1-8081 (Or any custom name)
* Model alias (optional)
| oQoRun (Or any custom alias)
If you see the error: Model context window too small (4096 tokens). Minimum is 16000.,
you must manually bypass OpenClaw's strict memory limits:
Ctrl + C in your terminal to completely
shut down the OpenClaw server.C:\Users\YourUsername\.openclaw\openclaw.json (Main
Config)C:\Users\YourUsername\.openclaw\agents\main\agent\models.json (Cache Config)Ctrl + F and search for your chosen
Endpoint ID (e.g., custom-127-0-0-1-8081)."contextWindow": 128000,
"maxTokens": 8192
openclaw gateway. Your terminal will
now connect perfectly!