oQoRun - The Lightest Local Model Launcher, Live Code Preview

oQoRun

The Lightest Local Model Launcher with Live UI Code Preview.

[!] DISCLAIMER & EULA / WARNING: READ BEFORE USE

This is open-source software provided "AS IS", without warranty of any kind.
It is strictly for EDUCATIONAL, LEARNING, and TESTING purposes only.
Ensure you have the rights to use any models loaded into this software.
The author/developer assumes absolutely NO RESPONSIBILITY or LIABILITY for any computer security issues, privacy leaks, hardware damage, or abuse.
By proceeding to use this application, you accept all risks entirely upon yourself and agree to these terms.

Running Configuration: i5 4200 1.6GHz, 8GB RAM (4B)

oQoRun

First Generation Successfully Running Version

⬇ Download 0

☕ Buy me a coffee

* MacOS / Linux users can use this by downloading native Llama.cpp binaries and running node server.js.

✨ Why oQoRun? The Perfect Copilot Engine

💻 Old Laptop Savior

Designed from the ground up for low-end hardware. Runs flawlessly on 10-year-old CPUs with just 8GB RAM. No dedicated GPU required.

🤖 AI Agent Ready

Instantly hosts a lightning-fast 127.0.0.1:8081/v1 OpenAI-compatible API. The perfect backend engine for VS Code Continue, Cline, or OpenClaw.

⚡ Ultra Lightweight

Zero bloatware. No heavy Electron frameworks or gigabytes of 3D UI installations. Just pure, unadulterated Node.js performance.

📊 Hardware & Model Size Guide

Not sure which model to download? Your computer's RAM (Memory) determines how large of a "brain" it can run. Use this table as a beginner's rule of thumb:

Your Computer Info	Max Model Size (Params)	Reference Models	Experience
8GB RAM (Older Laptops / Office PCs)	1B ~ 3B	Phi-4-mini, Qwen2.5-1.5B, Ministral-3B	Fast & smooth text generation. The sweet spot for oQoRun!
16GB RAM (Modern Standard PCs)	7B ~ 9B	Llama-3-8B, Gemma-2-9B, Qwen2.5-7B	Excellent logic & coding, moderate generation speed on CPU.
32GB RAM (High-End PCs)	12B ~ 14B	Qwen2.5-14B, DeepSeek-Coder-V2-Lite	Advanced coding & logic. (Keep browser tabs closed)
64GB RAM (Workstations)	32B ~ 70B	Qwen2.5-32B, Llama-3-70B (Low Quant)	Expert reasoning. (Minimum Hardware: 24GB+ VRAM Graphics Card Required)
128GB+ RAM (Monster AI Servers)	100B+	Command R+, Goliath-120B	The ultimate endgame. Enterprise-grade intelligence. (Minimum Hardware: Multiple High-End GPUs Required)

* Always download models in .gguf format and look for "Q4" or "Q5" in the filename (meaning 4-bit/5-bit quantized to save RAM).

🚀 Prerequisites: What you need to install "Manually"

To keep oQoRun as the lightest model launcher possible, minimizing file size and avoiding copyright issues, this software package does NOT include the AI engine itself nor the AI models. Before using it, ensure you have manually prepared the following three items:

1. Install Node.js (Required Environment)

The oQoRun backend server requires Node.js to run.

Download: Go to the Node.js Official Website and download the LTS (Long Term Support) version.
Verify: Restart your computer or terminal after installation, and type node -v to verify.

2. Prepare the Llama.cpp Engine (AI Core)

Because hardware varies (NVIDIA vs standard CPU), you must download the appropriate engine for your machine.

Download: Search for and download a compiled release of `llama.cpp` (e.g., pre-built Windows binaries from their GitHub Releases).
Location: Inside your extracted oQoRun folder, create a new folder named llama-b8101-bin-win-cpu-x64.
Required Files: Place the llama-server.exe file directly inside this folder.

3. Download GGUF Format AI Models

This is the "brain" of the AI.

Download: Visit Hugging Face to find models you like (We recommend models with Coding / Math capabilities for the Live Code Preview to work best). The format must be a .gguf file.
Location: Inside your oQoRun folder, create a new folder named models.
Storage: Place all downloaded .gguf files into this models folder.

📦 Start Setup & How to Use

Booting the Launcher

Once you have prepared the three items above, double-click the following file in your folder:
👉 oQoRun.bat

Upon first execution, the system will automatically perform the following for you:

Install Dependencies: The script will run npm install to automatically set up the required web environment.
Launch Server: It will host the local chat server safely on your machine.
Open UI: It will automatically open http://localhost:3000 in your default browser.

Using the oQoRun Interface

Select a Model: Click the dropdown menu in the left sidebar to load a model from your models folder.
Performance Tuning: Click "⚙️ Settings" at the bottom left. If you have a GPU, setting GPU Acceleration significantly improves response speeds.
Launch AI: Click your selected model and wait for the green Connected light at the top of the interface.
Live Preview: Ask the AI to write HTML code. The right sidebar will generate an instant thumbnail button. Click it to preview the live rendering!

🤖 How to Connect OpenClaw to oQoRun

oQoRun provides a seamless, OpenAI-compatible local API for AI agents. Here is the exact setup flow to hook up OpenClaw:

1. Terminal Configuration

Run openclaw configure in your terminal and select these options:

o  Where will the Gateway run?
|  Local (this machine)

o  Select sections to configure
|  Model

o  Model/auth provider
|  Custom Provider

o  API Base URL
|  http://127.0.0.1:8081/v1

o  API Key (leave blank if not required)
|  33113 (or any placeholder you prefer)

o  Endpoint compatibility
|  Unknown (detect automatically)

o  Model ID
|  Model Phi-4-mini-reasoning-Q4_K_M.gguf (Copy from oQoRun API window)

o  Detected OpenAI-compatible endpoint.

o  Endpoint ID
|  custom-127-0-0-1-8081 (Or any custom name)

*  Model alias (optional)
|  oQoRun (Or any custom alias)

2. Fix the 16000 Context Window Error

If you see the error: Model context window too small (4096 tokens). Minimum is 16000., you must manually bypass OpenClaw's strict memory limits:

Press Ctrl + C in your terminal to completely shut down the OpenClaw server.
Open these two hidden configuration files in a text editor (e.g., VS Code):
- 👉 C:\Users\YourUsername\.openclaw\openclaw.json (Main Config)
- 👉 C:\Users\YourUsername\.openclaw\agents\main\agent\models.json (Cache Config)
Press Ctrl + F and search for your chosen Endpoint ID (e.g., custom-127-0-0-1-8081).
Find your model's settings block and change the limits to bypass the check:
```
"contextWindow": 128000,
"maxTokens": 8192
```
Save both files, then restart the gateway using openclaw gateway. Your terminal will now connect perfectly!