How I cut my OpenClaw costs in half (Lumin)
I've been spending a lot of time working with agentic workflows, especially OpenClaw-style loops where the model re-uses big chunks of context every turn. That gets expensive very fast. Most of the...

Source: DEV Community
I've been spending a lot of time working with agentic workflows, especially OpenClaw-style loops where the model re-uses big chunks of context every turn. That gets expensive very fast. Most of the cost was coming from repeated context, oversized prompts, and sending the same structure back and forth on every turn. You're often not paying for new reasoning, you're paying for the same prompt, tool context, and loop structure again and again. This led me to build Lumin, a local proxy that sits in front of model calls and reduces cost before the request ever reaches the provider. GitHub: https://github.com/ryancloto-dot/Lumin The problem: Most agentic loops look like this: Large system prompts sent every turn Repeated workspace and tool context Loops re-sending almost identical request structures Agents paying full price even when most of the prompt hasn't changed How Lumin works Lumin sits between your agent and the model provider: Your agent → Lumin → OpenAI / Anthropic / Google / Ollam