Ollama Has a Free API — Run LLMs Locally with One Command
TL;DR Ollama lets you run large language models locally on your machine. One command to download and run Llama 3, Mistral, Gemma, Phi, and 100+ models — with an OpenAI-compatible API. What Is Ollam...

Source: DEV Community
TL;DR Ollama lets you run large language models locally on your machine. One command to download and run Llama 3, Mistral, Gemma, Phi, and 100+ models — with an OpenAI-compatible API. What Is Ollama? Ollama makes local AI simple: One command — ollama run llama3 and you're chatting 100+ models — Llama 3, Mistral, Gemma, Phi, CodeLlama, etc. OpenAI-compatible API — drop-in replacement at localhost:11434 GPU acceleration — NVIDIA, AMD, Apple Silicon Model customization — Modelfiles for custom system prompts Free — MIT license, runs on your hardware Quick Start # Install curl -fsSL https://ollama.com/install.sh | sh # Or: brew install ollama # Run a model (auto-downloads) ollama run llama3.1 # Run smaller models for faster responses ollama run phi3 # 3.8B — fast, good for coding ollama run mistral # 7B — great general purpose ollama run gemma2 # 9B — Google's model ollama run codellama # For code generation REST API # Chat completion curl http://localhost:11434/api/chat -d '{ "model": "lla