IBM Granite 4 Micro served from local GGUF server

Granite 4 Micro is an open-source LLM supporting a 1M context window. This demo uses only 2K context and max 1K output tokens. View Documentation

0 1
0 2
0 1
0 100
1 2000