io.github.MukundaKatta/agentfit icon

agentfit

by MukundaKatta

io.github.MukundaKatta/agentfit

Token-aware message truncation: fit a chat history into your model's context budget.

agentfit-mcp

MCP server for @mukundakatta/agentfit. Lets Claude Desktop, Cursor, Cline, Windsurf, Zed, or any other MCP client estimate token counts and fit a chat history into a model's context budget on demand.

npx -y @mukundakatta/agentfit-mcp

Three tools:

  • count_tokens — estimate tokens in a string or chat-message array, with per-model estimator families (openai, anthropic, google, llama, default).
  • fit_messages — drop messages from a chat history until under a maxTokens budget. Supports drop-oldest, drop-middle, and priority strategies; honors preserveSystem, preserveFirstN, preserveLastN.
  • list_estimators — list the built-in estimator families.

Add to your client

Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "agentfit": {
      "command": "npx",
      "args": ["-y", "@mukundakatta/agentfit-mcp"]
    }
  }
}

Cursor

~/.cursor/mcp.json:

{
  "mcpServers": {
    "agentfit": {
      "command": "npx",
      "args": ["-y", "@mukundakatta/agentfit-mcp"]
    }
  }
}

Cline / Windsurf / Zed

Same shape as above. The server speaks plain MCP over stdio, so any client that supports stdio MCP servers will work.

Tool examples

count_tokens:

{ "input": "hello world", "model": "claude-sonnet-4-6" }

Returns:

{ "tokens": 4, "model": "claude-sonnet-4-6" }

fit_messages:

{
  "messages": [
    { "role": "system", "content": "You are precise." },
    { "role": "user", "content": "long context..." },
    { "role": "assistant", "content": "..." },
    { "role": "user", "content": "final question" }
  ],
  "maxTokens": 8000,
  "model": "claude-sonnet-4-6",
  "preserveSystem": true,
  "preserveLastN": 2,
  "strategy": "drop-oldest"
}

Returns:

{
  "messages": [...],
  "dropped": [...],
  "tokens": { "before": 12000, "after": 7800, "budget": 8000 },
  "fit": true
}

fit_messages always returns a structured result and never throws across the wire: if the budget is unreachable even after dropping all non-protected messages, you get fit: false with the partial result so the caller can decide what to do.

Why a separate MCP server

@mukundakatta/agentfit is a zero-dependency JavaScript library. This package wraps it as an MCP server so it's accessible from inside any MCP-aware AI assistant: ask Claude "how many tokens is this transcript?" or "trim this chat to 8k tokens preserving the system prompt and last 2 turns" and the assistant calls these tools directly.

Sibling MCP servers

Part of the agent-stack series, all @mukundakatta/*-mcp:

License

MIT