kubis.ai USE AI NOW, ASK ME HOW
Interface Theme
Typography Set
Layout Mode
Ouroboros Loop
  • Timeline
  • Plain MD

Posts

  • Blog (35)
    • AI Augmented Workflow 10
      • Claude Code Skills 5
      • Vscode Githubcopilot 5
    • Dgx_series 16
      • Benchmarks 4
    • Llms 5
      • Tokens In Logits Out 5
    • Markdown Et Al 3
  • Archive
    • 2026 21
      • April 2
      • March 19
    • 2025 14
      • December 8
      • August 5
      • June 1
  • Tags
    • AGENTS.md 1
    • AI 29
    • API 1
    • Agent SDK 1
    • BPE 1
    • Blackwell 1
    • CUTLASS 1
    • Claude Code 8
    • DGX Spark 16
    • DevOps 1
    • GPT-2 5
    • GPT-OSS-120B 2
    • Gemma4 2
    • Git 1
    • GitHub Copilot 4
    • Groq 1
    • Hugging Face 1
    • LLM 5
    • LiteLLM 4
    • Local AI 16
    • Marp 2
    • Mermaid 2
    • Nemotron 2
    • Ollama 1
    • Open WebUI 1
    • OpenAI 1
    • Ray 1
    • SentencePiece 1
    • VSCode 5
    • agents 1
    • ai-augmented-workflow 10
    • attention 1
    • automation 1
    • benchmarking 7
    • btop 1
    • cluster 2
    • diagrams 1
    • documentation 1
    • finance 1
    • introduction 1
    • llama-benchy 1
    • markdown 4
    • markdown-et-al 3
    • marp 1
    • memory bandwidth 1
    • mermaid 1
    • monitoring 1
    • no-code 1
    • plugins 1
    • presentations 2
    • proxy 1
    • quantization 1
    • recipes 1
    • reconciliation 1
    • sampling 1
    • skills 5
    • temperature 1
    • tokenization 1
    • top-k 1
    • top-p 1
    • transformer 2
    • tutorial 1
    • vLLM 11
    • visualization 1

Posts tagged with "proxy"

Found 1 post

March 13, 2026 · 6 min read

5/9 LiteLLM: The Translation Layer Between Claude Code and Local Models

Claude Code speaks Anthropic. gpt-oss-120b speaks OpenAI with Harmony-style tool calls. LiteLLM sits in the middle and translates — including a custom callback that patches the tool calls neither side gets right.

#DGX Spark #LiteLLM #Claude Code #vLLM #proxy #AI #Local AI
← Back to all posts
Sidebar
Pin sidebar ›
Hub Timeline Archives
  • Timeline
  • Plain MD

Posts

  • Blog (35)
    • AI Augmented Workflow 10
      • Claude Code Skills 5
      • Vscode Githubcopilot 5
    • Dgx_series 16
      • Benchmarks 4
    • Llms 5
      • Tokens In Logits Out 5
    • Markdown Et Al 3
  • Archive
    • 2026 21
      • April 2
      • March 19
    • 2025 14
      • December 8
      • August 5
      • June 1
  • Tags
    • AGENTS.md 1
    • AI 29
    • API 1
    • Agent SDK 1
    • BPE 1
    • Blackwell 1
    • CUTLASS 1
    • Claude Code 8
    • DGX Spark 16
    • DevOps 1
    • GPT-2 5
    • GPT-OSS-120B 2
    • Gemma4 2
    • Git 1
    • GitHub Copilot 4
    • Groq 1
    • Hugging Face 1
    • LLM 5
    • LiteLLM 4
    • Local AI 16
    • Marp 2
    • Mermaid 2
    • Nemotron 2
    • Ollama 1
    • Open WebUI 1
    • OpenAI 1
    • Ray 1
    • SentencePiece 1
    • VSCode 5
    • agents 1
    • ai-augmented-workflow 10
    • attention 1
    • automation 1
    • benchmarking 7
    • btop 1
    • cluster 2
    • diagrams 1
    • documentation 1
    • finance 1
    • introduction 1
    • llama-benchy 1
    • markdown 4
    • markdown-et-al 3
    • marp 1
    • memory bandwidth 1
    • mermaid 1
    • monitoring 1
    • no-code 1
    • plugins 1
    • presentations 2
    • proxy 1
    • quantization 1
    • recipes 1
    • reconciliation 1
    • sampling 1
    • skills 5
    • temperature 1
    • tokenization 1
    • top-k 1
    • top-p 1
    • transformer 2
    • tutorial 1
    • vLLM 11
    • visualization 1

April 9, 2026 ·