Skip to content

[Question]:Local Ollama models always timeout and sometimes return null via LiteLLM-Proxy, while direct API calls work #108

@mikumiiku

Description

@mikumiiku

Do you need to ask a question?

  • I have searched the existing question and discussions and this question is not already answered.
  • I believe this is a legitimate question, not just a bug or feature request.

Your Question

I'm having persistent timeout issues when trying to use local Ollama models through LiteLLM-Proxy in my RAG-Anything setup. The API endpoints work fine, but local model calls always timeout despite various attempts to fix it.

Additional Context

Environment Setup:​​

  • Ollama Service​​: Running locally on port 11434
  • ​​LiteLLM-Proxy​​: Running on port 9000, configured with base_url: http://ollama:11434/v1
  • RAG-Anything​​: Running on port 8801, using API base: http://litellm-proxy:9000/v1
  • Model​​: Testing with Qwen3-8B-Q6_K and smaller 4B models
  • ​​Hardware​​: Ubuntu 22.04, RTX 4090 (48GB modified)
  • litellm config:
- model_name: Qwen3-8B-Q6_K
    litellm_params:
      model: openai/Qwen3-8B-Q6_K
      api_base: http://ollama:11434/v1
      api_key: dummy
      enable_thinking: false
      timeout: 600

What I've Already Tried:​​

  • Increased timeout settings significantly
  • Limited max_workers to reduce load
  • Switched to smaller 4B parameter models
  • Verified API endpoints work independently
  • Confirmed Ollama service responds to direct calls

Some of the console logs:

  • app |Received empty content from OpenAI API
  • app |WARNING: limit_async: Worker timeout for task 140201160394224_2571.427 after 210s
  • app |TimeoutError: [LLM func] limit_async: Worker execution timeout after 210s
  • ollama | [GIN] 2025/09/10 - 12:32:42 | 500 | 10m0s | 172.18.0.2 | POST "/v1/chat/completions"

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions