[Question]:Local Ollama models always timeout and sometimes return null via LiteLLM-Proxy, while direct API calls work

### Do you need to ask a question?

- [x] I have searched the existing question and discussions and this question is not already answered.
- [x] I believe this is a legitimate question, not just a bug or feature request.

### Your Question

I'm having persistent timeout issues when trying to use local Ollama models through LiteLLM-Proxy in my RAG-Anything setup. The API endpoints work fine, but local model calls always timeout despite various attempts to fix it.

### Additional Context

### Environment Setup:​​
- Ollama Service​​: Running locally on port 11434
- ​​LiteLLM-Proxy​​: Running on port 9000, configured with base_url: http://ollama:11434/v1
- RAG-Anything​​: Running on port 8801, using API base: http://litellm-proxy:9000/v1
- Model​​: Testing with Qwen3-8B-Q6_K and smaller 4B models
- ​​Hardware​​: Ubuntu 22.04, RTX 4090 (48GB modified)
- litellm config:
```yaml
- model_name: Qwen3-8B-Q6_K
    litellm_params:
      model: openai/Qwen3-8B-Q6_K
      api_base: http://ollama:11434/v1
      api_key: dummy
      enable_thinking: false
      timeout: 600
```
### What I've Already Tried:​​
- Increased timeout settings significantly
- Limited max_workers to reduce load
- Switched to smaller 4B parameter models
- Verified API endpoints work independently
- Confirmed Ollama service responds to direct calls
### Some of the console logs:
- app              |Received empty content from OpenAI API
- app              |WARNING: limit_async: Worker timeout for task 140201160394224_2571.427 after 210s
- app              |TimeoutError: [LLM func] limit_async: Worker execution timeout after 210s
- ollama         | [GIN] 2025/09/10 - 12:32:42 | 500 |         10m0s |      172.18.0.2 | POST     "/v1/chat/completions"


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question]:Local Ollama models always timeout and sometimes return null via LiteLLM-Proxy, while direct API calls work #108

Do you need to ask a question?

Your Question

Additional Context

Environment Setup:

What I've Already Tried:

Some of the console logs:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question]:Local Ollama models always timeout and sometimes return null via LiteLLM-Proxy, while direct API calls work #108

Description

Do you need to ask a question?

Your Question

Additional Context

Environment Setup:​​

What I've Already Tried:​​

Some of the console logs:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Environment Setup:

What I've Already Tried: