-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[ENH]: Client side retries #5419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 09-02-_enh_consolidate_retries
Are you sure you want to change the base?
[ENH]: Client side retries #5419
Conversation
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite. This stack of pull requests is managed by Graphite. Learn more about stacking. |
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
Add and Standardize Client-Side Retries for Python and JavaScript Clients This PR implements configurable client-side retry logic for both the Python ( Key Changes• Introduced a retry configuration model ( Affected Areas• This summary was automatically generated by @propel-code-bot |
2 Jobs Failed: PR checks / all-required-pr-checks-passed failed on "Decide whether the needed jobs succeeded or failed"
PR checks / Rust tests / Integration test ci_k8s_integration_slow 1 failed on "Unknown Step"Summary: 1 successful workflow, 1 failed workflow
Last updated: 2025-09-25 18:41:50 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets add the same retry logic for js as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree we should do this for JS as well
chromadb/api/fastapi.py
Outdated
retry=retry_if_exception_type(is_retryable_exception), | ||
before_sleep=before_sleep_log(logger, logging.INFO), | ||
reraise=True | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably ok for now but maybe good to make this configurable in the future?
b614a69
to
b381e79
Compare
7690f2c
to
ef8c70f
Compare
chromadb/api/fastapi.py
Outdated
def _request_with_retry(): | ||
# If the request has json in kwargs, use orjson to serialize it, | ||
# remove it from kwargs, and add it to the content parameter | ||
# This is because httpx uses a slower json serializer | ||
if "json" in kwargs: | ||
data = orjson.dumps(kwargs.pop("json"), option=orjson.OPT_SERIALIZE_NUMPY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
Critical bug: The kwargs
dictionary is being mutated inside the retry function, which will cause issues on subsequent retry attempts. When kwargs.pop("json")
is called on the first attempt, the "json" key is removed from kwargs permanently. On retry attempts, the JSON data will be missing, leading to malformed requests.
Suggested Change
def _request_with_retry(): | |
# If the request has json in kwargs, use orjson to serialize it, | |
# remove it from kwargs, and add it to the content parameter | |
# This is because httpx uses a slower json serializer | |
if "json" in kwargs: | |
data = orjson.dumps(kwargs.pop("json"), option=orjson.OPT_SERIALIZE_NUMPY) | |
def _request_with_retry(): | |
# Create a copy of kwargs to avoid mutation across retries | |
request_kwargs = kwargs.copy() | |
# If the request has json in kwargs, use orjson to serialize it, | |
# remove it from kwargs, and add it to the content parameter | |
# This is because httpx uses a slower json serializer | |
if "json" in request_kwargs: | |
data = orjson.dumps(request_kwargs.pop("json"), option=orjson.OPT_SERIALIZE_NUMPY) | |
request_kwargs["content"] = data | |
# Unlike requests, httpx does not automatically escape the path | |
escaped_path = urllib.parse.quote(path, safe="/", encoding=None, errors=None) | |
url = self._api_url + escaped_path | |
response = self._session.request(method, url, **cast(Any, request_kwargs)) | |
BaseHTTPClient._raise_chroma_error(response) | |
return orjson.loads(response.text) |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
Context for Agents
[**CriticalError**]
Critical bug: The `kwargs` dictionary is being mutated inside the retry function, which will cause issues on subsequent retry attempts. When `kwargs.pop("json")` is called on the first attempt, the "json" key is removed from kwargs permanently. On retry attempts, the JSON data will be missing, leading to malformed requests.
<details>
<summary>Suggested Change</summary>
```suggestion
def _request_with_retry():
# Create a copy of kwargs to avoid mutation across retries
request_kwargs = kwargs.copy()
# If the request has json in kwargs, use orjson to serialize it,
# remove it from kwargs, and add it to the content parameter
# This is because httpx uses a slower json serializer
if "json" in request_kwargs:
data = orjson.dumps(request_kwargs.pop("json"), option=orjson.OPT_SERIALIZE_NUMPY)
request_kwargs["content"] = data
# Unlike requests, httpx does not automatically escape the path
escaped_path = urllib.parse.quote(path, safe="/", encoding=None, errors=None)
url = self._api_url + escaped_path
response = self._session.request(method, url, **cast(Any, request_kwargs))
BaseHTTPClient._raise_chroma_error(response)
return orjson.loads(response.text)
```
⚡ **Committable suggestion**
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
</details>
File: chromadb/api/fastapi.py
Line: 141
def is_retryable_exception(exception: BaseException) -> bool: | ||
if isinstance(exception, ( | ||
httpx.ConnectError, | ||
httpx.ConnectTimeout, | ||
httpx.ReadTimeout, | ||
httpx.WriteTimeout, | ||
httpx.PoolTimeout, | ||
httpx.NetworkError, | ||
httpx.RemoteProtocolError, | ||
)): | ||
return True | ||
|
||
if isinstance(exception, httpx.HTTPStatusError): | ||
# Retry on server errors that might be temporary | ||
return exception.response.status_code in [502, 503, 504] | ||
|
||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
To make this new retry functionality more robust and flexible, I have a couple of suggestions:
-
Testing: It would be beneficial to add unit tests for
is_retryable_exception
to cover the various exception types and status codes. This would help prevent regressions in the retry logic. -
Configurability: The retry parameters (e.g., number of attempts, backoff factor) are currently hardcoded. It might be beneficial to make these configurable via
Settings
. This would allow users to tune the retry behavior for their specific network conditions.
Context for Agents
[**BestPractice**]
To make this new retry functionality more robust and flexible, I have a couple of suggestions:
1. **Testing**: It would be beneficial to add unit tests for `is_retryable_exception` to cover the various exception types and status codes. This would help prevent regressions in the retry logic.
2. **Configurability**: The retry parameters (e.g., number of attempts, backoff factor) are currently hardcoded. It might be beneficial to make these configurable via `Settings`. This would allow users to tune the retry behavior for their specific network conditions.
File: chromadb/api/fastapi.py
Line: 83
ef8c70f
to
65487ad
Compare
b381e79
to
128ad9d
Compare
65487ad
to
93023f0
Compare
def _make_request(self, method: str, path: str, **kwargs: Dict[str, Any]) -> Any: | ||
# If the request has json in kwargs, use orjson to serialize it, | ||
# remove it from kwargs, and add it to the content parameter | ||
# This is because httpx uses a slower json serializer | ||
if "json" in kwargs: | ||
data = orjson.dumps(kwargs.pop("json"), option=orjson.OPT_SERIALIZE_NUMPY) | ||
kwargs["content"] = data | ||
|
||
# Unlike requests, httpx does not automatically escape the path | ||
escaped_path = urllib.parse.quote(path, safe="/", encoding=None, errors=None) | ||
url = self._api_url + escaped_path | ||
|
||
response = self._session.request(method, url, **cast(Any, kwargs)) | ||
BaseHTTPClient._raise_chroma_error(response) | ||
return orjson.loads(response.text) | ||
def _send_request() -> Any: | ||
# If the request has json in kwargs, use orjson to serialize it, | ||
# remove it from kwargs, and add it to the content parameter | ||
# This is because httpx uses a slower json serializer | ||
if "json" in kwargs: | ||
data = orjson.dumps( | ||
kwargs.pop("json"), option=orjson.OPT_SERIALIZE_NUMPY | ||
) | ||
kwargs["content"] = data | ||
|
||
# Unlike requests, httpx does not automatically escape the path | ||
escaped_path = urllib.parse.quote( | ||
path, safe="/", encoding=None, errors=None | ||
) | ||
url = self._api_url + escaped_path | ||
|
||
response = self._session.request(method, url, **cast(Any, kwargs)) | ||
BaseHTTPClient._raise_chroma_error(response) | ||
return orjson.loads(response.text) | ||
|
||
retry_config = self._settings.retry_config | ||
|
||
if retry_config is None: | ||
return _send_request() | ||
|
||
min_delay = max(float(retry_config.min_delay), 0.0) | ||
max_delay = max(float(retry_config.max_delay), min_delay) | ||
multiplier = max(min_delay, 1e-3) | ||
exp_base = retry_config.factor if retry_config.factor > 0 else 2.0 | ||
|
||
wait_args = { | ||
"multiplier": multiplier, | ||
"min": min_delay, | ||
"max": max_delay, | ||
"exp_base": exp_base, | ||
} | ||
|
||
wait_strategy = ( | ||
wait_random_exponential(**wait_args) | ||
if retry_config.jitter | ||
else wait_exponential(**wait_args) | ||
) | ||
|
||
retrying = Retrying( | ||
stop=stop_after_attempt(retry_config.max_attempts), | ||
wait=wait_strategy, | ||
retry=retry_if_exception(is_retryable_exception), | ||
before_sleep=before_sleep_log(logger, logging.INFO), | ||
reraise=True, | ||
) | ||
|
||
try: | ||
return retrying(_send_request) | ||
except RetryError as e: | ||
# Re-raise the last exception that caused the retry to fail | ||
raise e.last_attempt.exception() from None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
[CodeDuplication] The retry logic implemented in _make_request
is nearly identical to the logic in chromadb/api/async_fastapi.py
's _make_request
method (lines 153-209). This introduces code duplication, making future maintenance harder.
Consider refactoring the common retry configuration and execution logic into a shared helper function. This would centralize the retry strategy and reduce redundancy. For example, a helper could build the Retrying
or AsyncRetrying
object based on a flag.
Context for Agents
[**BestPractice**]
[CodeDuplication] The retry logic implemented in `_make_request` is nearly identical to the logic in `chromadb/api/async_fastapi.py`'s `_make_request` method (lines 153-209). This introduces code duplication, making future maintenance harder.
Consider refactoring the common retry configuration and execution logic into a shared helper function. This would centralize the retry strategy and reduce redundancy. For example, a helper could build the `Retrying` or `AsyncRetrying` object based on a flag.
File: chromadb/api/fastapi.py
Line: 186
93023f0
to
7df8178
Compare
7df8178
to
e80460b
Compare
128ad9d
to
e24519c
Compare
min_delay: int = 1 | ||
max_delay: int = 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
For consistency with the Javascript client and to allow for more granular control over backoff timing, consider changing min_delay
and max_delay
to float
type. The implementation already casts these values to float, so this change would make the model's type hint more accurate.
min_delay: int = 1 | |
max_delay: int = 5 | |
min_delay: float = 1.0 | |
max_delay: float = 5.0 |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
Context for Agents
[**BestPractice**]
For consistency with the Javascript client and to allow for more granular control over backoff timing, consider changing `min_delay` and `max_delay` to `float` type. The implementation already casts these values to float, so this change would make the model's type hint more accurate.
```suggestion
min_delay: float = 1.0
max_delay: float = 5.0
```
⚡ **Committable suggestion**
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
File: chromadb/config.py
Line: 104
Description of changes
Test plan
tested in tilt
pytest
for python,yarn test
for js,cargo test
for rustMigration plan
None. Need to send out a notice to upgrade clients
Observability plan
In tilt and staging
Documentation Changes
None