Skip to content

Conversation

jiayin-nvidia
Copy link
Contributor

@jiayin-nvidia jiayin-nvidia commented Oct 14, 2025

What does this PR do?

Previously, the NVIDIA inference provider implemented a custom openai_embeddings method with a hardcoded input_type="query" parameter, which is required by NVIDIA asymmetric embedding models(#3205).
Recently extra_body parameter is added to the embeddings API (#3794). So, this PR updates the NVIDIA inference provider to use the base OpenAIMixin.openai_embeddings method instead and pass the input_type through the extra_body parameter for asymmetric embedding models.

Test Plan

Run the following command for the embedding_model: nvidia/llama-3.2-nv-embedqa-1b-v2, nvidia/nv-embedqa-e5-v5, nvidia/nv-embedqa-mistral-7b-v2, and snowflake/arctic-embed-l.

pytest -s -v tests/integration/inference/test_openai_embeddings.py --stack-config="inference=nvidia" --embedding-model={embedding_model} --env NVIDIA_API_KEY={nvidia_api_key} --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" --inference-mode=record

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 14, 2025
@jiayin-nvidia jiayin-nvidia force-pushed the use_extra_body_for_nvidia_openai_embeddings branch 2 times, most recently from 98c19d1 to f4f203a Compare October 14, 2025 05:51
@jiayin-nvidia jiayin-nvidia force-pushed the use_extra_body_for_nvidia_openai_embeddings branch from f4f203a to 1d4d263 Compare October 14, 2025 16:26
"model": embedding_model_id,
"input": [],
}
if is_asymmetric_model(client_with_models, embedding_model_id):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can always add extra body unconditionally? if it is null, it is null no harm made. you don't need the double check?

Copy link
Contributor Author

@jiayin-nvidia jiayin-nvidia Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I also consolidated the verification and retrieval of extra_body for asymmetric models into a single helper function that's used consistently across all openai_embeddings test cases.

For other models, return None.
"""
is_asymmetric = is_asymmetric_model(client_with_models, model_id)
if is_asymmetric:
Copy link
Contributor

@ashwinb ashwinb Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nah I don't think we need to be that defensive. i'd just simplify this a bunch more. simply make the callsites be:

client.embeddings.create(..., extra_body=get_extra_body())

that's it!

@jiayin-nvidia jiayin-nvidia force-pushed the use_extra_body_for_nvidia_openai_embeddings branch from c8d54e3 to 562ad43 Compare October 14, 2025 20:31
return providers[provider_id]


def is_asymmetric_model(client_with_models, model_id):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this used now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to put it in get_extra_body_for_model: if it is not an asymmetric model, return None for the extra_body.

@jiayin-nvidia jiayin-nvidia force-pushed the use_extra_body_for_nvidia_openai_embeddings branch 2 times, most recently from f651ece to 81e04c7 Compare October 14, 2025 20:40
@jiayin-nvidia jiayin-nvidia force-pushed the use_extra_body_for_nvidia_openai_embeddings branch from 81e04c7 to 33190c1 Compare October 14, 2025 20:42
@ashwinb ashwinb merged commit d875e42 into llamastack:main Oct 14, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants