refactor: use `extra_body` to pass in `input_type` params for asymmetric embedding models for NVIDIA Inference Provider #3804

jiayin-nvidia · 2025-10-14T05:25:49Z

What does this PR do?

Previously, the NVIDIA inference provider implemented a custom openai_embeddings method with a hardcoded input_type="query" parameter, which is required by NVIDIA asymmetric embedding models(#3205).
Recently extra_body parameter is added to the embeddings API (#3794). So, this PR updates the NVIDIA inference provider to use the base OpenAIMixin.openai_embeddings method instead and pass the input_type through the extra_body parameter for asymmetric embedding models.

Test Plan

Run the following command for the embedding_model: nvidia/llama-3.2-nv-embedqa-1b-v2, nvidia/nv-embedqa-e5-v5, nvidia/nv-embedqa-mistral-7b-v2, and snowflake/arctic-embed-l.

pytest -s -v tests/integration/inference/test_openai_embeddings.py --stack-config="inference=nvidia" --embedding-model={embedding_model} --env NVIDIA_API_KEY={nvidia_api_key} --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" --inference-mode=record

…dding models for NVIDIA Inference Provider

ashwinb · 2025-10-14T16:41:25Z

tests/integration/inference/test_openai_embeddings.py

+        "model": embedding_model_id,
+        "input": [],
+    }
+    if is_asymmetric_model(client_with_models, embedding_model_id):


you can always add extra body unconditionally? if it is null, it is null no harm made. you don't need the double check?

Good point! I also consolidated the verification and retrieval of extra_body for asymmetric models into a single helper function that's used consistently across all openai_embeddings test cases.

ashwinb · 2025-10-14T20:23:16Z

tests/integration/inference/test_openai_embeddings.py

+    For other models, return None.
+    """
+    is_asymmetric = is_asymmetric_model(client_with_models, model_id)
+    if is_asymmetric:


nah I don't think we need to be that defensive. i'd just simplify this a bunch more. simply make the callsites be:

client.embeddings.create(..., extra_body=get_extra_body())

that's it!

ashwinb · 2025-10-14T20:32:28Z

tests/integration/inference/test_openai_embeddings.py

    return providers[provider_id]


+def is_asymmetric_model(client_with_models, model_id):


where is this used now?

Update to put it in get_extra_body_for_model: if it is not an asymmetric model, return None for the extra_body.

jiayin-nvidia requested review from ashwinb, bbrowning, ehhuang, franciscojavierarceo, hardikjshah, leseb, mattf, raghotham, reluctantfuturist, slekkala1, terrytangyuan and yanxi0830 as code owners October 14, 2025 05:25

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 14, 2025

jiayin-nvidia force-pushed the use_extra_body_for_nvidia_openai_embeddings branch 2 times, most recently from 98c19d1 to f4f203a Compare October 14, 2025 05:51

fix: use extra_body for passing input_type params for asymmetric embe…

1d4d263

…dding models for NVIDIA Inference Provider

jiayin-nvidia force-pushed the use_extra_body_for_nvidia_openai_embeddings branch from f4f203a to 1d4d263 Compare October 14, 2025 16:26

ashwinb reviewed Oct 14, 2025

View reviewed changes

jiayin-nvidia force-pushed the use_extra_body_for_nvidia_openai_embeddings branch from c8d54e3 to 562ad43 Compare October 14, 2025 20:31

ashwinb reviewed Oct 14, 2025

View reviewed changes

ashwinb approved these changes Oct 14, 2025

View reviewed changes

jiayin-nvidia force-pushed the use_extra_body_for_nvidia_openai_embeddings branch 2 times, most recently from f651ece to 81e04c7 Compare October 14, 2025 20:40

Refactor test_openai_embeddings

33190c1

jiayin-nvidia force-pushed the use_extra_body_for_nvidia_openai_embeddings branch from 81e04c7 to 33190c1 Compare October 14, 2025 20:42

ashwinb merged commit d875e42 into llamastack:main Oct 14, 2025
23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: use `extra_body` to pass in `input_type` params for asymmetric embedding models for NVIDIA Inference Provider #3804

refactor: use `extra_body` to pass in `input_type` params for asymmetric embedding models for NVIDIA Inference Provider #3804

Uh oh!

jiayin-nvidia commented Oct 14, 2025 •

edited

Loading

Uh oh!

ashwinb Oct 14, 2025

Uh oh!

jiayin-nvidia Oct 14, 2025 •

edited

Loading

Uh oh!

ashwinb Oct 14, 2025 •

edited

Loading

Uh oh!

ashwinb Oct 14, 2025

Uh oh!

jiayin-nvidia Oct 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		return providers[provider_id]


		def is_asymmetric_model(client_with_models, model_id):

refactor: use extra_body to pass in input_type params for asymmetric embedding models for NVIDIA Inference Provider #3804

refactor: use extra_body to pass in input_type params for asymmetric embedding models for NVIDIA Inference Provider #3804

Uh oh!

Conversation

jiayin-nvidia commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

ashwinb Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

jiayin-nvidia Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ashwinb Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ashwinb Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

jiayin-nvidia Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

refactor: use `extra_body` to pass in `input_type` params for asymmetric embedding models for NVIDIA Inference Provider #3804

refactor: use `extra_body` to pass in `input_type` params for asymmetric embedding models for NVIDIA Inference Provider #3804

jiayin-nvidia commented Oct 14, 2025 •

edited

Loading

jiayin-nvidia Oct 14, 2025 •

edited

Loading

ashwinb Oct 14, 2025 •

edited

Loading