-
Notifications
You must be signed in to change notification settings - Fork 463
fix(llmobs): ensure langchain azure openai spans are not duplicate llm marked [backport 3.16] #14990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 3.16
Are you sure you want to change the base?
Conversation
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 242 ± 3 ms. The average import time from base is: 243 ± 2 ms. The import time difference between this PR and base is: -0.5 ± 0.1 ms. Import time breakdownThe following import paths have shrunk:
|
…m marked (#14939) [MLOB-4230] This PR does 3 things: 1. (non-user facing) Updates our docker-compose and services.yml files to upgrade to the latest testagent version, as well as adding a env var `VCR_PROVIDER_MAP` value for the testagent configs. 2. (user-facing) fixes the langchain integration such that azure openai calls are not marked as duplicate LLM spans (if the openai integration is enabled), and instead marks them as generic workflow spans. 3. (non-user facing) Adds langchain tests for calling Azure OpenAI. These requires the testagent upgrade and the `VCR_PROVIDER_MAP` env var to allow the testagent vcr proxy to call the azure openai endpoint. We have logic in our langchain integration to mark specific LLM calls as generic workflow spans (instead of the default llm span) if we detect the corresponding integration (for the given provider, i.e. `openai/anthropic`) is also enabled and will result in a downstream LLM span. Our product experience breaks if multiple spans duplicate represent an LLM call, and we were previously missing support for azure openai. <!-- Provide an overview of the change and motivation for the change --> <!-- Describe your testing strategy or note what tests are included --> <!-- Note any risks associated with this change, or "None" if no risks --> <!-- Any other information that would be helpful for reviewers --> [MLOB-4230]: https://datadoghq.atlassian.net/browse/MLOB-4230?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ (cherry picked from commit 9f7d187)
39f86b7
to
821a80f
Compare
Performance SLOsComparing candidate backport-14939-to-3.16 (58b6eff) with baseline 3.16 (39b3ba8) 📈 Performance Regressions (4 suites)📈 iast_aspects - 40/40✅ re_expand_aspectTime: ✅ 31.761µs (SLO: <40.000µs 📉 -20.6%) vs baseline: -0.9% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ re_expand_noaspectTime: ✅ 29.869µs (SLO: <40.000µs 📉 -25.3%) vs baseline: +4.8% Memory: ✅ 37.591MB (SLO: <39.000MB -3.6%) vs baseline: +4.6% ✅ re_findall_aspectTime: ✅ 2.920µs (SLO: <10.000µs 📉 -70.8%) vs baseline: +0.4% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ re_findall_noaspectTime: ✅ 1.417µs (SLO: <10.000µs 📉 -85.8%) vs baseline: -0.2% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ re_finditer_aspectTime: ✅ 4.410µs (SLO: <10.000µs 📉 -55.9%) vs baseline: -0.5% Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +4.9% ✅ re_finditer_noaspectTime: ✅ 1.406µs (SLO: <10.000µs 📉 -85.9%) vs baseline: ~same Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ re_fullmatch_aspectTime: ✅ 2.947µs (SLO: <10.000µs 📉 -70.5%) vs baseline: 📈 +10.5% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ re_fullmatch_noaspectTime: ✅ 1.296µs (SLO: <10.000µs 📉 -87.0%) vs baseline: +0.5% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.5% ✅ re_group_aspectTime: ✅ 2.960µs (SLO: <10.000µs 📉 -70.4%) vs baseline: +0.3% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.9% ✅ re_group_noaspectTime: ✅ 1.604µs (SLO: <10.000µs 📉 -84.0%) vs baseline: +0.5% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.9% ✅ re_groups_aspectTime: ✅ 3.074µs (SLO: <10.000µs 📉 -69.3%) vs baseline: -0.4% Memory: ✅ 37.749MB (SLO: <39.000MB -3.2%) vs baseline: +5.0% ✅ re_groups_noaspectTime: ✅ 1.685µs (SLO: <10.000µs 📉 -83.2%) vs baseline: +0.4% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +5.1% ✅ re_match_aspectTime: ✅ 2.713µs (SLO: <10.000µs 📉 -72.9%) vs baseline: -1.1% Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ re_match_noaspectTime: ✅ 1.290µs (SLO: <10.000µs 📉 -87.1%) vs baseline: -1.1% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ re_search_aspectTime: ✅ 2.557µs (SLO: <10.000µs 📉 -74.4%) vs baseline: -0.2% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.8% ✅ re_search_noaspectTime: ✅ 1.197µs (SLO: <10.000µs 📉 -88.0%) vs baseline: -0.4% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ re_sub_aspectTime: ✅ 3.401µs (SLO: <10.000µs 📉 -66.0%) vs baseline: -0.6% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ re_sub_noaspectTime: ✅ 1.532µs (SLO: <10.000µs 📉 -84.7%) vs baseline: -0.6% Memory: ✅ 37.591MB (SLO: <39.000MB -3.6%) vs baseline: +4.5% ✅ re_subn_aspectTime: ✅ 3.660µs (SLO: <10.000µs 📉 -63.4%) vs baseline: -1.2% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.9% ✅ re_subn_noaspectTime: ✅ 1.608µs (SLO: <10.000µs 📉 -83.9%) vs baseline: -0.7% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.9% 📈 iastaspects - 118/118✅ add_aspectTime: ✅ 0.405µs (SLO: <10.000µs 📉 -95.9%) vs baseline: -0.6% Memory: ✅ 38.004MB (SLO: <39.000MB -2.6%) vs baseline: +4.8% ✅ add_inplace_aspectTime: ✅ 0.408µs (SLO: <10.000µs 📉 -95.9%) vs baseline: +0.8% Memory: ✅ 37.749MB (SLO: <39.000MB -3.2%) vs baseline: +4.1% ✅ add_inplace_noaspectTime: ✅ 0.315µs (SLO: <10.000µs 📉 -96.9%) vs baseline: -1.2% Memory: ✅ 38.122MB (SLO: <39.000MB -2.3%) vs baseline: +5.0% ✅ add_noaspectTime: ✅ 0.277µs (SLO: <10.000µs 📉 -97.2%) vs baseline: +0.3% Memory: ✅ 38.063MB (SLO: <39.000MB -2.4%) vs baseline: +5.1% ✅ bytearray_aspectTime: ✅ 1.364µs (SLO: <10.000µs 📉 -86.4%) vs baseline: +3.2% Memory: ✅ 37.985MB (SLO: <39.000MB -2.6%) vs baseline: +4.6% ✅ bytearray_extend_aspectTime: ✅ 1.512µs (SLO: <10.000µs 📉 -84.9%) vs baseline: -1.1% Memory: ✅ 37.906MB (SLO: <39.000MB -2.8%) vs baseline: +4.3% ✅ bytearray_extend_noaspectTime: ✅ 0.610µs (SLO: <10.000µs 📉 -93.9%) vs baseline: -0.9% Memory: ✅ 37.945MB (SLO: <39.000MB -2.7%) vs baseline: +4.5% ✅ bytearray_noaspectTime: ✅ 0.483µs (SLO: <10.000µs 📉 -95.2%) vs baseline: +0.2% Memory: ✅ 38.044MB (SLO: <39.000MB -2.5%) vs baseline: +4.8% ✅ bytes_aspectTime: ✅ 1.291µs (SLO: <10.000µs 📉 -87.1%) vs baseline: -0.2% Memory: ✅ 38.044MB (SLO: <39.000MB -2.5%) vs baseline: +4.7% ✅ bytes_noaspectTime: ✅ 0.494µs (SLO: <10.000µs 📉 -95.1%) vs baseline: +0.7% Memory: ✅ 37.749MB (SLO: <39.000MB -3.2%) vs baseline: +4.0% ✅ bytesio_aspectTime: ✅ 1.357µs (SLO: <10.000µs 📉 -86.4%) vs baseline: -0.1% Memory: ✅ 37.985MB (SLO: <39.000MB -2.6%) vs baseline: +4.7% ✅ bytesio_noaspectTime: ✅ 0.499µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +1.1% Memory: ✅ 37.749MB (SLO: <39.000MB -3.2%) vs baseline: +3.9% ✅ capitalize_aspectTime: ✅ 0.734µs (SLO: <10.000µs 📉 -92.7%) vs baseline: -0.4% Memory: ✅ 38.024MB (SLO: <39.000MB -2.5%) vs baseline: +5.4% ✅ capitalize_noaspectTime: ✅ 0.439µs (SLO: <10.000µs 📉 -95.6%) vs baseline: +0.7% Memory: ✅ 38.083MB (SLO: <39.000MB -2.4%) vs baseline: +4.9% ✅ casefold_aspectTime: ✅ 0.737µs (SLO: <10.000µs 📉 -92.6%) vs baseline: +0.3% Memory: ✅ 38.024MB (SLO: <39.000MB -2.5%) vs baseline: +4.6% ✅ casefold_noaspectTime: ✅ 0.375µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +2.7% Memory: ✅ 37.827MB (SLO: <39.000MB -3.0%) vs baseline: +4.0% ✅ decode_aspectTime: ✅ 0.739µs (SLO: <10.000µs 📉 -92.6%) vs baseline: +2.1% Memory: ✅ 38.004MB (SLO: <39.000MB -2.6%) vs baseline: +4.6% ✅ decode_noaspectTime: ✅ 0.418µs (SLO: <10.000µs 📉 -95.8%) vs baseline: -0.5% Memory: ✅ 37.768MB (SLO: <39.000MB -3.2%) vs baseline: +4.1% ✅ encode_aspectTime: ✅ 0.707µs (SLO: <10.000µs 📉 -92.9%) vs baseline: -0.2% Memory: ✅ 38.024MB (SLO: <39.000MB -2.5%) vs baseline: +4.9% ✅ encode_noaspectTime: ✅ 0.406µs (SLO: <10.000µs 📉 -95.9%) vs baseline: +1.2% Memory: ✅ 37.867MB (SLO: <39.000MB -2.9%) vs baseline: +4.2% ✅ format_aspectTime: ✅ 3.379µs (SLO: <10.000µs 📉 -66.2%) vs baseline: -0.6% Memory: ✅ 38.024MB (SLO: <39.000MB -2.5%) vs baseline: +4.7% ✅ format_map_aspectTime: ✅ 4.100µs (SLO: <10.000µs 📉 -59.0%) vs baseline: 📈 +12.6% Memory: ✅ 37.788MB (SLO: <39.000MB -3.1%) vs baseline: +4.0% ✅ format_map_noaspectTime: ✅ 0.776µs (SLO: <10.000µs 📉 -92.2%) vs baseline: +0.5% Memory: ✅ 38.063MB (SLO: <39.000MB -2.4%) vs baseline: +4.9% ✅ format_noaspectTime: ✅ 0.597µs (SLO: <10.000µs 📉 -94.0%) vs baseline: -0.7% Memory: ✅ 37.808MB (SLO: <39.000MB -3.1%) vs baseline: +4.2% ✅ index_aspectTime: ✅ 0.359µs (SLO: <10.000µs 📉 -96.4%) vs baseline: +0.6% Memory: ✅ 38.142MB (SLO: <39.000MB -2.2%) vs baseline: +5.1% ✅ index_noaspectTime: ✅ 0.275µs (SLO: <10.000µs 📉 -97.3%) vs baseline: -0.7% Memory: ✅ 37.808MB (SLO: <39.000MB -3.1%) vs baseline: +4.3% ✅ join_aspectTime: ✅ 1.380µs (SLO: <10.000µs 📉 -86.2%) vs baseline: +1.6% Memory: ✅ 38.103MB (SLO: <39.000MB -2.3%) vs baseline: +5.0% ✅ join_noaspectTime: ✅ 0.495µs (SLO: <10.000µs 📉 -95.1%) vs baseline: +0.5% Memory: ✅ 38.103MB (SLO: <39.000MB -2.3%) vs baseline: +5.1% ✅ ljust_aspectTime: ✅ 2.559µs (SLO: <20.000µs 📉 -87.2%) vs baseline: -1.1% Memory: ✅ 37.965MB (SLO: <39.000MB -2.7%) vs baseline: +4.6% ✅ ljust_noaspectTime: ✅ 0.406µs (SLO: <10.000µs 📉 -95.9%) vs baseline: +0.3% Memory: ✅ 37.788MB (SLO: <39.000MB -3.1%) vs baseline: +4.0% ✅ lower_aspectTime: ✅ 2.332µs (SLO: <10.000µs 📉 -76.7%) vs baseline: +5.6% Memory: ✅ 38.083MB (SLO: <39.000MB -2.4%) vs baseline: +4.9% ✅ lower_noaspectTime: ✅ 0.371µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +0.4% Memory: ✅ 37.768MB (SLO: <39.000MB -3.2%) vs baseline: +3.9% ✅ lstrip_aspectTime: ✅ 2.501µs (SLO: <20.000µs 📉 -87.5%) vs baseline: 📈 +13.6% Memory: ✅ 38.083MB (SLO: <39.000MB -2.4%) vs baseline: +4.9% ✅ lstrip_noaspectTime: ✅ 0.379µs (SLO: <10.000µs 📉 -96.2%) vs baseline: -2.5% Memory: ✅ 37.867MB (SLO: <39.000MB -2.9%) vs baseline: +4.2% ✅ modulo_aspectTime: ✅ 0.998µs (SLO: <10.000µs 📉 -90.0%) vs baseline: +0.6% Memory: ✅ 38.044MB (SLO: <39.000MB -2.5%) vs baseline: +5.1% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 1.612µs (SLO: <10.000µs 📉 -83.9%) vs baseline: +4.6% Memory: ✅ 38.083MB (SLO: <39.000MB -2.4%) vs baseline: +4.9% ✅ modulo_aspect_for_bytesTime: ✅ 0.978µs (SLO: <10.000µs 📉 -90.2%) vs baseline: -0.4% Memory: ✅ 38.004MB (SLO: <39.000MB -2.6%) vs baseline: +4.7% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 1.241µs (SLO: <10.000µs 📉 -87.6%) vs baseline: ~same Memory: ✅ 37.985MB (SLO: <39.000MB -2.6%) vs baseline: +4.5% ✅ modulo_noaspectTime: ✅ 0.626µs (SLO: <10.000µs 📉 -93.7%) vs baseline: -0.4% Memory: ✅ 37.749MB (SLO: <39.000MB -3.2%) vs baseline: +4.0% ✅ replace_aspectTime: ✅ 5.427µs (SLO: <10.000µs 📉 -45.7%) vs baseline: 📈 +11.8% Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +3.8% ✅ replace_noaspectTime: ✅ 0.460µs (SLO: <10.000µs 📉 -95.4%) vs baseline: -1.0% Memory: ✅ 37.768MB (SLO: <39.000MB -3.2%) vs baseline: +4.0% ✅ repr_aspectTime: ✅ 0.906µs (SLO: <10.000µs 📉 -90.9%) vs baseline: -0.6% Memory: ✅ 38.024MB (SLO: <39.000MB -2.5%) vs baseline: +4.6% ✅ repr_noaspectTime: ✅ 0.419µs (SLO: <10.000µs 📉 -95.8%) vs baseline: -0.5% Memory: ✅ 38.122MB (SLO: <39.000MB -2.3%) vs baseline: +5.0% ✅ rstrip_aspectTime: ✅ 1.934µs (SLO: <20.000µs 📉 -90.3%) vs baseline: +1.1% Memory: ✅ 38.122MB (SLO: <39.000MB -2.3%) vs baseline: +4.9% ✅ rstrip_noaspectTime: ✅ 0.383µs (SLO: <10.000µs 📉 -96.2%) vs baseline: +0.6% Memory: ✅ 37.749MB (SLO: <39.000MB -3.2%) vs baseline: +3.8% ✅ slice_aspectTime: ✅ 0.497µs (SLO: <10.000µs 📉 -95.0%) vs baseline: ~same Memory: ✅ 38.024MB (SLO: <39.000MB -2.5%) vs baseline: +4.6% ✅ slice_noaspectTime: ✅ 0.445µs (SLO: <10.000µs 📉 -95.5%) vs baseline: -0.3% Memory: ✅ 37.768MB (SLO: <39.000MB -3.2%) vs baseline: +4.1% ✅ stringio_aspectTime: ✅ 1.759µs (SLO: <10.000µs 📉 -82.4%) vs baseline: 📈 +12.2% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +3.8% ✅ stringio_noaspectTime: ✅ 0.715µs (SLO: <10.000µs 📉 -92.8%) vs baseline: -1.3% Memory: ✅ 37.749MB (SLO: <39.000MB -3.2%) vs baseline: +4.0% ✅ strip_aspectTime: ✅ 2.194µs (SLO: <20.000µs 📉 -89.0%) vs baseline: -0.4% Memory: ✅ 38.044MB (SLO: <39.000MB -2.5%) vs baseline: +5.1% ✅ strip_noaspectTime: ✅ 0.386µs (SLO: <10.000µs 📉 -96.1%) vs baseline: -0.5% Memory: ✅ 37.788MB (SLO: <39.000MB -3.1%) vs baseline: +4.2% ✅ swapcase_aspectTime: ✅ 2.534µs (SLO: <10.000µs 📉 -74.7%) vs baseline: +5.2% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +3.8% ✅ swapcase_noaspectTime: ✅ 0.538µs (SLO: <10.000µs 📉 -94.6%) vs baseline: +0.2% Memory: ✅ 37.749MB (SLO: <39.000MB -3.2%) vs baseline: +3.9% ✅ title_aspectTime: ✅ 2.418µs (SLO: <10.000µs 📉 -75.8%) vs baseline: +2.7% Memory: ✅ 38.083MB (SLO: <39.000MB -2.4%) vs baseline: +5.0% ✅ title_noaspectTime: ✅ 0.502µs (SLO: <10.000µs 📉 -95.0%) vs baseline: -1.6% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +3.6% ✅ translate_aspectTime: ✅ 3.225µs (SLO: <10.000µs 📉 -67.7%) vs baseline: -0.3% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +3.8% ✅ translate_noaspectTime: ✅ 1.055µs (SLO: <10.000µs 📉 -89.4%) vs baseline: +0.5% Memory: ✅ 38.083MB (SLO: <39.000MB -2.4%) vs baseline: +4.8% ✅ upper_aspectTime: ✅ 2.231µs (SLO: <10.000µs 📉 -77.7%) vs baseline: +0.3% Memory: ✅ 38.063MB (SLO: <39.000MB -2.4%) vs baseline: +4.7% ✅ upper_noaspectTime: ✅ 0.366µs (SLO: <10.000µs 📉 -96.3%) vs baseline: -1.6% Memory: ✅ 37.827MB (SLO: <39.000MB -3.0%) vs baseline: +4.3% 📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 4.151µs (SLO: <10.000µs 📉 -58.5%) vs baseline: -1.1% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.7% ✅ ospathbasename_noaspectTime: ✅ 1.080µs (SLO: <10.000µs 📉 -89.2%) vs baseline: +0.2% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.1% ✅ ospathjoin_aspectTime: ✅ 6.968µs (SLO: <10.000µs 📉 -30.3%) vs baseline: 📈 +14.5% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ ospathjoin_noaspectTime: ✅ 2.292µs (SLO: <10.000µs 📉 -77.1%) vs baseline: -0.4% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ ospathnormcase_aspectTime: ✅ 3.499µs (SLO: <10.000µs 📉 -65.0%) vs baseline: +1.8% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7% ✅ ospathnormcase_noaspectTime: ✅ 0.571µs (SLO: <10.000µs 📉 -94.3%) vs baseline: +0.9% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ ospathsplit_aspectTime: ✅ 4.731µs (SLO: <10.000µs 📉 -52.7%) vs baseline: +0.3% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.9% ✅ ospathsplit_noaspectTime: ✅ 1.592µs (SLO: <10.000µs 📉 -84.1%) vs baseline: +0.4% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.8% ✅ ospathsplitdrive_aspectTime: ✅ 3.664µs (SLO: <10.000µs 📉 -63.4%) vs baseline: -0.2% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ ospathsplitdrive_noaspectTime: ✅ 0.696µs (SLO: <10.000µs 📉 -93.0%) vs baseline: -0.7% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ ospathsplitext_aspectTime: ✅ 5.102µs (SLO: <10.000µs 📉 -49.0%) vs baseline: 📈 +13.2% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ ospathsplitext_noaspectTime: ✅ 1.384µs (SLO: <10.000µs 📉 -86.2%) vs baseline: -0.1% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.6% 📈 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 3.358µs (SLO: <20.000µs 📉 -83.2%) vs baseline: +8.1% Memory: ✅ 32.145MB (SLO: <34.000MB -5.5%) vs baseline: +5.1% ✅ 1-count-metrics-100-timesTime: ✅ 213.203µs (SLO: <250.000µs 📉 -14.7%) vs baseline: +0.8% Memory: ✅ 32.106MB (SLO: <34.000MB -5.6%) vs baseline: +5.0% ✅ 1-distribution-metric-1-timesTime: ✅ 3.117µs (SLO: <20.000µs 📉 -84.4%) vs baseline: +7.8% Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +5.0% ✅ 1-distribution-metrics-100-timesTime: ✅ 193.393µs (SLO: <220.000µs 📉 -12.1%) vs baseline: +0.9% Memory: ✅ 32.067MB (SLO: <34.000MB -5.7%) vs baseline: +4.6% ✅ 1-gauge-metric-1-timesTime: ✅ 2.330µs (SLO: <20.000µs 📉 -88.3%) vs baseline: 📈 +12.7% Memory: ✅ 32.047MB (SLO: <34.000MB -5.7%) vs baseline: +4.6% ✅ 1-gauge-metrics-100-timesTime: ✅ 126.649µs (SLO: <150.000µs 📉 -15.6%) vs baseline: +0.7% Memory: ✅ 32.106MB (SLO: <34.000MB -5.6%) vs baseline: +4.8% ✅ 1-rate-metric-1-timesTime: ✅ 3.457µs (SLO: <20.000µs 📉 -82.7%) vs baseline: 📈 +11.0% Memory: ✅ 32.086MB (SLO: <34.000MB -5.6%) vs baseline: +4.7% ✅ 1-rate-metrics-100-timesTime: ✅ 215.179µs (SLO: <250.000µs 📉 -13.9%) vs baseline: +1.7% Memory: ✅ 32.086MB (SLO: <34.000MB -5.6%) vs baseline: +4.6% ✅ 100-count-metrics-100-timesTime: ✅ 21.400ms (SLO: <23.500ms -8.9%) vs baseline: -0.8% Memory: ✅ 32.106MB (SLO: <34.000MB -5.6%) vs baseline: +4.9% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.027ms (SLO: <2.250ms -9.9%) vs baseline: +1.9% Memory: ✅ 32.106MB (SLO: <34.000MB -5.6%) vs baseline: +4.8% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.288ms (SLO: <1.550ms 📉 -16.9%) vs baseline: -1.1% Memory: ✅ 32.165MB (SLO: <34.000MB -5.4%) vs baseline: +4.9% ✅ 100-rate-metrics-100-timesTime: ✅ 2.214ms (SLO: <2.550ms 📉 -13.2%) vs baseline: +1.1% Memory: ✅ 32.106MB (SLO: <34.000MB -5.6%) vs baseline: +4.9% ✅ flush-1-metricTime: ✅ 4.143µs (SLO: <20.000µs 📉 -79.3%) vs baseline: -0.3% Memory: ✅ 32.047MB (SLO: <34.000MB -5.7%) vs baseline: +4.7% ✅ flush-100-metricsTime: ✅ 182.323µs (SLO: <250.000µs 📉 -27.1%) vs baseline: -0.3% Memory: ✅ 32.185MB (SLO: <34.000MB -5.3%) vs baseline: +5.2% ✅ flush-1000-metricsTime: ✅ 2.210ms (SLO: <2.500ms 📉 -11.6%) vs baseline: +0.7% Memory: ✅ 32.932MB (SLO: <34.500MB -4.5%) vs baseline: +5.1% 🟡 Near SLO Breach (5 suites)🟡 djangosimple - 30/30✅ appsecTime: ✅ 20.445ms (SLO: <22.300ms -8.3%) vs baseline: -0.2% Memory: ✅ 65.568MB (SLO: <67.000MB -2.1%) vs baseline: +4.8% ✅ exception-replay-enabledTime: ✅ 1.346ms (SLO: <1.450ms -7.2%) vs baseline: +0.4% Memory: ✅ 64.423MB (SLO: <67.000MB -3.8%) vs baseline: +4.7% ✅ iastTime: ✅ 20.454ms (SLO: <22.250ms -8.1%) vs baseline: +0.1% Memory: ✅ 65.557MB (SLO: <67.000MB -2.2%) vs baseline: +5.0% ✅ profilerTime: ✅ 15.245ms (SLO: <16.550ms -7.9%) vs baseline: ~same Memory: ✅ 53.733MB (SLO: <54.500MB 🟡 -1.4%) vs baseline: +4.7% ✅ resource-renamingTime: ✅ 20.599ms (SLO: <21.750ms -5.3%) vs baseline: +0.2% Memory: ✅ 65.417MB (SLO: <67.000MB -2.4%) vs baseline: +5.0% ✅ span-code-originTime: ✅ 26.195ms (SLO: <28.200ms -7.1%) vs baseline: -0.3% Memory: ✅ 67.754MB (SLO: <69.500MB -2.5%) vs baseline: +4.9% ✅ tracerTime: ✅ 20.497ms (SLO: <21.750ms -5.8%) vs baseline: -0.2% Memory: ✅ 65.544MB (SLO: <67.000MB -2.2%) vs baseline: +4.9% ✅ tracer-and-profilerTime: ✅ 22.014ms (SLO: <23.500ms -6.3%) vs baseline: -0.1% Memory: ✅ 66.618MB (SLO: <67.500MB 🟡 -1.3%) vs baseline: +5.0% ✅ tracer-dont-create-db-spansTime: ✅ 19.270ms (SLO: <21.500ms 📉 -10.4%) vs baseline: -0.5% Memory: ✅ 65.447MB (SLO: <66.000MB 🟡 -0.8%) vs baseline: +4.8% ✅ tracer-minimalTime: ✅ 16.627ms (SLO: <17.500ms -5.0%) vs baseline: -0.3% Memory: ✅ 65.567MB (SLO: <66.000MB 🟡 -0.7%) vs baseline: +5.0% ✅ tracer-nativeTime: ✅ 20.454ms (SLO: <21.750ms -6.0%) vs baseline: ~same Memory: ✅ 71.332MB (SLO: <72.500MB 🟡 -1.6%) vs baseline: +4.8% ✅ tracer-no-cachesTime: ✅ 18.449ms (SLO: <19.650ms -6.1%) vs baseline: +0.2% Memory: ✅ 65.521MB (SLO: <67.000MB -2.2%) vs baseline: +4.9% ✅ tracer-no-databasesTime: ✅ 18.763ms (SLO: <20.100ms -6.7%) vs baseline: -0.2% Memory: ✅ 65.430MB (SLO: <67.000MB -2.3%) vs baseline: +5.0% ✅ tracer-no-middlewareTime: ✅ 20.072ms (SLO: <21.500ms -6.6%) vs baseline: -0.3% Memory: ✅ 65.494MB (SLO: <67.000MB -2.2%) vs baseline: +4.9% ✅ tracer-no-templatesTime: ✅ 20.305ms (SLO: <22.000ms -7.7%) vs baseline: ~same Memory: ✅ 65.545MB (SLO: <67.000MB -2.2%) vs baseline: +5.1% 🟡 errortrackingdjangosimple - 6/6✅ errortracking-enabled-allTime: ✅ 18.252ms (SLO: <19.850ms -8.1%) vs baseline: +0.9% Memory: ✅ 65.352MB (SLO: <66.500MB 🟡 -1.7%) vs baseline: +4.9% ✅ errortracking-enabled-userTime: ✅ 18.264ms (SLO: <19.400ms -5.9%) vs baseline: +1.0% Memory: ✅ 65.323MB (SLO: <66.500MB 🟡 -1.8%) vs baseline: +4.9% ✅ tracer-enabledTime: ✅ 18.119ms (SLO: <19.450ms -6.8%) vs baseline: ~same Memory: ✅ 65.274MB (SLO: <66.500MB 🟡 -1.8%) vs baseline: +4.9% 🟡 flasksimple - 18/18✅ appsec-getTime: ✅ 4.577ms (SLO: <4.750ms -3.7%) vs baseline: +0.6% Memory: ✅ 61.912MB (SLO: <65.000MB -4.8%) vs baseline: +4.7% ✅ appsec-postTime: ✅ 6.558ms (SLO: <6.750ms -2.8%) vs baseline: ~same Memory: ✅ 61.912MB (SLO: <65.000MB -4.8%) vs baseline: +4.8% ✅ appsec-telemetryTime: ✅ 4.561ms (SLO: <4.750ms -4.0%) vs baseline: -0.1% Memory: ✅ 61.971MB (SLO: <65.000MB -4.7%) vs baseline: +4.8% ✅ debuggerTime: ✅ 1.855ms (SLO: <2.000ms -7.2%) vs baseline: -0.2% Memory: ✅ 45.416MB (SLO: <47.000MB -3.4%) vs baseline: +4.8% ✅ iast-getTime: ✅ 1.870ms (SLO: <2.000ms -6.5%) vs baseline: +0.5% Memory: ✅ 42.369MB (SLO: <49.000MB 📉 -13.5%) vs baseline: +4.9% ✅ profilerTime: ✅ 1.910ms (SLO: <2.100ms -9.0%) vs baseline: ~same Memory: ✅ 46.360MB (SLO: <47.000MB 🟡 -1.4%) vs baseline: +4.5% ✅ resource-renamingTime: ✅ 3.379ms (SLO: <3.650ms -7.4%) vs baseline: +0.3% Memory: ✅ 52.204MB (SLO: <53.500MB -2.4%) vs baseline: +4.7% ✅ tracerTime: ✅ 3.373ms (SLO: <3.650ms -7.6%) vs baseline: +0.1% Memory: ✅ 52.199MB (SLO: <53.500MB -2.4%) vs baseline: +4.8% ✅ tracer-nativeTime: ✅ 3.371ms (SLO: <3.650ms -7.6%) vs baseline: +0.3% Memory: ✅ 58.138MB (SLO: <60.000MB -3.1%) vs baseline: +4.7% 🟡 otelspan - 22/22✅ add-eventTime: ✅ 43.386ms (SLO: <47.150ms -8.0%) vs baseline: +2.5% Memory: ✅ 44.503MB (SLO: <47.000MB -5.3%) vs baseline: +4.8% ✅ add-metricsTime: ✅ 316.119ms (SLO: <344.800ms -8.3%) vs baseline: +0.4% Memory: ✅ 595.419MB (SLO: <600.000MB 🟡 -0.8%) vs baseline: +4.8% ✅ add-tagsTime: ✅ 289.779ms (SLO: <314.000ms -7.7%) vs baseline: +1.5% Memory: ✅ 597.471MB (SLO: <600.000MB 🟡 -0.4%) vs baseline: +5.1% ✅ get-contextTime: ✅ 80.327ms (SLO: <92.350ms 📉 -13.0%) vs baseline: ~same Memory: ✅ 40.071MB (SLO: <46.500MB 📉 -13.8%) vs baseline: +5.2% ✅ is-recordingTime: ✅ 39.761ms (SLO: <44.500ms 📉 -10.6%) vs baseline: +1.9% Memory: ✅ 43.970MB (SLO: <47.500MB -7.4%) vs baseline: +4.7% ✅ record-exceptionTime: ✅ 58.903ms (SLO: <67.650ms 📉 -12.9%) vs baseline: +0.3% Memory: ✅ 40.314MB (SLO: <47.000MB 📉 -14.2%) vs baseline: +5.1% ✅ set-statusTime: ✅ 45.585ms (SLO: <50.400ms -9.6%) vs baseline: +2.0% Memory: ✅ 43.963MB (SLO: <47.000MB -6.5%) vs baseline: +4.8% ✅ startTime: ✅ 38.350ms (SLO: <43.450ms 📉 -11.7%) vs baseline: +0.2% Memory: ✅ 43.981MB (SLO: <47.000MB -6.4%) vs baseline: +4.9% ✅ start-finishTime: ✅ 83.997ms (SLO: <88.000ms -4.5%) vs baseline: +1.7% Memory: ✅ 34.564MB (SLO: <46.500MB 📉 -25.7%) vs baseline: +4.7% ✅ start-finish-telemetryTime: ✅ 84.143ms (SLO: <89.000ms -5.5%) vs baseline: +0.2% Memory: ✅ 34.603MB (SLO: <46.500MB 📉 -25.6%) vs baseline: +4.9% ✅ update-nameTime: ✅ 40.243ms (SLO: <45.150ms 📉 -10.9%) vs baseline: -0.1% Memory: ✅ 44.115MB (SLO: <47.000MB -6.1%) vs baseline: +4.4% 🟡 span - 26/26✅ add-eventTime: ✅ 20.500ms (SLO: <22.500ms -8.9%) vs baseline: -0.2% Memory: ✅ 50.295MB (SLO: <53.000MB -5.1%) vs baseline: +4.8% ✅ add-metricsTime: ✅ 91.370ms (SLO: <93.500ms -2.3%) vs baseline: +0.5% Memory: ✅ 661.242MB (SLO: <961.000MB 📉 -31.2%) vs baseline: +4.9% ✅ add-tagsTime: ✅ 148.988ms (SLO: <155.000ms -3.9%) vs baseline: +0.6% Memory: ✅ 661.551MB (SLO: <962.500MB 📉 -31.3%) vs baseline: +4.8% ✅ get-contextTime: ✅ 19.952ms (SLO: <20.500ms -2.7%) vs baseline: +2.8% Memory: ✅ 49.133MB (SLO: <53.000MB -7.3%) vs baseline: +4.9% ✅ is-recordingTime: ✅ 19.694ms (SLO: <20.500ms -3.9%) vs baseline: +0.8% Memory: ✅ 49.085MB (SLO: <53.000MB -7.4%) vs baseline: +4.8% ✅ record-exceptionTime: ✅ 38.552ms (SLO: <40.000ms -3.6%) vs baseline: +1.1% Memory: ✅ 42.726MB (SLO: <53.000MB 📉 -19.4%) vs baseline: +5.0% ✅ set-statusTime: ✅ 21.161ms (SLO: <22.000ms -3.8%) vs baseline: +0.3% Memory: ✅ 49.145MB (SLO: <53.000MB -7.3%) vs baseline: +4.9% ✅ startTime: ✅ 19.635ms (SLO: <20.500ms -4.2%) vs baseline: +2.7% Memory: ✅ 49.054MB (SLO: <53.000MB -7.4%) vs baseline: +4.8% ✅ start-finishTime: ✅ 51.520ms (SLO: <52.500ms 🟡 -1.9%) vs baseline: ~same Memory: ✅ 32.067MB (SLO: <34.000MB -5.7%) vs baseline: +4.6% ✅ start-finish-telemetryTime: ✅ 52.668ms (SLO: <54.500ms -3.4%) vs baseline: ~same Memory: ✅ 32.027MB (SLO: <34.000MB -5.8%) vs baseline: +4.9% ✅ start-finish-traceid128Time: ✅ 54.766ms (SLO: <57.000ms -3.9%) vs baseline: -0.3% Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +5.0% ✅ start-traceid128Time: ✅ 19.731ms (SLO: <22.500ms 📉 -12.3%) vs baseline: +0.1% Memory: ✅ 49.105MB (SLO: <53.000MB -7.3%) vs baseline: +5.0% ✅ update-nameTime: ✅ 20.090ms (SLO: <22.000ms -8.7%) vs baseline: -0.7% Memory: ✅ 49.741MB (SLO: <53.000MB -6.1%) vs baseline: +4.8%
|
Backport 9f7d187 from #14939 to 3.16.
MLOB-4230
Description
This PR does 3 things:
VCR_PROVIDER_MAP
value for the testagent configs.VCR_PROVIDER_MAP
env var to allow the testagent vcr proxy to call the azure openai endpoint.We have logic in our langchain integration to mark specific LLM calls as generic workflow spans (instead of the default llm span) if we detect the corresponding integration (for the given provider, i.e.
openai/anthropic
) is also enabled and will result in a downstream LLM span. Our product experience breaks if multiple spans duplicate represent an LLM call, and we were previously missing support for azure openai.Testing
Risks
Additional Notes