Skip to content

Integer Overflow Vulnerability in sync_queries() Leading to Buffer Overflow #4717

@Asuk4

Description

@Asuk4

Describe the bug

Hello VowpalWabbit team!

As part of our ongoing audit of this project, we've identified a critical integer overflow vulnerability in the sync_queries() function that could lead to a buffer overflow. This vulnerability occurs when processing user-controlled data in the all-reduce operation, specifically when the all_reduce_type is set to SOCKET.

Mechanism of Vulnerability

  1. The sync_queries() function calculates total_sum by summing up sizes[i] values without proper overflow checks.
  2. The sizes array values originate from user-controlled unflushed_bytes_count data.
  3. An attacker could manipulate the input to cause total_sum to overflow, resulting in a small wrapped value.
  4. This wrapped value is then used to allocate a buffer via calloc_or_throw().
  5. Subsequent operations attempt to read more data than the allocated buffer can hold, leading to a buffer overflow.

Relevant Code Snippets

kernel_svm.cc - sync_queries()

size_t* sizes = VW::details::calloc_or_throw<size_t>(all.runtime_state.all_reduce->total);
sizes[all.runtime_state.all_reduce->node] = b->unflushed_bytes_count();
VW::details::all_reduce<size_t, add_size_t>(all, sizes, all.runtime_state.all_reduce->total);

size_t prev_sum = 0, total_sum = 0;
for (size_t i = 0; i < all.runtime_state.all_reduce->total; i++)
{
    if (i <= (all.runtime_state.all_reduce->node - 1)) { prev_sum += sizes[i]; }
    total_sum += sizes[i];  // Integer overflow possible here
}

if (total_sum > 0)
{
    queries = VW::details::calloc_or_throw<char>(total_sum);  // Buffer allocation with overflowed size
    // ... subsequent operations using the under-allocated buffer
}

Required Conditions

  • all.runtime_config.selected_all_reduce_type must be set to SOCKET
  • Attacker must be able to control the unflushed_bytes_count values

Suggested Action

The code should be modified to:

  1. Add overflow checks when calculating total_sum
  2. Implement upper bounds validation for the allocation size
  3. Consider using a safer integer type or implementing proper bounds checking
  4. Add validation to ensure the allocated buffer size is sufficient for the actual data size

Example fix:

size_t total_sum = 0;
for (size_t i = 0; i < all.runtime_state.all_reduce->total; i++)
{
    if (i <= (all.runtime_state.all_reduce->node - 1)) { prev_sum += sizes[i]; }
    if (total_sum > SIZE_MAX - sizes[i]) {
        THROW("Integer overflow detected in total_sum calculation");
    }
    total_sum += sizes[i];
}

How to reproduce

Version

9.10.0

OS

Linux

Language

C

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugBug in learning semantics, critical by default

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions