Skip to content

Conversation

StanFromIreland
Copy link
Member

@StanFromIreland StanFromIreland commented Oct 4, 2025

Benchmark script
import timeit

def bench(code, setup="r=range(1000)", n=500000):
    time = timeit.timeit(code, setup=setup, number=n)
    print(f"{code}: {time/n*1000000:.2f} us/op")

bench("r[:]")
bench("r[1:]")
bench("r[:-1]")
bench("r[1:10]")
bench("r[::2]")
bench("r[1:50:3]")

Quick little patch. Optimises the r[:] case nicely (~5x), with negligible impact on other cases (I assume it is just noise):

Case Current (us) PR (us)
r[:] 0.11 0.02
r[1:] 0.13 0.13
r[:-1] 0.18 0.19
r[1:10] 0.15 0.14
r[::2] 0.13 0.13
r[1:50:3] 0.16 0.16

I agree with Benedikt optimising this in general has little benefit, though I think one special case is acceptable. To cover all cases, it would require comparing the objects, resulting in a ~4x performance penalty and or some convoluted code.

@python-cla-bot

This comment was marked as resolved.

@picnixz
Copy link
Member

picnixz commented Oct 4, 2025

I would instead have a look at how we optimize tuple slicing. strings and tuples check if the result slice would be equivalent to [:] and if so, return the object unchanged. For instance, we also want x[0:len(x)] to be x for strings for instance (this is achieved in unicode_subscript:

        } else if (start == 0 && step == 1 &&
                   slicelength == PyUnicode_GET_LENGTH(self)) {
            return unicode_result_unchanged(self);

If you were to make this change with this improved version, then we should also test non-trivial steps and decreasing sequences, as well as empty ones.

Otherwise, please add a comment saying that we only optimize r[:] but not r[0:len(r)].

@StanFromIreland
Copy link
Member Author

StanFromIreland commented Oct 4, 2025

How about the slightly convoluted, though covering several more cases:

    if ((slice->start == Py_None || (PyLong_Check(slice->start) && PyLong_AsLong(slice->start) == 0))
        && (slice->stop == Py_None || (PyLong_Check(slice->stop) && PyObject_RichCompareBool(slice->stop, r->length, Py_EQ) == 1))
        && (slice->step == Py_None || (PyLong_Check(slice->step) && PyLong_AsLong(slice->step) == 1)))
    {
        return Py_NewRef(r);
    }
Benchmarks (Negligible differences for non-optimised cases, mean of one million runs)
current
r[:]: 0.12 us/op
r[0:]: 0.13 us/op
r[::]: 0.11 us/op
r[:len(r)]: 0.18 us/op
r[0:len(r):1]: 0.21 us/op
r[:len(r):10]: 0.19 us/op
r[1:]: 0.13 us/op
r[:-1]: 0.20 us/op
r[1:10]: 0.14 us/op
r[::2]: 0.12 us/op
r[1:50:3]: 0.15 us/op
r[0:50:3]: 0.16 us/op
pr
r[:]: 0.02 us/op
r[0:]: 0.03 us/op
r[::]: 0.02 us/op
r[:len(r)]: 0.08 us/op
r[0:len(r):1]: 0.09 us/op
r[:len(r):10]: 0.19 us/op
r[1:]: 0.13 us/op
r[:-1]: 0.19 us/op
r[1:10]: 0.14 us/op
r[::2]: 0.12 us/op
r[1:50:3]: 0.15 us/op
r[0:50:3]: 0.17 us/op

@picnixz
Copy link
Member

picnixz commented Oct 4, 2025

Why complicate this. Can't you use _PySlice_GetLongIndices to get the values and then use them directly?

@StanFromIreland
Copy link
Member Author

Same situation with the benchmarks, albeit slightly slower now for the special cases, down to ~3x faster.

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still unsure whether this is really needed.

Comment on lines +416 to +421
if (start == _PyLong_GetZero()
&& step == _PyLong_GetOne()
&& (slice->stop == Py_None || PyObject_RichCompareBool(stop, r->length, Py_EQ) == 1))
{
return Py_NewRef(r);
}
Copy link
Member

@picnixz picnixz Oct 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyObject_RichCompareBool can raise an error. Handling it would then make the code more complex. So I'm rather against that change as it's not so trivial.

Also, start == _PyLong_GetZero() is not correct because it's not guaranteed that 0 will always be immortal (though it should in general and I see that there is code that is doing this for 1). I actually thought that we had the indices as C integers and not Python ones, but it appears that it's not the case.

@StanFromIreland
Copy link
Member Author

I'm still unsure whether this is really needed.

I have the same sentiment, I think the two best options are closing this or just implementing the initial small patch.

@picnixz
Copy link
Member

picnixz commented Oct 5, 2025

I will close this one as not planned for now. If there is a real reason for optimizing this path, we'll revisit later but the issue rationale doesn't seem to be backed by any practical considerations.

@picnixz picnixz closed this Oct 5, 2025
@StanFromIreland StanFromIreland deleted the optimize-range branch October 5, 2025 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants