6 Oct 2010 21:21
improvements to slicing
On 10/06/2010 08:58 AM, Nick Coghlan wrote:
> If I was going to ask for a change to anything in Python's
> indexing semantics, it would be for negative step values to create
> ranges that were half-open at the beginning rather than the end, such
> that reversing a slice just involved swapping the start value with the
> stop value and negating the step value.
Yes, negative slices are very tricky to get right. They could use some
attention I think.
> As it is, you also have to
> subtract one from both the start and stop value to get the original
> range of values back. However, just like the idea of ranges starting
> from 1 rather than 0, the idea of negative slices giving ranges
> half-open at the start rather than the end is also doomed by
> significant problems with backwards compatibility. For a new language,
> you might be able to make the argument that the alternative behaviour
> is a better design choice. For an existing one like Python, any
> possible benefits are so nebulous as to not be worth the inevitable
> hassle involved in changing the semantics)
We don't need to change the current range function/generator to add
inclusive or closed ranges. Just add a closed_range() function to the
itertools or math module.
[n for n in closed_range(-5, 5, 2)] --> [-5, -3, -1, 1, 3, 5]
I just noticed the __getslice__ method is no longer on sequences. (?)
My preference is for slicing to be based more on practical terms for
manipulating sequences rather than be defined in a purely mathematical way.
1. Have the direction determine by the start and stop values rather than
than by the step value so that the following is true.
"abcdefg"[start:stop:step] == "abcdefg"[start:stop][::step]
Reversing the slice can be done by simply swapping the start and stop.
Negating the slice too would give you ...
"abcdefg"[start:stop:step] == "abcdefg"[stop:start:-step]
Negating the step would not always give you the reverse sequence for steps
larger than 1, because the result may not contain the same values.
>>> 'abcd'[::2]
'ac'
>>> 'abcd'[::-2]
'db'
This is the current behavior and wouldn't change.
A positive step value would step from the left, and a negative step value
would step from the right of the slice determined by start and stop. This
already works if you don't give stop and start values.
>>> "abcdefg"[::2]
'aceg'
>>> "abcdefg"[::-2]
'geca'
And these can be used in for loops or list comps.
>>> [c for c in "abcdefg"[::2]]
['a', 'c', 'e', 'g']
If we could add a width value to slices we would be able to do this.
>>> "abcdefg"[::2:2]
'abcdefg'
As unimpressive as that looked, when used in a for loop or list comp it
would give us an easy and useful way to step through data.
[cc for cc in "abcdefg"[::2:2]] --> ['ab', 'cd', 'ef', 'g']
You could also spell that as...
list("abcdefg")[::2:2]) --> ['ab', 'cd', 'ef', 'g']
The problems start when you try to use actual index values to specify
start and stop ranges.
You can't index the last element with an explicit stop value.
>>> "abcdefg"[0:-1]
'abcdef'
>>> "abcdefg"[0:-0]
''
But we can use "None" which is awkward and requires testing the stop value
when the index is supplied by a variable.
>>> 'abcdefg'[:None]
'abcdefg'
I'm not sure how to fix this one. We've been living with this for a long
time so it's not like we need to fix it all at once.
Negative indexes can be confusing.
>>> "abcdefg"[-5:5]
'cde' # OK
>>> "abcdefg"[5:-5]
'' # Expected "edc'" here, not ''.
>>> "abcdefg"[5:-5:-1]
'fed' # Expected reverse of '' here,
# or 'cde', not 'fed'.
With the suggested change we get...
>>> "abcdefg"[-5:5]
'cde' # Stays the same.
>>> "abcdefg"[5:-5]
'edc' # Swapping start and stop reverses it.
>>> "abcdefg"[5:-5:-1]
'cde' # Negating the step, reverses it again.
I think these are easier to use than the current behavior. It doesn't
change slices using positive indexes and steps so maybe it's not so
backward incompatible to sneak in.
Ron
RSS Feed