[2.7] bpo-31530: Stop crashes when iterating over a file on multiple threads. by serhiy-storchaka · Pull Request #3672 · python/cpython
| self.f.writelines('') | ||
| self._test_close_open_io(io_func) | ||
|
|
||
| def test_iteration_torture(self): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment with a least "bpo-31530".
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| pass | ||
| self._run_workers(iterate, 10) | ||
|
|
||
| def test_iteration_seek(self): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment with a least "bpo-31530".
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I would prefer to see reviews of other core developers before seeing this change merged. Especially because @benjaminp proposed PR #3670 with a different approach.
| if (f->unlocked_count > 0) { | ||
| PyErr_SetString(PyExc_IOError, | ||
| "seek() called during concurrent " | ||
| "operation on the same file object."); |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the final ".", it's uncommon in Python error messages.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. But this copies the error message at line 431. Do you want to remove the final "." from all error messages in this file?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. I looked quickly at fileobject.c error messages, but I didn't notice that the error message already existed in close(). That's a good reason to generalize the check for concurrent operation :-)
In this case, it's fine, keep the final dot :-)
| @@ -0,0 +1 @@ | |||
| Fixed crashes when iterating over a file on multiple threads. | |||
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to complete the description:
seek() and next() methods of file objects now raise an exception during concurrent operation on the same file object. A lock can be used to prevent the error.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and I would want to see reviews of @benjaminp.
| if (f->unlocked_count > 0) { | ||
| PyErr_SetString(PyExc_IOError, | ||
| "seek() called during concurrent " | ||
| "operation on the same file object."); |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. But this copies the error message at line 431. Do you want to remove the final "." from all error messages in this file?
Oh, you removed the final dot. I'm fine with that too. It's just a minor thing, I don't care much :-) (But I like the idea of being consistent in error messages.)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
(But again, I would like to see a review of at least another core dev, especially @benjaminp.)
| self.f.writelines('') | ||
| self._test_close_open_io(io_func) | ||
|
|
||
| def test_iteration_torture(self): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| pass | ||
| self._run_workers(iterate, 10) | ||
|
|
||
| def test_iteration_seek(self): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| @@ -0,0 +1 @@ | |||
| Fixed crashes when iterating over a file on multiple threads. | |||
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I wrote my patch is that it seems weird that concurrent f.read() works but not iteration. This is probably fine, though, since no one can have been relying on the current behavior.
drop_readahead could also use a assert(f->unlocked_count == 0); assert
| for l in f: | ||
| pass | ||
| except IOError: | ||
| pass |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably a better stress test to continue looping even when you get an exception.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the test tooooooo slow. I have killed it after 8 minutes.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decrease the size of the file then?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2**12 lines is not enough for provoking a crash, 2**13 to 2**14 lines provoke random crashes, and 2**15 lines is needed for stable crashing. After patching the test with 2**13 lines takes about 40 sec, and 2**15 lines need 3 minutes to finish. This is still too long.
| for i in range(100): | ||
| f.seek(i*100, 0) | ||
| except IOError: | ||
| pass |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly here
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Please put an assert in drop_readahead, though.
drop_readaheadcould also use aassert(f->unlocked_count == 0);assert
This looks unsafe. f->unlocked_count can be non-zero if other thread performs non-read operation like tell(). And more, it seems to me that the check should be moved inside readahead_get_line_skip() because it can be called recursively, after releasing/acquiring GIL that allows other thread to increase f->unlocked_count.
benjaminp added a commit that referenced this pull request
Dec 31, 2017… file on multiple threads. (#3672)"
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters