[2.7] bpo-31530: Stop crashes when iterating over a file on multiple threads. by serhiy-storchaka · Pull Request #3672 · python/cpython

@serhiy-storchaka

@serhiy-storchaka

vstinner

self.f.writelines('')
self._test_close_open_io(io_func)

def test_iteration_torture(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment with a least "bpo-31530".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

pass
self._run_workers(iterate, 10)

def test_iteration_seek(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment with a least "bpo-31530".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@serhiy-storchaka

vstinner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I would prefer to see reviews of other core developers before seeing this change merged. Especially because @benjaminp proposed PR #3670 with a different approach.

if (f->unlocked_count > 0) {
PyErr_SetString(PyExc_IOError,
"seek() called during concurrent "
"operation on the same file object.");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the final ".", it's uncommon in Python error messages.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. But this copies the error message at line 431. Do you want to remove the final "." from all error messages in this file?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I looked quickly at fileobject.c error messages, but I didn't notice that the error message already existed in close(). That's a good reason to generalize the check for concurrent operation :-)

In this case, it's fine, keep the final dot :-)

vstinner

@@ -0,0 +1 @@
Fixed crashes when iterating over a file on multiple threads.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to complete the description:

seek() and next() methods of file objects now raise an exception during concurrent operation on the same file object. A lock can be used to prevent the error.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

serhiy-storchaka

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and I would want to see reviews of @benjaminp.

if (f->unlocked_count > 0) {
PyErr_SetString(PyExc_IOError,
"seek() called during concurrent "
"operation on the same file object.");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. But this copies the error message at line 431. Do you want to remove the final "." from all error messages in this file?

@serhiy-storchaka

@vstinner

Oh, you removed the final dot. I'm fine with that too. It's just a minor thing, I don't care much :-) (But I like the idea of being consistent in error messages.)

vstinner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

(But again, I would like to see a review of at least another core dev, especially @benjaminp.)

self.f.writelines('')
self._test_close_open_io(io_func)

def test_iteration_torture(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

pass
self._run_workers(iterate, 10)

def test_iteration_seek(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1 @@
Fixed crashes when iterating over a file on multiple threads.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

benjaminp

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I wrote my patch is that it seems weird that concurrent f.read() works but not iteration. This is probably fine, though, since no one can have been relying on the current behavior.

drop_readahead could also use a assert(f->unlocked_count == 0); assert

for l in f:
pass
except IOError:
pass

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably a better stress test to continue looping even when you get an exception.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the test tooooooo slow. I have killed it after 8 minutes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decrease the size of the file then?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2**12 lines is not enough for provoking a crash, 2**13 to 2**14 lines provoke random crashes, and 2**15 lines is needed for stable crashing. After patching the test with 2**13 lines takes about 40 sec, and 2**15 lines need 3 minutes to finish. This is still too long.

for i in range(100):
f.seek(i*100, 0)
except IOError:
pass

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here

benjaminp

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Please put an assert in drop_readahead, though.

@serhiy-storchaka

drop_readahead could also use a assert(f->unlocked_count == 0); assert

This looks unsafe. f->unlocked_count can be non-zero if other thread performs non-read operation like tell(). And more, it seems to me that the check should be moved inside readahead_get_line_skip() because it can be called recursively, after releasing/acquiring GIL that allows other thread to increase f->unlocked_count.

@serhiy-storchaka

benjaminp added a commit that referenced this pull request

Dec 31, 2017
… file on multiple threads. (#3672)"