Issue33639
Created on 2018-05-24 20:15 by giampaolo.rodola, last changed 2022-04-11 14:59 by admin. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 7100 | closed | giampaolo.rodola, 2018-05-24 20:18 | |
| Messages (8) | |||
|---|---|---|---|
| msg317611 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2018-05-24 20:15 | |
This is a follow up of #25063 and similar to socket.sendfile() (#17552). It provides a 20/25% speedup when copying files with shutil.copyfile(), shutil.copy() and shutil.copy2(). Differently from #25063 this is used for filesystem files only and copyfileobj() is left alone. Unmerged #26826 is also related to this. I applied #26826 patch and built a wrapper around copy_file_range() and the speedup is basically the same. Nevertheless, even when #26826 gets merged it probably makes sense to rely on sendfile() in case copy_file_range() is not available (it was introduced in 2016) or the UNIX platform supports file-to-file copy via sendfile(2) (even though I'm not aware of any, so this would basically be Linux-only). Some benchmarks: $ dd if=/dev/urandom of=/tmp/f1 bs=1K count=128 $ time ./python -m timeit -s 'import shutil; p1 = "/tmp/f1"; p2 = "/tmp/f2"' 'shutil.copyfile(p1, p2)' 128K copy ========= --- without patch: 2000 loops, best of 5: 160 usec per loop real 0m2.353s user 0m0.454s sys 0m1.435s --- with patch: 2000 loops, best of 5: 187 usec per loop real 0m2.724s user 0m0.627s sys 0m1.634s 8MB copy ======== $ dd if=/dev/urandom of=/tmp/f1 bs=1M count=8 --- without patch: 50 loops, best of 5: 9.51 msec per loop real 0m3.392s user 0m0.343s sys 0m2.478s --- with patch: 50 loops, best of 5: 7.75 msec per loop real 0m2.878s user 0m0.105s sys 0m2.187s 512MB copy ========== --- without patch: 1 loop, best of 5: 872 msec per loop real 0m5.574s user 0m0.402s sys 0m3.115s --- with patch: 1 loop, best of 5: 646 msec per loop real 0m5.475s user 0m0.037s sys 0m2.959s |
|||
| msg317622 - (view) | Author: desbma (desbma) * | Date: 2018-05-24 20:46 | |
Duplicate of https://bugs.python.org/issue25156 ? |
|||
| msg317627 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2018-05-24 22:01 | |
Oh! I got confused by the fact that #25063 was rejected due to concerns about copyfileobj() otherwise I would have commented on your patch which I totally missed. Yes, this overlaps with #25156 patch but it uses the same logic of socket.sendfile() which IMO is more robust. If you agree I'd close #25156 and continue in here, and I'm more than happy to co-author the contribution if that's OK with you. |
|||
| msg317630 - (view) | Author: desbma (desbma) * | Date: 2018-05-24 22:15 | |
Honestly, whatever gets this thing moving forward is good with me. I'm a bit depressed by the number of issues here that have a good patch waiting to be merged, or even read, and that languish for years. I haven't read your patch in detail, but if others agree that it is better than mine and can be merged, let's do it. |
|||
| msg317634 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2018-05-24 22:26 | |
> I'm a bit depressed by the number of issues here that have a good patch waiting to be merged, or even read, and that languish for years. I'm really sorry about that. We, core developers, are doing our best, but we are 34 active core developers who merged 5000 pull requests in 1 year. We don't scale well :-( We are working on getting more people onboard. |
|||
| msg317738 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2018-05-26 11:33 | |
I updated the code to rely on sendfile(2) on Linux only, which apparently is the only one supporting copy between regular files and added a check to fail immediately in case the filesystem is full. Can somebody review the patch? |
|||
| msg317882 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2018-05-28 16:28 | |
Closing as duplicate of #33671. |
|||
| msg317883 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2018-05-28 16:29 | |
Closing as duplicate of #33671. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:59:00 | admin | set | github: 77820 |
| 2018-05-28 16:29:08 | giampaolo.rodola | set | messages:
+ msg317883 stage: resolved -> patch review |
| 2018-05-28 16:28:30 | giampaolo.rodola | set | status: open -> closed superseder: Efficient zero-copy for shutil.copy* functions (Linux, OSX and Win) messages: + msg317882 resolution: duplicate |
| 2018-05-26 11:33:38 | giampaolo.rodola | set | messages: + msg317738 |
| 2018-05-24 22:26:38 | vstinner | set | messages: + msg317634 |
| 2018-05-24 22:15:58 | desbma | set | messages: + msg317630 |
| 2018-05-24 22:01:34 | giampaolo.rodola | set | messages: + msg317627 |
| 2018-05-24 20:46:51 | desbma | set | messages: + msg317622 |
| 2018-05-24 20:18:00 | giampaolo.rodola | set | keywords:
+ patch pull_requests: + pull_request6737 |
| 2018-05-24 20:15:29 | giampaolo.rodola | create | |
