Issue32453
Created on 2017-12-30 05:15 by nh2, last changed 2022-04-11 14:58 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| bench_rmtree.py | serhiy.storchaka, 2017-12-30 12:06 | |||
| Messages (12) | |||
|---|---|---|---|
| msg309217 - (view) | Author: Niklas Hambüchen (nh2) | Date: 2017-12-30 05:15 | |
See http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=24412edeaf556a for the explanation and equivalent fix in coreutils. The gist ist that deleting entries in inode order can improve deletion performance dramatically. To obtain inode numbers and sort by them, one needs to `getdents()` all the entries ahead of time, but rmtree() already gets all dirents ahead of the deletion. https://bugs.python.org/issue28564 recently improved shutil.rmtree() performance by using scandir(), but nevertheless the returned generator is list()ed, so we already have all necessary informtaion in memory and would just have to perform an O(n) integer sort. I propose we check if the improvements made to `rm -r` in coreutils should be ported to shutil.rmtree(). |
|||
| msg309227 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2017-12-30 11:22 | |
I don't think filesystem-specific optimizations belong in the Python stdlib. Python is compatible with multiple operating systems, including Windows, macOS, Android, many POSIX variants. It would be much better if this were fixed in the kernel... |
|||
| msg309230 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2017-12-30 12:06 | |
I have tested shutil.rmtree() with a large number of files using modified benchmark from issue28564. For 400000 files it takes less than 5 seconds. From the comment to the coreutils benchmark (http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=24412edeaf556a): # Using rm -rf to remove a 400k-entry directory takes: # - 9 seconds with the patch, on a 2-yr-old system # - 350 seconds without the patch, on a high-end system (disk 20-30% faster) threshold_seconds=60 Running the coreutils benchmark gives the result 3 seconds on my computer. It seems to me that this issue have been fixed in the kernel. |
|||
| msg309300 - (view) | Author: Niklas Hambüchen (nh2) | Date: 2017-12-31 19:33 | |
Serhiy, did you run your benchmark on an SSD or a spinning disk?
The coreutils bug mentions that the problem is seek times.
My tests on a spinning disk with 400k files suggest that indeed rmtree() is ~30x slower than `rm -r`:
# time (mkdir dirtest && cd dirtest && seq 1 100000 | xargs touch)
real 0m0.722s
user 0m0.032s
sys 0m0.680s
# time rm -rf dirtest/
real 0m0.519s
user 0m0.074s
sys 0m0.437s
# time (mkdir dirtest && cd dirtest && seq 1 100000 | xargs touch)
real 0m0.693s
user 0m0.039s
sys 0m0.659s
# time python -c 'import shutil; shutil.rmtree("dirtest")'
real 0m0.756s
user 0m0.225s
sys 0m0.499s
# time (mkdir dirtest && cd dirtest && seq 1 100000 | xargs touch)
real 0m0.685s
user 0m0.032s
sys 0m0.658s
# time python3 -c 'import shutil; shutil.rmtree("dirtest")'
real 0m0.965s
user 0m0.424s
sys 0m0.528s
# time (mkdir dirtest && cd dirtest && seq 1 400000 | xargs touch)
real 0m4.249s
user 0m0.098s
sys 0m2.804s
# time rm -rf dirtest/
real 0m10.782s
user 0m0.265s
sys 0m2.213s
# time (mkdir dirtest && cd dirtest && seq 1 400000 | xargs touch)
real 0m5.236s
user 0m0.107s
sys 0m2.832s
# time python -c 'import shutil; shutil.rmtree("dirtest")'
real 3m8.006s
user 0m1.323s
sys 0m3.929s
# time (mkdir dirtest && cd dirtest && seq 1 400000 | xargs touch)
real 0m4.671s
user 0m0.097s
sys 0m2.832s
# time python3 -c 'import shutil; shutil.rmtree("dirtest")'
real 2m49.476s
user 0m2.196s
sys 0m3.695s
The tests were done with coreutils rm 8.28, Python 2.7.14, Python 3.6.3, on ext4 (rw,relatime,data=ordered), on a dmraid RAID1 across 2 WDC_WD4000FYYZ disks (WD 4 TB Enterprise).
Also note how deleting 100k files takes ~0.5 seconds with `rm -r` and the Pythons, but deleting 4x more files takes 20x longer with `rm -r` and ~300x longer with the Pythons.
There is clearly some boundary below which we are hitting some nice cached behaviour.
|
|||
| msg309303 - (view) | Author: Niklas Hambüchen (nh2) | Date: 2017-12-31 20:11 | |
It turns out I was wrong when saying that there's some cache we're hitting.
In fact, `rm -r` is still precisely O(n^2), even with the coreutils patch I linked.
Quick overview table of the benchmark:
nfiles real user sys
100000 0.51s 0.07s 0.43s
200000 2.46s 0.15s 0.89s
400000 10.78s 0.26s 2.21s
800000 44.72s 0.58s 6.03s
1600000 180.37s 1.06s 10.70s
Each 2x increase of number of files results in 4x increased deletion time.
Full benchmark output:
# time (mkdir dirtest && cd dirtest && seq 1 100000 | xargs touch)
real 0m0.722s
user 0m0.032s
sys 0m0.680s
# time rm -rf dirtest/
real 0m0.519s
user 0m0.074s
sys 0m0.437s
# time (mkdir dirtest && cd dirtest && seq 1 200000 | xargs touch)
real 0m1.576s
user 0m0.044s
sys 0m1.275s
# time rm -r dirtest/
real 0m2.469s
user 0m0.150s
sys 0m0.890s
# time (mkdir dirtest && cd dirtest && seq 1 400000 | xargs touch)
real 0m4.249s
user 0m0.098s
sys 0m2.804s
# time rm -rf dirtest/
real 0m10.782s
user 0m0.265s
sys 0m2.213s
# time (mkdir dirtest && cd dirtest && seq 1 800000 | xargs touch)
real 0m10.533s
user 0m0.204s
sys 0m5.758s
# time rm -rf dirtest/
real 0m44.725s
user 0m0.589s
sys 0m6.037s
# time (mkdir dirtest && cd dirtest && seq 1 1600000 | xargs touch)
real 0m34.480s
user 0m0.382s
sys 0m12.057s
# time rm -r dirtest/
real 3m0.371s
user 0m1.069s
sys 0m10.704s
|
|||
| msg309305 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2017-12-31 20:28 | |
Did you try to sync and flush caches before running `rm -r`? |
|||
| msg309308 - (view) | Author: Niklas Hambüchen (nh2) | Date: 2017-12-31 21:14 | |
> Did you try to sync and flush caches before running `rm -r`? Yes, it doesn't make a difference for me, I still see the same O(n²) behaviour in `rm -r`. I've sent an email "O(n^2) performance of rm -r" to bug-coreutils@gnu.org just now, unfortunately I can't link it yet because the mailman archive doesn't show it yet. It should appear soon on http://lists.gnu.org/archive/html/bug-coreutils/2017-12/threads.html. |
|||
| msg309317 - (view) | Author: Niklas Hambüchen (nh2) | Date: 2018-01-01 02:44 | |
OK, my coreutils email is at http://lists.gnu.org/archive/html/bug-coreutils/2017-12/msg00054.html |
|||
| msg309353 - (view) | Author: Niklas Hambüchen (nh2) | Date: 2018-01-01 23:29 | |
A better location to view the whole coreutils thread is: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=29921 |
|||
| msg309359 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2018-01-02 08:22 | |
Thanks for your investigations Niklas. I ran my benchmark on a spinning disk, but significant nonlinearity is observed only for the size of directory around 1000000 and more. And yes, sorting by inode number helps.
$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-unpatched -c 'import shutil; shutil.rmtree("x")'; done
1.01 100000
3.80 200000
3.64 300000
4.89 400000
8.72 600000
11.86 800000
56.80 1200000
209.82 1600000
$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-patched -c 'import shutil; shutil.rmtree("x")'; done
0.97 100000
2.42 200000
3.84 300000
4.48 400000
7.07 600000
10.01 800000
15.53 1200000
23.24 1600000
$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" env rm -rf x; done
0.68 100000
1.34 200000
2.10 300000
3.95 400000
5.95 600000
10.28 800000
47.66 1200000
89.32 1600000
On an SSD:
$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-unpatched -c 'import shutil; shutil.rmtree("x")'; done
1.00 100000
1.93 200000
2.90 300000
4.98 400000
7.05 600000
9.87 800000
21.45 1200000
36.19 1600000
$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-patched -c 'import shutil; shutil.rmtree("x")'; done
0.96 100000
1.99 200000
3.09 300000
4.85 400000
7.55 600000
9.44 800000
16.03 1200000
21.28 1600000
$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" env rm -rf x; done
0.67 100000
1.38 200000
2.41 300000
2.82 400000
5.24 600000
7.02 800000
18.60 1200000
30.58 1600000
Interesting that patched Python is faster than 'rm' (GNU coreutils 8.26) for large numbers.
|
|||
| msg309367 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-01-02 11:26 | |
Yes, so `rm -rf` is quadratic on my SSD too... |
|||
| msg405367 - (view) | Author: Niklas Hambüchen (nh2) | Date: 2021-10-30 12:33 | |
A small update / summary so far:
From here this developed into coreutils discussion:
#29921 O(n^2) performance of rm -r
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=29921
and finally a `linux-fsdevel` discussion:
O(n^2) deletion performance
https://lore.kernel.org/linux-fsdevel/5ca3808d-4eea-afec-75a6-2cc41f44b868@nh2.me/t/#u
Dave Chinner (xfs dev) suggests that on XFS there is no quadratic behaviour once the problem is bound by seek-time of the spinning disk.
Somebody should try to confirm that it becomes linear in even larger tests, e.g. way larger than 21 minutes deletion time.
|
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:56 | admin | set | github: 76634 |
| 2021-10-30 12:33:10 | nh2 | set | messages: + msg405367 |
| 2018-06-12 09:35:52 | giampaolo.rodola | set | nosy:
+ giampaolo.rodola |
| 2018-01-02 11:26:15 | pitrou | set | messages: + msg309367 |
| 2018-01-02 08:22:14 | serhiy.storchaka | set | messages: + msg309359 |
| 2018-01-01 23:29:27 | nh2 | set | messages: + msg309353 |
| 2018-01-01 02:44:53 | nh2 | set | messages: + msg309317 |
| 2017-12-31 21:14:38 | nh2 | set | messages: + msg309308 |
| 2017-12-31 20:28:29 | pitrou | set | messages: + msg309305 |
| 2017-12-31 20:11:40 | nh2 | set | messages: + msg309303 |
| 2017-12-31 19:33:57 | nh2 | set | messages: + msg309300 |
| 2017-12-30 12:06:33 | serhiy.storchaka | set | files: + bench_rmtree.py |
| 2017-12-30 12:06:12 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg309230 |
| 2017-12-30 11:22:40 | pitrou | set | nosy:
+ pitrou messages: + msg309227 |
| 2017-12-30 05:15:09 | nh2 | create | |
