PATCH: ld/2442: ia64 ld slow with many local relocs (O(N^2) in get_dyn_sym_info)
James E Wilson
wilson@specifixinc.com
Tue Apr 4 19:02:00 GMT 2006
More information about the Binutils mailing list
Tue Apr 4 19:02:00 GMT 2006
- Previous message (by thread): PATCH: ld/2442: ia64 ld slow with many local relocs (O(N^2) in get_dyn_sym_info)
- Next message (by thread): PATCH: ld/2442: ia64 ld slow with many local relocs (O(N^2) in get_dyn_sym_info)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sun, 2006-04-02 at 11:33, H. J. Lu wrote: > This patch is very similar to the last one. The main difference is > it frees the unused portion of elfNN_ia64_dyn_sym_info array. The > speed improvement is the same 1300 X. It may use less memory than > the unmodified linker. I've looked through the patch. It looks pretty reasonable to me. You have created four new functions, but failed to add comments explaining what they do. This needs to be fixed before the patch is checked in. I had trouble understanding how the patch works. I had to spend a bit of time reading and rereading it. It really needs some better comments to explain what the new fields count, sorted_count and size mean and how they are used. In elfNN_ia64_check_relocs, you have a one line comment + /* We scan relocations first to create dynamic relocation arrays. */ that doesn't adequately explain what is going on. One reason that it is hard to understand is that it is followed by over a hundred lines of code, and it isn't clear what this is referring to. Also, it would have helped if you had provided an explanation of what the patch does and how it works. This is something that should always be provided with a non-trivial patch. The missing explanation here is that we place all get_dyn_sym_info (... TRUE) calls before all get_dyn_sym_info (... FALSE) calls. We don't sort when inserting. Also, we don't check for duplicates, except in the previously sorted section if one exists, and against the last inserted entry. This allows insertions to be fast. When we see a ...FALSE call, we sort and eliminate duplicates if there is an unsorted section. Typically, this will only happen once, because we do all insertions before lookups. We then use bsearch to do a lookup. This also allows lookups to be fast. So we have fast insertion (O(log N) due to duplicate check), fast lookup (O(log N)) and one sort (O(N log N) expected time). Previously, all lookups were O(N) because of the use of the linked list. Also, all insertions were O(N) because of the check for duplicates. There are some complications here because the array size grows occasionally, which may add an O(N) factor, but this should be rare. Also, you free the excess array allocation, which requires a copy which is O(N), but this only happens once. I see that bsearch is already used by the xtensa port, so there should be no portability problem there. bsearch is in ISO C99, but not in ISO C90. It is also in BSD4.3 and SVID 3, so probably every host we care about has it. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
- Previous message (by thread): PATCH: ld/2442: ia64 ld slow with many local relocs (O(N^2) in get_dyn_sym_info)
- Next message (by thread): PATCH: ld/2442: ia64 ld slow with many local relocs (O(N^2) in get_dyn_sym_info)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Binutils mailing list