"Memoryless" archive processing of ld
Fangrui Song
i@maskray.me
Fri Sep 4 06:34:24 GMT 2020
More information about the Binutils mailing list
Fri Sep 4 06:34:24 GMT 2020
- Previous message (by thread): [PATCH] ld: Change NOSANTIZE_CFLAGS to NOSANITIZE_CFLAGS
- Next message (by thread): "Memoryless" archive processing of ld
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Many people are aware that archive member(element) fetching does not allow backward references, i.e. ld def.a ref.o will fail with "undefined reference to". However, it is said that VMS (now OpenVMS), Mach-O ld64 and Windows link.exe chose a different strategy when every archive symbol is remembered and thus such a backward reference is allowed. Do folks know the pros and cons of GNU ld's strategy? (Is it emulation of ancient Unix ELF linkers' behavior?) People can quickly give one advantage: the "memoryless" archive processing saves memory. This was probably important in the old days, and probably more so before the archive symbol table was invented, but probably less relevant nowadays. https://www.gnu.org/software/coreutils/manual/html_node/tsort-background.html briefly describes 'lorder' (which still exists on a modern FreeBSD) and says > This whole procedure has been obsolete since about 1980, because Unix archives > now contain a symbol table (traditionally built by ranlib, now generally built > by ar itself), and the Unix linker uses the symbol table to effectively make > multiple passes over an archive file. > Anyhow, that’s where tsort came from. To solve an old problem with the way the > linker handled archive files, which has since been solved in different ways. Some disadvantages: * --start-group is needed to resolve circular dependencies among archives. People are probably used to ugly -lgcc -lgcc_eh or -lgcc_s on both side of -lc. * Poor diagnostics: "undefined reference to" tells you the symbol name, the source file, but not the destination file. It usually takes some efforts to figure out the problem (the ordering problem is usually not obvious). * An external program 'lorder' or build system's integrated topological sorting feature is needed to order archives. The ordering sacrifices commutativity. The loss of commutativity can make the build brittle, i.e. minor ordering tweak can cause subtle behavior changes (symbol resolution). * When providing an interceptor library (a library providing overriding definitions), you usually want to make it optional, i.e. the intercepted library does not have a dependency on the interceptor. However, due to the memoryless archive processing, you have to make sure the interceptor comes after the intercepted library, which usually requires some special plumbing in the build system. An alternative is --whole-archive, which unfortunately loses drops the nice lazy property. I have mentioned enough disadvantages:) As one additional advantage, the memoryless nature enforces a (weak) layering of archives. The layering is one particular topological sort of the dependency graph. It is weak as some dependency edges can still be missing, e.g. if a->b, a->c, b->d, c->d. If there is an unspecified dependency b->c, ld .. -la -lb -lc -ld will succeed but ld .. -la -lc -lb -ld may fail. Hope my few points above can intrigue more thoughts.
- Previous message (by thread): [PATCH] ld: Change NOSANTIZE_CFLAGS to NOSANITIZE_CFLAGS
- Next message (by thread): "Memoryless" archive processing of ld
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Binutils mailing list