[PATCH] PR32705 opcodes: fix RISC-V $x mapping symbol
Andrew Oates
andrew@andrewoates.com
Mon Feb 17 18:24:55 GMT 2025
More information about the Binutils mailing list
Mon Feb 17 18:24:55 GMT 2025
- Previous message (by thread): [PATCH] PR32705 opcodes: fix RISC-V $x mapping symbol
- Next message (by thread): [PATCH] PR32705 opcodes: fix RISC-V $x mapping symbol
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, Feb 17, 2025 at 1:18 PM Andrew Oates <andrew@andrewoates.com> wrote: > > > On Sun, Feb 16, 2025 at 11:41 PM Andrew Oates <andrew@andrewoates.com> > wrote: > >> (+ binutils@sourceware.org which I accidentally dropped in my earlier >> reply) >> >> On Sun, Feb 16, 2025, 23:02 Nelson Chu <nelson@rivosinc.com> wrote: >> >>> Well if I remember correctly, assembler generates $x means it is the >>> same as the previous $x<isa>, and that conflicts with what psABI defined. >>> If we are using a new objdump which follows the current psABI, then using >>> it to dump the old packages will be wrong, and that's the problem is. >>> >> >> Ah, I see what you mean. I didn't realize there was an ABI change. >> >> I haven't observed this behavior in practice but looking at the code I >> think you're right, it could emit $x when the isa hasn't changed but isn't >> the same as the ELF default. >> >> I'll definitely defer to this group on the right approach :) >> >> Any suggestions on how to write a test case to trigger this scenario? I >> played around with the gas and binutils test cases quite a bit but >> struggled to get gas to emit a $x in any interesting way. >> > > Would fixing this part of gas be as simple as plumbing the default arch > string through, then replacing "strcmp (riscv_rps_as.subset_list->arch_str, > ..." with "strcmp (riscv_rps_as.default_arch, ..." in > riscv_mapping_state()? > > I'm not at all familiar with the gas codebase so any pointers would be > very helpful. > Actually, thinking about this more, the linker has to be updated as well, right? If $x resets to the default architecture of the ELF file, that could change when multiple object files with different defaults are linked together, so the linker would need to go and update symbols in sections that use the "wrong" default to be explicit. I have not even started looking at the ld code :) This is probably not a huge issue in practice today, and gas emits explicit $xrv... symbols at the start of each section, and I imagine that changing architectures within a section is...unusual (what is the use case for that?). Let me know if we should take this discussion to a riscv ABI forum rather than binutils (and if so, which one). My tactical goal is just to be able to disassemble my current binaries compiled by LLVM without half the instructions being ".insn", but there may be some bigger questions here to resolve. > > >>> cc other riscv maintainers and the chair of psABI >>> >>> Nelson >>> >>> On Mon, Feb 17, 2025 at 11:45 AM Andrew Oates <andrew@andrewoates.com> >>> wrote: >>> >>>> >>>> >>>> On Sun, Feb 16, 2025, 22:34 Nelson Chu <nelson@rivosinc.com> wrote: >>>> >>>>> I think we should fix the behavior of assembler too in the same patch, >>>>> so that is a complete GNU feature, and don't need the binary testcase. >>>>> >>>>> There is a compatibility issue here since all objects, which were >>>>> built for three years, will need to be rebuilt when the behavior changes, >>>>> which will cause troubles. The implementation was committed first, but the >>>>> spec merged the $x rule to be the same as elf attr later after the >>>>> implementation, and it seems no one figured out the rule conflicts with the >>>>> existing behavior at that time... I think maybe It's too late to change >>>>> the psabi, so the whole GNU behavior, which means assembler and >>>>> dis-assembler, should be fixed at the same time. >>>>> >>>> >>>> I'll defer to you as the expert, but in my opinion the assembler is >>>> behaving correctly, if conservatively. In contrast to objdump which is >>>> (currently, at least) incorrect. >>>> >>>> We could also update the assembler to generate $x rather than the full >>>> arch string, but I don't think they need to be coupled, as the assembler is >>>> spec/ABI compliant still. >>>> >>>> >>>> >>>>> Nelson >>>>> >>>>> On Mon, Feb 17, 2025 at 3:15 AM <andrew@andrewoates.com> wrote: >>>>> >>>>>> From: Andrew Oates <andrew@andrewoates.com> >>>>>> >>>>>> The mapping symbol "$x" without an ISA string "means using ISA >>>>>> configuration from ELF attribute."[1]. Currently the code does not >>>>>> reset the subset_list. This means that a previous mapping symbol that >>>>>> overrides the ISA string will continue to be used, rather than the >>>>>> default string set in the ELF file's .riscv.attributes section. This >>>>>> can cause incorrect or failed instruction decodings. >>>>>> >>>>>> In practice, this causes problems when disassembling code generated by >>>>>> LLVM, which (unlike gas) does not emit explicit mapping symbols at the >>>>>> start of each section. >>>>>> >>>>>> This change stores the default architecture string seen at the >>>>>> beginning >>>>>> of disassembly in the global parse data struct, and restores that to >>>>>> subset_list whenever a bare "$x" symbol is seen. >>>>>> >>>>>> Test case object file generated per repro instructions in PR32705. >>>>>> >>>>>> * opcodes/riscv-dis.c >>>>>> (riscv_init_disasm_info): store default_arch in global >>>>>> riscv_private_data struct >>>>>> (riscv_update_map_state): used stored default when a "$x" >>>>>> symbol is seen >>>>>> >>>>>> binutils/testsuite/binutils-all/riscv/ >>>>>> * riscv.exp: new test case >>>>>> * pr32705.o.bz2: new file (test input) >>>>>> * pr32705.o.dump: new file (test output) >>>>>> >>>>>> [1] >>>>>> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-elf.adoc#mapping-symbol >>>>>> --- >>>>>> .../binutils-all/riscv/pr32705.o.bz2 | Bin 0 -> 630 bytes >>>>>> .../binutils-all/riscv/pr32705.o.dump | 13 ++++++++++++ >>>>>> .../testsuite/binutils-all/riscv/riscv.exp | 20 >>>>>> ++++++++++++++++++ >>>>>> opcodes/riscv-dis.c | 18 ++++++++++------ >>>>>> 4 files changed, 45 insertions(+), 6 deletions(-) >>>>>> create mode 100755 >>>>>> binutils/testsuite/binutils-all/riscv/pr32705.o.bz2 >>>>>> create mode 100644 >>>>>> binutils/testsuite/binutils-all/riscv/pr32705.o.dump >>>>>> >>>>>> diff --git a/binutils/testsuite/binutils-all/riscv/pr32705.o.bz2 >>>>>> b/binutils/testsuite/binutils-all/riscv/pr32705.o.bz2 >>>>>> new file mode 100755 >>>>>> index >>>>>> 0000000000000000000000000000000000000000..9b7fb4d67394b25908db2ffef51e14967dc51b4b >>>>>> GIT binary patch >>>>>> literal 630 >>>>>> zcmV-+0*U=XT4*^jL0KkKStLa_B>(}`f9L=H_+D=F|MYjIEpos2|8md(KmY(h0ssO; >>>>>> zXaK+g+z7FTieybkiKHHrCWojE02w_%8&RMDX|)4G)P7Vn$Qr5Xh)ttS5S~p>Qy>O_ >>>>>> zVgSMb42&QEG<rbbXblF9115k2Kmm!Nrh%q_8Z>ACXwYIrn^RLw9-?MVO)ws)**#Ml >>>>>> z2sFTGjExK$38oWGpoL1I;6{XUrUU<i_*sfOl+blFK>W@+^q(UANr>tE#k=>))e?0H >>>>>> zdNO9L0HBG($eqL&%+tA+I}j@(NZr=cwZYH*&W8dtg7#`Tzye1K1J)4nuu@^>{TyIk >>>>>> zM`8i!l#VXb5(G-lYY<4Fuemy%)+z={INK2pSbZDRt;u3BX$qi<!ulthyO!=))e%V; >>>>>> z)Q}j9pxW`gT__-sEQt+Wzi1yGb|TdqFnA`LQusWIO3J-hNNuS?AOVBQ!4?761eZ`# >>>>>> z<_1j`PVttis}@v+woe~45zSnZ<<zsL<ltmHWW5N6X5#_VlKFx1wE5(1*r_%y^dF+5 >>>>>> z(xHNOg*nK8=?C7SPaN^5z1sduWu6o_NeW?Br54h=0YJjVC)#buLoRDX{B_!pz#&CT >>>>>> zexOBX;W#r_s}_D83hqAk$b<7H(8)dTWj6HJbJ5!?(imz@3-c-S)NOCJ;U!jeNh)nv >>>>>> z(yf_hdjz<l8PnHTRt1gMTnEMv1|5BABq>n?WT815icmfvhABZsk{V7Rr*Ni8$7Pcq >>>>>> zdP6G5T_*V>R9I$XNi+m6>kxXOr^5ln5gAV;BvOb$m>_ltxbEHAN&Ij_pkOwHAQ#zz >>>>>> Q{9oekNT&)C1c;{Opf{8mTL1t6 >>>>>> >>>>>> literal 0 >>>>>> HcmV?d00001 >>>>>> >>>>>> diff --git a/binutils/testsuite/binutils-all/riscv/pr32705.o.dump >>>>>> b/binutils/testsuite/binutils-all/riscv/pr32705.o.dump >>>>>> new file mode 100644 >>>>>> index 00000000000..6204da92aa3 >>>>>> --- /dev/null >>>>>> +++ b/binutils/testsuite/binutils-all/riscv/pr32705.o.dump >>>>>> @@ -0,0 +1,13 @@ >>>>>> +#objdump: -d >>>>>> +# Test handling of $x mapping symbols to ensure they reset the >>>>>> current >>>>>> +# architecture correctly. >>>>>> + >>>>>> +#... >>>>>> +Disassembly of section \.text: >>>>>> + >>>>>> +00000000000100e8 <func-0x4>: >>>>>> + 100e8: 00100513 li a0,1 >>>>>> + >>>>>> +00000000000100ec <func>: >>>>>> + 100ec: 2505 addiw a0,a0,1 >>>>>> + 100ee: 8082 ret >>>>>> diff --git a/binutils/testsuite/binutils-all/riscv/riscv.exp >>>>>> b/binutils/testsuite/binutils-all/riscv/riscv.exp >>>>>> index 8847af8ac9d..66caf7f532a 100644 >>>>>> --- a/binutils/testsuite/binutils-all/riscv/riscv.exp >>>>>> +++ b/binutils/testsuite/binutils-all/riscv/riscv.exp >>>>>> @@ -27,3 +27,23 @@ foreach t $test_list { >>>>>> verbose [file rootname $t] >>>>>> run_dump_test [file rootname $t] >>>>>> } >>>>>> + >>>>>> +set test $srcdir/$subdir/pr32705.o.bz2 >>>>>> +# We need to strip the ".bz2", but can leave the dirname. >>>>>> +set t $subdir/[file tail $test] >>>>>> +set testname [file rootname $t] >>>>>> +set tempfile tmpdir/pr32705.o >>>>>> +set dumpfile tmpdir/pr32705.out >>>>>> +set objfile [file rootname $test] >>>>>> +if {[catch "system \"bzip2 -dc $test > $tempfile\""] != 0} { >>>>>> + untested "bzip2 -dc ($testname)" >>>>>> +} else { >>>>>> + verbose [file rootname $t] >>>>>> + if {[catch "system \"$OBJDUMP -d $tempfile > $dumpfile\""] != 0} >>>>>> { >>>>>> + fail $testname >>>>>> + } else { >>>>>> + if {[ regexp_diff $dumpfile "${objfile}.dump" ]} { >>>>>> + fail $testname >>>>>> + } >>>>>> + } >>>>>> +} >>>>>> diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c >>>>>> index 84c6deef7b6..7065fdecc02 100644 >>>>>> --- a/opcodes/riscv-dis.c >>>>>> +++ b/opcodes/riscv-dis.c >>>>>> @@ -52,6 +52,8 @@ struct riscv_private_data >>>>>> enum riscv_spec_class default_priv_spec; >>>>>> /* Used for architecture parser. */ >>>>>> riscv_parse_subset_t riscv_rps_dis; >>>>>> + /* Default architecture string for the object file. */ >>>>>> + const char* default_arch; >>>>>> /* Used for mapping symbols. */ >>>>>> int last_map_symbol; >>>>>> bfd_vma last_stop_offset; >>>>>> @@ -1065,10 +1067,14 @@ riscv_update_map_state (int n, >>>>>> return; >>>>>> >>>>>> name = bfd_asymbol_name(info->symtab[n]); >>>>>> - if (strcmp (name, "$x") == 0) >>>>>> - *state = MAP_INSN; >>>>>> - else if (strcmp (name, "$d") == 0) >>>>>> + if (strcmp (name, "$d") == 0) >>>>>> *state = MAP_DATA; >>>>>> + else if (strcmp (name, "$x") == 0) >>>>>> + { >>>>>> + *state = MAP_INSN; >>>>>> + riscv_release_subset_list (pd->riscv_rps_dis.subset_list); >>>>>> + riscv_parse_subset (&pd->riscv_rps_dis, pd->default_arch); >>>>>> + } >>>>>> else if (strncmp (name, "$xrv", 4) == 0) >>>>>> { >>>>>> *state = MAP_INSN; >>>>>> @@ -1372,7 +1378,7 @@ riscv_init_disasm_info (struct disassemble_info >>>>>> *info) >>>>>> pd->riscv_rps_dis.xlen = &pd->xlen; >>>>>> pd->riscv_rps_dis.isa_spec = &pd->default_isa_spec; >>>>>> pd->riscv_rps_dis.check_unknown_prefixed_ext = false; >>>>>> - const char *default_arch = "rv64gc"; >>>>>> + pd->default_arch = "rv64gc"; >>>>>> if (info->section != NULL) >>>>>> { >>>>>> bfd *abfd = info->section->owner; >>>>>> @@ -1390,12 +1396,12 @@ riscv_init_disasm_info (struct >>>>>> disassemble_info *info) >>>>>> attr[Tag_b].i, >>>>>> attr[Tag_c].i, >>>>>> >>>>>> &pd->default_priv_spec); >>>>>> - default_arch = attr[Tag_RISCV_arch].s; >>>>>> + pd->default_arch = attr[Tag_RISCV_arch].s; >>>>>> } >>>>>> } >>>>>> } >>>>>> riscv_release_subset_list (pd->riscv_rps_dis.subset_list); >>>>>> - riscv_parse_subset (&pd->riscv_rps_dis, default_arch); >>>>>> + riscv_parse_subset (&pd->riscv_rps_dis, pd->default_arch); >>>>>> >>>>>> pd->last_map_symbol = -1; >>>>>> pd->last_stop_offset = 0; >>>>>> -- >>>>>> 2.47.0 >>>>>> >>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://sourceware.org/pipermail/binutils/attachments/20250217/6dda291d/attachment-0001.htm>
- Previous message (by thread): [PATCH] PR32705 opcodes: fix RISC-V $x mapping symbol
- Next message (by thread): [PATCH] PR32705 opcodes: fix RISC-V $x mapping symbol
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Binutils mailing list