[PATCH v10 06/28] gas: implement parsing of object attributes v2
Jan Beulich
jbeulich@suse.com
Mon Dec 22 08:19:20 GMT 2025
More information about the Binutils mailing list
Mon Dec 22 08:19:20 GMT 2025
- Previous message (by thread): [PATCH v10 06/28] gas: implement parsing of object attributes v2
- Next message (by thread): [PATCH v10 06/28] gas: implement parsing of object attributes v2
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 19.12.2025 15:31, Matthieu Longo wrote:
> On 08/12/2025 07:43, Jan Beulich wrote:
>> On 05.12.2025 17:27, Matthieu Longo wrote:
>>> On 02/12/2025 08:19, Jan Beulich wrote:
>>>> On 01.12.2025 17:28, Matthieu Longo wrote:
>>>>> On 28/11/2025 14:29, Jan Beulich wrote:
>>>>>> On 20.11.2025 18:58, Matthieu Longo wrote:
>>>>>>> +#if (TC_OBJ_ATTR_v2)
>>>>>>> +/* Look up attribute tags defined in the backend (object attribute v2). */
>>>>>>> +static bool
>>>>>>> +obj_attr_v2_lookup_known_attr_tag_symbol (const char *identifier,
>>>>>>> + arg_token_t token_type,
>>>>>>> + arg_t *val_out)
>>>>>>> +{
>>>>>>> + obj_attr_subsection_v2_t *subsec = elf_obj_attr_subsections (stdoutput).last;
>>>>>>> + /* If there is no current subsection, this function was called wrongly before
>>>>>>> + setting one (usually via the subsection directive). */
>>>>>>> + gas_assert (subsec != NULL);
>>>>>>
>>>>>> The comment (in particular what is being said in parentheses) suggests to me that
>>>>>> here you're asserting correctness of user input. That would want to be an as_bad()
>>>>>> then, though.
>>>>>>
>>>>>
>>>>> In my understanding, as you stated, as_bad() is used to emit an error due to a user input.
>>>>> However, this comment refers to a logic error. The function is not supposed to be called if there is no current subsection.
>>>>> It happened to me at some point during the development, and got a mysterious segfault instead of an explicit message pointing to my mistake. I decided to add this comment and assert just in case it happens to someone else.
>>>>
>>>> May I then ask that you re-word the comment such that it's unambiguously talking
>>>> about internal state, not input we're processing? To me "If there is no current
>>>> subsection" can mean either.
>>>
>>> What about this ? Is it clearer ?
>>>
>>> /* This function is called when setting a value for an attribute, and it
>>> requires an active subsection. If 'subsec' is NULL, the caller wrongly
>>> invoked this function before selecting a current subsection (normally via
>>> the subsection directive in assembly). This is a logic error, most likely
>>> caused by a misvalidation of a user input. */
>>
>> Sadly not. What is said in parentheses again refers to user input. The
>> "misvalidation" part then recovers some, but it still leaves things
>> ambiguous imo.
>>
>
> Ok what about this ?
>
> /* This function is called when setting a value for an attribute, and it
> requires an active subsection. Calling this function without setting
> 'subsec' is a logical error. Invalid user code, where the subsection
> has not been selected must be handled by the caller. We require 'subsec'
> to be non-NULL here. */
Yes, better. Thanks.
>>>>>>> +/* Parse an argument, and set its type accordingly depending on the input
>>>>>>> + value, and the constraints on the expected argument. */
>>>>>>> +static bool
>>>>>>> +obj_attr_parse_arg (arg_token_t expected_ttype,
>>>>>>> + bool (*match_identifier) (char c),
>>>>>>> + bool resolve_identifier,
>>>>>>> + bool optional,
>>>>>>> + arg_t *arg_out)
>>>>>>> +{
>>>>>>> + const arg_token_t low_ttype = (expected_ttype & LT_MASK);
>>>>>>> +
>>>>>>> + if (optional && is_end_of_stmt (*input_line_pointer))
>>>>>>> + {
>>>>>>> + arg_out->vtype = VALUE_OPTIONAL_ABSENT;
>>>>>>> + return true;
>>>>>>> + }
>>>>>>> +
>>>>>>> + /* Check whether this looks like a string literal
>>>>>>> + Note: symbol look-up for string literals is not available. */
>>>>>>> + if (*input_line_pointer == '"')
>>>>>>> + {
>>>>>>> + bool status = extract_string_literal (arg_out);
>>>>>>> + if (status && (low_ttype & STRING))
>>>>>>> + return true;
>>>>>>> +
>>>>>>> + if (status)
>>>>>>> + {
>>>>>>> + char sbuffer[100];
>>>>>>> + as_bad (_("unexpected `string' \"%s\", expected %s instead"),
>>>>>>> + arg_out->val.string,
>>>>>>> + expectations_to_string (low_ttype, sbuffer, sizeof(sbuffer)));
>>>>>>> + free ((char *) arg_out->val.string);
>>>>>>> + arg_out->val.string = NULL;
>>>>>>> + arg_out->vtype = VALUE_UNDEFINED;
>>>>>>> + }
>>>>>>> + return false;
>>>>>>> + }
>>>>>>> + /* Check whether this looks like an identifier. */
>>>>>>> + else if (ISALPHA (*input_line_pointer) || *input_line_pointer == '_')
>>>>>>
>>>>>> Aren't you open-coding is_tag_identifier() here? Oh, wait - it's is ALNUM()
>>>>>> there and ISALPHA() here. Why the difference? A symbol (identifier) can't
>>>>>> start with a number, can it (seeing that is_tag_identifier() is only an
>>>>>> alias of is_symbol())?
>>>>>
>>>>> Exactly what you said, i.e. a symbol cannot start with a number, hence the call to ISALPHA() instead of ALNUM().
>>>>
>>>> Yet then exactly as I said: is_tag_identifier() being an alias of is_symbol(),
>>>> why is ISALNUM() used there? I.e. my original question remains: Are you perhaps
>>>> open-coding is_tag_identifier() here? (That's the only meaning of "identifier"
>>>> in the comment that I can derive. I can't exclude though that there are [again]
>>>> multiple different meanings of "identifier" throughout this code.)
>>>
>>> Sorry to repeat myself but I don't understand what you mean in this comment.
>>>
>>> Are we agreeing on the following statements ?
>>> 1. An identifier can start with [a-zA-Z_] but not [0-9], so the first character's check is using ISALPHA().
>>> 2. Any following characters can use [a-zA-Z0-9_] hence ALNUM().
>>
>> Yes (leaving aside the question of whether further characters may be legitimate to
>> use).
>>
>>> Is using "ISALPHA (*input_line_pointer) || *input_line_pointer == '_'" in the condition the issue you try to point at ?
>>> Do you want me to move it to a function/macro and call it "is_identifier_start()" ?
>>
>> Not exactly. First, as said, that expression feels like it's open-coding something.
>> That "something" may be is_tag_identifier(). Which aliases is_symbol(). Which uses
>> ISALNUM(), irrespective of the character being looked at being first in a symbol
>> name.
>>
>> IOW is_symbol() violates 1. in your reply. Why? If that was changed, in turn
>> is_tag_identifier() could be used here.
>>
>> is_symbol() of course is further confusing to see: Why would you open-code (and
>> further constrain) is_name_beginner() / is_part_of_name() / is_name_ender() here?
>> The only possible explanation I can find is that "symbol" means something else
>> here than in the rest of the assembler. Which wouldn't be very helpful.
>
> I tried to use is_name_beginner() along get_symbol_name().
> The parsing falls apart, and I get a lot of failures in the tests.
Which may point to a more general problem with the parsing (here and/or what's
available as general machinery in gas). Without you giving details, I of course
can't even guess what the problem may be.
> I don't have the courage to debug what is going on anymore. FYI I am morally exhausted with this patch particularly.
> If we want to be able to ship this feature for the next release of binutils, maybe we should stop changing everything.
Hmm, I'm really of two minds here. On one hand I can understand your frustration
as well as your goal to "just get this in". Otoh other kinds of open-coded
parsing have been causing issues elsewhere in the assembler. I'm therefore
rather hesitant to accept anything else going such a route.
Jan
- Previous message (by thread): [PATCH v10 06/28] gas: implement parsing of object attributes v2
- Next message (by thread): [PATCH v10 06/28] gas: implement parsing of object attributes v2
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Binutils mailing list