[PATCH v8 12/19] Merge of Object Attributes v2 during linkage (generic logic)

Matthieu Longo matthieu.longo@arm.com
Fri Aug 29 16:37:56 GMT 2025
On 2025-08-22 10:50, Jan Beulich wrote:
> On 15.07.2025 13:39, Matthieu Longo wrote:
>> This patch adds all the generic logic to the linker to process OAv2.
>> The linker is an "advanced" consumer of OAv2. After parsing, it
>> deduplicates them, merge them, detect any compatibility issues,
>> and finally translate them to GNU properties.
>>
>> ** Overall design
>>
>> The OAv2 processing pipeline follows a map-reduce pattern. Obviously,
>> the actual processing in GNU ld is not multi-threaded, and the
>> operations are not necessarily executed directly one after another.
>>
>> * Phase 1, map: successive per-file operations applied on the list of
>>    compatible input objects.
>>    1. Parsing of the OAv2 section's data (also used by objcopy).
>>    2. Translation of relevant GNU properties to OAv2. This is required
>>       for the backward-compatibility with input objects only marked
>>       using GNU properties.
>>    3. Sorting of the subsections and object attributes. Further
>>       operations rely on the ordering to perform some optimization in
>>       the processing of the data.
>>    4. Deduplication of subsections and object attributes, and detection
>>       of any conflict between duplicated subsections or tags.
>>    5. Translation of relevant OAv2 to GNU properties for a
>>       forward-compatibility with the GNU properties merge.
>>
>> * Phase 2, reduce: OAv2 in input objects are merged together.
>>    1. Gathering of "frozen" values (=coming from the command-line
>>       arguments) into a virtual read-only list of subsections and
>>       attributes.
>>    2. Merging of OAv2 from an input file and the frozen input.
>>    3. Merging of the results of step 2 together. Since the OAv2 merge
>>       is commutative and associative, it can be implemented as a reduce.
>>       However, GNU ld implements it as an accumulate because it does
>>       not support multithreading.
>>    Notes: the two merge phases also perform a marking of unsupported/
>>    invalid subsections and attributes. This marking can be used for
>>    debugging, and also more practically to drop unsupported optional
>>    subsections from the output.
>>
>> * Phase 3, finalization of the output.
>>    1. Pruning of the unsupported/invalid subsections and attributes.
> 
> May a linker legitimately remove anything? (Instead I would have hoped
> for some forward compatibility.)
> 

Addressed in a separate reply.

>> @@ -411,6 +513,98 @@ bfd_elf_set_obj_attr_contents (bfd *abfd, bfd_byte *buffer, bfd_vma size)
>>       abort ();
>>   }
>>   
>> +/* Structure storing the result of a search in the list of input BFDs.
>> +   - the pointer to the BFD.
>> +   - the pointer to the section containing the object attributes.  */
>> +typedef struct
>> +{
>> +  bfd *pbfd;
>> +  bool has_build_attributes;
>> +  asection *sec;
>> +} bfd_search_result_t;
>> +
>> +/* Checks whether a BFD contains object attributes, and if so search for the
>> +   relevant section storing them.  */
>> +static bool
>> +bfd_has_build_attributes (bfd *abfd, bfd_search_result_t *res)
>> +{
>> +  if (elf_obj_attr_subsections (abfd).size == 0)
>> +    return false;
>> +  res->has_build_attributes = true;
>> +
>> +  const char *sec_name = get_elf_backend_data (abfd)->obj_attrs_section;
>> +  if ((res->sec = bfd_get_section_by_name (abfd, sec_name)) == NULL)
>> +    return false;
>> +  return true;
>> +}
>> +
>> +/* Returns True if the given BFD is an ELF object with the current backend
> 
> What's "current"?
> 

The "target" backend.
Fixed in the next revision.

>> +   machine code, non-dynamic (i.e. not a shared library), and has sections.
> 
> But an executable is okay? I.e. aren't after relocatable objects here?
> 

You're right, it is missing.
I took this condition from _bfd_elf_link_setup_gnu_properties, and 
EXEC_P is missing.

However, this raises one question in my head.
Why would the GNU property code or object attribute one would have to 
deal with executable ELF as an input file ? Shouldn't ld filter such an 
executable before reaching this stage ?
Does it even make sense to add an executable as an input object for the 
linker ? Shouldn't the linker raise an error earlier ?

>> +   False otherwise.
>> +   Note: this function is a convenient encapsulation of the predicate used to
>> +   search for objects containing object attributes in the list of BFDs.  */
>> +static bool
>> +bfd_is_non_dynamic_elf_object (struct bfd_link_info *info,
>> +			       bfd *abfd)
>> +{
>> +  const struct elf_backend_data *output_bfd
>> +    = get_elf_backend_data (info->output_bfd);
> 
> "output_bed" may be a better name for this variable. It's hard to see though
> why you have ...
> 

Fixed in the next revision.

>> +  unsigned int elfclass = output_bfd->s->elfclass;
>> +  int elf_machine_code = output_bfd->elf_machine_code;
> 
> ... three variables for one side ...
> 

The line is too long otherwise, that's why I splitted it.

>> +  return (bfd_get_flavour (abfd) == bfd_target_elf_flavour
>> +	  && abfd->section_count != 0
>> +	  && (abfd->flags & DYNAMIC) == 0
>> +	  && elf_machine_code == get_elf_backend_data (abfd)->elf_machine_code
>> +	  && elfclass == get_elf_backend_data (abfd)->s->elfclass);
> 
> ... of the comparisons and none for the other. I would suggest to have "bed"
> and "output_bed", and no further ones.
> 

Fixed in the next revision.

> As this function isn't generic, I'd recommend dropping (or replacing) its
> bfd_ prefix. (This likely applies elsewhere as well.)
> 

The issue, if I manually inlining this code inside 
bfd_linear_search_one_with_build_attributes, is that the filtering 
condition is really big and makes the purpose of this loop obscure.

I renamed it to elf_may_contain_obj_attrs ().

Then this loop is relatively readable :)

for (bfd *abfd = info->input_bfds; abfd != NULL; abfd = abfd->link.next)
     if (elf_may_contain_obj_attrs (info, abfd))
       {
	res.pbfd = abfd;
	if (bfd_has_build_attributes (abfd, &res))
	  break;
       }

>> +/* Search for the first input object file containing object attributes.  */
>> +static bfd_search_result_t
>> +bfd_linear_search_one_with_build_attributes (struct bfd_link_info *info)
> 
> s/search_one/find_first/?
> 

Fixed in the next revision.

>> +/* Create a build attributes section for the given bfd input.  */
>> +static asection *
>> +create_build_attributes_section (struct bfd_link_info *info,
>> +				 bfd *ebfd)
>> +{
>> +  asection *sec;
>> +  const char *sec_name = get_elf_backend_data (ebfd)->obj_attrs_section;
>> +  sec = bfd_make_section_with_flags (ebfd,
>> +				     sec_name,
>> +				     (SEC_READONLY
>> +				      | SEC_HAS_CONTENTS
>> +				      | SEC_DATA));
>> +  if (sec == NULL)
>> +    info->callbacks->einfo (
>> +      _("%F%P: failed to create %s section\n"), sec_name);
>> +
>> +  unsigned align = (bfd_get_mach (ebfd) & bfd_mach_aarch64_ilp32) ? 2 : 3;
> 
> Wait - we're in generic code here? How can there be an Arm64 specific check?
> Did you perhaps mean to use struct bfd_arch_info's bits_per_address?
> 

This is a copy-paste from the AArch64 backend in 
_bfd_aarch64_elf_create_gnu_property_section.

We should also use the ELFCLASS instead of this bfd_mach_aarch64_ilp32.

Replaced by:

   unsigned align
     = (get_elf_backend_data (info->output_bfd)->s->elfclass == ELFCLASS64
        ? 3
        : 2);

in the next revision.

>> +  if (!bfd_set_section_alignment (sec, align))
>> +    info->callbacks->einfo (_("%F%pA: failed to align section\n"), sec);
> 
> What is %F? I understand there are very few other uses, but I can't help the
> impression that they're all stale (at best). If I try to use it, all I get is
> "%F" in the output. Ah, here we go - commit 8d97c1a53f3d. You mean to use
> ->fatal() instead (also a few lines up, and perhaps elsewhere).
> 

Oh I hadn't noticed this recent change.
The tests don't exert this path in the code, so I hadn't detected it.
Fixed in the next revision.

>> @@ -454,8 +648,8 @@ static known_subsection_v2 obj_attr_v2_known_gnu_subsections[] =
>>     /* Note for the future: GNU subsections can be added here below.  */
>>   };
>>   
>> -/* Return True if the given subsection name is part of the reserved "gnu-testing"
>> -   namespace.  */
>> +/* Return True if the given subsection name is part of the reserved testing
>> +   namespace, i.e. SUBSEC_NAME begins with "gnu-testing".  */
>>   static bool
>>   gnu_testing_namespace (const char *subsec_name)
>>   {
> 
> Was this adjustment meant to be done in patch 04?
> 

Yes, I missed this one.
Fixed in the next revision.

>> @@ -555,6 +749,1285 @@ oav2_encoding_to_string (obj_attr_encoding_v2 encoding)
>>     return (encoding == OA_ENC_ULEB128) ? "ULEB128" : "NTBS";
>>   }
>>   
>> +/* Initialize the given ATTR with its default value coming from the known tag
>> +   registry.  */
>> +static void
>> +oav2_attr_overwrite_with_default (struct bfd_link_info *info,
>> +				  obj_attr_subsection_v2 *subsec,
>> +				  obj_attr_v2 *attr)
>> +{
>> +  const struct elf_backend_data *be = get_elf_backend_data (info->output_bfd);
> 
> The commonly used name for this is "bed", I think.
> 

Fixed in the next revision.

>> +  const obj_attr_info_t *tag_info
>> +    = known_obj_attr_v2_find_by_tag (be, subsec->name, attr->tag);
>> +  if (tag_info == NULL)
>> +    {
>> +      attr->status = obj_attr_v2_unknown;
>> +      if (subsec->encoding == OA_ENC_ULEB128)
>> +	attr->vals.uint_val = 0;
>> +      else
>> +	attr->vals.string_val = NULL;
>> +      return;
>> +    }
>> +
>> +  if (be->obj_attr_v2_default_value != NULL
>> +      && be->obj_attr_v2_default_value (info, tag_info, subsec, attr))
>> +    {}
>> +  else if (subsec->encoding == OA_ENC_NTBS)
>> +    {
>> +      if (tag_info->default_value.val.string != NULL)
>> +	{
>> +	  if (attr->vals.string_val != NULL)
>> +	    free ((void *) attr->vals.string_val);
>> +	  attr->vals.string_val = strdup (tag_info->default_value.val.string);
> 
> xstrdup() ?
> 

Fixed in the next revision.

>> +	}
>> +      else
>> +	attr->vals.string_val = NULL;
> 
> Leaking the earlier string in this case?
> 

It was not leaking given how I was calling the function, but you are 
right, if this function was called with an already initialized non-null 
string attribute, it would leak.

Changed to the below in the next revision.

   else if (subsec->encoding == OA_ENC_NTBS)
     {
       if (attr->val.string_val != NULL)
	{
	  free ((void *) attr->val.string_val);
	  attr->val.string_val = NULL;
	}
       if (attr_info->default_value.string_val != NULL)
	attr->val.string_val = xstrdup (attr_info->default_value.string_val);
     }

>> +/* Merge policy Integer-AND: apply bitwise AND between REF and RHS.  */
>> +obj_attr_v2_merge_result
>> +obj_attr_v2_tag_merge_AND (struct bfd_link_info *info ATTRIBUTE_UNUSED,
>> +			   bfd *abfd ATTRIBUTE_UNUSED,
>> +			   obj_attr_subsection_v2 *subsec,
>> +			   obj_attr_v2 *ref, obj_attr_v2 *rhs,
>> +			   obj_attr_v2 *frozen ATTRIBUTE_UNUSED)
>> +{
>> +  BFD_ASSERT (subsec->encoding == OA_ENC_ULEB128);
>> +
>> +  obj_attr_v2_merge_result res = {
>> +    .merge = true,
>> +    .vals.uint_val = 0,
>> +    .reason = MERGE_OK,
>> +  };
>> +
>> +  uint32_t original_value = ref->vals.uint_val;
>> +  res.vals.uint_val = (ref->vals.uint_val & rhs->vals.uint_val);
>> +  res.merge = (res.vals.uint_val != original_value);
>> +  if (res.vals.uint_val == original_value)
> 
> Doing twice in a row the effectively same comparison?
> 

Replaced by the if condition by "!res.merge" in the next revision.

>> +/* Merge policy String-ADD: concatenates strings from REF and RHS together
>> +   adding a '+' character in-between.  */
>> +static obj_attr_v2_merge_result
>> +obj_attr_v2_tag_merge_ADD (struct bfd_link_info *info ATTRIBUTE_UNUSED,
>> +			   bfd *abfd ATTRIBUTE_UNUSED,
>> +			   obj_attr_subsection_v2 *subsec,
>> +			   obj_attr_v2 *ref, obj_attr_v2 *rhs,
>> +			   obj_attr_v2 *frozen)
>> +{
>> +  BFD_ASSERT (subsec->encoding == OA_ENC_NTBS);
>> +
>> +  size_t frozen_s_size = 0;
>> +  if (frozen && frozen->vals.string_val)
>> +    frozen_s_size = strlen (frozen->vals.string_val);
>> +
>> +  obj_attr_v2_merge_result res = {
>> +    .merge = false,
>> +    .vals.uint_val = 0,
>> +    .reason = MERGE_OK,
>> +  };
>> +
>> +  if (ref->vals.string_val && rhs->vals.string_val)
>> +    {
>> +      res.merge = true;
>> +      size_t ref_s_size = strlen (ref->vals.string_val);
>> +      size_t rhs_s_size = strlen (rhs->vals.string_val);
>> +      char *buffer = malloc (ref_s_size + 1 + rhs_s_size + 1);
> 
> xmalloc() (Hopefully you've already gone through and changed all the
> allocations throughout the series.)
> 

I could not find more, so hopefully everything is addressed.

$ git diff master..HEAD | grep malloc
+      char *buffer = xmalloc (ref_s_size + 1 + rhs_s_size + 1);
+         char *buffer = xmalloc (frozen_s_size + 1 + rhs_s_size + 1);

$ git diff master..HEAD | grep strdup
+       attr->val.string_val = xstrdup 
(attr_info->default_value.string_val);
+    (xstrdup (subsec->name), subsec->scope, subsec->optional, 
subsec->encoding);
@@ -344,7 +2145,7 @@ _bfd_elf_attr_strdup (bfd *abfd, const char *s)
+  *s = xstrdup ((const char *) cursor);
+        ? xstrdup (other->val.string_val)
+    = _bfd_elf_obj_attr_subsection_v2_init (xstrdup (other->name), 
other->scope,
  extern char *_bfd_elf_attr_strdup (bfd *, const char *);
+    = _bfd_elf_obj_attr_subsection_v2_init (xstrdup 
("aeabi_feature_and_bits"),
+           (xstrdup ("aeabi_feature_and_bits"), OA_SUBSEC_PUBLIC, true,
+      arg_out->val.string = xstrdup (obstack_buf);

Fixed in the next revision.

>> +      res.vals.string_val = buffer;
>> +      memcpy (buffer, ref->vals.string_val, ref_s_size);
>> +      buffer += ref_s_size;
>> +      *buffer = '+';
>> +      ++buffer;
>> +      memcpy (buffer, rhs->vals.string_val, rhs_s_size + 1);
>> +    }
>> +  else if (ref->vals.string_val)
>> +    {
>> +      /* Nothing to do, frozen (if not NULL) should already be merged with
>> +	 it.  */
>> +      res.reason = SAME_VALUE_AS_REF;
>> +    }
>> +  else if (rhs->vals.string_val)
>> +    {
>> +      res.merge = true;
>> +      if (frozen_s_size == 0)
> 
> This is the first use of the variable - why is it declared and set at the
> top of the function?
> 

No good reason to do it at the top, that's a good point.
I moved the declaration and initialization down right before using it.
Fixed in the next revision.

>> +/* Return the merge result between attributes LHS, RHS and FROZEN.  */
>> +static obj_attr_v2_merge_result
>> +oav2_attr_merge (struct bfd_link_info *info,
>> +		 bfd *abfd,
>> +		 obj_attr_subsection_v2 *subsec,
>> +		 obj_attr_v2 *lhs, obj_attr_v2 *rhs,
>> +		 obj_attr_v2 *frozen, bool frozen_as_abfd)
>> +{
>> +  obj_attr_v2_merge_result res = {
>> +    .merge = false,
>> +    .vals.uint_val = 0,
>> +    .reason = MERGE_OK,
>> +  };
>> +
>> +  gnu_testing_merge_policy policy;
>> +
>> +  if (get_elf_backend_data (abfd)->obj_attr_v2_tag_merge != NULL)
>> +    {
>> +      if (frozen_as_abfd)
>> +	{
>> +	  obj_attr_v2 *tmp = lhs;
>> +	  lhs = rhs;
>> +	  rhs = tmp;
>> +	}
>> +      res = get_elf_backend_data (abfd)->obj_attr_v2_tag_merge (info, abfd,
>> +	subsec, lhs, rhs, frozen);
>> +    }
>> +
>> +  /* Note for the future: the merge of generic object attributes should be
>> +     added here, between the architecture-specific merge, and the reserved GNU
>> +     testing namespace.  */
>> +
>> +  /* GNU testing merge policies are looked up last.  If MERGE_OK is detected,
>> +     the subsection is considered unmergeable.  */
>> +  if (! res.merge && (res.reason == UNSUPPORTED || res.reason == MERGE_OK))
> 
> I can't bring comment and code together.
> 

I cannot understand it too, it looks like a remaining from an earlier 
revision. Sorry for the confusion :/

I restructure the code by adding earlier return, and removed the 
comment. I find the logic clearer this way. See below.

/* Return the merge result between attributes LHS, RHS and FROZEN.  */
static obj_attr_v2_merge_result
oav2_attr_merge (struct bfd_link_info *info,
		 bfd *abfd,
		 obj_attr_subsection_v2 *subsec,
		 obj_attr_v2 *lhs, obj_attr_v2 *rhs,
		 obj_attr_v2 *frozen, bool frozen_as_abfd)
{
   obj_attr_v2_merge_result res = {
     .merge = false,
     .val.uint_val = 0,
     .reason = MERGE_OK,
   };

   gnu_testing_merge_policy policy;

   if (get_elf_backend_data (abfd)->obj_attr_v2_tag_merge != NULL)
     {
       if (frozen_as_abfd)
	{
	  obj_attr_v2 *tmp = lhs;
	  lhs = rhs;
	  rhs = tmp;
	}
       res = get_elf_backend_data (abfd)->obj_attr_v2_tag_merge (info, abfd,
	subsec, lhs, rhs, frozen);
       if (res.merge || res.reason == SAME_VALUE_AS_REF)
	return res;
     }

   /* Note for the future: the merge of generic object attributes should be
      added here, between the architecture-specific merge, and the 
reserved GNU
      testing namespace.  */

   /* GNU testing merge policies are looked up last.  */
   if ((res.reason == UNSUPPORTED || res.reason == MERGE_OK)
       && (policy = gnu_testing_merge_subsection (subsec->name))
	  != SUBSECTION_TESTING_MERGE_UNSUPPORTED)
     {
       /* Only the first two attributes can be merged, others won't and will
	 be discarded.  */
       if (lhs->tag <= 1)
	{
	  if (policy == SUBSECTION_TESTING_MERGE_AND_POLICY)
	    res = obj_attr_v2_merge_policy_AND
	      (info, abfd, subsec, lhs, rhs, frozen);
	  else if (policy == SUBSECTION_TESTING_MERGE_OR_POLICY)
	    res = obj_attr_v2_merge_policy_OR
	      (info, abfd, subsec, lhs, rhs, frozen);
	  else if (policy == SUBSECTION_TESTING_MERGE_ADD_POLICY)
	    res = obj_attr_v2_merge_policy_ADD
	      (info, abfd, subsec, lhs, rhs, frozen);

	  return res;
	}
     }

   /* Nothing matched, thus the merge of this tag is unsupported.  */
   res.reason = UNSUPPORTED;
   return res;
}

>> +    {
>> +      if ((policy = gnu_testing_merge_subsection (subsec->name))
>> +	   != SUBSECTION_TESTING_MERGE_UNSUPPORTED)
>> +	{
>> +	  /* Only the first two attributes can be merged, others won't and will
>> +	     be discarded.  */
>> +	  if (lhs->tag <= 1)
> 
> By inverting this and handling ...
> 
>> +	    {
>> +	      if (policy == SUBSECTION_TESTING_MERGE_AND_POLICY)
>> +		res = obj_attr_v2_tag_merge_AND (info, abfd, subsec, lhs, rhs,
>> +		  frozen);
>> +	      else if (policy == SUBSECTION_TESTING_MERGE_OR_POLICY)
>> +		res = obj_attr_v2_tag_merge_OR (info, abfd, subsec, lhs, rhs,
>> +		  frozen);
>> +	      else if (policy == SUBSECTION_TESTING_MERGE_ADD_POLICY)
>> +		res = obj_attr_v2_tag_merge_ADD (info, abfd, subsec, lhs, rhs,
>> +		  frozen);
>> +	    }
>> +	  else
>> +	    res.reason = UNSUPPORTED;
> 
> ... this first, you can save a level of indentation, improving readability.
> 

Thanks for the suggestion. I did so. See my answer to the previous comment.

>> +	}
>> +    }
>> +
>> +  return res;
>> +}
>> +
>> +/* Append a new default-initialized attribute with the same key as AREF to the
>> +   given subsection.  */
>> +static void
>> +oav2_subsection_append_attr_default (struct bfd_link_info *info,
>> +				     obj_attr_subsection_v2 *s_abfd_missing,
>> +				     obj_attr_v2 *a_ref)
>> +{
>> +  obj_attr_v2 *new_attr = oav2_attr_default (info, s_abfd_missing, a_ref);
>> +  LINKED_LIST_APPEND(obj_attr_v2) (s_abfd_missing, new_attr);
>> +}
>> +
>> +/* Return a new default-initialized subsection with the same parameters as
>> +   SUBSEC.  */
>> +static obj_attr_subsection_v2 *
>> +oav2_subsection_default_new (struct bfd_link_info *info,
>> +			     obj_attr_subsection_v2 *subsec)
>> +{
>> +  obj_attr_subsection_v2 *new_subsec
>> +    = _bfd_elf_obj_attr_subsection_v2_init (subsec->name, subsec->scope,
>> +      subsec->optional, subsec->encoding);
> 
> Nit: For the pending open parenthesis this line wants indenting yet another
> level.
> 

Fixed in the next revision.

>> +/* Report required attribute A_ABFD mismatching with A_REF.  */
>> +static void
>> +report_mismatching_required_obj_attr (struct bfd_link_info *info,
>> +				      bfd *ref_bfd,
>> +				      bfd *abfd,
>> +				      obj_attr_subsection_v2 *s_ref,
>> +				      obj_attr_v2 *a_ref,
>> +				      obj_attr_v2 *a_abfd)
>> +{
>> +  const struct elf_backend_data *be = get_elf_backend_data (abfd);
>> +  const char* tag_s = obj_attr_v2_tag_to_string (be, s_ref->name, a_ref->tag);
>> +  if (s_ref->encoding == OA_ENC_ULEB128)
>> +    {
>> +      if (tag_s)
>> +	info->callbacks->einfo (
>> +	  _("%X%pB, %pB: error: mismatching values 0x%x and 0x%x for "
> 
> In case I didn't say so elsewhere already: Please prefer %#x over 0x%x.
> 

If I change it to %#x, it crashes due to a SEGFAULT with the stack below:

(gdb) bt
#0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:76
#1  0x00007ffff7c8626a in __GI__IO_fputs (str=0x0, fp=0x7ffff7e044e0 
<_IO_2_1_stderr_>) at ./libio/iofputs.c:33
#2  0x00005555555baf34 in vfinfo (fp=0x7ffff7e044e0 <_IO_2_1_stderr_>, 
fmt=0x55555596d243 "' in subsection '%s'\n", ap=0x7fffffffd630, 
is_warning=true) at ../../ld/ldmisc.c:527
#3  0x00005555555bb36e in einfo (fmt=0x55555596d1f0 "%X%pB, %pB: error: 
mismatching values %#x and %#x for required object attribute '%s' in 
subsection '%s'\n") at ../../ld/ldmisc.c:608
#4  0x000055555566dfc8 in report_mismatching_required_obj_attr 
(info=0x5555559d7ce0 <link_info>, ref_bfd=0x555555a14dc0, 
abfd=0x555555a17160, s_ref=0x555555a19590, a_ref=0x555555a19610, 
a_abfd=0x555555a1fdf0)
     at ../../bfd/elf-attrs.c:1077
#5  0x000055555566e20c in oav2_subsection_perfect_match 
(info=0x5555559d7ce0 <link_info>, ref_bfd=0x555555a14dc0, 
abfd=0x555555a17160, s_ref=0x555555a19590, s_abfd=0x555555a1fd70) at 
../../bfd/elf-attrs.c:1139

It does not seem to me that the support for "#" was implemented in vfinfo().
>> +/* Merge case 3: S_ABFD does not have a S_REF equivalent.
>> +   1. Create a new default-initialized S_REF subsection.
>> +   2. Merge S_ABFD into S_REF.
>> +   3. Insert S_REF into REF.  */
>> +static bool
>> +handle_subsection_additional (struct bfd_link_info *info,
>> +			      bfd *ref_bfd, bfd *abfd,
>> +			      obj_attr_subsection_v2 *s_ref_next,
>> +			      obj_attr_subsection_v2 *s_abfd,
>> +			      obj_attr_subsection_v2 *s_frozen)
>> +{
>> +  if (! s_abfd->optional)
>> +    {
>> +      info->callbacks->einfo
>> +	(_("%X%pB: error: missing required object attributes subsection %s\n"),
>> +	 ref_bfd, s_abfd->name);
>> +      return false;
>> +    }
>> +
>> +  /* Compute default values of the missing attributes in REF, but present in
>> +     ABFD, and merge REF's generated subsection with the one of ABFD.  */
>> +  obj_attr_subsection_v2 *s_ref = oav2_subsection_default_new (info, s_abfd);
>> +  obj_attr_subsection_v2 *s_merged
>> +    = handle_subsection_merge (info, ref_bfd, abfd, s_ref, s_abfd, s_frozen);
>> +  BFD_ASSERT (s_merged == s_ref); // FIXME: I am not sure whether that it is true or false. If true, eliminate next free.
> 
> Such wants sorting before this can go in.
> 

I removed the assertion, s_merged is always equal to s_ref, unless NULL 
is returned in case of an issue happen inside handle_subsection_merge.

Changed the code to the following:

   /* Compute default values of the missing attributes in REF, but 
present in
      ABFD, and merge REF's generated subsection with the one of ABFD.  */
   obj_attr_subsection_v2 *s_ref = oav2_subsection_default_new (info, 
s_abfd);
   obj_attr_subsection_v2 *s_merged
     = handle_subsection_merge (info, ref_bfd, abfd, s_ref, s_abfd, 
s_frozen);
   if (s_merged != NULL)
     {
       LINKED_LIST_INSERT_BEFORE(obj_attr_subsection_v2) (
	&elf_obj_attr_subsections (ref_bfd), s_merged, s_ref_next);
     }
   else
     /* An issue occurred during the merge. s_ref won't be saved so it 
needs to
        be freed.  */
     _bfd_elf_obj_attr_subsection_v2_free (s_ref);
   return (s_merged != NULL);

>> +/* Check for mismatch between the parameters of subsections S1 and S2.
>> +   Note: F1 can be null when comparing FROZEN and the first object file used to
>> +   store the merge result.  If an error is reported, it means that one of the
>> +   definition of S1 or S2 is corrupted.  Most likely S2 because it is a user
>> +   input, or S1 if it is a programmation error of FROZEN.  In the second case,
>> +   please raise a bug to binutils bug tracker.  */
>> +static bool
>> +oav2_subsection_mismatching_params (struct bfd_link_info *info,
>> +				    bfd *f1, bfd *f2,
>> +				    obj_attr_subsection_v2 *s1,
>> +				    obj_attr_subsection_v2 *s2)
>> +{
>> +  if (! gnu_testing_namespace (s1->name))
>> +    {
>> +      /* Check whether the subsection is known, and if so, match against the
>> +	 expected properties.
>> +	 Note: this piece of code must be guarded against gnu-testing
>> +	 subsections, as the backend method looks up at the known subsections.
>> +	 Since the "fictive" entry for gnu-testing known subsection has random
>> +	 values for its encoding and optionality, it won't be able to detect
>> +	 mismatching parameters correctly.  */
>> +      bool match_known = true;
>> +      if (get_elf_backend_data (f2)->obj_attr_subsection_v2_match_known != NULL)
>> +	match_known = get_elf_backend_data (f2)
>> +	  ->obj_attr_subsection_v2_match_known (info, f2, s2);
>> +      if (! match_known)
>> +	return true;
>> +    }
>> +
>> +  bool mismatch = (s1->encoding != s2->encoding
>> +		   || s1->optional != s2->optional);
>> +
>> +  if (mismatch)
>> +    {
>> +      if (f1 != NULL)
>> +	info->callbacks->einfo (_("%X%pB, %pB: error: parameters of subsection"
>> +	  " '%s' are mismatching. (%s, %s) VS (%s, %s)\n"), f1, f2, s1->name,
> 
> I'm not a native speaker, but I think it wants to be "mismatched" here, much
> like you have ...
> 

I am not a native speaker as well, but the choices are the following 
from my perspective:
- a present participle as adjective: "mismatching". A present participle 
is less tied to the notion of time so used to discribe something more 
permanent.
- a past participle as adjective: "mismatched". This is true now, but 
might not be true in the future. (or potentially a passive form, but it 
seems weird in this context because of the nature of the mismatching). 
The mismatch is not going to change between this run and a next one with 
the same input.
- a verb to replace "are mistmatching" by the preterit "mismatched" to 
mean that the thing mismatched during a matching process, the action is 
done.

"mismatching" seems the most appropriate to me, but not sure whether it 
is the most idiomatical way of saying it.

>> +	  oav2_comprehension_to_string (s1->optional),
>> +	  oav2_encoding_to_string (s1->encoding),
>> +	  oav2_comprehension_to_string (s2->optional),
>> +	  oav2_encoding_to_string (s2->encoding));
>> +      else
>> +	info->callbacks->einfo (_("%X%pB: error: parameters of subsection"
>> +	  " '%s' are corrupted. (%s, %s) VS (%s, %s)\n"), f2, s1->name,
> 
> ... "corrupted" here.
> 

 From my perspective, this is the passive form: to be + verb past 
participle + [by: something or someone].

Again, I will let a native speaker enlighten us on the most idiomatic way.

>> +/* Merge object attributes from FROZEN into the object file REF_BFD.
>> +   Note: this function is called only once before starting the merge process
>> +   between the object files.  REF_BFD is used to store the result of the merge,
>> +   but REF_BFD is also an input file, so any mismatch against FROZEN should be
>> +   raised before the values of REF_BFD be modified.  */
>> +static bool
>> +oav2_subsections_merge_frozen (struct bfd_link_info *info,
>> +			       bfd *abfd,
>> +			       obj_attr_subsection_v2 *s_frozen)
>> +{
>> +  if (s_frozen == NULL)
>> +    return true;
>> +
>> +  bool success = true;
>> +
>> +  obj_attr_subsection_v2 *s_abfd = elf_obj_attr_subsections (abfd).first;
>> +  while (s_frozen != NULL && s_abfd != NULL)
>> +    {
>> +      int cmp = strcmp (s_abfd->name, s_frozen->name);
>> +      if (cmp < 0) /* ABFD has a subsection that FROZEN doesn't have.  */
>> +	{
>> +	  /* No need to try to merge anything here.  */
>> +	  s_abfd = s_abfd->next;
>> +	}
>> +      else if (cmp > 0) /* FROZEN has a subsection that ABFD doesn't have.  */
>> +	{
>> +	  success &= handle_subsection_additional (info, abfd, abfd,
>> +	    s_abfd, s_frozen, s_frozen);
>> +	  s_frozen = s_frozen->next;
>> +	}
>> +      else /* Both ABFD and frozen have the subsection.  */
>> +	{
>> +	  bool mismatch = oav2_subsection_mismatching_params (info, NULL, abfd,
>> +	    s_frozen, s_abfd);
>> +	  success &= ! mismatch;
>> +	  if (mismatch)
>> +	    /* FROZEN cannot be corrupted as it is generated from the command
>> +	       line arguments.  If it is corrupted, it is a bug.  */
>> +	    s_abfd->status = obj_attr_subsection_v2_corrupted;
>> +	  else
>> +	    success &= (handle_subsection_merge (info, abfd, abfd,
>> +	      s_abfd, s_frozen, s_frozen) != NULL);
>> +
>> +	  s_abfd = s_abfd->next;
>> +	  s_frozen = s_frozen->next;
>> +	}
>> +    }
>> +
>> +  /* No need to go through the remaining sections of ABFD, only mismatch against
>> +     FROZEN are interesting.  */
> 
> Nit (grammmar): Either "mismatches" or "is".
> 

"mismatches" indeed.
Fixed in the next revision.

>> +  for (; s_frozen != NULL; s_frozen = s_frozen->next)
>> +    success &= handle_subsection_additional (info, abfd, abfd,
>> +      elf_obj_attr_subsections (abfd).last, s_frozen, s_frozen);
>> +
>> +  return success;
>> +}
>> +
>> +/* Merge object attributes from object file ABFD into REF_BFD.  */
>> +static bool
>> +oav2_subsections_merge (struct bfd_link_info *info, bfd *ref_bfd, bfd *abfd)
>> +{
>> +  bool success = true;
>> +  obj_attr_subsection_list *out_frozen_subsecs
>> +    = &elf_obj_attr_subsections (info->output_bfd);
>> +  obj_attr_subsection_list *abfd_subsecs = &elf_obj_attr_subsections (abfd);
>> +  obj_attr_subsection_list *ref_subsecs = &elf_obj_attr_subsections (ref_bfd);
>> +
>> +  obj_attr_subsection_v2 *s_frozen_first = out_frozen_subsecs->first;
>> +  obj_attr_subsection_v2 *s_abfd = abfd_subsecs->first;
>> +  obj_attr_subsection_v2 *s_ref = ref_subsecs->first;
>> +
>> +  /* Translate object attributes from abfd to GNU properties if they have an
>> +     equivalence.  */
>> +  _bfd_elf_translate_relevant_obj_attrs_to_gnu_props (abfd);
> 
> How does this call fit here? Below you're only dealing with attributes,
> afaict.
> 

ABFD is an input file. It might have either GNU properties only, object 
attributes only, or both.

Before the merge of object attributes occur, we need to make sure that 
the GNU properties are translated to their object attributes equivalents 
(if they exists).

In the same way, before the merge of GNU properties, we need to make 
sure that the object attributes are translated to their GNU properties 
equivalents (if they exists).

In this implementation, the merge of object attributes always occurs 
before the merge of GNU properties.

Ideally we should not care about the order, and there should be a 
translation of both GNU properties and object attributes to an internal 
representation with an abstraction level flexible enough to represent 
the data without loss for both. Any incompatibility with previously 
translated value would be raised. Then the merge of the generic internal 
representation would happen, and finally a OAv2 serializer and GNU 
properties serializer would translate the data from the generic model.

This would be nice, but it is more complicated to implement, and out of 
scope for today.

So the code finds itself in a "bastard" state where I find myself 
needing to do this translation in different place.

This specific line that you pointed out translates the object attributes 
from all the input files (except the one used as REF) to GNU properties 
equivalents (if they exist), so that the GNU properties merge will later 
work even if the GNU properties were not set on the object.

Please feel free to ask more details if you need, I know that this is a 
bit confusing.

>> +  while (s_abfd != NULL && s_ref != NULL)
>> +    {
>> +      int cmp = strcmp (s_ref->name, s_abfd->name);
>> +
>> +      if (cmp < 0) /* REF has a subsection that ABFD doesn't have.  */
>> +	{
>> +	  if (s_ref->status != obj_attr_subsection_v2_ok)
>> +	    {
>> +	      s_ref = s_ref->next;
>> +	      continue;
>> +	    }
>> +
>> +	  obj_attr_subsection_v2 *s_frozen
>> +	    = obj_attr_subsection_v2_find_by_name (s_frozen_first, s_ref->name,
>> +						   true);
>> +
>> +	  /* Mismatching between REF and FROZEN already done in
>> +	     oav2_subsections_merge_frozen.  */
>> +	  success &= handle_subsection_missing (info, ref_bfd, abfd, s_ref,
>> +	    s_frozen);
>> +
>> +	  if (s_frozen != NULL)
>> +	    s_frozen_first = s_frozen->next;
>> +	  s_ref = s_ref->next;
>> +	}
>> +      else if (cmp > 0) /* ABFD has a subsection that REF doesn't have.  */
>> +	{
>> +	  if (s_abfd->status != obj_attr_subsection_v2_ok)
>> +	    {
>> +	      s_abfd = s_abfd->next;
>> +	      continue;
>> +	    }
>> +
>> +	  obj_attr_subsection_v2 *s_frozen
>> +	    = obj_attr_subsection_v2_find_by_name (s_frozen_first, s_abfd->name,
>> +						   true);
>> +	  if (s_frozen != NULL)
>> +	    {
>> +	      /* Check any mismatch against ABFD and FROZEN.  */
>> +	      bool mismatch = oav2_subsection_mismatching_params (info, NULL,
>> +		abfd, s_frozen, s_abfd);
>> +	      success &= ! mismatch;
>> +	      if (mismatch)
>> +		s_abfd->status = obj_attr_subsection_v2_corrupted;
>> +	      else
>> +		success &= handle_subsection_additional (info, ref_bfd, abfd,
>> +		  s_ref, s_abfd, s_frozen);
> 
> This and ...
> 
>> +	    }
>> +	  else
>> +	    success &= handle_subsection_additional (info, ref_bfd, abfd, s_ref,
>> +	      s_abfd, NULL);
> 
> ... this look to be easy to fold into just a single call. (Same pattern again
> at least once further down.)
> 

Yes, it makes sense.
Fixed in the next revision.

>> +/* Compact duplicated tag declarations in a same subsection.
>> +   Return True on success, False if any issue is found during the compaction,
>> +   i.e. conflicting values for the same tag.  */
>> +static bool
>> +oav2_compact_tags (bfd *abfd, obj_attr_subsection_v2 *subsec)
>> +{
>> +  bool success = true;
>> +
>> +
>> +  for (obj_attr_v2 *a = subsec->first;
>> +       a != NULL && a->next != NULL;)
>> +    {
>> +      if (a->tag != a->next->tag)
>> +	{
>> +	  a = a->next;
>> +	  continue;
>> +	}
>> +
>> +      if (subsec->encoding == OA_ENC_ULEB128)
>> +	{
>> +	  if (a->vals.uint_val != a->next->vals.uint_val)
>> +	    {
>> +	      success = false;
>> +	      _bfd_error_handler (_("%pB: error: found duplicated attributes "
>> +		"'Tag_unknown_%u' with conflicting values (0x%x vs 0x%x) in "
> 
> What does it follow from that the tag is unknown?
> 

obj_attr_v2_tag_to_string() was too limited, and I changed the 
implementation in the next revision (see below). It allowed me to 
eliminate some code duplication, and the solution is simpler.

/* To-string function for the pair <SUBSEC, TAG>.
    Returns the attribute information associated to TAG if it is found,
    or "Tag_unknown_<N>" otherwise.  */
const char *
obj_attr_v2_tag_to_string (const struct elf_backend_data *bed,
			   const char *subsec_name,
			   obj_attr_tag_t tag)
{
   const obj_attr_info_t *attr_info
     = obj_attr_v2_find_known_by_tag (bed, subsec_name, tag);
   if (attr_info != NULL)
     return attr_info->tag.name;

   /* 23 because "Tag_unknown_" + max(uint32_t), i.e 4294967295 + '\0'.  */
   static char tag_s[23] = {'\0'};
   snprintf (tag_s, sizeof(tag_s), "Tag_unknown_%u", tag);
   return tag_s;
}

>> +		"subsection %s"), abfd, a->tag, a->vals.uint_val,
>> +		a->next->vals.uint_val, subsec->name);
>> +	    }
>> +	  else
>> +	    LINKED_LIST_REMOVE(obj_attr_v2) (subsec, a->next);
>> +	}
>> +      else /* (subsec->encoding == NTBS)  */
>> +	{
>> +	  if (strcmp (a->vals.string_val, a->next->vals.string_val) != 0)
> 
> Code in obj_attr_v2_tag_merge_ADD() suggests that vals.string_val can be NULL,
> in which case it would be invalid to pass to strcmp().
> 

oav2_file_scope_merge_subsections(), which calls oav2_compact_tags(), is 
called before oav2_merge_attrs().
It compacts the content of subsections before the merge process starts, 
because the merge process has for the following requirements on an input 
object file:
- the subsections have to be sorted in alphabetical order.
- the attributes inside a subsection have to be sorted in ascending order.
- the subsections and their attributes have to be unique.

The value of a string attribute can only be NULL if this attribute was 
created by the merge process.
Since this code runs before the merge process, no worry about the string 
being NULL and strcmp() can work assuming that the string will never be 
NULL.

>> +/* Merge two subsections together (object attributes v2 only).
>> +   The result is stored into subsec1.  subsec2 is destroyed.
>> +   Return true if the merge was successful, false otherwise.
>> +   Note: subsec1 and subsec2 are expected to be sorted before the call to this
>> +   function.  */
>> +static bool
>> +oav2_subsection_destructive_merge (bfd *abfd,
>> +				   obj_attr_subsection_v2 *subsec1,
>> +				   obj_attr_subsection_v2 *subsec2)
>> +{
>> +  BFD_ASSERT (subsec1->encoding == subsec2->encoding
>> +	   && subsec1->optional == subsec2->optional);

Fixed in the next revision.

>> +
>> +  bool success = true;
>> +
>> +  success &= oav2_compact_tags (abfd, subsec1);
>> +  success &= oav2_compact_tags (abfd, subsec2);
>> +
>> +  obj_attr_v2 *a1 = subsec1->first;
>> +  obj_attr_v2 *a2 = subsec2->first;
>> +  while (a1 != NULL && a2 != NULL)
>> +    {
>> +      if (a1->tag < a2->tag)
>> +	{} /* Nothing to do, a1 is already in subsec1.  */
>> +      else if (a1->tag > a2->tag)
>> +	{
>> +	  /* a2 is missing in subsec1, add it.  */
>> +	  obj_attr_v2 *previous = LINKED_LIST_REMOVE(obj_attr_v2) (subsec2, a2);
>> +	  LINKED_LIST_INSERT_BEFORE(obj_attr_v2) (subsec1, a2, a1);
>> +	  a2 = previous;
>> +	}
>> +      else
>> +	{
>> +	  if (subsec1->encoding == OA_ENC_ULEB128
>> +	   && a1->vals.uint_val != a2->vals.uint_val)
> 
> Nit: Indentation.
> 

Fixed in the next revision.

>> +	    {
>> +	      success = false;
>> +	      _bfd_error_handler
>> +		(_("%pB: error: found 2 subsections with the same name '%s' "
>> +		   "and found conflicting values (%#x vs %#x) for object "
>> +		   "attribute 'Tag_unknown_%u'"),
>> +		   abfd, subsec1->name, a1->vals.uint_val, a2->vals.uint_val,
>> +		   a1->tag);
>> +	    }
>> +	  else if (subsec1->encoding == OA_ENC_NTBS
>> +	    && strcmp (a1->vals.string_val, a2->vals.string_val) != 0)
> 
> Again.
> 

Fixed in the next revision.

>> @@ -1146,9 +2619,9 @@ oav2_parse_subsection (bfd *abfd,
>>       ? OA_SUBSEC_PUBLIC
>>       : OA_SUBSEC_PRIVATE;
>>   
>> -  obj_attr_subsection_v2 *subsec =
>> -    _bfd_elf_obj_attr_subsection_v2_init (subsection_name, scope, optional_raw,
>> -					  attr_type_raw);
>> +  obj_attr_subsection_v2 *subsec
>> +    = _bfd_elf_obj_attr_subsection_v2_init (subsection_name, scope,
>> +					    optional_raw, attr_type_raw);
>>     while (cursor < end)
>>       {
>>         BufferReadOp_t op_ = oav2_parse_attr (abfd, cursor, end, attr_type_raw);
> 
> Misplaced format adjustment?
> 

The format looks good, the issue is probably triggered by the diff.
I changed the formatting to the below, maybe it will be better.

   *subsec = _bfd_elf_obj_attr_subsection_v2_init
     (subsection_name, scope, comprehension_raw, value_encoding);

> Once again an overly large patch, close to impossible to sensibly review.
> 
> Jan

This is a big patch indeed.
How would you have splitted it in smaller pieces that still make sense ?
I am interested in your suggestions. Such a situation, where I introduce 
a lot of code, might occur again. Who knows ? :P

Matthieu


More information about the Binutils mailing list