[PATCH] x86: Expand Broadcast to 3 bits

H.J. Lu hjl.tools@gmail.com
Thu Jul 26 15:57:00 GMT 2018
On Thu, Jul 26, 2018 at 8:47 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 26.07.18 at 17:02, <hjl.tools@gmail.com> wrote:
>> On Thu, Jul 26, 2018 at 7:58 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 26.07.18 at 00:05, <hongjiu.lu@intel.com> wrote:
>>>> @@ -5008,6 +5010,22 @@ optimize_disp (void)
>>>>        }
>>>>  }
>>>>
>>>> +/* Return 1 if there is a match in broadcast bytes between operand
>>>> +   GIVEN and instruction template T.   */
>>>> +
>>>> +static INLINE int
>>>> +match_broadcast_size (const insn_template *t, unsigned int given)
>>>> +{
>>>> +  return ((t->opcode_modifier.broadcast == BYTE_BROADCAST
>>>> +        && i.types[given].bitfield.byte)
>>>> +       || (t->opcode_modifier.broadcast == WORD_BROADCAST
>>>> +           && i.types[given].bitfield.word)
>>>> +       || (t->opcode_modifier.broadcast == DWORD_BROADCAST
>>>> +           && i.types[given].bitfield.dword)
>>>> +       || (t->opcode_modifier.broadcast == QWORD_BROADCAST
>>>> +           && i.types[given].bitfield.qword));
>>>> +}
>>>> +
>>>>  /* Check if operands are valid for the instruction.  */
>>>>
>>>>  static int
>>>> @@ -5126,23 +5144,29 @@ check_VecOperands (const insn_template *t)
>>>>        i386_operand_type type, overlap;
>>>>
>>>>        /* Check if specified broadcast is supported in this instruction,
>>>> -      and it's applied to memory operand of DWORD or QWORD type.  */
>>>> +      and its broadcast bytes match the memory operand.  */
>>>>        op = i.broadcast->operand;
>>>>        if (!t->opcode_modifier.broadcast
>>>>         || !i.types[op].bitfield.mem
>>>>         || (!i.types[op].bitfield.unspecified
>>>> -           && (t->operand_types[op].bitfield.dword
>>>> -               ? !i.types[op].bitfield.dword
>>>> -               : !i.types[op].bitfield.qword)))
>>>> +           && !match_broadcast_size (t, op)))
>>>>       {
>>>>       bad_broadcast:
>>>>         i.error = unsupported_broadcast;
>>>>         return 1;
>>>>       }
>>>>
>>>> +      i.broadcast->bytes = ((1 << (t->opcode_modifier.broadcast - 1))
>>>> +                         * i.broadcast->type);
>>>
>>> So if you moved this up ahead of the earlier if(), and if you used
>>> i.broadcast->bytes in place of t->opcode_modifier.broadcast in
>>> match_broadcast_size(), I think you could get away without the
>>> extension to 3 bits in the templates.
>>
>> i.broadcast->bytes is set from t->opcode_modifier.broadcast.
>> I'd like to avoid check byte, word, dword, qword to compute
>> i.broadcast->bytes.
>
> And this is because of what? This is exactly the kind of redundancy
> I'm talking about. Or are there going to be cases where the
> broadcast element size is not the smallest among multiple possible
> ones for a single template (but then your logic in i386-gen would
> be wrong too)?

By definition, the broadcast element size is the smalltest non-vector
size.

-- 
H.J.



More information about the Binutils mailing list