New .nops directive, to aid Linux alternatives patching?
Andrew Cooper
andrew.cooper3@citrix.com
Thu Feb 8 23:47:00 GMT 2018
More information about the Binutils mailing list
Thu Feb 8 23:47:00 GMT 2018
- Previous message (by thread): New .nops directive, to aid Linux alternatives patching?
- Next message (by thread): New .nops directive, to aid Linux alternatives patching?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 08/02/2018 20:36, H.J. Lu wrote: > On Thu, Feb 8, 2018 at 12:33 PM, Andrew Cooper > <andrew.cooper3@citrix.com> wrote: >> On 08/02/2018 20:28, H.J. Lu wrote: >>> On Thu, Feb 8, 2018 at 12:27 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>>> On Thu, Feb 8, 2018 at 12:18 PM, Andrew Cooper >>>> <andrew.cooper3@citrix.com> wrote: >>>>> On 08/02/2018 20:10, H.J. Lu wrote: >>>>>> On Thu, Feb 8, 2018 at 11:26 AM, Andrew Cooper >>>>>> <andrew.cooper3@citrix.com> wrote: >>>>>>> Hello, >>>>>>> >>>>>>> I realise this is a little bit niche, but how feasible would it be to >>>>>>> introduce a new .nops directive which takes a size parameter, and >>>>>>> outputs long nops covering the number of specified bytes? >>>>>>> >>>>>> Sounds to me you want a pseudo NOP instruction: >>>>>> >>>>>> pseudo-NOP N >>>>>> >>>>>> which generates a long NOP with N byte. Is that correct. If yes, >>>>>> what is the range of N? >>>>> Currently 255 based on other implementation limits, and I expect that >>>>> ought to be long enough for anyone. There is one existing user for >>>>> N=43, and I expect that to grow a bit. >>>>> >>>>> The real answer properly depends at what point it is more efficient to >>>>> jmp rather than wasting decode bandwidth decoding nops, and I don't know >>>>> the answer, but expect that it isn't larger than 255. >>>>> >>>> How about >>>> >>>> {nop} N >>>> >>>> If N is less than 15 bytes, it generates a long nop. Otherwise, we use a jump >>>> instruction over nops. Does it work for you? >>> N will be limited to 255. >> Do you mean up to 255 bytes of adjacent long nops, or still a jump if >> over 15 bytes? For alternatives in the range of 15-30, a jmp is almost >> certainly slower than executing through the nops. The ORM isn't clear >> where the split lies, and I expect it is very uarch specific. > How about this > > {nop} N, L > {nop} N > > N is < =255. If L is missing, L is 15. > > If N < L then > Long NOPs up to N bytes > else > jmp + long nops up to N bytes. > fi I'm afraid that I don't think that will be very helpful in that form. Are there technical reasons why you don't want to emit more than a single 15byte long nop? First of all, 9-byte long nops are the longest you can use without suffering decode stalls from on most processors due to excess segment prefixes, which is why both Linux and Xen top out there when dynamically adding new nops. Secondly, I don't understand why you want the jmp. I think it would be entirely reasonable to make it the programmers problem to work out when a jmp is more efficient. If the patchsites really do get stupidly long, we could make a boot-time u-arch calculation to decider whether the jmp or the nops are better, but shorter patchsites are better so I don't expect such a feature to get any production use where using a jmp would be beneficial. Ideally, such an implementation would just emit as many long nops as would fill up the space requested. One trick however to consider is that if you've got N+10 bytes remaining and emitting N-sized long nops (where N is most likely 9), then emitting an N+8 long nop and a 2-byte long nop is more efficient to execute than an N+9 nop and a singlebyte nop, as the singlebyte nop can't be optimised during execution. ~Andrew
- Previous message (by thread): New .nops directive, to aid Linux alternatives patching?
- Next message (by thread): New .nops directive, to aid Linux alternatives patching?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Binutils mailing list