New .nops directive, to aid Linux alternatives patching?
H.J. Lu
hjl.tools@gmail.com
Sun Feb 11 00:59:00 GMT 2018
More information about the Binutils mailing list
Sun Feb 11 00:59:00 GMT 2018
- Previous message (by thread): New .nops directive, to aid Linux alternatives patching?
- Next message (by thread): New .nops directive, to aid Linux alternatives patching?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sat, Feb 10, 2018 at 9:22 AM, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 10/02/18 15:44, H.J. Lu wrote: >> On Fri, Feb 9, 2018 at 5:29 AM, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> On 09/02/18 11:55, H.J. Lu wrote: >>>> On Fri, Feb 9, 2018 at 3:35 AM, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>>>> On 09/02/18 02:22, H.J. Lu wrote: >>>>>> On Thu, Feb 8, 2018 at 5:14 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>>>>>> On Thu, Feb 8, 2018 at 4:45 PM, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>>>>>>> On 09/02/2018 00:24, H.J. Lu wrote: >>>>>>>>> On Thu, Feb 8, 2018 at 3:47 PM, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>>>>>>>>> On 08/02/2018 20:36, H.J. Lu wrote: >>>>>>>>>>> On Thu, Feb 8, 2018 at 12:33 PM, Andrew Cooper >>>>>>>>>>> <andrew.cooper3@citrix.com> wrote: >>>>>>>>>>>> On 08/02/2018 20:28, H.J. Lu wrote: >>>>>>>>>>>>> On Thu, Feb 8, 2018 at 12:27 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>>>>>>>>>>>>> On Thu, Feb 8, 2018 at 12:18 PM, Andrew Cooper >>>>>>>>>>>>>> <andrew.cooper3@citrix.com> wrote: >>>>>>>>>>>>>>> On 08/02/2018 20:10, H.J. Lu wrote: >>>>>>>>>>>>>>>> On Thu, Feb 8, 2018 at 11:26 AM, Andrew Cooper >>>>>>>>>>>>>>>> <andrew.cooper3@citrix.com> wrote: >>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I realise this is a little bit niche, but how feasible would it be to >>>>>>>>>>>>>>>>> introduce a new .nops directive which takes a size parameter, and >>>>>>>>>>>>>>>>> outputs long nops covering the number of specified bytes? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Sounds to me you want a pseudo NOP instruction: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> pseudo-NOP N >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> which generates a long NOP with N byte. Is that correct. If yes, >>>>>>>>>>>>>>>> what is the range of N? >>>>>>>>>>>>>>> Currently 255 based on other implementation limits, and I expect that >>>>>>>>>>>>>>> ought to be long enough for anyone. There is one existing user for >>>>>>>>>>>>>>> N=43, and I expect that to grow a bit. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The real answer properly depends at what point it is more efficient to >>>>>>>>>>>>>>> jmp rather than wasting decode bandwidth decoding nops, and I don't know >>>>>>>>>>>>>>> the answer, but expect that it isn't larger than 255. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> How about >>>>>>>>>>>>>> >>>>>>>>>>>>>> {nop} N >>>>>>>>>>>>>> >>>>>>>>>>>>>> If N is less than 15 bytes, it generates a long nop. Otherwise, we use a jump >>>>>>>>>>>>>> instruction over nops. Does it work for you? >>>>>>>>>>>>> N will be limited to 255. >>>>>>>>>>>> Do you mean up to 255 bytes of adjacent long nops, or still a jump if >>>>>>>>>>>> over 15 bytes? For alternatives in the range of 15-30, a jmp is almost >>>>>>>>>>>> certainly slower than executing through the nops. The ORM isn't clear >>>>>>>>>>>> where the split lies, and I expect it is very uarch specific. >>>>>>>>>>> How about this >>>>>>>>>>> >>>>>>>>>>> {nop} N, L >>>>>>>>>>> {nop} N >>>>>>>>>>> >>>>>>>>>>> N is < =255. If L is missing, L is 15. >>>>>>>>>>> >>>>>>>>>>> If N < L then >>>>>>>>>>> Long NOPs up to N bytes >>>>>>>>>>> else >>>>>>>>>>> jmp + long nops up to N bytes. >>>>>>>>>>> fi >>>>>>>>>> I'm afraid that I don't think that will be very helpful in that form. >>>>>>>>>> Are there technical reasons why you don't want to emit more than a >>>>>>>>>> single 15byte long nop? >>>>>>>>>> >>>>>>>>> Doesn't >>>>>>>>> >>>>>>>>> {nop} 28, 40 >>>>>>>>> >>>>>>>>> generate 2 x 14-byte nops? >>>>>>>> By the above logic, yes. I still don't see the value in the L >>>>>>>> parameter, because I don't expect an average programmer to know how to >>>>>>>> choose it sensibly. Then again, a compiler generating code for a >>>>>>>> specified uarch probably could have some idea of what value to feed in. >>>>>>>> >>>>>>>> If the semantics were a little more like: >>>>>>>> >>>>>>>> {nop} N => N bytes of nops with no jumps >>>>>>>> {nop} N, L => as above >>>>>>>> >>>>>>>> Then this might be more useful. >>>>>>>> >>>>>>>> I expect N will typically be an expression rather than an absolute >>>>>>>> number, because the usecase I've proposed is for filling in a specific, >>>>>>>> calculated number of bytes. (In particular, what commonly happens is >>>>>>>> that memory references in alternatives are the thing which cause the >>>>>>>> exact length to fluctuate.) When there is a sensible uarch value for L, >>>>>>>> that can be fed in, but shouldn't be mandatory. In particular, if it >>>>>>>> unknown, 15 is almost certainly the wrong default for it. >>>>>>> So, you want >>>>>>> >>>>>>> .nop SIZE >>>>>>> >>>>>>> and >>>>>>> >>>>>>> .jump SIZE >>>>>>> >>>>>>> which are similar to '.skip SIZE , FILL'. But they fill SIZE with nops or >>>>>>> jmp + nops. >>>>>>> >>>>>> Or >>>>>> >>>>>> .nop SIZE, JUMP_SIZE >>>>>> >>>>>> If SIZE < JUMP_SIZE then >>>>>> SIZE of nops. >>>>>> else >>>>>> SIZE of jmp + nops. >>>>>> fi >>>>> I'm still not sure why you want the jump functionality in the first >>>>> place, but yes - this latest option would work. >>>>> >>>>> FWIW, jumping over code with alternatives is typically done like: >>>>> >>>>> ALTERNATIVE "jmp .L\@_skip", "", FEATURE_X >>>>> ... >>>>> .L\@_skip: >>>>> >>>>> At which point it is only the two or 5 byte jmp which is being >>>>> dynamically modified. The converse case is where we begin with 2 or 5 >>>>> bytes of nops, and dynamically insert the jmp. >>>>> >>>>> If we're in the line for other related feature requests, how about being >>>>> able to optionally specify the maximum length of individual nops? e.g. >>>>> >>>>> .nop SIZE [, MAX_NOP = 9 [, JUMP_SIZE = -1]] >>>> OK, let go with >>>> >>>> .nop SIZE [, MAX_NOP = 9] >>>> >>>> It is easier to implement with 2 arguments. MAX_NOP must be a constant. >>> Sounds good to me. >> Please try users/hjl/nop branch: >> >> https://github.com/hjl-tools/binutils-gdb/tree/users/hjl/nop > > Oh - thankyou! I was about to ask if there were any pointers to get > started hacking on binutils. > > As for the functionality, there are unfortunately some issues. Given > this source: > > .text > single: > nop > > pseudo_1: > .nop 1 > > pseudo_8: > .nop 8 > > pseudo_8_4: > .nop 8, 4 > > pseudo_20: > .nop 20 > > I get the following disassembly: > > 0000000000000000 <single>: > 0: 90 nop > > 0000000000000001 <pseudo_1>: > 1: 66 90 xchg %ax,%ax > > 0000000000000003 <pseudo_8>: > 3: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1) > a: 00 00 > > 000000000000000c <pseudo_8_4>: > c: 90 nop > d: 0f 1f 40 00 nopl 0x0(%rax) > 11: 0f 1f 40 00 nopl 0x0(%rax) > > 0000000000000015 <pseudo_20>: > 15: 90 nop > 16: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) > 1d: 00 00 00 > 20: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) > 27: 00 00 00 > > The MAX_NOP part looks to be working as intended (including reducing > below the default of 10), but there appears to be an off-by-one > somewhere, as one too many nops are emitted in the block. > > Furthermore, attempting to use .nop 30 yields: > > /tmp/ccI2Eakp.s: Assembler messages: > /tmp/ccI2Eakp.s: Fatal error: can't write 145268933551616 bytes to > section .text of nops.o: 'Bad value' Please try my branch again. It should be fixed. -- H.J.
- Previous message (by thread): New .nops directive, to aid Linux alternatives patching?
- Next message (by thread): New .nops directive, to aid Linux alternatives patching?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Binutils mailing list