ELF octets_per_byte

Dan dgisselq@verizon.net
Thu Feb 25 02:41:00 GMT 2016
Maciej,

>  Please also note that the ELF gABI[1] is very explicit about a byte being 
> 8-bits wide:
> 
> "As described here, the object file format supports various processors 
> with 8-bit bytes and either 32-bit or 64-bit architectures.  
> Nevertheless, it is intended to be extensible to larger (or smaller) 
> architectures.  Object files therefore represent some control data with a 
> machine-independent format, making it possible to identify object files 
> and interpret their contents in a common way.  Remaining data in an object 
> file use the encoding of the target processor, regardless of the machine 
> on which the file was created."
> 
> so whenever it refers to a "byte" I think it really means an octet, 
> although I do see an ambiguity here as sometimes it uses the term to mean 
> a target byte.
> 

Sigh.  You don't need to convince me that "bytes" versus "octets" make
for a rather confusing nomenclature.  They wouldn't be my first choice
of terms.  I am all open to a better choice.  In my own experience, as
in the above citation, "byte" is used to mean 8-bits.  It just appears
to be the term used within binutils, and gas in particular, to reference
the minimum addressable unit.

> > I also propose that the following values are in units of target address
> > space "bytes":
> > 
> > ELF header "entry" address
> > section header address
> > symbol value
> > symbol size
> > relocation offset
> > relocation addend
> 
>  These express target addresses or are directly related to them (e.g. 
> offsets) and therefore I'm sure they're best expressed in whatever format 
> your target uses.  These IMHO certainly qualify as "remaining data" 
> referred to in the gABI citation included above.
> 
>  So with the entry point for example I'd expect whatever representation a 
> function pointer stored in memory would have on your target if the 
> function pointed was the intended entry point.  Likewise with VMAs and 
> LMAs used in program headers, section headers, symbol tables, etc.
> 
>  These do not necessarily have to be "proper" memory addresses even, for 
> example the MIPS processor encodes the execution mode in bit #0 of code 
> addresses, so in certain cases the entry point in MIPS ELF binaries will 
> have bit #0 set even though the memory location referred will have this 
> bit clear.  So it's really up to you to decide whatever encoding is the 
> most appropriate for your architecture.
> 
>  As to the symbol size I think it needs to be set to whatever the 
> C-language's `sizeof' operator would return for a unit of storage of the 
> same size.
> 

As I mentioned earlier this evening, the Zip CPU features:

sizeof(char)=sizeof(short)=sizeof(int)=sizeof(void *)=1 // 32-bits

It's a ... unique architectural feature.  :)

Dan

> References:
> 
> [1] "System V Application Binary Interface" - DRAFT - 10 June 2013, 
>     Section "Data Representation"
> <http://www.sco.com/developers/gabi/latest/ch4.intro.html#data_representation>
> 
>  HTH,
> 
>   Maciej



More information about the Binutils mailing list