Although mainly focused on recording security information, the system can be used to record any kind of data, even data not related to the application. One of the main goals of the system however is the ability to specify the address range over which a given piece of information is valid. So for example it is possible to specify that all of a program was compiled with the -O2 option except for one special function which was compiled with -O0 instead.
The range information is useful because it allows third parties to examine the binary and find out if its construction was consistent. IE that there are no gaps in the recorded information, and no special cases where a required feature was not active.
The system works by adding a special section to the application containing individual pieces of information along with an address range for which the information is valid. (Some effort has gone into the storing this information in a reasonably compact format).
The information is generated by a plugin that is attached to the compiler (either "gcc" or "clang"). The plugin is called "annobin" and it extracts information from the internals of compiler and records them in the object file(s) being produced.
Note - the plugin method is just one way of generating the information. Any interested party can create and add information to the objhect file, providing that they follow the Watermark specification.
The information can be extracted from files via the use of tools like "readelf" and "objdump". The "annobin" package itself includes a program called annocheck which can can also examine this information. Details on this program can be found elsewhere in this documentation.
Normally the option to enable the recording of binary annotation notes is enabled automatically by the build system, so no user intervention is required. On Fedora and RHEL based systems this is handled by the redhat-rpm-config package.
Currently the binary annotations are generated by a plugin to the "GCC" and "clang" compilers. This does mean that files that are not compiled with either of these compilers will not gain any binary annotations, although there is an optional assembler switch to add some basic notes if none are present in the input files.
If the build system being used does not automatically enabled the annobin plugin then it can be specifically added to the compiler command line by adding the -fplugin=annobin option. It may also be necessary to tell the compiler where to find the plugin by adding the -iplugindir= option, although this should only be necessary if the plugin is installed in an unusual place.
If it is desired to disable the recording of binary annotations then the -fplugin-arg-annobin-disable (for "gcc") or -Xclang -plugin-arg-annobin-disable (for "clang") can be used. Note - these options must be placed after the -fplugin=annobin option.
On Fedora and RHEL systems the plugin can be disabled entirely for all compilations in a package by adding %undefine _annotated_build to the spec file.
The information is stored in the ELF Note format in a special section called ".gnu.build.attributes". The "readelf" program from the "binutils" package can extract and display these notes when the --notes option is provided. (Adding the --wide option is also helpful). Here is an example of the output:
Displaying notes found in: .gnu.build.attributes Owner Data size Description GA$<version>3p3 0x00000010 OPEN Applies to region from 0x8a0 to 0x8c6 (hello.c) GA$<tool>gcc 7.2.1 20170915 0x00000000 OPEN Applies to region from 0x8a0 to 0x8c6 GA*GOW:0x452b 0x00000000 OPEN Applies to region from 0x8a0 to 0x8c6 GA*<stack prot>strong 0x00000000 OPEN Applies to region from 0x8a0 to 0x8c6 GA*GOW:0x412b 0x00000010 func Applies to region from 0x8c0 to 0x8c6 (baz)
This shows various different pieces of information, including the fact that the notes were produced using version 3 of the specification, and version 3 of the plugin. The binary was built by gcc version 7.2.1 and the -fstack-protector-strong option was enabled on the command line. The program was compiled with -O2 enabled except the baz() function which was compiled with -O0 instead.
The most complicated part of the notes is the owner field. This is used to encode the type of note as well as its value and possibly extra data as well. The format of the field is explained in detail in the Watermark specification, but it basically consists of the letters G and A followed by an encoding character (one of *$!+) and then a type character and finally the value.
The notes are always four byte aligned, even on 64-bit systems. This does mean that consumers of the notes may have to read 8-byte wide values from 4-byte aligned addresses, and that producers of the notes may have to generate unaligned relocs when creating them.
The default is to enable attach, but the inverse option is available in case the host assembler does not support the .attach_to_group pseudo-op. If this feature is disabled then note generation for function sections will not work properly.
Note - this option is currently under development, and is not yet fully functional.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled ``GNU Free Documentation License''.