Section: RPM Development Tools (1)
Updated: 2020-01-31
Page Index


annobin - Annobin  




Binary Annotation is a method for recording information about an application inside the application itself. It is an implementation of the "Watermark" specification defined here: <>

Although mainly focused on recording security information, the system can be used to record any kind of data, even data not related to the application. One of the main goals of the system however is the ability to specify the address range over which a given piece of information is valid. So for example it is possible to specify that all of a program was compiled with the -O2 option except for one special function which was compiled with -O0 instead.

The range information is useful because it allows third parties to examine the binary and find out if its construction was consistent. IE that there are no gaps in the recorded information, and no special cases where a required feature was not active.

The system works by adding a special section to the application containing individual pieces of information along with an address range for which the information is valid. (Some effort has gone into the storing this information in a reasonably compact format).

The information is generated by a plugin that is attached to the compiler (either "gcc" or "clang"). The plugin is called "annobin" and it extracts information from the internals of compiler and records them in the object file(s) being produced.

Note - the plugin method is just one way of generating the information. Any interested party can create and add information to the objhect file, providing that they follow the Watermark specification.

The information can be extracted from files via the use of tools like "readelf" and "objdump". The "annobin" package itself includes a program called annocheck which can can also examine this information. Details on this program can be found elsewhere in this documentation.

Normally the option to enable the recording of binary annotation notes is enabled automatically by the build system, so no user intervention is required. On Fedora and RHEL based systems this is handled by the redhat-rpm-config package.

Currently the binary annotations are generated by a plugin to the "GCC" and "clang" compilers. This does mean that files that are not compiled with either of these compilers will not gain any binary annotations, although there is an optional assembler switch to add some basic notes if none are present in the input files.

If the build system being used does not automatically enabled the annobin plugin then it can be specifically added to the compiler command line by adding the -fplugin=annobin option. It may also be necessary to tell the compiler where to find the plugin by adding the -iplugindir= option, although this should only be necessary if the plugin is installed in an unusual place.

If it is desired to disable the recording of binary annotations then the -fplugin-arg-annobin-disable (for "gcc") or -Xclang -plugin-arg-annobin-disable (for "clang") can be used. Note - these options must be placed after the -fplugin=annobin option.

On Fedora and RHEL systems the plugin can be disabled entirely for all compilations in a package by adding %undefine _annotated_build to the spec file.

The information is stored in the ELF Note format in a special section called "". The "readelf" program from the "binutils" package can extract and display these notes when the --notes option is provided. (Adding the --wide option is also helpful). Here is an example of the output:

        Displaying notes found in:
          Owner                        Data size        Description
          GA$<version>3p3              0x00000010       OPEN        Applies to region from 0x8a0 to 0x8c6 (hello.c)
          GA$<tool>gcc 7.2.1 20170915  0x00000000       OPEN        Applies to region from 0x8a0 to 0x8c6
          GA*GOW:0x452b                0x00000000       OPEN        Applies to region from 0x8a0 to 0x8c6
          GA*<stack prot>strong        0x00000000       OPEN        Applies to region from 0x8a0 to 0x8c6
          GA*GOW:0x412b                0x00000010       func        Applies to region from 0x8c0 to 0x8c6 (baz)

This shows various different pieces of information, including the fact that the notes were produced using version 3 of the specification, and version 3 of the plugin. The binary was built by gcc version 7.2.1 and the -fstack-protector-strong option was enabled on the command line. The program was compiled with -O2 enabled except the baz() function which was compiled with -O0 instead.

The most complicated part of the notes is the owner field. This is used to encode the type of note as well as its value and possibly extra data as well. The format of the field is explained in detail in the Watermark specification, but it basically consists of the letters G and A followed by an encoding character (one of *$!+) and then a type character and finally the value.

The notes are always four byte aligned, even on 64-bit systems. This does mean that consumers of the notes may have to read 8-byte wide values from 4-byte aligned addresses, and that producers of the notes may have to generate unaligned relocs when creating them.  


The plugin accepts a small selection of command line arguments, all accessed by passing -fplugin-arg-annobin-<option> (for "gcc") or -Xclang -plugin-arg-annobin-<option> (for "clang") on the command line. These options must be placed on the command line after the plugin itself is mentioned. The options are:
Either disable or enable the plugin. The default is for the plugin to be enabled.
Display a list of supported options on the standard output. This is in addition to whatever else the plugin has been instructed to do.
Display the version of the plugin on the standard output. This is in addition to whatever else the plugin has been instructed to do.
Report the actions that the plugin is taking. If invoked for a second time on the command line the plugin will be very verbose.
Report the generation of function specific notes. This indicates that the named function was compiled with different options from those that were globally enabled.
Do not, or do, record information for the dynamic loader. The default is to record this information.
Do not, or do, record information for static analysis. The default is to record this information.
Do, or do not, record information about the stack requirements of functions in the executable. This feature is disabled by default as these notes can take up a lot of extra room if the executable contains a lot of functions.
If stack size requirements are being recorded then this option sets the minimum value to record. Functions which require less than "N" bytes of static stack space will not have their requirements recorded. If not set, then "N" defaults to 1024.
If enabled the global-file-syms option will create globally visible, unique symbols to mark the start and end of the compiled code. This can be desirable if a program consists of multiple source files with the same name, or if it links to a library that was built with source files of the same name as the program itself. The disadvantage of this feature however is that the unique names are based upon the time of the build, so repeated builds of the same source will have different symbol names inside it. This breaks the functionality of the build-id system which is meant to identify similar builds created at different times. This feature is disabled by default, and if enabled can be disabled again via the no-global-file-syms option.
When gcc compiles code with the -ffunction-sections option active it will place each function into its own section. When the annobin attach option is active the plugin will attempt to attach the function section to a group containing the notes and relocations for the function. In that way, if the linker decides to discard the function, it will also know that it should discard the notes and relocations as well.

The default is to enable attach, but the inverse option is available in case the host assembler does not support the .attach_to_group pseudo-op. If this feature is disabled then note generation for function sections will not work properly.

Adds an extra prefix to the symbol names generated by the "annobin" plugin. This allows the plugin to be run twice on the same executable, which can be useful for debugging and build testing.
The active-checks option enables compile time checking by the annobin plugin. The plugin will actively examine the gcc command line and generate errors if required security options are missing or have the wrong value. The default is not to perform these checkes.

Note - this option is currently under development, and is not yet fully functional.



Copyright (c) 2018 - 2020 Red Hat.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled ``GNU Free Documentation License''.