Table of Contents
SPO600 2024 Fall Project
Goal
The goal of this project is to write a functioning proof-of-concept prototype for the function-pruning component of Automatic Function Multi-Versioning (AFMV) capability to the Gnu Compiler Collection (GCC) for AArch64 systems, building on previous work.
The Problem
We're attempting to add automatic function multi-versioning (AFMV) to the GCC compiler. This builds upon other capabilities within the compiler:
1. IFUNC - The ifunc capability allows a program to provide multiple implementations of a function, and to use a “resolver function” which is run once at program initialization to determine which implementation will be used. The resolver function returns a pointer to the selected function, which is used from that point on as for the life of the process. This capability is very flexible but requires the programmer to create:
- multiple implementations of the desired function with different names
- the resolver function (which can select between the implementations based on any criteria, but usually selects based on the hardware capabilities of the runtime system)
- a prototype that ties together the desired target name and the resolver function
2. FMV - The GCC compiler includes a function multiversioning capability for x86_32, x86_64, powerpc, and aarch64 architectures (with slightly different implementations). FMV is similar to ifunc, and can be used in two different ways:
- With manually-written functions:
- each function has the same name, and an attribute which specifies which architectural variant that function version is to be used on
- the resolver function is provided automatically by the compiler
- With function cloning:
- only one version of the function is provided, and an attribute specifies the list of architectural variants. The function is automatically cloned by the complier with one function clone for each architectural variant, and each clone is optimized for a specific variant.
- the resolver function is provided automatically by the compiler
This requires fewer code changes than ifunc, but still requires that the programmer state the architectural variants that will be targetted. The programmer also needs to know (or guess!) which functions would benefit from multiversioning.
3. AFMV - This “automatic function multi-versioning” capability does not exist yet, and is what we're working towards building. AFMV should work like FMV function cloning, but without any source code changes; instead, a compiler option will be used to specify the architectural variants of interest, and any function that would benefit from function multi-versioning will be automatically cloned. It is proposed that AFMV operate in this fashion:
- all functions will be cloned and subject to the compiler's optimization process
- any function clones which are fundamentally the same after optimization will be pruned back to a single implementation
Our project this semester is to determine how to detect when two different versions of a function are “fundamentally the same”.
We're going to do this work within the GCC compiler.
Project Stage 1: Preparation
Note: the Project Stage 1 requirements were changed on Oct 31 due to issues with the available server systems.
1. Become familiar with the GCC build process. Build the current development version of GCC on x86_64 and aarch64 platforms. Get to know how long a full build takes, how to change the build options, and how to install a local (non-system, personal) copy of GCC. Make sure that you are very comfortable with the build process.
Note: you do not have to produce a full bootstrap build.
1a. Become familiar with producing and reviewing dumps produced during the compilation passes (see the -fdump-tree-all
and -fdump-tree-pass
compiler options and the associated code, as well as -fdump-rtl-all
and -fdump-rtl-pass
). Important: you do not (and should not) enable these options during the compilation of the GCC compiler itself, but you should test out using them to compile other (smaller!) programs.
2. Learn how to navigate the GCC codebase. Specifically, find out what code implements these aspects of the compiler and how to add to or change the code:
a. Find the code that controls the compilation passes, and how passes can be added. Test out adding a pass (even if just a dummy pass that produces a confirmation that it's running).
b. Find out how to access the intermediate representation (IR) of the program being compiled. Ideally, experiment with using the accessor macros (see the GCC documentation) to access and perhaps iterate through the IR. You could, for example, create a pass that performs a scan for all of the function names, and identifies all of the clones of a function.
c. Find out how dumps are produced during the compilation passes (see the
-fdump-tree-all
and -fdump-tree-pass
compiler options and the associated code, as well as -fdump-rtl-all
and -fdump-rtl-pass
). Become familiar with producing and reviewing these dumps. Consider creating a dummy pass that produces a useful diagnostic dump, or add a dump to the an IR scanning pass.
Submitting your Project Stage 1
Blog your results:
- Include detailed results for the items above. Be specific and conclusive in your reporting, and include detail such as build options and build time, specific files and directories identified as you learned to navigate the code, and the specific code used in your experimentation.
Enable replication of your results. For example, you could provide links to specific content in a Git repository of your experiments. Avoid presenting code as screenshots whenever possible, because screenshots are not searchable, indexable, testable, nor accessible.- Add your reflections on the experience - what you learned, what you found interesting, what you found challenging, and what gaps you have identified in your knowledge and how you can address those gaps.
- I recommend that you blog about your work in multiple sections - blog as you go rather than waiting and writing one massive blog post at the end of each stage.
Resources
- Building GCC guide page on this wiki
-
- See especially:
-
- Take particular note of the section on Contributing to GCC.
- Installing GCC, the GCC project's guide to compiling and installing the GCC compiler from source code
-
-
- Note that GCC is mirrored onto GitHub, but the main activity is conducted in the GCC git repository documented in the link above. The GitHub mirror is provided for the convenience of GitHub users, but the GitHub workflow (including approaches such as forking and pull requests) is not used by the GCC project; instead, they discuss and review patches through an email-based workflow.
Specifics: IFUNC
GCC IFUNC documenation:
Specifics: FMV
Current documentation:
- Mentions that FMV is only implemented for i386 (aka x86_32) - now false as it's also implemented for x86_64, Power (PPC64), and aarch64
- Does not mention
target_clones
syntax
- Does not talk about the current state of implementation
- Mentions that FMV may be disabled at compile time by a compiler flag, but this flag is not documented and does not appear to be implemented
- The macro
__HAVE_FEATURE_MULTI_VERSIONING
(or__FEATURE_FUNCTION_MULTI_VERSIONING
or__ARM_FEATURE_FUNCTION_MULTIVERSIONING
) does not appear to be defined (as of GCC 14.2.1)
Implementation in GCC
- Implemented and tested in (at least) x86_64, PowerPC4, and AArch64
- I did not test the PowerPC version
- Testing performed with GCC 14.0.1 20240223 with limited testing on 14.2.1 20240912.
- On x86:
- Syntax to manually specify a function target:
__attribue__((target("nnn")))
- wherennn
may take the form of “default”, or “feature” eg., “sse4.2”, or “feature,feature” e.g., “sse4.2,avx2”, or it may take the form “arch=archlevel” e.g., “arch=x86-64-v3” or “arch=atom” - target_version is not accepted as an alternative to target attribute
- Syntax to manually specify cloning:
__attribute__((target_clone("nnn1", "nnn2" [...])))
- Works in both the C and C++ frontends
- On AArch64:
- Current support landed Dec 16, 2023; see commit 0cfde688e21 (and the commit messages) in the GCC Git repository (https://gcc.gnu.org/git.html) or GitHub read-only mirror of that repository (https://github.com/gcc-mirror/gcc), or the corresponding discussion on the gcc-patches mailing list. There have been several updates and enhancements since it first landed.
- Syntax to manually specify a function target:
__attribute__((target_version("nnn")))
- wherennn
may take the form of “default”, or “feature” e.g., “sve”, or “feature+feature” e.g., “sve+sve2” (Note: in some earlier versions of GCC, a plus-sign was required at the start of the feature list, e.g., “+sve” instead of “sve”. This was changed by gcc 14). Note the use of the attributetarget_version
as opposed totarget
(as used on x86) which is compliant with the ACLE specification; it appears possible to usetarget
in some versions of the GCC compiler (apparently with the plus-sign at the start of the feature-list?). Note that the “arch=nnn” format is not supported (and probably should be). - Syntax to manually specify cloning:
__attribute__((target_clone("nnn", "nnn" [...])))
- note that contrary to the ACLE documentation, there is no automatic “default” argument - the first argument supplied should be “default” - Manually specified function target (initially) works in the C++ frontend only, but automatic cloning appears to work in both C and C++. Note that most C code can be compiled with the C++ frontend, except for some recent C enhancements not understood by C++ as well as some C++ keywords that are not reserved in C
Due Date
- Stage 1 is due with the second batch of blog posts on November
23, 2024.