Table of Contents

SPO600 2024 Fall Project

Goal

The goal of this project is to write a functioning proof-of-concept prototype for the function-pruning component of Automatic Function Multi-Versioning (AFMV) capability to the Gnu Compiler Collection (GCC) for AArch64 systems, building on previous work.

The Problem

We're attempting to add automatic function multi-versioning (AFMV) to the GCC compiler. This builds upon other capabilities within the compiler:

1. IFUNC - The ifunc capability allows a program to provide multiple implementations of a function, and to use a “resolver function” which is run once at program initialization to determine which implementation will be used. The resolver function returns a pointer to the selected function, which is used from that point on as for the life of the process. This capability is very flexible but requires the programmer to create:

2. FMV - The GCC compiler includes a function multiversioning capability for x86_32, x86_64, powerpc, and aarch64 architectures (with slightly different implementations). FMV is similar to ifunc, and can be used in two different ways:

This requires fewer code changes than ifunc, but still requires that the programmer state the architectural variants that will be targetted. The programmer also needs to know (or guess!) which functions would benefit from multiversioning.

3. AFMV - This “automatic function multi-versioning” capability does not exist yet, and is what we're working towards building. AFMV should work like FMV function cloning, but without any source code changes; instead, a compiler option will be used to specify the architectural variants of interest, and any function that would benefit from function multi-versioning will be automatically cloned. It is proposed that AFMV operate in this fashion:

Our project this semester is to determine how to detect when two different versions of a function are “fundamentally the same”.

We're going to do this work within the GCC compiler.

Project Stage 1: Preparation

Note: the Project Stage 1 requirements were changed on Oct 31 due to issues with the available server systems.

1. Become familiar with the GCC build process. Build the current development version of GCC on x86_64 and aarch64 platforms. Get to know how long a full build takes, how to change the build options, and how to install a local (non-system, personal) copy of GCC. Make sure that you are very comfortable with the build process.

Note: you do not have to produce a full bootstrap build.

1a. Become familiar with producing and reviewing dumps produced during the compilation passes (see the -fdump-tree-all and -fdump-tree-pass compiler options and the associated code, as well as -fdump-rtl-all and -fdump-rtl-pass). Important: you do not (and should not) enable these options during the compilation of the GCC compiler itself, but you should test out using them to compile other (smaller!) programs.

2. Learn how to navigate the GCC codebase. Specifically, find out what code implements these aspects of the compiler and how to add to or change the code: a. Find the code that controls the compilation passes, and how passes can be added. Test out adding a pass (even if just a dummy pass that produces a confirmation that it's running). b. Find out how to access the intermediate representation (IR) of the program being compiled. Ideally, experiment with using the accessor macros (see the GCC documentation) to access and perhaps iterate through the IR. You could, for example, create a pass that performs a scan for all of the function names, and identifies all of the clones of a function. c. Find out how dumps are produced during the compilation passes (see the -fdump-tree-all and -fdump-tree-pass compiler options and the associated code, as well as -fdump-rtl-all and -fdump-rtl-pass). Become familiar with producing and reviewing these dumps. Consider creating a dummy pass that produces a useful diagnostic dump, or add a dump to the an IR scanning pass.

Submitting your Project Stage 1

Blog your results:

Resources

Specifics: IFUNC

GCC IFUNC documenation:

Specifics: FMV

Current documentation:

1. GCC documentation

2. ARM ACLE documentation

Implementation in GCC

Due Date

Project Stage 2: Clone-Pruning Analysis Pass

Create a pass for the GCC compiler which analyzes the program being compiled and: (a) Identifies one or more functions which have been cloned; (b) Examines the cloned functions to determine if they are substantially the same or different; © Emits a message in the GCC diagnostic dump for the pass that indicates if the functions should be pruned (in the case that they're substantially the same) or not pruned (if they are different).

It is recommended that you proceed in steps:

To limit complexity, you may make these assumptions:

  1. There is only one cloned function in a program
  2. There are only two versions (clones) of that function (ignoring the function resolver)

It is important that you position your compiler pass late in the compilation/optimization process so that any significant optimizations, such as vectorization, are performed before your analysis. Ideally, it should be one of the last “tree” (gimple) passes performed.

Note that the gimple code for two identical functions may have slight variations. For example, the names of temporary variables will probably be different (because they are sequentially numbered), and generated labels in the code will probably be different (for the same reason). However, these variations by themselves should not be considered to make the function clones different.

Two possible approaches to this problem are (1) to iterate through the statements in each function, comparing them statement-by-statement; or (2) generating some type of hash or signature that uniquely identifies the implementation of the function and which can be compared to the hash/signature of a clone to see if they are different.

Please use these specific strings in your dump file:

Where name of base function is the original name of the function that should (or should not) be pruned.

Your solution should build and execute successfully on both x86_64 and aarch64 systems, and should take into account the differences between the FMV implementations on those two architectures (for example, the munging algorithm used to create the suffixes for the cloned functions is different).

Demo Files for Creating a GCC Pass

Each of the SPO600 Servers has a file /public/spo600-gcc-pass-demo.tgz which is a tar archive containing modified versions of four files from the current (2024-11-20) GCC development head.

These files are all from the gcc subdirectory in the source tree:

Building GCC with these changes will result in a compiler that can output an additional dump, which can be triggered with -fdump-tree-ctyler (or -fdump-tree-all).

Test Cases for Pruning/No-Pruning

Each of the SPO600 Servers has a file /public/spo600-test-clone.tgz which is a tar archive containing code to build test cases on x86_64 or aarch64 systems. On each architecture, two binaries will be built, each containing one cloned function. Building these binaries with a copy of GCC that contains your analysis pass should result in a decision to prune (for the binary test-clone-arch-prune) or not to prune (for the binary test-clone-arch-noprune), where arch is either x86 or aarch64.

Refer to the README.txt file within the tgz file for more detail.

Recommendations for Building GCC in Stage 2

A reminder that the make utility will rebuild a codebase in as few steps as possible. It does this by comparing the timestamps of the dependencies (inputs) for each target (output) to determine which source (or other input files) have changed since the related targets were built, and then rebuilding only those targets.

This can effectively cut the build time for a complex project like GCC from hours to minutes. On my development system (a Ryzen 7735HS with 32 GB RAM), a null rebuild (no source changes - make is checking that everything is up-to-date) takes about 8.3 seconds, and a rebuild with edits to one pass source file take 23-30 seconds. On the SPO600 Servers the rebuild times are similar.

To take advantage of this capability, do an initial full build of GCC in a separate build directory as usual, then make whatever required edits to the source code in the source directory. Run make with appropriate options (including -j job values) in the build directory.

Remember to use screen (or a similar program such as tmux) when building on remote systems in case your network connection gets interrupted, and it's a good idea to time every build (prepend time to your make command) and redirect both stdout and stderr to a log file: time make … |& tee build.log if you also want to see the output on the terminal or time make … &> build.log if you don't want to see the output.

You can do your development work on either architecture, but remember to test your work on both architectures.

Submitting your Project Stage 2

Blog your results:

Due Date