====== SPO600 2024 Summer Project ====== ===== Goal ==== The goal of this project is to add a functioning proof-of-concept prototype of Automatic Function Multi-Versioning (AFMV) capability to the Gnu Compiler Collection (GCC) for AArch64 systems, building on previous work. ===== Project Stage 1: Preparation ===== 1. **Become familiar with the GCC build process.** Build the current development version of GCC on AArch64 and x86 platforms. Get to know how long a full build takes, how to change the build options, and how to install a local (non-system, personal) copy of GCC. 2. **Learn how to navigate the GCC codebase.** Specifically, find out what code implements these aspects of the compiler and how to add to or change the code: a. Find the code that controls the compilation passes, and how passes can be added. b. Find the code that controls the argument parsing. Add a dummy argument and experiment with messages for the user that describe how and when that argument can be used. c. Find out where argument information is stored and how it can be accessed in code. d. Find out how dumps are produced during the compilation passes (in particular, during the tree passes). Become familiar with producing and reviewing these dumps. Create a dummy pass that produces a useful diagnostic dump. ==== Submitting your Project Stage 1 ==== Blog your results: * Include detailed results for the items above. Be specific and conclusive in your reporting, and include detail such as build options and build time, specific files and directories identified as you learned to navigate the code, and the specific code used in your experimentation. * Enable replication of your results. For example, you could provide links to specific content in a Git repository of your experiments. Avoid presenting code as screenshots whenever possible. * Add your reflections on the experience - what you learned, what you found interesting, what you found challenging, and what gaps you have identified in your knowledge and how you can address those gaps. Identify the types of tasks that are most attractive to you (and why) - some people love writing documentation, others like to perform testing, and others prefer to write and debug code (and have preferences about what type of code they like to work with). * I recommend that you blog about your work in multiple sections - blog as you go rather than waiting and writing one massive blog post at the end of each stage. ==== Resources ==== * [[https://gcc.gnu.org/onlinedocs/|GCC Documentation]] * See especially: * The [[https://gcc.gnu.org/onlinedocs/gcc-4.5.0/gccint/|GCC Internals Manual]] * Take particular note of the section on Contributing to GCC. * [[https://gcc.gnu.org/install/|Installing GCC]], the GCC project's guide to compiling and installing the GCC compiler from source code * [[https://gcc.gnu.org/git.html|GCC Git]] * Note that GCC is [[https://github.com/gcc-mirror/gcc|mirrored]] onto GitHub, but the main activity is conducted in the GCC git repository documented in the link above. The GitHub mirror is provided for the convenience of GitHub users, but the GitHub workflow (including approaches such as forking and pull requests) is not used by the GCC project; instead, they discuss and review patches through an email-based workflow. ==== Due Date ==== * May 31 **June 6** for Stage 1 ===== Project Stage 2: Implementation ===== == The Problem == There are multiple versions of processors of every architecture currently in the market. You can see this when you go into a computer store such as Canada Computers or Best Buy -- there are laptops and desktops with processors ranging from Atoms and Celerons to Ryzen 4/7/9 and Core i3/i5/i7/i9 processors, and workstations and servers with processors ranging up to Xeon and Epyc/Threadripper devices. Similarly, cellphones range from devices with Cortex-A35 cores through Neoverse X3 cores. These wide range of devices support a diverse range of processor features. Software developers (and vendors) are caught between supporting only the latest hardware, which limits the market they can sell to, or else harming the performance of their software by not taking advantage of recent processor improvements. Neither option is attractive for a software company wishing to be competitive. == The Goal == To take good advantage of processor features with minimal effort by the software developers. == Three Solutions == There are three solutions in various stages of preparation, each of which builds upon the previous solutions: - IFUNC - Indirect Functions - This is a solution provided by the development toolchain (compiler, linker, libraries) but which is largely manual for the software developer. The developer provides multiple alternate versions of performance-critical functions which are targeted at different micro-architectural levels, plus a resolver function that selects between the implementations at runtime. Note that IFUNC is the only solution which enables a resolver function that takes into account factors other than the micro-architectural level of the processor. For example, a resolver function could select beween alternate functions based on available memory, storage performance, or the speed of the network connection. - FMV - Function Multi-Versioning - This is a solution that is also supported by the development toolchain but which involves slightly less manual work for the developer. There are two levels of FMV: - FMV with Manual Alternate Functions - The programmer provides the alternate functions and uses function attributes to specify the microarchitectural level at which each is targeted. The resolver function for each group of alternate functions is automatically generated. - FMV with Cloned Functions - The program provides one version of the function and uses function attributes to specify that clones of that function are to be built, and the micro-architectural targets for each clone. The resolver function for each group of cloned functions is automatically generated. The only difference between the cloned functions is the micro-architectural optimizations that are applied by the compiler. Note that there is nothing to ensure that the clones are actually any better or in fact different from each other. - AFMV - Automatic Function Multi-Versioning - **This is what we're working on** - This is effectively FMV with Cloned Functions, but the cloning is controlled from the command line rather than using function attributes. This has the advantage that no source changes are required. Every function in the program is cloned, and the after the various optimization passes have been applied, the cloned functions are analyzed. If the functions are different, they are kept, but if they are idential, they are removed, and only the default version of the function is used. == Specifics: IFUNC == GCC IFUNC documenation: * [[https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/Function-Attributes.html#index-g_t_0040code_007bifunc_007d-function-attribute-3095|GCC Manual]] * [[https://sourceware.org/glibc/wiki/GNU_IFUNC|GCC Wiki]] == Specifics: FMV == Current documentation: 1. [[https://gcc.gnu.org/onlinedocs/gcc/Function-Multiversioning.html|GCC documentation]] * Mentions that FMV is only implemented for i386 (aka x86_32) - now false as it's also implemented for x86_64, Power (PPC64), and aarch64 * Does not mention ''target_clones'' syntax 2. [[https://arm-software.github.io/acle/main/acle.html#function-multi-versioning|ARM ACLE documentation]] * Does not talk about the current state of implementation * Mentions that FMV may be disabled at compile time by a compiler flag, but this flag is not documented * The macro ''__HAVE_FEATURE_MULTI_VERSIONING'' (or ''__FEATURE_FUNCTION_MULTI_VERSIONING'' or ''__ARM_FEATURE_FUNCTION_MULTIVERSIONING'') does not appear to be defined Implementation in GCC * Implemented and tested in (at least) x86_64, PowerPC4, and AArch64 * I did not test the PowerPC version * Testing performed with GCC 14.0.1 20240223 * On x86: * Syntax to manually specify a function target: ''__attribue__((target("nnn")))'' - where ''nnn'' may take the form of "default", or "feature" eg., "sse4.2", or "feature,feature" e.g., "sse4.2,avx2", or it may take the form "arch=archlevel" e.g., "arch=x86-64-v3" or "arch=atom" * target_version is not accepted as an alternative to target attribute * Syntax to manually specify cloning: ''__attribute__((target_clone("nnn1", "nnn2" [...])))'' * Works in both the C and C++ frontends * On AArch64: * Current support landed Dec 16, 2023; see commit 0cfde688e21 (and the commit messages) in the GCC Git repository (https://gcc.gnu.org/git.html) or GitHub read-only mirror of that repository (https://github.com/gcc-mirror/gcc), or the corresponding discussion on the [[https://gcc.gnu.org/pipermail/gcc-patches/|gcc-patches mailing list]]. There have been several updates and enhancements since it first landed. * Syntax to manually specify a function target: ''__attribute__((target_version("nnn")))'' - where ''nnn'' may take the form of "default", or "feature" e.g., "sve", or "feature+feature" e.g., "sve+sve2" (Note: in some earlier versions of GCC, a plus-sign was required at the start of the feature list, e.g., "+sve" instead of "sve". This was changed by gcc 14). Note the use of the attribute ''target_version'' as opposed to ''target'' (as used on x86) which is compliant with the ACLE specification. Note that the "arch=nnn" format is not supported (and probably should be). * Syntax to manually specify cloning: ''__attribute__((target_clone("nnn", "nnn" [...])))'' - note that contrary to some of the documentation, there is no automatic "default" argument - the first argument supplied should be "default" * Manually specified function target (initially) works in the C++ frontend only, but automatic cloning appears to work in both C and C++. Note that most C code can be compiled with the C++ frontend, except for some recent C enhancements not understood by C++ as well as some C++ keywords that are not reserved in C ==== Tasks ==== See also the [[SPO600 2024 Summer Participants]] page. ^ # ^ Name ^ Description ^ Notes ^ Lead ^ | 1 | Command-line Parsing | Parse the GCC command line to pick up AFMV options, process the version list to validate the architectural feature specification | ''.opt'' file writing | Marco Siu | | 2 | arch= Arguments | The current GCC AArch64 FMV capability accepts versions that are identified by feature flags (such as "sve2") but does not accept "arch=" arguments such as "arch=armv9-a" (those type of arguments are accepted by the x86 FMV implementation). Add this functionality. | | Connor Squires | | 3 | Apply FMV cloning to functions automatically | When the appropriate command-line options are provided, the compiler should automatically clone //all// functions, as if the target_clone attribute was specified. | | Yukti Manoj Mulani | | 4 | Produce an error message if AFMV and FMV are used together | Produce an error if the compiler is invoked with AFMV command-line options //and// there are FMV attributes specified in the code. | | Mara Perkons | | 5 | Prune Cloned Functions (1)| Remove any AFMV-created clone functions that do not provide any significant benefit or differentiation - Task 1: determine which function(s) to prune. | | Sangwoo Shin | | 6 | Prune Cloned Functions (2) | Remove any AFMV-created clone functions that do not provide any significant benefit or differentiation - Perform function pruning. | | Wai Hing William Tse | | 7 | Diagnostic Output | Provide diagnostic output (when activated by -fdump-...-... command-line options). | | Anatolii Hryhorzhevskyi | | 8 | Git Wrangler | Mangage the repository. | This task includes rebasing as upstream changes and managing code reviews. | Rigels Hasani | | 9 | Update Documentation 1| Update the existing GCC IFUNC and FMV documentation (all archs) | Technical writing. | Shubh Jani | | 10 | Create AFMV Documentation | Create documentation for the AFMV feature. | Technical writing. | Humaira Shaikh | | 11 | Create Tests | Create a suite of tests for the AFMV capability. (This is in addition to individual tests that the various task owners will prepare). | Requires understanding the existing test framework and writing tests. | Zijun Li | | 12 | Test AFMV Implementation | Use the existing test suite(s) to verify that the aarch64 changes are not introducing regressions on aarch64 or x86-64. | Requires understanding and using the existing test framework; may involve scripting. | | ==== What You Need to Do ==== Make as much progress on the task identified above as possible! The goal is to have a proof-of-concept implemention. Isolate your task from the other tasks. For example, if your code depends on command-line arguments, __do not__ wait for the argument parsing task to be completed - instead, you can hard-code some assumed command-line arguments for testing purposes. We'll connect the various tasks together in the next Stage of the project. Be sure to commit & push your code into a branch of the class GitHub repository. Follow the GCC project's coding standards and practices. ==== Information about Writing Messages ==== If you are outputting information to the user (such as warnings or error messages), here is some additional information which may be useful: === gettext === Throughout the GCC code, you will see strings wrapped with a function named with an underscore, like this: ''_("message")''. For example, line 3730 of ''gcc/gcc.cc'': printf (_("Usage: %s [options] file...\n"), progname); Or line 2004 of ''gcc/opts.cc'': description = _("The following options are target specific"); This unusual syntax is part of the gettext framework, which is a core part of the internationalization (i18n) code in GCC, which is the infrastructure which enables localization (l10n) -- the ability to localize the software for use with varying languages (message text, language direction) and formats for numbers, dates, times, and currencies. For more information, see the [[https://www.gnu.org/software/gettext/manual/html_node/index.html|documentation for gettext]]. However, it is sufficient in most cases to simply ensure that every string literal is wrapped in the underscore function; the i18n tools will take care of most of the rest. === User Experience === For advice on writing good messages (diagnostics, warnings, errors), see the [[https://gcc.gnu.org/onlinedocs/gccint/User-Experience-Guidelines.html|User Experience Guidelines]] chapter of the [[https://gcc.gnu.org/onlinedocs/gccint|GCC Internals]] documentation. ==== Previous Work ==== The previous semester made some preliminary progress on some of these tasks. You can find the blog posts and the GitHub commits via these resources: * [[spo600_2024_winter_participants|SPO600 Winter 2024 Participants]] * [[https://github.com/seneca-cdot/gcc|GitHub repository - see individual branches]] ==== Submitting your Project Stage 2 ==== Blog your results: * Detail the results of your implementation. Include links to your code in the class GitHub repository. * Enable replication of your results. For example, you could provide links to specific content in a Git repository of your experiments. Avoid presenting code as screenshots whenever possible. * Add your reflections on the experience - what you learned, what you found interesting, what you found challenging, and what gaps you have identified in your knowledge and how you can address those gaps. * I recommend that you blog about your work in multiple sections - blog as you go rather than waiting and writing one massive blog post at the end of each stage. ==== Due Date ==== * June 15 for Stage 2 ===== Project Stage 3: Integrate, Tidy, & Wrap ===== ==== What You Need to Do ==== Wrap up your project: * Deal with any loose ends from Stage 2 * Integrate your code with any related code * Test and document what you've done The goal is to get a working proof-of-concept of the GCC AFMV feature; please keep this in mind as you prioritize your work! [[https://github.com/seneca-cdot/gcc|Class code repository]]: * Minimum: your branch must include your code * Target: your code is merged with the other compatible branches and is available in the master branch of the Git repository This is a summary of the discussion that took place in the June 17th class regarding status and next steps for each task: ^^#^Name^Lead^Status^Consumer (tasks we get data from)^Provider (tasks we give data to)^Affects (tasks affected)^Next Steps - Development^Next Steps - Connecting Code / Integrating^Description^$ ||1|Command-line Parsing|Marco Siu|Working for 4 values (sve, sve2, simd, neon)| | 3, 4, 7?, 11|10|Validate all target_clone options|Feed values to 3,4,7?,11 - Initializes afmv_targets array and afmv_cnt?|Parse the GCC command line to pick up AFMV options, process the version list to validate the architectural feature specification|$ ||2|arch= Arguments|Connor Squires|Picking up the values but not validating| | |1, 10|Get working target_clone| | The current GCC AArch64 FMV capability accepts versions that are identified by feature flags (such as “sve2”) but does not accept “arch=” arguments such as “arch=armv9-a” (those type of arguments are accepted by the x86 FMV implementation). Add this functionality.|$ ||3|Apply FMV cloning to functions automatically|Yukti Manoj Mulani|Throwing error msgs re target value|1|5| | Resolve error msgs|Accept data from 1|When the appropriate command-line options are provided, the compiler should automatically clone all functions, as if the target_clone attribute was specified.|$ ||4|Error message: AFMV && FMV|Mara Perkons|Working with dummy afmv arg|1| | | | Accept data from 1|Produce an error if the compiler is invoked with AFMV command-line options and there are FMV attributes specified in the code.|$ ||5|Prune Cloned Functions (1) - detect|Sangwoo Shin|???|(1), (3)|6| | | | Remove any AFMV-created clone functions that do not provide any significant benefit or differentiation - Task 1: determine which function(s) to prune.|$ ||6|Prune Cloned Functions (2) - perform prune|Wai Hing William Tse|Accepts array of function names & prunes, tests failed|5| | 7|Get tests passing| | Remove any AFMV-created clone functions that do not provide any significant benefit or differentiation - Perform function pruning.|$ ||7|Diagnostic Output|Anatolii Hryhorzhevskyi|Not dumping |6| | |Get dumps working|Integrate into 6|Provide diagnostic output (when activated by -fdump-…-… command-line options).|$ ||8|Git Wrangler|Rigels Hasani|5 branches updated| | | | Rebase|Merge|Mangage the repository.|$ ||9|Update Documentation 1|Shubh Janis|???|-2| | |(Complete)| | Update the existing GCC IFUNC and FMV documentation (all archs)|$ ||10|Create AFMV Documentation|Humaira Shaikh|???|(*, 1, 2, 4, 7)| | |Update with code changes| | Create documentation for the AFMV feature.|$ ||11|Create Tests|Zijun Li|???|1, (2?), 4, (5/6?), 7)| | |Update with code changes| | Create a suite of tests for the AFMV capability. (This is in addition to individual tests that the various task owners will prepare).|$ ||12|Test AFMV Implementation| | | | | | | | Use the existing test suite(s) to verify that the aarch64 changes are not introducing regressions on aarch64 or x86-64.|$ === Blogging === * Provide an overview of the final state of your project code: * Location in class Git repository (branch) * Integration with other branches * What works, what limitations exist, what doesn't work, and what is not tested * Provide detailed reflections on the project work and the course ==== Due Date ==== * June 21 for Stage 3