User Tools

Site Tools


spo600:start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
spo600:start [2025/02/08 04:31] – [Week 5 - Class II] chrisspo600:start [2025/03/14 10:30] (current) chris
Line 12: Line 12:
 |Week 5|February 3|[[#Week 5 - Class I|Compiler Optimizations & Compiler Internals]]|[[#Week 5 - Class II|Introduction to 64-Bit Systems]]|[[#Week 5 Deliverables|Complete labs 3 & 4]]| |Week 5|February 3|[[#Week 5 - Class I|Compiler Optimizations & Compiler Internals]]|[[#Week 5 - Class II|Introduction to 64-Bit Systems]]|[[#Week 5 Deliverables|Complete labs 3 & 4]]|
 |Week 6|February 10|[[#Week 6 - Class I|64 Bit Assembler Lab]]|[[#Week 6 - Class II|Project Stage 1]]|[[#Week 6 Deliverables|Lab 5, Project blogging]]| |Week 6|February 10|[[#Week 6 - Class I|64 Bit Assembler Lab]]|[[#Week 6 - Class II|Project Stage 1]]|[[#Week 6 Deliverables|Lab 5, Project blogging]]|
-|Week 7|February 17|[[#Week 7 - Class I|Indirect Functions (IFUNC) and Function Multi Versioning (FMV)]]|[[#Week 7 - Class II|Automatic Function Multi Versioning (AFMV)]]|[[#Week 7 Deliverables|Project bogging]]|+|Week 7|February 17|[[#Week 7 - Class I|Project Stage 1]]|[[#Week 7 - Class II|Indirect Functions (IFUNC), Function Multi Versioning (FMV), Automatic Function Multi Versioning (AFMV)]]|[[#Week 7 Deliverables|Project bogging]]|
 |Reading Week|February 24|Study Week||| |Reading Week|February 24|Study Week|||
-|Week 8|March 3|Project Discussion|Single Instruction Multiple Data (SIMD) and Scalable Vector Extensions (SVE and SVE2)|Project Stage 1, Blog posts group 2| +|Week 8|March 3|[[#Week 8 - Class I|Project Discussion; Single Instructioni Multiple Data (SIMD)]]|[[#Week 8 - Class II|Single Instruction Multiple Data (SIMD) and Scalable Vector Extensions (SVE and SVE2)]]\\ **Part Async**|[[#Week 8 Deliverables|Project Stage 1, Blog posts group 2]]
-|Week 9|March 10|Project Discussion|Profiling and Benchmarking|Project blogging|+|Week 9|March 10|[[#Weel 9 - Class I|Project Discussion]]|[[#Week 9 - Class II|Project Stage II]]\\ **Async**|[[#Weel 9 Deliverables|Project blogging]]|
 |Week 10|March 17|Project Discussion|Algorithm Selection|Project blogging| |Week 10|March 17|Project Discussion|Algorithm Selection|Project blogging|
 |Week 11|March 24|Project Discussion|Paged Memory|Project stage 2, Blog posts group 3| |Week 11|March 24|Project Discussion|Paged Memory|Project stage 2, Blog posts group 3|
Line 289: Line 289:
   * Finish up [[6502 Program Lab|lab 3]] & [[GCC Build Lab|lab 4]]   * Finish up [[6502 Program Lab|lab 3]] & [[GCC Build Lab|lab 4]]
  
 +===== Week 6 =====
 +
 +==== Week 6 - Class I ====
 +
 +=== Video ===
 +{{vimeo>1058406952?full}}
 +
 +=== 64-bit Class Servers ===
 +  * [[SPO600 Servers]]
 +
 +=== 64-bit Assembly Language ===
 +  *  [[Assembler Basics]] (includes instructions on how to use the GNU Assembler)
 +  *  [[Executable and Linkable Format]] (ELF) - current file format for binary files on Linux
 +  *  [[Syscalls]]
 +  *  [[x86_64 Register and Instruction Quick Start]]
 +  *  [[aarch64 Register and Instruction Quick Start]]
 +
 +=== Lab 5 ===
 +  * [[64-bit_assembly_language_lab|64-Bit Assembly Language Lab]] (Lab 5)
 +
 +==== Week 6 - Class II ====
 +
 +=== Video ===
 +{{vimeo>1058652710?full}}
 +
 +==== Week 6 Deliverables ====
 +  * Complete [[64-bit_assembly_language_lab|Lab 5]] and blog about it
 +
 +===== Week 7 =====
 +
 +==== Week 7 - Class I ====
 +
 +=== Video ===
 +{{vimeo>1058741484?full}}
 +
 +=== Project Stage 1 ===
 +  * [[2025 Winter Project#Project Stage 1: Create a Basic GCC Pass|SPO600 2025 Winter Project - Stage 1]]
 +
 +==== Week 7 - Class II ====
 +
 +=== Video ===
 +{{vimeo>1059089464?full}}
 +
 +=== Runtime Codepath Selection ===
 +
 +1. **IFUNC** - The ifunc capability allows a program to provide multiple implementations of a function, and to use a "resolver function" which is run once at program initialization to determine which implementation will be used. The resolver function returns a pointer to the selected function, which is used from that point on as for the life of the process. This capability is very flexible but requires the programmer to create:
 +  * multiple implementations of the desired function with different names
 +  * the resolver function (which can select between the implementations based on any criteria, but usually selects based on the hardware capabilities of the runtime system)
 +  * a prototype that ties together the desired target name and the resolver function
 +
 +2. **FMV** - The GCC compiler includes a function multiversioning capability for x86_32, x86_64, powerpc, and aarch64 architectures (with slightly different implementations). FMV is similar to ifunc, and can be used in two different ways:
 +  * With manually-written functions:
 +    * each function has the same name, and an attribute which specifies which architectural variant that function version is to be used on
 +    * the resolver function is provided automatically by the compiler
 +
 +  * With function cloning:
 +    * only one version of the function is provided, and an attribute specifies the list of architectural variants. The function is automatically cloned by the complier with one function clone for each architectural variant, and each clone is optimized for a specific variant.
 +    * the resolver function is provided automatically by the compiler
 +
 +This requires fewer code changes than ifunc, but still requires that the programmer state the architectural variants that will be targetted. The programmer also needs to know (or guess!) which functions would benefit from multiversioning.
 +
 +3. **AFMV** - This "automatic function multi-versioning" capability does not exist yet, and is what we're working towards building. AFMV should work like FMV function cloning, but without any source code changes; instead, a compiler option will be used to specify the architectural variants of interest, and any function that would benefit from function multi-versioning will be automatically cloned. It is proposed that AFMV operate in this fashion:
 +  * all functions will be cloned and subject to the compiler's optimization process
 +  * any function clones which are fundamentally the same after optimization will be pruned back to a single implementation
 +
 +=== Resources ===
 +
 +  * [[Building GCC]] guide page on this wiki
 +  * [[https://gcc.gnu.org/onlinedocs/|GCC Documentation]]
 +    * See especially:
 +      * The [[https://gcc.gnu.org/onlinedocs/gcc-4.5.0/gccint/|GCC Internals Manual]]
 +        * Take particular note of the section on Contributing to GCC.
 +      * [[https://gcc.gnu.org/install/|Installing GCC]], the GCC project's guide to compiling and installing the GCC compiler from source code
 +  * [[https://gcc.gnu.org/git.html|GCC Git]]
 +    * Note that GCC is [[https://github.com/gcc-mirror/gcc|mirrored]] onto GitHub, but the main activity is conducted in the GCC git repository documented in the link above. The GitHub mirror is provided for the convenience of GitHub users, but the GitHub workflow (including approaches such as forking and pull requests) is not used by the GCC project; instead, they discuss and review patches through an email-based workflow.
 +
 +== Specifics: IFUNC ==
 +
 +GCC IFUNC documenation:
 +  * [[https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/Function-Attributes.html#index-g_t_0040code_007bifunc_007d-function-attribute-3095|GCC Manual]]
 +  * [[https://sourceware.org/glibc/wiki/GNU_IFUNC|GCC Wiki]]
 +
 +== Specifics: FMV ==
 +
 +Current documentation:
 +
 +1. [[https://gcc.gnu.org/onlinedocs/gcc/Function-Multiversioning.html|GCC documentation]]
 +  * Mentions that FMV is only implemented for i386 (aka x86_32) - now false as it's also implemented for x86_64, Power (PPC64), and aarch64
 +  * Does not mention ''target_clones'' syntax
 +
 +2. [[https://arm-software.github.io/acle/main/acle.html#function-multi-versioning|ARM ACLE documentation]]
 +  * Does not talk about the current state of implementation
 +  * Mentions that FMV may be disabled at compile time by a compiler flag, but this flag is not documented and does not appear to be implemented
 +  * The macro ''<nowiki>__</nowiki>HAVE_FEATURE_MULTI_VERSIONING'' (or ''<nowiki>__</nowiki>FEATURE_FUNCTION_MULTI_VERSIONING'' or ''<nowiki>__</nowiki>ARM_FEATURE_FUNCTION_MULTIVERSIONING'') does not appear to be defined (as of GCC 14.2.1)
 +
 +Implementation in GCC
 +  * Implemented and tested in (at least) x86_64, PowerPC4, and AArch64
 +  * I did not test the PowerPC version
 +  * Testing performed with GCC 14.0.1 20240223 with limited testing on 14.2.1 20240912.
 +
 +  * On x86:
 +    * Syntax to manually specify a function target: ''<nowiki>__attribue__((target("nnn")))</nowiki>'' - where ''nnn'' may take the form of "default", or "feature" eg., "sse4.2", or "feature,feature" e.g., "sse4.2,avx2", or it may take the form "arch=archlevel" e.g., "arch=x86-64-v3" or "arch=atom"
 +    * target_version is not accepted as an alternative to target attribute
 +    * Syntax to manually specify cloning: ''<nowiki>__attribute__((target_clone("nnn1", "nnn2" [...])))</nowiki>''
 +    * Works in both the C and C++ frontends
 +
 +  * On AArch64:
 +    * Current support landed Dec 16, 2023; see commit 0cfde688e21 (and the commit messages) in the GCC Git repository (https://gcc.gnu.org/git.html) or GitHub read-only mirror of that repository (https://github.com/gcc-mirror/gcc), or the corresponding discussion on the [[https://gcc.gnu.org/pipermail/gcc-patches/|gcc-patches mailing list]]. There have been several updates and enhancements since it first landed.
 +      * Syntax to manually specify a function target: ''<nowiki>__attribute__((target_version("nnn")))</nowiki>'' - where ''nnn'' may take the form of "default", or "feature" e.g., "sve", or "feature+feature" e.g., "sve+sve2" (Note: in some earlier versions of GCC, a plus-sign was required at the start of the feature list, e.g., "+sve" instead of "sve". This was changed by gcc 14). Note the use of the attribute ''target_version'' as opposed to ''target'' (as used on x86) which is compliant with the ACLE specification; it appears possible to use ''target'' in some versions of the GCC compiler (apparently with the plus-sign at the start of the feature-list?). Note that the "arch=nnn" format is not supported (and probably should be).
 +      * Syntax to manually specify cloning: ''<nowiki>__attribute__((target_clone("nnn", "nnn" [...])))</nowiki>'' - note that contrary to the ACLE documentation, there is no automatic "default" argument - the first argument supplied should be "default"
 +      * Manually specified function target (initially) works in the C++ frontend only, but automatic cloning appears to work in both C and C++. Note that most C code can be compiled with the C++ frontend, except for some recent C enhancements not understood by C++ as well as some C++ keywords that are not reserved in C
 +
 +==== Week 7 Deliverables ====
 +  * Work on your Project Stage 1 and blog about it.
 +
 +===== Week 8 =====
 +
 +==== Week 8 - Class I ====
 +
 +=== Video ===
 +
 +**There are some technical issues with camera focus on the video for this week.** My apologies for the low quality!
 +
 +{{vimeo>1063430031?full}}
 +
 +=== SIMD Examples ===
 +
 +The sound volume scaling examples mentioned in the video may be found in the file ''/public/spo600-volume-examples.tgz'' on either of the [[SPO600 Servers]].
 +
 +==== Week 8 - Class II ====
 +
 +=== Video ===
 +
 +{{vimeo>1063482268?full}}
 +
 +=== SVE/SVE2 Examples ===
 +
 +For some SVE/SVE2 example code, see ''/public/spo600-sve-sve2-ifunc-examples.tgz'' on aarch64-001.spo600.cdot.systems. This archive contains:
 +
 +  * ''spo600/examples/sve2-test'' - Example SVE2 code, in vectorizable C, inline assembler, and C with [[Compiler intrinsics|intrinsics]]
 +    * ''spo600/examples/sve2-test/sve-width'' - Example inline assembler code C intrinsic code for determining the width of the SVE/SVE2 vectors on a given system
 +  * ''spo600/examples/ifunc'' - Test/demo code using ifunc with 3 versions of a dummy function (advanced SIMD, SVE, and SVE2)
 +  * (ignore the directory ''spo600/examples/autoifunc'')
 +
 +==== Week 8 Deliverables ====
 +
 +  * Complete your project stage 1, and blog posts group 2, by Sunday night (March 9 at 11:59 pm).
 +
 +===== Week 9 =====
 +
 +==== Week 9 - Class I ====
 +
 +=== Video ===
 +  * Edited summary video pending
 +
 +==== Week 9 - Class II ====
 +
 +=== Project Stage II ===
 +  * Refer to your email for Project Stage I feedback.
 +  * See the [[2025_winter_project#|Project Page]] for Stage II details.
 +
 +==== Week 9 Deliverables ====
 +  * Start on your Project Stage II and blog about your work.
  
 <!-- <!--
spo600/start.1738989094.txt.gz · Last modified: 2025/02/08 04:31 by chris

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki