====== SPO600 - 2024 Fall ====== This is the SPO600 course schedule. It's a live document and will be revised as the course proceed. Each topic will be linked to notes at the end of this page as the course proceeds. ^Week^Week of ...^Class I (Tuesday - Syncronous on Zoom)^Class II (Thursday - Asynchronous)^Deliverables^ |Week 1|September 2|[[#Week 1 - Class I|Introduction, Course Setup (incl SSH Keys)]]|[[#Week 1 - Class II|Binary Representation of Data, Introduction to Computer Architecture]]|[[#Week 1 Deliverables|Setup Communication Tools]]| |Week 2|September 9|[[#Week 2 - Class I|Introduction to 6502 Assembly Language]]|[[#Week 2 - Class II|6502 Math and Flow Control]]|[[#Week 2 Deliverables|Lab 1]]| |Week 3|September 16|[[#Week 3 - Class I|6502 Math]]|[[#Week 3 - Class II|Compiler Internals and Compiler Flags]]|[[#Week 3 Deliverables|Lab 2]]| |Week 4|September 23|[[#Week 4 - Class I|6502 Strings]]|[[#Week 4 - Class II|Compiler Optimizations]]|[[#Week 4 Deliverables|Lab 3]]| |Week 5|September 30|64-Bit Assembler|SIMD, SVE, SVE2 & IFUNC, FMV, AFMV|Lab 4, Blog posts group 1| |Week 6|October 7|Navigating the GCC Codebase|GCC IR Accessors|Project blogging| |Week 7|October 14|Project Discussion|GCC Dump Infrastructure|Project bogging| |Reading Week|October 21|Reading Week||| |Week 8|October 28|Project Discussion|Profiling|Project Stage 1, Blog posts group 2| |Week 9|November 4|Strategies for AFMV Paring|Paged Memory Concepts|Project blogging| |Week 10|November 11|Project Discussion|Advanced Memory Concepts |Project blogging| |Week 11|November 18|Project Discussion|Memory Access in Multicore Systems|Project stage 2, Blog posts group 3| |Week 12|November 25|Strategies for Landing AFMV|Project Recommendations|Project blogging| |Week 13|December 2|Project Discussion|Project Recommendations|Project blogging| |Week 14|December 9|Course Wrap-Up|//No class//|Project Stage 3, Blog posts group 4| ===== Current Participants ===== See the [[SPO600 2024 Fall Participants]] page. ===== Week 1 ===== ==== Week 1 - Class I ==== === Video === * [[https://seneca-my.sharepoint.com/:v:/g/personal/chris_tyler_senecapolytechnic_ca/EZkHTibCuSJIqRD3jNameQoBnUjppnLMeXdl3LLeNKOPUA|Edited summary video]] **Note:** these summary videos are **no substitute** for attending class in-person! It does not include: quizzes and quiz answer discussion, group exercises, and group discussion. It may take several days to process and edit the video before it is made available. It may not record properly and may not be made available. **Do not** rely only on the summary videos! === General Course Information === * Course resources are linked from this wiki. * Coursework is submitted by blogging. The only exception to this is quizzes. * Quizzes will be short (equivalent to ~1 page) and will be held without announcement at the start of any synchronous (Tuesday) class. The quizz answers will be discussed immediately after submission. There is no opportunity to re-take a missed quiz, but your lowest three quiz scores will not be counted, so do not worry if you miss one or two. * Students with test accommodations: alternate quizzes can be made available via the Test Centre. Communicate with your professor for details. * Course marks (see Weekly Schedule for dates): * 60% - Project Deliverables in three phases (15%, 20%, 25%) * 20% - Communication (Blog writing, in four phases, 5% each) * 10% - Labs * 10% - Quizzes (lowest 3 quiz scores not counted) === About SPO600 Classes === * Online Classes * Synchronous: Tuesday 1:30-3:15 pm. Attendance at the Tuesday classes is mandatory. * Summary video(s) may be posted on a best-effort basis (technical issues may prevent posting in some cases). The summary video will be edited and will not include all of the content covered in class. The link(s) to the video(s) will be posted on this page under the corresponding date. The video will not cover all of the synchronous session, and may not be posted if there are any technical issues with the recording. //Do not// rely on the summary videos. * Asynchronous: Resources will be released by Thursday 1:30 pm. You are expected to review and learn this material before the following class (the next Tuesday). * Please attend the online sessions and take lots of notes. === Introduction to the Problems === == Porting and Portability == * Most software is written in a **high-level language** which can be compiled into [[Machine Language|machine code]] for a specific computer architecture. In many cases, this code can be compiled or interpreted for execution on multiple computer architectures - this is called 'portable' code. However, there is a lot of existing code that contains some architecture-specific code fragments which contains assumptions about the architecture, resulting in architecture-specific high-level or [[Assembly Language]] code. * Reasons that code is architecture-specific: * System assumptions that don't hold true on other platforms * Variable or [[Word|word]] size * [[Endian|Endianness]] * Memory ordering * Specific machine details, such as memory page size, stack order, or edge-case floating-point behavior * Code that takes advantage of platform-specific features * Reasons for writing code in machine-specific Assembly Language include: * Performance * [[Atomic Operation|Atomic Operations]] * Direct access to hardware features, e.g., CPUID registers * Most of the historical reasons for using assembler are no longer valid. Modern compilers can out-perform most hand-optimized assembly code, atomic operations can be handled by libraries or [[Compiler Intrinsics|compiler intrinsics]], and most hardware access should be performed through the operating system or appropriate libraries. * A new architecture has appeared: [[aarch64_register_and_instruction_quick_start|AArch64]], which is a 64-bit execution state introduced as part of ARM architecture version 8 (ARMv8). This is the first new [[Computer Architecture|computer architecture]] to appear in several years (at least, the first mainstream computer architecture). * At this point, most key open source software (the software typically present in a Linux distribution such as Ubuntu or Fedora, for example) now runs on AArch64. However, it may not yet be as extensively optimized as on older architectures (such as x86_64). == Optimization == Optimization is the process of evaluating different ways that software can be written or built and selecting the option(s) that has the best performance tradeoffs for the situation at hand. Optimization may involve substituting software algorithms, altering the sequence of operations, using architecture-specific code, selecting data types, or altering the build process. It is important to ensure that the optimized software produces correct results and does not cause an unacceptable performance regression for other use-cases, system configurations, operating systems, or architectures. The definition of "performance" varies according to the target system and the operating goals. For example, in some contexts, low memory or storage usage is important; in other cases, fast operation; and in other cases, low CPU utilization or long battery life may be the most important factor. It is often necessary to trade off performance in one area for another; using a lookup table, for example, can reduce CPU utilization and improve battery life in some algorithms, in return for increased memory consumption. Virtually all compilers (and interpreters) perform some level of optimization, and the options selected for compilation can have a significant effect on the trade-offs made by the compiler, affecting memory usage, execution speed, executable size, power consumption, and debuggability. However, there are some types of optimization that cannot be applied by the compiler, and which must be applied by the programmer. == Build Process == Building software is a complex task that many developers gloss over. The simple act of compiling a program invokes a process with five or more stages, including pre-processing, compiling, optimizing, assembling, and linking. However, a complex software system will have hundreds or even thousands of source files, as well as dozens or hundreds of build configuration options, auto configuration scripts (cmake, autotools), build scripts (such as Makefiles) to coordinate the process, test suites, and more. The build process varies significantly between software packages. Most software distribution projects (including Linux distributions such as Ubuntu and Fedora) use a packaging system that further wraps the build process in a standardized script format, so that different software packages can be built using a consistent process. In order to get consistent and comparable benchmark results, you need to ensure that the software is being built in a consistent way. Altering the build process is one way of optimizing software. Note that the build time for a complex package can range up to hours or even days! == Benchmarking and Profiling == Benchmarking involves testing software performance under controlled conditions so that the performance can be compared to other software, the same software operating on other types of computers, or so that the impact of a change to the software can be gauged. Profiling is the process of analyzing software performance on finer scale, determining resource usage per program part (typically per function/method). This can identify software bottlenecks and potential targets for optimization. The resource utilization studies may include memory, CPU cycles/time, or power. === Communication Tools Setup === Follow the instructions on the **[[SPO600 Communication Tools]]** page to set up a blog, create SSH keys, and send your blog URLs and public key to me. I will use this information to: - Update the [[Current SPO600 Participants]] page with your information, and - Create an account for you on the [[SPO600 Servers]], if you didn't do that during class. The updating is done in batches every few days -- allow some time! === Introduction to the 6502 Processor === The 6502 Processor is a simple 8-bit processor that powered a number of early microcomputers (and video games). We're going to use it to learn [[machine language]] and [[assembly language]] concepts before tackling modern processors (because the 6502 instruction set can be documented in one page rather than 7000 pages!). * [[6502]] - Basic information about the processor * We'll continue exploration of this processor in the next class... ==== Week 1 - Class II ==== === Video === Due to a power + Internet outage, these videos are from a previous semester. * [[https://seneca-my.sharepoint.com/:v:/g/personal/chris_tyler_senecapolytechnic_ca/EY6MVoZluMVMmghA0FBTsdABnvh9UR-cmKmZLNdIDL2H_w|Binary Representation of Data]] * [[https://seneca-my.sharepoint.com/:v:/g/personal/chris_tyler_senecapolytechnic_ca/ESTfZj0CqAJAsm_dkGwcom0B4LkaL_k75F8fbOXgF9UEuQ?e=eNdaB5|Computer Architecture & the 6502 (Starter)]] - Ignore the comments at the end about the course format and schedule, they are applicable only to the previous semester. ==== Week 1 Deliverables ==== * Set up your [[SPO600 Communication Tools]] ===== Week 2 ===== ==== Week 2 - Class I ==== === Video === * [[https://seneca-my.sharepoint.com/:v:/g/personal/chris_tyler_senecapolytechnic_ca/EfFrwcimbyNDrezJIJpc1tMBnJhEjFzHHywOtIovGwk8cw?e=hF56U8|Edited Summary Video]] === 6502 Assembly Language Programming === * Background knowledge * [[Computer Architecture]] basics * [[Assembly Language]] vs [[Machine Language]] * [[Assembler Basics]] * [[6502]] - Basic information about the processor * [[6502 Addressing Modes]] * [[6502 Instructions]] * [[6502 Emulator]] === Lab 1 === * [[6502 Assembly Language Lab|Lab 1 - 6502 Assembly Language Lab]] ==== Week 2 - Class II ==== === Video === * [[https://seneca-my.sharepoint.com/:v:/g/personal/chris_tyler_senecapolytechnic_ca/EZrkwTFz3vZBq1u66y_OvYUBF5VU4-vfUnPlVisy-AvxSw?e=Vlaw1r|6502 Math]] * [[https://seneca-my.sharepoint.com/:v:/g/personal/chris_tyler_senecapolytechnic_ca/EY9YBU3WJTZAtb0dTRLh6JcBkgRB-XeFR2uWjyNoPOwoUg?e=oJKskY|6502 Flow Control (Jumps, Branches, and Subroutines)]] === Resources === * [[6502 Jumps, Branches, and Procedures]] * [[6502 Math]] ==== Week 2 Deliverables ==== * Complete and blog about [[6502 Assembly Language Lab|Lab 1]] ===== Week 3 ===== ==== Week 3 - Class I ==== === Video === * [[https://seneca-my.sharepoint.com/:v:/g/personal/chris_tyler_senecapolytechnic_ca/ETdh8aX-ORxEtL3Q-_ZwN00BJrTajKBcyrdK0A1L8_9RLA|6502 Math Lab]] === Resources === * [[6502 Jumps, Branches, and Procedures]] * [[6502 Math]] === Lab 2 === * [[6502 Math Lab|Lab 2 - 6502 Math Lab]] ==== Week 3 - Class II ==== === Video === * [[https://seneca-my.sharepoint.com/:v:/g/personal/chris_tyler_senecapolytechnic_ca/ETBkqdjC90BNtrQLEyX_PpYBa_ZQIWESFty2ojEpmb030w?e=G1hfhR|Compiler Internals and Compiler Flags]] === Resources === * [[Executable and Linkable Format]] * [[https://gcc.gnu.org/onlinedocs/|GCC Documentation]] - Particularly note the GCC manual for the latest version, and the GCC Internals documentation (link near the bottom of the page) ==== Week 3 Deliverables ==== * Complete and blog about [[6502 Math Lab|Lab 2]] ===== Week 4 ===== ==== Week 4 - Class I ==== === Video === * [[https://seneca-my.sharepoint.com/:v:/g/personal/chris_tyler_senecapolytechnic_ca/ETxcLi8M9INPhWewNr74Z0kBYT5d67gDfowh0wMxpR3wGw?e=NiIWeq|Edited summary video]] * This video does not include the quiz answers or discussion of the blog posts. === Code === * [[https://github.com/ctyler/6502js-code/blob/master/input5.6502|input5.6502]] - five-letter-word input routine very similar to the one discussed in the video, used in the [[https://github.com/ctyler/6502js-code/blob/master/wordle.6502|Wordle-like game]]. === Lab 3 === * [[6502 Program Lab|Lab 3 - 6502 Program Lab]] ==== Week 4 - Class II ==== === Video === * [[https://seneca-my.sharepoint.com/:v:/g/personal/chris_tyler_senecapolytechnic_ca/EaoYUNEOMhNMn70T_DOvQK4BdIkTWSbj4ikZZcMtPU5bKQ?e=gg5X9g|Compiler Optimizations]] === Resources === * [[Compiler Optimizations]] * [[Link Time Optimization]] * [[Profile-Guided Optimization]] ==== Week 4 Deliverables ==== * Complete and blog about [[6502 Program Lab|Lab 3]] * Reminder: the first group of blogs is due next week (Oct 6 11:59 PM)