Skip to content

Latest commit

 

History

History
180 lines (127 loc) · 5.99 KB

README.md

File metadata and controls

180 lines (127 loc) · 5.99 KB

Apple Silicon

This book is written to the Linux calling convention as stated early on. Unfortunately, this means that even if you own an Apple Silicon machine, which is AARCH64, you'd still need a Linux virtual machine.

This didn't sit well with some on reddit and rightfully so. We undertook to develop a way of writing assembly code once and having it work on both Mac OS and Linux to the degree possible.

Much of this chapter has been replaced here.

There are some things we cannot adapt, such as variadic functions (e.g. printf()) but we explain how code can be written to be compatible with both environments at the expense of some minor amount duplicated code.

Assembly language macros

An early innovation in assemblers was the introduction of a macro capability. Given what could be considered a certain amount of tedium in coding in asm, macros provide a simple form of meta programming where a series of statements can be encapsulated by a single macro. Think of a macro as an early form of C++ templated function (kinda but not really).

Here's an example of an assembly language macro:

.macro LLD_ADDR xreg, label 
        adrp    \xreg, \label@PAGE
        add     \xreg, \xreg, \label@PAGEOFF
.endm

Here's how it might be used:

        LLD_ADDR x0, fmt

This gets expanded to:

        adrp    x0, fmt@PAGE
        add     x0, x0, fmt@PAGEOFF

Reminder - there is documentation for these macros

The documentation for the macro suite has been moved here.

How to force the C pre-processor to run on assembly language

clang on Mac OS will run assembly language files through the C pre-processor. clang on Linux will not by default but can if you specify -x assembler-with-cpp.

gcc on Mac OS can be based on clang so on Mac OS it inherits clang's behavior. gcc on Linux does not run assembly language files through the C pre-processor if the asm file ends in .s but WILL if the file ends in .S

Differences between Apple and Linux

Getting the address of errno

errno is an externally defined int32_t used by many "system" provided APIs to report error conditions back to calling programs. The macro ERRNO_ADDR can be used to converge how Linux and Apple get the address of the variable (which is left in x0).

Variadic functions

This is important! Understand this section in order to be able to use printf().

Functions like printf() are variadic. These are functions that can take any number of parameters. The first argument contains information that tells the function how many parameters were actually given.

For example:

printf("%d is a number.\n", 9);

There is but one % place holder in this text. This tells printf() that in addition to the string there is but one more parameter to be expected.

Apple and Linux handle variadic differently.

Linux will use the scratch registers first up to the integer or floating point register 7. Then it will use the stack.

Apple will put the first parameter in the zero register and then shifts immediately to putting all other parameters onto the stack.

We overcome this difference by detecting which environment we are building in using #if after having first set up for the Linux version.

By setting up for the Linux version, the Apple version involves just pushing registers onto the stack.

Remember that %f always expects a double. This is hidden from you in C and C++ but is important in assembly language. Use fcvt to shift from single precision to double.

An example:

        LLD_ADDR    x0, fmt
        LLD_FLT     x1, s0, flt
        fcvt        d0, s0
#if defined(__APPLE__)
        PUSH_R      d0
        CRT         printf
        add         sp, sp, 16  // See discussion below.
#else
        CRT         printf
#endif

Reminder that this makes use of the C preprocessor to perform the conditional evaluation detecting the platform. You can ensure that the C preprocessor is used by naming your assembly language source code files ending in capital S.

Undoing Stack Pointer Changes

A small tip concerning undoing changes to the stack pointer. You might think that changes to the stack made by str or stp and their cousins must be undone with ldr or ldp and their cousins.

This depends.

If you need to get back the original contents of a register pushed onto the stack, then an ldr or ldp is appropriate. However, if you don't need to get the original contents of a register back, then it is faster to undo a change to the stack using addition.

Take for example the use of printf(). On Apple Silicon systems, you must send arguments to printf() by pushing them onto the stack. However, when printf() completes, you have no need for the values that you pushed. As shown above, simply add the right (multiple of 16) to the stack pointer. This is faster as the addition makes no reference to RAM (or caches) as the ldr would.

Frame pointer

Apple requires that x29 be kept as a valid stack frame pointer. The frame pointer should always start out as equal to the stack pointer. However, within the function, the stack pointer is free to change. The frame pointer must remain fixed so that debuggers always know how to find the initial stack frame.

To be Apple compatible, in addition to backing up x30 also back up x29 and then:

mov x29, sp

We will be converting all sample code to do this over time.

More?

As we discover more differences, they will be described here.

START_PROC and END_PROC

Again, for debugging purposes, you can insert frame checks into your code. These work the same on both Apple Silicon and Linux. If you want these, put START_PROC after the label introducing a function. Then, put END_PROC after the last statement of the function.

This helps the debuggers understand where a function begins and ends.

We will transition all sample code to use this over time.

A useful link

Here is an understandable version of gcc documentation.