Skip to content

Commit

Permalink
corrected layout of floats for AARCH64
Browse files Browse the repository at this point in the history
  • Loading branch information
pkivolowitz committed Feb 15, 2024
1 parent d099c79 commit a7e8971
Show file tree
Hide file tree
Showing 4 changed files with 7 additions and 9 deletions.
4 changes: 2 additions & 2 deletions section_1/kickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ be on one chip and RAM on another set of chips.

The idea of registers were introduced a very long time ago as being
super fast storage that is implemented directly in the CPU. Because they
are within the CPU, distance isn't really an issue. Similarly, because
are within the CPU, distance isn'tv really an issue. Similarly, because
they are in the CPU, they operate as the speed of the CPU itself.

Registers don't have addresses because they are not in memory. Instead
Expand Down Expand Up @@ -232,7 +232,7 @@ This is like:
*ptr = var;
```

The analogies are not exact but close.
**The analogies are not exact but close.**

Pairs of registers can also be stored and loaded with the `stp` and
`ldp` op codes.
Expand Down
Binary file added section_2/float/simdlanes.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 5 additions & 7 deletions section_2/float/working.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,13 @@ For example, in the following image, note the overlap of two single
precision floats within a single double precision floating point
register.

*NOTE NOTE NOTE* This must be fixed - the picture corresponds to the
32 bit state - AARCH32!
*NOTE NOTE NOTE* To keep to our promise of simplicity for now, consider
only `B0`, `H0`, `S0` and `D0`. The remainder of the image ([from The
Eclectic Light Company](https://eclecticlight.co/2021/08/23/code-in-arm-assembly-lanes-and-loads-in-neon/)) deals with SIMD, covered
later.

![regs](./regs.png)
![regs](./simdlanes.jpg)

It is worth noting early and often that you should not mix dealing
with different precisions assuming that because of the overlaps in
space, you'll get a meaningful result.

The above image does not show the corresponding layout of [half
precision](./half.md) floating point registers. `H0` sits in the least
significant bits of `S0` and so on.
Binary file modified section_2/float/working.pdf
Binary file not shown.

0 comments on commit a7e8971

Please sign in to comment.