I implemented a 64 bit * 64 bit → 128 bit fast multiplier in ARM assembly using only fundamental assembly instructions. The code along with my solutions to excercise questions can be found on GitHub.