lochshhttps://mcla.ug/blog/2021-03-01T00:00:00+00:00Designing a RISC-V CPU, Part 2: Successfully executing (some) instructions2021-03-01T00:00:00+00:002021-03-01T00:00:00+00:00Hannah McLaughlintag:mcla.ug,2021-03-01:/blog/risc-v-cpu-part-2.html<p>The <a href="https://mcla.ug/blog/risc-v-cpu-part-1.html">previous instalment</a> of this
series was "basically an explanation of what FPGAs are and a 'hello world'
nMigen example."<sup id="fnref:1"><a class="footnote-ref" href="#fn:1">1</a></sup> In this post, I will be detailing the design of my CPU as
it currently stands, and going over the various mistakes I made along the way.
As with …</p><p>The <a href="https://mcla.ug/blog/risc-v-cpu-part-1.html">previous instalment</a> of this
series was "basically an explanation of what FPGAs are and a 'hello world'
nMigen example."<sup id="fnref:1"><a class="footnote-ref" href="#fn:1">1</a></sup> In this post, I will be detailing the design of my CPU as
it currently stands, and going over the various mistakes I made along the way.
As with the previous post, I am primarily aiming this at software engineers who
are new to hardware design – but I hope it will be interesting to anyone
interested in nMigen, RISC-V, or CPU design generally.</p>
<p>I hope the mistakes I made along the way will be a useful frame for considering
these questions:</p>
<ul>
<li>
<p>What about digital logic design is fundamentally different from software
design?</p>
</li>
<li>
<p>What about digital logic design is similar to software design?</p>
</li>
</ul>
<p>You can see the code for my CPU at the time of writing
<a href="https://github.com/lochsh/riscy-boi/tree/33229b5fdcee90cfdf776ccf925f560f0fa5ce82">here</a>
or an up to date version <a href="https://github.com/lochsh/riscy-boi">here</a>.</p>
<h2>An introduction to RISC-V</h2>
<p>RISC-V ("risk five") is an open standard instruction set architecture (ISA).
"RISC" means "reduced instruction set computer", broadly meaning that the
ISA prioritises simple instructions. In contrast, CISC (complex instruction set
computer) ISAs are optimised to perform actions in as few instructions as
possible, hence their instructions are often more complex. ARM architectures
are RISC; x86-related architectures are CISC.</p>
<p>Usually in industry, ISAs are patented, so in order to implement the ISA you
need an (expensive) license from the vendor. Often, commercial ISAs are poorly
documented, with the motivation behind design details not always being
available even after such a license agreement.</p>
<p>From the RISC-V spec <sup id="fnref:2"><a class="footnote-ref" href="#fn:2">2</a></sup>:</p>
<blockquote>
<p>Our goals in defining RISC-V include:</p>
<ul>
<li>A completely <em>open</em> ISA that is freely available to academia and industry</li>
<li>a <em>real</em> ISA suitable for direct native hardware implementation, not just
simulation or binary translation.</li>
<li>An ISA that avoids "over-architecting" for a particular microarchitecture
style, but which allows efficient implementation in any of these.</li>
</ul>
</blockquote>
<p>Additionally, a lot of commercial architectures are complex and burdened with
backwards-compatibility constraints. RISC-V offers a fresh start!</p>
<p>The ISA doesn't explain the details of how to design a CPU – it just
defines an abstract model of the CPU, mostly by defining what instructions the
CPU must support, including:</p>
<ul>
<li>The encoding of the instructions (i.e. how to construct the machine code that
the CPU will run)</li>
<li>The registers (the very fast, very small storage locations accessed directly
by the CPU)<sup id="fnref:3"><a class="footnote-ref" href="#fn:3">3</a></sup></li>
</ul>
<p>How to actually create a CPU that realises these requirements is up to us \o/</p>
<h3>A quick note (feel free to skip if you don't have opinions about CPUs)</h3>
<p>I'm trying to design a single-stage CPU; that is, I'm trying to design a CPU
that retires one instruction per clock cycle with no pipelining. Usually CPUs
have pipelining in place to maximise efficiency. I am hoping that avoiding this
will result in a simpler design more suitable to my learning. Perhaps it will
complicate other aspects of the design (like my program counter having 3
outputs), but I'm certainly finding it easier to hold the design in my
head when I only have to think about this clock cycle and the next. I will
likely discuss this in more detail in the next blog post where I implement the
load instructions. I have a plan for how to make these work in a single cycle,
but they do pose a particular challenge and I may change my design as a
consequence.</p>
<h2>Designing a CPU</h2>
<p>RISC-V defines various ISA modules; I will be implementing RV32I, the base
32-bit integer instruction set.</p>
<p>To design my CPU, I first looked at the JAL (jump and link) and the ADDI (add
immediate) instructions, and drew a block diagram of what hardware
would be needed to decode those instructions. I think the ADDI instruction is
easier to grasp if you're not used to thinking about how machine code is
encoded, so we'll start with that. If this all totally foreign to you, you
might enjoy my
<a href="https://mcla.ug/blog/how-to-flash-an-led.html">introduction to ARM assembly</a>.</p>
<h3>Decoding ADDI</h3>
<p><img alt="iri" class="callout" src="/blog/images/integer-register-immediate.png" title="Diagram of instruction encoding for ADDI from RISC-V spec"></p>
<blockquote>
<p>ADDI adds the sign-extended 12-bit immediate to register <em>rs1</em></p>
</blockquote>
<p>Let's break some of this down, imagining we're decoding such an instruction:</p>
<ul>
<li>An immediate is analogous to a literal value in source code; the value itself
is encoded in the instruction, rather than e.g. retrieved from a register.</li>
<li>The <em>opcode</em> field is the same value for all the instructions listed here. </li>
<li>Once we've read the <em>opcode</em>, we know that bits 12-14 contain the <em>funct3</em>
field (3 for 3 bits), which encodes whether this is an ADDI instruction (or
SLTI, ANDI &c.)</li>
</ul>
<p>So, somehow we will have to:</p>
<ul>
<li>Retrieve the value stored in register <em>rs1</em><ul>
<li>Therefore, our register file needs an input for selecting this register
to read, and an output for the data read from that register.</li>
</ul>
</li>
<li>Add that to the immediate value<ul>
<li>An arithmetic logic unit (ALU) will be helpful</li>
</ul>
</li>
<li>Store the result in register <em>rd</em><ul>
<li>Our register file also needs write select and write data inputs.</li>
</ul>
</li>
</ul>
<p>Sign-extension will be necessary too – this is just the process of
taking a number narrower than (in our case) 32 bits, and filling in the
remaining bits such that two's complement arithmetic will be performed
correctly. I won't include this in the diagram below.</p>
<p>Implicit in all this is that we'll need some way to retrieve the instruction
itself and pass it to the instruction decoder. The <em>program counter</em> is
what tells the instruction memory which address to retrieve an instruction
from.</p>
<p><img alt="addi" src="/blog/images/addi.png" title="Block diagram showing the necessary connections between components to implement the ADDI instruction. The instruction decoder tells the register file to read from rs1 and write to rd. The data read from rs1 and the immediate are inputs to the ALU. The output of the ALU is written to rd. The program counter tells the instruction memory which address to retrieve and instruction from, and the instruction memory provides that instruction to the instruction decoder."></p>
<p>I'm using thick cyan lines for data flow and thin magenta lines for control
flow.</p>
<h3>JAL</h3>
<p>Now let's try doing the same for JAL.</p>
<blockquote>
<p>The jump and link (JAL) instruction [...] immediate encodes a signed offset
in multiples of 2 bytes. The offset is sign-extended and added to the <code>pc</code> to
form the jump target address. [...] JAL stores the address of the instruction
following the jump (<code>pc</code> + 4) into register <em>rd</em>.</p>
</blockquote>
<p>There's a lot more going on here, particularly if you aren't familiar with
machine code. We noted above that the program counter sets which instruction
will be executed next. Usually the program counter simply increments to the
next address each cycle – that's how we continue to execute our program!
However, say our program calls a subroutine stored elsewhere in memory; we'd
need a way to <em>jump</em> to the subroutine address. Or, say we wanted an infinite
loop (ubiquitous in embedded software!); we'd need a way to <em>jump</em><sup id="fnref:4"><a class="footnote-ref" href="#fn:4">4</a></sup> back to
the address at the start of the loop. JAL allows for these situations.</p>
<p>The "link" part of JAL is the part where the next instruction is stored in the
destination register. This is convenient when the jump is used to execute a
subroutine: once the subroutine completes, we can jump back to that stored
instruction address.</p>
<p>Here's the encoding for JAL:</p>
<p><img alt="jal-enc" class="callout" src="/blog/images/jal-enc.png" title="Diagram of instruction encoding for JAL from RISC-V spec"></p>
<p>Note:</p>
<ul>
<li>The opcode uniquely defines the JAL instruction (no other instructions share
this opcode)</li>
<li>We need to unshuffle the immediate to get the offset</li>
<li>The LSB<sup id="fnref:5"><a class="footnote-ref" href="#fn:5">5</a></sup> of the immediate is not encoded because it must be 0: only
offsets in multiples of 2 bytes are supported.</li>
</ul>
<p>The unshuffling caused me some pain in my code, so it's amusing to me that it
won't be part of the block diagram below. But it's part of the internals of the
instruction decoder, not its interface, which is what we are determining with
these diagrams. The shuffled bits might seem nonsensical, but they're chosen to
maximise overlap with other instruction formats in RISC-V.</p>
<p><img alt="jal" src="/blog/images/jal.png" title="Block diagram showing the necessary connections between components to implement the JAL instruction."></p>
<h3>Putting ADDI and JAL together</h3>
<p>The next challenge is to draw a block diagram that implements both ADDI and
JAL. The first obvious problem: the inputs to the ALU are different in both
cases, as is the wiring of the output. We need some kind of logic block that
can pick between them depending on some control signal: a multiplexer (mux).</p>
<p>We also need a way to tell the program counter whether the next instruction
address should come from a set address or from incrementing the current
address.</p>
<p>Here's what my design looks like at the moment (excluding a few things I have
because I know I'll need them later, like two read select/data signals on the
register file):</p>
<p><img alt="mux" src="/blog/images/mux.png" title="Block diagram showing the necessary connections between components to implement the JAL and ADDI instructions."></p>
<h2>Implementing the design</h2>
<p>As covered in my previous blog post, I'm using
<a href="https://github.com/nmigen/nmigen">nMigen</a> to implement my design. As I'm
currently designing a single-stage CPU, more of my design is
combinatorial, rather than synchronous, because I don't require the additional
state that pipelining necessitates. This most likely means my design is
unable to run quickly, but that's not a concern of mine.</p>
<p>I don't think it's helpful to post all the source code of my implementation in
this blog, but I will include some code here to illustrate mistakes that I made
and what I learned from them.</p>
<h3>Mistake #1: thinking about my logic circuit sequentially</h3>
<p>I initially implemented my program counter incorrectly after I got really
confused about when synchronous updates would take effect. I initially only had
the <code>pc</code> and <code>pc_inc</code> outputs because I didn't really understand the difference
between <code>pc</code> and <code>pc_next</code>. It's taken some getting used to thinking about the
whole logic circuit "happening at once"<sup id="fnref:7"><a class="footnote-ref" href="#fn:7">7</a></sup>, rather thinking sequentially like
I would when writing software. This is what led to my confusion. Properly
conceptualising your circuit in this way is key, and a challenge if you're used
to writing software.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="String">"""</span><span class="String">Program Counter</span><span class="String">"""</span>
<span id="L2" class="LineNr"> 2 </span><span class="PreProc">import</span> nmigen <span class="Statement">as</span> nm
<span id="L3" class="LineNr"> 3 </span>
<span id="L4" class="LineNr"> 4 </span>INSTR_BYTES <span class="op_lv0">=</span> <span class="Number">4</span>
<span id="L5" class="LineNr"> 5 </span>
<span id="L6" class="LineNr"> 6 </span>
<span id="L7" class="LineNr"> 7 </span><span class="Statement">class</span> <span class="Function">ProgramCounter</span><span class="lv12c">(</span>nm<span class="op_lv12">.</span>Elaboratable<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L8" class="LineNr"> 8 </span> <span class="String">"""</span>
<span id="L9" class="LineNr"> 9 </span><span class="String"> Program Counter</span>
<span id="L10" class="LineNr">10 </span>
<span id="L11" class="LineNr">11 </span><span class="String"> * load (in): low to increment, high to load an address</span>
<span id="L12" class="LineNr">12 </span><span class="String"> * input_address (in): the input used when loading an address</span>
<span id="L13" class="LineNr">13 </span>
<span id="L14" class="LineNr">14 </span><span class="String"> * pc (out): the address of the instruction being executed this clock cycle</span>
<span id="L15" class="LineNr">15 </span><span class="String"> * pc_next (out): the address of the instruction being executed next clock</span>
<span id="L16" class="LineNr">16 </span><span class="String"> cycle</span>
<span id="L17" class="LineNr">17 </span><span class="String"> * pc_inc (out): the address after that of the instruction being executed</span>
<span id="L18" class="LineNr">18 </span><span class="String"> this clock cycle</span>
<span id="L19" class="LineNr">19 </span><span class="String"> </span><span class="String">"""</span>
<span id="L20" class="LineNr">20 </span>
<span id="L21" class="LineNr">21 </span> <span class="Statement">def</span> <span class="Function">__init__</span><span class="lv12c">(</span>self<span class="op_lv12">,</span> width<span class="op_lv12">=</span><span class="Number">32</span><span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L22" class="LineNr">22 </span> self<span class="op_lv0">.</span>load <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Signal<span class="lv12c">()</span>
<span id="L23" class="LineNr">23 </span> self<span class="op_lv0">.</span>input_address <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Signal<span class="lv12c">(</span>width<span class="lv12c">)</span>
<span id="L24" class="LineNr">24 </span> self<span class="op_lv0">.</span>pc <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Signal<span class="lv12c">(</span>width<span class="lv12c">)</span>
<span id="L25" class="LineNr">25 </span> self<span class="op_lv0">.</span>pc_next <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Signal<span class="lv12c">(</span>width<span class="lv12c">)</span>
<span id="L26" class="LineNr">26 </span> self<span class="op_lv0">.</span>pc_inc <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Signal<span class="lv12c">(</span>width<span class="lv12c">)</span>
<span id="L27" class="LineNr">27 </span>
<span id="L28" class="LineNr">28 </span> <span class="Statement">def</span> <span class="Function">elaborate</span><span class="lv12c">(</span>self<span class="op_lv12">,</span> _<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L29" class="LineNr">29 </span> m <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Module<span class="lv12c">()</span>
<span id="L30" class="LineNr">30 </span>
<span id="L31" class="LineNr">31 </span> m<span class="op_lv0">.</span>d<span class="op_lv0">.</span>comb <span class="op_lv0">+=</span> self<span class="op_lv0">.</span>pc_inc<span class="op_lv0">.</span>eq<span class="lv12c">(</span>self<span class="op_lv12">.</span>pc <span class="op_lv12">+</span> INSTR_BYTES<span class="lv12c">)</span>
<span id="L32" class="LineNr">32 </span> m<span class="op_lv0">.</span>d<span class="op_lv0">.</span>sync <span class="op_lv0">+=</span> self<span class="op_lv0">.</span>pc<span class="op_lv0">.</span>eq<span class="lv12c">(</span>self<span class="op_lv12">.</span>pc_next<span class="lv12c">)</span>
<span id="L33" class="LineNr">33 </span>
<span id="L34" class="LineNr">34 </span> <span class="Statement">with</span> m<span class="op_lv0">.</span>If<span class="lv12c">(</span>self<span class="op_lv12">.</span>load<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L35" class="LineNr">35 </span> m<span class="op_lv0">.</span>d<span class="op_lv0">.</span>comb <span class="op_lv0">+=</span> self<span class="op_lv0">.</span>pc_next<span class="op_lv0">.</span>eq<span class="lv12c">(</span>self<span class="op_lv12">.</span>input_address<span class="lv12c">)</span>
<span id="L36" class="LineNr">36 </span> <span class="Statement">with</span> m<span class="op_lv0">.</span>Else<span class="lv12c">()</span><span class="op_lv0">:</span>
<span id="L37" class="LineNr">37 </span> m<span class="op_lv0">.</span>d<span class="op_lv0">.</span>comb <span class="op_lv0">+=</span> self<span class="op_lv0">.</span>pc_next<span class="op_lv0">.</span>eq<span class="lv12c">(</span>self<span class="op_lv12">.</span>pc_inc<span class="lv12c">)</span>
<span id="L38" class="LineNr">38 </span>
<span id="L39" class="LineNr">39 </span> <span class="Statement">return</span> m
</pre>
</div>
<p></html></p>
<p>When thinking about my implementation, I now want to answer these questions:</p>
<ul>
<li>What is my state?</li>
<li>How are the outputs computed?</li>
<li>How is the state updated?</li>
</ul>
<p>This sounds simple, but it has been clarifying. For this case:</p>
<ul>
<li><code>pc</code> is my state (you'll note it is assigned synchronously, so its value
is updated at the start of the next clock cycle).</li>
<li><code>pc</code>, <code>pc_inc</code>, and <code>pc_next</code> are the outputs:<ul>
<li>the <code>pc</code> output is the current state</li>
<li><code>pc_inc</code> is just <code>pc + 4</code></li>
<li><code>pc_next</code> swaps between <code>pc_inc</code> and the <code>input_address</code> based on load</li>
</ul>
</li>
<li>Each clock cycle, <code>pc</code> takes on the value of <code>pc_next</code></li>
</ul>
<h3>Mistake #2: Creating a combinatorial loop</h3>
<p>This isn't really an additional mistake, as the incorrect program counter
implementation described above caused this. I felt this particular consequence
deserved its own subsection.</p>
<p>When I only had the <code>pc</code> and <code>pc_inc</code> outputs from my program counter, I
created the following combinatorial loop:</p>
<ul>
<li>the <code>pc</code> output was an input to the ALU</li>
<li>but the <code>pc</code> output was computed from the output of the ALU</li>
</ul>
<p>Both of these signals were in the combinatorial domain, so they were
effectively wired together in a loop.</p>
<p>The creation of this feedback loop resulted in my tests hanging as my
combinatorial simulations never settled. If I'd tried to synthesise my design,
the synthesis tools would have refused. If you were able to synthesise such a
design, the output would be constantly changing.</p>
<h3>Mistake #3: Not creating the block diagram correctly</h3>
<p>True Fans™ of this blog might notice that the block diagram above has a
notable difference from the one shown at the end of the previous blog post:
previously I didn't mux the input of the ALU. I was wondering why JAL wasn't
working, and tried to trace it through my block diagram, like we did above. A
re-enactment:</p>
<p><em>Computer, show my original sketch of the block diagram.</em>
<img alt="sketch" class="callout" src="/blog/images/block-diagram-sktech.jpg" title="A screenshot of my first CPU design sketch"></p>
<p><em>Computer, enhance:</em>
<img alt="enhance" class="callout" src="/blog/images/enhance.jpg" title="Zoomed into a note on screenshot saying 'need to multiplex ALU input"></p>
<p>Me:
<img alt="pikachu" class="callout" src="/blog/images/shocked-pikachu.png" title="Shocked pikachu face from Pokemon"></p>
<p>The above is a bit frivolous, but I'd say the lesson here is to think actively
about your block diagram (or whatever reference you are using while
implementing your design), making sure it actually makes sense to you and does
what you want it to. This applies to software design too when viewing any
requirements or other design documentation, of course.</p>
<h3>Miscellaneous mistakes</h3>
<p>I made a lot of other mistakes that aren't that interesting to cover –
most notably really messing up the unshuffling of the immediate in the JAL
instruction. But that was just a programming error without an interesting
lesson to be learnt from it.</p>
<h2>Running my first program!</h2>
<p>It was a very exciting moment when I figured out the last of the mistakes I had
made in my implementation, with the help of my tests and of gtkwave.</p>
<p>For convenience in tests, my instruction decoding code also has code for
assembling instructions. I use it here to create a simple program with two
instructions:</p>
<ul>
<li>
<p>An <code>ADDI</code> instruction where <code>rd</code> and <code>rs1</code> are the same register, and the
immediate is <code>1</code>. This means it will increment the value in <code>rs1</code>.</p>
</li>
<li>
<p>A <code>JAL</code> instruction with an offset of -4, meaning it jumps back to our <code>ADDI</code>
instruction. This creates an infinite loop where the value in a register is
incremented each clock cycle.</p>
</li>
</ul>
<p>The LED code is specific to my board. I'm selecting 4 bits (one for each LED)
in the register and displaying them on the board's LEDs. I pick bits in the
middle so they don't change too quickly to see (as would be the case if I
picked the LSBs) or too slow to look cool (as would be the case if I picked the
MSBs).</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span> program <span class="op_lv0">=</span> <span class="lv12c">[</span>encoding<span class="op_lv12">.</span>IType<span class="op_lv12">.</span>encode<span class="lv11c">(</span>
<span id="L2" class="LineNr"> 2 </span> <span class="Number">1</span><span class="op_lv11">,</span>
<span id="L3" class="LineNr"> 3 </span> reg<span class="op_lv11">,</span>
<span id="L4" class="LineNr"> 4 </span> encoding<span class="op_lv11">.</span>IntRegImmFunct<span class="op_lv11">.</span>ADDI<span class="op_lv11">,</span>
<span id="L5" class="LineNr"> 5 </span> reg<span class="op_lv11">,</span>
<span id="L6" class="LineNr"> 6 </span> encoding<span class="op_lv11">.</span>Opcode<span class="op_lv11">.</span>OP_IMM<span class="lv11c">)</span><span class="op_lv12">,</span>
<span id="L7" class="LineNr"> 7 </span> <span class="Comment"># jump back to the previous instruction for infinite loop</span>
<span id="L8" class="LineNr"> 8 </span> encoding<span class="op_lv12">.</span>JType<span class="op_lv12">.</span>encode<span class="lv11c">(</span><span class="Number">0x1ffffc</span><span class="op_lv11">,</span> link_reg<span class="lv11c">)</span><span class="lv12c">]</span>
<span id="L9" class="LineNr"> 9 </span>
<span id="L10" class="LineNr">10 </span> imem <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Memory<span class="lv12c">(</span>width<span class="op_lv12">=</span><span class="Number">32</span><span class="op_lv12">,</span> depth<span class="op_lv12">=</span><span class="Number">256</span><span class="op_lv12">,</span> init<span class="op_lv12">=</span>program<span class="lv12c">)</span>
<span id="L11" class="LineNr">11 </span> imem_rp <span class="op_lv0">=</span> m<span class="op_lv0">.</span>submodules<span class="op_lv0">.</span>imem_rp <span class="op_lv0">=</span> imem<span class="op_lv0">.</span>read_port<span class="lv12c">()</span>
<span id="L12" class="LineNr">12 </span> m<span class="op_lv0">.</span>d<span class="op_lv0">.</span>comb <span class="op_lv0">+=</span> <span class="lv12c">[</span>
<span id="L13" class="LineNr">13 </span> imem_rp<span class="op_lv12">.</span>addr<span class="op_lv12">.</span>eq<span class="lv11c">(</span>cpu_inst<span class="op_lv11">.</span>imem_addr<span class="lv11c">)</span><span class="op_lv12">,</span>
<span id="L14" class="LineNr">14 </span> cpu_inst<span class="op_lv12">.</span>imem_data<span class="op_lv12">.</span>eq<span class="lv11c">(</span>imem_rp<span class="op_lv11">.</span>data<span class="lv11c">)</span><span class="op_lv12">,</span>
<span id="L15" class="LineNr">15 </span> <span class="lv12c">]</span>
<span id="L16" class="LineNr">16 </span>
<span id="L17" class="LineNr">17 </span> colours <span class="op_lv0">=</span> <span class="lv12c">[</span><span class="String">"</span><span class="String">b</span><span class="String">"</span><span class="op_lv12">,</span> <span class="String">"</span><span class="String">g</span><span class="String">"</span><span class="op_lv12">,</span> <span class="String">"</span><span class="String">o</span><span class="String">"</span><span class="op_lv12">,</span> <span class="String">"</span><span class="String">r</span><span class="String">"</span><span class="lv12c">]</span>
<span id="L18" class="LineNr">18 </span> leds <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Cat<span class="lv12c">(</span>platform<span class="op_lv12">.</span>request<span class="lv11c">(</span>f<span class="String">"</span><span class="String">led_{c}</span><span class="String">"</span><span class="lv11c">)</span> <span class="Statement">for</span> c <span class="Statement">in</span> colours<span class="lv12c">)</span>
<span id="L19" class="LineNr">19 </span> m<span class="op_lv0">.</span>d<span class="op_lv0">.</span>sync <span class="op_lv0">+=</span> leds<span class="op_lv0">.</span>eq<span class="lv12c">(</span>cpu_inst<span class="op_lv12">.</span>debug_out<span class="lv11c">[</span><span class="Number">13</span><span class="op_lv11">:</span><span class="Number">17</span><span class="lv11c">]</span><span class="lv12c">)</span>
</pre>
</div>
<p></html></p>
<p>And here is a poor quality gif of the LEDs! Very exciting.
<img alt="leds" class="callout" src="/blog/images/cpu.gif" title="The LEDs flashing on the FPGA dev board"></p>
<p>The next stage will be to implement the rest of the RV32I instructions.
Probably I will start with the load instructions, as I think they will pose a
challenge in my current single-stage architecture. I have a plan for how to
address that. If you'd like to see future instalments of this blog series, you
can follow me on <a href="https://twitter.com/lochsh">Twitter</a> or subscribe to my
<a href="https://mcla.ug/blog/feeds/all.rss.xml">RSS feed</a>.</p>
<h2>Differences and commonalities between software design and digital logic design</h2>
<p>Now that I've completed a lot more of the design and implementation of my CPU,
I have some clearer thoughts on this topic. I thought there would be some
clear differences in kind between software design and digital logic design.
After much thought, I wonder if there are merely differences in degree.</p>
<h3>Concurrency</h3>
<p>First, let's consider concurrency. This is a topic that's sometimes
misunderstood, so let's be clear: concurrency happens whenever different
parts of your system might execute at different times or out of order<sup id="fnref:6"><a class="footnote-ref" href="#fn:6">6</a></sup>. A
system is concurrent if it can support having multiple tasks in progress at the
same time; this could be in parallel, or it could involve switching between the
tasks while only actually executing one at any given moment.</p>
<p>Arguably, the default state for software is to have no concurrency – it
often needs explicitly created (e.g. by creating a thread), or at the very
least, it needs explicitly handled (e.g. in an interrupt handler). I think any
software engineer who has worked on concurrent software would agree that
dealing with concurrency is a singular challenge that creates room for
difficult and subtle bugs. For these two reasons, software generally has
limited concurrency.</p>
<p>Now imagine concurrency is your default state, and that every single piece of
state is updating all at once every step of your program.</p>
<p>That is the default state of digital logic. While in software you generally
have to explicitly create (or at least explicitly handle) concurrency, in
hardware you have to deliberately make something sequential.</p>
<p>I said above that concurrency in software is challenging. If digital hardware
has more concurrency, is it more challenging to design? Perhaps! It's hard for
me to comment on this due to my lack of experience in hardware design.</p>
<p>Perhaps one reason that concurrency is so difficult in software is the annoying
habit of a lot of software<sup id="fnref:8"><a class="footnote-ref" href="#fn:8">8</a></sup> to be dependent on a colossal amount of state.
How many possible states does a mere kilo-bit of memory have? 2^{1024} is too
big a number for me to conceptualise. And yet a kilo-bit is nothing by many
program's standards.</p>
<p>In hardware, state is more precious. My FPGA is only so large. I have an
ice40hx8k, which has 8000 logic cells. Each cell has one 4-input LUT and one
flip-flop: so let's say 8000 bits of state. Pathetic! Watch me create
millions of bits of pure chaos with a single call to <code>malloc</code>.</p>
<p>And of course, if you were designing for silicon, the size of your design would
be directly related to cost!</p>
<h3>Verification</h3>
<p>So, we've considered concurrency. Another thing I wondered is whether digital
hardware has some fundamental property that makes it easier to formally verify,
or at least to demonstrate equivalence with a model. When I worked on a system
that could kill people if it malfunctioned, in our risk assessments we had to
assume 100% chance of software failure. Practically this generally meant that
software mitigation alone was not enough. I found it strange that the FPGA
component of our system was considered much more trusted. Did hardware have
some fundamental property that made it "better"?</p>
<p>Once again, I don't think there <em>is</em> a fundamental difference here. After all,
software exists that has been formally verified. In general, functional
programming lends itself better to this due to the separation of state and
behaviour. It seems more likely that the difference is in the careful
management of state. The properties of digital hardware described above
encourage this, but it's not impossible to do in software.</p>
<p>The huge teams of verification engineers that work on silicon design might also
suggest some fundamental difference, given you might not have seen the same
applied to software. However, there <em>are</em> software projects that are given this
level of rigour! Most famously, much of NASA's software, like software for
space shuttles<sup id="fnref:9"><a class="footnote-ref" href="#fn:9">9</a></sup>. The sad truth is that most companies don't consider it
worth it to apply this rigour to their software (it's hugely expensive!). When
sufficient lives and/or money are on the line, software <em>can</em> and <em>is</em> written
and tested with the same level of rigour as a lot of hardware..</p>
<p>If you have thoughts to share on what you think the differences and
commonalities between hardware and software design are, please share them via
<a href="mailto:h@mcla.ug">email</a> or in Hacker News comments if you have come from
there. It's been really interesting to ponder, and I'd be interested in
different perspectives.</p>
<div class="footnote">
<hr>
<ol>
<li id="fn:1">
<p>Hacker News comment, 2021 <a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text">↩</a></p>
</li>
<li id="fn:2">
<p><a href="https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf">https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf</a> <a class="footnote-backref" href="#fnref:2" title="Jump back to footnote 2 in the text">↩</a></p>
</li>
<li id="fn:3">
<p>Just a note: the word "register" is pretty overloaded. You may have heard
it in the context used above (the small storage accessed directly by the CPU);
for memory-mapped IO in a microcontroller; you may have also heard "register"
used as a verb to describe saving state in hardware. <a class="footnote-backref" href="#fnref:3" title="Jump back to footnote 3 in the text">↩</a></p>
</li>
<li id="fn:4">
<p>I think generally "branch" is used to describe conditional instructions,
whereas "jump" is used to describe unconditional instructions. Certainly this
is the case in RISC-V, but I think it very much depends on the particular
architecture, and sometimes they are used interchangeably. <a class="footnote-backref" href="#fnref:4" title="Jump back to footnote 4 in the text">↩</a></p>
</li>
<li id="fn:5">
<p>Least Significant Bit <a class="footnote-backref" href="#fnref:5" title="Jump back to footnote 5 in the text">↩</a></p>
</li>
<li id="fn:6">
<p>wording from <a href="https://docs.rust-embedded.org/book/concurrency/">https://docs.rust-embedded.org/book/concurrency/</a> <a class="footnote-backref" href="#fnref:6" title="Jump back to footnote 6 in the text">↩</a></p>
</li>
<li id="fn:7">
<p>Of course in analogue reality things are not happening at exactly the
same time, but our abstraction is the discrete time period of our clock. <a class="footnote-backref" href="#fnref:7" title="Jump back to footnote 7 in the text">↩</a></p>
</li>
<li id="fn:8">
<p>functional programmers, we don't need your smugness here! <a class="footnote-backref" href="#fnref:8" title="Jump back to footnote 8 in the text">↩</a></p>
</li>
<li id="fn:9">
<p><a href="https://www.fastcompany.com/28121/they-write-right-stuff">https://www.fastcompany.com/28121/they-write-right-stuff</a> <a class="footnote-backref" href="#fnref:9" title="Jump back to footnote 9 in the text">↩</a></p>
</li>
</ol>
</div>Designing a RISC-V CPU, Part 1: Learning hardware design as a software engineer2021-02-16T00:00:00+00:002021-02-16T00:00:00+00:00Hannah McLaughlintag:mcla.ug,2021-02-16:/blog/risc-v-cpu-part-1.html<p>I have no experience in digital logic design. That is, I didn't until I
recently decided that I would like to try designing my own CPU and running it
on an FPGA! If you too are a software engineer with a vague interest in
hardware design, I hope this series …</p><p>I have no experience in digital logic design. That is, I didn't until I
recently decided that I would like to try designing my own CPU and running it
on an FPGA! If you too are a software engineer with a vague interest in
hardware design, I hope this series of posts about what I've learnt will be
helpful and interesting. In this first installment, I hope to answer these
questions:</p>
<ul>
<li>
<p>What is digital logic design?</p>
</li>
<li>
<p>How do I get started, and what tools might I use?</p>
</li>
</ul>
<p>In future installments, I will go into more detail about my CPU design and the
RISC-V architecture, as well as hopefully answering these questions:</p>
<ul>
<li>
<p>What about digital logic design is fundamentally different from software
design?</p>
</li>
<li>
<p>What about digital logic design is similar to software design?</p>
</li>
</ul>
<p>You can see the code for my CPU at the time of writing
<a href="https://github.com/lochsh/riscy-boi/tree/47e94dc6e9665f73c871add002c34d1516fd5106">here</a>
or an up to date version <a href="https://github.com/lochsh/riscy-boi">here</a>.</p>
<h2>What is digital logic design?</h2>
<p>Digital logic design is designing logic circuits that operate on binary values.
The elementary components are logic gates: an AND gate, for example, has two
inputs and one output. The output is 1 iff<sup id="fnref:1"><a class="footnote-ref" href="#fn:1">1</a></sup> both inputs are 1.</p>
<p>Typically, we design synchronous circuits which use flip-flops to store state,
and thereby synchronise the operation of the circuit to a common clock.
Flip-flops are composed of logic gates.</p>
<p>Analogue circuit design is concerned with the electronic components that make
up logic gates, like transistors and diodes. This level of abstraction is often
needed for applications dealing directly with signals derived from analogue
sensors, like radio receivers. When designing a CPU, this level of abstraction
would not be feasible: modern CPUs can have billions of transistors!</p>
<p>Instead, we use tools that can translate our digital logic design into
different useful formats: the configuration of an FPGA (see below); a
simulation; silicon layout.</p>
<h2>What is an FPGA and why are they used?</h2>
<p>We noted above that the same digital logic design tools can be used whether we
are creating a custom ASIC to be made into silicon, or configuring an FPGA. A
Field-Programmable Gate Array is an integrated circuit containing an array of
programmable logic blocks. You could imagine it is as a big array of logic
gates that can be connected together in various ways.</p>
<p>Making a custom chip generally costs millions, and of course once your chip is
manufactured it cannot be changed. Thus, generally FPGAs are used when:</p>
<ul>
<li>
<p>You cannot afford to create a custom ASIC due to lack of capital (e.g. if
you're just some hacker like me and not ARM or Intel)</p>
</li>
<li>
<p>You cannot afford to create a custom ASIC because your volume is too low to
make it worth the high one-off costs (e.g. if you are making a small quantity
of MRI machines with custom data acquisition hardware)</p>
</li>
<li>
<p>You need the flexibility</p>
</li>
</ul>
<p>The downsides? FPGAs have a much higher per-chip cost, and they are generally
much slower as a consequence of being able to connect logic blocks together in
very flexible ways. In contrast, a custom design can be reduced to the minimum
number of transistors, with no concern for flexibility.</p>
<p>I think it's helpful context to compare the custom ASIC design process against
that of an FPGA design:</p>
<ul>
<li>
<p><span style="color:#fc04a2">Logic design</span>: just like we'd do for an FPGA, the logic design of an ASIC is
done in a hardware description language.</p>
</li>
<li>
<p><span style="color:#fc04a2">Verification</span>: FPGA designs may well be verified, but you might expect the
process for an ASIC design to be more rigorous – after all, the design
can't be changed once manufactured! Often verification will involve formally
verifying<sup id="fnref:2"><a class="footnote-ref" href="#fn:2">2</a></sup> parts of the design.</p>
</li>
<li>
<p><span style="color:#fc04a2">Synthesis</span>: This creates a <em>netlist</em>: a list of logic blocks and their
connections. The connections are called <em>nets</em>, and the blocks are called
<em>cells</em>. For both FPGAs and ASICs, the cells are vendor-specific.</p>
</li>
<li>
<p><span style="color:#fc04a2">Placement and routing</span> (P&R): for an FPGA, this involves mapping the logic
blocks described in the netlist to actual blocks in the FPGA. The resulting
binary is often called a <em>bitstream</em>. For an ASIC, this involves deciding
where to place the cells on the silicon, and how to connect them up. Both
applications generally use automated optimisation tools for this.</p>
</li>
</ul>
<h2>What tools do I need?</h2>
<h3>A hardware description language: I am using <a href="https://github.com/nmigen/nmigen">nMigen</a><sup id="fnref:3"><a class="footnote-ref" href="#fn:3">3</a></sup></h3>
<p>You may have heard of Verilog or VHDL: both popular hardware description
languages (HDLs). I use "popular" here to mean widely used, not widely loved.</p>
<p>I won't pretend to know much about these tools: I only know that smarter people
than me with vast logic design experience have a lot of hate for them.
Due to the problems with Verilog and other similar tools, there have been
various attempts at making more useful and friendlier alternatives. nMigen is
one such project, which creates a domain-specific language in Python. In their
own words:</p>
<blockquote>
<p>Despite being faster than schematics entry, hardware design with Verilog and
VHDL remains tedious and inefficient for several reasons. The event-driven
model introduces issues and manual coding that are unnecessary for
synchronous circuits, which represent the lion's share of today's logic
designs. Counterintuitive arithmetic rules result in steeper learning curves
and provide a fertile ground for subtle bugs in designs. Finally, support for
procedural generation of logic (metaprogramming) through "generate"
statements is very limited and restricts the ways code can be made generic,
reused and organized.</p>
<p>To address those issues, we have developed the nMigen FHDL, a library that
replaces the event-driven paradigm with the notions of combinatorial and
synchronous statements, has arithmetic rules that make integers always behave
like mathematical integers, and most importantly allows the design's logic to
be constructed by a Python program. This last point enables hardware
designers to take advantage of the richness of the Python language—object
oriented programming, function parameters, generators, operator overloading,
libraries, etc.—to build well organized, reusable and elegant designs.</p>
</blockquote>
<p>If, like me, you've never used Verilog, then not all of this will have more
than abstract meaning to you. But it certainly sounds promising,
and I can attest that it has been very straightforward to get started with
logic design without the reportedly large barrier of grappling with Verilog. I
would recommend it, particularly if you are already familiar with Python!</p>
<p>The only downside I can think of is that nMigen is still in development, and
in particular the documentation is not complete. There is a helpful community
at #nmigen on <a href="chat.freenode.net">chat.freenode.net</a>.</p>
<h3>A wave viewer for inspecting simulations: I am using <a href="http://gtkwave.sourceforge.net/">GTKWave</a></h3>
<p>nMigen provides simulation tooling: I use it in my tests, written using
<code>pytest</code>. I record the signals during these tests and view them in a wave
viewer to help debug.</p>
<p><img alt="gtkwave" class="callout" src="/blog/images/gtkwave.png" title="A screenshot of GTKWave"></p>
<h3>Optional: An FPGA dev board. I am using a myStorm BlackIce II</h3>
<p>You don't need an FPGA dev board to create your own CPU. You could do
everything in simulation! The fun of having a board to work with, for me, is
being able to flash LEDs and see my design in action.</p>
<p>Of course, if you were creating something more useful than my very basic CPU,
then you would probably want some hardware to run it on, and this would be less
"optional"!</p>
<h2>Getting started with nMigen</h2>
<p>Rather than immediately trying to design a CPU, I started by making an
Arithmetic Logic Unit (ALU) in nMigen. The ALU is a key piece of any CPU design
that I have seen: it performs arithmetic operations.</p>
<p>Why start with this? I knew I would need an ALU for my CPU; I knew I could make
a simple one; I knew that the feeling of making something is an important
motivator when starting a new project!</p>
<p>My design looked something like this:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="String">"""</span><span class="String">Arithmetic Logic Unit</span><span class="String">"""</span>
<span id="L2" class="LineNr"> 2 </span><span class="PreProc">import</span> enum
<span id="L3" class="LineNr"> 3 </span>
<span id="L4" class="LineNr"> 4 </span><span class="PreProc">import</span> nmigen <span class="Statement">as</span> nm
<span id="L5" class="LineNr"> 5 </span>
<span id="L6" class="LineNr"> 6 </span>
<span id="L7" class="LineNr"> 7 </span><span class="Statement">class</span> <span class="Function">ALUOp</span><span class="lv12c">(</span>enum<span class="op_lv12">.</span>IntEnum<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L8" class="LineNr"> 8 </span> <span class="String">"""</span><span class="String">Operations for the ALU</span><span class="String">"""</span>
<span id="L9" class="LineNr"> 9 </span> ADD <span class="op_lv0">=</span> <span class="Number">0</span>
<span id="L10" class="LineNr">10 </span> SUB <span class="op_lv0">=</span> <span class="Number">1</span>
<span id="L11" class="LineNr">11 </span>
<span id="L12" class="LineNr">12 </span>
<span id="L13" class="LineNr">13 </span><span class="Statement">class</span> <span class="Function">ALU</span><span class="lv12c">(</span>nm<span class="op_lv12">.</span>Elaboratable<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L14" class="LineNr">14 </span> <span class="String">"""</span>
<span id="L15" class="LineNr">15 </span><span class="String"> Arithmetic Logic Unit</span>
<span id="L16" class="LineNr">16 </span>
<span id="L17" class="LineNr">17 </span><span class="String"> * op (in): the opcode</span>
<span id="L18" class="LineNr">18 </span><span class="String"> * a (in): the first operand</span>
<span id="L19" class="LineNr">19 </span><span class="String"> * b (in): the second operand</span>
<span id="L20" class="LineNr">20 </span>
<span id="L21" class="LineNr">21 </span><span class="String"> * o (out): the output</span>
<span id="L22" class="LineNr">22 </span><span class="String"> </span><span class="String">"""</span>
<span id="L23" class="LineNr">23 </span>
<span id="L24" class="LineNr">24 </span> <span class="Statement">def</span> <span class="Function">__init__</span><span class="lv12c">(</span>self<span class="op_lv12">,</span> width<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L25" class="LineNr">25 </span> <span class="String">"""</span>
<span id="L26" class="LineNr">26 </span><span class="String"> Initialiser</span>
<span id="L27" class="LineNr">27 </span>
<span id="L28" class="LineNr">28 </span><span class="String"> Args:</span>
<span id="L29" class="LineNr">29 </span><span class="String"> width (int): data width</span>
<span id="L30" class="LineNr">30 </span><span class="String"> </span><span class="String">"""</span>
<span id="L31" class="LineNr">31 </span> self<span class="op_lv0">.</span>op <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Signal<span class="lv12c">()</span>
<span id="L32" class="LineNr">32 </span> self<span class="op_lv0">.</span>a <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Signal<span class="lv12c">(</span>width<span class="lv12c">)</span>
<span id="L33" class="LineNr">33 </span> self<span class="op_lv0">.</span>b <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Signal<span class="lv12c">(</span>width<span class="lv12c">)</span>
<span id="L34" class="LineNr">34 </span> self<span class="op_lv0">.</span>o <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Signal<span class="lv12c">(</span>width<span class="lv12c">)</span>
<span id="L35" class="LineNr">35 </span>
<span id="L36" class="LineNr">36 </span> <span class="Statement">def</span> <span class="Function">elaborate</span><span class="lv12c">(</span>self<span class="op_lv12">,</span> _<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L37" class="LineNr">37 </span> m <span class="op_lv0">=</span> nm<span class="op_lv0">.</span>Module<span class="lv12c">()</span>
<span id="L38" class="LineNr">38 </span>
<span id="L39" class="LineNr">39 </span> <span class="Statement">with</span> m<span class="op_lv0">.</span>Switch<span class="lv12c">(</span>self<span class="op_lv12">.</span>op<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L40" class="LineNr">40 </span> <span class="Statement">with</span> m<span class="op_lv0">.</span>Case<span class="lv12c">(</span>ALUOp<span class="op_lv12">.</span>ADD<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L41" class="LineNr">41 </span> m<span class="op_lv0">.</span>d<span class="op_lv0">.</span>comb <span class="op_lv0">+=</span> self<span class="op_lv0">.</span>o<span class="op_lv0">.</span>eq<span class="lv12c">(</span>self<span class="op_lv12">.</span>a <span class="op_lv12">+</span> self<span class="op_lv12">.</span>b<span class="lv12c">)</span>
<span id="L42" class="LineNr">42 </span> <span class="Statement">with</span> m<span class="op_lv0">.</span>Case<span class="lv12c">(</span>ALUOp<span class="op_lv12">.</span>SUB<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L43" class="LineNr">43 </span> m<span class="op_lv0">.</span>d<span class="op_lv0">.</span>comb <span class="op_lv0">+=</span> self<span class="op_lv0">.</span>o<span class="op_lv0">.</span>eq<span class="lv12c">(</span>self<span class="op_lv12">.</span>a <span class="op_lv12">-</span> self<span class="op_lv12">.</span>b<span class="lv12c">)</span>
<span id="L44" class="LineNr">44 </span> <span class="Statement">return</span> m
</pre>
</div>
<p></html></p>
<p>As you can see, we've created a lot of nMigen <code>Signal</code> instances to represent
well...the signals that define the interface to our ALU! But what is this
<code>elaborate</code> method? My understanding is that "elaboration" is the name for the
first step in synthesising the netlist (see above). The idea in the nMigen code
above is that we've created some <em>elaboratable</em> structure (by inheriting from
<code>nm.Elaboratable</code>), i.e. something that describes digital logic we want to
synthesise. The <code>elaborate</code> method describes that digital logic. It has to
return an nMigen <code>Module</code>.</p>
<p>Let's have a closer look at the contents of the <code>elaborate</code> method. The
<code>Switch</code> will create some kind of decision logic in the synthesised design.
But what is <code>m.d.comb</code>? nMigen has the concept of synchronous (<code>m.d.sync</code>)
and combinatorial<sup id="fnref:4"><a class="footnote-ref" href="#fn:4">4</a></sup> (<code>m.d.comb</code>) control domains. From the nMigen
<a href="https://nmigen.info/nmigen/latest/lang.html#lang-domains">docs</a>:</p>
<blockquote>
<p>A control domain is a named group of signals that change their value in
identical conditions.</p>
<p>All designs have a single predefined <em>combinatorial domain</em>, containing all
signals that change immediately when any value used to compute them changes.
The name comb is reserved for the combinatorial domain.</p>
<p>A design can also have any amount of user-defined <em>synchronous domains</em>, also
called clock domains, containing signals that change when a specific edge
occurs on the domain’s clock signal or, for domains with asynchronous reset,
on the domain’s reset signal. Most modules only use a single synchronous
domain[...]</p>
<p>The behavior of assignments differs for signals in combinatorial and
synchronous domains. Collectively, signals in synchronous domains contain the
state of a design, whereas signals in the combinatorial domain cannot form
feedback loops or hold state.</p>
</blockquote>
<p>Let's think about a shift register as an example piece of logic that we wish to
design. Let's say our shift register has 8 bits, and every clock cycle the bit
values are shifted one bit along (with the left-most value coming from an input
signal). This is necessarily synchronous: you couldn't create this
functionality by simply wiring the bits together, which in nMigen is what
assigning them in the combinatorial domain would represent.</p>
<p>In the next installment of this blog series, I'll have more detail on my CPU
design. As it stands at the moment, I'm trying to retire one instruction per
cycle with no pipelining – this is unusual, but my hope was that it would
make various aspects of the CPU simpler. A consequence of this is that much of
my logic is combinatorial, rather than synchronous, as I have very little state
to maintain between clock cycles. At the moment, something is wrong with my
register file design, and there's a chance I'll have to reassess my "no
pipelining" idea in order to fix it.</p>
<h2>Writing tests</h2>
<p>I like using <code>pytest</code> for Python tests, but you can of course use whatever
framework appeals to you. Here are my tests for the ALU code above:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="String">"""</span><span class="String">ALU tests</span><span class="String">"""</span>
<span id="L2" class="LineNr"> 2 </span><span class="PreProc">import</span> nmigen<span class="op_lv0">.</span>sim
<span id="L3" class="LineNr"> 3 </span><span class="PreProc">import</span> pytest
<span id="L4" class="LineNr"> 4 </span>
<span id="L5" class="LineNr"> 5 </span><span class="PreProc">from</span> riscy_boi <span class="PreProc">import</span> alu
<span id="L6" class="LineNr"> 6 </span>
<span id="L7" class="LineNr"> 7 </span>
<span id="L8" class="LineNr"> 8 </span><span class="op_lv0">@</span>pytest<span class="op_lv0">.</span>mark<span class="op_lv0">.</span>parametrize<span class="lv12c">(</span>
<span id="L9" class="LineNr"> 9 </span> <span class="String">"</span><span class="String">op, a, b, o</span><span class="String">"</span><span class="op_lv12">,</span> <span class="lv11c">[</span>
<span id="L10" class="LineNr">10 </span> <span class="lv10c">(</span>alu<span class="op_lv10">.</span>ALUOp<span class="op_lv10">.</span>ADD<span class="op_lv10">,</span> <span class="Number">1</span><span class="op_lv10">,</span> <span class="Number">1</span><span class="op_lv10">,</span> <span class="Number">2</span><span class="lv10c">)</span><span class="op_lv11">,</span>
<span id="L11" class="LineNr">11 </span> <span class="lv10c">(</span>alu<span class="op_lv10">.</span>ALUOp<span class="op_lv10">.</span>ADD<span class="op_lv10">,</span> <span class="Number">1</span><span class="op_lv10">,</span> <span class="Number">2</span><span class="op_lv10">,</span> <span class="Number">3</span><span class="lv10c">)</span><span class="op_lv11">,</span>
<span id="L12" class="LineNr">12 </span> <span class="lv10c">(</span>alu<span class="op_lv10">.</span>ALUOp<span class="op_lv10">.</span>ADD<span class="op_lv10">,</span> <span class="Number">2</span><span class="op_lv10">,</span> <span class="Number">1</span><span class="op_lv10">,</span> <span class="Number">3</span><span class="lv10c">)</span><span class="op_lv11">,</span>
<span id="L13" class="LineNr">13 </span> <span class="lv10c">(</span>alu<span class="op_lv10">.</span>ALUOp<span class="op_lv10">.</span>ADD<span class="op_lv10">,</span> <span class="Number">258</span><span class="op_lv10">,</span> <span class="Number">203</span><span class="op_lv10">,</span> <span class="Number">461</span><span class="lv10c">)</span><span class="op_lv11">,</span>
<span id="L14" class="LineNr">14 </span> <span class="lv10c">(</span>alu<span class="op_lv10">.</span>ALUOp<span class="op_lv10">.</span>ADD<span class="op_lv10">,</span> <span class="Number">5</span><span class="op_lv10">,</span> <span class="Number">0</span><span class="op_lv10">,</span> <span class="Number">5</span><span class="lv10c">)</span><span class="op_lv11">,</span>
<span id="L15" class="LineNr">15 </span> <span class="lv10c">(</span>alu<span class="op_lv10">.</span>ALUOp<span class="op_lv10">.</span>ADD<span class="op_lv10">,</span> <span class="Number">0</span><span class="op_lv10">,</span> <span class="Number">5</span><span class="op_lv10">,</span> <span class="Number">5</span><span class="lv10c">)</span><span class="op_lv11">,</span>
<span id="L16" class="LineNr">16 </span> <span class="lv10c">(</span>alu<span class="op_lv10">.</span>ALUOp<span class="op_lv10">.</span>ADD<span class="op_lv10">,</span> <span class="Number">2</span><span class="op_lv10">**</span><span class="Number">32</span> <span class="op_lv10">-</span> <span class="Number">1</span><span class="op_lv10">,</span> <span class="Number">1</span><span class="op_lv10">,</span> <span class="Number">0</span><span class="lv10c">)</span><span class="op_lv11">,</span>
<span id="L17" class="LineNr">17 </span> <span class="lv10c">(</span>alu<span class="op_lv10">.</span>ALUOp<span class="op_lv10">.</span>SUB<span class="op_lv10">,</span> <span class="Number">1</span><span class="op_lv10">,</span> <span class="Number">1</span><span class="op_lv10">,</span> <span class="Number">0</span><span class="lv10c">)</span><span class="op_lv11">,</span>
<span id="L18" class="LineNr">18 </span> <span class="lv10c">(</span>alu<span class="op_lv10">.</span>ALUOp<span class="op_lv10">.</span>SUB<span class="op_lv10">,</span> <span class="Number">4942</span><span class="op_lv10">,</span> <span class="Number">0</span><span class="op_lv10">,</span> <span class="Number">4942</span><span class="lv10c">)</span><span class="op_lv11">,</span>
<span id="L19" class="LineNr">19 </span> <span class="lv10c">(</span>alu<span class="op_lv10">.</span>ALUOp<span class="op_lv10">.</span>SUB<span class="op_lv10">,</span> <span class="Number">1</span><span class="op_lv10">,</span> <span class="Number">2</span><span class="op_lv10">,</span> <span class="Number">2</span><span class="op_lv10">**</span><span class="Number">32</span> <span class="op_lv10">-</span> <span class="Number">1</span><span class="lv10c">)</span><span class="lv11c">]</span><span class="lv12c">)</span>
<span id="L20" class="LineNr">20 </span><span class="Statement">def</span> <span class="Function">test_alu</span><span class="lv12c">(</span>comb_sim<span class="op_lv12">,</span> op<span class="op_lv12">,</span> a<span class="op_lv12">,</span> b<span class="op_lv12">,</span> o<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L21" class="LineNr">21 </span> alu_inst <span class="op_lv0">=</span> alu<span class="op_lv0">.</span>ALU<span class="lv12c">(</span><span class="Number">32</span><span class="lv12c">)</span>
<span id="L22" class="LineNr">22 </span>
<span id="L23" class="LineNr">23 </span> <span class="Statement">def</span> <span class="Function">testbench</span><span class="lv12c">()</span><span class="op_lv0">:</span>
<span id="L24" class="LineNr">24 </span> <span class="Statement">yield</span> alu_inst<span class="op_lv0">.</span>op<span class="op_lv0">.</span>eq<span class="lv12c">(</span>op<span class="lv12c">)</span>
<span id="L25" class="LineNr">25 </span> <span class="Statement">yield</span> alu_inst<span class="op_lv0">.</span>a<span class="op_lv0">.</span>eq<span class="lv12c">(</span>a<span class="lv12c">)</span>
<span id="L26" class="LineNr">26 </span> <span class="Statement">yield</span> alu_inst<span class="op_lv0">.</span>b<span class="op_lv0">.</span>eq<span class="lv12c">(</span>b<span class="lv12c">)</span>
<span id="L27" class="LineNr">27 </span> <span class="Statement">yield</span> nmigen<span class="op_lv0">.</span>sim<span class="op_lv0">.</span>Settle<span class="lv12c">()</span>
<span id="L28" class="LineNr">28 </span> <span class="Statement">assert</span> <span class="lv12c">(</span><span class="Statement">yield</span> alu_inst<span class="op_lv12">.</span>o<span class="lv12c">)</span> <span class="op_lv0">==</span> o
<span id="L29" class="LineNr">29 </span>
<span id="L30" class="LineNr">30 </span> comb_sim<span class="lv12c">(</span>alu_inst<span class="op_lv12">,</span> testbench<span class="lv12c">)</span>
</pre>
</div>
<p></html></p>
<p>and my <code>conftest.py</code>:
<html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="String">"""</span><span class="String">Test configuration</span><span class="String">"""</span>
<span id="L2" class="LineNr"> 2 </span><span class="PreProc">import</span> os
<span id="L3" class="LineNr"> 3 </span><span class="PreProc">import</span> shutil
<span id="L4" class="LineNr"> 4 </span>
<span id="L5" class="LineNr"> 5 </span><span class="PreProc">import</span> nmigen<span class="op_lv0">.</span>sim
<span id="L6" class="LineNr"> 6 </span><span class="PreProc">import</span> pytest
<span id="L7" class="LineNr"> 7 </span>
<span id="L8" class="LineNr"> 8 </span>
<span id="L9" class="LineNr"> 9 </span>VCD_TOP_DIR <span class="op_lv0">=</span> os<span class="op_lv0">.</span>path<span class="op_lv0">.</span>join<span class="lv12c">(</span>
<span id="L10" class="LineNr">10 </span> os<span class="op_lv12">.</span>path<span class="op_lv12">.</span>dirname<span class="lv11c">(</span>os<span class="op_lv11">.</span>path<span class="op_lv11">.</span>realpath<span class="lv10c">(</span>__file__<span class="lv10c">)</span><span class="lv11c">)</span><span class="op_lv12">,</span>
<span id="L11" class="LineNr">11 </span> <span class="String">"</span><span class="String">tests</span><span class="String">"</span><span class="op_lv12">,</span>
<span id="L12" class="LineNr">12 </span> <span class="String">"</span><span class="String">vcd</span><span class="String">"</span><span class="lv12c">)</span>
<span id="L13" class="LineNr">13 </span>
<span id="L14" class="LineNr">14 </span>
<span id="L15" class="LineNr">15 </span><span class="Statement">def</span> <span class="Function">vcd_path</span><span class="lv12c">(</span>node<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L16" class="LineNr">16 </span> directory <span class="op_lv0">=</span> os<span class="op_lv0">.</span>path<span class="op_lv0">.</span>join<span class="lv12c">(</span>VCD_TOP_DIR<span class="op_lv12">,</span> node<span class="op_lv12">.</span>fspath<span class="op_lv12">.</span>basename<span class="op_lv12">.</span>split<span class="lv11c">(</span><span class="String">"</span><span class="String">.</span><span class="String">"</span><span class="lv11c">)[</span><span class="Number">0</span><span class="lv11c">]</span><span class="lv12c">)</span>
<span id="L17" class="LineNr">17 </span> os<span class="op_lv0">.</span>makedirs<span class="lv12c">(</span>directory<span class="op_lv12">,</span> exist_ok<span class="op_lv12">=</span><span class="Function">True</span><span class="lv12c">)</span>
<span id="L18" class="LineNr">18 </span> <span class="Statement">return</span> os<span class="op_lv0">.</span>path<span class="op_lv0">.</span>join<span class="lv12c">(</span>directory<span class="op_lv12">,</span> node<span class="op_lv12">.</span>name <span class="op_lv12">+</span> <span class="String">"</span><span class="String">.vcd</span><span class="String">"</span><span class="lv12c">)</span>
<span id="L19" class="LineNr">19 </span>
<span id="L20" class="LineNr">20 </span>
<span id="L21" class="LineNr">21 </span><span class="op_lv0">@</span>pytest<span class="op_lv0">.</span>fixture<span class="lv12c">(</span>scope<span class="op_lv12">=</span><span class="String">"</span><span class="String">session</span><span class="String">"</span><span class="op_lv12">,</span> autouse<span class="op_lv12">=</span><span class="Function">True</span><span class="lv12c">)</span>
<span id="L22" class="LineNr">22 </span><span class="Statement">def</span> <span class="Function">clear_vcd_directory</span><span class="lv12c">()</span><span class="op_lv0">:</span>
<span id="L23" class="LineNr">23 </span> shutil<span class="op_lv0">.</span>rmtree<span class="lv12c">(</span>VCD_TOP_DIR<span class="op_lv12">,</span> ignore_errors<span class="op_lv12">=</span><span class="Function">True</span><span class="lv12c">)</span>
<span id="L24" class="LineNr">24 </span>
<span id="L25" class="LineNr">25 </span>
<span id="L26" class="LineNr">26 </span><span class="op_lv0">@</span>pytest<span class="op_lv0">.</span>fixture
<span id="L27" class="LineNr">27 </span><span class="Statement">def</span> <span class="Function">comb_sim</span><span class="lv12c">(</span>request<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L28" class="LineNr">28 </span>
<span id="L29" class="LineNr">29 </span> <span class="Statement">def</span> <span class="Function">run</span><span class="lv12c">(</span>fragment<span class="op_lv12">,</span> process<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L30" class="LineNr">30 </span> sim <span class="op_lv0">=</span> nmigen<span class="op_lv0">.</span>sim<span class="op_lv0">.</span>Simulator<span class="lv12c">(</span>fragment<span class="lv12c">)</span>
<span id="L31" class="LineNr">31 </span> sim<span class="op_lv0">.</span>add_process<span class="lv12c">(</span>process<span class="lv12c">)</span>
<span id="L32" class="LineNr">32 </span> <span class="Statement">with</span> sim<span class="op_lv0">.</span>write_vcd<span class="lv12c">(</span>vcd_path<span class="lv11c">(</span>request<span class="op_lv11">.</span>node<span class="lv11c">)</span><span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L33" class="LineNr">33 </span> sim<span class="op_lv0">.</span>run_until<span class="lv12c">(</span><span class="Number">100e-6</span><span class="lv12c">)</span>
<span id="L34" class="LineNr">34 </span>
<span id="L35" class="LineNr">35 </span> <span class="Statement">return</span> run
<span id="L36" class="LineNr">36 </span>
<span id="L37" class="LineNr">37 </span>
<span id="L38" class="LineNr">38 </span><span class="op_lv0">@</span>pytest<span class="op_lv0">.</span>fixture
<span id="L39" class="LineNr">39 </span><span class="Statement">def</span> <span class="Function">sync_sim</span><span class="lv12c">(</span>request<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L40" class="LineNr">40 </span>
<span id="L41" class="LineNr">41 </span> <span class="Statement">def</span> <span class="Function">run</span><span class="lv12c">(</span>fragment<span class="op_lv12">,</span> process<span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L42" class="LineNr">42 </span> sim <span class="op_lv0">=</span> nmigen<span class="op_lv0">.</span>sim<span class="op_lv0">.</span>Simulator<span class="lv12c">(</span>fragment<span class="lv12c">)</span>
<span id="L43" class="LineNr">43 </span> sim<span class="op_lv0">.</span>add_sync_process<span class="lv12c">(</span>process<span class="lv12c">)</span>
<span id="L44" class="LineNr">44 </span> sim<span class="op_lv0">.</span>add_clock<span class="lv12c">(</span><span class="Number">1</span> <span class="op_lv12">/</span> <span class="Number">10e6</span><span class="lv12c">)</span>
<span id="L45" class="LineNr">45 </span> <span class="Statement">with</span> sim<span class="op_lv0">.</span>write_vcd<span class="lv12c">(</span>vcd_path<span class="lv11c">(</span>request<span class="op_lv11">.</span>node<span class="lv11c">)</span><span class="lv12c">)</span><span class="op_lv0">:</span>
<span id="L46" class="LineNr">46 </span> sim<span class="op_lv0">.</span>run<span class="lv12c">()</span>
<span id="L47" class="LineNr">47 </span>
<span id="L48" class="LineNr">48 </span> <span class="Statement">return</span> run
</pre>
</div>
<p></html></p>
<p>Every test generates a <code>vcd</code> file for me to view in a wave viewer, like
GTKWave, for debugging purposes. You'll notice that the combinatorial
simulation fixture runs for an arbitrary small time period, whereas the
synchronous simulation feature runs for a defined number of clock cycles.</p>
<p>Yielding a signal in the test function requests its current value from the
simulator. For combinatorial logic, we yield <code>nmigen.sim.Settle()</code> to ask the
simulation to complete.</p>
<p>For synchronous logic, you can also yield without an argument to start a new
clock cycle.</p>
<h2>Designing a CPU</h2>
<p>Once I'd gotten familiar with nMigen, I started trying to draw a block diagram
for my CPU. I will go into much more detail on this on the next installment of
this blog series, but I will briefly say that I started by drawing out the
logic required for one instruction, then drew out the logic for another
instruction, then figured out how to combine them. Here's the first messy
sketch:</p>
<p><img alt="sketch" class="callout" src="/blog/images/block-diagram-sktech.jpg" title="A screenshot of my first CPU design sketch"></p>
<p>This block diagram step was extremely valuable in figuring out what the
interfaces of different components needed to be, but I wouldn't have wanted to
do it before playing around in nMigen first and learning a bit about digital
logic design in the process. The jazzed up block diagram currently looks like
this:</p>
<p><img alt="block" class="callout" src="/blog/images/riscyboi.png" title="Block diagram of current CPU design"></p>
<p>Stay tuned for the next installment where I actually delve into RISC-V and CPU
design. I expect there to be a third installment of me reworking my design and
getting it working on the entire instruction set (RV32I) that I am implementing
:)</p>
<div class="footnote">
<hr>
<ol>
<li id="fn:1">
<p>"if and only if" <a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text">↩</a></p>
</li>
<li id="fn:2">
<p>If you're a software engineer who has worked on high assurance software,
such as that for a high risk medical device, you might think "formal
verification" refers to any formalised verification process. Here I use formal
verification to refer to mathematically proving correctness, see
<a href="https://en.wikipedia.org/wiki/Formal_verification">https://en.wikipedia.org/wiki/Formal_verification</a> <a class="footnote-backref" href="#fnref:2" title="Jump back to footnote 2 in the text">↩</a></p>
</li>
<li id="fn:3">
<p>Note that you want the nmigen project under the nmigen organisation on
GitHub. You do not want the one under the m-labs organisation. The situation is
a bit unfortunate and complicated, but just know that the former is only one
under active development. It's possible that the active project might change
name soon – I'll do my best to update this blog post should that happen. <a class="footnote-backref" href="#fnref:3" title="Jump back to footnote 3 in the text">↩</a></p>
</li>
<li id="fn:4">
<p>also known as combinational <a class="footnote-backref" href="#fnref:4" title="Jump back to footnote 4 in the text">↩</a></p>
</li>
</ol>
</div>Why can't I pass std::vector<Child*> as std::vector<Parent*>?2020-07-25T00:00:00+01:002020-07-25T00:00:00+01:00Hannah McLaughlintag:mcla.ug,2020-07-25:/blog/cpp-templates-with-subtype-type-args.html<h2>Introduction</h2>
<p>This is a short post reflecting on some pretty basic C++ that bothered me
because I felt like I didn't fully understand it. I felt that the language
should be able to do more, and wanted to understand why it could not.</p>
<h2>Subtype polymorphism and templates</h2>
<p>In C++, subtype …</p><h2>Introduction</h2>
<p>This is a short post reflecting on some pretty basic C++ that bothered me
because I felt like I didn't fully understand it. I felt that the language
should be able to do more, and wanted to understand why it could not.</p>
<h2>Subtype polymorphism and templates</h2>
<p>In C++, subtype polymorphism allows pointers and references of child classes to
be treated as their parents<sup id="fnref:1"><a class="footnote-ref" href="#fn:1">1</a></sup>:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Type">class</span><span class="Function"> Parent</span><span class="op_lv0">;</span>
<span id="L2" class="LineNr"> 2 </span><span class="Type">class</span><span class="Function"> Child</span> <span class="op_lv0">:</span> <span class="Statement">public</span><span class="Function"> Parent</span><span class="op_lv0">;</span>
<span id="L3" class="LineNr"> 3 </span>
<span id="L4" class="LineNr"> 4 </span><span class="Type">void</span> <span class="Function">foo</span><span class="lv12c">(</span>Parent<span class="op_lv12">*</span><span class="lv12c">)</span><span class="op_lv0">;</span>
<span id="L5" class="LineNr"> 5 </span>
<span id="L6" class="LineNr"> 6 </span><span class="Type">int</span> <span class="Function">main</span><span class="lv12c">()</span> <span class="lv12c">{</span>
<span id="L7" class="LineNr"> 7 </span> Child<span class="op_lv12">*</span> child <span class="op_lv12">=</span> <span class="Statement">new</span> <span class="Function">Child</span><span class="lv11c">()</span><span class="op_lv12">;</span>
<span id="L8" class="LineNr"> 8 </span> <span class="Function">foo</span><span class="lv11c">(</span>child<span class="lv11c">)</span><span class="op_lv12">;</span>
<span id="L9" class="LineNr"> 9 </span> <span class="Comment">// ...</span>
<span id="L10" class="LineNr">10 </span><span class="lv12c">}</span>
</pre>
</div>
<p></html></p>
<p>This is at the core of Object Oriented style programming in C++, and, like all
forms of polymorphism, allows us to write code that is only tied to an
interface, not its implementations. We can swap out implementations easily,
making the code more flexible, as well as more testable: test doubles that
implement the interface can be injected during tests.</p>
<p>C++ also provides another mechanism for polymorphism: templates. Templates
allow us to parameterise interfaces by type. The template <code>std::vector</code>
describes a growable list and is parametrised by the type the list contains.</p>
<p>Unfortunately, in C++, you cannot combine these two forms of polymorphism in a
way that you might like to.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span><span class="Type">void</span> <span class="Function">bar</span><span class="lv12c">(</span><span class="Type">const</span> <span class="Constant">std</span><span class="op_lv12">::</span><span class="Type">vector</span><span class="lv11c"><</span>Parent<span class="op_lv11">*</span><span class="lv11c">></span><span class="op_lv12">&</span><span class="lv12c">)</span><span class="op_lv0">;</span>
<span id="L2" class="LineNr">2 </span>
<span id="L3" class="LineNr">3 </span><span class="Type">int</span> <span class="Function">main</span><span class="lv12c">()</span> <span class="lv12c">{</span>
<span id="L4" class="LineNr">4 </span> <span class="Constant">std</span><span class="op_lv12">::</span><span class="Type">vector</span><span class="lv11c"><</span>Child<span class="op_lv11">*</span><span class="lv11c">></span> children <span class="op_lv12">=</span> <span class="lv11c">{</span><span class="Statement">new</span> <span class="Function">Child</span><span class="lv10c">()</span><span class="lv11c">}</span><span class="op_lv12">;</span>
<span id="L5" class="LineNr">5 </span> <span class="Function">bar</span><span class="lv11c">(</span>children<span class="lv11c">)</span><span class="op_lv12">;</span> <span class="Comment">// Ill-formed (won't compile)</span>
<span id="L6" class="LineNr">6 </span> <span class="Comment">// ...</span>
<span id="L7" class="LineNr">7 </span><span class="lv12c">}</span>
</pre>
</div>
<p></html></p>
<p>It's easy to explain this by pointing out that <code>std::vector<Child*></code> does not
inherit from <code>std::vector<Parent*></code>, and so why would you expect to be able to
use them interchangeably?</p>
<p>This explanation makes sense intuitively, but I found it to be lacking. I
thought about this:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Type">void</span> <span class="Function">bar</span><span class="lv12c">(</span><span class="Type">const</span> <span class="Constant">std</span><span class="op_lv12">::</span><span class="Type">vector</span><span class="lv11c"><</span>Parent<span class="op_lv11">*</span><span class="lv11c">></span><span class="op_lv12">&</span><span class="lv12c">)</span><span class="op_lv0">;</span>
<span id="L2" class="LineNr"> 2 </span>
<span id="L3" class="LineNr"> 3 </span><span class="Type">int</span> <span class="Function">main</span><span class="lv12c">()</span> <span class="lv12c">{</span>
<span id="L4" class="LineNr"> 4 </span> <span class="Constant">std</span><span class="op_lv12">::</span><span class="Type">vector</span><span class="lv11c"><</span>Child<span class="op_lv11">*</span><span class="lv11c">></span> children <span class="op_lv12">=</span> <span class="lv11c">{</span><span class="Statement">new</span> <span class="Function">Child</span><span class="lv10c">()</span><span class="lv11c">}</span><span class="op_lv12">;</span>
<span id="L5" class="LineNr"> 5 </span> <span class="Constant">std</span><span class="op_lv12">::</span><span class="Type">vector</span><span class="lv11c"><</span>Parent<span class="op_lv11">*</span><span class="lv11c">></span> children_as_parents <span class="op_lv12">=</span>
<span id="L6" class="LineNr"> 6 </span> <span class="op_lv12">*</span><span class="Statement">reinterpret_cast</span><span class="lv11c"><</span><span class="Constant">std</span><span class="op_lv11">::</span><span class="Type">vector</span><span class="lv10c"><</span>Parent<span class="op_lv10">*</span><span class="lv10c">></span><span class="op_lv11">*</span><span class="lv11c">>(</span><span class="op_lv11">&</span>children<span class="lv11c">)</span><span class="op_lv12">;</span>
<span id="L7" class="LineNr"> 7 </span> <span class="Function">bar</span><span class="lv11c">(</span>children_as_parents<span class="lv11c">)</span><span class="op_lv12">;</span> <span class="Comment">// Well-formed (compiles)</span>
<span id="L8" class="LineNr"> 8 </span>
<span id="L9" class="LineNr"> 9 </span> <span class="Statement">return</span> <span class="Number">0</span><span class="op_lv12">;</span>
<span id="L10" class="LineNr">10 </span><span class="lv12c">}</span>
</pre>
</div>
<p></html></p>
<p>I thought this would be "safe" (we'll see why not below): the memory layout
will still be valid, as the
vector contains pointers, so accessing elements of the vector would work as
expected; and of course, because <code>Child</code> inherits from <code>Parent</code>, accessing the
<code>Parent</code> members after the cast would also work as expected.</p>
<p>I thought: if I know that this is safe, why can't the compiler? If the compiler
can know that it's safe, why doesn't the language allow it without circumventing
the type system with <code>reinterpret_cast</code>?</p>
<h2>The first problem: mutability</h2>
<p>The example above is not actually safe because the vector is mutable. We could
add pointers to other subtypes of <code>Parent</code> to <code>children_as_parents</code> while still
being able to access <code>children</code>, which would now be undefined behaviour.</p>
<p>This suggests we could never have <code>std::vector</code> behave the way I've described
with subtypes, and immutable containers would be needed instead.</p>
<h2>The second problem: template specialization</h2>
<p>The <code>reinterpret_cast</code> done above would only work because I know that the
implementation of <code>std::vector<Parent*></code> and <code>std::vector<Child*></code> is
identical.</p>
<p>However, that of course isn't the case in general for templates: part of their
function is that you can specialise them for different type arguments.<sup id="fnref:2"><a class="footnote-ref" href="#fn:2">2</a></sup></p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Type">template</span> <span class="lv12c"><</span><span class="Type">typename</span> T<span class="lv12c">></span>
<span id="L2" class="LineNr"> 2 </span><span class="Type">class</span><span class="Function"> A</span> <span class="lv12c">{</span>
<span id="L3" class="LineNr"> 3 </span> <span class="op_lv12">...</span>
<span id="L4" class="LineNr"> 4 </span><span class="lv12c">}</span><span class="op_lv0">;</span>
<span id="L5" class="LineNr"> 5 </span>
<span id="L6" class="LineNr"> 6 </span><span class="Type">template</span> <span class="lv12c"><></span> <span class="Type">class</span><span class="Function"> A</span><span class="lv12c"><</span>Parent<span class="lv12c">></span> <span class="lv12c">{</span>
<span id="L7" class="LineNr"> 7 </span> <span class="Type">double</span> m_<span class="op_lv12">;</span>
<span id="L8" class="LineNr"> 8 </span><span class="lv12c">}</span><span class="op_lv0">;</span>
<span id="L9" class="LineNr"> 9 </span>
<span id="L10" class="LineNr">10 </span><span class="Type">template</span><span class="lv12c"><></span> <span class="Type">class</span><span class="Function"> A</span><span class="lv12c"><</span>Child<span class="lv12c">></span> <span class="lv12c">{</span>
<span id="L11" class="LineNr">11 </span> <span class="Type">int</span> bar_<span class="op_lv12">;</span>
<span id="L12" class="LineNr">12 </span><span class="lv12c">}</span><span class="op_lv0">;</span>
</pre>
</div>
<p></html></p>
<p>If you were able to pass <code>A<Child></code> as if it were <code>A<Parent></code>, what could that
look like?</p>
<p>When a <code>Child</code> object is copied to a <code>Parent</code> object, it is <em>sliced</em>. This just
means that any data in the child object that isn't part of <code>Parent</code> is not
copied.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Type">class</span><span class="Function"> Parent</span> <span class="lv12c">{</span>
<span id="L2" class="LineNr"> 2 </span> <span class="Statement">public</span><span class="op_lv12">:</span>
<span id="L3" class="LineNr"> 3 </span> <span class="Type">virtual</span> <span class="Type">int</span> <span class="Function">foo</span><span class="lv11c">()</span> <span class="Type">const</span> <span class="lv11c">{</span>
<span id="L4" class="LineNr"> 4 </span> <span class="Statement">return</span> <span class="Number">0</span><span class="op_lv11">;</span>
<span id="L5" class="LineNr"> 5 </span> <span class="lv11c">}</span>
<span id="L6" class="LineNr"> 6 </span><span class="lv12c">}</span><span class="op_lv0">;</span>
<span id="L7" class="LineNr"> 7 </span>
<span id="L8" class="LineNr"> 8 </span><span class="Type">class</span><span class="Function"> Child</span> <span class="op_lv0">:</span> <span class="Statement">public</span><span class="Function"> Parent</span> <span class="lv12c">{</span>
<span id="L9" class="LineNr"> 9 </span> <span class="Statement">public</span><span class="op_lv12">:</span>
<span id="L10" class="LineNr">10 </span> <span class="Type">int</span> <span class="Function">foo</span><span class="lv11c">()</span> <span class="Type">const</span> <span class="Type">override</span> <span class="lv11c">{</span>
<span id="L11" class="LineNr">11 </span> <span class="Statement">return</span> foo_<span class="op_lv11">;</span>
<span id="L12" class="LineNr">12 </span> <span class="lv11c">}</span>
<span id="L13" class="LineNr">13 </span>
<span id="L14" class="LineNr">14 </span> <span class="Statement">private</span><span class="op_lv12">:</span>
<span id="L15" class="LineNr">15 </span> <span class="Type">const</span> <span class="Type">int</span> foo_ <span class="op_lv12">=</span> <span class="Number">1</span><span class="op_lv12">;</span>
<span id="L16" class="LineNr">16 </span><span class="lv12c">}</span><span class="op_lv0">;</span>
<span id="L17" class="LineNr">17 </span>
<span id="L18" class="LineNr">18 </span><span class="Type">int</span> <span class="Function">main</span><span class="lv12c">()</span> <span class="lv12c">{</span>
<span id="L19" class="LineNr">19 </span> Child child<span class="op_lv12">;</span>
<span id="L20" class="LineNr">20 </span> Parent parent <span class="op_lv12">=</span> child<span class="op_lv12">;</span> <span class="Comment">// copy -- Child::foo_ is not copied</span>
<span id="L21" class="LineNr">21 </span> <span class="Statement">return</span> parent<span class="op_lv12">.</span><span class="Function">foo</span><span class="lv11c">()</span><span class="op_lv12">;</span> <span class="Comment">// returns 0, not 1</span>
<span id="L22" class="LineNr">22 </span><span class="lv12c">}</span>
</pre>
</div>
<p></html></p>
<p>There's no obvious way for this to work for our template example above:
<code>A<Child></code> does not share the members of <code>A<Parent></code>, how could it be sliced
down?</p>
<p>Even if we were to take a reference (which means no slicing would occur), the
problem is essentially the same: <code>A<Parent></code> could have a method that
<code>A<Child></code> does not have.</p>
<h2>A deeply dissatisfying "solution"</h2>
<p>Of course, we can still write a function that could take a <code>A<Parent></code> or
<code>A<Child></code>: with templating:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Type">class</span><span class="Function"> Parent</span> <span class="lv12c">{</span>
<span id="L2" class="LineNr"> 2 </span> <span class="Statement">public</span><span class="op_lv12">:</span>
<span id="L3" class="LineNr"> 3 </span> <span class="Type">virtual</span> <span class="Type">int</span> <span class="Function">foo</span><span class="lv11c">()</span> <span class="Type">const</span> <span class="op_lv12">=</span> <span class="Number">0</span><span class="op_lv12">;</span>
<span id="L4" class="LineNr"> 4 </span><span class="lv12c">}</span><span class="op_lv0">;</span>
<span id="L5" class="LineNr"> 5 </span>
<span id="L6" class="LineNr"> 6 </span><span class="Type">class</span><span class="Function"> Child</span> <span class="lv12c">{</span>
<span id="L7" class="LineNr"> 7 </span> <span class="Statement">public</span><span class="op_lv12">:</span>
<span id="L8" class="LineNr"> 8 </span> <span class="Type">int</span> <span class="Function">foo</span><span class="lv11c">()</span> <span class="Type">const</span> <span class="Type">override</span> <span class="lv11c">{</span>
<span id="L9" class="LineNr"> 9 </span> <span class="Statement">return</span> <span class="Number">1</span><span class="op_lv11">;</span>
<span id="L10" class="LineNr">10 </span> <span class="lv11c">}</span>
<span id="L11" class="LineNr">11 </span><span class="lv12c">}</span><span class="op_lv0">;</span>
<span id="L12" class="LineNr">12 </span>
<span id="L13" class="LineNr">13 </span><span class="Type">template</span> <span class="lv12c"><</span><span class="Type">typename</span> T<span class="lv12c">></span>
<span id="L14" class="LineNr">14 </span><span class="Type">int</span> <span class="Function">func</span><span class="lv12c">(</span><span class="Type">const</span> <span class="Constant">std</span><span class="op_lv12">::</span><span class="Type">vector</span><span class="lv11c"><</span>T<span class="op_lv11">*</span><span class="lv11c">></span><span class="op_lv12">&</span> vec<span class="lv12c">)</span> <span class="lv12c">{</span>
<span id="L15" class="LineNr">15 </span> <span class="Statement">return</span> vec<span class="lv11c">[</span><span class="Number">0</span><span class="lv11c">]</span><span class="op_lv12">-></span><span class="Function">foo</span><span class="lv11c">()</span><span class="op_lv12">;</span>
<span id="L16" class="LineNr">16 </span><span class="lv12c">}</span>
</pre>
</div>
<p></html></p>
<p>But what I wanted was to be able to express that my function can take a vector
of any type that implements <code>Parent</code>. This doesn't express that. There's no way
in C++ to explicitly constrain the types on templates. In C++20, concepts
provide a minimal way to do this<sup id="fnref:3"><a class="footnote-ref" href="#fn:3">3</a></sup>, but:</p>
<ul>
<li>
<p>Concept constraints are implicitly met, not explicitly implemented</p>
</li>
<li>
<p>Concepts are orthogonal to the subtype polymorphism I want to ergonomically take advantage of.</p>
</li>
</ul>
<h2>A summary</h2>
<p>Ultimately, the explanation of why you can't pass <code>std::vector<Child*></code> as
<code>std::vector<Parent*></code> is really the same explanation we had at the start: you
can't pass templates of subtypes as templates of parents because the template
itself is not a subtype.</p>
<p>But really, a more meta-explanation is that C++ lacks the ability to constrain
template type arguments. This makes templates inherently inexpressive, and
makes it difficult to harmoniously use templates with inherited types.</p>
<p>This is one place where Rust shines in comparison to C++.</p>
<h1>A better world</h1>
<p>I really like the way Rust approaches polymorphism. Traits are Rust's only
mechanism for polymorphism. They provide a unified way for specifying both
static and dynamic interfaces. They facilitate designs based on composition,
reducing coupling. They allow for explicit constraint of type arguments in
generics.<sup id="fnref:4"><a class="footnote-ref" href="#fn:4">4</a></sup></p>
<p>In Rust, we can express the equivalent of what we wanted above easily:<sup id="fnref:5"><a class="footnote-ref" href="#fn:5">5</a></sup></p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Keyword">trait</span> <span class="Identifier">Mag</span> <span class="lv12c">{</span>
<span id="L2" class="LineNr"> 2 </span> <span class="Keyword">fn</span> <span class="Function">magnitude</span><span class="lv11c">(</span><span class="op_lv11">&</span><span class="Constant">self</span><span class="lv11c">)</span> <span class="op_lv12">-></span> <span class="Type">f32</span><span class="op_lv12">;</span>
<span id="L3" class="LineNr"> 3 </span><span class="lv12c">}</span>
<span id="L4" class="LineNr"> 4 </span>
<span id="L5" class="LineNr"> 5 </span><span class="Keyword">struct</span> <span class="Identifier">Point</span> <span class="lv12c">{</span>
<span id="L6" class="LineNr"> 6 </span> x<span class="op_lv12">:</span> <span class="Type">f32</span><span class="op_lv12">,</span>
<span id="L7" class="LineNr"> 7 </span> y<span class="op_lv12">:</span> <span class="Type">f32</span><span class="op_lv12">,</span>
<span id="L8" class="LineNr"> 8 </span><span class="lv12c">}</span>
<span id="L9" class="LineNr"> 9 </span>
<span id="L10" class="LineNr">10 </span><span class="Keyword">impl</span> Mag <span class="Keyword">for</span> Point <span class="lv12c">{</span>
<span id="L11" class="LineNr">11 </span> <span class="Keyword">fn</span> <span class="Function">magnitude</span><span class="lv11c">(</span><span class="op_lv11">&</span><span class="Constant">self</span><span class="lv11c">)</span> <span class="op_lv12">-></span> <span class="Type">f32</span> <span class="lv11c">{</span>
<span id="L12" class="LineNr">12 </span> <span class="Type">f32</span><span class="op_lv11">::</span><span class="Function">sqrt</span><span class="lv10c">(</span><span class="lv9c">(</span><span class="Constant">self</span><span class="op_lv9">.</span>x <span class="op_lv9">*</span> <span class="Constant">self</span><span class="op_lv9">.</span>x<span class="lv9c">)</span> <span class="op_lv10">+</span> <span class="lv9c">(</span><span class="Constant">self</span><span class="op_lv9">.</span>y <span class="op_lv9">*</span> <span class="Constant">self</span><span class="op_lv9">.</span>y<span class="lv9c">)</span><span class="lv10c">)</span>
<span id="L13" class="LineNr">13 </span> <span class="lv11c">}</span>
<span id="L14" class="LineNr">14 </span><span class="lv12c">}</span>
<span id="L15" class="LineNr">15 </span>
<span id="L16" class="LineNr">16 </span><span class="Keyword">fn</span> <span class="Function">magnitude_of_first</span><span class="lv12c"><</span>T<span class="lv12c">>(</span>vec<span class="op_lv12">:</span> <span class="op_lv12">&</span><span class="Type">Vec</span><span class="lv11c"><</span>T<span class="lv11c">></span><span class="lv12c">)</span> <span class="op_lv0">-></span> <span class="Type">f32</span>
<span id="L17" class="LineNr">17 </span><span class="Keyword">where</span>
<span id="L18" class="LineNr">18 </span> T<span class="op_lv0">:</span> Mag<span class="op_lv0">,</span>
<span id="L19" class="LineNr">19 </span><span class="lv12c">{</span>
<span id="L20" class="LineNr">20 </span> vec<span class="lv11c">[</span><span class="Number">0</span><span class="lv11c">]</span><span class="op_lv12">.</span><span class="Function">magnitude</span><span class="lv11c">()</span>
<span id="L21" class="LineNr">21 </span><span class="lv12c">}</span>
<span id="L22" class="LineNr">22 </span>
<span id="L23" class="LineNr">23 </span><span class="Keyword">fn</span> <span class="Function">main</span><span class="lv12c">()</span> <span class="lv12c">{</span>
<span id="L24" class="LineNr">24 </span> <span class="Comment">// The type hint is here for illustrative purposes</span>
<span id="L25" class="LineNr">25 </span> <span class="Keyword">let</span> v<span class="op_lv12">:</span> <span class="Type">Vec</span><span class="lv11c"><</span>Point<span class="lv11c">></span> <span class="op_lv12">=</span> <span class="PreProc">vec!</span><span class="lv11c">[</span>Point <span class="lv10c">{</span> x<span class="op_lv10">:</span> <span class="Number">0.0</span><span class="op_lv10">,</span> y<span class="op_lv10">:</span> <span class="Number">1.0</span> <span class="lv10c">}</span><span class="op_lv11">,</span> Point <span class="lv10c">{</span> x<span class="op_lv10">:</span> <span class="Number">2.0</span><span class="op_lv10">,</span> y<span class="op_lv10">:</span> <span class="Number">3.0</span> <span class="lv10c">}</span><span class="lv11c">]</span><span class="op_lv12">;</span>
<span id="L26" class="LineNr">26 </span> <span class="PreProc">dbg!</span><span class="lv11c">(</span><span class="Function">magnitude_of_first</span><span class="lv10c">(</span><span class="op_lv10">&</span>v<span class="lv10c">)</span><span class="lv11c">)</span><span class="op_lv12">;</span>
<span id="L27" class="LineNr">27 </span><span class="lv12c">}</span>
</pre>
</div>
<p></html></p>
<p>This is the power of a more expressive type system, which to me is such a
crucial part of a programming language. C++'s type system always falls flat for
me, for many reasons: the limited expressiveness shown above; the lack of a way
for the type system to to track object lifetime, so we have a library of smart
pointers instead which can easily be abused; implicit type conversion that can
make code less clear and cause bugs; references that are taken implicitly; a
confusing value taxonomy.</p>
<p>I am definitely complaining here; but I also think part of developing a good
understanding of a system is understanding its limitations, being able to
understand why some things aren't possible, and knowing what a different system
would (or does!) look like. I hope this Rust example has helped with that.</p>
<p>As always, if you have any C++ insight to share I'd appreciate it: it's a
complex topic and I always have more to learn :)</p>
<h1>Update 26th July 2020: variance</h1>
<p><a href="https://twitter.com/smattrr">Matthew Fernandez</a> kindly sent me <a href="https://eli.thegreenplace.net/2018/covariance-and-contravariance-in-subtyping/">this blog
post</a>
about the concept of type variance, which gives us a more formal language to
talk about some of the ideas discussed in this blog post, namely that C++
templates are <em>invariant</em>. I desired for them to be <em>covariant</em>.</p>
<p>From the linked blog post, "variance refers to how subtyping between more
complex types relates to subtyping between their components". In Rust,
implementing a trait does not create a subtype of that trait, so there's no
concept of variance to apply to our example above.<sup id="fnref:6"><a class="footnote-ref" href="#fn:6">6</a></sup></p>
<p>The issue of variance in C++ is relevant due to the use of both templates and
subtyping; Rust's unified mechanism for polymorphism, with explicitly met
constraints on type arguments, avoids this issue.</p>
<div class="footnote">
<hr>
<ol>
<li id="fn:1">
<p>The same would apply with STL smart pointers, which is what I would find
myself using in the wild. I did not use them here for simplicity, and because
they are not germane to the example. <a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text">↩</a></p>
</li>
<li id="fn:2">
<p>Indeed, C++'s STL specification notoriously allows for optimisations on
<code>std::vector<bool></code> that mean it can behave differently from how one might
expect, in order to save space:
<a href="https://en.cppreference.com/w/cpp/container/vector_bool">https://en.cppreference.com/w/cpp/container/vector_bool</a> <a class="footnote-backref" href="#fnref:2" title="Jump back to footnote 2 in the text">↩</a></p>
</li>
<li id="fn:3">
<p>I have a blog post about C++20 concepts, specifically about how they
differ from Rust traits.
<a href="https://mcla.ug/blog/cpp20-concepts-are-not-like-rust-traits.html">here</a> <a class="footnote-backref" href="#fnref:3" title="Jump back to footnote 3 in the text">↩</a></p>
</li>
<li id="fn:4">
<p>If you'd like to read more about Rust in general, I have a blog post
<a href="https://mcla.ug/blog/rust-a-future-for-real-time-and-safety-critical-software.html">here</a> <a class="footnote-backref" href="#fnref:4" title="Jump back to footnote 4 in the text">↩</a></p>
</li>
<li id="fn:5">
<p>This is obviously a bit of a contrived example, but I hope it is helpful.
In reality its rare to pass a vector like this, and instead a slice would be
passed. I wanted to use the <code>Vec</code> type for a closer analogue to the original
C++ example.
<a href="https://doc.rust-lang.org/book/ch04-03-slices.html">https://doc.rust-lang.org/book/ch04-03-slices.html</a> <a class="footnote-backref" href="#fnref:5" title="Jump back to footnote 5 in the text">↩</a></p>
</li>
<li id="fn:6">
<p>Subtyping and variance <em>are</em> relevant concepts in Rust, but they apply to
lifetimes. This isn't something my knowledge is robust on, but you can read
about it in the
<a href="https://doc.rust-lang.org/reference/subtyping.html">Rustonomicon</a> <a class="footnote-backref" href="#fnref:6" title="Jump back to footnote 6 in the text">↩</a></p>
</li>
</ol>
</div>Emulating an STM32F4 in QEMU to test ARM assembly2020-04-04T00:00:00+01:002020-04-04T00:00:00+01:00Hannah McLaughlintag:mcla.ug,2020-04-04:/blog/emulating-stm32-qemu.html<p>I recently published a blog post titled <a href="https://mcla.ug/blog/how-to-flash-an-led.html" target="_blank">How to flash an
LED</a> about writing ARM assembly
for an STM32. I was running my code on a
<a href="https://1bitsy.org/" target="_blank">1bitsy</a>
development board, but I wanted anyone to be able to have a go at writing the
assembly and testing it – even if they …</p><p>I recently published a blog post titled <a href="https://mcla.ug/blog/how-to-flash-an-led.html" target="_blank">How to flash an
LED</a> about writing ARM assembly
for an STM32. I was running my code on a
<a href="https://1bitsy.org/" target="_blank">1bitsy</a>
development board, but I wanted anyone to be able to have a go at writing the
assembly and testing it – even if they didn't have the right hardware.
You can find <a href="https://github.com/lochsh/gpio-toggle-exercise" target="_blank">the repo on my
Github</a>.</p>
<p>Finding the specific QEMU tools I needed took a while, so I wanted to document
it in this blog post.</p>
<h2>The quest for a QEMU STM32F4 machine</h2>
<p><a href="https://www.qemu.org/" target="_blank">QEMU</a> is a "generic and open source machine emulator
and virtualizer". An example where emulation is useful: if you are writing
software for an embedded target, reliable automated tests can be a challenge.
Emulating your embedded target on your host computer makes allows for easier
testing, and for isolating problems to do with the real hardware from problems
to do with the software.</p>
<p>The 1bitsy has a STM32F415 microcontroller, so I was looking for emulation for
a development board with a similar MCU. QEMU comes with a set of built-in
machines, and you can write your own machines to emulate the hardware you
desire.</p>
<p>Unfortunately, none of the built-in machines suited my purposes. There were
multiple Cortex M3 machines, but none of them were STM32s, and there were no
Cortex M4 machines. I wanted to use an STM32F4 so as to avoid confusion for
people going from my assembly blog post to the emulation. I also gave the blog
post as a talk at work, and didn't want my coworkers to be confused by the
inconsistency. Creating my own QEMU machine was not remotely feasible on the
timescales I was working on for my work talk, and I believe it is a lot of
work! So, I searched the internet for an available STM32F4 QEMU machine.</p>
<h3>Enter GNU MCU Eclipse...maybe</h3>
<p>During this search, I came across many mentions of <a href="https://gnu-mcu-eclipse.github.io/" target="_blank">GNU MCU
Eclipse</a>, a suite of tools for the Eclipse
IDE which included an STM32F4 Discovery board.</p>
<p>I am a <a href="https://www.vim.org/" target="_blank">vim</a> die-hard with no desire to use Eclipse, so I
was hopeful to find a way to be able to extract these tools and easily use
them outside of Eclipse.</p>
<p>It turns out the fork of QEMU that this toolsuite is using is available
separately as <a href="https://xpack.github.io/qemu-arm/" target="_blank">xPack QEMU
ARM</a>.</p>
<h3>xPack QEMU ARM has an STM32F4 Discovery machine</h3>
<p>xPack <a href="https://xpack.github.io/qemu-arm/install/" target="_blank">recommend</a> using
<a href="https://www.npmjs.com/package/xpm" target="_blank">xpm</a> to install their QEMU ARM fork, but I
found the manual installation instructions worked fine on Ubuntu. This gives
us, among other things, the <code>qemu-system-gnuarmeclipse</code> binary.</p>
<h2>Connecting to the emulator via gdb</h2>
<p>We can run our target executable on the emulator like this, with the <code>gdb</code>
switch telling QEMU to wait for a connection from gdb before continuing the
program's execution. Here, we tell it listen on TCP port 3333.
<html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span>qemu<span class="op_lv0">-</span>system<span class="op_lv0">-</span>gnuarmeclipse <span class="Statement">\</span>
<span id="L2" class="LineNr">2 </span> <span class="op_lv0">-</span>cpu cortex<span class="op_lv0">-</span>m4 <span class="Statement">\</span>
<span id="L3" class="LineNr">3 </span> <span class="op_lv0">-</span>machine STM32F4<span class="op_lv0">-</span>Discovery <span class="Statement">\</span>
<span id="L4" class="LineNr">4 </span> <span class="op_lv0">-</span>gdb tcp<span class="op_lv0">::</span><span class="Number">3333</span> <span class="Statement">\</span>
<span id="L5" class="LineNr">5 </span> <span class="op_lv0">-</span>nographic <span class="Statement">\</span>
<span id="L6" class="LineNr">6 </span> <span class="op_lv0">-</span>kernel <span class="Statement">"</span><span class="PreProc">${</span><span class="PreProc">TARGET</span><span class="PreProc">}</span><span class="Statement">"</span>
</pre>
</div>
<p></html></p>
<p>Then we can connect from gdb:
<html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span>gdb<span class="op_lv0">-</span>multiarch <span class="Statement">\</span>
<span id="L2" class="LineNr">2 </span> <span class="op_lv0">-</span>q <span class="Statement">"</span><span class="PreProc">${</span><span class="PreProc">TARGET</span><span class="PreProc">}</span><span class="Statement">"</span> <span class="Statement">\</span>
<span id="L3" class="LineNr">3 </span> <span class="op_lv0">-</span>ex <span class="Statement">"</span><span class="String">target remote :3333</span><span class="Statement">"</span>
</pre>
</div>
<p></html></p>
<h2>Automated testing of ARM assembly on the STM32F4 emulator</h2>
<p>I put all this together into a bash script so I could test that assembly for
toggling a GPIO pin was doing what it should, by reading register values from
gdb at breakpoints set in the assembly file:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Comment">#!/usr/bin/env bash</span>
<span id="L2" class="LineNr"> 2 </span><span class="Statement">set</span><span class="Identifier"> </span><span class="Special">-euo</span><span class="Identifier"> pipefail</span>
<span id="L3" class="LineNr"> 3 </span><span class="Identifier">TARGET</span>=<span class="PreProc">$1</span>
<span id="L4" class="LineNr"> 4 </span>
<span id="L5" class="LineNr"> 5 </span>qemu<span class="op_lv0">-</span>system<span class="op_lv0">-</span>gnuarmeclipse <span class="Statement">\</span>
<span id="L6" class="LineNr"> 6 </span> <span class="op_lv0">-</span>cpu cortex<span class="op_lv0">-</span>m4 <span class="Statement">\</span>
<span id="L7" class="LineNr"> 7 </span> <span class="op_lv0">-</span>machine STM32F4<span class="op_lv0">-</span>Discovery <span class="Statement">\</span>
<span id="L8" class="LineNr"> 8 </span> <span class="op_lv0">-</span>gdb tcp<span class="op_lv0">::</span>3333 <span class="Statement">\</span>
<span id="L9" class="LineNr"> 9 </span> <span class="op_lv0">-</span>nographic <span class="Statement">\</span>
<span id="L10" class="LineNr">10 </span> <span class="op_lv0">-</span>kernel <span class="Statement">"</span><span class="PreProc">${</span><span class="PreProc">TARGET</span><span class="PreProc">}</span><span class="Statement">"</span> <span class="op_lv0">></span> <span class="op_lv0">/</span>dev<span class="op_lv0">/</span>null <span class="op_lv0">&</span>
<span id="L11" class="LineNr">11 </span><span class="Identifier">QEMU_PID</span>=<span class="PreProc">$!</span>
<span id="L12" class="LineNr">12 </span>
<span id="L13" class="LineNr">13 </span><span class="Statement">if </span><span class="Statement">[</span> <span class="Statement">!</span> <span class="Statement">-d</span> <span class="Statement">"</span><span class="String">/proc/</span><span class="PreProc">${</span><span class="PreProc">QEMU_PID</span><span class="PreProc">}</span><span class="Statement">"</span> <span class="Statement">]</span>
<span id="L14" class="LineNr">14 </span><span class="Statement">then</span>
<span id="L15" class="LineNr">15 </span> <span class="Statement">echo</span><span class="String"> -ne </span><span class="Statement">"</span><span class="Special">\033</span><span class="String">[31m Failed to start QEMU</span><span class="Statement">"</span>
<span id="L16" class="LineNr">16 </span> <span class="Statement">echo</span><span class="String"> -e </span><span class="Statement">"</span><span class="Special">\033</span><span class="String">[0m</span><span class="Statement">"</span>
<span id="L17" class="LineNr">17 </span> <span class="Statement">exit</span> <span class="Number">1</span>
<span id="L18" class="LineNr">18 </span><span class="Statement">fi</span>
<span id="L19" class="LineNr">19 </span>
<span id="L20" class="LineNr">20 </span><span class="Function">function</span> <span class="Function">read_address() {</span>
<span id="L21" class="LineNr">21 </span> <span class="Statement">local</span><span class="Identifier"> ADDRESS=</span><span class="PreProc">$1</span>
<span id="L22" class="LineNr">22 </span> <span class="Identifier">VALUE</span>=<span class="PreProc">$(</span><span class="Special">gdb-multiarch \</span>
<span id="L23" class="LineNr">23 </span><span class="Special"> </span><span class="Special">-q</span><span class="Special"> </span><span class="Statement">"</span><span class="PreProc">${</span><span class="PreProc">TARGET</span><span class="PreProc">}</span><span class="Statement">"</span><span class="Special"> \</span>
<span id="L24" class="LineNr">24 </span><span class="Special"> </span><span class="Special">-ex</span><span class="Special"> </span><span class="Statement">"</span><span class="String">target remote :3333</span><span class="Statement">"</span><span class="Special"> \</span>
<span id="L25" class="LineNr">25 </span><span class="Special"> </span><span class="Special">-ex</span><span class="Special"> </span><span class="Statement">"</span><span class="String">x/1xw </span><span class="PreProc">${</span><span class="PreProc">ADDRESS</span><span class="PreProc">}</span><span class="Statement">"</span><span class="Special"> \</span>
<span id="L26" class="LineNr">26 </span><span class="Special"> </span><span class="Special">--batch</span><span class="Special"> </span><span class="Statement">|</span><span class="Special"> </span><span class="Statement">tail</span><span class="Special"> </span><span class="Number">-1</span><span class="Special"> </span><span class="Statement">|</span><span class="Special"> cut </span><span class="Special">-f</span><span class="Special"> </span><span class="Number">2</span><span class="PreProc">)</span>
<span id="L27" class="LineNr">27 </span><span class="Function">}</span>
<span id="L28" class="LineNr">28 </span>
<span id="L29" class="LineNr">29 </span><span class="Function">function</span> <span class="Function">test_address() {</span>
<span id="L30" class="LineNr">30 </span> <span class="Statement">local</span><span class="Identifier"> ADDRESS=</span><span class="PreProc">$1</span>
<span id="L31" class="LineNr">31 </span> <span class="Statement">local</span><span class="Identifier"> REGISTER_NAME=</span><span class="PreProc">$2</span>
<span id="L32" class="LineNr">32 </span> <span class="Statement">local</span><span class="Identifier"> EX_VALUE=</span><span class="PreProc">$3</span>
<span id="L33" class="LineNr">33 </span>
<span id="L34" class="LineNr">34 </span> read_address <span class="Statement">"</span><span class="PreProc">${</span><span class="PreProc">ADDRESS</span><span class="PreProc">}</span><span class="Statement">"</span>
<span id="L35" class="LineNr">35 </span> <span class="Statement">if </span><span class="Statement">[</span> <span class="Statement">"</span><span class="PreProc">$VALUE</span><span class="Statement">"</span> <span class="Statement">=</span> <span class="String">"</span><span class="PreProc">${</span><span class="PreProc">EX_VALUE</span><span class="PreProc">}</span><span class="String">"</span> <span class="Statement">]</span>
<span id="L36" class="LineNr">36 </span> <span class="Statement">then</span>
<span id="L37" class="LineNr">37 </span> <span class="Statement">echo</span><span class="String"> -ne </span><span class="Statement">"</span><span class="Special">\033</span><span class="String">[32m</span><span class="Special">✓</span><span class="String"> </span><span class="PreProc">${</span><span class="PreProc">REGISTER_NAME</span><span class="PreProc">}</span><span class="String"> correctly set to </span><span class="PreProc">${</span><span class="PreProc">EX_VALUE</span><span class="PreProc">}</span><span class="Statement">"</span>
<span id="L38" class="LineNr">38 </span> <span class="Statement">else</span>
<span id="L39" class="LineNr">39 </span> <span class="Statement">echo</span><span class="String"> -ne </span><span class="Statement">"</span><span class="Special">\033</span><span class="String">[31m</span><span class="Special">✘</span><span class="String"> </span><span class="PreProc">${</span><span class="PreProc">REGISTER_NAME</span><span class="PreProc">}</span><span class="String"> was </span><span class="PreProc">${</span><span class="PreProc">VALUE</span><span class="PreProc">}</span><span class="String">, want </span><span class="PreProc">${</span><span class="PreProc">EX_VALUE</span><span class="PreProc">}</span><span class="Statement">"</span>
<span id="L40" class="LineNr">40 </span> <span class="Statement">fi</span>
<span id="L41" class="LineNr">41 </span>
<span id="L42" class="LineNr">42 </span> <span class="Statement">echo</span><span class="String"> -e </span><span class="Statement">"</span><span class="Special">\033</span><span class="String">[0m</span><span class="Statement">"</span>
<span id="L43" class="LineNr">43 </span><span class="Function">}</span>
<span id="L44" class="LineNr">44 </span>
<span id="L45" class="LineNr">45 </span>test_address <span class="Statement">"</span><span class="String">0x40023830</span><span class="Statement">"</span> <span class="Statement">"</span><span class="String">AHB1ENR</span><span class="Statement">"</span> <span class="Statement">"</span><span class="String">0x00000001</span><span class="Statement">"</span>
<span id="L46" class="LineNr">46 </span>test_address <span class="Statement">"</span><span class="String">0x40020000</span><span class="Statement">"</span> <span class="Statement">"</span><span class="String">GPIOA_MODER</span><span class="Statement">"</span> <span class="Statement">"</span><span class="String">0xa8010000</span><span class="Statement">"</span>
<span id="L47" class="LineNr">47 </span>test_address <span class="Statement">"</span><span class="String">0x40020014</span><span class="Statement">"</span> <span class="Statement">"</span><span class="String">GPIOA_ODR</span><span class="Statement">"</span> <span class="Statement">"</span><span class="String">0x00000100</span><span class="Statement">"</span>
<span id="L48" class="LineNr">48 </span>
<span id="L49" class="LineNr">49 </span><span class="Statement">kill</span> <span class="op_lv0">$</span><span class="lv12c">{</span>QEMU_PID<span class="lv12c">}</span> <span class="op_lv0">&></span> <span class="op_lv0">/</span>dev<span class="op_lv0">/</span>null
</pre>
</div>
<p></html></p>
<p>The output looks like this for correct assembly, which you can view
<a href="https://github.com/lochsh/gpio-toggle-exercise/blob/master/sneak-peek/working-gpio-toggle.s">here</a>.</p>
<p><img src="/blog/images/qemu-output.png" width="400" alt="QEMU script output" class=callout></p>
<p>This testing is limited, but I'm pleased with it! I put everything in a
Dockerfile in the Github repo to make life easier for people trying this out.</p>
<h2>Bonus 1337 screenshot</h2>
<p>This is a screenshot from when I finally got all this working after a couple
evenings of trials:</p>
<p><img src="/blog/images/qemu-all-screenshot.png" class=callout></p>
<p>Clockwise from left is the assembly, the bash script, a gdb session where I'd
been poking around, and me running the bash script.</p>
<p>I hope this is post helpful to anyone wanting to emulate an STM32F4 without
having to do it all themselves, and that the tools in my
<a href="https://github.com/lochsh/gpio-toggle-exercise" target="_blank">GPIO toggling
exercise</a> are
useful to anyone who wants to play with ARM assembly but doesn't have the
hardware.</p>How to flash an LED2020-03-27T00:00:00+00:002020-03-27T00:00:00+00:00Hannah McLaughlintag:mcla.ug,2020-03-27:/blog/how-to-flash-an-led.html<p>Today we are going to be learning how to flash an LED on a microcontroller by
writing ARM assembly.</p>
<p>If you write software but are unfamiliar with basic electronics or embedded
software development, there will be explanations of some fundamentals – I
expect you will not feel left behind :).</p>
<p>If you'd …</p><p>Today we are going to be learning how to flash an LED on a microcontroller by
writing ARM assembly.</p>
<p>If you write software but are unfamiliar with basic electronics or embedded
software development, there will be explanations of some fundamentals – I
expect you will not feel left behind :).</p>
<p>If you'd like to skip the fundamentals and go straight to the assembly writing,
click <a href="#writingcode">here</a>.</p>
<p>We'll be using a <a href="https://1bitsy.org/" target="_blank">1bitsy</a>, which is an
<a href="https://www.st.com/en/microcontrollers-microprocessors/stm32-32-bit-arm-cortex-mcus.html" target="_blank">STM32</a> development
board. We'll be reading the STM32 reference manual to figure out what assembly
to write. If you want to try out the coding yourself, but don't have a 1bitsy
or anything similar, check out <a href="https://github.com/lochsh/gpio-toggle-exercise" target="_blank">this Github repo</a>
where you can run the code on an emulator in a Docker container.</p>
<h1>Some basics</h1>
<p>Let's get some basics out of the way. How <em>do</em> you flash an LED?</p>
<h2>Hardware</h2>
<p><img src="/blog/images/led-flash/led.svg" width="100" alt="LED circuit symbol" class=callout>
As you may know, an LED is a "light-emitting diode". LEDs are
increasingly popular in torches, bulbs and various other lighting due to their
ability to be very bright with relatively low power. What we need to know is
that they emit light when current passes through them. So: we need some
current.</p>
<p><img src="/blog/images/led-flash/ledcircuit.svg" width="100" alt="LED circuit" class=callout>
This is just a simple electronics circuit with current flowing through the LED.
We have low voltage at the bottom, and higher voltage at the top: that
difference causes current to flow, lighting up our LED. But this is no good for
us – we want to control the flow of current so that we can flash our LED on
and off.</p>
<p>In order to control the current in our LED, we can connect it to a GPIO pin on
our microcontroller:
<img src="/blog/images/led-flash/ledcircuitgpio.svg" width="300" alt="LED connected to a GPIO" class=callout></p>
<p>GPIO just stands for "general purpose input/output". It's a pin on a
microchip that you can configure at runtime: for example, you can say "I
want this pin to be an output, and I want to turn it on", or "I want this pin
to be an input" and then read data from it.</p>
<p>A microcontroller is a small computer used for embedded software – we'll learn
more about the specific microcontroller we're using later. Embedded software is
software that isn't written for a general purpose computer, but instead targets
specific hardware used in some physical device: for example, the software that
runs on an MRI scanner to control its operation, or the software in modern cars
that controls things like the anti-lock braking system.</p>
<h2>Software</h2>
<p>If GPIO pins are configurable at runtime, then we need to write
some code that will tell our little computer how to configure the
GPIO. This is how that would usually look:</p>
<p><img src="/blog/images/led-flash/ledsoftwarecompile.svg" width="100%" alt="embedded software development process" class=callout></p>
<p>Usually, you'd expect that code to be written in C. You need a language that
allows you control over memory the way that C does: when you have small limited
memory, as is generally the case on small embedded computers,
it's important to be able to understand how much memory is being used by
your program. Languages that rely on dynamic allocation and garbage collection
are a bad fit, partly for this reason.</p>
<p><a href="https://mcla.ug/blog/rust-a-future-for-real-time-and-safety-critical-software.html" target="_blank">Rust</a> and C++ are also used for embedded work. The C++ you'd write for embedded
would be quite different from what you might write for a desktop application
– you likely wouldn't use any of the STL containers as they all rely on
dynamic memory allocation. Eliminating dynamic memory allocation is safer: the
risk of memory allocation failing is much higher when there isn't much memory
to begin with. And a failure could be much more catastrophic: many embedded
systems are designed to run autonomously, without any human there to restart
them. Many control safety-critical physical systems.</p>
<p>Dynamic allocation is generally not needed anyway: a desktop application might
have to dynamically allocate resources to accommodate a user opening an unknown
number of tabs in a GUI; an embedded application will know at compile time how
many motors it has to control, or how many sensors it will read from. This
allows a lot of memory to be allocated statically.</p>
<p>So, for the sake of example let's say that you would write your code to flash
an LED in C. You'd probably use a hardware abstraction library (aka a HAL) to
abstract over memory addresses and such. This makes the code more portable as
well as more readable.</p>
<p>But today, we're going to do stuff a little differently from how you might
normally: we'll be writing all our code in assembly.</p>
<h2>What is assembly?</h2>
<p>When you compile a C program, say, you compile it to machine code. Machine code
is the lowest level of software – it's the binary code that the CPU
executes. This machine code consists of <em>instructions</em>. For example, you might
have one instruction that says "copy value 42 into register 0", and that is our
smallest unit of executable code.</p>
<p>Assembly is the next level of software up – it's a lot like writing human
readable machine code, where you write out each instruction in text form. This
is very different to writing a C program which is much further abstracted,
which means compiling C to machine code is a lot more complicated. When we
write assembly today, that's exactly what our CPU is going to be executing:
there's a very close mapping to the actual machine code.</p>
<h2>Why write our code in assembly?</h2>
<p>The usual reasons: for fun and learning! Writing code in assembly means really
getting to know your target hardware. Plus, we'll know exactly what code is
running on our processor.</p>
<p>Although this isn't something you might usually do, understanding assembly code
is a big part of many developers' jobs: reading the assembly is often the only
way to debug optimised code, and it's crucial to reverse engineering and
exploit development. It's also key to compiler development, and used for making
specific optimisations to embedded code. It's often the only way to access
specialised CPU features, and to run special instructions like DSP
instructions.</p>
<h1>Getting to know our hardware</h1>
<p>Doing embedded development means really getting to know your target hardware.
So, what hardware are we using?</p>
<p><img alt="hardware" class="callout" src="/blog/images/led-flash/hardware.png" title="list of hardware and documentation"></p>
<p>We have an ARM development board called a 1bitsy. It has an STM32F4 on it,
which is our microcontroller unit, or MCU. This microcontroller is basically
the CPU plus about a megabyte of flash and 200 kilobytes of RAM, and what
are called peripherals: some of these are for communicating via various
protocols, and some are for general purposes usage, like the GPIOs we talked
about earlier. The MCU has everything you need to make the CPU actually be
a useful computer. Our STM32 contains a Cortex M4 CPU – the picture
above is of the die of the STM32, it's basically what's inside the black
plastic on the outside of the chip. The CPU is on the top right of the die,
with RAM top left, flash bottom left, and peripherals bottom right.</p>
<p>I've included lists of the documentation associated with these. Today we're
exclusively going to be looking at the schematic for the board and the
reference manual for the STM32.</p>
<p>To program the 1bitsy, we will also need a prorgrammer board like the
<a href="https://github.com/blacksphere/blackmagic/wiki" target="_blank">Black Magic probe</a>.</p>
<h1>A brief introduction to assembly</h1>
<h2>What does assembly look like?</h2>
<p>Before we get onto writing some code, what does ARM assembly look like?</p>
<p>Here is an example instruction: <code>mov r0, #5</code>. This means move the literal value
5 into register 0. But what's a register? A register is the last key concept
we're going to need to know before we write any assembly.</p>
<p><img src="/blog/images/led-flash/registers.svg" width="250" alt="registers of Cortex M4" class=callout></p>
<p>Our ARM processor has a small number of very fast, very small storage
locations, and they're called registers. These are directly accessed by the
CPU, and they aren't accessible via main memory. Some are used for general
purpose storage, others have specific purposes, like the program counter
register (PC). The CPU is hardwired to execute whatever instruction is at the
memory location stored in the PC. The stack pointer is used to keep track of
the call stack.</p>
<p>On a separate memory bus, our STM32 also has about a thousand configuration and
status registers – also often called memory-mapped IO. These are basically
pre-defined structs that live somewhere in memory, and you read & write to them
in order to configure the hardware. In our case, we'll be writing to these to
configure a GPIO, which will be connected to our LED.</p>
<h2>RISC vs CISC</h2>
<p>I think it's important context to note that the assembly we'll be writing today
is a little different than what you would likely write for your PC. Broadly,
you can divide computer architectures into complex instruction set computers
(CISC) and reduced instruction set computers (RISC). CISC is what Intel
chips use, and it is optimised to perform actions in as few instructions as
possible – as a consequence each instruction itself can be very complex.
RISC, on the other hand, prioritises having simple instructions, and you'll
be glad to know that's what we'll be writing today.</p>
<p>I couldn't resist including a screenshot from Hackers, my favourite movie,
which is from 1995, a much more hopeful time in software.</p>
<p><img alt="risc" class="callout" src="/blog/images/led-flash/risc.png" title="screengrab from Hackers the movie"></p>
<p>Here the hacker Acid Burn is saying that RISC architecture is going to change
everything – and in many ways she's right! I don't know of any mobile phone,
Apple or Android, that doesn't use an ARM core, and mobile phones are
everywhere. Sadly, most laptops and desktops use Intel CISC processors. This
makes no difference to my life at all, but I like to pretend it matters to me
so I can feel like I'm as cool as Acid Burn.</p>
<h1 id="writingcode">Let's write some code!</h1>
<p>At last...it is time to get down to business. First we need to briefly
look at the <a href="https://github.com/1Bitsy/1bitsy-hardware/blob/51de093b188c909c3b7af41ad1a1134d68a42f0e/1bitsy/v1.0d/1bitsy_schematic.pdf" target="_blank">schematic</a>
for the 1bitsy, our development board. The schematic tells us what is on the
board, and how it is connected. We're interested in how the status LED is
connected.</p>
<p>Because the 1bitsy is quite simple, there is only one page to the schematic.
If we look at the top of the schematic, centre-right, we can see that there's a
status LED connected to GPIO port A, pin 8, which we'll call PA8 for short.</p>
<p><img src="/blog/images/led-flash/ledschematic.png" width="250" alt="LED in 1bitsy schematic" class=callout></p>
<p>There are three things we're going to need to do:</p>
<ol>
<li>Turn on the GPIOA clock</li>
<li>Set GPIOA8 to an output</li>
<li>Toggle GPIOA8</li>
</ol>
<h2>Turning on the clock</h2>
<p>Before we can do anything with this GPIO pin, we need to set up its clock.
Inside our chip, and inside the CPUs in our work laptops, there's a oscillator
providing a clock signal that is used to synchronise different parts of the
complicated integrated circuit that is our computer.</p>
<p>If we are going to use our GPIO pin, it needs to have its clock enabled,
otherwise it is effectively off, and won't respond to any reads or writes.
It defaults to being off because the peripheral consumes power when it's on.</p>
<p>To find out how to setup the GPIOA clock, we need to look at the STM32F415
<a href="https://tinyurl.com/f4-ref-man" target="_blank">reference manual</a>, or ref man for short.
We want to look at the memory map, to see what the start address is for the
Reset and Clock Control (RCC) registers.</p>
<p><img src="/blog/images/led-flash/memory-map-rcc.png" width="100%" alt="STM32 memory map" class=callout></p>
<p>We're going to need a bit more information in order to set the clock, but this
memory address is something we'll need in our code, so let's make a note of it
(0x40023800).</p>
<p>Let's go to the RCC register map next – this is how we're going to find
exactly which RCC register we need to write to in order to turn on the GPIOA
clock.</p>
<p><img src="/blog/images/led-flash/rcc-register-map.png" width="100%" alt="RCC register map" class=callout></p>
<p>The first column in this table shows the address offset from the base address
we noted earlier. The numbers from 31 to 0 show the bits of the 32-bit
registers.</p>
<p>If we look closely, we can see the field GPIOA_ENR for enabling GPIOA's clock
– so, we want to set bit 0 in the AHB1ENR register. I realise that might
seem very obscure; I think there are two things to note: firstly, there's
actually a lot of additional documentation about this elsewhere in the ref man,
showing the different memory buses and the clock tree. It would be too dense to
show in this blog post.</p>
<p>Secondly, when you create a software API, a huge priority is making something
that is useable and clear to developers (I should hope it is, anyway). When
designing hardware, there are physical constraints, and the design <em>has</em>
to be cheap and simple to mass manufacture. Consequently, clarity for us chumps
cannot be a priority, and instead of a method call with helpfully named
arguments, we have dense manuals like this...</p>
<p>Reading this sort of documentation does get easier the more you get to know
your architecture, and the more experience you have reading similar manuals
– as with anything :)</p>
<h3>Actually writing code for real</h3>
<p>Now: we're finally going to write some actual code. I am sorry I said "let's
write some code" further up. We couldn't do it until we had this information
from the ref man!</p>
<p>Let's copy that RCC base address into register 0. Our registers are all 32 bits
wide, but we can only copy 16 bits at a time, otherwise we'd have no room for
the rest of our instruction. So, we copy 0x00003800 into the register using the
<code>mov</code> instruction, and then copy 0x4002 into the top half, hence the <code>t</code> in
<code>movt</code> below.</p>
<p>Then, we want to set the 0th bit in the AHB1ENR register. First, let's copy
0x01 into r1. Then, let's store the contents of r1 in the memory address
contained in r0, offset by 0x30 using the <code>str</code> instruction.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span> <span class="op_lv0">@</span> <span class="Function">Store</span> <span class="Function">RCC</span> <span class="Function">base</span> <span class="Function">address</span> <span class="Function">in</span> <span class="Type">r0</span>
<span id="L2" class="LineNr">2 </span> <span class="Keyword">movw</span> <span class="Type">r0</span><span class="op_lv0">,</span> <span class="Constant">#0x3800</span>
<span id="L3" class="LineNr">3 </span> <span class="Keyword">movt</span> <span class="Type">r0</span><span class="op_lv0">,</span> <span class="Constant">#0x4002</span>
<span id="L4" class="LineNr">4 </span>
<span id="L5" class="LineNr">5 </span> <span class="op_lv0">@</span> <span class="Function">Turn</span> <span class="Function">on</span> <span class="Function">GPIOA</span> <span class="Function">clock</span> <span class="Function">by</span> <span class="Function">setting</span> <span class="Function">bit</span> <span class="Constant">0</span> <span class="Function">in</span> <span class="Function">AHB1ENR</span> <span class="Function">register</span>
<span id="L6" class="LineNr">6 </span> <span class="Keyword">movw</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x01</span>
<span id="L7" class="LineNr">7 </span> <span class="Keyword">str</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="lv12c">[</span><span class="Type">r0</span><span class="op_lv12">,</span> <span class="Constant">#0x30</span><span class="lv12c">]</span>
</pre>
</div>
<p></html></p>
<p>With these runes, we can enable the clock!</p>
<p>All the <code>mov</code> instructions are about moving data into registers. The <code>str</code>
instruction moves data from registers and into memory.</p>
<p>You can read more detail about these instructions in the <a href="http://infocenter.arm.com/help/topic/com.arm.doc.dui0553b/DUI0553.pdf" target="_blank">User Guide</a> for our CPU.</p>
<h2>Setting GPIOA8 to an output</h2>
<p>Next on our list is configuring GPIOA8 to be an output. As before, we can look
up the base address of GPIOA registers in the ref man. It's 0x40020000. Then,
we can have a read of the GPIO registers to find out which one we need to write
to.</p>
<p><img src="/blog/images/led-flash/gpioen.png" width="100%" alt="GPIO enable register" class=callout></p>
<p>It looks like we want GPIOA_MODER, and you can see above that the reset value
is 0xA8000000 for GPIOA. I understand this is because some of the GPIOA pins
are used for the debug interface of the STM32, otherwise the reset value would
be all zeroes. We want to change the two-bit field MODER8 to be 01, so we want
to set the register value to 0xA8010000. There is no offset this time as the
mode register is the first GPIO register.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span> <span class="op_lv0">@</span> <span class="Function">Store</span> <span class="Function">start</span> <span class="Function">address</span> <span class="Function">of</span> <span class="Function">GPIOA</span> <span class="Function">registers</span>
<span id="L2" class="LineNr">2 </span> <span class="Keyword">movw</span> <span class="Type">r0</span><span class="op_lv0">,</span> <span class="Constant">#0x0000</span>
<span id="L3" class="LineNr">3 </span> <span class="Keyword">movt</span> <span class="Type">r0</span><span class="op_lv0">,</span> <span class="Constant">#0x4002</span>
<span id="L4" class="LineNr">4 </span>
<span id="L5" class="LineNr">5 </span> <span class="op_lv0">@</span> <span class="Function">Use</span> <span class="Function">GPIOA_MODER</span> <span class="Function">to</span> <span class="Function">make</span> <span class="Function">GPIOA8</span> <span class="Function">an</span> <span class="Function">output</span>
<span id="L6" class="LineNr">6 </span> <span class="Keyword">movw</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0000</span>
<span id="L7" class="LineNr">7 </span> <span class="Keyword">movt</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0xA801</span>
<span id="L8" class="LineNr">8 </span> <span class="Keyword">str </span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="lv12c">[</span><span class="Type">r0</span><span class="lv12c">]</span>
</pre>
</div>
<p></html></p>
<h2>Toggling the GPIO</h2>
<p><img src="/blog/images/led-flash/gpiooutdata.png" width="100%" alt="GPIO output data register" class=callout>
If we look at the GPIO documentation, it tells us that there is an output data
register, but access to it isn't atomic. That's not a big problem for us here
as we don't have any concurrency, but maybe we will later on! We can use the
bit-set-reset register for atomic access instead. This also allows us to set
individual bits in the output data register, instead of overwriting any values
on other GPIO pins.</p>
<p><img src="/blog/images/led-flash/bsrr.png" width="100%" alt="GPIO output data register" class=callout></p>
<p>The direction our LED has been wired up means it's active low, so it will turn
on when the GPIO output is cleared, and off when it is set.</p>
<p>So, to turn on our LED we want to set the BR8 field, and to turn it off, we
want to set the BS8 field.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span> <span class="op_lv0">@</span> <span class="Function">Set</span> <span class="Function">BR8</span> <span class="Function">field</span> <span class="Function">in</span> <span class="Function">GPIOA_BSRR</span><span class="op_lv0">,</span> <span class="Function">to</span> <span class="Function">clear</span> <span class="Function">GPIOA8</span>
<span id="L2" class="LineNr">2 </span> <span class="Keyword">movw</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0000</span>
<span id="L3" class="LineNr">3 </span> <span class="Keyword">movt</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0100</span>
<span id="L4" class="LineNr">4 </span> <span class="Keyword">str</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="lv12c">[</span><span class="Type">r0</span><span class="op_lv12">,</span> <span class="Constant">#0x18</span><span class="lv12c">]</span>
<span id="L5" class="LineNr">5 </span>
<span id="L6" class="LineNr">6 </span> <span class="op_lv0">@</span> <span class="Function">Set</span> <span class="Function">BS8</span> <span class="Function">field</span> <span class="Function">in</span> <span class="Function">GPIOA_BSRR</span><span class="op_lv0">,</span> <span class="Function">to</span> <span class="Function">set</span> <span class="Function">GPIOA8</span>
<span id="L7" class="LineNr">7 </span> <span class="Keyword">movw</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0100</span>
<span id="L8" class="LineNr">8 </span> <span class="Keyword">str</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="lv12c">[</span><span class="Type">r0</span><span class="op_lv12">,</span> <span class="Constant">#0x18</span><span class="lv12c">]</span>
</pre>
</div>
<p></html></p>
<h3>Looping</h3>
<p>The last code snippet will just turn the LED off and on once. To create an
infinite loop instead, we simply create a label (let's call it <code>.loop</code>) and
then use the branch instruction to go back to that label!</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="op_lv0">.</span>loop<span class="op_lv0">:</span>
<span id="L2" class="LineNr"> 2 </span> <span class="op_lv0">@</span> <span class="Function">Set</span> <span class="Function">BR8</span> <span class="Function">field</span> <span class="Function">in</span> <span class="Function">GPIOA_BSRR</span><span class="op_lv0">,</span> <span class="Function">to</span> <span class="Function">clear</span> <span class="Function">GPIOA8</span>
<span id="L3" class="LineNr"> 3 </span> <span class="Keyword">movw</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0000</span>
<span id="L4" class="LineNr"> 4 </span> <span class="Keyword">movt</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0100</span>
<span id="L5" class="LineNr"> 5 </span> <span class="Keyword">str</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="lv12c">[</span><span class="Type">r0</span><span class="op_lv12">,</span> <span class="Constant">#0x18</span><span class="lv12c">]</span>
<span id="L6" class="LineNr"> 6 </span>
<span id="L7" class="LineNr"> 7 </span> <span class="op_lv0">@</span> <span class="Function">Set</span> <span class="Function">BS8</span> <span class="Function">field</span> <span class="Function">in</span> <span class="Function">GPIOA_BSRR</span><span class="op_lv0">,</span> <span class="Function">to</span> <span class="Function">set</span> <span class="Function">GPIOA8</span>
<span id="L8" class="LineNr"> 8 </span> <span class="Keyword">movw</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0100</span>
<span id="L9" class="LineNr"> 9 </span> <span class="Keyword">str</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="lv12c">[</span><span class="Type">r0</span><span class="op_lv12">,</span> <span class="Constant">#0x18</span><span class="lv12c">]</span>
<span id="L10" class="LineNr">10 </span>
<span id="L11" class="LineNr">11 </span> <span class="Keyword">b</span> <span class="op_lv0">.</span>loop
</pre>
</div>
<p></html></p>
<h3>Adding a delay</h3>
<p>Now for something that is hopefully a lot more interesting than just shoving
values into memory addresses. We want to do this in a loop, with a delay
between turning the LED off an on!</p>
<p>There are a few ways you could do this delay. If precise timing was important,
the timer peripherals of the STM32 can be used. We could also just add a lot of
<code>nop</code> (no operation) over and over again -- that doesn't feel very
sophisticated, and would give us a really large binary!</p>
<p>We're going to do this by putting a big number in a register and decrementing
it until it hits zero. So, we're creating another loop, but this time with an
exit condition.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="op_lv0">.</span>loop<span class="op_lv0">:</span>
<span id="L2" class="LineNr"> 2 </span> <span class="op_lv0">@</span> <span class="Function">Set</span> <span class="Function">BR8</span> <span class="Function">field</span> <span class="Function">in</span> <span class="Function">GPIOA_BSRR</span><span class="op_lv0">,</span> <span class="Function">to</span> <span class="Function">clear</span> <span class="Function">GPIOA8</span>
<span id="L3" class="LineNr"> 3 </span> <span class="Keyword">movw</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0000</span>
<span id="L4" class="LineNr"> 4 </span> <span class="Keyword">movt</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0100</span>
<span id="L5" class="LineNr"> 5 </span> <span class="Keyword">str</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="lv12c">[</span><span class="Type">r0</span><span class="op_lv12">,</span> <span class="Constant">#0x18</span><span class="lv12c">]</span>
<span id="L6" class="LineNr"> 6 </span>
<span id="L7" class="LineNr"> 7 </span> <span class="op_lv0">@</span> <span class="Function">Delay</span>
<span id="L8" class="LineNr"> 8 </span> <span class="Keyword">movw</span> <span class="Type">r2</span><span class="op_lv0">,</span> <span class="Constant">#0x3500</span>
<span id="L9" class="LineNr"> 9 </span> <span class="Keyword">movt</span> <span class="Type">r2</span><span class="op_lv0">,</span> <span class="Constant">#0x000c</span>
<span id="L10" class="LineNr">10 </span><span class="op_lv0">.</span>L<span class="Constant">1</span><span class="op_lv0">:</span>
<span id="L11" class="LineNr">11 </span> <span class="Keyword">subs</span> <span class="Type">r2</span><span class="op_lv0">,</span> <span class="Constant">#0x0001</span>
<span id="L12" class="LineNr">12 </span> <span class="Keyword">bne</span> <span class="op_lv0">.</span>L<span class="Constant">1</span>
<span id="L13" class="LineNr">13 </span>
<span id="L14" class="LineNr">14 </span> <span class="op_lv0">@</span> <span class="Function">Set</span> <span class="Function">BS8</span> <span class="Function">field</span> <span class="Function">in</span> <span class="Function">GPIOA_BSRR</span><span class="op_lv0">,</span> <span class="Function">to</span> <span class="Function">set</span> <span class="Function">GPIOA8</span>
<span id="L15" class="LineNr">15 </span> <span class="Keyword">movw</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0100</span>
<span id="L16" class="LineNr">16 </span> <span class="Keyword">str</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="lv12c">[</span><span class="Type">r0</span><span class="op_lv12">,</span> <span class="Constant">#0x18</span><span class="lv12c">]</span>
<span id="L17" class="LineNr">17 </span>
<span id="L18" class="LineNr">18 </span> <span class="op_lv0">@</span> <span class="Function">Delay</span>
<span id="L19" class="LineNr">19 </span> <span class="Keyword">movw</span> <span class="Type">r2</span><span class="op_lv0">,</span> <span class="Constant">#0x3500</span>
<span id="L20" class="LineNr">20 </span> <span class="Keyword">movt</span> <span class="Type">r2</span><span class="op_lv0">,</span> <span class="Constant">#0x000c</span>
<span id="L21" class="LineNr">21 </span><span class="op_lv0">.</span>L<span class="Constant">2</span><span class="op_lv0">:</span>
<span id="L22" class="LineNr">22 </span> <span class="Keyword">subs</span> <span class="Type">r2</span><span class="op_lv0">,</span> <span class="Constant">#0x0001</span>
<span id="L23" class="LineNr">23 </span> <span class="Keyword">bne</span> <span class="op_lv0">.</span>L<span class="Constant">2</span>
<span id="L24" class="LineNr">24 </span>
<span id="L25" class="LineNr">25 </span> <span class="Keyword">b</span> <span class="op_lv0">.</span>loop
</pre>
</div>
<p></html></p>
<p>The <code>subs</code> instruction here is subtracting, and the <code>s</code> suffix means that a
flag will be set in the Program Status Register if the result of the operation
is zero. The <code>bne</code> instruction means "branch if not equal (to zero)", so we'll
jump back to the start of our delay loop if that zero flag isn't set.</p>
<h2>Putting the pieces together</h2>
<p>We now have everything we need to flash our LED – almost.</p>
<p>There's some boilerplate that needs added to our assembly file. We need to give
our a name to the entry point, let's call it <code>main</code>. </p>
<p>There are two instruction encodings for ARM: ARM and Thumb. The encoding
defines how the assembly is translated to machine code. It used to be that you
needed different syntax for each of these, until ARM brought out their unified
assembly language. Line 1 below is telling the assembler (the tool that turns
the assembly into machine code) which syntax we are using.</p>
<p>Then, line 3 is telling the assembler that we are using the Thumb encoding for
<code>main</code>, which is the only encoding our target (the STM32F4) supports. Then line
4 is exposing the symbol <code>main</code> to the linker.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span><span class="PreProc">.syntax</span> <span class="Function">unified</span>
<span id="L2" class="LineNr">2 </span>
<span id="L3" class="LineNr">3 </span><span class="PreProc">.thumb_func</span>
<span id="L4" class="LineNr">4 </span><span class="PreProc">.global</span> <span class="Function">main</span>
<span id="L5" class="LineNr">5 </span><span class="Function">main:</span>
</pre>
</div>
<p></html></p>
<p>Lastly, we need to make sure our program is what runs when our microcontroller
powers on. The reset vector is the location the CPU will go to find
the first instruction it will execute after being reset.</p>
<p>What we’re doing below is putting the address of <code>main</code> into the reset vector
so that when our board turns on, it will go to that address and start running
our code to flash the LED.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span><span class="PreProc">.section</span> <span class="op_lv0">.</span>vector_table<span class="op_lv0">.</span>reset_vector
<span id="L2" class="LineNr">2 </span><span class="PreProc">.word</span> <span class="Function">main</span>
</pre>
</div>
<p></html></p>
<p>We now have our final asm file:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="PreProc">.syntax</span> <span class="Function">unified</span>
<span id="L2" class="LineNr"> 2 </span>
<span id="L3" class="LineNr"> 3 </span><span class="PreProc">.thumb_func</span>
<span id="L4" class="LineNr"> 4 </span><span class="PreProc">.global</span> <span class="Function">main</span>
<span id="L5" class="LineNr"> 5 </span><span class="Function">main:</span>
<span id="L6" class="LineNr"> 6 </span> <span class="op_lv0">@</span> <span class="Function">Store</span> <span class="Function">RCC</span> <span class="Function">base</span> <span class="Function">address</span> <span class="Function">in</span> <span class="Type">r0</span>
<span id="L7" class="LineNr"> 7 </span> <span class="Keyword">movw</span> <span class="Type">r0</span><span class="op_lv0">,</span> <span class="Constant">#0x3800</span>
<span id="L8" class="LineNr"> 8 </span> <span class="Keyword">movt</span> <span class="Type">r0</span><span class="op_lv0">,</span> <span class="Constant">#0x4002</span>
<span id="L9" class="LineNr"> 9 </span>
<span id="L10" class="LineNr">10 </span> <span class="op_lv0">@</span> <span class="Function">Turn</span> <span class="Function">on</span> <span class="Function">GPIOA</span> <span class="Function">clock</span> <span class="Function">by</span> <span class="Function">setting</span> <span class="Function">bit</span> <span class="Constant">0</span> <span class="Function">in</span> <span class="Function">AHB1ENR</span> <span class="Function">register</span>
<span id="L11" class="LineNr">11 </span> <span class="Keyword">movw</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x01</span>
<span id="L12" class="LineNr">12 </span> <span class="Keyword">str</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="lv12c">[</span><span class="Type">r0</span><span class="op_lv12">,</span> <span class="Constant">#0x30</span><span class="lv12c">]</span>
<span id="L13" class="LineNr">13 </span>
<span id="L14" class="LineNr">14 </span> <span class="op_lv0">@</span> <span class="Function">Store</span> <span class="Function">start</span> <span class="Function">address</span> <span class="Function">of</span> <span class="Function">GPIOA</span> <span class="Function">registers</span>
<span id="L15" class="LineNr">15 </span> <span class="Keyword">movw</span> <span class="Type">r0</span><span class="op_lv0">,</span> <span class="Constant">#0x0000</span>
<span id="L16" class="LineNr">16 </span> <span class="Keyword">movt</span> <span class="Type">r0</span><span class="op_lv0">,</span> <span class="Constant">#0x4002</span>
<span id="L17" class="LineNr">17 </span>
<span id="L18" class="LineNr">18 </span> <span class="op_lv0">@</span> <span class="Function">Use</span> <span class="Function">GPIOA_MODER</span> <span class="Function">to</span> <span class="Function">make</span> <span class="Function">GPIOA8</span> <span class="Function">an</span> <span class="Function">output</span>
<span id="L19" class="LineNr">19 </span> <span class="Keyword">movw</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0000</span>
<span id="L20" class="LineNr">20 </span> <span class="Keyword">movt</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0xA801</span>
<span id="L21" class="LineNr">21 </span> <span class="Keyword">str</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="lv12c">[</span><span class="Type">r0</span><span class="lv12c">]</span>
<span id="L22" class="LineNr">22 </span>
<span id="L23" class="LineNr">23 </span><span class="op_lv0">.</span>loop<span class="op_lv0">:</span>
<span id="L24" class="LineNr">24 </span> <span class="op_lv0">@</span> <span class="Function">Set</span> <span class="Function">BR8</span> <span class="Function">field</span> <span class="Function">in</span> <span class="Function">GPIOA_BSRR</span><span class="op_lv0">,</span> <span class="Function">to</span> <span class="Function">clear</span> <span class="Function">GPIOA8</span>
<span id="L25" class="LineNr">25 </span> <span class="Keyword">movw</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0000</span>
<span id="L26" class="LineNr">26 </span> <span class="Keyword">movt</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0100</span>
<span id="L27" class="LineNr">27 </span> <span class="Keyword">str</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="lv12c">[</span><span class="Type">r0</span><span class="op_lv12">,</span> <span class="Constant">#0x18</span><span class="lv12c">]</span>
<span id="L28" class="LineNr">28 </span>
<span id="L29" class="LineNr">29 </span> <span class="op_lv0">@</span> <span class="Function">Delay</span>
<span id="L30" class="LineNr">30 </span> <span class="Keyword">movw</span> <span class="Type">r2</span><span class="op_lv0">,</span> <span class="Constant">#0x3500</span>
<span id="L31" class="LineNr">31 </span> <span class="Keyword">movt</span> <span class="Type">r2</span><span class="op_lv0">,</span> <span class="Constant">#0x000c</span>
<span id="L32" class="LineNr">32 </span><span class="op_lv0">.</span>L<span class="Constant">1</span><span class="op_lv0">:</span>
<span id="L33" class="LineNr">33 </span> <span class="Keyword">subs</span> <span class="Type">r2</span><span class="op_lv0">,</span> <span class="Constant">#0x0001</span>
<span id="L34" class="LineNr">34 </span> <span class="Keyword">bne</span> <span class="op_lv0">.</span>L<span class="Constant">1</span>
<span id="L35" class="LineNr">35 </span>
<span id="L36" class="LineNr">36 </span> <span class="op_lv0">@</span> <span class="Function">Set</span> <span class="Function">BS8</span> <span class="Function">field</span> <span class="Function">in</span> <span class="Function">GPIOA_BSRR</span><span class="op_lv0">,</span> <span class="Function">to</span> <span class="Function">set</span> <span class="Function">GPIOA8</span>
<span id="L37" class="LineNr">37 </span> <span class="Keyword">movw</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="Constant">#0x0100</span>
<span id="L38" class="LineNr">38 </span> <span class="Keyword">str</span> <span class="Type">r1</span><span class="op_lv0">,</span> <span class="lv12c">[</span><span class="Type">r0</span><span class="op_lv12">,</span> <span class="Constant">#0x18</span><span class="lv12c">]</span>
<span id="L39" class="LineNr">39 </span>
<span id="L40" class="LineNr">40 </span> <span class="op_lv0">@</span> <span class="Function">Delay</span>
<span id="L41" class="LineNr">41 </span> <span class="Keyword">movw</span> <span class="Type">r2</span><span class="op_lv0">,</span> <span class="Constant">#0x3500</span>
<span id="L42" class="LineNr">42 </span> <span class="Keyword">movt</span> <span class="Type">r2</span><span class="op_lv0">,</span> <span class="Constant">#0x000c</span>
<span id="L43" class="LineNr">43 </span><span class="op_lv0">.</span>L<span class="Constant">2</span><span class="op_lv0">:</span>
<span id="L44" class="LineNr">44 </span> <span class="Keyword">subs</span> <span class="Type">r2</span><span class="op_lv0">,</span> <span class="Constant">#0x0001</span>
<span id="L45" class="LineNr">45 </span> <span class="Keyword">bne</span> <span class="op_lv0">.</span>L<span class="Constant">2</span>
<span id="L46" class="LineNr">46 </span>
<span id="L47" class="LineNr">47 </span> <span class="Keyword">b</span> <span class="op_lv0">.</span>loop
<span id="L48" class="LineNr">48 </span>
<span id="L49" class="LineNr">49 </span><span class="PreProc">.section</span> <span class="op_lv0">.</span>vector_table<span class="op_lv0">.</span>reset_vector
<span id="L50" class="LineNr">50 </span><span class="PreProc">.word</span> <span class="Function">main</span>
</pre>
</div>
<p></html></p>
<h1>Building our code and flashing our target</h1>
<p>We use an <em>assembler</em> to turn our assembly into an object file, e.g.</p>
<p><code>arm-none-eabi-as -mcpu=cortex-m4 toggle.s -c -o output/toggle.o</code></p>
<p>Then we use a linker to make an executable. We need a custom linker script to
tell the linker where RAM and flash start on our target. Here's what I used:
<html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Keyword">ENTRY</span><span class="lv12c">(</span>main<span class="lv12c">)</span>
<span id="L2" class="LineNr"> 2 </span>
<span id="L3" class="LineNr"> 3 </span><span class="PreProc">MEMORY</span> <span class="lv12c">{</span>
<span id="L4" class="LineNr"> 4 </span> FLASH <span class="op_lv12">:</span> <span class="Identifier">ORIGIN</span> <span class="op_lv12">=</span> <span class="Number">0x08000000</span><span class="op_lv12">,</span> <span class="Identifier">LENGTH</span> <span class="op_lv12">=</span> <span class="Number">128</span><span class="PreProc">K</span>
<span id="L5" class="LineNr"> 5 </span> RAM <span class="op_lv12">:</span> <span class="Identifier">ORIGIN</span> <span class="op_lv12">=</span> <span class="Number">0x20000000</span><span class="op_lv12">,</span> <span class="Identifier">LENGTH</span> <span class="op_lv12">=</span> <span class="Number">128</span><span class="PreProc">K</span>
<span id="L6" class="LineNr"> 6 </span><span class="lv12c">}</span>
<span id="L7" class="LineNr"> 7 </span>
<span id="L8" class="LineNr"> 8 </span><span class="PreProc">SECTIONS</span> <span class="lv12c">{</span>
<span id="L9" class="LineNr"> 9 </span> <span class="Comment">/\* Vector table is first thing in flash \*/</span>
<span id="L10" class="LineNr">10 </span> <span class="op_lv12">.</span>vector_table <span class="Identifier">ORIGIN</span><span class="lv11c">(</span>FLASH<span class="lv11c">)</span> <span class="op_lv12">:</span>
<span id="L11" class="LineNr">11 </span> <span class="lv11c">{</span>
<span id="L12" class="LineNr">12 </span> <span class="Comment">/\* Initial stack pointer \*/</span>
<span id="L13" class="LineNr">13 </span> <span class="Type">LONG</span><span class="lv10c">(</span><span class="Identifier">ORIGIN</span><span class="lv9c">(</span>RAM<span class="lv9c">)</span> <span class="op_lv10">+</span> <span class="Identifier">LENGTH</span><span class="lv9c">(</span>RAM<span class="lv9c">)</span><span class="lv10c">)</span><span class="op_lv11">;</span>
<span id="L14" class="LineNr">14 </span>
<span id="L15" class="LineNr">15 </span> <span class="Comment">/\* Rest of vector table \*/</span>
<span id="L16" class="LineNr">16 </span> <span class="Keyword">KEEP</span><span class="lv10c">(</span><span class="op_lv10">\*</span><span class="lv9c">(</span><span class="op_lv9">.</span>vector_table<span class="lv9c">)</span><span class="lv10c">)</span><span class="op_lv11">;</span>
<span id="L17" class="LineNr">17 </span> <span class="lv11c">}</span> <span class="op_lv12">></span> FLASH
<span id="L18" class="LineNr">18 </span>
<span id="L19" class="LineNr">19 </span> <span class="Comment">/\* text section contains executable code \*/</span>
<span id="L20" class="LineNr">20 </span> <span class="op_lv12">.</span>text <span class="Identifier">ADDR</span><span class="lv11c">(</span><span class="op_lv11">.</span>vector_table<span class="lv11c">)</span> <span class="op_lv12">+</span> <span class="Identifier">SIZEOF</span><span class="lv11c">(</span><span class="op_lv11">.</span>vector_table<span class="lv11c">)</span> <span class="op_lv12">:</span>
<span id="L21" class="LineNr">21 </span> <span class="lv11c">{</span>
<span id="L22" class="LineNr">22 </span> <span class="op_lv11">\*</span><span class="lv10c">(</span><span class="op_lv10">.</span>text <span class="op_lv10">.</span>text<span class="op_lv10">.\*</span><span class="lv10c">)</span><span class="op_lv11">;</span>
<span id="L23" class="LineNr">23 </span> <span class="lv11c">}</span> <span class="op_lv12">></span> FLASH
<span id="L24" class="LineNr">24 </span><span class="lv12c">}</span>
</pre>
</div>
<p></html></p>
<p>Then we can call the linker:
<code>arm-none-eabi-ld -T link.ld output/toggle.o -o output/toggle</code></p>
<p>I'm using a <a href="https://github.com/blacksphere/blackmagic/wiki" target="_blank">Black Magic probe</a>
to flash my 1bitsy. I can talk to the probe over gdb:</p>
<div class="highlight"><pre><span></span><code>gdb-multiarch -n --batch <span class="se">\</span>
-ex <span class="s1">'tar ext /dev/serial/by-id/example'</span> <span class="se">\</span>
-ex <span class="s1">'mon tpwr en'</span> <span class="se">\</span>
-ex <span class="s1">'mon swdp_scan'</span> <span class="se">\</span>
-ex <span class="s1">'att 1'</span> <span class="se">\</span>
-ex <span class="s1">'load'</span> <span class="se">\</span>
-ex <span class="s1">'start'</span> <span class="se">\</span>
-ex <span class="s1">'detach'</span> <span class="se">\</span>
output/toggle
</code></pre></div>
<p>Voila:
<img src="/blog/images/led-flash/blinky.gif" width="300" alt="A gif of the LED flashing" class=callout></p>
<h1>Resources</h1>
<ul>
<li>
<p>You can check out my
<a href="https://github.com/lochsh/gpio-toggle-exercise" target="_blank">GPIO toggling exercise</a>
which talks you through the assembly here in a little more detail, and provides
a Dockerfile for emulating the target chip. I'll have a (shorter) blog post
about the emulation soon.</p>
</li>
<li>
<p>Azeria Labs have an <a href="https://azeria-labs.com/writing-arm-assembly-part-1/" target="_blank">excellent guide</a> to ARM assembly that goes into more
detail than I have, though assumes a bit more knowledge about computer
architecture.</p>
</li>
<li>
<p>The <a href="http://infocenter.arm.com/help/topic/com.arm.doc.dui0553b/DUI0553.pdf" target="_blank">Cortex M4 User Guide</a> is a good technical reference for the assembly written here</p>
</li>
</ul>Let's break CPython together, for fun and mischief2019-10-13T00:00:00+01:002019-10-13T00:00:00+01:00Hannah McLaughlintag:mcla.ug,2019-10-13:/blog/cpython-hackage.html<p>I promise that nothing we do here will be useful.</p>
<p>But I promise it will be entertaining, and (hopefully) educational:</p>
<ul>
<li>
<p>if you don't know anything about the CPython internals, you're about to
learn (a bit)</p>
</li>
<li>
<p>if you do know about the CPython internals, you're hopefully about to
learn some new …</p></li></ul><p>I promise that nothing we do here will be useful.</p>
<p>But I promise it will be entertaining, and (hopefully) educational:</p>
<ul>
<li>
<p>if you don't know anything about the CPython internals, you're about to
learn (a bit)</p>
</li>
<li>
<p>if you do know about the CPython internals, you're hopefully about to
learn some new ways to abuse them 😉</p>
</li>
</ul>
<h2>Clarification</h2>
<p>Before we proceed to hackage, let me make sure it's clear what I'm talking
about when I say "CPython internals". CPython is the reference implementation
of Python, and it's what most people use. It's what comes as standard on any
system I've ever used.</p>
<p>A Python implementation includes the interpreter, the built-in types and the
standard library. With CPython, apart from much of the standard library which
is in Python, this is all written in C. There are other implementations:</p>
<ul>
<li>PyPy is written in Python itself and has a JIT compiler, it's really fast</li>
<li>Jython runs on the JVM</li>
<li>IronPython runs on .NET the Microsoft framework</li>
</ul>
<p>Everything we do here is exploiting the specific implementation details
of CPython.</p>
<h3>YMMV</h3>
<p>Please bear in mind that Python was not designed to do the things we're going
to do, and some of the fun things that worked with the version of Python I used
here, my operating system, &c., might end up segfaulting for you. Running stuff
in ipython rather than the standard REPL will also likely end up with more
issues occurring when things are hacked.</p>
<h2>To whet your appetite</h2>
<p>Let's have a look at the Python language reference. The first two sentences of
the <a href="https://docs.python.org/3/reference/datamodel.html">data model</a> say this:</p>
<blockquote>
<p>Objects are Python’s abstraction for data. All data in a Python program is
represented by objects or by relations between objects.</p>
</blockquote>
<p>In CPython a Python object is defined in the
<a href="https://github.com/python/cpython/blob/10e5c66789a06dc9015a24968e96e77a75725a7a/Include/object.h#L104"><code>PyObject</code></a>
struct:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span><span class="Type">typedef</span> <span
class="Type">struct</span> _object {
<span id="L2" class="LineNr">2 </span> _PyObject_HEAD_EXTRA
<span id="L3" class="LineNr">3 </span> Py_ssize_t ob_refcnt;
<span id="L4" class="LineNr">4 </span> <span class="Type">struct</span> _typeobject *ob_type;
<span id="L5" class="LineNr">5 </span>} PyObject;
</pre>
</div>
<p></html></p>
<p>(The first bit here, <code>_PyObject_HEAD_EXTRA</code>, is only valid when compiling Python
with a special tracing debugging feature, so don't worry about it.)</p>
<p>We have the reference count <code>ob_refcnt</code>, which is used for memory management
and tells us how many other objects are referencing this one. When the
reference count of an object is zero, its memory and resources can be freed by
the garbage collector.</p>
<p>We also have the type information, <code>ob_type</code>, which tells us how to interact
with the object, what its behaviour is, what data it contains.</p>
<p>Going back to the data model:</p>
<blockquote>
<p>Every object has an identity, a type and a value. An object’s identity never
changes once it has been created; you may think of it as the object’s address
in memory. The ‘is’ operator compares the identity of two objects; the id()
function returns an integer representing its identity.</p>
<p>CPython implementation detail: For CPython, id(x) is the memory address where
x is stored.</p>
</blockquote>
<p>So what I'd expect CPython to do is dynamically allocate memory for a new
<code>PyObject</code> each time we create a new object.</p>
<p>Let's test this out with some integers:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span>>>> x = <span class="Number">500</span>
<span id="L2" class="LineNr">2 </span>>>> y = <span class="Number">500</span>
<span id="L3" class="LineNr">3 </span>>>> x <span class="Statement">is</span> y
<span id="L4" class="LineNr">4 </span><span class="Function">False</span>
</pre>
</div>
<p></html></p>
<p>That makes sense: a new <code>PyObject</code> has been allocated for each variable we've
made here, and so they are at different places in memory. But what if we use
smaller integers?</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span>>>> x = <span class="Number">5</span>
<span id="L2" class="LineNr">2 </span>>>> y = <span class="Number">5</span>
<span id="L3" class="LineNr">3 </span>>>> x <span class="Statement">is</span> y
<span id="L4" class="LineNr">4 </span><span class="Function">True</span>
</pre>
</div>
<p></html></p>
<p>How surprising! Let's have a look in the <a href="https://github.com/python/cpython/blob/10e5c66789a06dc9015a24968e96e77a75725a7a/Objects/longobject.c#L38">CPython
source</a>
to see why this might be:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="PreProc">#ifndef NSMALLPOSINTS</span>
<span id="L2" class="LineNr"> 2 </span><span class="PreProc">#define NSMALLPOSINTS </span><span class="Number">257</span>
<span id="L3" class="LineNr"> 3 </span><span class="PreProc">#endif</span>
<span id="L4" class="LineNr"> 4 </span><span class="PreProc">#ifndef NSMALLNEGINTS</span>
<span id="L5" class="LineNr"> 5 </span><span class="PreProc">#define NSMALLNEGINTS </span><span class="Number">5</span>
<span id="L6" class="LineNr"> 6 </span><span class="PreProc">#endif</span>
<span id="L7" class="LineNr"> 7 </span>
<span id="L8" class="LineNr"> 8 </span><span class="Comment">/*</span><span class="Comment"> Small integers are preallocated in this array so that they</span>
<span id="L9" class="LineNr"> 9 </span><span class="Comment"> can be shared.</span>
<span id="L10" class="LineNr">10 </span><span class="Comment"> The integers that are preallocated are those in the range</span>
<span id="L11" class="LineNr">11 </span><span class="Comment"> -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).</span>
<span id="L12" class="LineNr">12 </span><span class="Comment">*/</span>
<span id="L13" class="LineNr">13 </span><span class="Type">static</span> PyLongObject small_ints[NSMALLNEGINTS + NSMALLPOSINTS];
</pre>
</div>
<p></html></p>
<p>So it seems integers between -5 and 256 inclusive are statically allocated in a
big old array! This is an optimisation that CPython has chosen to do -- the
idea is that these integers are going to be used a lot, and it would be time
consuming to allocate new memory every time.</p>
<p>But...if that means some integers have a defined place in memory, can
we...corrupt that memory?</p>
<h2>import ctypes</h2>
<p><img alt="ctypes" class="callout" src="/blog/images/goosebumpsctypes.jpg" title="Someone whispering 'import ctypes' and goosebumps appearing on the arm of the listener"></p>
<p>Most good CPython shenanigans begins with importing ctypes, which is Python's
standard C foreign function interface. An FFI allows different languages to
interoperate. ctypes provides C compatible data types and allows calling
functions from shared libraries and such.</p>
<p>The ctypes docs tell us about the function
<a href="https://docs.python.org/3.7/library/ctypes.html#ctypes.memmove"><code>memmove</code></a>:</p>
<blockquote>
<p>ctypes.memmove(dst, src, count)</p>
<p>Same as the standard C memmove library function: copies count bytes from
src to dst. dst and src must be integers or ctypes instances that can be
converted to pointers.</p>
</blockquote>
<p>So what if we copied the memory where 6 is to where 5 is?</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span>>>> <span class="PreProc">import</span> ctypes
<span id="L2" class="LineNr">2 </span>>>> <span class="PreProc">import</span> sys
<span id="L3" class="LineNr">3 </span>>>> ctypes.memmove(<span class="Function">id</span>(<span class="Number">5</span>), <span class="Function">id</span>(<span class="Number">6</span>), sys.getsizeof(<span class="Number">5</span>))
<span id="L4" class="LineNr">4 </span>>>> <span class="Number">5</span> + <span class="Number">5</span>
<span id="L5" class="LineNr">5 </span><span class="Number">12</span>
</pre>
</div>
<p></html></p>
<p>What fun! But this is small fry stuff. We can do more. We have ambition.</p>
<h2>Ambition</h2>
<p>I don't what to change one integer. I want to change ALL the integers.</p>
<p>What if we changed what happens when you add integers together? What if we made
it subtract instead?</p>
<p><img alt="mischief" class="callout" src="/blog/images/thearm.gif" title="The Arm from Twin Peaks rubbing his hands together mischeviously"></p>
<p>The way operator resolution works in Python is that the corresponding "magic
method" or "dunder method" (for double underscores) is called. For example <code>x +
y</code> will become <code>x.__add__(y)</code>. So the <code>int.__add__</code> method is going to be our
target for mischevious hackage.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span><span class="Statement">def</span> <span class="Function">fake_add</span>(x, y):
<span id="L2" class="LineNr">2 </span> <span class="Statement">return</span> x - y
<span id="L3" class="LineNr">3 </span>>>> <span class="Function">int</span>.__add__ = fake_add
<span id="L4" class="LineNr">4 </span><span class="Type">TypeError</span>: can't set attributes of built-in/extension type </span><span class="String">'</span><span class="Function">int</span><span class="String">'</span>
</pre>
</div>
<p></html></p>
<p>Annoying, but unsurprising. Python is permissive in the sense that it doesn't
have access modifiers like C++ or Java – you can't really define private
attributes of a class. But you can't do just anything, and patching built-ins like
this is one of the things Python prevents us from doing – unless we try very
hard.</p>
<p>So what can we try instead? All attribute resolution comes down to looking up
attribute names in an object's dictionary. For example, <code>x.y</code> would resolve to
<code>x.__dict__["y"]</code><sup id="fnref:1"><a class="footnote-ref" href="#fn:1">1</a></sup>. What if we try accessing <code>int.__add__</code> that way?</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span>>>> <span class="Function">int</span>.__dict__[<span class="String">"</span><span class="String">__add__</span><span class="String">"</span>] = fake_add
<span id="L2" class="LineNr">2 </span><span class="Type">TypeError</span>: <span class="String">'</span><span class="String">mappingproxy</span><span class="String">'</span> <span class="Function">object</span> does <span class="Statement">not</span> support item assignment
</pre>
</div>
<p></html></p>
<p>Tarnation. But of course, we knew it would not be as easy as this. Perhaps lesser
programmers would give up here. "It's not allowed," they might say. But we are
strong and we are determined.</p>
<p>What is this
<a href="https://docs.python.org/3/library/types.html#types.MappingProxyType">mappingproxy</a> the interpreter speaks of?</p>
<blockquote>
<p>Read-only proxy of a mapping.</p>
</blockquote>
<p>Ok, so this is just some cast over the actual dictionary. If we can cast it to
a dictionary, we can assign to it. But doing this with a Python cast is just
creating a copy:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span>>>> <span
class="Function">dict</span>(<span class="Function">int</span>.__dict__)[<span class="String">"</span><span class="String">__add__</span><span class="String">"</span>] = fake_add
<span id="L2" class="LineNr">2 </span>>>> <span class="Number">1</span> + <span class="Number">5</span>
<span id="L3" class="LineNr">3 </span><span class="Number">6</span>
<span id="L4" class="LineNr">4 </span>>>> (<span class="Number">1</span>).__add__(<span class="Number">5</span>)
<span id="L5" class="LineNr">5 </span><span class="Number">6</span>
<span id="L6" class="LineNr">6 </span>>>> <span class="Function">int</span>.__add__ == fake_add
<span id="L7" class="LineNr">7 </span><span class="Function">False</span>
</pre>
</div>
<p></html></p>
<p>We need to go deeper. Let's look at the <a href="https://github.com/python/cpython/blob/10e5c66789a06dc9015a24968e96e77a75725a7a/Objects/descrobject.c#L954">CPython
source</a> for the mappingproxy type.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Type">typedef</span> <span class="Type">struct</span> {
<span id="L2" class="LineNr"> 2 </span> PyObject_HEAD
<span id="L3" class="LineNr"> 3 </span> PyObject *mapping;
<span id="L4" class="LineNr"> 4 </span>} mappingproxyobject;
<span id="L5" class="LineNr"> 5 </span>
<span id="L6" class="LineNr"> 6 </span><span class="Type">static</span> PyMappingMethods mappingproxy_as_mapping = {
<span id="L7" class="LineNr"> 7 </span> (lenfunc)mappingproxy_len, <span class="Comment">/*</span><span class="Comment"> mp_length </span><span class="Comment">*/</span>
<span id="L8" class="LineNr"> 8 </span> (binaryfunc)mappingproxy_getitem, <span class="Comment">/*</span><span class="Comment"> mp_subscript </span><span class="Comment">*/</span>
<span id="L9" class="LineNr"> 9 </span> <span class="Number">0</span>, <span class="Comment">/*</span><span class="Comment"> mp_ass_subscript </span><span class="Comment">*/</span>
<span id="L10" class="LineNr">10 </span>};
</pre>
</body>
</div>
<p></html></p>
<p>The <code>PyMappingMethods</code> of a type tell us how it behaves as a mapping: what does
<code>x[key]</code> do (<code>mp_subscript</code>)? What does <code>x[key] = y</code> do (<code>mp_ass_subscript</code>)?</p>
<p>What this is telling us is that the mapping proxy is basically a wrapper around
a normal dictionary with the function pointer to the subscript assignment
method set to NULL.</p>
<p>We can use ctypes to cast this and reveal the underlying dictionary.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="PreProc">import</span> ctypes
<span id="L2" class="LineNr"> 2 </span>
<span id="L3" class="LineNr"> 3 </span>
<span id="L4" class="LineNr"> 4 </span><span class="Statement">class</span> <span class="Function">PyObject</span>(ctypes.Structure):
<span id="L5" class="LineNr"> 5 </span> <span class="Statement">pass</span>
<span id="L6" class="LineNr"> 6 </span>
<span id="L7" class="LineNr"> 7 </span>
<span id="L8" class="LineNr"> 8 </span>PyObject._fields_ = [
<span id="L9" class="LineNr"> 9 </span> (<span class="String">'</span><span class="String">ob_refcnt</span><span class="String">'</span>, ctypes.c_ssize_t)
<span id="L10" class="LineNr">10 </span> (<span class="String">'</span><span class="String">ob_type</span><span class="String">'</span>, ctypes.POINTER(PyObject))
<span id="L11" class="LineNr">11 </span>]
<span id="L12" class="LineNr">12 </span>
<span id="L13" class="LineNr">13 </span>
<span id="L14" class="LineNr">14 </span><span class="Statement">class</span> <span class="Function">MappingProxy</span>(PyObject):
<span id="L15" class="LineNr">15 </span> _fields_ = [(<span class="String">'</span><span class="String">dict</span><span class="String">'</span>, ctypes.POINTER(PyObject))]
</pre>
</div>
<p></html></p>
<p>The trouble is, once we have the dict as a PyObject pointer, how do we get it
back to being a plain old Python dict? It's no good doing this:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span>>>> MappingProxy.from_address(<span class="Function">id</span>(<span class="Function">int</span>.__dict__)).dict
<span id="L2" class="LineNr">2 </span><LP_PyObject at <span class="Number">0x7f6e98c8e7b8</span>>
</pre>
</div>
<p></html></p>
<p>if we have no way to interpret this as a dict. We can use this pleasing wee
trick from the CPython API, courtesy of <a href="http://lucumr.pocoo.org/about/">Armin
Ronacher</a><sup id="fnref:2"><a class="footnote-ref" href="#fn:2">2</a></sup> which will put it as a value into another existing
dictionary where it will be interpreted the same as any other object, then we
can extract it!</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Statement">def</span> <span class="Function">pyobj_cast</span>(obj):
<span id="L2" class="LineNr"> 2 </span> <span class="Statement">return</span> ctypes.cast(<span class="Function">id</span>(obj), ctypes.POINTER(PyObject)
<span id="L3" class="LineNr"> 3 </span>
<span id="L4" class="LineNr"> 4 </span>
<span id="L5" class="LineNr"> 5 </span><span class="Statement">def</span> <span class="Function">get_dict</span>(proxy):
<span id="L6" class="LineNr"> 6 </span> dict_as_pyobj = MappingProxy.from_address(<span class="Function">id</span>(proxy)).dict
<span id="L7" class="LineNr"> 7 </span> fence = {}
<span id="L8" class="LineNr"> 8 </span> ctypes.pythonapi.PyDict_SetItem(
<span id="L9" class="LineNr"> 9 </span> pyobj_cast(fence),
<span id="L10" class="LineNr">10 </span> pyobj_cast(<span class="String">"</span><span class="String">victory</span><span class="String">"</span>),
<span id="L11" class="LineNr">11 </span> dict_as_pyobj)
<span id="L12" class="LineNr">12 </span> <span class="Statement">return</span> fence[<span class="String">"</span><span class="String">victory</span><span class="String">"</span>]
<span id="L13" class="LineNr">13 </span>
<span id="L14" class="LineNr">14 </span>int_dict = get_dict(<span class="Function">int</span>.__dict__)
<span id="L15" class="LineNr">15 </span>int_dict[<span class="String">"</span><span class="String">__add__</span><span class="String">"</span>] = fake_add
</pre>
</div>
<p></html></p>
<p>Have we done it???</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span>>>> <span class="Number">1</span> + <span class="Number">1</span>
<span id="L2" class="LineNr">2 </span><span class="Number">2</span>
</pre>
</div>
<p></html></p>
<p>D'oh! But wait a minute...</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span>>>> (<span class="Number">1</span>).__add__(<span class="Number">1</span>)
<span id="L2" class="LineNr">2 </span><span class="Number">0</span>
</pre>
</div>
<p></html></p>
<p>What! But the data model says:</p>
<blockquote>
<p>to evaluate the expression x + y, where x is an instance of a class that has
an __add__() method, x.__add__(y) is called</p>
</blockquote>
<p>We've been lied to...this is clearly not true! It seems CPython has some
shortcut in place.</p>
<p>To be fair, they probably didn't think we'd ever find out this "lie" by
performing these shenanigans. We need to go yet deeper still to fulfill our
pointless quest. We <em>will</em> have control of the builtins.</p>
<h2>Full type mappings</h2>
<p>Back to the CPython source. What is in this type information that's in the
PyObject struct we looked at earlier? The answer: lots of stuff that I am not going to put here, but the most
interesting parts for our purposes are the <a href="https://github.com/python/cpython/blob/10e5c66789a06dc9015a24968e96e77a75725a7a/Doc/includes/typestruct.h#L16">method
suites</a>:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span><span class="Type">typedef</span> <span
class="Type">struct</span> _typeobject {
<span id="L2" class="LineNr">2 </span> ...
<span id="L3" class="LineNr">3 </span> PyNumberMethods *tp_as_number;
<span id="L4" class="LineNr">4 </span> PySequenceMethods *tp_as_sequence;
<span id="L5" class="LineNr">5 </span> PyMappingMethods *tp_as_mapping;
<span id="L6" class="LineNr">6 </span> ...
<span id="L7" class="LineNr">7 </span>} PyTypeObject;
</pre>
</div>
<p></html></p>
<p>This struct contains other structs defining the behaviour of the type
via function pointers. We're specifically interested in this
<a href="https://github.com/python/cpython/blob/10e5c66789a06dc9015a24968e96e77a75725a7a/Include/cpython/object.h#L95"><code>tp_as_number</code></a>
member. Its first member, <code>nb_add</code>, is the function pointer to the add
method. This is what we want to overwrite. This is our new target.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span><span class="Type">typedef</span> <span class="Type">struct</span> {
<span id="L2" class="LineNr">2 </span> binaryfunc nb_add;
<span id="L3" class="LineNr">3 </span> binaryfunc nb_subtract;
<span id="L4" class="LineNr">4 </span> binaryfunc nb_multiply;
<span id="L5" class="LineNr">5 </span> ...
<span id="L6" class="LineNr">6 </span>} PyNumberMethods;
</pre>
</div>
<p></html></p>
<p>So, like we made the ctypes mappings before, I want to do it for this entire
<code>PyTypeObject</code> struct. Which is big...so I'm not putting it all here!</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="PreProc">import</span> ctypes
<span id="L2" class="LineNr"> 2 </span>
<span id="L3" class="LineNr"> 3 </span>
<span id="L4" class="LineNr"> 4 </span><span class="Statement">class</span> <span class="Function">PyObject</span>(ctypes.Structure):
<span id="L5" class="LineNr"> 5 </span> <span class="Statement">pass</span>
<span id="L6" class="LineNr"> 6 </span>
<span id="L7" class="LineNr"> 7 </span>
<span id="L8" class="LineNr"> 8 </span><span class="Statement">class</span> <span class="Function">PyTypeObject</span>(ctypes.Structure):
<span id="L9" class="LineNr"> 9 </span> <span class="Statement">pass</span>
<span id="L10" class="LineNr">10 </span>
<span id="L11" class="LineNr">11 </span>
<span id="L12" class="LineNr">12 </span>Py_ssize_t = ctypes.c_ssize_t
<span id="L13" class="LineNr">13 </span>binaryfunc = ctypes.CFUNCTYPE(
<span id="L14" class="LineNr">14 </span> ctypes.POINTER(PyObject),
<span id="L15" class="LineNr">15 </span> ctypes.POINTER(PyObject),
<span id="L16" class="LineNr">16 </span> ctypes.POINTER(PyObject))
<span id="L17" class="LineNr">17 </span>
<span id="L18" class="LineNr">18 </span>
<span id="L19" class="LineNr">19 </span><span class="Statement">class</span> <span class="Function">PyNumberMethods</span>(ctypes.Structure):
<span id="L20" class="LineNr">20 </span> _fields_ = [
<span id="L21" class="LineNr">21 </span> (<span class="String">"</span><span class="String">nb_add</span><span class="String">"</span>, binaryfunc),
<span id="L22" class="LineNr">22 </span> (<span class="String">"</span><span class="String">nb_subtract</span><span class="String">"</span>, binaryfunc),
<span id="L23" class="LineNr">23 </span> (<span class="String">"</span><span class="String">nb_multiply</span><span class="String">"</span>, binaryfunc),
<span id="L24" class="LineNr">24 </span> ...
<span id="L25" class="LineNr">25 </span>
<span id="L26" class="LineNr">26 </span>PyTypeObject._fields_ = [
<span id="L27" class="LineNr">27 </span> ...
<span id="L28" class="LineNr">28 </span> (<span class="String">"</span><span class="String">tp_as_number</span><span class="String">"</span>, ctypes.POINTER(PyNumberMethods)),
<span id="L29" class="LineNr">29 </span> ...
<span id="L30" class="LineNr">30 </span>
<span id="L31" class="LineNr">31 </span>
<span id="L32" class="LineNr">32 </span>PyObject._fields_ = [
<span id="L33" class="LineNr">33 </span> (<span class="String">"</span><span class="String">ob_refcnt</span><span class="String">"</span>, Py_ssize_t),
<span id="L34" class="LineNr">34 </span> (<span class="String">"</span><span class="String">ob_type</span><span class="String">"</span>, ctypes.POINTER(PyTypeObject))]
</pre>
</div>
<p></html></p>
<p>So here we've basically made a Python mapping of the structs we have in C. If
we cast our Python <code>int</code> type to the equivalent type struct, we'll reveal the
secrets usually hidden from us.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span>>>> PyLong_Type = ctypes.cast(<span class="Function">id</span>(<span class="Function">int</span>), ctypes.POINTER(PyTypeObject)).contents
<span id="L2" class="LineNr">2 </span>>>> PyLong_Type.tp_as_number.contents.nb_add = PyLong_Type.tp_as_number.contents.nb_subtract
<span id="L3" class="LineNr">3 </span>>>> <span class="Number">10</span> + <span class="Number">4</span>
<span id="L4" class="LineNr">4 </span><span class="Number">6</span>
<span id="L5" class="LineNr">5 </span>>>> <span class="Number">1</span> + <span class="Number">1</span>
<span id="L6" class="LineNr">6 </span><span class="Number">0</span>
<span id="L7" class="LineNr">7 </span>>>> <span class="Number">1</span> + <span class="Number">3</span>
<span id="L8" class="LineNr">8 </span>-<span class="Number">2</span>
</pre>
</div>
<p></html></p>
<p>We did it!!! Incredible.</p>
<p>But now we know how to patch built-ins...what if we went further? What if we
added functionality that wasn't there before, rather than altering existing
functionality?</p>
<h2>Nice immutable string you got there. It would be a shame if something should...happen to it 😏</h2>
<p>In Python, strings are immutable. You can't go in and change one of the
characters – you have to create a new string object. When you add characters
to an existing string variable, a new string object is created.<sup id="fnref:3"><a class="footnote-ref" href="#fn:3">3</a></sup></p>
<p>What if we made strings mutable?</p>
<p>Let's have a look at <code>PyUnicode_Type</code> in the <a href="https://github.com/python/cpython/blob/10e5c66789a06dc9015a24968e96e77a75725a7a/Objects/unicodeobject.c">CPython
source</a>...and
then swiftly look away from all 16k lines of it cos it's distressingly complex
as it has to handle unicode and all its complexities as well as ASCII: good
times. We want to find the
<a href="https://github.com/python/cpython/blob/10e5c66789a06dc9015a24968e96e77a75725a7a/Objects/unicodeobject.c#L14126"><code>tp_as_mapping</code></a> member of the <code>PyUnicode_Type</code>
struct:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span><span class="Type">static</span> PyMappingMethods unicode_as_mapping = {
<span id="L2" class="LineNr">2 </span> (lenfunc)unicode_length, <span class="Comment">/*</span><span class="Comment"> mp_length </span><span class="Comment">*/</span>
<span id="L3" class="LineNr">3 </span> (binaryfunc)unicode_subscript, <span
class="Comment">/*</span><span class="Comment"> mp_subscript </span><span class="Comment">*/</span>
<span id="L4" class="LineNr">4 </span> (objobjargproc)<span
class="Number">0</span>, <span class="Comment">/*</span><span class="Comment"> mp_ass_subscript </span><span class="Comment">*/</span>
<span id="L5" class="LineNr">5 </span>};
</pre>
</div>
<p></html></p>
<p>We want to create a new function to point to from the <code>mp_ass_subscript</code>
member. Here's my extremely hacked one which wouldn't handle every case, not at
all. But I think it's going to allow us to do what we want.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Type">static</span> <span class="Type">int</span>
<span id="L2" class="LineNr"> 2 </span><span class="Function">unicode_ass_subscript</span>(PyUnicodeObject* self, PyObject* item, PyObject* value)
<span id="L3" class="LineNr"> 3 </span>{
<span id="L4" class="LineNr"> 4 </span> Py_ssize_t i = ((PyLongObject*)(item))->ob_digit[<span class="Number">0</span>];
<span id="L5" class="LineNr"> 5 </span> <span class="Type">unsigned</span> <span class="Type">int</span> kind = ((PyASCIIObject*)(self))->state.kind;
<span id="L6" class="LineNr"> 6 </span> <span class="Type">char</span>* data = ((<span class="Type">char</span>*)((PyASCIIObject*)(self) + <span class="Number">1</span>));
<span id="L7" class="LineNr"> 7 </span> <span class="Type">char</span>* new_data = ((<span class="Type">char</span>*)((PyASCIIObject*)(value) + <span class="Number">1</span>));
<span id="L8" class="LineNr"> 8 </span> *(data + kind * i) = *new_data;
<span id="L9" class="LineNr"> 9 </span> <span class="Statement">return</span> <span class="Number">0</span>;
<span id="L10" class="LineNr">10 </span>}
<span id="L11" class="LineNr">11 </span>
<span id="L12" class="LineNr">12 </span><span class="Type">static</span> PyMappingMethods unicode_as_mapping = {
<span id="L13" class="LineNr">13 </span> (lenfunc)unicode_length, <span class="Comment">/*</span><span class="Comment"> mp_length </span><span class="Comment">*/</span>
<span id="L14" class="LineNr">14 </span> (binaryfunc)unicode_subscript, <span class="Comment">/*</span><span class="Comment"> mp_subscript </span><span class="Comment">*/</span>
<span id="L15" class="LineNr">15 </span> (objobjargproc)unicode_ass_subscript, <span class="Comment">/*</span><span class="Comment"> mp_ass_subscript </span><span class="Comment">*/</span>
<span id="L16" class="LineNr">16 </span>};
</pre>
</div>
<p></html></p>
<p>I don't want to just change the source code of the Python binary I'm using.
That is cheating. I want to break Python from the <em>inside</em>.</p>
<p>But I <em>can</em> use this to get the machine code that I want to replace the
subscript assignment function with. (This is now just stupid compared to what
we were doing before...and not at all portable. But we are doing it.).</p>
<p>First, I built CPython with this new source code in there. I can retrieve the
machine code generated by our new function using <code>objdump</code>:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span>objdump Objects/unicodeobject.o
</pre>
</div>
<p></html></p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Number">00000000000000</span><span class="Identifier">f4</span> <<span class="Identifier">unicode_ass_subscript</span>>:
<span id="L2" class="LineNr"> 2 </span> <span class="Identifier">f4</span>: <span class="Number">8</span><span class="Identifier">b</span> <span class="Number">4</span><span class="Identifier">e</span> <span class="Number">18</span> <span class="Identifier">mov</span> <span class="Number">0x18</span>(%<span class="Identifier">rsi</span>),%<span class="Identifier">ecx</span>
<span id="L3" class="LineNr"> 3 </span> <span class="Identifier">f7</span>: 0<span class="Identifier">f</span> <span class="Identifier">b6</span> <span class="Number">47</span> <span class="Number">20</span> <span class="Identifier">movzbl</span> <span class="Number">0x20</span>(%<span class="Identifier">rdi</span>),%<span class="Identifier">eax</span>
<span id="L4" class="LineNr"> 4 </span> <span class="Identifier">fb</span>: <span class="Identifier">c0</span> <span class="Identifier">e8</span> <span class="Number">02 </span> <span class="Identifier">shr</span> $<span class="Number">0x2</span>,%<span class="Identifier">al</span>
<span id="L5" class="LineNr"> 5 </span> <span class="Identifier">fe</span>: <span class="Number">83</span> <span class="Identifier">e0</span> <span class="Number">07 </span> <span class="Identifier">and</span> $<span class="Number">0x7</span>,%<span class="Identifier">eax</span>
<span id="L6" class="LineNr"> 6 </span> <span class="Number">101</span>: <span class="Number">48</span> 0<span class="Identifier">f</span> <span class="Identifier">af</span> <span class="Identifier">c1</span> <span class="Identifier">imul</span> %<span class="Identifier">rcx</span>,%<span class="Identifier">rax</span>
<span id="L7" class="LineNr"> 7 </span> <span class="Number">105</span>: 0<span class="Identifier">f</span> <span class="Identifier">b6</span> <span class="Number">52</span> <span class="Number">30</span> <span class="Identifier">movzbl</span> <span class="Number">0x30</span>(%<span class="Identifier">rdx</span>),%<span class="Identifier">edx</span>
<span id="L8" class="LineNr"> 8 </span> <span class="Number">109</span>: <span class="Number">88</span> <span class="Number">54</span> <span class="Number">07 30</span> <span class="Identifier">mov</span> %<span class="Identifier">dl</span>,<span class="Number">0x30</span>(%<span class="Identifier">rdi</span>,%<span class="Identifier">rax</span>,<span class="Number">1</span>)
<span id="L9" class="LineNr"> 9 </span> <span class="Number">10</span><span class="Identifier">d</span>: <span class="Identifier">b8</span> <span class="Number">00 00 00 00 </span> <span class="Identifier">mov</span> $<span class="Number">0x0</span>,%<span class="Identifier">eax</span>
<span id="L10" class="LineNr">10 </span> <span class="Number">112</span>: <span class="Identifier">c3</span> <span class="Identifier">retq</span>
</pre>
</div>
<p></html></p>
<p>Then, in a standard un-patched Python session, let's copy the machine codes
we've just compiled from the C, and make a new function pointer that points to
them:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="PreProc">import</span> ctypes
<span id="L2" class="LineNr"> 2 </span><span class="PreProc">import</span> mmap
<span id="L3" class="LineNr"> 3 </span>
<span id="L4" class="LineNr"> 4 </span>PyUnicode_Type = ctypes.cast(<span class="Function">id</span>(<span class="Function">str</span>), ctypes.POINTER(PyTypeObject)).contents
<span id="L5" class="LineNr"> 5 </span>
<span id="L6" class="LineNr"> 6 </span>payload = (
<span id="L7" class="LineNr"> 7 </span> b<span class="String">"</span><span class="Special">\x8b\x4e\x18\x0f\xb6\x47\x20\xc0\xe8\x02\x83\xe0\x07\x48\x0f\xaf</span><span class="String">"</span>
<span id="L8" class="LineNr"> 8 </span> b<span class="String">"</span><span class="Special">\xc1\x0f\xb6\x52\x30\x88\x54\x07\x30\xb8\x00\x00\x00\x00\xc3</span><span class="String">"</span>)
<span id="L9" class="LineNr"> 9 </span>buf = mmap.mmap(
<span id="L10" class="LineNr">10 </span> -<span class="Number">1</span>,
<span id="L11" class="LineNr">11 </span> <span class="Function">len</span>(payload),
<span id="L12" class="LineNr">12 </span> prot=mmap.PROT_READ | mmap.PROT_WRITE | mmap.PROT_EXEC)
<span id="L13" class="LineNr">13 </span>buf.write(payload)
<span id="L14" class="LineNr">14 </span>fpointer = ctypes.c_void_p.from_buffer(buf)
<span id="L15" class="LineNr">15 </span>bad_boi = objobjargproc(ctypes.addressof(fpointer))
<span id="L16" class="LineNr">16 </span>PyUnicode_Type.tp_as_mapping.contents.mp_ass_subscript = bad_boi
</pre>
</div>
<p></html></p>
<p>Before we scored this righteous hack, this would happen:
<html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span>>>> x = <span class="String">"</span><span class="String">hack the planet</span><span class="String">"</span>
<span id="L2" class="LineNr">2 </span>>>> x[<span class="Number">1</span>] = <span class="String">"</span><span class="String">4</span><span class="String">"</span>
<span id="L3" class="LineNr">3 </span><span class="Type">TypeError</span>: <span class="String">'</span><span class="String">str</span><span class="String">'</span> <span class="Function">object</span> does <span class="Statement">not</span> support item assignment
</pre>
</div>
<p></html>
but now...
<html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span>>>> x = <span class="String">"</span><span class="String">hack the planet</span><span class="String">"</span>
<span id="L2" class="LineNr">2 </span>>>> x[<span class="Number">1</span>] = <span class="String">"</span><span class="String">4</span><span class="String">"</span>
<span id="L3" class="LineNr">3 </span>>>> <span class="Function">print</span>(x)
<span id="L4" class="LineNr">4 </span><span class="String">"</span><span class="String">h4ck the planet</span><span class="String">"</span>
</pre>
</div>
<p></html></p>
<p><img alt="floppy" class="callout" src="/blog/images/hackersfloppydraw.gif" title="Zero Cool in Hackers drawing floppy disks like they are guns...precious"></p>
<p>And so our pointless quest is over. I hope you had fun. Sometimes it is good to
remember that computers can be just for fun. 😊</p>
<div class="footnote">
<hr>
<ol>
<li id="fn:1">
<p>More information about this is in the <a href="https://docs.python.org/3/reference/datamodel.html">Data
Model</a>. In short, there is
a defined resolution order for class hierarchies. So if <code>x.__dict__</code> didn't
have the key <code>"y"</code>, then we'll next look in base classes of <code>x</code>, &c. <a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text">↩</a></p>
</li>
<li id="fn:2">
<p>I first seen Armin Ronacher do this on Twitter, so a lot of credit is due to
him for this great trick. Someone has uploaded his code
<a href="https://gist.github.com/mahmoudimus/295200">here</a> (I can't find original
tweet) <a class="footnote-backref" href="#fnref:2" title="Jump back to footnote 2 in the text">↩</a></p>
</li>
<li id="fn:3">
<p>There is a good explanation as to why
<a href="https://www.quora.com/Why-are-Python-strings-immutable/answer/Michael-Veksler">here</a> <a class="footnote-backref" href="#fnref:3" title="Jump back to footnote 3 in the text">↩</a></p>
</li>
</ol>
</div>C++ is not a superset of C2019-09-02T00:00:00+01:002019-09-02T00:00:00+01:00Hannah McLaughlintag:mcla.ug,2019-09-02:/blog/cpp-is-not-a-superset-of-c.html<p>If you're not familiar with both languages, you might have heard people say
that C++ is a superset of C. If you're experienced in both languages, you'll
know that this is not true at all.</p>
<p>Of course, C++ has many features that C does not; but there are also a …</p><p>If you're not familiar with both languages, you might have heard people say
that C++ is a superset of C. If you're experienced in both languages, you'll
know that this is not true at all.</p>
<p>Of course, C++ has many features that C does not; but there are also a few
features that only C has. And, perhaps most importantly, there is code that
compiles in both languages but does different things.</p>
<p>There's a lot of information about the differences between the two languages
available, but a lot of it seems scattered. I wanted to have a go at creating a
concise guide for the details that are often overlooked, with excerpts from the
language standards to back these up.</p>
<h2>Notes</h2>
<p>This is primarily aimed at people who are familiar with at least one of C or
C++.</p>
<p>When I refer to C++, I mean C++11 onwards, though much of this will apply to
earlier standards. I'll be referencing the C++17 standard<sup id="fnref:1"><a class="footnote-ref" href="#fn:1">1</a></sup>.</p>
<p>When I refer to C, I mean C99 onwards. I'll be referencing the C11
standard<sup id="fnref:2"><a class="footnote-ref" href="#fn:2">2</a></sup>.</p>
<p>It's worth noting that a lot of compilers aren't fully compliant, or have
extensions that aren't part of the standard. To me, this is part of what makes
it difficult to pick apart what is standard, what is non-compliant, and what is
implementation defined. I recommend <a href="https://godbolt.org">Compiler Explorer</a> if
you want to see what other compilers might output if you are experimenting with
any examples.</p>
<h2>Update</h2>
<p>I've made some updates after <a href="https://news.ycombinator.com/item?id=20869451">some helpful feedback</a>:</p>
<ul>
<li>
<p>fixing mistakes in the <code>const</code> section</p>
</li>
<li>
<p>clarifying the use of implicit int in the <code>auto</code> section</p>
</li>
</ul>
<p>The original post is on the <a href="https://web.archive.org/web/20190903193635/https://mcla.ug/blog/cpp-is-not-a-superset-of-c.html">Internet
Archive</a>.</p>
<h1>Code that compiles in both languages, but does different things in each</h1>
<p>This is the category of differences that I think is most important. Not
everything that C and C++ appear to share is as it seems.</p>
<h2>const</h2>
<h3>What can be a constant expression?</h3>
<p>The keyword <code>const</code> has a different semantic meaning in C++ than in C, but it's
more subtle than I originally thought when first writing this blog post.</p>
<p>The differences come down to what each language allows to be a <em>constant
expression</em>. A constant expression can be evaluated at compile time.
Compile-time evaluation is needed for e.g. the size of a static array, as in
the following example which will compile in C++, but whether it compiles in C
will be implementation defined:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="Type">const</span> <span class="Type">size_t</span> buffer_size = <span class="Number">5</span>;
<span id="L2" class="LineNr">2 </span><span class="Type">int</span> buffer[buffer_size];
<span id="L3" class="LineNr">3 </span>
<span id="L4" class="LineNr">4 </span><span class="Comment">// int main() {</span>
<span id="L5" class="LineNr">5 </span> <span class="Comment">// ...</span>
<span id="L6" class="LineNr">6 </span><span class="Comment">// }</span>
</pre>
</div>
<p></html></p>
<p>We'll need to piece together a few different pieces of the C11 standard to
understand why this is implementation defined.</p>
<p>C11 6.6 paragraph 6 defines an <em>integer constant expression</em>:</p>
<blockquote>
<p>An integer constant expression shall have integer type and shall only have
operands that are integer constants, enumeration constants, character
constants, <code>sizeof</code> expressions whose results are integer constants, and
floating constants that are the immediate operands of casts. Cast operators
in an integer constant expression shall only convert arithmetic types to
integer types, except as part of an operand to the <code>sizeof</code> operator.</p>
</blockquote>
<p>But what is an "integer constant"? From 6.4.4, these are literal values, not
variables, e.g. <code>1</code>.</p>
<p>What this boils down to is that only expressions like <code>1</code> or <code>5 + 7</code> can be
constant expressions in C. Variables can't be constant expressions. As
expected, this example <a href="https://godbolt.org/z/oMq16u">doesn't compile with
gcc</a>. But <a href="https://godbolt.org/z/GgQU8X">it <em>does</em> compile with
Clang</a>: why?</p>
<p>The answer is one final piece of the puzzle, C11 6.6 paragraph 10:</p>
<blockquote>
<p>An implementation may accept other forms of constant expressions.</p>
</blockquote>
<p>A portable version of the code above in C would have to use a preprocessor
macro:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="PreProc">#define </span><span class="Function">BUFFER_SIZE </span><span class="PreProc">(</span><span class="Number">5</span><span class="PreProc">)</span>
<span id="L2" class="LineNr">2 </span><span class="Type">int</span> buffer[BUFFER_SIZE];
</pre>
</div>
<p></html></p>
<p>The keyword <code>const</code> was created for this very purpose by Bjarne Stroustrop<sup id="fnref:3"><a class="footnote-ref" href="#fn:3">3</a></sup>:
to reduce the need for macros. C++ is much more permissive about what can be a
constant expression, making <code>const</code> variables more powerful.</p>
<p>It was a surprise to me to learn that <code>const</code> originated in what would become
C++, and was then adopted by C. I had assumed that <code>const</code> came from C, and C++
took the same concept and extended it in order to reduce the need for macros. I
understand macros are embraced by C, but it seems a shame to deliberately
reduce the usefulness of <code>const</code> when standardising C.</p>
<h3>Linkage</h3>
<p>Another difference is that file-scope <code>const</code> variables have internal linkage
by default in C++. This is so that you can make a <code>const</code> declaration in a
header without having multiple definition errors<sup id="fnref:4"><a class="footnote-ref" href="#fn:4">4</a></sup></p>
<h3>Modifying const variables</h3>
<p>The following code is a constraint violation in C:</p>
<p><html></p>
<div class=code>
<pre id="vimCodeElement"><span id="L1" class="LineNr">1 </span><span class="Type">const</span> <span class="Type">int</span> foo = <span class="Number">1</span>;
<span id="L2" class="LineNr">2 </span><span class="Type">int</span>* bar = &foo;
<span id="L3" class="LineNr">3 </span>*bar = <span class="Number">2</span>;
</pre>
</div>
<p></html></p>
<p>C11 6.5.16.1 paragraph 1 lists some constraints, one of which must be true for
an assignment to be valid. The relevant constraint for our example:</p>
<blockquote>
<p>the left operand has atomic, qualified, or unqualified pointer type,
and (considering the type the left operand would have after lvalue
conversion) both operands are pointers to qualified or unqualified versions
of compatible types, and <em>the type pointed to by the left has all the
qualifiers of the type pointed to by the right</em></p>
</blockquote>
<p>To be conformant, the compiler must generate a diagnostic if there's a
constraint violation. This could be a warning or an error. I've found that it
is generally a warning, meaning this <a href="https://godbolt.org/z/DZdBRW">can often be compiled in
C</a>, though would give undefined behaviour<sup id="fnref:5"><a class="footnote-ref" href="#fn:5">5</a></sup>:</p>
<p><a href="https://godbolt.org/z/yMUvpg">This is would not compile as C++</a>. I think this
is because in C++ <code>const T</code> is a distinct type from <code>T</code>, and the implicit
conversion is not allowed. In C, the <code>const</code> is just a qualifier. I could be
misunderstanding, however.</p>
<p>C++17 6.7.3:</p>
<blockquote>
<p>The cv-qualified or cv-unqualified versions of a type are distinct types</p>
</blockquote>
<h2>Function declarations with no arguments</h2>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="Type">int</span> <span class="Function">func</span>();
</pre>
</div>
<p></html></p>
<p>In C++, this declares a function that takes no arguments. But in C, this
declares a function that could take any number of arguments of any type.</p>
<p>From the C11 standard 6.7.6.3 paragraphs 10 and 14:</p>
<blockquote>
<p>The special case of an unnamed parameter of type void as the only item in the
list specifies that the function has no parameters.</p>
<p>An empty list in a function declarator that is part of a definition of that
function specifies that the function has no parameters. The empty list in a
function declarator that is not part of a definition of that function
specifies that no information about the number or types of the parameters is
supplied.</p>
</blockquote>
<p>So the following would be legit C:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="Comment">// func.h</span>
<span id="L2" class="LineNr">2 </span><span class="Type">int</span> <span class="Function">func</span>();
</pre>
</div>
<p></html></p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="Comment">// func.c</span>
<span id="L2" class="LineNr">2 </span><span class="Type">int</span> <span class="Function">func</span>(<span class="Type">int</span> foo, <span class="Type">int</span> bar) {
<span id="L3" class="LineNr">3 </span> <span class="Statement">return</span> foo + bar;
<span id="L4" class="LineNr">4 </span>}
</pre>
</div>
<p></html></p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="Comment">// main.c</span>
<span id="L2" class="LineNr">2 </span><span class="PreProc">#include </span><span class="String">"func.h"</span>
<span id="L3" class="LineNr">3 </span>
<span id="L4" class="LineNr">4 </span><span class="Type">int</span> <span class="Function">main</span>() {
<span id="L5" class="LineNr">5 </span> <span class="Statement">return</span> <span class="Function">func</span>(<span class="Number">5</span>, <span class="Number">6</span>);
<span id="L6" class="LineNr">6 </span>}
</pre>
</div>
<p></html></p>
<p>This would result in a compiler error in C++:</p>
<div class="highlight"><pre><span></span><code><span class="err"> main.c:5:12: error: no matching function for call to 'func'</span>
<span class="err"> return func(5, 6);</span>
<span class="err"> ^~~~</span>
<span class="err"> ./func.h:2:5: note: candidate function not viable:</span>
<span class="err"> requires 0 arguments, but 2 were provided</span>
</code></pre></div>
<h3>The effect of name mangling</h3>
<p>There are some common implementation details that allow us to take this
further. On my Linux machine using Clang, the following C compiles and links
(though the result would of course be undefined):</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="Comment">// func.h</span>
<span id="L2" class="LineNr">2 </span><span class="Type">int</span> <span class="Function">func</span>(<span class="Type">int</span> foo, <span class="Type">int</span> bar);
</pre>
</div>
<p></html></p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="PreProc">#include </span><span class="String"><stdio.h></span>
<span id="L2" class="LineNr">2 </span>
<span id="L3" class="LineNr">3 </span><span class="Comment">// func.c</span>
<span id="L4" class="LineNr">4 </span><span class="Type">int</span> <span class="Function">func</span>(<span class="Type">float</span> foo, <span class="Type">float</span> bar) {
<span id="L5" class="LineNr">5 </span> <span class="Statement">return</span> <span class="Function">printf</span>(<span class="String">"</span><span class="Special">%f</span><span class="String">, </span><span class="Special">%f</span><span class="Special">\n</span><span class="String">"</span>, foo, bar);
<span id="L6" class="LineNr">6 </span>}
</pre>
</div>
<p></html></p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="Comment">// main.c</span>
<span id="L2" class="LineNr">2 </span><span class="PreProc">#include </span><span class="String">"func.h"</span>
<span id="L3" class="LineNr">3 </span>
<span id="L4" class="LineNr">4 </span><span class="Type">int</span> <span class="Function">main</span>() {
<span id="L5" class="LineNr">5 </span> <span class="Statement">return</span> <span class="Function">func</span>(<span class="Number">5</span>, <span class="Number">6</span>);
<span id="L6" class="LineNr">6 </span>}
</pre>
</div>
<p></html></p>
<p>This does not compile in C++. C++ compilers commonly use name mangling to
enable function overloading. They "mangle" the names of functions in order to
encode their arguments, e.g. by appending the argument types to the function
name. Generally, C compilers just store the function name as the symbol. We can
see this by comparing the symbol table of <code>func.o</code> when compiled as C and C++.</p>
<p>As C:</p>
<div class="highlight"><pre><span></span><code><span class="err">╰─λ</span> <span class="n">objdump</span> <span class="o">-</span><span class="n">t</span> <span class="n">func</span><span class="p">.</span><span class="n">o</span>
<span class="n">func</span><span class="p">.</span><span class="n">o</span><span class="p">:</span> <span class="n">file</span> <span class="n">format</span> <span class="n">elf64</span><span class="o">-</span><span class="n">x86</span><span class="o">-</span><span class="mi">64</span>
<span class="n">SYMBOL</span> <span class="k">TABLE</span><span class="p">:</span>
<span class="mi">0000000000000000</span> <span class="n">l</span> <span class="n">df</span> <span class="o">*</span><span class="k">ABS</span><span class="o">*</span> <span class="mi">0000000000000000</span> <span class="n">foo</span><span class="p">.</span><span class="k">c</span>
<span class="mi">0000000000000000</span> <span class="n">l</span> <span class="n">d</span> <span class="p">.</span><span class="nb">text</span> <span class="mi">0000000000000000</span> <span class="p">.</span><span class="nb">text</span>
<span class="mi">0000000000000000</span> <span class="n">l</span> <span class="n">d</span> <span class="p">.</span><span class="n">rodata</span><span class="p">.</span><span class="n">str1</span><span class="p">.</span><span class="mi">1</span> <span class="mi">0000000000000000</span> <span class="p">.</span><span class="n">rodata</span><span class="p">.</span><span class="n">str1</span><span class="p">.</span><span class="mi">1</span>
<span class="mi">0000000000000000</span> <span class="k">g</span> <span class="n">F</span> <span class="p">.</span><span class="nb">text</span> <span class="mi">000000000000002</span><span class="n">e</span> <span class="n">func</span>
<span class="mi">0000000000000000</span> <span class="o">*</span><span class="n">UND</span><span class="o">*</span> <span class="mi">0000000000000000</span> <span class="n">printf</span>
</code></pre></div>
<p>As C++:</p>
<div class="highlight"><pre><span></span><code><span class="err">╰─λ</span> <span class="n">objdump</span> <span class="o">-</span><span class="n">t</span> <span class="n">func</span><span class="p">.</span><span class="n">o</span>
<span class="n">func</span><span class="p">.</span><span class="n">o</span><span class="p">:</span> <span class="n">file</span> <span class="n">format</span> <span class="n">elf64</span><span class="o">-</span><span class="n">x86</span><span class="o">-</span><span class="mi">64</span>
<span class="n">SYMBOL</span> <span class="k">TABLE</span><span class="p">:</span>
<span class="mi">0000000000000000</span> <span class="n">l</span> <span class="n">df</span> <span class="o">*</span><span class="k">ABS</span><span class="o">*</span> <span class="mi">0000000000000000</span> <span class="n">foo</span><span class="p">.</span><span class="k">c</span>
<span class="mi">0000000000000000</span> <span class="n">l</span> <span class="n">d</span> <span class="p">.</span><span class="nb">text</span> <span class="mi">0000000000000000</span> <span class="p">.</span><span class="nb">text</span>
<span class="mi">0000000000000000</span> <span class="n">l</span> <span class="n">d</span> <span class="p">.</span><span class="n">rodata</span><span class="p">.</span><span class="n">str1</span><span class="p">.</span><span class="mi">1</span> <span class="mi">0000000000000000</span> <span class="p">.</span><span class="n">rodata</span><span class="p">.</span><span class="n">str1</span><span class="p">.</span><span class="mi">1</span>
<span class="mi">0000000000000000</span> <span class="k">g</span> <span class="n">F</span> <span class="p">.</span><span class="nb">text</span> <span class="mi">000000000000003</span><span class="n">b</span> <span class="n">_Z4funcff</span>
<span class="mi">0000000000000000</span> <span class="o">*</span><span class="n">UND</span><span class="o">*</span> <span class="mi">0000000000000000</span> <span class="n">printf</span>
</code></pre></div>
<p>These implementation details are not part of the standards, but I'd be
surprised to see an implementation that did something wildly different.</p>
<h2>auto</h2>
<p>I mostly include this for fun, as I think it's not as well known as it could
be. <code>auto</code> is used for type-inference in C++, but is also a C keyword, just one
that I've never actually seen used.</p>
<p><code>auto</code> is used to declare something with automatic storage class. It's rarely
seen because this is the default storage class for all variables declared
within a block.</p>
<p>The following C has a constraint violation, namely not specifying a type<sup id="fnref:6"><a class="footnote-ref" href="#fn:6">6</a></sup>.
This could error, but I've never found a compiler to give it anything but a
warning about implicit conversion:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="Type">int</span> <span class="Function">main</span>() {
<span id="L2" class="LineNr">2 </span> <span class="Type">auto</span> x = <span class="String">"actually an int"</span>;
<span id="L3" class="LineNr">3 </span> <span class="Statement">return</span> x;
<span id="L4" class="LineNr">4 </span>}
</pre>
</div>
<p></html></p>
<p>Before C99, it was legal to have no type specifiers, and the type would be
assumed to be <code>int</code>. This is what happens when I compile this with
<a href="https://godbolt.org/z/ok1OSi">Clang</a> and <a href="https://godbolt.org/z/v5TTqP">gcc</a>,
and so we get a warning due to implicitly converting a <code>char</code> array to <code>int</code>.</p>
<p>In C++ this wouldn't compile, as the type of <code>x</code> is inferred to be <code>const char*</code>:</p>
<div class="highlight"><pre><span></span><code><span class="n">error</span><span class="o">:</span> <span class="n">cannot</span> <span class="n">initialize</span> <span class="k">return</span> <span class="n">object</span> <span class="n">of</span> <span class="n">type</span> <span class="s1">'int'</span> <span class="k">with</span> <span class="n">an</span> <span class="n">lvalue</span> <span class="n">of</span> <span class="n">type</span> <span class="s1">'const char *'</span>
<span class="k">return</span> <span class="n">x</span><span class="o">;</span>
</code></pre></div>
<h1>Features C has that C++ doesn't have</h1>
<p>Despite C being a very small language, and C++ being huge, there are a few
features that C has that C++ does not.</p>
<h2>Variable length arrays</h2>
<p>VLAs allow you to define an array of automatic storage duration with variable
length. E.g.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="Type">void</span> <span class="Function">f</span>(<span class="Type">int</span> n) {
<span id="L2" class="LineNr">2 </span> <span class="Type">int</span> arr[n];
<span id="L3" class="LineNr">3 </span> <span class="Comment">// ......</span>
<span id="L4" class="LineNr">4 </span>}
</pre>
</div>
<p></html></p>
<p>VLAs were actually made optional in the C11 standard, which makes them not very
portable.</p>
<p>These aren't part of C++, probably in part because the C++ standard library
relies heavily on dynamic memory allocation to create containers like
<code>std::vector</code> that can be used similarly. There are reasons you might not want
this dynamic allocation, but then perhaps you would not be using C++.</p>
<h2>Restricted pointers</h2>
<p>C defines a third type qualifier (in addition to <code>const</code> and <code>volatile</code>):
<code>restrict</code><sup id="fnref:7"><a class="footnote-ref" href="#fn:7">7</a></sup>. This is only used with pointers. Making a pointer restricted is
telling the compiler "I will only access the underlying object via this pointer
for the scope of this pointer". Consequently it can't be aliased. If you break
this promise you will get undefined behaviour.</p>
<p>This exists to aid optimisation. A classic example is <code>memmove</code> where you can
tell the compiler that the <code>src</code> and <code>dst</code> do not overlap.</p>
<p>From C11 6.7.3 paragraph 8:</p>
<blockquote>
<p>An object that is accessed through a restrict-qualified pointer has a special
association with that pointer. This association, defined in 6.7.3.1 below,
requires that all accesses to that object use, directly or indirectly, the
value of that particular pointer.135)The intended use of the restrict
qualifier (like the register storage class) is to promote optimization, and
deleting all instances of the qualifier from all preprocessing translation
units composing a conforming program does not change its meaning (i.e.,
observable behavior)</p>
</blockquote>
<p>Restricted pointers aren't part of the C++ standard but are actually supported
as extensions by many compilers<sup id="fnref:8"><a class="footnote-ref" href="#fn:8">8</a></sup>.</p>
<p>I'm suspicious of <code>restrict</code>. It seems like playing with fire, and anecdotally
it seems common to run into compiler optimisation bugs when using it because
it's exercised so little<sup id="fnref:9"><a class="footnote-ref" href="#fn:9">9</a></sup>. But it's easy to be suspicious of something I've
never actually used.</p>
<h2>Designated initialisers</h2>
<p>C99 brought in an incredibly useful way to initialise structs, and I do not
understand why it has not been adopted by C++.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr"> 1 </span><span class="Type">typedef</span> <span class="Type">struct</span> {
<span id="L2" class="LineNr"> 2 </span> <span class="Type">float</span> red;
<span id="L3" class="LineNr"> 3 </span> <span class="Type">float</span> green;
<span id="L4" class="LineNr"> 4 </span> <span class="Type">float</span> blue;
<span id="L5" class="LineNr"> 5 </span>} Colour;
<span id="L6" class="LineNr"> 6 </span>
<span id="L7" class="LineNr"> 7 </span><span class="Type">int</span> <span class="Function">main</span>() {
<span id="L8" class="LineNr"> 8 </span> Colour c = { .red = <span class="Number">0.1</span>, .green = <span class="Number">0.5</span>, .blue = <span class="Number">0.9</span> };
<span id="L9" class="LineNr"> 9 </span> <span class="Statement">return</span> <span class="Number">0</span>;
<span id="L10" class="LineNr">10 </span>}
</pre>
</div>
<p></html></p>
<p>In C++ you would have to initialise like this: <code>Colour c = { 0.1, 0.5, 0.9 };</code>
which is harder to read and not robust to changes in the definition of
<code>Colour</code>. You could instead define a constructor but why should we have to do
this for a simple aggregate type? I hear designated initialisers are now coming
in C++20. It only took 21 years...</p>
<div class="footnote">
<hr>
<ol>
<li id="fn:1">
<p>The closest working draft I could find for free online:
<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4713.pdf">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4713.pdf</a> <a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text">↩</a></p>
</li>
<li id="fn:2">
<p><a href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf">http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf</a> <a class="footnote-backref" href="#fnref:2" title="Jump back to footnote 2 in the text">↩</a></p>
</li>
<li id="fn:3">
<p><a href="http://www.stroustrup.com/sibling_rivalry.pdf">Sibling Rivalry, 2002, Bjarne
Stroustrup</a> <a class="footnote-backref" href="#fnref:3" title="Jump back to footnote 3 in the text">↩</a></p>
</li>
<li id="fn:4">
<p>C++11 standard appendix C.1.2 <a class="footnote-backref" href="#fnref:4" title="Jump back to footnote 4 in the text">↩</a></p>
</li>
<li id="fn:5">
<p>From C11 6.7.3 paragraph 6: If an attempt is made to modify an object
defined with a const-qualified type through use of an lvalue with
non-const-qualified type, the behavior is undefined. <a class="footnote-backref" href="#fnref:5" title="Jump back to footnote 5 in the text">↩</a></p>
</li>
<li id="fn:6">
<p>C11 6.7.2 <a class="footnote-backref" href="#fnref:6" title="Jump back to footnote 6 in the text">↩</a></p>
</li>
<li id="fn:7">
<p>C11 also defines type qualifier <code>_Atomic</code> but I didn't include it here
for reasons of: conciseness; it's ugly (it's a shame it couldn't be <code>atomic</code>,
too much existing code uses that); I don't know how common it is as a lot of
people still use C99; C++ also has atomic types as part of the STL so it wasn't
an interesting example. <a class="footnote-backref" href="#fnref:7" title="Jump back to footnote 7 in the text">↩</a></p>
</li>
<li id="fn:8">
<p><a href="https://gcc.gnu.org/onlinedocs/gcc-6.4.0/gcc/Restricted-Pointers.html">https://gcc.gnu.org/onlinedocs/gcc-6.4.0/gcc/Restricted-Pointers.html</a> <a class="footnote-backref" href="#fnref:8" title="Jump back to footnote 8 in the text">↩</a></p>
</li>
<li id="fn:9">
<p><a href="https://software.intel.com/en-us/forums/intel-c-compiler/topic/474141">https://software.intel.com/en-us/forums/intel-c-compiler/topic/474141</a> <a class="footnote-backref" href="#fnref:9" title="Jump back to footnote 9 in the text">↩</a></p>
</li>
</ol>
</div>C++20 concepts are not like Rust traits2019-08-21T00:00:00+01:002019-08-21T00:00:00+01:00Hannah McLaughlintag:mcla.ug,2019-08-21:/blog/cpp20-concepts-are-not-like-rust-traits.html<p>At writing, <a href="https://en.wikipedia.org/w/index.php?title=Rust_(programming_language)&oldid=910954500">Rust's
Wikipedia</a>
currently says the following:</p>
<blockquote>
<p>Functions can be given generic parameters, which usually require the generic
type to implement a certain trait or traits. Within such a function, the
generic value can only be used through those traits. This means that a generic
function can be type-checked …</p></blockquote><p>At writing, <a href="https://en.wikipedia.org/w/index.php?title=Rust_(programming_language)&oldid=910954500">Rust's
Wikipedia</a>
currently says the following:</p>
<blockquote>
<p>Functions can be given generic parameters, which usually require the generic
type to implement a certain trait or traits. Within such a function, the
generic value can only be used through those traits. This means that a generic
function can be type-checked as soon as it is defined. This is in contrast to
C++ templates, which are fundamentally duck typed and cannot be checked until
instantiated with concrete types. C++ concepts address the same issue and are
expected to be part of C++20 (2020).</p>
</blockquote>
<p>But C++ concepts (as currently proposed) are very different from Rust traits, and do not allow for the parameterised function to be type-checked only once.</p>
<p>Rust traits are implemented explicitly, whereas C++ concept constraints are
implicitly met. This means that concept-constrained templates can legally
invoke behaviour not defined by the concept. So, the following code compiles in
g++ v9.2 with flags <code>--std=c++2a -fconcepts</code>:<sup id="fnref:1"><a class="footnote-ref" href="#fn:1">1</a></sup></p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="PreProc">#include </span><span class="String"><string></span>
<span id="L2" class="LineNr"> 2 </span>
<span id="L3" class="LineNr"> 3 </span><span class="Type">template</span><<span class="Type">typename</span> T>
<span id="L4" class="LineNr"> 4 </span>concept <span class="Type">bool</span> Stringable = <span class="Function">requires</span>(T a) {
<span id="L5" class="LineNr"> 5 </span> {a.<span class="Function">stringify</span>()} -> <span class="Constant">std</span>::<span class="Type">string</span>;
<span id="L6" class="LineNr"> 6 </span>};
<span id="L7" class="LineNr"> 7 </span>
<span id="L8" class="LineNr"> 8 </span><span class="Type">class</span> Cat {
<span id="L9" class="LineNr"> 9 </span> <span class="Statement">public</span>:
<span id="L10" class="LineNr">10 </span> <span class="Constant">std</span>::<span class="Type">string</span> <span class="Function">stringify</span>() {
<span id="L11" class="LineNr">11 </span> <span class="Statement">return</span> <span class="String">"meow"</span>;
<span id="L12" class="LineNr">12 </span> }
<span id="L13" class="LineNr">13 </span>
<span id="L14" class="LineNr">14 </span> <span class="Type">void</span> <span class="Function">pet</span>() {
<span id="L15" class="LineNr">15 </span> }
<span id="L16" class="LineNr">16 </span>};
<span id="L17" class="LineNr">17 </span>
<span id="L18" class="LineNr">18 </span><span class="Type">template</span><Stringable T>
<span id="L19" class="LineNr">19 </span><span class="Type">void</span> <span class="Function">f</span>(T a) {
<span id="L20" class="LineNr">20 </span> a.<span class="Function">pet</span>();
<span id="L21" class="LineNr">21 </span>}
<span id="L22" class="LineNr">22 </span>
<span id="L23" class="LineNr">23 </span><span class="Type">int</span> <span class="Function">main</span>() {
<span id="L24" class="LineNr">24 </span> <span class="Function">f</span>(<span class="Function">Cat</span>());
<span id="L25" class="LineNr">25 </span> <span class="Statement">return</span> <span class="Number">0</span>;
<span id="L26" class="LineNr">26 </span>}
</div>
<p></html></p>
<p>The Rust equivalent would not compile:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Keyword">trait</span> <span
class="Identifier">Stringable</span> {
<span id="L2" class="LineNr"> 2 </span> <span class="Keyword">fn</span> <span class="Function">stringify</span>() <span class="Statement">-></span> <span class="Type">String</span>;
<span id="L3" class="LineNr"> 3 </span>}
<span id="L4" class="LineNr"> 4 </span>
<span id="L5" class="LineNr"> 5 </span><span class="Keyword">struct</span> <span class="Identifier">Cat</span> {
<span id="L6" class="LineNr"> 6 </span>}
<span id="L7" class="LineNr"> 7 </span>
<span id="L8" class="LineNr"> 8 </span><span class="Keyword">impl</span> Cat {
<span id="L9" class="LineNr"> 9 </span> <span class="Keyword">fn</span> <span class="Function">pet</span>() {}
<span id="L10" class="LineNr">10 </span>}
<span id="L11" class="LineNr">11 </span>
<span id="L12" class="LineNr">12 </span><span class="Keyword">impl</span> Stringable <span class="Statement">for</span> Cat {
<span id="L13" class="LineNr">13 </span> <span class="Keyword">fn</span> <span class="Function">stringify</span>() <span class="Statement">-></span> <span class="Type">String</span> {
<span id="L14" class="LineNr">14 </span> <span class="String">"meow"</span>.<span class="Function">to_string</span>()
<span id="L15" class="LineNr">15 </span> }
<span id="L16" class="LineNr">16 </span>}
<span id="L17" class="LineNr">17 </span>
<span id="L18" class="LineNr">18 </span><span class="Keyword">fn</span> <span class="Function">f</span><span class="Statement"><</span>T: Stringable<span class="Statement">></span>(a: T) {
<span id="L19" class="LineNr">19 </span> a.<span class="Function">pet</span>(); <span class="Comment">// error[E0599]: no method named `pet` found for type `T` in the current scope</span>
<span id="L20" class="LineNr">20 </span>}
<span id="L21" class="LineNr">21 </span>
<span id="L22" class="LineNr">22 </span><span class="Keyword">fn</span> <span class="Function">main</span>() {
<span id="L23" class="LineNr">23 </span> <span class="Keyword">let</span> cat <span class="Statement">=</span> Cat{};
<span id="L24" class="LineNr">24 </span> <span class="Function">f</span>(cat);
<span id="L25" class="LineNr">25 </span>}
</pre>
</div>
<p></html></p>
<p>C++ concept-constrained templates are still only type checked when concrete
instantiation is attempted – they just give better, sooner error messages
for types that don't comply with the constraint, rather than the long stream of
nonsense that failed template instantiations output in C++ without concepts.</p>
<p>I think it's quite common for people to think that concepts are the same as
traits. They look similar syntactically, and also the realities of them aren't
well known because they aren't yet in a standard. I hope this can clarify
things for anyone curious, and help anyone adjust expectations before the C++20
standard is released.</p>
<h1>Rust traits comparisons with other language constructs</h1>
<p>Although Rust traits are very different from C++20 concepts, they have
similarities to a lot of other language constructs for polymorphism:</p>
<ul>
<li>
<p>Haskell typeclasses: Rust traits are based on these, but Rust does not have
higher kinded types<sup id="fnref:2"><a class="footnote-ref" href="#fn:2">2</a></sup>, and Rust enforces global uniqueness on trait
implementations<sup id="fnref:3"><a class="footnote-ref" href="#fn:3">3</a></sup><sup id="fnref:4"><a class="footnote-ref" href="#fn:4">4</a></sup>. This means that there is at most one implementation
of a trait for any given type. This is not enforced in Haskell, but it is
discouraged to take advantage of this.<sup id="fnref:5"><a class="footnote-ref" href="#fn:5">5</a></sup></p>
</li>
<li>
<p>Java interfaces: when Rust traits are used dynamically, they are analogous to
Java interfaces, except without the <code>extend</code> functionality available in
Java.<sup id="fnref:6"><a class="footnote-ref" href="#fn:6">6</a></sup></p>
</li>
</ul>
<div class="footnote">
<hr>
<ol>
<li id="fn:1">
<p><a href="https://godbolt.org/z/qA3hlL">https://godbolt.org/z/qA3hlL</a> <a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text">↩</a></p>
</li>
<li id="fn:2">
<p><a href="https://github.com/rust-lang/rfcs/issues/324">https://github.com/rust-lang/rfcs/issues/324</a> <a class="footnote-backref" href="#fnref:2" title="Jump back to footnote 2 in the text">↩</a></p>
</li>
<li id="fn:3">
<p><a href="http://aturon.github.io/tech/2017/02/06/specialization-and-coherence/">http://aturon.github.io/tech/2017/02/06/specialization-and-coherence/</a> <a class="footnote-backref" href="#fnref:3" title="Jump back to footnote 3 in the text">↩</a></p>
</li>
<li id="fn:4">
<p><a href="https://github.com/ixrec/rust-orphan-rules/blob/master/README.md">https://github.com/ixrec/rust-orphan-rules/blob/master/README.md</a> <a class="footnote-backref" href="#fnref:4" title="Jump back to footnote 4 in the text">↩</a></p>
</li>
<li id="fn:5">
<p><a href="http://blog.ezyang.com/2014/07/type-classes-confluence-coherence-global-uniqueness/">http://blog.ezyang.com/2014/07/type-classes-confluence-coherence-global-uniqueness/</a> <a class="footnote-backref" href="#fnref:5" title="Jump back to footnote 5 in the text">↩</a></p>
</li>
<li id="fn:6">
<p><a href="https://stevedonovan.github.io/rust-gentle-intro/object-orientation.html">https://stevedonovan.github.io/rust-gentle-intro/object-orientation.html</a> <a class="footnote-backref" href="#fnref:6" title="Jump back to footnote 6 in the text">↩</a></p>
</li>
</ol>
</div>Rust: a future for real-time and safety-critical software without C or C++2019-08-20T00:00:00+01:002019-08-20T00:00:00+01:00Hannah McLaughlintag:mcla.ug,2019-08-20:/blog/rust-a-future-for-real-time-and-safety-critical-software.html<h1>Overview</h1>
<p>Rust is a fairly new programming language that I'm really excited about. I gave
a talk about it to my coworkers, primarily aimed at C++ programmers. This is
that talk translated to a blog post. I hope you will be excited about Rust too
by the end of this …</p><h1>Overview</h1>
<p>Rust is a fairly new programming language that I'm really excited about. I gave
a talk about it to my coworkers, primarily aimed at C++ programmers. This is
that talk translated to a blog post. I hope you will be excited about Rust too
by the end of this blog post!</p>
<p>A quick overview: Rust is syntactically similar to C++, but semantically very
different. It's heavily influenced by functional programming languages like
Haskell and OCaml. It's statically typed, with full type inference, rather than
the partial type inference that C++ has. It's as fast as C and C++ while making
guarantees about memory safety that are impossible to make in those languages.</p>
<h1>house[-1]</h1>
<p>Before we look at Rust in any more detail, I want you to imagine yourself in a
scenario. Imagine that you are a builder setting up the gas supply in a new house. Your
boss tells you to connect the gas pipe in the basement to the gas main on the
pavement. You go downstairs, and find that there's a glitch: this house doesn't
have a basement!</p>
<p>So, what do you do? Perhaps you do nothing, or perhaps you decide to
whimsically interpret your instruction by attaching the gas main to some other
nearby fixture, like the air conditioning intake for the office next door.
Either way, suppose you report back to your boss that you're done.</p>
<p><img alt="KWABOOM" class="callout" src="/blog/images/rust-talk/explosion.png" title="A cartoon explosion, in red and yellow, with some shrapnel"></p>
<p>KWABOOM! When the dust settles from the explosion, you would be guilty of
criminal negligence.<sup id="fnref:1"><a class="footnote-ref" href="#fn:1">1</a></sup></p>
<p>Yet this is exactly what happens in some programming languages. In C you could
have an array, or in C++ you could have a vector, ask for the -1 index, and
anything could happen. The result could be different any time you run the
program, and you might not realise that something is wrong. This is called
undefined behaviour, and the possibility of it can't be eliminated entirely,
because low-level hardware operations are inherently unsafe, but it's something
that is protected against in many languages, just not C and C++.</p>
<p>The lack of memory safety guarantees from these languages, and the ease with
which undefined behaviour can be invoked, is terrifying when you think of how
much of the world runs on software. Heartbleed, the famous SSL vulnerability,
was due to this lack of memory safety; Stagefright, a famous Android
vulnerability, was due to undefined behaviour from signed integer overflow in
C++.</p>
<p><html></p>
<div class=info_box>
Memory safety is crucial to both the correctness and reliability of a program.
</div>
<p><html></p>
<p>Vulnerabilities aren't the only concern. Memory safety is crucial to both the
correctness and reliability of a program. No one wants their program to crash
out of nowhere, no matter how minor the application: reliability matters. As
for correctness, I have a friend who used to work on rocket flight simulation
software and they found passing in the same initialisation data but with a
different filename gave you a different result, because some uninitialised
memory was being read, it happened to read the program argument's memory, and
so the simulation was seeded with garbage values based on the filename.
Arguably their entire business was a lie.</p>
<h1>So why not use memory safe languages like Python or Java?</h1>
<p>Languages like Python and Java use garbage collection to automatically protect
us from bad memory accesses, like</p>
<ul>
<li>
<p>use-after-frees (when you access memory that has been deallocated)</p>
</li>
<li>
<p>double frees (when you release memory that's already been released,
potentially corrupting the heap)</p>
</li>
<li>
<p>memory leaks (when memory that isn't being used is never released. This isn't
necessarily dangerous but can cause your system to crash and destroy
performance.)</p>
</li>
</ul>
<p>Languages like Python and Java protect from these situations automatically. A
garbage collector will run as part of the JVM or the Python interpreter, and
periodically check memory to find unused objects, releasing their associated resources and
memory.</p>
<p>But it does this at great cost. Garbage collection is slow, it uses a lot of
memory, and crucially it means that at any point – you don't know when –
the program will halt – for how long, you don't know – to clean up the
garbage.</p>
<p><html></p>
<div class=info_box>
**Python and Java**<br>memory safe at the cost of speed and determinism
<br><br>
**C and C++**<br>fast and deterministic at the cost of memory safety
</div>
<p><html></p>
<p>This lack of predictability makes it impossible to use Python or Java for
real-time applications, where you must guarantee that operations will complete
within a specified period of time. It's not about being as fast as possible,
it's about guaranteeing you will be fast enough every single time.</p>
<p>So of course, there are social reasons why C and C++ are popular: it's what
people know and they've been around a long time. But they are also popular because
they are fast and deterministic.
Unfortunately, this comes at the cost of memory safety. Even worse, many
real-time applications are also safety critical, like control software in cars
and surgical robots. As a result, safety critical applications often use these
dangerous languages.</p>
<p>For a long time, this has been a fundamental trade-off. You either get speed
and predictability, or you get memory safety.</p>
<p>Rust completely overturns this, which is what makes it so exciting and notable.</p>
<h1>What this blog post will cover</h1>
<p>These are the questions I hope to answer in this post:</p>
<ul>
<li>
<p>What are Rust's design goals?</p>
</li>
<li>
<p>How does Rust achieve memory safety?</p>
</li>
<li>
<p>What does polymorphism look like in Rust?</p>
</li>
<li>
<p>What is Rust tooling like?</p>
</li>
</ul>
<h1>What are Rust's design goals?</h1>
<ul>
<li>
<p>Concurrency without data races</p>
<ul>
<li>
<p>Concurrency happens whenever different parts of your program might
execute at different times or out of order.</p>
</li>
<li>
<p>We'll discuss data races more later, but they're a common hazard when
writing concurrent programs, as many of you will know.</p>
</li>
</ul>
</li>
<li>
<p>Abstraction without overhead</p>
<ul>
<li>This just means that the conveniences and expressive power that the
language provides don't come at a run time cost, it doesn't slow your
program down to use them.</li>
</ul>
</li>
<li>
<p>Memory safety without garbage collection</p>
<ul>
<li>We've just talked about what these two terms mean. Let's have a look at
how Rust achieves this previously contradictory pairing.</li>
</ul>
</li>
</ul>
<h1>Memory safety without garbage collection</h1>
<p>How Rust achieves memory safety is simultaneously really simple and really
complex.</p>
<p>It's simple because all it involves is enforcing a few simple rules, which are
really easy to understand.</p>
<p>In Rust, all objects have an <em>owner</em>, tracked by the compiler. There can only
be one owner at a time, which is very different to how things work in most
other programming languages. It ensures that there is exactly one binding to
any given resource.
This alone would be very restrictive, so of course we can also give out
references according to strict rules. Taking a reference is often called
"borrowing" in Rust and I'm going to use that language here.</p>
<p>The rules for borrowing are:</p>
<blockquote>
<p>Any borrow must last for a scope no greater than that of the owner.</p>
<p>You may have one or the other of these two kinds of borrows, but not both at
the same time:</p>
<p>one or more immutable references to a resource
OR
exactly one mutable reference.</p>
</blockquote>
<p>The first rule eliminates use-after-frees. The second rule eliminates data
races. A data race happens when:</p>
<ul>
<li>
<p>two or more pointers access the same memory location at the same time</p>
</li>
<li>
<p>at least one of them is writing</p>
</li>
<li>
<p>and the operations aren't synchronised.</p>
</li>
</ul>
<p>The memory is left in an unknown state.</p>
<p>We didn't have a heap when I worked as an embedded engineer, and we had a
hardware trap for null pointer derefs. So a lot of common memory safety issues
weren't a major concern. Data races were the main type of bug I was really
scared of. Races can be difficult to detect until you make a seemingly insignificant
change to the code, or there's a slight change in the external conditions, and
suddenly the winner of the race changes. A data race caused multiple deaths
in Therac-25 when patients were given lethal doses of radiation during cancer
treatment.</p>
<p><html></p>
<div class=info_box>
Rust's key innovation is enforcing its memory safety rules at compile time.
</div>
<p><html></p>
<p>As I said, these rules are simple and shouldn't be surprising for anyone who's
ever had to deal with the possibility of data races before
– but when I said that they are also complex, I meant it is incredibly smart
that Rust is able to enforce these rules at compile time. This is Rust's key innovation.</p>
<p>There are some memory-safety checks that have to be runtime – like array
bounds checking. But if you are writing idiomatic Rust you'll almost never need
these – you wouldn't usually be directly indexing an array. Instead you'd be
using higher-order functions like fold, map and filter – you will be familiar
with this type of function if you've written Haskell/Scala or even Python.</p>
<h2>unsafe Rust</h2>
<p>I mentioned earlier that the possibility of undefined behaviour isn't something
that can be eliminated entirely, due to the inherently unsafe nature of low
level operations. Rust allows you to do such operations within specific unsafe
blocks. I believe C# and Ada have similar constructs for disabling certain
safety checks. You'll often need this in Rust when doing embedded programming,
or low level systems programming. The ability to isolate the potentially unsafe
parts of your code is incredibly useful – those parts will be subject to a
higher level of scrutiny, and if you have a bug that looks like a memory
problem, those will be the only place that can be causing it, rather than
absolutely anywhere in your code.</p>
<p>The unsafe block doesn't disable the borrow checker, which is the part of the
compiler that enforces the borrowing rules we just talked about
– it just allows you to dereference raw pointers, or access/modify mutable
static variables. The benefits of the ownership system are still there.</p>
<h1>Revisiting ownership</h1>
<p>Speaking of the ownership system, I want to compare it with ownership in C++.</p>
<p>Ownership semantics in C++ changed the language drastically when C++11 came
out. But the language paid such a high price for backwards compatibility.
Ownership, to me, feels unnaturally tacked onto C++. The previously simple value
taxonomy was butchered by it<sup id="fnref:2"><a class="footnote-ref" href="#fn:2">2</a></sup>. In many ways it was a great achievement to
massively modernise such a widely used language, but Rust shows us what a
language can look like when ownership is a core design concept from the
beginning.</p>
<p><html></p>
<div class=info_box>
C++ smart pointers are just a library on top of an outdated system, and as such
can be misused and abused in ways that Rust just does not allow.</div>
<p><html></p>
<p>C++'s type system does not model object lifetime at all. You can't check for
use-after-frees at compile time. Smart pointers are just a library on top of an
outdated system, and as such can be misused and abused in ways that Rust just
does not allow.</p>
<p>Let's take a look at some (simplified) C++ code I wrote at work, where this
misuse occurs. Then we can look at a Rust equivalent, which (rightly) doesn't compile.</p>
<h1>Abusing smart pointers in C++</h1>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="PreProc">#include </span><span class="String"><functional></span>
<span id="L2" class="LineNr"> 2 </span><span class="PreProc">#include </span><span class="String"><memory></span>
<span id="L3" class="LineNr"> 3 </span><span class="PreProc">#include </span><span class="String"><vector></span>
<span id="L4" class="LineNr"> 4 </span>
<span id="L5" class="LineNr"> 5 </span><span class="Constant">std</span>::<span class="Type">vector</span><DataValueCheck> <span
class="Function">createChecksFromStrings</span>(
<span id="L6" class="LineNr"> 6 </span> <span class="Constant">std</span>::<span class="Type">unique_ptr</span><Data> data,
<span id="L7" class="LineNr"> 7 </span> <span class="Constant">std</span>::<span class="Type">vector</span><<span class="Constant">std</span>::<span class="Type">string</span>> dataCheckStrs) {
<span id="L8" class="LineNr"> 8 </span>
<span id="L9" class="LineNr"> 9 </span> <span class="Type">auto</span> createCheck = [&](<span class="Constant">std</span>::<span
class="Type">string</span> checkStr) {
<span id="L10" class="LineNr">10 </span> <span class="Statement">return</span> <span class="Function">DataValueCheck</span>(checkStr, <span class="Constant">std</span>::<span class="Function">move</span>(data));
<span id="L11" class="LineNr">11 </span> };
<span id="L12" class="LineNr">12 </span>
<span id="L13" class="LineNr">13 </span> <span class="Constant">std</span>::<span class="Type">vector</span><DataValueCheck> checks;
<span id="L14" class="LineNr">14 </span> <span class="Constant">std</span>::<span class="Function">transform</span>(
<span id="L15" class="LineNr">15 </span> dataCheckStrs.<span class="Function">begin</span>(),
<span id="L16" class="LineNr">16 </span> dataCheckStrs.<span class="Function">end</span>(),
<span id="L17" class="LineNr">17 </span> <span class="Constant">std</span>::<span class="Function">back_inserter</span>(checks),
<span id="L18" class="LineNr">18 </span> createCheck);
<span id="L19" class="LineNr">19 </span>
<span id="L20" class="LineNr">20 </span> <span class="Statement">return</span> checks;
<span id="L21" class="LineNr">21 </span>}
</pre>
</div>
<p></html>
The idea of this code is that we take some strings defining some checks to be
performed on some data, e.g. is a value within a particular range. We
then create a vector of check objects by parsing these strings.</p>
<p>First, we create a lambda that captures by reference, hence the ampersand. The
unique pointer to the data is moved in this lambda, which was a mistake.</p>
<p>We then fill our vector with checks constructed from moved data. The problem is
that only the first move will be successful. Unique pointers are move-only.
So after the first loop in <code>std::transform</code>, probably the unique pointer is
nulled (the standard only specifies that it will be left in a valid but unknown
state, but in my experience with Clang it's generally nulled).</p>
<p>Using that null pointer later results in undefined behaviour! In my case, I got
a segmentation fault, which is what will happen on a nullptr deref on most
hosted systems, because the zero memory page is usually reserved. But this
behaviour certainly isn't guaranteed. A bug like this could in theory lie dormant for a while, and then
your application would crash out of nowhere.</p>
<p>The use of the lambda here is a large part of what makes this dangerous. The
compiler just sees a function pointer at the call-site. It can't inspect the
lambda the way it might a standard function.</p>
<p>For context on understanding how this bug came about, originally, we were
using a shared_ptr to store the data, which would have made
this code fine. We wrongly thought we could store it in a unique_ptr instead,
and this bug came about when we made the change. It went unnoticed in part
because the compiler didn't complain.</p>
<p>I'm glad this happened because it means I can show you a real example of a C++
memory
safety bug that went unnoticed by both me and my code reviewer, until it later
showed up in a test<sup id="fnref:3"><a class="footnote-ref" href="#fn:3">3</a></sup>. It doesn't matter if you're an experienced programmer
–
these bugs happen! And the compiler can't save you. We must demand better
tools, for our sanity and for public safety. This is an ethical concern.</p>
<p>With that in mind, let's look at a Rust version.</p>
<h1>In Rust, that bad move is not allowed</h1>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span><span class="Keyword">pub</span> <span class="Keyword">fn</span> <span class="Function">create_checks_from_strings</span>(
<span id="L2" class="LineNr">2 </span> data: <span class="Type">Box</span><span class="Statement"><</span>Data<span class="Statement">></span>,
<span id="L3" class="LineNr">3 </span> data_check_strs: <span class="Type">Vec</span><span class="Statement"><</span><span class="Type">String</span><span class="Statement">></span>)
<span id="L4" class="LineNr">4 </span> <span class="Statement">-></span> <span class="Type">Vec</span><span class="Statement"><</span>DataValueCheck<span class="Statement">></span>
<span id="L5" class="LineNr">5 </span>{
<span id="L6" class="LineNr">6 </span> <span class="Keyword">let</span> create_check <span class="Statement">=</span> <span class="Statement">|</span>check_str: <span class="Type">&</span><span class="Type">String</span><span class="Statement">|</span> <span class="PreProc">DataValueCheck</span><span class="Special">::</span><span class="Function">new</span>(check_str, data);
<span id="L7" class="LineNr">7 </span> data_check_strs.<span class="Function">iter</span>().<span class="Function">map</span>(create_check).<span class="Function">collect</span>()
<span id="L8" class="LineNr">8 </span>}
</pre>
</div>
<p></html>
This is our first look at some Rust code. Now is a good time for me to mention
that variables are immutable by default. For something to be modifiable, we
need to use the <code>mut</code> keyword – kind of like the opposite of <code>const</code> in C
and C++.</p>
<p>The Box type just means that we've allocated on the heap. I chose it here
because unique_ptrs are also heap allocated. We don't need anything else to be
analogous with unique_ptr because of Rust's rule about each object having only
one owner at a time.</p>
<p>We're then creating a closure, and then using the higher order function <code>map</code>
to apply it to the strings. It's very similar to the C++ version, but less
verbose.</p>
<p>But! This doesn't compile, and here's the error message.
<html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span>error[E0525]: expected a closure that
implements the `<span class="Type">FnMut</span>` <span
class="Keyword">trait</span>, but this closure only implements `<span
class="Type">FnOnce</span>`
<span id="L2" class="LineNr"> 2 </span> <span class="Statement">-</span><span class="Statement">-></span> bad_move.rs:<span class="Number">1</span>:<span class="Number">8</span>
<span id="L3" class="LineNr"> 3 </span> <span class="Statement">|</span>
<span id="L4" class="LineNr"> 4 </span> <span class="Number">6</span> <span class="Statement">|</span> <span class="Keyword">let</span> create_check <span class="Statement">=</span> <span class="Statement">|</span>check_str: <span class="Type">&</span><span class="Type">String</span><span class="Statement">|</span> <span class="PreProc">DataValueCheck</span><span class="Special">::</span><span class="Function">new</span>(check_str, data);
<span id="L5" class="LineNr"> 5 </span> <span class="Statement">|</span> <span class="Statement">^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^----^</span>
<span id="L6" class="LineNr"> 6 </span> <span class="Statement">|</span> <span class="Statement">|</span> <span class="Statement">|</span>
<span id="L7" class="LineNr"> 7 </span> <span class="Statement">|</span> <span class="Statement">|</span> closure is `<span class="Type">FnOnce</span>` because it moves
<span id="L8" class="LineNr"> 8 </span> <span class="Statement">|</span> <span class="Statement">|</span> the variable `data` out of its environment
<span id="L9" class="LineNr"> 9 </span> <span class="Statement">|</span> this closure implements `<span class="Type">FnOnce</span>`, not `<span class="Type">FnMut</span>`
<span id="L10" class="LineNr">10 </span> <span class="Number">7</span> <span class="Statement">|</span> data_check_strs.<span class="Function">iter</span>().<span class="Function">map</span>(create_check).<span class="Function">collect</span>()
<span id="L11" class="LineNr">11 </span> <span class="Statement">|</span> <span class="Statement">---</span> the requirement to implement `<span class="Type">FnMut</span>` derives from here
<span id="L12" class="LineNr">12 </span>
<span id="L13" class="LineNr">13 </span>error: aborting due to previous error
<span id="L14" class="LineNr">14 </span>
<span id="L15" class="LineNr">15 </span>For more information about this error, try `rustc <span class="Statement">--</span>explain E0525`.
</pre>
</div>
<p></html></p>
<p>One really great thing about the Rust community is there's a really strong
focus on making sure there's lots of resources for people to learn, and on
having readable error messages. You can even ask the compiler for more
information about the error
message, and it will show you a minimal example with explanations.</p>
<p>When we create our closure, the data variable is moved inside it because of the
one owner rule. The compiler
then infers that the closure can only be run once: further calls are illegal as
we no longer own the variable. Then, the function <code>map</code> requires a callable
that can be called repeatedly and mutate state, so the compilation fails.</p>
<p>I think this snippet shows how powerful the type system is in Rust compared to
C++, and how different it is to program in a language where the compiler tracks
object lifetime.</p>
<p>You'll notice that the error message here mentions traits: "expected a closure
that implements FnMut trait", for example. Traits are a language feature that
tell the compiler what functionality a type must provide. Traits are Rust's
mechanism for polymorphism.</p>
<h1>Polymorphism</h1>
<p>In C++ there's a lot of different ways of doing polymorphism, which I think
contributes to how bloated the language can feel. There's templates,
function & operator overloading for static polymorphism, and subtyping for
dynamic polymorphism. These can have major downsides: subtyping can lead to
very high coupling, and templates can be unpleasant to use due to their lack of
parameterisation.</p>
<p>In Rust, traits provide a unified way of specifying both static and dynamic
interfaces. They are Rust's sole notion of interface. They only support the
"implements" relationship, not the "extends" relationship. This encourages
designs that are based on composition, not implementation inheritance, leading to less
coupling.</p>
<p>Let's have a look at an example.
<html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Keyword">trait</span> <span class="Identifier">Rateable</span> {
<span id="L2" class="LineNr"> 2 </span> <span class="Special">/// Rate fluff out of 10</span>
<span id="L3" class="LineNr"> 3 </span> <span class="Special">/// Ratings above 10 for exceptionally soft bois</span>
<span id="L4" class="LineNr"> 4 </span> <span class="Keyword">fn</span> <span class="Function">fluff_rating</span>(<span class="Type">&</span><span class="Constant">self</span>) <span class="Statement">-></span> <span class="Type">f32</span>;
<span id="L5" class="LineNr"> 5 </span>}
<span id="L6" class="LineNr"> 6 </span>
<span id="L7" class="LineNr"> 7 </span><span class="Keyword">struct</span> <span class="Identifier">Alpaca</span> {
<span id="L8" class="LineNr"> 8 </span> days_since_shearing: <span class="Type">f32</span>,
<span id="L9" class="LineNr"> 9 </span> age: <span class="Type">f32</span>
<span id="L10" class="LineNr">10 </span>}
<span id="L11" class="LineNr">11 </span>
<span id="L12" class="LineNr">12 </span><span class="Keyword">impl</span> Rateable <span class="Statement">for</span> Alpaca {
<span id="L13" class="LineNr">13 </span> <span class="Keyword">fn</span> <span class="Function">fluff_rating</span>(<span class="Type">&</span><span class="Constant">self</span>) <span class="Statement">-></span> <span class="Type">f32</span> {
<span id="L14" class="LineNr">14 </span> <span class="Number">10.0</span> <span class="Statement">*</span> <span class="Number">365.0</span> <span class="Statement">/</span> <span class="Constant">self</span>.days_since_shearing
<span id="L15" class="LineNr">15 </span> }
<span id="L16" class="LineNr">16 </span>}
</pre>
</div>
<p></html></p>
<p>There's nothing complicated going on here, I decided a simple but fun example
was best. First, we're defining a trait called <code>Rateable</code>. For a type to be
<code>Rateable</code>, it has to implement a function called <code>fluff_rating</code> that returns a
float.</p>
<p>Then we define a type called <code>Alpaca</code> and implement this interface for it. We
could do the same for another type, say <code>Cat</code>!
<html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Keyword">enum</span> <span class="Identifier">Coat</span> {
<span id="L2" class="LineNr"> 2 </span> Hairless,
<span id="L3" class="LineNr"> 3 </span> Short,
<span id="L4" class="LineNr"> 4 </span> Medium,
<span id="L5" class="LineNr"> 5 </span> Long
<span id="L6" class="LineNr"> 6 </span>}
<span id="L7" class="LineNr"> 7 </span>
<span id="L8" class="LineNr"> 8 </span><span class="Keyword">struct</span> <span class="Identifier">Cat</span> {
<span id="L9" class="LineNr"> 9 </span> coat: Coat,
<span id="L10" class="LineNr">10 </span> age: <span class="Type">f32</span>
<span id="L11" class="LineNr">11 </span>}
<span id="L12" class="LineNr">12 </span>
<span id="L13" class="LineNr">13 </span><span class="Keyword">impl</span> Rateable <span class="Statement">for</span> Cat {
<span id="L14" class="LineNr">14 </span> <span class="Keyword">fn</span> <span class="Function">fluff_rating</span>(<span class="Type">&</span><span class="Constant">self</span>) <span class="Statement">-></span> <span class="Type">f32</span> {
<span id="L15" class="LineNr">15 </span> <span class="Statement">match</span> <span class="Constant">self</span>.coat {
<span id="L16" class="LineNr">16 </span> <span class="PreProc">Coat</span><span class="Special">::</span>Hairless <span class="Statement">=></span> <span class="Number">0.0</span>,
<span id="L17" class="LineNr">17 </span> <span class="PreProc">Coat</span><span class="Special">::</span>Short <span class="Statement">=></span> <span class="Number">5.0</span>,
<span id="L18" class="LineNr">18 </span> <span class="PreProc">Coat</span><span class="Special">::</span>Medium <span class="Statement">=></span> <span class="Number">7.5</span>,
<span id="L19" class="LineNr">19 </span> <span class="PreProc">Coat</span><span class="Special">::</span>Long <span class="Statement">=></span> <span class="Number">10.0</span>
<span id="L20" class="LineNr">20 </span> }
<span id="L21" class="LineNr">21 </span> }
<span id="L22" class="LineNr">22 </span>}
</pre>
</div>
<p></html></p>
<p>Here you can see me using pattern matching, another Rust feature. It's similar
in usage to a switch statement in C but semantically very different. Cases in
switch blocks are just gotos; pattern matching has required coverage
completeness. You have to cover every case for it to compile. Plus you can
match on ranges and other constructs that makes it a lot more flexible.</p>
<p>So, now that we've implemented this trait for these two types, we can have a
generic function<sup id="fnref:4"><a class="footnote-ref" href="#fn:4">4</a></sup>.</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span><span class="Keyword">fn</span> <span class="Function">pet</span><span class="Statement"><</span>T: Rateable<span class="Statement">></span>(boi: T) <span class="Statement">-></span> <span class="Type">&</span><span class="Type">str</span> {
<span id="L2" class="LineNr">2 </span> <span class="Statement">match</span> boi.<span class="Function">fluff_rating</span>() {
<span id="L3" class="LineNr">3 </span> <span class="Number">0.0</span>...<span class="Number">3.5</span> <span class="Statement">=></span> <span class="String">"naked alien boi...but precious nonetheless"</span>,
<span id="L4" class="LineNr">4 </span> <span class="Number">3.5</span>...<span class="Number">6.5</span> <span class="Statement">=></span> <span class="String">"increased floof...increased joy"</span>,
<span id="L5" class="LineNr">5 </span> <span class="Number">6.5</span>...<span class="Number">8.5</span> <span class="Statement">=></span> <span class="String">"approaching maximum fluff"</span>,
<span id="L6" class="LineNr">6 </span> _ <span class="Statement">=></span> <span class="String">"sublime. the softest boi!"</span>
<span id="L7" class="LineNr">7 </span>}
</pre>
</div>
<p></html></p>
<p>Like in C++, the stuff inside the angle brackets is our type arguments. But
unlike C++ templates, we are able to parametrise the function. We're able to
say "this function is only for types that are Rateable". That's not something
you can do in C++!<sup id="fnref:5"><a class="footnote-ref" href="#fn:5">5</a></sup> This has consequences beyond readability. Trait bounds
on type arguments means the Rust compiler can type check the function once,
rather than having to check each concrete instantiation separately. This means
faster compilation and clearer compiler error messages.</p>
<p>You can also use traits dynamically, which isn't preferred as it has a runtime
penalty, but is sometimes necessary. I decided it was best not to cover that in
this post.</p>
<p>One other big part of traits is the interoperability that comes from standard
traits, like <code>Add</code> and <code>Display</code>. Implementing add means you can add a type
together with the + operator, implementing Display means you can print it.</p>
<h1>Rust tools</h1>
<p>C and C++ don't have a standard way to manage dependencies. There's a few
different tools for doing this, I haven't heard great things about any of them.
Using plain Makefiles for your build system is very flexible, but can be
rubbish to maintain. CMake reduces the maintenance burden but is less flexible
which can be frustrating.</p>
<p>Rust really shines in this respect. Cargo is the one and only tool used in the
Rust community for dependency management, packaging and for building and
running your code. It's similar in many ways to Pipenv and Poetry in Python.
There's an official package repository to go along with it. I don't have a lot
more to say about this! It's really nice to use and it makes me sad that C and
C++ don't have the same thing.</p>
<p><img alt="cargo" class="callout" src="/blog/images/rust-talk/cargo.png" title="cargo usage screenshot"></p>
<h1>Should we all use Rust?</h1>
<p>There's no universal answer for this. It depends on your application, as with
any programming language. Rust is already being used very successfully in many
different places. Microsoft use it for Azure IoT stuff, Mozilla sponsor Rust
and use it for parts of the Firefox web browser, and many smaller companies are
using it too.</p>
<p>So depending on your application, it's very much production ready.</p>
<p><html></p>
<div class=info_box>
Rust is already being used in production successfully
<br><br>
But for some applications you might find support immature or lacking.
</div>
<p></html></p>
<h3>Embedded</h3>
<p>In the embedded world, how ready Rust is depends on what you are doing. There
are mature resources for Cortex-M that are used in production, and there's a
developing but not yet mature RISC-V toolchain.</p>
<p>For x86 and arm8 bare metal the story is also good, like for Raspberry Pis.
For more vintage architectures like PIC and AVR there isn't great support but I
don't think for most new projects that should be a big issue.</p>
<p>Cross compilation support is good for all LLVM targets because the Rust
compiler uses LLVM as its backend.</p>
<p>One thing where embedded Rust is lacking is there are no production grade
RTOSs, and HALs
are less developed. This isn't an insurmountable issue for many projects, but
it would certainly hamper many too. I expect this to continue growing in the
next couple years.</p>
<h3>Async</h3>
<p>One thing that definitely isn't ready is language async support which is still
in development. They're still deciding what the async/await syntax should look
like.</p>
<h3>Interoperability</h3>
<p>As for interoperability with other languages, there's a good C FFI in Rust, but
you have to go through that if you want to call Rust from C++ or vice versa.
That's very common in many languages, I don't expect it to change. I
mention it because it would make it a bit of a pain to incorporate Rust into an
existing C++ project: you'd need a C layer between the Rust and C++ and that
would potentially be adding a lot of complexity.</p>
<h3>Final thoughts</h3>
<p>Were I starting from scratch on a new project at work I would definitely vouch
for Rust. I'm really hopeful that it
represents a better future for software – one that's more reliable, more
secure and more enjoyable to write.</p>
<div class="footnote">
<hr>
<ol>
<li id="fn:1">
<p>from Ian Barland's <a href="http://www.radford.edu/ibarland/Manifestoes/whyC++isBad.shtml">"Why C and C++ are Awful Programming
Languages"</a> <a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text">↩</a></p>
</li>
<li id="fn:2">
<p>The value categories used to just be lvalues (which has an identifiable
place in memory) and rvalues (which do no not, like literals). It is now <a href="https://en.cppreference.com/w/cpp/language/value_category">much
more complicated</a>. <a class="footnote-backref" href="#fnref:2" title="Jump back to footnote 2 in the text">↩</a></p>
</li>
<li id="fn:3">
<p>Yes...I should have had a test already. But for various unjustified
reasons I didn't write a test that covered this particular code until later. <a class="footnote-backref" href="#fnref:3" title="Jump back to footnote 3 in the text">↩</a></p>
</li>
<li id="fn:4">
<p>Note: cheekily, I included this code that would not actually compile.
You’d need the return type to be <code>&’static str</code>. This is “lifetime annotation”
and was outside the scope of my talk, and this blog post. Read about it
<a href="https://doc.rust-lang.org/1.9.0/book/lifetimes.html">here</a>. <a class="footnote-backref" href="#fnref:4" title="Jump back to footnote 4 in the text">↩</a></p>
</li>
<li id="fn:5">
<p>I hear a lot of people say that Concepts in C++20 are analogous to
traits, but this isn't true. I explain why
<a href="https://mcla.ug/blog/cpp20-concepts-are-not-like-rust-traits.html">here</a> <a class="footnote-backref" href="#fnref:5" title="Jump back to footnote 5 in the text">↩</a></p>
</li>
</ol>
</div>Making things at the Cambridge Makespace2018-12-09T00:00:00+00:002018-12-09T00:00:00+00:00Hannah McLaughlintag:mcla.ug,2018-12-09:/blog/making-things-at-the-cambridge-makespace.html<p>I joined the <a href="http://makespace.org/" target="_blank">Cambridge
Makespace</a>, my local community workshop, earlier this year. I've made a few
nice simple things on a laser cutter since, and wanted to share a couple of my
favourites.</p>
<p>I've been trained on a 3D printer and the CNC router and look forward to using
those …</p><p>I joined the <a href="http://makespace.org/" target="_blank">Cambridge
Makespace</a>, my local community workshop, earlier this year. I've made a few
nice simple things on a laser cutter since, and wanted to share a couple of my
favourites.</p>
<p>I've been trained on a 3D printer and the CNC router and look forward to using
those skills soon too!</p>
<h3>Iridescent chip necklace</h3>
<p>I ordered <a href="https://www.sketchlasercutting.co.uk/products/3mm-reflections-radiant-iridescent-acrylic-sheet" target="_blank"> this amazing acrylic</a> with an iridescent film over it.</p>
<p>I love iridescent things, and made a necklace in a shape resembling a microchip
with PCB tracks coming from it. It's a bit fragile, and I wonder if layering
some reflecting acrylic behind it would make it look better, but it's pretty
cool!</p>
<p><img alt="Chip necklace GIF" class="callout" src="/blog/images/chip-necklace.gif" title="A gif showing the iridescence of the necklace"></p>
<p><img alt="Chip necklace on laser cutter" class="callout" src="/blog/images/necklace-on-laser-cutter.jpg" title="The necklace being cut on the laser cutter"></p>
<p>Here is me dressing up by myself at home with the necklace. I call this look
"Goth Hacker".</p>
<p><img alt="Goth Hacker" class="callout" src="/blog/images/goth-hacker.jpg" title="Me wearing the necklace, with dark lipstick on, bathed in purple light"></p>
<h3>Alpaca bois</h3>
<p>I love alpacas and made these simple bois that have since been used as
keyrings, a mount for an RFID tag, a lapel pin, and as a little friend stuck to
the bezel of my monitor.</p>
<p><img alt="Bois" class="callout" src="/blog/images/wooden-alpacas.jpg" title="Wooden laser cut alpacas"></p>
<h3>Floppy disk pin</h3>
<p>I went to a party the other week where someone younger than me complimented my
"save" pin.💀</p>
<p><img alt="Pins" class="callout" src="/blog/images/pins.jpg" title="Lapel pins, one is a floppy disk and one an alpaca"></p>
<p>I've made some small gifts for loved ones also. I hope to find more time to
make stuff in the new year!</p>LED Hackers jacket for EMF camp2018-09-23T00:00:00+01:002018-09-23T00:00:00+01:00Hannah McLaughlintag:mcla.ug,2018-09-23:/blog/led-hackers-jacket-for-emf-camp.html<h3>Hack the Planet!</h3>
<p>The 1995 film Hackers is probably my favourite film. It is so much fun: I love
the outrageous fashion, the rollerblading, the unabashed cheesiness. It
represents a more hopeful time I wish to embody. <a href="https://waypoint.vice.com/en_us/article/bj3v3d/hackers-90s-movies" target="_blank">This write up</a> of it says all I could hope to, and says …</p><h3>Hack the Planet!</h3>
<p>The 1995 film Hackers is probably my favourite film. It is so much fun: I love
the outrageous fashion, the rollerblading, the unabashed cheesiness. It
represents a more hopeful time I wish to embody. <a href="https://waypoint.vice.com/en_us/article/bj3v3d/hackers-90s-movies" target="_blank">This write up</a> of it says all I could hope to, and says it
better.</p>
<p><img alt="Hackers" class="callout" src="/blog/images/hackers.jpg" title="The cast from Hackers (1995)"></p>
<p>Last month I went to <a href="http://www.emfcamp.org" target="_blank">EMF Camp</a>,
and there was a showing of Hackers with a competition for best costume. My
moment had come: finally an excuse to look like the ridiculous and shiny hacker
I desperately want to be.</p>
<h3>Heinous amounts of PVC</h3>
<p>My first step was to spend a stupid amount of money on PVC from China, which
seems to be the only place selling iridescent, translucent PVC I wanted in
sizes suitable for making clothes. I wish always to be shiny and opalescent,
and would accept nothing less than this surely very comfortable "fabric":</p>
<p><img alt="PVC" class="callout" src="/blog/images/pvc.jpg" title="The PVC purchased for the jacker"></p>
<p>Then I waited for the PVC to arrive which took weeks. Once it arrived, I put
off doing anything with it until two weeks before EMF, because I like to
suffer. Then, I frantically ordered a sewing pattern, and when it arrived I
got to work.</p>
<h3>Sewing the "fabric"</h3>
<p>Some aspects of sewing the PVC were easier than using actual fabric, and some
were harder. Easier: it's translucent and wipeable, so I could trace the
pattern through the PVC with a felt tip pen, rather than having to cut out the
pattern and pin it on top. Harder: the PVC I got is less flexible than I would
like, and the more pieces I sewed together, the harder it was to fit under my
sewing machine.</p>
<p>I should mention now that I am not an experienced seamster and have made
exactly <a href="https://mcla.ug/dog_tshirt.jpg" target="_blank">one (1)
garment</a>
before that I made in a beginners' night class.</p>
<p><img alt="Cutting the pieces" class="callout" src="/blog/images/cutting_pvc.jpg" title="Tracing the pattern and cutting the pieces out of the PVC"></p>
<p><img alt="Shiny" class="callout" src="/blog/images/shiny_under_sewing_machine.jpg" title="Shiny PVC under the sewing machine, catching the light"></p>
<h3>Realising it was all wrong</h3>
<p>After five evenings of desperate sewing, I had the terrible realisation that
the jacket was too inflexible for its length. I could wear it, but I wouldn't
be able to sit down as it wouldn't bend well at the waist. 🙃. It was too late
to shorten it – I'd already sewn on the zip and pockets.</p>
<h3>Starting again, in earnest</h3>
<p>Although I was sad to have to scrap what I'd been working on, I was confident I
could do a much better and faster job the second time. And I did! I did it in
two evenings, somehow.</p>
<p><img alt="Jacket complete, minus LEDs" class="callout" src="/blog/images/jacket_on_hanger.jpg" title="The finished jacket, minus LEDs, on a hanger">
<img alt="Me in jacket" class="callout" src="/blog/images/me_in_jacket_noled.jpg" title="Me wearing the finished jacket, minus LEDs">
<img alt="Me in jacket" class="callout" src="/blog/images/me_in_jacket_noled_sunglasses.jpg" title="Me wearing the finished jacket, minus LEDs, plus sunglasses"></p>
<h3>Flashback to December 2017</h3>
<p>Back when I was a whole 10 months younger and wasn't a hollowed out shell, I
still had the mental capacity to do programming outside of my job. I'd been
using an STM32 microcontroller at work and wanted to hack about on one at home,
both for fun and to help me learn. I got a
<a href="https://1bitsy.org/" target="_blank">1bitsy</a> and some LEDs and made
a simple driver for the LEDs with
<a href="http://libopencm3.org/" target="_blank">LibOpenCM3</a>. The one
pattern I ended up making was a gradient of the bi pride colours.</p>
<p><img alt="Engiqueering" class="callout" src="/blog/images/engiqueering.jpg" title="Colourful LED strip"></p>
<h3>Back to the present day: campsite soldering</h3>
<p>The first night at EMF I had to do a bit of campsite soldering (I'm trying to
sound cool here, but it was literally a wee bit of wire to a single pin) and
then used some tape to attach the LEDs to my jacket collar cos I didn't really
have anything else to use and it worked ok.</p>
<p><img alt="Campsite soldering complete" class="callout" src="/blog/images/campsite_soldering.jpg"></p>
<p>I went to the amazing
<a href="https://medium.com/@me_26124/emf-2018-cybar-and-nullsector-dc884233045d" target="_blank">Null Sector and Cybar</a>
and felt pretty cool. The next night was the Hackers viewing and I won the
costume competition! I almost didn't enter as I was shy. I won a very fancy
bottle of whisky and shook hands with the director of the best film ever made.
Here are some photos of me looking the coolest I can probably ever hope to
look.</p>
<p><img alt="Me at Null Sector" class="callout" src="/blog/images/me_closeup_nullsec0.jpg">
<img alt="Me at Null Sector" class="callout" src="/blog/images/me_closeup_nullsec1.jpg">
<img alt="Me at Null Sector" class="callout" src="/blog/images/me_standing_nullsec.jpg">
<img alt="Me at Null Sector" class="callout" src="/blog/images/me_nullsec_in_bg.jpg"></p>Property based testing2016-10-04T00:00:00+01:002016-10-04T00:00:00+01:00Hannah McLaughlintag:mcla.ug,2016-10-04:/blog/property-based-testing.html<p>The following is a rough transcript of a talk I did for my colleagues at
PA on property based testing. You can find more materials here at the
<a href="https://github.com/lochsh/pbt-talk">Github repo</a>.</p>
<h2>Introduction</h2>
<p>As software engineers and programmers, we all know the value of writing
automated tests for our code. Many of …</p><p>The following is a rough transcript of a talk I did for my colleagues at
PA on property based testing. You can find more materials here at the
<a href="https://github.com/lochsh/pbt-talk">Github repo</a>.</p>
<h2>Introduction</h2>
<p>As software engineers and programmers, we all know the value of writing
automated tests for our code. Many of us appreciate the advantages of
Test Driven Development. Today I want to talk about another technique
that can improve the usefulness of our tests.</p>
<p>Property based testing involves running a single test many times with
multiple randomly generated inputs. This allows you to test more with
less code. It makes it easier for you to write better tests, and reduces
the need for you to think up examples.</p>
<p>If we consider our tests as documentation, PBT improves the breadth and
generality of that documentation.</p>
<h2>Example based testing</h2>
<p>So, how is property based testing different from tests we've written
before? I can't speak for everyone, but when I have written tests in
the past, they have been largely example-based. I find myself having to
think up example scenarios and manually code them. For example, testing
a sorting algorithm:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="Statement">def</span> <span class="Function">test_sort_list_of_ints</span>():
<span id="L2" class="LineNr">2 </span> <span class="Statement">for</span> i <span class="Statement">in</span> <span class="Function">range</span>(<span class="Number">2</span>, <span class="Number">101</span>):
<span id="L3" class="LineNr">3 </span> ints = [random.randint(-<span class="Number">1000</span>, <span class="Number">1000</span>) <span class="Statement">for</span> _ <span class="Statement">in</span> <span class="Function">range</span>(i)]
<span id="L4" class="LineNr">4 </span> result = sort(ints)
<span id="L5" class="LineNr">5 </span> <span class="Statement">assert</span> <span class="Function">all</span>(x <= y <span class="Statement">for</span> x, y <span class="Statement">in</span> <span class="Function">zip</span>(result, result[<span class="Number">1</span>:]))
</pre>
</div>
<p></html></p>
<p>I want to ensure my function <code>sort</code> actually sorts a list of integers.
My test, however, only tests for one specific input. It's possible that
even if it passes, other values could cause problems. It also doesn't
document the desired behaviour of my code fully. Examples are helpful in
understanding how something works, but they aren't the whole story.
This test is just an example of how the code should work, rather than a
statement defining a more general property.</p>
<p>It's inherently difficult or lengthy to demonstrate generic properties
with example based testing.</p>
<p>With this in mind, here is a modified test:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="Statement">def</span> <span class="Function">test_sort_list_of_ints</span>():
<span id="L2" class="LineNr">2 </span> <span class="Statement">for</span> i <span class="Statement">in</span> <span class="Function">range</span>(<span class="Number">2</span>, <span class="Number">101</span>):
<span id="L3" class="LineNr">3 </span> ints = [random.randint(-<span class="Number">1000</span>, <span class="Number">1000</span>) <span class="Statement">for</span> _ <span class="Statement">in</span> <span class="Function">range</span>(i)]
<span id="L4" class="LineNr">4 </span> result = sort(ints)
<span id="L5" class="LineNr">5 </span> <span class="Statement">assert</span> <span class="Function">all</span>(x <= y <span class="Statement">for</span> x, y <span class="Statement">in</span> <span class="Function">zip</span>(result, result[<span class="Number">1</span>:]))
</pre>
</div>
<p></html></p>
<p>This is better in some ways – we are now testing for lists with between
2 and 100 elements long, containing random integers in the range ±1000.\
However, there are some difficulties here:</p>
<ul>
<li>
<p>This test may pass sometimes and fail others. Even though it may
have failed in the past, we don't have a record of the example that
caused it to fail.</p>
</li>
<li>
<p>There is no direction to our random search.</p>
</li>
</ul>
<p>A property based testing framework can help resolve these issues.</p>
<h2>Property based testing</h2>
<p>So, what can a property based testing framework add to this?</p>
<p>Here is the above example test modified to use the <code>hypothesis</code>
framework for Python, which I learnt about at Europython:</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement'>
<span id="L1" class="LineNr">1 </span><span class="PreProc">import</span> hypothesis
<span id="L2" class="LineNr">2 </span><span class="PreProc">from</span> hypothesis <span class="PreProc">import</span> strategies <span class="Statement">as</span> st
<span id="L3" class="LineNr">3 </span>
<span id="L4" class="LineNr">4 </span>
<span id="L5" class="LineNr">5 </span><span class="PreProc">@</span><span class="Function">hypothesis.given</span>(st.lists(st.integers(), min_size=<span class="Number">2</span>))
<span id="L6" class="LineNr">6 </span><span class="Statement">def</span> <span class="Function">test_sort_list_of_ints</span>(ints):
<span id="L7" class="LineNr">7 </span> result = sort(ints)
<span id="L8" class="LineNr">8 </span> <span class="Statement">assert</span> <span class="Function">all</span>(x <= y <span class="Statement">for</span> x, y <span class="Statement">in</span> <span class="Function">zip</span>(result, result[<span class="Number">1</span>:]))
</pre>
</div>
<p></html></p>
<p>The key differences:</p>
<ul>
<li>
<p>When running the test, <code>hypothesis</code> actively seeks out falsifying
examples. Not only that, but the examples are simplified until a
smaller example is found that still causes the problem. These
examples are then stored in a cache, so that a test that fails once
will always fail, until the code is updated.</p>
</li>
<li>
<p>From an ergonomic point of view, it's much easier to see straight
away what property we are testing. The decorator line specifies that
this test should pass for all list of integers of length ≥2. We test
more, but with less code.</p>
</li>
</ul>
<h2>Use cases</h2>
<p>I think these kind of tests are useful in any software project, but here
are some particularly motivating examples:</p>
<ul>
<li>
<p>Testing the parsing of user text input. It's infeasible to think of
every possible string a user could input to your GUI; property based
tests can give you more confidence in your sanitising and parsing.</p>
</li>
<li>
<p>Many mathematical calculations lend themselves well to being tested
this way. For example, the objective function in
Expectation-Maximisation should always decrease or plateau. If it
increases at any iteration, you have a problem.</p>
</li>
<li>
<p>The Fourier Transform of a pure sine wave should have constant
magnitude across time shifts.</p>
</li>
</ul>
<p>These are invariant properties that are poorly demonstrated with
examples alone. Property based testing allows your tests to function
better both as documentation, and as proof of the robustness of your
code. Each test is more concise, and each test goes further. For these
hopefully very compelling reasons, I hope you'll all consider giving
PBT a try and using it in your work!</p>
<p>Some frameworks to read up on are:</p>
<ul>
<li>
<p><a href="hypothesis.works"><code>hypothesis</code></a> by David MacIver, which is
currently available for Python only. Java, C and C++ implementations
will hopefully come some time in the future.</p>
</li>
<li>
<p><a href="https://hackage.haskell.org/package/QuickCheck"><code>QuickCheck</code></a> is
the classic property based testing framework, released for Haskell
in 1999 and ported to Erlang, Scala and other functional languages.</p>
</li>
<li>
<p><a href="https://github.com/silentbicycle/theft"><code>theft</code></a> for C</p>
</li>
<li>
<p><a href="https://github.com/emil-e/rapidcheck"><code>RapidCheck</code></a> for C++</p>
</li>
<li>
<p><a href="https://github.com/fscheck/FsCheck"><code>FsCheck</code></a> for .NET languages
(C# etc.)</p>
</li>
</ul>
<p>Enjoy!
:::
:::</p>Making an automatic chord recogniser2016-09-13T00:00:00+01:002016-09-13T00:00:00+01:00Hannah McLaughlintag:mcla.ug,2016-09-13:/blog/making-an-automatic-chord-recogniser.html<p>I've started working on an automatic chord recogniser for audio. It's
something I've wanted to try out for a while but hadn't found the time
until recently! It seems like a neat project :) I'm still in the
beginnings, but I'm going to talk about what I have so far.</p>
<script type="text/x-mathjax-config">
MathJax …</script><p>I've started working on an automatic chord recogniser for audio. It's
something I've wanted to try out for a while but hadn't found the time
until recently! It seems like a neat project :) I'm still in the
beginnings, but I'm going to talk about what I have so far.</p>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\\\(','\\\\)']],processEscapes: true}
});
</script>
<script type="text/javascript" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
<h2>What is a chromagram?</h2>
<p>The feature extraction technique I'm starting with comes from <a href="http://www.music.mcgill.ca/~jason/mumt621/papers5/fujishima_1999.pdf">a 1999
paper</a>
called "Realtime Chord Recognition of Musical Sound".
Nineteen-ninety-nine was a long time ago, but the features are really
intuitive and involve some fun simple musicology and DSP. I think it's
a really nice place to start – I might try some more exciting (read:
time-consuming and complicated) techniques later on, once I've got my
teeth into things more.</p>
<p>In Western tonal music, we divide frequencies into semitones which
follow a logarithmic scale. Every 12 semitones, we double frequency. A
group of such 12 tones, or that distance, is called an octave.</p>
<p><img alt="An octave on a
piano" class="callout" src="/blog/images/piano_keys.jpg" title="An octave on a piano"></p>
<p>The <em>chromagram</em> feature is a 12 dimensional vector, with each entry
giving the relative intensity of a <em>chroma</em>, e.g. A or C#. Chroma
simply refers to a note regardless of octave. It's a nice name!
Chromagrams are also known as Pitch Class Profiles, but I thought the
former was much cooler :D</p>
<h2>How do we calculate chromagrams?</h2>
<p>Time for some fun musicology and DSP. To calculate a chromagram, we want
to know which chroma are present in a time-sample, and their intensity.
This is a perfect job for a Fourier Transform. We take the DFT, then, in
short, put the DFT bins into chroma bins. We effectively sum the DFT
components that correspond to chroma frequencies (or come close).
Here's the maths:</p>
<p>$$ M_k = \left[12 \log_2\left(\frac{f_s
k}{Nf_{\textrm{ref}}}\right) \mod 12 \right] $$ $$ C_c =
\sum_{M_k = c} \left|{X_k}\right|^2 $$</p>
<p>$M_k$ tells us the closest chroma to the DFT frequency bin:</p>
<ul>
<li>
<p>$f_{\textrm{ref}}$ is the reference frequency of the 0th chroma
(e.g. 27.5 Hz for the lowest A)</p>
</li>
<li>
<p>The frequency represented by the bin is easily found by the product
of the sampling frequency and the bin index, divided by the DFT
length: $f_{\textrm{bin}} = \frac{f_s k}{Nf_{\textrm{ref}}}$</p>
</li>
<li>
<p>$\log_2(\frac{f_{\textrm{bin}}}{f_{\textrm{ref}}})$ gives
how many octaves the bin frequency is above the chroma reference
frequency</p>
</li>
<li>
<p>$ 12 \log_2(\frac{f_{\textrm{bin}}}{f_{\textrm{ref}}})$
gives how many semitones the bin frequency is above the chroma
reference frequency</p>
</li>
<li>
<p>Rounding this and taking the mod 12 gives the nearest chroma index</p>
</li>
</ul>
<p>$X_k$ is the DFT, and $C_c$ tells us the intensity of the chroma
$c$ by summing DFT components where the spectral bins land in the
chroma bins.</p>
<h2>My project</h2>
<p>I currently have a functioning chromagrammer in Python 3, along with
some generic audio processing, and a script for generating tones and
chords for testing. There are some issues, and I've had some silly bugs
along the way, but I'm enjoying it. Here is one of my favourite pieces
of code. All it does is produce overlapping frames of a signal, but
it's very neat :D</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr"> 1 </span><span class="Statement">def</span> <span class="Function">overlapping_frames</span>(self):
<span id="L2" class="LineNr"> 2 </span> <span class="String">"""</span>
<span id="L3" class="LineNr"> 3 </span><span class="String"> Generates overlapping frames</span>
<span id="L4" class="LineNr"> 4 </span>
<span id="L5" class="LineNr"> 5 </span><span class="String"> Generator that yields a deque containing the current frame of audio</span>
<span id="L6" class="LineNr"> 6 </span><span class="String"> data. The deque contents is shifted by the frame size minus the</span>
<span id="L7" class="LineNr"> 7 </span><span class="String"> overlap on each iteration, to minimise computation.</span>
<span id="L8" class="LineNr"> 8 </span><span class="String"> </span><span class="String">"""</span>
<span id="L9" class="LineNr"> 9 </span> frame = collections.deque(maxlen=self.frame_size)
<span id="L10" class="LineNr">10 </span> frame.extend(self.data[:self.frame_size])
<span id="L11" class="LineNr">11 </span> <span class="Statement">yield</span> frame
<span id="L12" class="LineNr">12 </span>
<span id="L13" class="LineNr">13 </span> <span class="Statement">for</span> i <span class="Statement">in</span> <span class="Function">range</span>(<span class="Number">1</span>, self.num_frames):
<span id="L14" class="LineNr">14 </span> frame.extend(self.data[i * self.frame_size - self.overlap:
<span id="L15" class="LineNr">15 </span> (i + <span class="Number">1</span>) * self.frame_size - self.overlap])
<span id="L16" class="LineNr">16 </span> <span class="Statement">yield</span> frame
</pre>
</div>
<p></html></p>
<p>I'm using <a href="hypothesis.readthedocs.io">hypothesis</a> for property based
testing, which I'm really getting into. Here's an example test, which
actively finds example inputs that will break the test. My tests are so
much more useful with this!</p>
<p><html></p>
<div class=code>
<pre id='vimCodeElement' style="overflow-x: scroll; width: 100%; white-space: pre;">
<span id="L1" class="LineNr">1 </span><span class="PreProc">@</span><span class="Function">hypothesis.given</span>(arrays(<span class="Function">float</span>, <span class="Number">100</span>))
<span id="L2" class="LineNr">2 </span><span class="Statement">def</span> <span class="Function">test_overlapping_frames_yields_correct_initial_frame</span>(self, data):
<span id="L3" class="LineNr">3 </span> self.ap.data = np.nan_to_num(data)
<span id="L4" class="LineNr">4 </span> self.ap.process_data()
<span id="L5" class="LineNr">5 </span>
<span id="L6" class="LineNr">6 </span> frames = self.ap.overlapping_frames()
<span id="L7" class="LineNr">7 </span> <span class="Statement">assert</span> (<span class="Function">next</span>(frames) == self.ap.data[:self.ap.frame_size]).all()
</pre>
</div>
<p></html></p>
<p>It's not a very exciting test, but I can be confident my code works for
a variety of values, and it saves me having to make up dummy data.</p>
<h2>Seaborn for pretty graphs!</h2>
<p>I've started using
<a href="https://stanford.edu/~mwaskom/software/seaborn/">seaborn</a> for plotting
my graphs, rather than plain old matplotlib. It's amazing! Everything
looks so pretty. Here's part of a <a href="http://www.mountain-goats.com/">Mountain
Goats</a> song's chromagram plotted for a
bunch of time samples. Nice!</p>
<p><img alt="Pretty chromagram ;o" src="/blog/images/tmg.png" title="Pretty seaborn heatmap"></p>
<p>I want to try and update my blog with how my project is going. I'm
really enjoying it and I'm excited about the different ways it can go.
There's a lot of scope for some cool Machine Learning and DSP! I really
like the neat intuitive features I'm using at the moment, but I'm
excited about what else I could use :D You can check out the code
<a href="https://github.com/lochsh/chordal">here</a> at the github repo.
:::
:::</p>All things wooden wonderful2016-04-02T00:00:00+01:002016-04-02T00:00:00+01:00Hannah McLaughlintag:mcla.ug,2016-04-02:/blog/all-things-wooden-wonderful.html<p>I haven't posted on my blog in ages! I just changed themes to the
lovely <a href="https://github.com/jvanz/pelican-hyde">Pelican Hyde</a>, and
thought it was time to update the content, too.</p>
<p>I made some exciting wooden things around Christmas time – or rather, I
embellished one and designed another. Both were presents for others, and …</p><p>I haven't posted on my blog in ages! I just changed themes to the
lovely <a href="https://github.com/jvanz/pelican-hyde">Pelican Hyde</a>, and
thought it was time to update the content, too.</p>
<p>I made some exciting wooden things around Christmas time – or rather, I
embellished one and designed another. Both were presents for others, and
both recipients were delighted :D.</p>
<h2>The Space Box</h2>
<p><img alt="space box" class="callout" src="/blog/images/space_box_1.jpg"></p>
<p>Years ago, I bought a woodburning kit off Amazon for £20 or so. I then
proceeded to never use it...until late last year, that is!</p>
<p>I had the amazing idea of making a space-themed box for my partner. At
first, I was determined to learn to use a
<a href="https://en.wikipedia.org/wiki/Router_(woodworking)">router</a> , and inlay
a wooden box with mother of pearl. This was far too ambitious, it turned
out; I hope I get to try that some day.</p>
<p>I realised quickly that woodburning was my best bet. Burning patterns
into wood is also called
<a href="https://en.wikipedia.org/wiki/Pyrography">pyrography</a>, and I'm amazed
at the skill some pyrographers have! Do a quick Google image search, and
you will be too.</p>
<p>I enjoy drawing and I'm pretty ok at it – I foolishly expected
woodburning to not be that different! In some ways it isn't, but the
grain in wood can give a lot of resistance. Burning a solid line in the
direction you want is difficult! I don't understand how other people do
it! I think it's probably a combination of skill, selecting the right
wood, and maybe having a better quality kit.</p>
<p>Anyway, I decided that pointilism was the solution to my ineptness at
woodburning, and it worked a treat!</p>
<p><img alt="space
box" class="callout" src="/blog/images/space_box.jpg" title="so many hours of burning"></p>
<p>After hours of burning, I lined the inside of the box with some cool
spacey-fabric, and I was done! I'd originally planned to finish the
wood with something to make it last longer, but I lacked the patience in
the end. I wanted to present my gift!</p>
<h2>Laser cutting</h2>
<p><img alt="dhun na gall" class="callout" src="/blog/images/donegal.jpg"></p>
<p>I wanted to get my Granny a good Christmas present. My granny is from
Donegal, and I thought it would be funny to put some Donegal turf in a
"BREAK IN CASE OF EMERGENCY" case. My family did not appreciate my
genius and sadly this plan has not been carried out.</p>
<p><img alt="donegal" class="callout" src="/blog/images/donegal_collage.jpg"></p>
<p>I did, however, design this lovely piece, which my partner then laser
cut for me. Look at it! It's beautiful. I have resolved to gain access
to a laser cutter so I can make more such things.</p>
<p><img alt="framed!" class="callout" src="/blog/images/donegal_frame.jpg" title="So many hours finding a frame"></p>
<p>The flowers are honeysuckle, foxgloves and montbretia (thought I didn't
know the name of that last one until I had to find a reference image to
draw from!). My granny was delighted, and so was I.</p>
<p><img alt="donegal
flowers" class="callout" src="/blog/images/donegal_flowers.jpg" title="Donegal flowers"></p>
<p><em><a href="http://roseblaneyphotography.ie">Montbretia</a>
<a href="http://www.irishnews.com/lifestyle/gardening/2016/01/16/news/plant-of-the-week---winter-honeysuckle-lonicera--380353/">Honeysuckle</a>
<a href="http://flickrhivemind.net/blackmagic.cgi?id=2586941715&url=http%3A%2F%2Fflickrhivemind.net%2FTags%2Fdonegal%252Cfoxglove%2FInteresting%3Fsearch_type%3DTags%3Btextinput%3Ddonegal%252Cfoxglove%3Bphoto_type%3D250%3Bmethod%3DGET%3Bnoform%3Dt%3Bsort%3DInterestingness%23pic2586941715&user=&flickrurl=http://www.flickr.com/photos/17487821@N00/2586941715">Foxgloves</a></em>
:::
:::</p>The Desk2015-11-08T00:00:00+00:002015-11-08T00:00:00+00:00Hannah McLaughlintag:mcla.ug,2015-11-08:/blog/the-desk.html<p><img alt="The finished
desk" class="callout" src="/blog/images/desk/done_collage.jpg" title="The finished desk"></p>
<p>A few months ago I bought a desk. It was the end of a long and consuming
search to find one I liked that would fit in the 105cm wide alcove in my
living room.</p>
<p>A combination of things meant I decided it would be a good idea to
refinish …</p><p><img alt="The finished
desk" class="callout" src="/blog/images/desk/done_collage.jpg" title="The finished desk"></p>
<p>A few months ago I bought a desk. It was the end of a long and consuming
search to find one I liked that would fit in the 105cm wide alcove in my
living room.</p>
<p>A combination of things meant I decided it would be a good idea to
refinish the desk. I thought I could do it in a weekend. "Sanding
can't take that long," I thought, wild with naïveté.</p>
<p><img alt="The desk
before" class="callout" src="/blog/images/desk/before_drawer_open.jpg" title="Before photo of desk"></p>
<p><em>The desk before I began my long struggle to beautify it</em></p>
<p>Turns out sanding takes a very long time! Even with an electric sander.
Welp. This project became a bit of a pain – and I couldn't abandon it,
not once I had started to remove the finish from the wood! Also, my
sander was so loud I had to wear earplugs, meaning I couldn't sand any
time I wanted. I had neighbours and a flatmate to consider! This made
progress even slower than it might have been.</p>
<p><img alt="Slow
progress" class="callout" src="/blog/images/desk/leg_comparison.jpg" title="Slow progress"></p>
<p><em>Slow progress</em></p>
<p><img alt="Slow
progress" class="callout" src="/blog/images/desk/drawer_comparison.jpg" title="Slow progress"></p>
<p>The desk is quite a cool design. The legs screw on and off, and there's
a hollow in the desk where they can be stored. It makes it easy to
transport! I appreciate this.</p>
<p><img alt="So
portable" class="callout" src="/blog/images/desk/leg_compartment.jpg" title="Portability"></p>
<p>Eventually, after many weekends of sanding, I decided I was ready to
apply stain. My aim was to have it match the wee table and unit in my
living room.</p>
<p><img alt="Stain
test" class="callout" src="/blog/images/desk/stain_test.jpg" title="Testing the stain"></p>
<p>I decided the stain looked good enough. It was time to get down to
business!</p>
<p><img alt="Finished
sanding" class="callout" src="/blog/images/desk/all_sanded.jpg" title="Finished sanding"></p>
<p><img alt="Staining at
last" class="callout" src="/blog/images/desk/staining.jpg" title="Staining the wood"></p>
<p>Another thing that slowed me down was my lack of a proper workspace. One
goal for the next place I live is to have a garage or workshop! I had to
apply the stain on my balcony. There wasn't much space, and I had to
wait for a dry day. Despite applying it outside, the living room still
smelled strongly of solvents for a few hours. Much appreciation to my
flatmate for putting up with this.</p>
<p>The stain took really well to most of the wood – apart from the front!
I clearly hadn't sanded enough, and so some finish was left on the
wood. Reluctantly I sanded it again, and applied the stain once more.
I'm so glad I bothered to do so.</p>
<p>Unfortunately, after staining there was still a long way to go. Not a
lot of work, true; but a lot of time. I'd had oil finishes recommended
to me for their low-lustre, natural looking finish. I used Rustin's
Danish Oil. I'm not sure I would use it again. It was quite difficult
to apply correctly. I don't think I wiped enough of the excess off. On
the last coat I applied it with fine steel wool, and this made a huge
difference. The finish evened out, and felt so much smoother.</p>
<p>I applied three coats to the wood, and an extra one on the top surface.
Once they were all completely dry, I used a Howard's Feed 'n' Wax to
polish the wood. I'm glad I bothered with this extra step. The wood
felt even better afterwards!</p>
<p>I took some time to admire the grain. It came out so well! However, I
had one task left before my desk was complete: lining the drawer.</p>
<p><img alt="Lining the
drawer" class="callout" src="/blog/images/desk/lining_collage.jpg" title="Lining the drawer"></p>
<p>I had some flock fabric I'd bought ages ago and never used. I used
spray adhesive and a Stanley knife to cut the fabric to size, and voila!
A luxurious drawer was created.</p>
<p><img alt="The lined
drawer" class="callout" src="/blog/images/desk/drawer.jpg" title="The lined drawer"></p>
<p>My desk was now finished, at long, long last! Here are some of the many
photos I took. Look how well it matches the other furniture! (It's not
perfect, but it's good enough for me.)</p>
<p><img alt="So beautiful" class="callout" src="/blog/images/desk/done1.jpg" title="DONE AT LAST"></p>
<p><img alt="So
beautiful" class="callout" src="/blog/images/desk/done_angle.jpg" title="DONE AT LAST"></p>
<p><img alt="So
beautiful" class="callout" src="/blog/images/desk/done_books.jpg" title="DONE AT LAST"></p>
<p><img alt="So
beautiful" class="callout" src="/blog/images/desk/done_chair.jpg" title="DONE AT LAST"></p>
<p><img alt="So
beautiful" class="callout" src="/blog/images/desk/matchy_matchy.jpg" title="DONE AT LAST">
:::
:::</p>