2pi.dkhttps://2pi.dk/2024-01-02T00:00:00-08:00Jakob Stoklund OlesenReflow Soldering Hotplate2024-01-02T00:00:00-08:002024-01-02T00:00:00-08:00Jakob Stoklund Olesentag:2pi.dk,2024-01-02:/2024/01/reflow-soldering-hotplate<p>I got a hotplate so I can make my own printed circuit boards with surface
mount components using solder paste and reflow soldering.</p>
<figure id="__yafg-figure-1">
<img alt="Reflow soldering hotplate" class="image-process-photo" src="https://2pi.dk/2024/01/Reflow soldering hotplate.jpeg">
<figcaption>Hotplate for reflow soldering. It has a 200 × 200 mm aluminum top plate and a cheap industrial temperature controller.</figcaption>
</figure>
<p>I did the first burn-in run outside on …</p><p>I got a hotplate so I can make my own printed circuit boards with surface
mount components using solder paste and reflow soldering.</p>
<figure id="__yafg-figure-1">
<img alt="Reflow soldering hotplate" class="image-process-photo" src="https://2pi.dk/2024/01/derivatives/photo/960w/Reflow soldering hotplate.jpeg" srcset="https://2pi.dk/2024/01/derivatives/photo/640w/Reflow soldering hotplate.jpeg 640w, https://2pi.dk/2024/01/derivatives/photo/960w/Reflow soldering hotplate.jpeg 960w, https://2pi.dk/2024/01/derivatives/photo/1280w/Reflow soldering hotplate.jpeg 1280w"/>
<figcaption>Hotplate for reflow soldering. It has a 200 × 200 mm aluminum top plate and a cheap industrial temperature controller.</figcaption>
</figure>
<p>I did the first burn-in run outside on the deck. The plate gets 400°C hot,
although it takes 15 minutes to get up to that temperature. I’m glad I did
the burn-in outside because it did smell quite a bit. After the first run,
the plate doesn’t smell at all when it gets hot.</p>
<h2><span class="caps">PID</span> tuning</h2>
<p>The <a href="https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller"><span class="caps">PID</span> controller</a> comes with some very conservative factory settings
that make the hotplate quite slow at reaching the set temperature. Out of the
box it takes 10 minutes to reach 300°C. You can tell the thermostat starts
backing off already at 250°C to avoid overshooting. This is not great for
soldering electronics—we want to minimize the time components are exposed to heat.</p>
<p>My controller is just labeled <em><span class="caps">RRKKCC</span></em>, and I couldn’t find a manual for it
online. It seems to be a copy of the Japanese <a href="https://www.mpja.com/download/rex-c100.pdf"><span class="caps">REX</span>-C100 controller</a> from <span class="caps">RKC</span> Instruments. I used that manual to piece together the
settings for my device.</p>
<p>I ran a bunch of experiments with different <span class="caps">PID</span> settings and eventually
arrived at this configuration:</p>
<table>
<tr>
<td>
<svg fill="#A00" height="40" stroke="#FFF" stroke-width=".25" viewbox="-24 0 48 20" width="96">
<g id="s4">
<path d="m10,2-1-1h-6l-1,1 1,1h6zl-1,1v6l1,1 1-1v-6zm0,16-1-1v-6l1-1 1,1v6zl-1-1h-6l-1,1 1,1h6zm-8-8-1,1v6l1,1 1-1v-6zl-1-1v-6l1-1 1,1v6zl1,1h6l1-1-1-1h-6z" fill="#EEE" id="s"></path>
<use href="#s" x="12"></use>
<use href="#s" x="-12"></use>
<use href="#s" x="-24"></use>
</g>
<g id="sA0">
<path d="m-2,2-1-1h-6l-1,1 1,1h6z" id="a0"></path>
<g id="bc0">
<path d="m-2,2 -1,1 v6 l1,1 1,-1 v-6z" id="b0"></path>
<use href="#b0" id="c0" y="8"></use>
</g>
<use href="#bc0" id="ef0" x="-8"></use>
<use href="#a0" id="g0" y="8"></use>
</g>
<g id="sL1">
<use href="#ef0" id="ef1" x="12"></use>
<use href="#g0" id="d1" x="12" y="8"></use>
</g>
<use href="#bc0" id="bc2" x="24"></use>
</svg>
</td>
<td><span class="caps">AL1</span><br/>30°C</td>
<td><em>Alarm 1.</em> This alarm temperature is not used in the
hotplate.</td>
</tr>
<tr>
<td>
<svg fill="#A00" height="40" stroke="#FFF" stroke-width=".25" viewbox="-24 0 48 20" width="96">
<use href="#s4"></use>
<use href="#sA0" id="sA1" x="12"></use>
<g id="sT2">
<use href="#ef1" id="ef2" x="12"></use>
<use href="#a0" id="a2" x="24"></use>
</g>
</svg>
</td>
<td><span class="caps">AT</span><br/>0</td>
<td><em>Auto-tune mode.</em> I tried enabling this, and it made a complete
mess of all the settings. Not recommended.</td>
</tr>
<tr>
<td>
<svg fill="#A00" height="40" stroke="#FFF" stroke-width=".25" viewbox="-24 0 48 20" width="96">
<use href="#s4"></use>
<use href="#ef2"></use>
<use href="#a2"></use>
<use href="#a2" id="g2" y="8"></use>
<use href="#b0" id="b2" x="24"></use>
</svg>
</td>
<td>P<br/>10°C</td>
<td><em>Proportional band.</em> This controls the gain of the <span class="caps">PID</span> controller
by setting the width of the error band where the controller output
transitions from 0% to 100% heating.</td>
</tr>
<tr>
<td>
<svg fill="#A00" height="40" stroke="#FFF" stroke-width=".25" viewbox="-24 0 48 20" width="96">
<use href="#s4"></use>
<use href="#ef2"></use>
</svg>
</td>
<td>I<br/>30 s</td>
<td><em>Integral time constant.</em> This controls the weight of the integral
term in the <span class="caps">PID</span> controller.</td>
</tr>
<tr>
<td>
<svg fill="#A00" height="40" stroke="#FFF" stroke-width=".25" viewbox="-24 0 48 20" width="96">
<use href="#s4"></use>
<use href="#bc0" id="bc2" x="24"></use>
<use href="#b2" id="e2" x="-8" y="8"></use>
<use href="#g2"></use>
<use href="#g2" id="d2" y="8"></use>
</svg>
</td>
<td>d<br/>6 s</td>
<td><em>Derivative time constant.</em> This controls the derivative term in
the <span class="caps">PID</span> controller. I lowered this a lot (from 60s) since this is the term
that caused the controller to be so cautious about overshooting.</td>
</tr>
<tr>
<td>
<svg fill="#A00" height="40" stroke="#FFF" stroke-width=".25" viewbox="-24 0 48 20" width="96">
<use href="#s4"></use>
<use href="#sA1"></use>
<use href="#e2"></use>
<use href="#g2"></use>
</svg>
</td>
<td>Ar<br/>50%</td>
<td><em>Anti-reset windup.</em> This is a percentage of the proportional band.
I wasn’t able to figure out how this setting works.</td>
</tr>
<tr>
<td>
<svg fill="#A00" height="40" stroke="#FFF" stroke-width=".25" viewbox="-24 0 48 20" width="96">
<use href="#s4"></use>
<use href="#sT2"></use>
</svg>
</td>
<td>T<br/>2s</td>
<td><em><span class="caps">PWM</span> period.</em> The controller controls the heater from 0% to 100%
using pulse width modulation. This just sets the period of the pulses. I left
it at the factory default.</td>
</tr>
<tr>
<td>
<svg fill="#A00" height="40" stroke="#FFF" stroke-width=".25" viewbox="-24 0 48 20" width="96">
<use href="#s4"></use>
<use href="#a2" id="a1" x="-12"></use>
<use href="#b0" id="f1" x="4"></use>
<use href="#g0" id="g1" x="12"></use>
<use href="#c0" id="c1" x="12"></use>
<use href="#d1"></use>
<g id="sC2">
<use href="#a2"></use>
<use href="#ef2"></use>
<use href="#d2"></use>
</g>
</svg>
</td>
<td><span class="caps">SC</span><br/>0°C</td>
<td><em>Sensor Calibration.</em> This number is simply added to the
measurement from the k-type thermocouple sensor on the hotplate. Mine did not
need any calibration. </td>
</tr>
<tr>
<td>
<svg fill="#A00" height="40" stroke="#FFF" stroke-width=".25" viewbox="-24 0 48 20" width="96">
<use href="#s4"></use>
<use href="#sL1" id="sL0" x="-12"></use>
<use href="#sC2" id="sC1" x="-12"></use>
<use href="#ef2"></use>
<use href="#b2"></use>
<use href="#d2"></use>
<use href="#g2"></use>
</svg>
</td>
<td>LCk<br/>0000</td>
<td><em>Lock.</em> This parameter locks various settings so they can’t be
changed accidentally. I left it off.</td>
</tr>
</table>
<p>With these settings, the hotplate goes from room temperature to 300°C in less
than six minutes.</p>
<h2>Reflow soldering</h2>
<p>The thermostat is measuring the temperature of the aluminum hotplate itself,
but the solder paste we’re melting is going to be on a <span class="caps">PCB</span> on top of the
hotplate. Since the thermal coupling may not be the best, I used a small <span class="caps">PCB</span>
and a thermocouple to measure the temperature on the <span class="caps">PCB</span> surface. I used a
bit of flux to ensure good coupling between the <span class="caps">PCB</span> and thermocouple:</p>
<figure id="__yafg-figure-2">
<img alt="PCB with thermocouple" class="image-process-photo" src="https://2pi.dk/2024/01/derivatives/photo/960w/PCB with thermocouple.jpeg" srcset="https://2pi.dk/2024/01/derivatives/photo/640w/PCB with thermocouple.jpeg 640w, https://2pi.dk/2024/01/derivatives/photo/960w/PCB with thermocouple.jpeg 960w, https://2pi.dk/2024/01/derivatives/photo/1280w/PCB with thermocouple.jpeg 1280w"/>
<figcaption>Small <span class="caps">PCB</span> on the hotplate with a thermocouple measuring the surface temperature.</figcaption>
</figure>
<p>A couple of experiments showed that the hotplate is about 20% hotter than the
<span class="caps">PCB</span> surface when measuring in °C. So to get a 250°C <span class="caps">PCB</span> you need a 300°C hotplate.</p>
<p>This is the temperature profile I get for the <span class="caps">PCB</span> surface when I simply set
the hotplate to 300°C:</p>
<figure id="__yafg-figure-3">
<img alt="Reflow profile" class="image-process-photo" src="https://2pi.dk/2024/01/derivatives/photo/960w/Reflow profile.png" srcset="https://2pi.dk/2024/01/derivatives/photo/640w/Reflow profile.png 640w, https://2pi.dk/2024/01/derivatives/photo/960w/Reflow profile.png 960w, https://2pi.dk/2024/01/derivatives/photo/1280w/Reflow profile.png 1280w"/>
<figcaption>Temperature profile of the top of the <span class="caps">PCB</span>. Cooled with a fan once it reached 250°C</figcaption>
</figure>
<p>This profile meets all the requirements for reflow soldering using <span class="caps">SAC305</span>
lead-free solder paste:</p>
<ul>
<li>Preheat from 150°C to 200°C: 90 seconds.</li>
<li>Time above 217°C: less than 2 minutes.</li>
<li>Time from 25°C to peak temperature: less than 8 minutes.</li>
</ul>
<p>When I tried it for real with the small purple <span class="caps">PCB</span>, I simply turned off the
hotplate and pointed a fan at the <span class="caps">PCB</span> as soon as the reflow happened. It
works great!</p>Basic Block Arguments2022-05-29T00:00:00-07:002022-05-29T00:00:00-07:00Jakob Stoklund Olesentag:2pi.dk,2022-05-29:/2022/05/bb-arguments<p><span class="caps">SSA</span> form <a href="https://en.wikipedia.org/wiki/Static_single-assignment_form">Static Single Assignment</a>
is popular in compiler intermediate representations. Since <span class="caps">SSA</span> has no mutable
variables, phi functions are an important feature.</p>
<p>Let’s take a look at an alternative to phi functions: <a href="https://en.wikipedia.org/wiki/Static_single-assignment_form#Block_arguments"><em>basic block arguments</em></a>.</p>
<h2>Phi functions</h2>
<p>Here’s a loop counting to 10 in a fictional <span class="caps">SSA …</span></p><p><span class="caps">SSA</span> form <a href="https://en.wikipedia.org/wiki/Static_single-assignment_form">Static Single Assignment</a>
is popular in compiler intermediate representations. Since <span class="caps">SSA</span> has no mutable
variables, phi functions are an important feature.</p>
<p>Let’s take a look at an alternative to phi functions: <a href="https://en.wikipedia.org/wiki/Static_single-assignment_form#Block_arguments"><em>basic block arguments</em></a>.</p>
<h2>Phi functions</h2>
<p>Here’s a loop counting to 10 in a fictional <span class="caps">SSA</span> <span class="caps">IR</span>:</p>
<div class="highlight"><pre><span></span><code>entry:
v1 = const 1
jump loop
loop:
v10 = phi(entry: v1, loop: v11)
v11 = add v10, 1
v12 = cmp le v11, 10
branch v12, loop, exit
exit:
return
</code></pre></div>
<p>The phi function binds the <code>v10</code> value in a control flow dependent way: When coming from the <code>entry</code> block, use <code>v1</code>. When coming from the <code>loop</code> block, use <code>v11</code>. There must be a phi operand for each predecessor block so the value is always defined.</p>
<h2>Basic block arguments</h2>
<p>Here is the same loop in a fictional <span class="caps">IR</span> that uses basic block arguments instead of phi functions:</p>
<div class="highlight"><pre><span></span><code>entry:
v1 = const 1
jump loop(v1)
loop(v10):
v11 = add v10, 1
v12 = cmp le v11, 10
branch v12, loop(v11), exit
exit:
return
</code></pre></div>
<p>The <code>v10</code> value is no longer defined by a phi function. Instead it appears as an
argument to the <code>loop</code> basic block. These block arguments work just like
function arguments: You must provide a value when jumping to the basic block.</p>
<p>The biggest advantage to using basic block arguments is that phi functions
disappear, and it is no longer necessary to update phi operand lists when making
changes to the control flow. Of course, compiler optimization passes must now
make sure to always provide the correct arguments when branching to a basic
block. While that is technically equivalent to updating phi operand lists, in
practice it turns out to be a lot easier.</p>
<h2>Corner cases</h2>
<p>Block arguments are almost equivalent to phi functions, but there are some corner cases to consider. One case is terminator instructions that mention the same basic block more than once. Consider:</p>
<div class="highlight"><pre><span></span><code>entry:
branch v12, block(v3), block(v4)
block(v20):
...
</code></pre></div>
<p>This <span class="caps">IR</span> has a perfectly well-defined meaning, but it can’t be represented with a
phi function without introducing an extra basic block. At some point, code
generation will also have to create a separate basic block to distinguish the
two cases. Depending on how high-level or low-level the <span class="caps">IR</span> is, you can choose
whether to allow this kind of thing.</p>
<p>Block arguments can also be used to handle function calls that can either return normally or throw an exception. Compare this to something like <span class="caps">LLVM</span>’s <a href="https://llvm.org/docs/LangRef.html#invoke-instruction"><code>invoke</code> instruction</a>:</p>
<div class="highlight"><pre><span></span><code>entry:
invoke f() to normal unwind lpad
normal(v10):
# Normal return value in v10
lpad(v20):
# Exception v20 thrown
</code></pre></div>
<p><span class="caps">LLVM</span>’s <code>landingpad</code> instruction is a bit of a hack to deal with this.</p>Preparing Sides2021-02-22T00:00:00-08:002021-02-22T00:00:00-08:00Jakob Stoklund Olesentag:2pi.dk,2021-02-22:/2021/02/toolchest-sides<p>I let my new white pine boards sit in the shop in the basement for six weeks to
let them acclimate. White pine is not supposed to move so much with changes in
humidity, but it is still best to make sure the boards are stable while working
with them …</p><p>I let my new white pine boards sit in the shop in the basement for six weeks to
let them acclimate. White pine is not supposed to move so much with changes in
humidity, but it is still best to make sure the boards are stable while working
with them. I take a long time working on my projects, so many days can pass
between flattening a board and cutting joints in it.</p>
<h2>Flattening boards</h2>
<p>I made each case side from two 12” boards glued together to get a 24” wide
panel. I cut the pieces to length, making sure there were no knots near the cut.
It would be difficult to make dovetails in a knot. The boards are not completely
flat because the wood moved after it was milled, so I planed each piece flat on
both sides, and made sure all the pieces ended up with the same thickness.</p>
<figure id="__yafg-figure-1">
<img a="" alt="Flattening a 12" board="" class="image-process-photo" jointer="" plane"="" src="https://2pi.dk/2021/02/derivatives/photo/960w/Flattening boards for sides.jpeg" srcset="https://2pi.dk/2021/02/derivatives/photo/640w/Flattening boards for sides.jpeg 640w, https://2pi.dk/2021/02/derivatives/photo/960w/Flattening boards for sides.jpeg 960w, https://2pi.dk/2021/02/derivatives/photo/1280w/Flattening boards for sides.jpeg 1280w" with=""/>
<figcaption>Flattening a 12 inch board with a #8 jointer plane. It’s pretty easy when the plane is as long as the board.</figcaption>
</figure>
<p>I also made sure the edges were straight and exactly 90° to the faces. This is
extra important for the edges that are glued together so there are no gaps.</p>
<h2>Gluing up the panels</h2>
<p>I needed all my clamps to glue a single panel, so I had to do the four sides
separately over four days.</p>
<figure id="__yafg-figure-2">
<img alt="Gluing a panel for the side" class="image-process-photo" src="https://2pi.dk/2021/02/derivatives/photo/960w/Gluing up side panel.jpeg" srcset="https://2pi.dk/2021/02/derivatives/photo/640w/Gluing up side panel.jpeg 640w, https://2pi.dk/2021/02/derivatives/photo/960w/Gluing up side panel.jpeg 960w, https://2pi.dk/2021/02/derivatives/photo/1280w/Gluing up side panel.jpeg 1280w"/>
<figcaption>The panel for the rear side of the chest. The big knot doesn’t go through to the other side, so I’ll put it on the outside where it will be painted over.</figcaption>
</figure>
<p>I used four pipe clamps to apply pressure on the edge joint and F-clamps to
secure cauls in order to make the panel as straight as possible. Transparent
packing tape on the cauls makes sure that the glue doesn’t stick to them.</p>
<p>One of the edges wasn’t perfectly square, so the panel came out with a small
angle in the joint. It’s so small that I can push it out when joining the dovetails.</p>
<h2>Flattening the panels</h2>
<p>After the glue dried, I flattened the whole panel again to get rid of the
imperfections in the glue joint. The panels are wider than my workbench, so it
was a bit tricky to hold them securely.</p>
<figure id="__yafg-figure-3">
<img alt="Flattening a panel" class="image-process-photo" src="https://2pi.dk/2021/02/derivatives/photo/960w/Flattening panels.jpeg" srcset="https://2pi.dk/2021/02/derivatives/photo/640w/Flattening panels.jpeg 640w, https://2pi.dk/2021/02/derivatives/photo/960w/Flattening panels.jpeg 960w, https://2pi.dk/2021/02/derivatives/photo/1280w/Flattening panels.jpeg 1280w"/>
<figcaption>Flattening a panel after gluing it.</figcaption>
</figure>
<p>One of the panels had a bump because the two boards were not exactly the same
thickness when I glued them. Next time I should be more careful to get the
boards to the same thickness before gluing them up. I think that will make the
cauls work better too.</p>
<p>I also planed the edges of the panels to get them perfectly parallel and to get
all the panels to the exact same width.</p>
<figure id="__yafg-figure-4">
<img alt="Planing the edge of a panel" class="image-process-photo" src="https://2pi.dk/2021/02/derivatives/photo/960w/Straightening panels.jpeg" srcset="https://2pi.dk/2021/02/derivatives/photo/640w/Straightening panels.jpeg 640w, https://2pi.dk/2021/02/derivatives/photo/960w/Straightening panels.jpeg 960w, https://2pi.dk/2021/02/derivatives/photo/1280w/Straightening panels.jpeg 1280w"/>
<figcaption>Planing the edge of a glued up panel. The sliding deadman on my workbench works great for holding such a big panel with an F-clamp and a holdfast.</figcaption>
</figure>
<p>The corners of the chest are going to be dovetailed together, so it is
important that the ends of the panels are perfectly square. This is hard to do
with a saw alone, so I built an improvised shooting board to help with planing
the ends square.</p>
<figure id="__yafg-figure-5">
<img alt="Shooting the end of a panel" class="image-process-photo" src="https://2pi.dk/2021/02/derivatives/photo/960w/Improvised shooting board.jpeg" srcset="https://2pi.dk/2021/02/derivatives/photo/640w/Improvised shooting board.jpeg 640w, https://2pi.dk/2021/02/derivatives/photo/960w/Improvised shooting board.jpeg 960w, https://2pi.dk/2021/02/derivatives/photo/1280w/Improvised shooting board.jpeg 1280w"/>
<figcaption>Crude shooting board for squaring up the ends of the panels.</figcaption>
</figure>
<p>I worked from both sides towards the middle of the panel. I didn’t want to risk
splitting the wood near the edge by planing the whole width. A proper shooting
board would have a fence to prevent the wood from splitting, but this method
worked okay with such a wide panel.</p>
<p>The sides are now ready for dovetailing.</p>Siding Replacement Started2021-01-14T00:00:00-08:002021-01-14T00:00:00-08:00Jakob Stoklund Olesentag:2pi.dk,2021-01-14:/2021/01/scaffolding<p>The <span class="caps">HOA</span> is replacing the siding on my building. Scaffolding went up this week.
The old <abbr title="Exterior insulation finishing system"><span class="caps">EIFS</span></abbr> siding is not suitable for Portland weather, and water is leaking
into the building.</p>
<figure id="__yafg-figure-1">
<img alt="NW corner of building" class="image-process-photo" src="https://2pi.dk/2021/01/12th-IrvingSt.jpeg">
<figcaption>Scaffolding going up on the North side facing Irving St.</figcaption>
</figure>
<p>The project will last most of 2021 in two …</p><p>The <span class="caps">HOA</span> is replacing the siding on my building. Scaffolding went up this week.
The old <abbr title="Exterior insulation finishing system"><span class="caps">EIFS</span></abbr> siding is not suitable for Portland weather, and water is leaking
into the building.</p>
<figure id="__yafg-figure-1">
<img alt="NW corner of building" class="image-process-photo" src="https://2pi.dk/2021/01/derivatives/photo/960w/12th-IrvingSt.jpeg" srcset="https://2pi.dk/2021/01/derivatives/photo/640w/12th-IrvingSt.jpeg 640w, https://2pi.dk/2021/01/derivatives/photo/960w/12th-IrvingSt.jpeg 960w, https://2pi.dk/2021/01/derivatives/photo/1280w/12th-IrvingSt.jpeg 1280w"/>
<figcaption>Scaffolding going up on the North side facing Irving St.</figcaption>
</figure>
<p>The project will last most of 2021 in two phases: First the East and North walls,
then the West and South walls. Of course, I have a North West corner apartment,
so I get to enjoy both phases.</p>
<figure id="__yafg-figure-2">
<img alt="SW corner of building" class="image-process-photo" src="https://2pi.dk/2021/01/derivatives/photo/960w/Hoyt-IrvingSt.jpeg" srcset="https://2pi.dk/2021/01/derivatives/photo/640w/Hoyt-IrvingSt.jpeg 640w, https://2pi.dk/2021/01/derivatives/photo/960w/Hoyt-IrvingSt.jpeg 960w, https://2pi.dk/2021/01/derivatives/photo/1280w/Hoyt-IrvingSt.jpeg 1280w"/>
<figcaption>Scaffolding in the back alley begins on the second floor.</figcaption>
</figure>
<p>The new siding won’t be in the faux Art Deco style — it will be something more modern looking.</p>Wood for the Tool Chest2020-12-29T00:00:00-08:002020-12-29T00:00:00-08:00Jakob Stoklund Olesentag:2pi.dk,2020-12-29:/2020/12/wood-for-toolchest<p>Some years ago I read <a href="https://lostartpress.com/products/the-anarchists-tool-chest"><em>“The Anarchist’s Tool Chest”</em> by Christopher
Schwarz</a> . It’s a wonderful book describing many of the hand tools used in
the English/American tradition of woodworking. The book inspired me to get into
woodworking with hand tools, and to actually build the tool chest …</p><p>Some years ago I read <a href="https://lostartpress.com/products/the-anarchists-tool-chest"><em>“The Anarchist’s Tool Chest”</em> by Christopher
Schwarz</a> . It’s a wonderful book describing many of the hand tools used in
the English/American tradition of woodworking. The book inspired me to get into
woodworking with hand tools, and to actually build the tool chest described in
the back of the book.</p>
<figure id="__yafg-figure-1">
<img alt="Finished tool chest built by Christopher Schwarz" class="image-process-photo" src="https://2pi.dk/2020/12/derivatives/photo/960w/schwarz-toolchest.jpeg" srcset="https://2pi.dk/2020/12/derivatives/photo/640w/schwarz-toolchest.jpeg 640w, https://2pi.dk/2020/12/derivatives/photo/960w/schwarz-toolchest.jpeg 960w, https://2pi.dk/2020/12/derivatives/photo/1280w/schwarz-toolchest.jpeg 1280w"/>
<figcaption>One of the many tool chests built by Chris Schwarz.</figcaption>
</figure>
<p>A description of the tool chest was also <a href="https://www.popularwoodworking.com/article/12-rules-for-tool-chests-2/">published in Popular Woodworking
Magazine</a>.</p>
<h2>Buying boards</h2>
<p>I’m going to build the chest in the recommended <a href="https://en.wikipedia.org/wiki/Pinus_strobus">Eastern White Pine</a>
which is lightweight, fairly strong, and easy to work with. It is much lighter
than the <a href="https://en.wikipedia.org/wiki/Douglas_fir">Douglas Fir</a> construction lumber I used to build my
workbench. <a href="http://www.crosscuthardwoods.com/species.html">Crosscut Hardwoods</a> in Portland sells white pine and lots
of other species of wood.</p>
<p>I was expecting them to carry 8’ long 6” wide boards of clear grade pine, so
before going to the lumberyard I figured how many of those boards I would need.
When I got there, their pine boards where a mix of 10’, 11’, and 12’ long, and
in various widths between 4” and 12”. They also didn’t have a lot of clear grade
boards, so I had to get some of the furniture grade boards too. (“Furniture grade”
must be for upholstered furniture, because it has quite a few knots.)</p>
<figure id="__yafg-figure-2">
<img alt="Furniture grade white pine" class="image-process-photo" src="https://2pi.dk/2020/12/derivatives/photo/960w/Furniture grade white pine.jpeg" srcset="https://2pi.dk/2020/12/derivatives/photo/640w/Furniture grade white pine.jpeg 640w, https://2pi.dk/2020/12/derivatives/photo/960w/Furniture grade white pine.jpeg 960w, https://2pi.dk/2020/12/derivatives/photo/1280w/Furniture grade white pine.jpeg 1280w"/>
<figcaption>12 inch board of furniture grade white pine. Lots of big knots.</figcaption>
</figure>
<p>I had to do a lot of figuring on the spot, and ended up buying these boards:</p>
<table>
<thead>
<tr>
<th>Dimensions (inches)</th>
<th>Note</th>
</tr>
</thead>
<tbody>
<tr>
<td>4 × 133 ½</td>
<td><span class="caps">FUR</span>, 12” damaged one end</td>
</tr>
<tr>
<td>4 × 144 ½</td>
<td><span class="caps">FUR</span>, big knot at 24”</td>
</tr>
<tr>
<td>4 × 144 ½</td>
<td><span class="caps">FUR</span>, big knot at 24”</td>
</tr>
<tr>
<td>4 × 121 ½</td>
<td><span class="caps">CLR</span>, damaged end</td>
</tr>
<tr>
<td>4 ¼ × 121 ½</td>
<td><span class="caps">CLR</span></td>
</tr>
<tr>
<td>6 × 121 ½</td>
<td><span class="caps">CLR</span>, bad end 8”</td>
</tr>
<tr>
<td>6 ¼ × 121 ½</td>
<td><span class="caps">CLR</span>, good</td>
</tr>
<tr>
<td>6 ¼ × 120 ¾</td>
<td><span class="caps">CLR</span></td>
</tr>
<tr>
<td>6 × 121 ½</td>
<td><span class="caps">CLR</span></td>
</tr>
<tr>
<td>11 ¾ × 144 ½</td>
<td><span class="caps">FUR</span>, some knots</td>
</tr>
<tr>
<td>12 × 144 ½</td>
<td><span class="caps">FUR</span>, some knots</td>
</tr>
</tbody>
</table>
<h2>Planning the cuts</h2>
<p>My tool chest is going to be 40” × 24” × 24”, so here’s how I am going to use the boards:</p>
<dl>
<dt>Sides</dt>
<dd>I can glue two 12” boards together to make panels for the sides of the box.
After planing, that won’t add up to exactly 24”, but that is okay. The height is
not a critical dimension.</dd>
<dt>Skirt</dt>
<dd>A 6” wide skirt goes around the bottom of the chest. This needs two boards
of 25” and two of 41”. I can cut these from my 6” boards. They have a couple
small knots which won’t do any harm on the skirt.</dd>
<dt>Bottom</dt>
<dd>The bottom of the chest is made from 11 4” wide boards joined with
tongue-and-groove. I can cut these from the furniture grade 4” boards. It’s not
a problem if there are a few knots or other blemishes in the bottom boards.</dd>
<dt>Lid frame</dt>
<dd>The rails and stiles of the lid are about 4” wide, and they need to be very
straight grain with no knots. I’ll make them from the clear 4” boards along with
any left over bottom boards that are clear of knots.</dd>
<dt>Lid panel</dt>
<dd>The underside of the lid is going to be the most visible part of the chest
since the outside will be painted. The panel needs to be just over 16” wide
between the lid rails, so it can be made from 3 6” wide boards. I’ll pick out
the best wood for this.</dd>
<dt>Lid skirt</dt>
<dd>There’s a narrow skirt on the lid and a dust seal on the chest right under
it. These 2” wide parts can be ripped from the remaining 6” boards. The parts
may have to be slightly narrower than 2” so I can rip 3 usable parts out of one
6” board.</dd>
</dl>IEEE 754-2008 minNum and maxNum2016-05-02T00:00:00-07:002016-05-02T00:00:00-07:00Jakob Stoklund Olesentag:2pi.dk,2016-05-02:/2016/05/ieee-min-max<p>The <span class="caps">IEEE</span> 754-2008 standard for floating-point arithmetic defines <code>minNum(x, y)</code>
and <code>maxNum(x, y)</code> functions that compute the minimum and maximum of two
floating-point numbers respectively . This is simple as long as one number is
greater than the other, but the functions have some aggravating corner cases:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">maxNum …</span></code></pre></div><p>The <span class="caps">IEEE</span> 754-2008 standard for floating-point arithmetic defines <code>minNum(x, y)</code>
and <code>maxNum(x, y)</code> functions that compute the minimum and maximum of two
floating-point numbers respectively . This is simple as long as one number is
greater than the other, but the functions have some aggravating corner cases:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">maxNum</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">if</span> <span class="n">x</span> <span class="o">></span> <span class="n">y</span><span class="p">:</span>
<span class="k">return</span> <span class="n">x</span>
<span class="k">if</span> <span class="n">x</span> <span class="o"><</span> <span class="n">y</span><span class="p">:</span>
<span class="k">return</span> <span class="n">y</span>
<span class="k">if</span> <span class="n">x</span> <span class="o">==</span> <span class="n">y</span><span class="p">:</span>
<span class="c1"># What is maxNum(+0, -0)?</span>
<span class="k">return</span> <span class="n">bitand</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="c1"># What about NaN?</span>
<span class="o">...</span>
</code></pre></div>
<h2>Negative zero</h2>
<p>The first corner case is <code>maxNum(+0, -0)</code>; should the result be <code>+0</code> or
<code>-0</code>? The <span class="caps">IEEE</span> standard leaves this up to the implementation. The <span class="caps">ARM</span> and <span class="caps">MIPS</span>
architectures both choose to compare as if <code>-0 < +0</code> for the purposes of min and
max. This can be implemented as the bitwise <code>and</code> of <code>x</code> and <code>y</code> when they compare
equal using the normal floating-point equality as shown above.</p>
<p>The Intel <span class="caps">SSE</span> <code>maxss</code> and <code>maxsd</code> instructions mimic the C expression <code>x > y ? x :
y</code>, so they would return <code>y</code> in this case:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="n">maxss</span><span class="p">(</span><span class="o">+</span><span class="mi">0</span><span class="p">,</span> <span class="o">-</span><span class="mi">0</span><span class="p">)</span>
<span class="o">-</span><span class="mi">0</span>
<span class="o">>>></span> <span class="n">maxss</span><span class="p">(</span><span class="o">-</span><span class="mi">0</span><span class="p">,</span> <span class="o">+</span><span class="mi">0</span><span class="p">)</span>
<span class="o">+</span><span class="mi">0</span>
</code></pre></div>
<p>These instructions are not commutative.</p>
<h2>Not a number</h2>
<p>What if one of the operands is a NaN? Since many other <span class="caps">IEEE</span> primitives like
addition and subtraction always produce a NaN when one of the operands is a NaN,
it would make sense for min and max to do that same. Indeed <span class="caps">ARM</span> has a <code>fmax</code>
instruction which does that:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">fmax</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">if</span> <span class="n">x</span> <span class="o">></span> <span class="n">y</span><span class="p">:</span>
<span class="k">return</span> <span class="n">x</span>
<span class="k">if</span> <span class="n">x</span> <span class="o"><</span> <span class="n">y</span><span class="p">:</span>
<span class="k">return</span> <span class="n">y</span>
<span class="k">if</span> <span class="n">x</span> <span class="o">==</span> <span class="n">y</span><span class="p">:</span>
<span class="k">return</span> <span class="n">bitand</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="k">return</span> <span class="n">NaN</span>
</code></pre></div>
<p><span class="caps">IEEE</span> however, treats NaN as a missing value for the purpose of the <code>minNum</code> and
<code>maxNum</code> functions. They will suppress a single NaN operand and return the
number instead:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">maxNum</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">if</span> <span class="n">x</span> <span class="o">></span> <span class="n">y</span><span class="p">:</span>
<span class="k">return</span> <span class="n">x</span>
<span class="k">if</span> <span class="n">x</span> <span class="o"><</span> <span class="n">y</span><span class="p">:</span>
<span class="k">return</span> <span class="n">y</span>
<span class="k">if</span> <span class="n">x</span> <span class="o">==</span> <span class="n">y</span><span class="p">:</span>
<span class="k">return</span> <span class="n">oneOf</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">isNaN</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="k">return</span> <span class="n">x</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">isNaN</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
<span class="k">return</span> <span class="n">y</span>
<span class="k">return</span> <span class="n">NaN</span>
</code></pre></div>
<p>The rationale for this behavior is not completely clear to me. This is from
<a href="http://www.cs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF">Prof. Kahan’s
notes</a>:</p>
<blockquote>
<p>Some familiar functions have yet to be defined for NaN. For instance <code>max{x,
y}</code> should deliver the same result as <code>max{y, x}</code> but almost no
implementations do that when <code>x</code> is NaN. There are good reasons to define
<code>max{NaN, 5} := max{5, NaN} := 5</code> though many would disagree.</p>
</blockquote>
<p>There is further discussion of this choice in the comments on a <a href="https://github.com/JuliaLang/julia/issues/7866">Julia language
issue</a>.</p>
<h2>Signaling NaNs</h2>
<p>There are two types of NaNs: <em>quiet</em> and <em>signaling</em> NaNs. Quiet NaNs can be
produced by invalid operations like <code>0 / 0</code> or <code>∞ - ∞</code>, but signaling NaNs are
not produced by normal arithmetic operations, although they can be propagated by
certain functions like <code>negate</code> and <code>copySign</code> that only manipulate the sign bit.</p>
<p>All other operations are invalid with a signaling NaN operand, causing them to
produce a quiet NaN result. This includes the <code>minNum</code> and <code>maxNum</code>
functions:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">maxNum</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">if</span> <span class="n">isSignaling</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="ow">or</span> <span class="n">isSignaling</span><span class="p">(</span><span class="n">y</span><span class="p">):</span>
<span class="k">return</span> <span class="n">NaN</span>
<span class="k">if</span> <span class="n">x</span> <span class="o">></span> <span class="n">y</span><span class="p">:</span>
<span class="k">return</span> <span class="n">x</span>
<span class="k">if</span> <span class="n">x</span> <span class="o"><</span> <span class="n">y</span><span class="p">:</span>
<span class="k">return</span> <span class="n">y</span>
<span class="k">if</span> <span class="n">x</span> <span class="o">==</span> <span class="n">y</span><span class="p">:</span>
<span class="k">return</span> <span class="n">oneOf</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">isNaN</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="k">return</span> <span class="n">x</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">isNaN</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
<span class="k">return</span> <span class="n">y</span>
<span class="k">return</span> <span class="n">NaN</span>
</code></pre></div>
<p>This means that these functions suppress quiet NaNs while signaling NaNs are
converted to a quiet NaN:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="n">maxNum</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">NaN</span><span class="p">)</span>
<span class="mi">1</span>
<span class="o">>>></span> <span class="n">maxNum</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">sNaN</span><span class="p">)</span>
<span class="n">NaN</span>
</code></pre></div>
<p>The conversion of signaling NaNs to quiet NaNs as they are propagated has the
unfortunate effect of making these functions non-associative:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="n">maxNum</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">maxNum</span><span class="p">(</span><span class="n">sNaN</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span>
<span class="mi">1</span>
<span class="o">>>></span> <span class="n">maxNum</span><span class="p">(</span><span class="n">maxNum</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">sNaN</span><span class="p">),</span> <span class="mi">2</span><span class="p">))</span>
<span class="mi">2</span>
</code></pre></div>
<p>It is quite unfortunate that computing the maximum of a set containing the value
2 can produce the result 1. It seems that the answer should be either 2 or NaN.
This bad behavior when <code>maxNum</code> functions are chained appears because</p>
<ol>
<li>Signaling NaNs are propagated as quiet NaNs.</li>
<li>Quiet NaNs are supressed rather than propagated.</li>
</ol>
<p>Signaling NaNs also latch an <em>invalid operation</em> floating-point exception, but
very few programs use the floating-point status flags. Trapping on
floating-point exceptions is almost always disabled.</p>
<p>The ARMv8 instruction set has an <code>fmaxnmv</code> instruction which computes the <span class="caps">IEEE</span>
<code>maxNum</code> function across the lanes of a <span class="caps">SIMD</span> vector:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">fmaxnmv</span><span class="p">(</span><span class="n">v</span><span class="p">):</span>
<span class="k">return</span> <span class="n">maxNum</span><span class="p">(</span><span class="n">maxNum</span><span class="p">(</span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">v</span><span class="p">[</span><span class="mi">1</span><span class="p">]),</span> <span class="n">maxNum</span><span class="p">(</span><span class="n">v</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="n">v</span><span class="p">[</span><span class="mi">3</span><span class="p">]))</span>
</code></pre></div>
<p>Since the underlying function is non-associative, we can get different results
simply by permuting the lanes of the <span class="caps">SIMD</span> vector:</p>
<div class="highlight"><pre><span></span><code><span class="o">>>></span> <span class="n">fmaxnmv</span><span class="p">([</span><span class="n">sNaN</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">])</span>
<span class="mi">3</span>
<span class="o">>>></span> <span class="n">fmaxnmv</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="n">sNaN</span><span class="p">])</span>
<span class="mi">2</span>
</code></pre></div>
<p>This sets the <em>invalid operation</em> bit in the <code>FPSR</code> floating-point status
register as the only indication that something might be wrong.</p>
<h2>Mitigation</h2>
<p>So how can we compute the min or max of a set of floating-point numbers without
losing our sanity? There’s a couple of options:</p>
<ul>
<li>Use min and max functions that propagate all NaN operands, like the <span class="caps">ARM</span> <code>fmax</code>
instruction described above. If any member of the set is a NaN (quiet or
signaling), the result will be NaN.</li>
<li>Use min and max functions that suppress all NaN operands, quiet or signaling.
This is valid to do in a C implementation since the C11 specification leaves
the handling of signaling NaNs implementation-defined. On <span class="caps">OS</span> X, this if how
the <code><math.h></math.h></code> <code>fmin()</code> and <code>fmax()</code> functions behave.</li>
<li>Detect <em>invalid operation</em> floating point exceptions. Signaling NaN operands
cause operations to latch an invalid operation exception and return NaN. If
such an exception was latched while computing the maximum element of a set,
the result can’t be trusted.</li>
</ul>Global Instruction Selection in LLVM2013-08-08T00:00:00-07:002013-08-08T00:00:00-07:00Jakob Stoklund Olesentag:2pi.dk,2013-08-08:/llvm/global-isel<p>This document summarizes the design of a global instruction selector for <span class="caps">LLVM</span> as
<a href="http://lists.llvm.org/pipermail/llvm-dev/2013-August/064696.html">proposed on llvmdev</a>.</p>
<h2>Design goals</h2>
<p>The global instruction selector is intended to replace both SelectionDAG
and fast isel. These are some of the goals for a new instruction
selector architecture:</p>
<ul>
<li><em>We want a global instruction selector</em>.
Make …</li></ul><p>This document summarizes the design of a global instruction selector for <span class="caps">LLVM</span> as
<a href="http://lists.llvm.org/pipermail/llvm-dev/2013-August/064696.html">proposed on llvmdev</a>.</p>
<h2>Design goals</h2>
<p>The global instruction selector is intended to replace both SelectionDAG
and fast isel. These are some of the goals for a new instruction
selector architecture:</p>
<ul>
<li><em>We want a global instruction selector</em>.
Make it possible to match complex patterns across basic blocks to
take advantage of addressing modes. Legalization of switches,
selects, and atomics can happen as part of the instruction selection
process, even if new basic blocks are required. Clean up ext/trunc
instructions across basic blocks after legalization of small
integer types.</li>
<li><em>We want a faster instruction selector</em>.
By selecting directly to <span class="caps">MI</span>, we can avoid one <span class="caps">IR</span> translation phase.
By using a linearized <span class="caps">IR</span>, scheduling becomes optional.</li>
<li><em>We want a shared code path for fast and good instruction selection</em>.
The -O0 instruction selector should simply be a trimmed down version
of the full optimizing instruction selector.</li>
<li><em>We want an <span class="caps">IR</span> that represents <span class="caps">ISA</span> concepts well</em>.
The MachineInstr <span class="caps">IR</span> is used to represent functions throughout
instruction selection, and targets are allowed in insert
pre-selected instructions at any time. Virtual registers can either
have (<span class="caps">EVT</span>, Register bank) or register class labels. Custom opaque
types can be added to represent target-specific values.</li>
<li><em>We want a configurable instruction selector</em>.
Weird targets have weird requirements, and it should be possible for
targets to inject new passes into the instruction selection process.
Sometimes, it may even be required to replace a standard pass.</li>
</ul>
<h2>Overall design</h2>
<p>The global instruction selector is implemented as a series of machine
function passes. Targets can inject their own passes or replace standard
passes, using the existing mechanism for configuring code generation pipelines.</p>
<p>The <span class="caps">MI</span> intermediate representation is extended with a set of abstract
target-independent opcodes very similar to the <span class="caps">ISD</span> opcodes used by
SelectionDAG. Virtual registers can be typed so they have an <span class="caps">EVT</span> instead
of a register class. A given type can be mapped to more than one
register class, controlled by a register bank label.</p>
<p>The standard passes are:</p>
<ol>
<li><span class="caps">MI</span> builder.</li>
<li>Switch lowering.</li>
<li>Iterative legalization and cleanup.</li>
<li>Register bank selection.</li>
<li>Common subexpression elimination.</li>
<li>Instruction selection.</li>
</ol>
<p>I’ll describe these passes in more detail below.</p>
<h2><span class="caps">MI</span> builder</h2>
<p>The <span class="caps">MI</span> builder pass translates <span class="caps">LLVM</span> <span class="caps">IR</span> to <span class="caps">MI</span> <span class="caps">IR</span>. This pass will:</p>
<ul>
<li>Expand <span class="caps">LLVM</span> <span class="caps">IR</span> first-class aggregate types into their constituent parts.
This also includes splitting load, store, phi, and select instructions.</li>
<li>Replace pointer types with target-specific integers.</li>
<li>Expand getelementptr instructions with integer adds and multiplies.</li>
<li>Map <span class="caps">LLVM</span> instructions to target-independent abstract <span class="caps">MI</span> opcodes.</li>
<li>Lower <span class="caps">ABI</span> boundaries with help from target hooks.</li>
<li>Create a 1-1 mapping of the <span class="caps">LLVM</span> <span class="caps">IR</span> <span class="caps">CFG</span>. No new basic blocks are created.</li>
<li>Preserve switch instructions.</li>
<li>Preserve i1 logic instructions.</li>
<li>Not use MERGE_VALUES instructions. We’ll use a value map that can handle aggregates.</li>
<li>The aggregate type expansion creates value types that can all be represented
by EVTs, and MachineRegisterInfo will be extended to allow virtual registers
to have an <span class="caps">EVT</span> instead of a register class. EVTs are all the integer, floating
point, and vector types from <span class="caps">LLVM</span> <span class="caps">IR</span>.</li>
</ul>
<p>The <span class="caps">ABI</span> boundary lowering requires types to be broken down further into
‘legal types’ that can be mapped to registers. The secondary breakdown
is currently handled by TargetLowering::LowerCallTo() calling
getRegisterType() and getNumRegisters(). Most ABIs are defined in terms
of C types, not <span class="caps">LLVM</span> <span class="caps">IR</span> types, so there is a close connection between
the C frontend and the <span class="caps">ABI</span> lowering code in the instruction selector. It
would be a good idea to have the <span class="caps">ABI</span> lowering code work independently of
the type system used during instruction selection. This is made possible
by having <span class="caps">ABI</span> lowering be part of the <span class="caps">LLVM</span> <span class="caps">IR</span> to <span class="caps">MI</span> translation process.</p>
<h2>Switch lowering</h2>
<p>The switch lowering pass converts switch instructions into a combination
of branch trees and jump tables. A default target-independent
implementation is possible, but targets may want to override this pass.
For example, the <span class="caps">ARM</span> target could try to create jump tables that would
work well with the <span class="caps">TBB</span>/<span class="caps">TBH</span> instructions to reduce code size.</p>
<h2>Legalization</h2>
<p>We’ll define legal types precisely:</p>
<blockquote>
<p>A type is considered legal if it can be loaded into and stored from an
allocatable register class. (And all allocatable register classes must
support copy instructions.)</p>
</blockquote>
<p>We don’t require other supported operations than load and store for a
type to be legal, and all types that can be loaded and stored are legal.
This means that most targets will have a much larger set of legal types
than they do today. Also note that we don’t require targets to designate
a register class to use for each legal type; in fact, TableGen can
compute the legal type set automatically. Register classes can be
inferred from the selected instructions,
MachineRegisterInfo::recomputeRegClass() already knows how to do that.</p>
<p>With a much larger set of legal types, a separate type legalization
phase becomes superfluous. The operation legalizer must be able to do
anything the type legalizer can do anyway, so the type legalizer isn’t
adding any value.</p>
<p>The legalizer will work bottom-up and iteratively. As illegal
instructions are visited, the legalizer will apply transformations that
make the current instruction ‘more legal’. Each transformation is
allowed to modify a whole chain of single-use instructions for
efficiency, but it is also allowed to only modify the current
instruction in hard cases. The instructions created don’t need to be
legal, the legalizer iterates until the current instruction is legal
before it moves on.</p>
<p>The set of new instructions created while legalizing a single
instruction is fed through an instruction simplifier that cleans up
redundancies. This replaces DAGCombine.</p>
<h2>Register bank selection</h2>
<p>Many instruction set architectures have multiple register banks. X86 has
three: Integer, vector, and x87 floating point registers. (Four if you
count <span class="caps">MMX</span> registers as a separate bank.) Blackfin and m68k have separate
pointer and data register banks, etc. It is also common to have the same
operations available in multiple register banks. For example, most ISAs
with a vector register bank support bitwise and/or/xor operations in
both the integer and vector register banks.</p>
<p>The global instruction selector will assign virtual registers to
register banks explicitly. The set of register banks is typically small
(2-3) and defined by the target. Modelling register banks explicitly
makes it possible to move operations between register banks to minimize
cross-bank copies which are often expensive. <span class="caps">SPARC</span> even requires
cross-bank copies to go through memory, as does x86 in some cases.</p>
<p>The register bank selection pass computes the optimal bank assignment
and inserts copy instructions when a value needs to cross banks.
Sometimes, it may be beneficial to have the same value available in two
register banks simultaneously, this can also be represented with
cross-bank copy instructions. The bank selection can also be affected by
register pressure concerns. On x86-64, for example, many i32 values
could be moved to the <span class="caps">SSE</span> registers, freeing up the integer registers.</p>
<h2>Instruction selection</h2>
<p>The instruction selection pass replaces most target-independent
instructions with real target opcodes, and it ensures that all virtual
registers have register classes instead of EVTs. Some target-independent
instructions, like <span class="caps">COPY</span>, are not lowered until after register allocation.</p>
<p>SelectionDAG instruction selection is controlled only by expression
types, and the selected instructions are expected to use the unique
register banks selected by the type:</p>
<div class="highlight"><pre><span></span><code>(operation, type) -> opcode
</code></pre></div>
<p>We’re representing register banks explicitly, and many operations are
available in multiple banks, so the register bank needs to be part of
the instruction selection:</p>
<div class="highlight"><pre><span></span><code>(operation, type, bank) -> opcode
</code></pre></div>
<p>On the other hand, when types are not
used to select register banks, it becomes really difficult to explain
the difference between load i32 and load f32. The hardware doesn’t care
either, it simply knows how to load 32 bits into a given register. We
can use a three-level hierarchical type system to better describe this:</p>
<ul>
<li>Bit width. Operations like load, store, select, and the bitwise
and/or/xor only depend on the number of bits in the operands. Their
effect is independent of the vector lane structure and whether lanes
are ints or floats.</li>
<li>Vector lanes + lane width. Operations like shufflevector and
insertelement depend of the vector topology of the operand types,
but they don’t care if vector lanes are floats or ints.</li>
<li>Full EVTs. Finally, arithmetic instructions actually depend on the
full type of their operands. It is worth noting, though, that the
int/float distinction is already redundantly encoded in
the operations. <span class="caps">LLVM</span> <span class="caps">IR</span> has separate add and fadd instructions,
for example.</li>
</ul>
<p>The instruction selection will only use the relevant parts of the
operand type, depending on the operation being matched. It will consider
load i32, load f32, and load v2i16 to be simply 32-bit wide loads. The
target is likely to have multiple 32-bit load instructions. The explicit
register bank is used to pick the right one. Note that some big-endian
targets (<span class="caps">ARM</span> and <span class="caps">MIPS</span> in <span class="caps">BE</span> mode) still number their vector lanes
right-to-left. This means that vector loads get shuffled differently
depending on the lane size being loaded, and we’ll need to make it
possible to use different load instruction for different vector lane
sizes. (On such targets, bitcasts become shuffles too.)</p>
<p>Apart from the way operations and types are matched, the instruction
selection algorithm is a lot like the current SelectionDAG algorithm.
The process is less transactional than it is in SelectionDAG.
Specifically, targets are allowed to insert pre-selected instructions at
any time.</p>
<p>The instruction selection algorithm is global, which means it can look
upwards in the dominator tree when matching patterns. A cost model is
required to determine when it is a good idea to fold instructions
outside the current loop. The same applies to matching instructions with
multiple uses.</p>