1. MIPS simulator
QtMips simulator
Download, unzip, and run the desktop version. Or, you can use the web version, but be aware that the graphics may be glitchy.
Example code, for later: https://github.com/cvut/QtMips-Playground
2. Data hazard with lw
lw $x, 0xAAAA($y)
sw $x, 0xBBBB($z) ; <-- data hazard: $x is updated then used immediately
2.1. Single-cycle datapath
In QtMips setup the most basic single-cycle datapath that we began with:
-
File -> New Simulation
-
Select "No pipeline no cache"
-
Start empty

Load the built-in example:
-
File -> Examples ->
simple-lw-sw-ia.S

Study this example. What does it do? How does it do it?
⋯ (stop scrolling and take a look!)⋯
This assembler syntax is a subset of gas
GNU Assembler:
-
.globl _start
is declared as a global label (across all input files) -
.set noreorder
prevents the assembler from addingnop
or otherwise optimizing the code -
.text
what follows is machine code -
loop:
symbolic label on a memory address (otherwisebeq $0, $0, loop
needs to be hard-coded with the correct offset) -
.data
what follows is NOT machine code, just variables and constants -
.org
this starts at specific memory address (0x2000)
Notice that the variable at 0x2000 is set to 0x12345678 and the second variable is set to zero. This second variable (at 0x2004 since the previous entity was a 4-byte MIPS word) is what you want to watch when stepping through the code.
Assemble this code and load:
-
image::data/qtmips-button-compile.png[]
-
Machine -> Compile Source
-
Ctrl-E
The #pragma
s ensure that you are at least seeing the Registers and Memory windows.
The ending #pragma
also conveniently ensures the Memory window is scrolled to the relevant location.
-
Single step through the program by
or Ctrl-T.
-
Count how many steps until memory 0x2004 gets updated (you should get 3).
Switch from the simple-lw-sw-ia.S source code tab to the Core tab.
-
Compile and load again (Ctrl-E)
-
Watch the datapath image update values of various registers and wires.
-
Do this cycle for a bit to get a handle on how the simulator displays the datapath and updates at each step. Your eye-brain system is good at finding changes in a scene.
A large screen is super helpful for watching the datapath! Be sure to notice the colored current instructions at the top and the information at the bottom (Cycles … Stalls).
2.2. Pipeline with no features
Setup a new datapath:
-
File -> New Simulation
-
Custom
-
Core:
-
Pipelined
-
(no) Hazard
-
-
Program cache
-
Not enabled
-
-
Data cache
-
Not enabled
-
How many cycles does it take for 0x2004 to get updated? You should count 9 cycles.
¿But aren’t pipelines supposed to increase performance?
-
Insert at least 4
nop
instructions between thesw
and thewhile(1) beq
and compile and run again
The dst_val
location does not get updated until the second time through the loop.
This is a programming BUG; there is no reason that this lw / sw
combination needs to be executed twice.
Welcome to a pipeline data hazard.
How do you ensure that register $2 has the correct value by the time that the sw
instruction reads the $2 value?
-
The destination register for a load instruction cannot be used as a source for at least 2 cycles.
2.3. Pipeline with hazard detection and stall
Add the hazard detection unit:
-
File -> New Simulation
-
Custom
-
Core:
-
Pipelined
-
(YES) Hazard
-
Stall when hazard is detected
-
-
Program cache
-
Not enabled
-
-
Data cache
-
Not enabled
-
Run the modified example code:
-
no
nop
s betweenlw / sw
-
at least 4
nop
s beforebeq
How many cycles? (should be 7)
-
Notice how the datapath inserts exactly two
nop
s (bubbles) into the pipeline when the control unit (in the second stage ID) sees the use of $2 right after thelw
. -
If you haven’t already seen how the MUXs change to route signals … now you know. It’s interesting to watch the sequencing.
Having a hazard detector + stall built into the datapath removes the programmer’s responsibility for this bug.
It is A-OK if the programmer does add two nop
s, but they happen anyway, so no advantage.
What is better is the programmer knowing this behavior and using two unrelated but useful instructions after the lw
— no stall and higher performance.
2.4. Pipeline with forwarding
Now, add the shortcut of forwarding data destined for a register but sneaked in one or two cycles early.
-
File -> New Simulation
-
Custom
-
Core:
-
Pipelined
-
(YES) Hazard
-
Stall or forward when hazard is detected
-
-
Program cache
-
Not enabled
-
-
Data cache
-
Not enabled
-
Run the modified example code:
-
no
nop
s betweenlw / sw
-
at least 4
nop
s beforebeq
How many cycles? (should now be 6)
Notice how there is still one stall? Even with forwarding, there is still one cycle of pause.
An assembly programmer that understands this will automatically seek to insert another instruction after a lw
that does not use that particular register.
There is still opportunity for the programmer to optimize performance.
The single-cycle datapath has no such limitation, so why do we pipeline, again?
(that would be a great test question!)