inside-minor.doxygen revision 10259
19264Sdjordje.kovacevic@arm.com# Copyright (c) 2014 ARM Limited
29264Sdjordje.kovacevic@arm.com# All rights reserved
39264Sdjordje.kovacevic@arm.com#
49264Sdjordje.kovacevic@arm.com# The license below extends only to copyright in the software and shall
59264Sdjordje.kovacevic@arm.com# not be construed as granting a license to any other intellectual
69264Sdjordje.kovacevic@arm.com# property including but not limited to intellectual property relating
79264Sdjordje.kovacevic@arm.com# to a hardware implementation of the functionality of the software
89264Sdjordje.kovacevic@arm.com# licensed hereunder.  You may use the software subject to the license
99264Sdjordje.kovacevic@arm.com# terms below provided that you ensure that this notice is replicated
109264Sdjordje.kovacevic@arm.com# unmodified and in its entirety in all distributions of the software,
119264Sdjordje.kovacevic@arm.com# modified or unmodified, in source code or in binary form.
129264Sdjordje.kovacevic@arm.com#
139264Sdjordje.kovacevic@arm.com# Redistribution and use in source and binary forms, with or without
149264Sdjordje.kovacevic@arm.com# modification, are permitted provided that the following conditions are
159264Sdjordje.kovacevic@arm.com# met: redistributions of source code must retain the above copyright
169264Sdjordje.kovacevic@arm.com# notice, this list of conditions and the following disclaimer;
179264Sdjordje.kovacevic@arm.com# redistributions in binary form must reproduce the above copyright
189264Sdjordje.kovacevic@arm.com# notice, this list of conditions and the following disclaimer in the
199264Sdjordje.kovacevic@arm.com# documentation and/or other materials provided with the distribution;
209264Sdjordje.kovacevic@arm.com# neither the name of the copyright holders nor the names of its
219264Sdjordje.kovacevic@arm.com# contributors may be used to endorse or promote products derived from
229264Sdjordje.kovacevic@arm.com# this software without specific prior written permission.
239264Sdjordje.kovacevic@arm.com#
249264Sdjordje.kovacevic@arm.com# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
259264Sdjordje.kovacevic@arm.com# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
269264Sdjordje.kovacevic@arm.com# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
279264Sdjordje.kovacevic@arm.com# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
289264Sdjordje.kovacevic@arm.com# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
299264Sdjordje.kovacevic@arm.com# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
309264Sdjordje.kovacevic@arm.com# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
319264Sdjordje.kovacevic@arm.com# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
329264Sdjordje.kovacevic@arm.com# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
339264Sdjordje.kovacevic@arm.com# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
349264Sdjordje.kovacevic@arm.com# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
359264Sdjordje.kovacevic@arm.com#
369264Sdjordje.kovacevic@arm.com# Authors: Andrew Bardsley
379264Sdjordje.kovacevic@arm.com
389264Sdjordje.kovacevic@arm.comnamespace Minor
399264Sdjordje.kovacevic@arm.com{
409264Sdjordje.kovacevic@arm.com
419264Sdjordje.kovacevic@arm.com/*!
429264Sdjordje.kovacevic@arm.com
439264Sdjordje.kovacevic@arm.com\page minor Inside the Minor CPU model
449264Sdjordje.kovacevic@arm.com
459264Sdjordje.kovacevic@arm.com\tableofcontents
469264Sdjordje.kovacevic@arm.com
479264Sdjordje.kovacevic@arm.comThis document contains a description of the structure and function of the
489264Sdjordje.kovacevic@arm.comMinor gem5 in-order processor model.  It is recommended reading for anyone who
499264Sdjordje.kovacevic@arm.comwants to understand Minor's internal organisation, design decisions, C++
509264Sdjordje.kovacevic@arm.comimplementation and Python configuration.  A familiarity with gem5 and some of
519264Sdjordje.kovacevic@arm.comits internal structures is assumed.  This document is meant to be read
529264Sdjordje.kovacevic@arm.comalongside the Minor source code and to explain its general structure without
539264Sdjordje.kovacevic@arm.combeing too slavish about naming every function and data type.
5413892Sgabeblack@google.com
559264Sdjordje.kovacevic@arm.com\section whatis What is Minor?
569264Sdjordje.kovacevic@arm.com
579264Sdjordje.kovacevic@arm.comMinor is an in-order processor model with a fixed pipeline but configurable
5813892Sgabeblack@google.comdata structures and execute behaviour.  It is intended to be used to model
599264Sdjordje.kovacevic@arm.comprocessors with strict in-order execution behaviour and allows visualisation
6013892Sgabeblack@google.comof an instruction's position in the pipeline through the
619264Sdjordje.kovacevic@arm.comMinorTrace/minorview.py format/tool.  The intention is to provide a framework
629264Sdjordje.kovacevic@arm.comfor micro-architecturally correlating the model with a particular, chosen
639264Sdjordje.kovacevic@arm.comprocessor with similar capabilities.
649264Sdjordje.kovacevic@arm.com
659264Sdjordje.kovacevic@arm.com\section philo Design philosophy
669264Sdjordje.kovacevic@arm.com
679264Sdjordje.kovacevic@arm.com\subsection mt Multithreading
689264Sdjordje.kovacevic@arm.com
699264Sdjordje.kovacevic@arm.comThe model isn't currently capable of multithreading but there are THREAD
709264Sdjordje.kovacevic@arm.comcomments in key places where stage data needs to be arrayed to support
719264Sdjordje.kovacevic@arm.commultithreading.
729264Sdjordje.kovacevic@arm.com
739264Sdjordje.kovacevic@arm.com\subsection structs Data structures
749264Sdjordje.kovacevic@arm.com
759264Sdjordje.kovacevic@arm.comDecorating data structures with large amounts of life-cycle information is
769264Sdjordje.kovacevic@arm.comavoided.  Only instructions (MinorDynInst) contain a significant proportion of
779264Sdjordje.kovacevic@arm.comtheir data content whose values are not set at construction.
789264Sdjordje.kovacevic@arm.com
799264Sdjordje.kovacevic@arm.comAll internal structures have fixed sizes on construction.  Data held in queues
8013892Sgabeblack@google.comand FIFOs (MinorBuffer, FUPipeline) should have a BubbleIF interface to
819264Sdjordje.kovacevic@arm.comallow a distinct 'bubble'/no data value option for each type.
829264Sdjordje.kovacevic@arm.com
839264Sdjordje.kovacevic@arm.comInter-stage 'struct' data is packaged in structures which are passed by value.
849264Sdjordje.kovacevic@arm.comOnly MinorDynInst, the line data in ForwardLineData and the memory-interfacing
859264Sdjordje.kovacevic@arm.comobjects Fetch1::FetchRequest and LSQ::LSQRequest are '::new' allocated while
869264Sdjordje.kovacevic@arm.comrunning the model.
879264Sdjordje.kovacevic@arm.com
889264Sdjordje.kovacevic@arm.com\section model Model structure
899264Sdjordje.kovacevic@arm.com
909264Sdjordje.kovacevic@arm.comObjects of class MinorCPU are provided by the model to gem5.  MinorCPU
919264Sdjordje.kovacevic@arm.comimplements the interfaces of (cpu.hh) and can provide data and
9213892Sgabeblack@google.cominstruction interfaces for connection to a cache system.  The model is
939264Sdjordje.kovacevic@arm.comconfigured in a similar way to other gem5 models through Python.  That
949264Sdjordje.kovacevic@arm.comconfiguration is passed on to MinorCPU::pipeline (of class Pipeline) which
959264Sdjordje.kovacevic@arm.comactually implements the processor pipeline.
969264Sdjordje.kovacevic@arm.com
979264Sdjordje.kovacevic@arm.comThe hierarchy of major unit ownership from MinorCPU down looks like this:
989264Sdjordje.kovacevic@arm.com
999264Sdjordje.kovacevic@arm.com<ul>
1009264Sdjordje.kovacevic@arm.com<li>MinorCPU</li>
1019264Sdjordje.kovacevic@arm.com<ul>
1029264Sdjordje.kovacevic@arm.com    <li>Pipeline - container for the pipeline, owns the cyclic 'tick'
1039264Sdjordje.kovacevic@arm.com    event mechanism and the idling (cycle skipping) mechanism.</li>
1049264Sdjordje.kovacevic@arm.com    <ul>
1059264Sdjordje.kovacevic@arm.com        <li>Fetch1 - instruction fetch unit responsible for fetching cache
1069264Sdjordje.kovacevic@arm.com            lines (or parts of lines from the I-cache interface)</li>
1079264Sdjordje.kovacevic@arm.com        <ul>
1089264Sdjordje.kovacevic@arm.com            <li>Fetch1::IcachePort - interface to the I-cache from
1099264Sdjordje.kovacevic@arm.com                Fetch1</li>
1109264Sdjordje.kovacevic@arm.com            </ul>
1119264Sdjordje.kovacevic@arm.com            <li>Fetch2 - line to instruction decomposition</li>
1129264Sdjordje.kovacevic@arm.com            <li>Decode - instruction to micro-op decomposition</li>
1139264Sdjordje.kovacevic@arm.com            <li>Execute - instruction execution and data memory
1149264Sdjordje.kovacevic@arm.com                interface</li>
1159264Sdjordje.kovacevic@arm.com            <ul>
1169264Sdjordje.kovacevic@arm.com                <li>LSQ - load store queue for memory ref. instructions</li>
1179264Sdjordje.kovacevic@arm.com                <li>LSQ::DcachePort - interface to the D-cache from
1189264Sdjordje.kovacevic@arm.com                    Execute</li>
1199264Sdjordje.kovacevic@arm.com            </ul>
1209264Sdjordje.kovacevic@arm.com        </ul>
1219264Sdjordje.kovacevic@arm.com    </ul>
1229264Sdjordje.kovacevic@arm.com</ul>
1239264Sdjordje.kovacevic@arm.com
1249264Sdjordje.kovacevic@arm.com\section keystruct Key data structures
1259264Sdjordje.kovacevic@arm.com
1269264Sdjordje.kovacevic@arm.com\subsection ids Instruction and line identity: InstId (dyn_inst.hh)
1279264Sdjordje.kovacevic@arm.com
1289264Sdjordje.kovacevic@arm.comAn InstId contains the sequence numbers and thread numbers that describe the
1299264Sdjordje.kovacevic@arm.comlife cycle and instruction stream affiliations of individual fetched cache
1309264Sdjordje.kovacevic@arm.comlines and instructions.
1319264Sdjordje.kovacevic@arm.com
1329264Sdjordje.kovacevic@arm.comAn InstId is printed in one of the following forms:
1339264Sdjordje.kovacevic@arm.com
1349264Sdjordje.kovacevic@arm.com    - T/S.P/L - for fetched cache lines
1359264Sdjordje.kovacevic@arm.com    - T/S.P/L/F - for instructions before Decode
1369264Sdjordje.kovacevic@arm.com    - T/S.P/L/F.E - for instructions from Decode onwards
1379264Sdjordje.kovacevic@arm.com
1389264Sdjordje.kovacevic@arm.comfor example:
1399264Sdjordje.kovacevic@arm.com
1409264Sdjordje.kovacevic@arm.com    - 0/10.12/5/6.7
1419264Sdjordje.kovacevic@arm.com
1429264Sdjordje.kovacevic@arm.comInstId's fields are:
1439264Sdjordje.kovacevic@arm.com
1449264Sdjordje.kovacevic@arm.com<table>
1459264Sdjordje.kovacevic@arm.com<tr>
1469264Sdjordje.kovacevic@arm.com    <td><b>Field</b></td>
1479264Sdjordje.kovacevic@arm.com    <td><b>Symbol</b></td>
1489264Sdjordje.kovacevic@arm.com    <td><b>Generated by</b></td>
1499264Sdjordje.kovacevic@arm.com    <td><b>Checked by</b></td>
1509264Sdjordje.kovacevic@arm.com    <td><b>Function</b></td>
1519264Sdjordje.kovacevic@arm.com</tr>
1529264Sdjordje.kovacevic@arm.com
1539264Sdjordje.kovacevic@arm.com<tr>
1549264Sdjordje.kovacevic@arm.com    <td>InstId::threadId</td>
1559264Sdjordje.kovacevic@arm.com    <td>T</td>
1569264Sdjordje.kovacevic@arm.com    <td>Fetch1</td>
1579264Sdjordje.kovacevic@arm.com    <td>Everywhere the thread number is needed</td>
1589264Sdjordje.kovacevic@arm.com    <td>Thread number (currently always 0).</td>
1599264Sdjordje.kovacevic@arm.com</tr>
1609264Sdjordje.kovacevic@arm.com
1619264Sdjordje.kovacevic@arm.com<tr>
1629264Sdjordje.kovacevic@arm.com    <td>InstId::streamSeqNum</td>
1639264Sdjordje.kovacevic@arm.com    <td>S</td>
1649264Sdjordje.kovacevic@arm.com    <td>Execute</td>
1659264Sdjordje.kovacevic@arm.com    <td>Fetch1, Fetch2, Execute (to discard lines/insts)</td>
1669264Sdjordje.kovacevic@arm.com    <td>Stream sequence number as chosen by Execute.  Stream
1679264Sdjordje.kovacevic@arm.com        sequence numbers change after changes of PC (branches, exceptions) in
1689264Sdjordje.kovacevic@arm.com        Execute and are used to separate pre and post branch instruction
1699264Sdjordje.kovacevic@arm.com        streams.</td>
1709264Sdjordje.kovacevic@arm.com</tr>
1719264Sdjordje.kovacevic@arm.com
1729264Sdjordje.kovacevic@arm.com<tr>
1739264Sdjordje.kovacevic@arm.com    <td>InstId::predictionSeqNum</td>
1749264Sdjordje.kovacevic@arm.com    <td>P</td>
1759264Sdjordje.kovacevic@arm.com    <td>Fetch2</td>
1769264Sdjordje.kovacevic@arm.com    <td>Fetch2 (while discarding lines after prediction)</td>
1779264Sdjordje.kovacevic@arm.com    <td>Prediction sequence numbers represent branch prediction decisions.
1789264Sdjordje.kovacevic@arm.com    This is used by Fetch2 to mark lines/instructions according to the last
1799264Sdjordje.kovacevic@arm.com    followed branch prediction made by Fetch2.  Fetch2 can signal to Fetch1
1809264Sdjordje.kovacevic@arm.com    that it should change its fetch address and mark lines with a new
1819264Sdjordje.kovacevic@arm.com    prediction sequence number (which it will only do if the stream sequence
1829264Sdjordje.kovacevic@arm.com    number Fetch1 expects matches that of the request).  </td> </tr>
1839264Sdjordje.kovacevic@arm.com
1849264Sdjordje.kovacevic@arm.com<tr>
1859264Sdjordje.kovacevic@arm.com<td>InstId::lineSeqNum</td>
1869264Sdjordje.kovacevic@arm.com<td>L</td>
1879264Sdjordje.kovacevic@arm.com<td>Fetch1</td>
1889264Sdjordje.kovacevic@arm.com<td>(Just for debugging)</td>
1899264Sdjordje.kovacevic@arm.com<td>Line fetch sequence number of this cache line or the line
1909264Sdjordje.kovacevic@arm.com    this instruction was extracted from.
1919264Sdjordje.kovacevic@arm.com    </td>
1929264Sdjordje.kovacevic@arm.com</tr>
1939264Sdjordje.kovacevic@arm.com
1949264Sdjordje.kovacevic@arm.com<tr>
1959264Sdjordje.kovacevic@arm.com<td>InstId::fetchSeqNum</td>
1969264Sdjordje.kovacevic@arm.com<td>F</td>
1979264Sdjordje.kovacevic@arm.com<td>Fetch2</td>
1989264Sdjordje.kovacevic@arm.com<td>Fetch2 (as the inst. sequence number for branches)</td>
1999264Sdjordje.kovacevic@arm.com<td>Instruction fetch order assigned by Fetch2 when lines
2009264Sdjordje.kovacevic@arm.com    are decomposed into instructions.
2019264Sdjordje.kovacevic@arm.com    </td>
2029264Sdjordje.kovacevic@arm.com</tr>
2039264Sdjordje.kovacevic@arm.com
2049264Sdjordje.kovacevic@arm.com<tr>
2059264Sdjordje.kovacevic@arm.com<td>InstId::execSeqNum</td>
2069264Sdjordje.kovacevic@arm.com<td>E</td>
2079264Sdjordje.kovacevic@arm.com<td>Decode</td>
2089264Sdjordje.kovacevic@arm.com<td>Execute (to check instruction identity in queues/FUs/LSQ)</td>
2099264Sdjordje.kovacevic@arm.com<td>Instruction order after micro-op decomposition.</td>
2109264Sdjordje.kovacevic@arm.com</tr>
2119264Sdjordje.kovacevic@arm.com
2129264Sdjordje.kovacevic@arm.com</table>
2139264Sdjordje.kovacevic@arm.com
2149264Sdjordje.kovacevic@arm.comThe sequence number fields are all independent of each other and although, for
2159264Sdjordje.kovacevic@arm.cominstance, InstId::execSeqNum for an instruction will always be >=
2169264Sdjordje.kovacevic@arm.comInstId::fetchSeqNum, the comparison is not useful.
2179264Sdjordje.kovacevic@arm.com
2189264Sdjordje.kovacevic@arm.comThe originating stage of each sequence number field keeps a counter for that
2199264Sdjordje.kovacevic@arm.comfield which can be incremented in order to generate new, unique numbers.
2209264Sdjordje.kovacevic@arm.com
2219264Sdjordje.kovacevic@arm.com\subsection insts Instructions: MinorDynInst (dyn_inst.hh)
2229264Sdjordje.kovacevic@arm.com
2239264Sdjordje.kovacevic@arm.comMinorDynInst represents an instruction's progression through the pipeline.  An
2249264Sdjordje.kovacevic@arm.cominstruction can be three things:
2259264Sdjordje.kovacevic@arm.com
2269264Sdjordje.kovacevic@arm.com<table>
2279264Sdjordje.kovacevic@arm.com<tr>
2289264Sdjordje.kovacevic@arm.com    <td><b>Thing</b></td>
2299264Sdjordje.kovacevic@arm.com    <td><b>Predicate</b></td>
2309264Sdjordje.kovacevic@arm.com    <td><b>Explanation</b></td>
2319264Sdjordje.kovacevic@arm.com</tr>
2329264Sdjordje.kovacevic@arm.com<tr>
2339264Sdjordje.kovacevic@arm.com    <td>A bubble</td>
2349264Sdjordje.kovacevic@arm.com    <td>MinorDynInst::isBubble()</td>
2359264Sdjordje.kovacevic@arm.com    <td>no instruction at all, just a space-filler</td>
2369264Sdjordje.kovacevic@arm.com</tr>
2379264Sdjordje.kovacevic@arm.com<tr>
2389264Sdjordje.kovacevic@arm.com    <td>A fault</td>
2399264Sdjordje.kovacevic@arm.com    <td>MinorDynInst::isFault()</td>
2409264Sdjordje.kovacevic@arm.com    <td>a fault to pass down the pipeline in an instruction's clothing</td>
2419264Sdjordje.kovacevic@arm.com</tr>
2429264Sdjordje.kovacevic@arm.com<tr>
2439264Sdjordje.kovacevic@arm.com    <td>A decoded instruction</td>
2449264Sdjordje.kovacevic@arm.com    <td>MinorDynInst::isInst()</td>
2459264Sdjordje.kovacevic@arm.com    <td>instructions are actually passed to the gem5 decoder in Fetch2 and so
2469264Sdjordje.kovacevic@arm.com    are created fully decoded.  MinorDynInst::staticInst is the decoded
2479264Sdjordje.kovacevic@arm.com    instruction form.</td>
2489264Sdjordje.kovacevic@arm.com</tr>
2499264Sdjordje.kovacevic@arm.com</table>
2509264Sdjordje.kovacevic@arm.com
2519264Sdjordje.kovacevic@arm.comInstructions are reference counted using the gem5 RefCountingPtr
2529264Sdjordje.kovacevic@arm.com(base/refcnt.hh) wrapper.  They therefore usually appear as MinorDynInstPtr in
2539264Sdjordje.kovacevic@arm.comcode.  Note that as RefCountingPtr initialises as nullptr rather than an
2549264Sdjordje.kovacevic@arm.comobject that supports BubbleIF::isBubble, passing raw MinorDynInstPtrs to
2559264Sdjordje.kovacevic@arm.comQueue%s and other similar structures from stage.hh without boxing is
2569264Sdjordje.kovacevic@arm.comdangerous.
2579264Sdjordje.kovacevic@arm.com
2589264Sdjordje.kovacevic@arm.com\subsection fld ForwardLineData (pipe_data.hh)
2599264Sdjordje.kovacevic@arm.com
2609264Sdjordje.kovacevic@arm.comForwardLineData is used to pass cache lines from Fetch1 to Fetch2.  Like
2619264Sdjordje.kovacevic@arm.comMinorDynInst%s, they can be bubbles (ForwardLineData::isBubble()),
2629264Sdjordje.kovacevic@arm.comfault-carrying or can contain a line (partial line) fetched by Fetch1.  The
2639264Sdjordje.kovacevic@arm.comdata carried by ForwardLineData is owned by a Packet object returned from
2649264Sdjordje.kovacevic@arm.commemory and is explicitly memory managed and do must be deleted once processed
2659264Sdjordje.kovacevic@arm.com(by Fetch2 deleting the Packet).
2669264Sdjordje.kovacevic@arm.com
2679264Sdjordje.kovacevic@arm.com\subsection fid ForwardInstData (pipe_data.hh)
2689264Sdjordje.kovacevic@arm.com
2699264Sdjordje.kovacevic@arm.comForwardInstData can contain up to ForwardInstData::width() instructions in its
2709264Sdjordje.kovacevic@arm.comForwardInstData::insts vector.  This structure is used to carry instructions
2719264Sdjordje.kovacevic@arm.combetween Fetch2, Decode and Execute and to store input buffer vectors in Decode
2729264Sdjordje.kovacevic@arm.comand Execute.
2739264Sdjordje.kovacevic@arm.com
2749264Sdjordje.kovacevic@arm.com\subsection fr Fetch1::FetchRequest (fetch1.hh)
2759264Sdjordje.kovacevic@arm.com
2769264Sdjordje.kovacevic@arm.comFetchRequests represent I-cache line fetch requests.  The are used in the
2779264Sdjordje.kovacevic@arm.commemory queues of Fetch1 and are pushed into/popped from Packet::senderState
2789264Sdjordje.kovacevic@arm.comwhile traversing the memory system.
279
280FetchRequests contain a memory system Request (mem/request.hh) for that fetch
281access, a packet (Packet, mem/packet.hh), if the request gets to memory, and a
282fault field that can be populated with a TLB-sourced prefetch fault (if any).
283
284\subsection lsqr LSQ::LSQRequest (execute.hh)
285
286LSQRequests are similar to FetchRequests but for D-cache accesses.  They carry
287the instruction associated with a memory access.
288
289\section pipeline The pipeline
290
291\verbatim
292------------------------------------------------------------------------------
293    Key:
294
295    [] : inter-stage BufferBuffer
296    ,--.
297    |  | : pipeline stage
298    `--'
299    ---> : forward communication
300    <--- : backward communication
301
302    rv : reservation information for input buffers
303
304                ,------.     ,------.     ,------.     ,-------.
305 (from  --[]-v->|Fetch1|-[]->|Fetch2|-[]->|Decode|-[]->|Execute|--> (to Fetch1
306 Execute)    |  |      |<-[]-|      |<-rv-|      |<-rv-|       |     & Fetch2)
307             |  `------'<-rv-|      |     |      |     |       |
308             `-------------->|      |     |      |     |       |
309                             `------'     `------'     `-------'
310------------------------------------------------------------------------------
311\endverbatim
312
313The four pipeline stages are connected together by MinorBuffer FIFO
314(stage.hh, derived ultimately from TimeBuffer) structures which allow
315inter-stage delays to be modelled.  There is a MinorBuffer%s between adjacent
316stages in the forward direction (for example: passing lines from Fetch1 to
317Fetch2) and, between Fetch2 and Fetch1, a buffer in the backwards direction
318carrying branch predictions.
319
320Stages Fetch2, Decode and Execute have input buffers which, each cycle, can
321accept input data from the previous stage and can hold that data if the stage
322is not ready to process it.  Input buffers store data in the same form as it
323is received and so Decode and Execute's input buffers contain the output
324instruction vector (ForwardInstData (pipe_data.hh)) from their previous stages
325with the instructions and bubbles in the same positions as a single buffer
326entry.
327
328Stage input buffers provide a Reservable (stage.hh) interface to their
329previous stages, to allow slots to be reserved in their input buffers, and
330communicate their input buffer occupancy backwards to allow the previous stage
331to plan whether it should make an output in a given cycle.
332
333\subsection events Event handling: MinorActivityRecorder (activity.hh,
334pipeline.hh)
335
336Minor is essentially a cycle-callable model with some ability to skip cycles
337based on pipeline activity.  External events are mostly received by callbacks
338(e.g. Fetch1::IcachePort::recvTimingResp) and cause the pipeline to be woken
339up to service advancing request queues.
340
341Ticked (sim/ticked.hh) is a base class bringing together an evaluate
342member function and a provided SimObject.  It provides a Ticked::start/stop
343interface to start and pause clock events from being periodically issued.
344Pipeline is a derived class of Ticked.
345
346During evaluate calls, stages can signal that they still have work to do in
347the next cycle by calling either MinorCPU::activityRecorder->activity() (for
348non-callable related activity) or MinorCPU::wakeupOnEvent(<stageId>) (for
349stage callback-related 'wakeup' activity).
350
351Pipeline::evaluate contains calls to evaluate for each unit and a test for
352pipeline idling which can turns off the clock tick if no unit has signalled
353that it may become active next cycle.
354
355Within Pipeline (pipeline.hh), the stages are evaluated in reverse order (and
356so will ::evaluate in reverse order) and their backwards data can be
357read immediately after being written in each cycle allowing output decisions
358to be 'perfect' (allowing synchronous stalling of the whole pipeline).  Branch
359predictions from Fetch2 to Fetch1 can also be transported in 0 cycles making
360fetch1ToFetch2BackwardDelay the only configurable delay which can be set as
361low as 0 cycles.
362
363The MinorCPU::activateContext and MinorCPU::suspendContext interface can be
364called to start and pause threads (threads in the MT sense) and to start and
365pause the pipeline.  Executing instructions can call this interface
366(indirectly through the ThreadContext) to idle the CPU/their threads.
367
368\subsection stages Each pipeline stage
369
370In general, the behaviour of a stage (each cycle) is:
371
372\verbatim
373    evaluate:
374        push input to inputBuffer
375        setup references to input/output data slots
376
377        do 'every cycle' 'step' tasks
378
379        if there is input and there is space in the next stage:
380            process and generate a new output
381            maybe re-activate the stage
382
383        send backwards data
384
385        if the stage generated output to the following FIFO:
386            signal pipe activity
387
388        if the stage has more processable input and space in the next stage:
389            re-activate the stage for the next cycle
390
391        commit the push to the inputBuffer if that data hasn't all been used
392\endverbatim
393
394The Execute stage differs from this model as its forward output (branch) data
395is unconditionally sent to Fetch1 and Fetch2.  To allow this behaviour, Fetch1
396and Fetch2 must be unconditionally receptive to that data.
397
398\subsection fetch1 Fetch1 stage
399
400Fetch1 is responsible for fetching cache lines or partial cache lines from the
401I-cache and passing them on to Fetch2 to be decomposed into instructions.  It
402can receive 'change of stream' indications from both Execute and Fetch2 to
403signal that it should change its internal fetch address and tag newly fetched
404lines with new stream or prediction sequence numbers.  When both Execute and
405Fetch2 signal changes of stream at the same time, Fetch1 takes Execute's
406change.
407
408Every line issued by Fetch1 will bear a unique line sequence number which can
409be used for debugging stream changes.
410
411When fetching from the I-cache, Fetch1 will ask for data from the current
412fetch address (Fetch1::pc) up to the end of the 'data snap' size set in the
413parameter fetch1LineSnapWidth.  Subsequent autonomous line fetches will fetch
414whole lines at a snap boundary and of size fetch1LineWidth.
415
416Fetch1 will only initiate a memory fetch if it can reserve space in Fetch2
417input buffer.  That input buffer serves an the fetch queue/LFL for the system.
418
419Fetch1 contains two queues: requests and transfers to handle the stages of
420translating the address of a line fetch (via the TLB) and accommodating the
421request/response of fetches to/from memory.
422
423Fetch requests from Fetch1 are pushed into the requests queue as newly
424allocated FetchRequest objects once they have been sent to the ITLB with a
425call to itb->translateTiming.
426
427A response from the TLB moves the request from the requests queue to the
428transfers queue.  If there is more than one entry in each queue, it is
429possible to get a TLB response for request which is not at the head of the
430requests queue.  In that case, the TLB response is marked up as a state change
431to Translated in the request object, and advancing the request to transfers
432(and the memory system) is left to calls to Fetch1::stepQueues which is called
433in the cycle following any event is received.
434
435Fetch1::tryToSendToTransfers is responsible for moving requests between the
436two queues and issuing requests to memory.  Failed TLB lookups (prefetch
437aborts) continue to occupy space in the queues until they are recovered at the
438head of transfers.
439
440Responses from memory change the request object state to Complete and
441Fetch1::evaluate can pick up response data, package it in the ForwardLineData
442object, and forward it to Fetch2%'s input buffer.
443
444As space is always reserved in Fetch2::inputBuffer, setting the input buffer's
445size to 1 results in non-prefetching behaviour.
446
447When a change of stream occurs, translated requests queue members and
448completed transfers queue members can be unconditionally discarded to make way
449for new transfers.
450
451\subsection fetch2 Fetch2 stage
452
453Fetch2 receives a line from Fetch1 into its input buffer.  The data in the
454head line in that buffer is iterated over and separated into individual
455instructions which are packed into a vector of instructions which can be
456passed to Decode.  Packing instructions can be aborted early if a fault is
457found in either the input line as a whole or a decomposed instruction.
458
459\subsubsection bp Branch prediction
460
461Fetch2 contains the branch prediction mechanism.  This is a wrapper around the
462branch predictor interface provided by gem5 (cpu/pred/...).
463
464Branches are predicted for any control instructions found.  If prediction is
465attempted for an instruction, the MinorDynInst::triedToPredict flag is set on
466that instruction.
467
468When a branch is predicted to take, the MinorDynInst::predictedTaken flag is
469set and MinorDynInst::predictedTarget is set to the predicted target PC value.
470The predicted branch instruction is then packed into Fetch2%'s output vector,
471the prediction sequence number is incremented, and the branch is communicated
472to Fetch1.
473
474After signalling a prediction, Fetch2 will discard its input buffer contents
475and will reject any new lines which have the same stream sequence number as
476that branch but have a different prediction sequence number.  This allows
477following sequentially fetched lines to be rejected without ignoring new lines
478generated by a change of stream indicated from a 'real' branch from Execute
479(which will have a new stream sequence number).
480
481The program counter value provided to Fetch2 by Fetch1 packets is only updated
482when there is a change of stream.  Fetch2::havePC indicates whether the PC
483will be picked up from the next processed input line.  Fetch2::havePC is
484necessary to allow line-wrapping instructions to be tracked through decode.
485
486Branches (and instructions predicted to branch) which are processed by Execute
487will generate BranchData (pipe_data.hh) data explaining the outcome of the
488branch which is sent forwards to Fetch1 and Fetch2.  Fetch1 uses this data to
489change stream (and update its stream sequence number and address for new
490lines).  Fetch2 uses it to update the branch predictor.   Minor does not
491communicate branch data to the branch predictor for instructions which are
492discarded on the way to commit.
493
494BranchData::BranchReason (pipe_data.hh) encodes the possible branch scenarios:
495
496<table>
497<tr>
498    <td>Branch enum val.</td>
499    <td>In Execute</td>
500    <td>Fetch1 reaction</td>
501    <td>Fetch2 reaction</td>
502</tr>
503<tr>
504    <td>NoBranch</td>
505    <td>(output bubble data)</td>
506    <td>-</td>
507    <td>-</td>
508</tr>
509<tr>
510    <td>CorrectlyPredictedBranch</td>
511    <td>Predicted, taken</td>
512    <td>-</td>
513    <td>Update BP as taken branch</td>
514</tr>
515<tr>
516    <td>UnpredictedBranch</td>
517    <td>Not predicted, taken and was taken</td>
518    <td>New stream</td>
519    <td>Update BP as taken branch</td>
520</tr>
521<tr>
522    <td>BadlyPredictedBranch</td>
523    <td>Predicted, not taken</td>
524    <td>New stream to restore to old inst. source</td>
525    <td>Update BP as not taken branch</td>
526</tr>
527<tr>
528    <td>BadlyPredictedBranchTarget</td>
529    <td>Predicted, taken, but to a different target than predicted one</td>
530    <td>New stream</td>
531    <td>Update BTB to new target</td>
532</tr>
533<tr>
534    <td>SuspendThread</td>
535    <td>Hint to suspend fetching</td>
536    <td>Suspend fetch for this thread (branch to next inst. as wakeup
537        fetch addr)</td>
538    <td>-</td>
539</tr>
540<tr>
541    <td>Interrupt</td>
542    <td>Interrupt detected</td>
543    <td>New stream</td>
544    <td>-</td>
545</tr>
546</table>
547
548The parameter decodeInputWidth sets the number of instructions which can be
549packed into the output per cycle.  If the parameter fetch2CycleInput is true,
550Decode can try to take instructions from more than one entry in its input
551buffer per cycle.
552
553\subsection decode Decode stage
554
555Decode takes a vector of instructions from Fetch2 (via its input buffer) and
556decomposes those instructions into micro-ops (if necessary) and packs them
557into its output instruction vector.
558
559The parameter executeInputWidth sets the number of instructions which can be
560packed into the output per cycle.  If the parameter decodeCycleInput is true,
561Decode can try to take instructions from more than one entry in its input
562buffer per cycle.
563
564\subsection execute Execute stage
565
566Execute provides all the instruction execution and memory access mechanisms.
567An instructions passage through Execute can take multiple cycles with its
568precise timing modelled by a functional unit pipeline FIFO.
569
570A vector of instructions (possibly including fault 'instructions') is provided
571to Execute by Decode and can be queued in the Execute input buffer before
572being issued.  Setting the parameter executeCycleInput allows execute to
573examine more than one input buffer entry (more than one instruction vector).
574The number of instructions in the input vector can be set with
575executeInputWidth and the depth of the input buffer can be set with parameter
576executeInputBufferSize.
577
578\subsubsection fus Functional units
579
580The Execute stage contains pipelines for each functional unit comprising the
581computational core of the CPU.  Functional units are configured via the
582executeFuncUnits parameter.  Each functional unit has a number of instruction
583classes it supports, a stated delay between instruction issues, and a delay
584from instruction issue to (possible) commit and an optional timing annotation
585capable of more complicated timing.
586
587Each active cycle, Execute::evaluate performs this action:
588
589\verbatim
590    Execute::evaluate:
591        push input to inputBuffer
592        setup references to input/output data slots and branch output slot
593
594        step D-cache interface queues (similar to Fetch1)
595
596        if interrupt posted:
597            take interrupt (signalling branch to Fetch1/Fetch2)
598        else
599            commit instructions
600            issue new instructions
601
602        advance functional unit pipelines
603
604        reactivate Execute if the unit is still active
605
606        commit the push to the inputBuffer if that data hasn't all been used
607\endverbatim
608
609\subsubsection fifos Functional unit FIFOs
610
611Functional units are implemented as SelfStallingPipelines (stage.hh).  These
612are TimeBuffer FIFOs with two distinct 'push' and 'pop' wires.  They respond
613to SelfStallingPipeline::advance in the same way as TimeBuffers <b>unless</b>
614there is data at the far, 'pop', end of the FIFO.  A 'stalled' flag is
615provided for signalling stalling and to allow a stall to be cleared.  The
616intention is to provide a pipeline for each functional unit which will never
617advance an instruction out of that pipeline until it has been processed and
618the pipeline is explicitly unstalled.
619
620The actions 'issue', 'commit', and 'advance' act on the functional units.
621
622\subsubsection issue Issue
623
624Issuing instructions involves iterating over both the input buffer
625instructions and the heads of the functional units to try and issue
626instructions in order.  The number of instructions which can be issued each
627cycle is limited by the parameter executeIssueLimit, how executeCycleInput is
628set, the availability of pipeline space and the policy used to choose a
629pipeline in which the instruction can be issued.
630
631At present, the only issue policy is strict round-robin visiting of each
632pipeline with the given instructions in sequence.  For greater flexibility,
633better (and more specific policies) will need to be possible.
634
635Memory operation instructions traverse their functional units to perform their
636EA calculations.  On 'commit', the ExecContext::initiateAcc execution phase is
637performed and any memory access is issued (via. ExecContext::{read,write}Mem
638calling LSQ::pushRequest) to the LSQ.
639
640Note that faults are issued as if they are instructions and can (currently) be
641issued to *any* functional unit.
642
643Every issued instruction is also pushed into the Execute::inFlightInsts queue.
644Memory ref. instructions are pushing into Execute::inFUMemInsts queue.
645
646\subsubsection commit Commit
647
648Instructions are committed by examining the head of the Execute::inFlightInsts
649queue (which is decorated with the functional unit number to which the
650instruction was issued).  Instructions which can then be found in their
651functional units are executed and popped from Execute::inFlightInsts.
652
653Memory operation instructions are committed into the memory queues (as
654described above) and exit their functional unit pipeline but are not popped
655from the Execute::inFlightInsts queue.  The Execute::inFUMemInsts queue
656provides ordering to memory operations as they pass through the functional
657units (maintaining issue order).  On entering the LSQ, instructions are popped
658from Execute::inFUMemInsts.
659
660If the parameter executeAllowEarlyMemoryIssue is set, memory operations can be
661sent from their FU to the LSQ before reaching the head of
662Execute::inFlightInsts but after their dependencies are met.
663MinorDynInst::instToWaitFor is marked up with the latest dependent instruction
664execSeqNum required to be committed for a memory operation to progress to the
665LSQ.
666
667Once a memory response is available (by testing the head of
668Execute::inFlightInsts against LSQ::findResponse), commit will process that
669response (ExecContext::completeAcc) and pop the instruction from
670Execute::inFlightInsts.
671
672Any branch, fault or interrupt will cause a stream sequence number change and
673signal a branch to Fetch1/Fetch2.  Only instructions with the current stream
674sequence number will be issued and/or committed.
675
676\subsubsection advance Advance
677
678All non-stalled pipeline are advanced and may, thereafter, become stalled.
679Potential activity in the next cycle is signalled if there are any
680instructions remaining in any pipeline.
681
682\subsubsection sb Scoreboard
683
684The scoreboard (Scoreboard) is used to control instruction issue.  It contains
685a count of the number of in flight instructions which will write each general
686purpose CPU integer or float register.  Instructions will only be issued where
687the scoreboard contains a count of 0 instructions which will write to one of
688the instructions source registers.
689
690Once an instruction is issued, the scoreboard counts for each destination
691register for an instruction will be incremented.
692
693The estimated delivery time of the instruction's result is marked up in the
694scoreboard by adding the length of the issued-to FU to the current time.  The
695timings parameter on each FU provides a list of additional rules for
696calculating the delivery time.  These are documented in the parameter comments
697in MinorCPU.py.
698
699On commit, (for memory operations, memory response commit) the scoreboard
700counters for an instruction's source registers are decremented.  will be
701decremented.
702
703\subsubsection ifi Execute::inFlightInsts
704
705The Execute::inFlightInsts queue will always contain all instructions in
706flight in Execute in the correct issue order.  Execute::issue is the only
707process which will push an instruction into the queue.  Execute::commit is the
708only process that can pop an instruction.
709
710\subsubsection lsq LSQ
711
712The LSQ can support multiple outstanding transactions to memory in a number of
713conservative cases.
714
715There are three queues to contain requests: requests, transfers and the store
716buffer.  The requests and transfers queue operate in a similar manner to the
717queues in Fetch1.  The store buffer is used to decouple the delay of
718completing store operations from following loads.
719
720Requests are issued to the DTLB as their instructions leave their functional
721unit.  At the head of requests, cacheable load requests can be sent to memory
722and on to the transfers queue.  Cacheable stores will be passed to transfers
723unprocessed and progress that queue maintaining order with other transactions.
724
725The conditions in LSQ::tryToSendToTransfers dictate when requests can
726be sent to memory.
727
728All uncacheable transactions, split transactions and locked transactions are
729processed in order at the head of requests.  Additionally, store results
730residing in the store buffer can have their data forwarded to cacheable loads
731(removing the need to perform a read from memory) but no cacheable load can be
732issue to the transfers queue until that queue's stores have drained into the
733store buffer.
734
735At the end of transfers, requests which are LSQ::LSQRequest::Complete (are
736faulting, are cacheable stores, or have been sent to memory and received a
737response) can be picked off by Execute and either committed
738(ExecContext::completeAcc) and, for stores, be sent to the store buffer.
739
740Barrier instructions do not prevent cacheable loads from progressing to memory
741but do cause a stream change which will discard that load.  Stores will not be
742committed to the store buffer if they are in the shadow of the barrier but
743before the new instruction stream has arrived at Execute.  As all other memory
744transactions are delayed at the end of the requests queue until they are at
745the head of Execute::inFlightInsts, they will be discarded by any barrier
746stream change.
747
748After commit, LSQ::BarrierDataRequest requests are inserted into the
749store buffer to track each barrier until all preceding memory transactions
750have drained from the store buffer.  No further memory transactions will be
751issued from the ends of FUs until after the barrier has drained.
752
753\subsubsection drain Draining
754
755Draining is mostly handled by the Execute stage.  When initiated by calling
756MinorCPU::drain, Pipeline::evaluate checks the draining status of each unit
757each cycle and keeps the pipeline active until draining is complete.  It is
758Pipeline that signals the completion of draining.  Execute is triggered by
759MinorCPU::drain and starts stepping through its Execute::DrainState state
760machine, starting from state Execute::NotDraining, in this order:
761
762<table>
763<tr>
764    <td><b>State</b></td>
765    <td><b>Meaning</b></td>
766</tr>
767<tr>
768    <td>Execute::NotDraining</td>
769    <td>Not trying to drain, normal execution</td>
770</tr>
771<tr>
772    <td>Execute::DrainCurrentInst</td>
773    <td>Draining micro-ops to complete inst.</td>
774</tr>
775<tr>
776    <td>Execute::DrainHaltFetch</td>
777    <td>Halt fetching instructions</td>
778</tr>
779<tr>
780    <td>Execute::DrainAllInsts</td>
781    <td>Discarding all instructions presented</td>
782</tr>
783</table>
784
785When complete, a drained Execute unit will be in the Execute::DrainAllInsts
786state where it will continue to discard instructions but has no knowledge of
787the drained state of the rest of the model.
788
789\section debug Debug options
790
791The model provides a number of debug flags which can be passed to gem5 with
792the --debug-flags option.
793
794The available flags are:
795
796<table>
797<tr>
798    <td><b>Debug flag</b></td>
799    <td><b>Unit which will generate debugging output</b></td>
800</tr>
801<tr>
802    <td>Activity</td>
803    <td>Debug ActivityMonitor actions</td>
804</tr>
805<tr>
806    <td>Branch</td>
807    <td>Fetch2 and Execute branch prediction decisions</td>
808</tr>
809<tr>
810    <td>MinorCPU</td>
811    <td>CPU global actions such as wakeup/thread suspension</td>
812</tr>
813<tr>
814    <td>Decode</td>
815    <td>Decode</td>
816</tr>
817<tr>
818    <td>MinorExec</td>
819    <td>Execute behaviour</td>
820</tr>
821<tr>
822    <td>Fetch</td>
823    <td>Fetch1 and Fetch2</td>
824</tr>
825<tr>
826    <td>MinorInterrupt</td>
827    <td>Execute interrupt handling</td>
828</tr>
829<tr>
830    <td>MinorMem</td>
831    <td>Execute memory interactions</td>
832</tr>
833<tr>
834    <td>MinorScoreboard</td>
835    <td>Execute scoreboard activity</td>
836</tr>
837<tr>
838    <td>MinorTrace</td>
839    <td>Generate MinorTrace cyclic state trace output (see below)</td>
840</tr>
841<tr>
842    <td>MinorTiming</td>
843    <td>MinorTiming instruction timing modification operations</td>
844</tr>
845</table>
846
847The group flag Minor enables all of the flags beginning with Minor.
848
849\section trace MinorTrace and minorview.py
850
851The debug flag MinorTrace causes cycle-by-cycle state data to be printed which
852can then be processed and viewed by the minorview.py tool.  This output is
853very verbose and so it is recommended it only be used for small examples.
854
855\subsection traceformat MinorTrace format
856
857There are three types of line outputted by MinorTrace:
858
859\subsubsection state MinorTrace - Ticked unit cycle state
860
861For example:
862
863\verbatim
864 110000: system.cpu.dcachePort: MinorTrace: state=MemoryRunning in_tlb_mem=0/0
865\endverbatim
866
867For each time step, the MinorTrace flag will cause one MinorTrace line to be
868printed for every named element in the model.
869
870\subsubsection traceunit MinorInst - summaries of instructions issued by \
871    Decode
872
873For example:
874
875\verbatim
876 140000: system.cpu.execute: MinorInst: id=0/1.1/1/1.1 addr=0x5c \
877                             inst="  mov r0, #0" class=IntAlu
878\endverbatim
879
880MinorInst lines are currently only generated for instructions which are
881committed.
882
883\subsubsection tracefetch1 MinorLine - summaries of line fetches issued by \
884    Fetch1
885
886For example:
887
888\verbatim
889  92000: system.cpu.icachePort: MinorLine: id=0/1.1/1 size=36 \
890                                vaddr=0x5c paddr=0x5c
891\endverbatim
892
893\subsection minorview minorview.py
894
895Minorview (util/minorview.py) can be used to visualise the data created by
896MinorTrace.
897
898\verbatim
899usage: minorview.py [-h] [--picture picture-file] [--prefix name]
900                   [--start-time time] [--end-time time] [--mini-views]
901                   event-file
902
903Minor visualiser
904
905positional arguments:
906  event-file
907
908optional arguments:
909  -h, --help            show this help message and exit
910  --picture picture-file
911                        markup file containing blob information (default:
912                        <minorview-path>/minor.pic)
913  --prefix name         name prefix in trace for CPU to be visualised
914                        (default: system.cpu)
915  --start-time time     time of first event to load from file
916  --end-time time       time of last event to load from file
917  --mini-views          show tiny views of the next 10 time steps
918\endverbatim
919
920Raw debugging output can be passed to minorview.py as the event-file. It will
921pick out the MinorTrace lines and use other lines where units in the
922simulation are named (such as system.cpu.dcachePort in the above example) will
923appear as 'comments' when units are clicked on the visualiser.
924
925Clicking on a unit which contains instructions or lines will bring up a speech
926bubble giving extra information derived from the MinorInst/MinorLine lines.
927
928--start-time and --end-time allow only sections of debug files to be loaded.
929
930--prefix allows the name prefix of the CPU to be inspected to be supplied.
931This defaults to 'system.cpu'.
932
933In the visualiser, The buttons Start, End, Back, Forward, Play and Stop can be
934used to control the displayed simulation time.
935
936The diagonally striped coloured blocks are showing the InstId of the
937instruction or line they represent.  Note that lines in Fetch1 and f1ToF2.F
938only show the id fields of a line and that instructions in Fetch2, f2ToD, and
939decode.inputBuffer do not yet have execute sequence numbers.  The T/S.P/L/F.E
940buttons can be used to toggle parts of InstId on and off to make it easier to
941understand the display.  Useful combinations are:
942
943<table>
944<tr>
945    <td><b>Combination</b></td>
946    <td><b>Reason</b></td>
947</tr>
948<tr>
949    <td>E</td>
950    <td>just show the final execute sequence number</td>
951</tr>
952<tr>
953    <td>F/E</td>
954    <td>show the instruction-related numbers</td>
955</tr>
956<tr>
957    <td>S/P</td>
958    <td>show just the stream-related numbers (watch the stream sequence
959        change with branches and not change with predicted branches)</td>
960</tr>
961<tr>
962    <td>S/E</td>
963    <td>show instructions and their stream</td>
964</tr>
965</table>
966
967The key to the right shows all the displayable colours (some of the colour
968choices are quite bad!):
969
970<table>
971<tr>
972    <td><b>Symbol</b></td>
973    <td><b>Meaning</b></td>
974</tr>
975<tr>
976    <td>U</td>
977    <td>Unknown data</td>
978</tr>
979<tr>
980    <td>B</td>
981    <td>Blocked stage</td>
982</tr>
983<tr>
984    <td>-</td>
985    <td>Bubble</td>
986</tr>
987<tr>
988    <td>E</td>
989    <td>Empty queue slot</td>
990</tr>
991<tr>
992    <td>R</td>
993    <td>Reserved queue slot</td>
994</tr>
995<tr>
996    <td>F</td>
997    <td>Fault</td>
998</tr>
999<tr>
1000    <td>r</td>
1001    <td>Read (used as the leftmost stripe on data in the dcachePort)</td>
1002</tr>
1003<tr>
1004    <td>w</td>
1005    <td>Write " "</td>
1006</tr>
1007<tr>
1008    <td>0 to 9</td>
1009    <td>last decimal digit of the corresponding data</td>
1010</tr>
1011</table>
1012
1013\verbatim
1014
1015    ,---------------.         .--------------.  *U
1016    | |=|->|=|->|=| |         ||=|||->||->|| |  *-  <- Fetch queues/LSQ
1017    `---------------'         `--------------'  *R
1018    === ======                                  *w  <- Activity/Stage activity
1019                              ,--------------.  *1
1020    ,--.      ,.      ,.      | ============ |  *3  <- Scoreboard
1021    |  |-\[]-\||-\[]-\||-\[]-\| ============ |  *5  <- Execute::inFlightInsts
1022    |  | :[] :||-/[]-/||-/[]-/| -. --------  |  *7
1023    |  |-/[]-/||  ^   ||      |  | --------- |  *9
1024    |  |      ||  |   ||      |  | ------    |
1025[]->|  |    ->||  |   ||      |  | ----      |
1026    |  |<-[]<-||<-+-<-||<-[]<-|  | ------    |->[] <- Execute to Fetch1,
1027    '--`      `'  ^   `'      | -' ------    |        Fetch2 branch data
1028             ---. |  ---.     `--------------'
1029             ---' |  ---'       ^       ^
1030                  |   ^         |       `------------ Execute
1031  MinorBuffer ----' input       `-------------------- Execute input buffer
1032                    buffer
1033\endverbatim
1034
1035Stages show the colours of the instructions currently being
1036generated/processed.
1037
1038Forward FIFOs between stages show the data being pushed into them at the
1039current tick (to the left), the data in transit, and the data available at
1040their outputs (to the right).
1041
1042The backwards FIFO between Fetch2 and Fetch1 shows branch prediction data.
1043
1044In general, all displayed data is correct at the end of a cycle's activity at
1045the time indicated but before the inter-stage FIFOs are ticked.  Each FIFO
1046has, therefore an extra slot to show the asserted new input data, and all the
1047data currently within the FIFO.
1048
1049Input buffers for each stage are shown below the corresponding stage and show
1050the contents of those buffers as horizontal strips.  Strips marked as reserved
1051(cyan by default) are reserved to be filled by the previous stage.  An input
1052buffer with all reserved or occupied slots will, therefore, block the previous
1053stage from generating output.
1054
1055Fetch queues and LSQ show the lines/instructions in the queues of each
1056interface and show the number of lines/instructions in TLB and memory in the
1057two striped colours of the top of their frames.
1058
1059Inside Execute, the horizontal bars represent the individual FU pipelines.
1060The vertical bar to the left is the input buffer and the bar to the right, the
1061instructions committed this cycle.  The background of Execute shows
1062instructions which are being committed this cycle in their original FU
1063pipeline positions.
1064
1065The strip at the top of the Execute block shows the current streamSeqNum that
1066Execute is committing.  A similar stripe at the top of Fetch1 shows that
1067stage's expected streamSeqNum and the stripe at the top of Fetch2 shows its
1068issuing predictionSeqNum.
1069
1070The scoreboard shows the number of instructions in flight which will commit a
1071result to the register in the position shown.  The scoreboard contains slots
1072for each integer and floating point register.
1073
1074The Execute::inFlightInsts queue shows all the instructions in flight in
1075Execute with the oldest instruction (the next instruction to be committed) to
1076the right.
1077
1078'Stage activity' shows the signalled activity (as E/1) for each stage (with
1079CPU miscellaneous activity to the left)
1080
1081'Activity' show a count of stage and pipe activity.
1082
1083\subsection picformat minor.pic format
1084
1085The minor.pic file (src/minor/minor.pic) describes the layout of the
1086models blocks on the visualiser.  Its format is described in the supplied
1087minor.pic file.
1088
1089*/
1090
1091}
1092