110259SAndrew.Bardsley@arm.com# Copyright (c) 2014 ARM Limited
210259SAndrew.Bardsley@arm.com# All rights reserved
310259SAndrew.Bardsley@arm.com#
410259SAndrew.Bardsley@arm.com# The license below extends only to copyright in the software and shall
510259SAndrew.Bardsley@arm.com# not be construed as granting a license to any other intellectual
610259SAndrew.Bardsley@arm.com# property including but not limited to intellectual property relating
710259SAndrew.Bardsley@arm.com# to a hardware implementation of the functionality of the software
810259SAndrew.Bardsley@arm.com# licensed hereunder.  You may use the software subject to the license
910259SAndrew.Bardsley@arm.com# terms below provided that you ensure that this notice is replicated
1010259SAndrew.Bardsley@arm.com# unmodified and in its entirety in all distributions of the software,
1110259SAndrew.Bardsley@arm.com# modified or unmodified, in source code or in binary form.
1210259SAndrew.Bardsley@arm.com#
1310259SAndrew.Bardsley@arm.com# Redistribution and use in source and binary forms, with or without
1410259SAndrew.Bardsley@arm.com# modification, are permitted provided that the following conditions are
1510259SAndrew.Bardsley@arm.com# met: redistributions of source code must retain the above copyright
1610259SAndrew.Bardsley@arm.com# notice, this list of conditions and the following disclaimer;
1710259SAndrew.Bardsley@arm.com# redistributions in binary form must reproduce the above copyright
1810259SAndrew.Bardsley@arm.com# notice, this list of conditions and the following disclaimer in the
1910259SAndrew.Bardsley@arm.com# documentation and/or other materials provided with the distribution;
2010259SAndrew.Bardsley@arm.com# neither the name of the copyright holders nor the names of its
2110259SAndrew.Bardsley@arm.com# contributors may be used to endorse or promote products derived from
2210259SAndrew.Bardsley@arm.com# this software without specific prior written permission.
2310259SAndrew.Bardsley@arm.com#
2410259SAndrew.Bardsley@arm.com# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
2510259SAndrew.Bardsley@arm.com# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
2610259SAndrew.Bardsley@arm.com# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
2710259SAndrew.Bardsley@arm.com# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
2810259SAndrew.Bardsley@arm.com# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
2910259SAndrew.Bardsley@arm.com# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
3010259SAndrew.Bardsley@arm.com# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
3110259SAndrew.Bardsley@arm.com# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
3210259SAndrew.Bardsley@arm.com# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
3310259SAndrew.Bardsley@arm.com# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
3410259SAndrew.Bardsley@arm.com# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
3510259SAndrew.Bardsley@arm.com#
3610259SAndrew.Bardsley@arm.com# Authors: Andrew Bardsley
3710259SAndrew.Bardsley@arm.com
3810259SAndrew.Bardsley@arm.comnamespace Minor
3910259SAndrew.Bardsley@arm.com{
4010259SAndrew.Bardsley@arm.com
4110259SAndrew.Bardsley@arm.com/*!
4210259SAndrew.Bardsley@arm.com
4310259SAndrew.Bardsley@arm.com\page minor Inside the Minor CPU model
4410259SAndrew.Bardsley@arm.com
4510259SAndrew.Bardsley@arm.com\tableofcontents
4610259SAndrew.Bardsley@arm.com
4710259SAndrew.Bardsley@arm.comThis document contains a description of the structure and function of the
4810259SAndrew.Bardsley@arm.comMinor gem5 in-order processor model.  It is recommended reading for anyone who
4910259SAndrew.Bardsley@arm.comwants to understand Minor's internal organisation, design decisions, C++
5010259SAndrew.Bardsley@arm.comimplementation and Python configuration.  A familiarity with gem5 and some of
5110259SAndrew.Bardsley@arm.comits internal structures is assumed.  This document is meant to be read
5210259SAndrew.Bardsley@arm.comalongside the Minor source code and to explain its general structure without
5310259SAndrew.Bardsley@arm.combeing too slavish about naming every function and data type.
5410259SAndrew.Bardsley@arm.com
5510259SAndrew.Bardsley@arm.com\section whatis What is Minor?
5610259SAndrew.Bardsley@arm.com
5710259SAndrew.Bardsley@arm.comMinor is an in-order processor model with a fixed pipeline but configurable
5810259SAndrew.Bardsley@arm.comdata structures and execute behaviour.  It is intended to be used to model
5910259SAndrew.Bardsley@arm.comprocessors with strict in-order execution behaviour and allows visualisation
6010259SAndrew.Bardsley@arm.comof an instruction's position in the pipeline through the
6110259SAndrew.Bardsley@arm.comMinorTrace/minorview.py format/tool.  The intention is to provide a framework
6210259SAndrew.Bardsley@arm.comfor micro-architecturally correlating the model with a particular, chosen
6310259SAndrew.Bardsley@arm.comprocessor with similar capabilities.
6410259SAndrew.Bardsley@arm.com
6510259SAndrew.Bardsley@arm.com\section philo Design philosophy
6610259SAndrew.Bardsley@arm.com
6710259SAndrew.Bardsley@arm.com\subsection mt Multithreading
6810259SAndrew.Bardsley@arm.com
6910259SAndrew.Bardsley@arm.comThe model isn't currently capable of multithreading but there are THREAD
7010259SAndrew.Bardsley@arm.comcomments in key places where stage data needs to be arrayed to support
7110259SAndrew.Bardsley@arm.commultithreading.
7210259SAndrew.Bardsley@arm.com
7310259SAndrew.Bardsley@arm.com\subsection structs Data structures
7410259SAndrew.Bardsley@arm.com
7510259SAndrew.Bardsley@arm.comDecorating data structures with large amounts of life-cycle information is
7610259SAndrew.Bardsley@arm.comavoided.  Only instructions (MinorDynInst) contain a significant proportion of
7710259SAndrew.Bardsley@arm.comtheir data content whose values are not set at construction.
7810259SAndrew.Bardsley@arm.com
7910259SAndrew.Bardsley@arm.comAll internal structures have fixed sizes on construction.  Data held in queues
8010259SAndrew.Bardsley@arm.comand FIFOs (MinorBuffer, FUPipeline) should have a BubbleIF interface to
8110259SAndrew.Bardsley@arm.comallow a distinct 'bubble'/no data value option for each type.
8210259SAndrew.Bardsley@arm.com
8310259SAndrew.Bardsley@arm.comInter-stage 'struct' data is packaged in structures which are passed by value.
8410259SAndrew.Bardsley@arm.comOnly MinorDynInst, the line data in ForwardLineData and the memory-interfacing
8510259SAndrew.Bardsley@arm.comobjects Fetch1::FetchRequest and LSQ::LSQRequest are '::new' allocated while
8610259SAndrew.Bardsley@arm.comrunning the model.
8710259SAndrew.Bardsley@arm.com
8810259SAndrew.Bardsley@arm.com\section model Model structure
8910259SAndrew.Bardsley@arm.com
9010259SAndrew.Bardsley@arm.comObjects of class MinorCPU are provided by the model to gem5.  MinorCPU
9110259SAndrew.Bardsley@arm.comimplements the interfaces of (cpu.hh) and can provide data and
9210259SAndrew.Bardsley@arm.cominstruction interfaces for connection to a cache system.  The model is
9310259SAndrew.Bardsley@arm.comconfigured in a similar way to other gem5 models through Python.  That
9410259SAndrew.Bardsley@arm.comconfiguration is passed on to MinorCPU::pipeline (of class Pipeline) which
9510259SAndrew.Bardsley@arm.comactually implements the processor pipeline.
9610259SAndrew.Bardsley@arm.com
9710259SAndrew.Bardsley@arm.comThe hierarchy of major unit ownership from MinorCPU down looks like this:
9810259SAndrew.Bardsley@arm.com
9910259SAndrew.Bardsley@arm.com<ul>
10010259SAndrew.Bardsley@arm.com<li>MinorCPU</li>
10110259SAndrew.Bardsley@arm.com<ul>
10210259SAndrew.Bardsley@arm.com    <li>Pipeline - container for the pipeline, owns the cyclic 'tick'
10310259SAndrew.Bardsley@arm.com    event mechanism and the idling (cycle skipping) mechanism.</li>
10410259SAndrew.Bardsley@arm.com    <ul>
10510259SAndrew.Bardsley@arm.com        <li>Fetch1 - instruction fetch unit responsible for fetching cache
10610259SAndrew.Bardsley@arm.com            lines (or parts of lines from the I-cache interface)</li>
10710259SAndrew.Bardsley@arm.com        <ul>
10810259SAndrew.Bardsley@arm.com            <li>Fetch1::IcachePort - interface to the I-cache from
10910259SAndrew.Bardsley@arm.com                Fetch1</li>
11010259SAndrew.Bardsley@arm.com            </ul>
11110259SAndrew.Bardsley@arm.com            <li>Fetch2 - line to instruction decomposition</li>
11210259SAndrew.Bardsley@arm.com            <li>Decode - instruction to micro-op decomposition</li>
11310259SAndrew.Bardsley@arm.com            <li>Execute - instruction execution and data memory
11410259SAndrew.Bardsley@arm.com                interface</li>
11510259SAndrew.Bardsley@arm.com            <ul>
11610259SAndrew.Bardsley@arm.com                <li>LSQ - load store queue for memory ref. instructions</li>
11710259SAndrew.Bardsley@arm.com                <li>LSQ::DcachePort - interface to the D-cache from
11810259SAndrew.Bardsley@arm.com                    Execute</li>
11910259SAndrew.Bardsley@arm.com            </ul>
12010259SAndrew.Bardsley@arm.com        </ul>
12110259SAndrew.Bardsley@arm.com    </ul>
12210259SAndrew.Bardsley@arm.com</ul>
12310259SAndrew.Bardsley@arm.com
12410259SAndrew.Bardsley@arm.com\section keystruct Key data structures
12510259SAndrew.Bardsley@arm.com
12610259SAndrew.Bardsley@arm.com\subsection ids Instruction and line identity: InstId (dyn_inst.hh)
12710259SAndrew.Bardsley@arm.com
12810259SAndrew.Bardsley@arm.comAn InstId contains the sequence numbers and thread numbers that describe the
12910259SAndrew.Bardsley@arm.comlife cycle and instruction stream affiliations of individual fetched cache
13010259SAndrew.Bardsley@arm.comlines and instructions.
13110259SAndrew.Bardsley@arm.com
13210259SAndrew.Bardsley@arm.comAn InstId is printed in one of the following forms:
13310259SAndrew.Bardsley@arm.com
13410259SAndrew.Bardsley@arm.com    - T/S.P/L - for fetched cache lines
13510259SAndrew.Bardsley@arm.com    - T/S.P/L/F - for instructions before Decode
13610259SAndrew.Bardsley@arm.com    - T/S.P/L/F.E - for instructions from Decode onwards
13710259SAndrew.Bardsley@arm.com
13810259SAndrew.Bardsley@arm.comfor example:
13910259SAndrew.Bardsley@arm.com
14010259SAndrew.Bardsley@arm.com    - 0/10.12/5/6.7
14110259SAndrew.Bardsley@arm.com
14210259SAndrew.Bardsley@arm.comInstId's fields are:
14310259SAndrew.Bardsley@arm.com
14410259SAndrew.Bardsley@arm.com<table>
14510259SAndrew.Bardsley@arm.com<tr>
14610259SAndrew.Bardsley@arm.com    <td><b>Field</b></td>
14710259SAndrew.Bardsley@arm.com    <td><b>Symbol</b></td>
14810259SAndrew.Bardsley@arm.com    <td><b>Generated by</b></td>
14910259SAndrew.Bardsley@arm.com    <td><b>Checked by</b></td>
15010259SAndrew.Bardsley@arm.com    <td><b>Function</b></td>
15110259SAndrew.Bardsley@arm.com</tr>
15210259SAndrew.Bardsley@arm.com
15310259SAndrew.Bardsley@arm.com<tr>
15410259SAndrew.Bardsley@arm.com    <td>InstId::threadId</td>
15510259SAndrew.Bardsley@arm.com    <td>T</td>
15610259SAndrew.Bardsley@arm.com    <td>Fetch1</td>
15710259SAndrew.Bardsley@arm.com    <td>Everywhere the thread number is needed</td>
15810259SAndrew.Bardsley@arm.com    <td>Thread number (currently always 0).</td>
15910259SAndrew.Bardsley@arm.com</tr>
16010259SAndrew.Bardsley@arm.com
16110259SAndrew.Bardsley@arm.com<tr>
16210259SAndrew.Bardsley@arm.com    <td>InstId::streamSeqNum</td>
16310259SAndrew.Bardsley@arm.com    <td>S</td>
16410259SAndrew.Bardsley@arm.com    <td>Execute</td>
16510259SAndrew.Bardsley@arm.com    <td>Fetch1, Fetch2, Execute (to discard lines/insts)</td>
16610259SAndrew.Bardsley@arm.com    <td>Stream sequence number as chosen by Execute.  Stream
16710259SAndrew.Bardsley@arm.com        sequence numbers change after changes of PC (branches, exceptions) in
16810259SAndrew.Bardsley@arm.com        Execute and are used to separate pre and post branch instruction
16910259SAndrew.Bardsley@arm.com        streams.</td>
17010259SAndrew.Bardsley@arm.com</tr>
17110259SAndrew.Bardsley@arm.com
17210259SAndrew.Bardsley@arm.com<tr>
17310259SAndrew.Bardsley@arm.com    <td>InstId::predictionSeqNum</td>
17410259SAndrew.Bardsley@arm.com    <td>P</td>
17510259SAndrew.Bardsley@arm.com    <td>Fetch2</td>
17610259SAndrew.Bardsley@arm.com    <td>Fetch2 (while discarding lines after prediction)</td>
17710259SAndrew.Bardsley@arm.com    <td>Prediction sequence numbers represent branch prediction decisions.
17810259SAndrew.Bardsley@arm.com    This is used by Fetch2 to mark lines/instructions according to the last
17910259SAndrew.Bardsley@arm.com    followed branch prediction made by Fetch2.  Fetch2 can signal to Fetch1
18010259SAndrew.Bardsley@arm.com    that it should change its fetch address and mark lines with a new
18110259SAndrew.Bardsley@arm.com    prediction sequence number (which it will only do if the stream sequence
18210259SAndrew.Bardsley@arm.com    number Fetch1 expects matches that of the request).  </td> </tr>
18310259SAndrew.Bardsley@arm.com
18410259SAndrew.Bardsley@arm.com<tr>
18510259SAndrew.Bardsley@arm.com<td>InstId::lineSeqNum</td>
18610259SAndrew.Bardsley@arm.com<td>L</td>
18710259SAndrew.Bardsley@arm.com<td>Fetch1</td>
18810259SAndrew.Bardsley@arm.com<td>(Just for debugging)</td>
18910259SAndrew.Bardsley@arm.com<td>Line fetch sequence number of this cache line or the line
19010259SAndrew.Bardsley@arm.com    this instruction was extracted from.
19110259SAndrew.Bardsley@arm.com    </td>
19210259SAndrew.Bardsley@arm.com</tr>
19310259SAndrew.Bardsley@arm.com
19410259SAndrew.Bardsley@arm.com<tr>
19510259SAndrew.Bardsley@arm.com<td>InstId::fetchSeqNum</td>
19610259SAndrew.Bardsley@arm.com<td>F</td>
19710259SAndrew.Bardsley@arm.com<td>Fetch2</td>
19810259SAndrew.Bardsley@arm.com<td>Fetch2 (as the inst. sequence number for branches)</td>
19910259SAndrew.Bardsley@arm.com<td>Instruction fetch order assigned by Fetch2 when lines
20010259SAndrew.Bardsley@arm.com    are decomposed into instructions.
20110259SAndrew.Bardsley@arm.com    </td>
20210259SAndrew.Bardsley@arm.com</tr>
20310259SAndrew.Bardsley@arm.com
20410259SAndrew.Bardsley@arm.com<tr>
20510259SAndrew.Bardsley@arm.com<td>InstId::execSeqNum</td>
20610259SAndrew.Bardsley@arm.com<td>E</td>
20710259SAndrew.Bardsley@arm.com<td>Decode</td>
20810259SAndrew.Bardsley@arm.com<td>Execute (to check instruction identity in queues/FUs/LSQ)</td>
20910259SAndrew.Bardsley@arm.com<td>Instruction order after micro-op decomposition.</td>
21010259SAndrew.Bardsley@arm.com</tr>
21110259SAndrew.Bardsley@arm.com
21210259SAndrew.Bardsley@arm.com</table>
21310259SAndrew.Bardsley@arm.com
21410259SAndrew.Bardsley@arm.comThe sequence number fields are all independent of each other and although, for
21510259SAndrew.Bardsley@arm.cominstance, InstId::execSeqNum for an instruction will always be >=
21610259SAndrew.Bardsley@arm.comInstId::fetchSeqNum, the comparison is not useful.
21710259SAndrew.Bardsley@arm.com
21810259SAndrew.Bardsley@arm.comThe originating stage of each sequence number field keeps a counter for that
21910259SAndrew.Bardsley@arm.comfield which can be incremented in order to generate new, unique numbers.
22010259SAndrew.Bardsley@arm.com
22110259SAndrew.Bardsley@arm.com\subsection insts Instructions: MinorDynInst (dyn_inst.hh)
22210259SAndrew.Bardsley@arm.com
22310259SAndrew.Bardsley@arm.comMinorDynInst represents an instruction's progression through the pipeline.  An
22410259SAndrew.Bardsley@arm.cominstruction can be three things:
22510259SAndrew.Bardsley@arm.com
22610259SAndrew.Bardsley@arm.com<table>
22710259SAndrew.Bardsley@arm.com<tr>
22810259SAndrew.Bardsley@arm.com    <td><b>Thing</b></td>
22910259SAndrew.Bardsley@arm.com    <td><b>Predicate</b></td>
23010259SAndrew.Bardsley@arm.com    <td><b>Explanation</b></td>
23110259SAndrew.Bardsley@arm.com</tr>
23210259SAndrew.Bardsley@arm.com<tr>
23310259SAndrew.Bardsley@arm.com    <td>A bubble</td>
23410259SAndrew.Bardsley@arm.com    <td>MinorDynInst::isBubble()</td>
23510259SAndrew.Bardsley@arm.com    <td>no instruction at all, just a space-filler</td>
23610259SAndrew.Bardsley@arm.com</tr>
23710259SAndrew.Bardsley@arm.com<tr>
23810259SAndrew.Bardsley@arm.com    <td>A fault</td>
23910259SAndrew.Bardsley@arm.com    <td>MinorDynInst::isFault()</td>
24010259SAndrew.Bardsley@arm.com    <td>a fault to pass down the pipeline in an instruction's clothing</td>
24110259SAndrew.Bardsley@arm.com</tr>
24210259SAndrew.Bardsley@arm.com<tr>
24310259SAndrew.Bardsley@arm.com    <td>A decoded instruction</td>
24410259SAndrew.Bardsley@arm.com    <td>MinorDynInst::isInst()</td>
24510259SAndrew.Bardsley@arm.com    <td>instructions are actually passed to the gem5 decoder in Fetch2 and so
24610259SAndrew.Bardsley@arm.com    are created fully decoded.  MinorDynInst::staticInst is the decoded
24710259SAndrew.Bardsley@arm.com    instruction form.</td>
24810259SAndrew.Bardsley@arm.com</tr>
24910259SAndrew.Bardsley@arm.com</table>
25010259SAndrew.Bardsley@arm.com
25110259SAndrew.Bardsley@arm.comInstructions are reference counted using the gem5 RefCountingPtr
25210259SAndrew.Bardsley@arm.com(base/refcnt.hh) wrapper.  They therefore usually appear as MinorDynInstPtr in
25310259SAndrew.Bardsley@arm.comcode.  Note that as RefCountingPtr initialises as nullptr rather than an
25410259SAndrew.Bardsley@arm.comobject that supports BubbleIF::isBubble, passing raw MinorDynInstPtrs to
25510259SAndrew.Bardsley@arm.comQueue%s and other similar structures from stage.hh without boxing is
25610259SAndrew.Bardsley@arm.comdangerous.
25710259SAndrew.Bardsley@arm.com
25810259SAndrew.Bardsley@arm.com\subsection fld ForwardLineData (pipe_data.hh)
25910259SAndrew.Bardsley@arm.com
26010259SAndrew.Bardsley@arm.comForwardLineData is used to pass cache lines from Fetch1 to Fetch2.  Like
26110259SAndrew.Bardsley@arm.comMinorDynInst%s, they can be bubbles (ForwardLineData::isBubble()),
26210259SAndrew.Bardsley@arm.comfault-carrying or can contain a line (partial line) fetched by Fetch1.  The
26310259SAndrew.Bardsley@arm.comdata carried by ForwardLineData is owned by a Packet object returned from
26410259SAndrew.Bardsley@arm.commemory and is explicitly memory managed and do must be deleted once processed
26510259SAndrew.Bardsley@arm.com(by Fetch2 deleting the Packet).
26610259SAndrew.Bardsley@arm.com
26710259SAndrew.Bardsley@arm.com\subsection fid ForwardInstData (pipe_data.hh)
26810259SAndrew.Bardsley@arm.com
26910259SAndrew.Bardsley@arm.comForwardInstData can contain up to ForwardInstData::width() instructions in its
27010259SAndrew.Bardsley@arm.comForwardInstData::insts vector.  This structure is used to carry instructions
27110259SAndrew.Bardsley@arm.combetween Fetch2, Decode and Execute and to store input buffer vectors in Decode
27210259SAndrew.Bardsley@arm.comand Execute.
27310259SAndrew.Bardsley@arm.com
27410259SAndrew.Bardsley@arm.com\subsection fr Fetch1::FetchRequest (fetch1.hh)
27510259SAndrew.Bardsley@arm.com
27610259SAndrew.Bardsley@arm.comFetchRequests represent I-cache line fetch requests.  The are used in the
27710259SAndrew.Bardsley@arm.commemory queues of Fetch1 and are pushed into/popped from Packet::senderState
27810259SAndrew.Bardsley@arm.comwhile traversing the memory system.
27910259SAndrew.Bardsley@arm.com
28010259SAndrew.Bardsley@arm.comFetchRequests contain a memory system Request (mem/request.hh) for that fetch
28110259SAndrew.Bardsley@arm.comaccess, a packet (Packet, mem/packet.hh), if the request gets to memory, and a
28210259SAndrew.Bardsley@arm.comfault field that can be populated with a TLB-sourced prefetch fault (if any).
28310259SAndrew.Bardsley@arm.com
28410259SAndrew.Bardsley@arm.com\subsection lsqr LSQ::LSQRequest (execute.hh)
28510259SAndrew.Bardsley@arm.com
28610259SAndrew.Bardsley@arm.comLSQRequests are similar to FetchRequests but for D-cache accesses.  They carry
28710259SAndrew.Bardsley@arm.comthe instruction associated with a memory access.
28810259SAndrew.Bardsley@arm.com
28910259SAndrew.Bardsley@arm.com\section pipeline The pipeline
29010259SAndrew.Bardsley@arm.com
29110259SAndrew.Bardsley@arm.com\verbatim
29210259SAndrew.Bardsley@arm.com------------------------------------------------------------------------------
29310259SAndrew.Bardsley@arm.com    Key:
29410259SAndrew.Bardsley@arm.com
29510259SAndrew.Bardsley@arm.com    [] : inter-stage BufferBuffer
29610259SAndrew.Bardsley@arm.com    ,--.
29710259SAndrew.Bardsley@arm.com    |  | : pipeline stage
29810259SAndrew.Bardsley@arm.com    `--'
29910259SAndrew.Bardsley@arm.com    ---> : forward communication
30010259SAndrew.Bardsley@arm.com    <--- : backward communication
30110259SAndrew.Bardsley@arm.com
30210259SAndrew.Bardsley@arm.com    rv : reservation information for input buffers
30310259SAndrew.Bardsley@arm.com
30410259SAndrew.Bardsley@arm.com                ,------.     ,------.     ,------.     ,-------.
30510259SAndrew.Bardsley@arm.com (from  --[]-v->|Fetch1|-[]->|Fetch2|-[]->|Decode|-[]->|Execute|--> (to Fetch1
30610259SAndrew.Bardsley@arm.com Execute)    |  |      |<-[]-|      |<-rv-|      |<-rv-|       |     & Fetch2)
30710259SAndrew.Bardsley@arm.com             |  `------'<-rv-|      |     |      |     |       |
30810259SAndrew.Bardsley@arm.com             `-------------->|      |     |      |     |       |
30910259SAndrew.Bardsley@arm.com                             `------'     `------'     `-------'
31010259SAndrew.Bardsley@arm.com------------------------------------------------------------------------------
31110259SAndrew.Bardsley@arm.com\endverbatim
31210259SAndrew.Bardsley@arm.com
31310259SAndrew.Bardsley@arm.comThe four pipeline stages are connected together by MinorBuffer FIFO
31410259SAndrew.Bardsley@arm.com(stage.hh, derived ultimately from TimeBuffer) structures which allow
31510259SAndrew.Bardsley@arm.cominter-stage delays to be modelled.  There is a MinorBuffer%s between adjacent
31610259SAndrew.Bardsley@arm.comstages in the forward direction (for example: passing lines from Fetch1 to
31710259SAndrew.Bardsley@arm.comFetch2) and, between Fetch2 and Fetch1, a buffer in the backwards direction
31810259SAndrew.Bardsley@arm.comcarrying branch predictions.
31910259SAndrew.Bardsley@arm.com
32010259SAndrew.Bardsley@arm.comStages Fetch2, Decode and Execute have input buffers which, each cycle, can
32110259SAndrew.Bardsley@arm.comaccept input data from the previous stage and can hold that data if the stage
32210259SAndrew.Bardsley@arm.comis not ready to process it.  Input buffers store data in the same form as it
32310259SAndrew.Bardsley@arm.comis received and so Decode and Execute's input buffers contain the output
32410259SAndrew.Bardsley@arm.cominstruction vector (ForwardInstData (pipe_data.hh)) from their previous stages
32510259SAndrew.Bardsley@arm.comwith the instructions and bubbles in the same positions as a single buffer
32610259SAndrew.Bardsley@arm.comentry.
32710259SAndrew.Bardsley@arm.com
32810259SAndrew.Bardsley@arm.comStage input buffers provide a Reservable (stage.hh) interface to their
32910259SAndrew.Bardsley@arm.comprevious stages, to allow slots to be reserved in their input buffers, and
33010259SAndrew.Bardsley@arm.comcommunicate their input buffer occupancy backwards to allow the previous stage
33110259SAndrew.Bardsley@arm.comto plan whether it should make an output in a given cycle.
33210259SAndrew.Bardsley@arm.com
33310259SAndrew.Bardsley@arm.com\subsection events Event handling: MinorActivityRecorder (activity.hh,
33410259SAndrew.Bardsley@arm.compipeline.hh)
33510259SAndrew.Bardsley@arm.com
33610259SAndrew.Bardsley@arm.comMinor is essentially a cycle-callable model with some ability to skip cycles
33710259SAndrew.Bardsley@arm.combased on pipeline activity.  External events are mostly received by callbacks
33810259SAndrew.Bardsley@arm.com(e.g. Fetch1::IcachePort::recvTimingResp) and cause the pipeline to be woken
33910259SAndrew.Bardsley@arm.comup to service advancing request queues.
34010259SAndrew.Bardsley@arm.com
34110259SAndrew.Bardsley@arm.comTicked (sim/ticked.hh) is a base class bringing together an evaluate
34210259SAndrew.Bardsley@arm.commember function and a provided SimObject.  It provides a Ticked::start/stop
34310259SAndrew.Bardsley@arm.cominterface to start and pause clock events from being periodically issued.
34410259SAndrew.Bardsley@arm.comPipeline is a derived class of Ticked.
34510259SAndrew.Bardsley@arm.com
34610259SAndrew.Bardsley@arm.comDuring evaluate calls, stages can signal that they still have work to do in
34710259SAndrew.Bardsley@arm.comthe next cycle by calling either MinorCPU::activityRecorder->activity() (for
34810259SAndrew.Bardsley@arm.comnon-callable related activity) or MinorCPU::wakeupOnEvent(<stageId>) (for
34910259SAndrew.Bardsley@arm.comstage callback-related 'wakeup' activity).
35010259SAndrew.Bardsley@arm.com
35110259SAndrew.Bardsley@arm.comPipeline::evaluate contains calls to evaluate for each unit and a test for
35210259SAndrew.Bardsley@arm.compipeline idling which can turns off the clock tick if no unit has signalled
35310259SAndrew.Bardsley@arm.comthat it may become active next cycle.
35410259SAndrew.Bardsley@arm.com
35510259SAndrew.Bardsley@arm.comWithin Pipeline (pipeline.hh), the stages are evaluated in reverse order (and
35610259SAndrew.Bardsley@arm.comso will ::evaluate in reverse order) and their backwards data can be
35710259SAndrew.Bardsley@arm.comread immediately after being written in each cycle allowing output decisions
35810259SAndrew.Bardsley@arm.comto be 'perfect' (allowing synchronous stalling of the whole pipeline).  Branch
35910259SAndrew.Bardsley@arm.compredictions from Fetch2 to Fetch1 can also be transported in 0 cycles making
36010259SAndrew.Bardsley@arm.comfetch1ToFetch2BackwardDelay the only configurable delay which can be set as
36110259SAndrew.Bardsley@arm.comlow as 0 cycles.
36210259SAndrew.Bardsley@arm.com
36310259SAndrew.Bardsley@arm.comThe MinorCPU::activateContext and MinorCPU::suspendContext interface can be
36410259SAndrew.Bardsley@arm.comcalled to start and pause threads (threads in the MT sense) and to start and
36510259SAndrew.Bardsley@arm.compause the pipeline.  Executing instructions can call this interface
36610259SAndrew.Bardsley@arm.com(indirectly through the ThreadContext) to idle the CPU/their threads.
36710259SAndrew.Bardsley@arm.com
36810259SAndrew.Bardsley@arm.com\subsection stages Each pipeline stage
36910259SAndrew.Bardsley@arm.com
37010259SAndrew.Bardsley@arm.comIn general, the behaviour of a stage (each cycle) is:
37110259SAndrew.Bardsley@arm.com
37210259SAndrew.Bardsley@arm.com\verbatim
37310259SAndrew.Bardsley@arm.com    evaluate:
37410259SAndrew.Bardsley@arm.com        push input to inputBuffer
37510259SAndrew.Bardsley@arm.com        setup references to input/output data slots
37610259SAndrew.Bardsley@arm.com
37710259SAndrew.Bardsley@arm.com        do 'every cycle' 'step' tasks
37810259SAndrew.Bardsley@arm.com
37910259SAndrew.Bardsley@arm.com        if there is input and there is space in the next stage:
38010259SAndrew.Bardsley@arm.com            process and generate a new output
38110259SAndrew.Bardsley@arm.com            maybe re-activate the stage
38210259SAndrew.Bardsley@arm.com
38310259SAndrew.Bardsley@arm.com        send backwards data
38410259SAndrew.Bardsley@arm.com
38510259SAndrew.Bardsley@arm.com        if the stage generated output to the following FIFO:
38610259SAndrew.Bardsley@arm.com            signal pipe activity
38710259SAndrew.Bardsley@arm.com
38810259SAndrew.Bardsley@arm.com        if the stage has more processable input and space in the next stage:
38910259SAndrew.Bardsley@arm.com            re-activate the stage for the next cycle
39010259SAndrew.Bardsley@arm.com
39110259SAndrew.Bardsley@arm.com        commit the push to the inputBuffer if that data hasn't all been used
39210259SAndrew.Bardsley@arm.com\endverbatim
39310259SAndrew.Bardsley@arm.com
39410259SAndrew.Bardsley@arm.comThe Execute stage differs from this model as its forward output (branch) data
39510259SAndrew.Bardsley@arm.comis unconditionally sent to Fetch1 and Fetch2.  To allow this behaviour, Fetch1
39610259SAndrew.Bardsley@arm.comand Fetch2 must be unconditionally receptive to that data.
39710259SAndrew.Bardsley@arm.com
39810259SAndrew.Bardsley@arm.com\subsection fetch1 Fetch1 stage
39910259SAndrew.Bardsley@arm.com
40010259SAndrew.Bardsley@arm.comFetch1 is responsible for fetching cache lines or partial cache lines from the
40110259SAndrew.Bardsley@arm.comI-cache and passing them on to Fetch2 to be decomposed into instructions.  It
40210259SAndrew.Bardsley@arm.comcan receive 'change of stream' indications from both Execute and Fetch2 to
40310259SAndrew.Bardsley@arm.comsignal that it should change its internal fetch address and tag newly fetched
40410259SAndrew.Bardsley@arm.comlines with new stream or prediction sequence numbers.  When both Execute and
40510259SAndrew.Bardsley@arm.comFetch2 signal changes of stream at the same time, Fetch1 takes Execute's
40610259SAndrew.Bardsley@arm.comchange.
40710259SAndrew.Bardsley@arm.com
40810259SAndrew.Bardsley@arm.comEvery line issued by Fetch1 will bear a unique line sequence number which can
40910259SAndrew.Bardsley@arm.combe used for debugging stream changes.
41010259SAndrew.Bardsley@arm.com
41110259SAndrew.Bardsley@arm.comWhen fetching from the I-cache, Fetch1 will ask for data from the current
41210259SAndrew.Bardsley@arm.comfetch address (Fetch1::pc) up to the end of the 'data snap' size set in the
41310259SAndrew.Bardsley@arm.comparameter fetch1LineSnapWidth.  Subsequent autonomous line fetches will fetch
41410259SAndrew.Bardsley@arm.comwhole lines at a snap boundary and of size fetch1LineWidth.
41510259SAndrew.Bardsley@arm.com
41610259SAndrew.Bardsley@arm.comFetch1 will only initiate a memory fetch if it can reserve space in Fetch2
41710259SAndrew.Bardsley@arm.cominput buffer.  That input buffer serves an the fetch queue/LFL for the system.
41810259SAndrew.Bardsley@arm.com
41910259SAndrew.Bardsley@arm.comFetch1 contains two queues: requests and transfers to handle the stages of
42010259SAndrew.Bardsley@arm.comtranslating the address of a line fetch (via the TLB) and accommodating the
42110259SAndrew.Bardsley@arm.comrequest/response of fetches to/from memory.
42210259SAndrew.Bardsley@arm.com
42310259SAndrew.Bardsley@arm.comFetch requests from Fetch1 are pushed into the requests queue as newly
42410259SAndrew.Bardsley@arm.comallocated FetchRequest objects once they have been sent to the ITLB with a
42510259SAndrew.Bardsley@arm.comcall to itb->translateTiming.
42610259SAndrew.Bardsley@arm.com
42710259SAndrew.Bardsley@arm.comA response from the TLB moves the request from the requests queue to the
42810259SAndrew.Bardsley@arm.comtransfers queue.  If there is more than one entry in each queue, it is
42910259SAndrew.Bardsley@arm.compossible to get a TLB response for request which is not at the head of the
43010259SAndrew.Bardsley@arm.comrequests queue.  In that case, the TLB response is marked up as a state change
43110259SAndrew.Bardsley@arm.comto Translated in the request object, and advancing the request to transfers
43210259SAndrew.Bardsley@arm.com(and the memory system) is left to calls to Fetch1::stepQueues which is called
43310259SAndrew.Bardsley@arm.comin the cycle following any event is received.
43410259SAndrew.Bardsley@arm.com
43510259SAndrew.Bardsley@arm.comFetch1::tryToSendToTransfers is responsible for moving requests between the
43610259SAndrew.Bardsley@arm.comtwo queues and issuing requests to memory.  Failed TLB lookups (prefetch
43710259SAndrew.Bardsley@arm.comaborts) continue to occupy space in the queues until they are recovered at the
43810259SAndrew.Bardsley@arm.comhead of transfers.
43910259SAndrew.Bardsley@arm.com
44010259SAndrew.Bardsley@arm.comResponses from memory change the request object state to Complete and
44110259SAndrew.Bardsley@arm.comFetch1::evaluate can pick up response data, package it in the ForwardLineData
44210259SAndrew.Bardsley@arm.comobject, and forward it to Fetch2%'s input buffer.
44310259SAndrew.Bardsley@arm.com
44410259SAndrew.Bardsley@arm.comAs space is always reserved in Fetch2::inputBuffer, setting the input buffer's
44510259SAndrew.Bardsley@arm.comsize to 1 results in non-prefetching behaviour.
44610259SAndrew.Bardsley@arm.com
44710259SAndrew.Bardsley@arm.comWhen a change of stream occurs, translated requests queue members and
44810259SAndrew.Bardsley@arm.comcompleted transfers queue members can be unconditionally discarded to make way
44910259SAndrew.Bardsley@arm.comfor new transfers.
45010259SAndrew.Bardsley@arm.com
45110259SAndrew.Bardsley@arm.com\subsection fetch2 Fetch2 stage
45210259SAndrew.Bardsley@arm.com
45310259SAndrew.Bardsley@arm.comFetch2 receives a line from Fetch1 into its input buffer.  The data in the
45410259SAndrew.Bardsley@arm.comhead line in that buffer is iterated over and separated into individual
45510259SAndrew.Bardsley@arm.cominstructions which are packed into a vector of instructions which can be
45610259SAndrew.Bardsley@arm.compassed to Decode.  Packing instructions can be aborted early if a fault is
45710259SAndrew.Bardsley@arm.comfound in either the input line as a whole or a decomposed instruction.
45810259SAndrew.Bardsley@arm.com
45910259SAndrew.Bardsley@arm.com\subsubsection bp Branch prediction
46010259SAndrew.Bardsley@arm.com
46110259SAndrew.Bardsley@arm.comFetch2 contains the branch prediction mechanism.  This is a wrapper around the
46210259SAndrew.Bardsley@arm.combranch predictor interface provided by gem5 (cpu/pred/...).
46310259SAndrew.Bardsley@arm.com
46410259SAndrew.Bardsley@arm.comBranches are predicted for any control instructions found.  If prediction is
46510259SAndrew.Bardsley@arm.comattempted for an instruction, the MinorDynInst::triedToPredict flag is set on
46610259SAndrew.Bardsley@arm.comthat instruction.
46710259SAndrew.Bardsley@arm.com
46810259SAndrew.Bardsley@arm.comWhen a branch is predicted to take, the MinorDynInst::predictedTaken flag is
46910259SAndrew.Bardsley@arm.comset and MinorDynInst::predictedTarget is set to the predicted target PC value.
47010259SAndrew.Bardsley@arm.comThe predicted branch instruction is then packed into Fetch2%'s output vector,
47110259SAndrew.Bardsley@arm.comthe prediction sequence number is incremented, and the branch is communicated
47210259SAndrew.Bardsley@arm.comto Fetch1.
47310259SAndrew.Bardsley@arm.com
47410259SAndrew.Bardsley@arm.comAfter signalling a prediction, Fetch2 will discard its input buffer contents
47510259SAndrew.Bardsley@arm.comand will reject any new lines which have the same stream sequence number as
47610259SAndrew.Bardsley@arm.comthat branch but have a different prediction sequence number.  This allows
47710259SAndrew.Bardsley@arm.comfollowing sequentially fetched lines to be rejected without ignoring new lines
47810259SAndrew.Bardsley@arm.comgenerated by a change of stream indicated from a 'real' branch from Execute
47910259SAndrew.Bardsley@arm.com(which will have a new stream sequence number).
48010259SAndrew.Bardsley@arm.com
48110259SAndrew.Bardsley@arm.comThe program counter value provided to Fetch2 by Fetch1 packets is only updated
48210259SAndrew.Bardsley@arm.comwhen there is a change of stream.  Fetch2::havePC indicates whether the PC
48310259SAndrew.Bardsley@arm.comwill be picked up from the next processed input line.  Fetch2::havePC is
48410259SAndrew.Bardsley@arm.comnecessary to allow line-wrapping instructions to be tracked through decode.
48510259SAndrew.Bardsley@arm.com
48610259SAndrew.Bardsley@arm.comBranches (and instructions predicted to branch) which are processed by Execute
48710259SAndrew.Bardsley@arm.comwill generate BranchData (pipe_data.hh) data explaining the outcome of the
48810259SAndrew.Bardsley@arm.combranch which is sent forwards to Fetch1 and Fetch2.  Fetch1 uses this data to
48910259SAndrew.Bardsley@arm.comchange stream (and update its stream sequence number and address for new
49010259SAndrew.Bardsley@arm.comlines).  Fetch2 uses it to update the branch predictor.   Minor does not
49110259SAndrew.Bardsley@arm.comcommunicate branch data to the branch predictor for instructions which are
49210259SAndrew.Bardsley@arm.comdiscarded on the way to commit.
49310259SAndrew.Bardsley@arm.com
49410259SAndrew.Bardsley@arm.comBranchData::BranchReason (pipe_data.hh) encodes the possible branch scenarios:
49510259SAndrew.Bardsley@arm.com
49610259SAndrew.Bardsley@arm.com<table>
49710259SAndrew.Bardsley@arm.com<tr>
49810259SAndrew.Bardsley@arm.com    <td>Branch enum val.</td>
49910259SAndrew.Bardsley@arm.com    <td>In Execute</td>
50010259SAndrew.Bardsley@arm.com    <td>Fetch1 reaction</td>
50110259SAndrew.Bardsley@arm.com    <td>Fetch2 reaction</td>
50210259SAndrew.Bardsley@arm.com</tr>
50310259SAndrew.Bardsley@arm.com<tr>
50410259SAndrew.Bardsley@arm.com    <td>NoBranch</td>
50510259SAndrew.Bardsley@arm.com    <td>(output bubble data)</td>
50610259SAndrew.Bardsley@arm.com    <td>-</td>
50710259SAndrew.Bardsley@arm.com    <td>-</td>
50810259SAndrew.Bardsley@arm.com</tr>
50910259SAndrew.Bardsley@arm.com<tr>
51010259SAndrew.Bardsley@arm.com    <td>CorrectlyPredictedBranch</td>
51110259SAndrew.Bardsley@arm.com    <td>Predicted, taken</td>
51210259SAndrew.Bardsley@arm.com    <td>-</td>
51310259SAndrew.Bardsley@arm.com    <td>Update BP as taken branch</td>
51410259SAndrew.Bardsley@arm.com</tr>
51510259SAndrew.Bardsley@arm.com<tr>
51610259SAndrew.Bardsley@arm.com    <td>UnpredictedBranch</td>
51710259SAndrew.Bardsley@arm.com    <td>Not predicted, taken and was taken</td>
51810259SAndrew.Bardsley@arm.com    <td>New stream</td>
51910259SAndrew.Bardsley@arm.com    <td>Update BP as taken branch</td>
52010259SAndrew.Bardsley@arm.com</tr>
52110259SAndrew.Bardsley@arm.com<tr>
52210259SAndrew.Bardsley@arm.com    <td>BadlyPredictedBranch</td>
52310259SAndrew.Bardsley@arm.com    <td>Predicted, not taken</td>
52410259SAndrew.Bardsley@arm.com    <td>New stream to restore to old inst. source</td>
52510259SAndrew.Bardsley@arm.com    <td>Update BP as not taken branch</td>
52610259SAndrew.Bardsley@arm.com</tr>
52710259SAndrew.Bardsley@arm.com<tr>
52810259SAndrew.Bardsley@arm.com    <td>BadlyPredictedBranchTarget</td>
52910259SAndrew.Bardsley@arm.com    <td>Predicted, taken, but to a different target than predicted one</td>
53010259SAndrew.Bardsley@arm.com    <td>New stream</td>
53110259SAndrew.Bardsley@arm.com    <td>Update BTB to new target</td>
53210259SAndrew.Bardsley@arm.com</tr>
53310259SAndrew.Bardsley@arm.com<tr>
53410259SAndrew.Bardsley@arm.com    <td>SuspendThread</td>
53510259SAndrew.Bardsley@arm.com    <td>Hint to suspend fetching</td>
53610259SAndrew.Bardsley@arm.com    <td>Suspend fetch for this thread (branch to next inst. as wakeup
53710259SAndrew.Bardsley@arm.com        fetch addr)</td>
53810259SAndrew.Bardsley@arm.com    <td>-</td>
53910259SAndrew.Bardsley@arm.com</tr>
54010259SAndrew.Bardsley@arm.com<tr>
54110259SAndrew.Bardsley@arm.com    <td>Interrupt</td>
54210259SAndrew.Bardsley@arm.com    <td>Interrupt detected</td>
54310259SAndrew.Bardsley@arm.com    <td>New stream</td>
54410259SAndrew.Bardsley@arm.com    <td>-</td>
54510259SAndrew.Bardsley@arm.com</tr>
54610259SAndrew.Bardsley@arm.com</table>
54710259SAndrew.Bardsley@arm.com
54810259SAndrew.Bardsley@arm.comThe parameter decodeInputWidth sets the number of instructions which can be
54910259SAndrew.Bardsley@arm.compacked into the output per cycle.  If the parameter fetch2CycleInput is true,
55010259SAndrew.Bardsley@arm.comDecode can try to take instructions from more than one entry in its input
55110259SAndrew.Bardsley@arm.combuffer per cycle.
55210259SAndrew.Bardsley@arm.com
55310259SAndrew.Bardsley@arm.com\subsection decode Decode stage
55410259SAndrew.Bardsley@arm.com
55510259SAndrew.Bardsley@arm.comDecode takes a vector of instructions from Fetch2 (via its input buffer) and
55610259SAndrew.Bardsley@arm.comdecomposes those instructions into micro-ops (if necessary) and packs them
55710259SAndrew.Bardsley@arm.cominto its output instruction vector.
55810259SAndrew.Bardsley@arm.com
55910259SAndrew.Bardsley@arm.comThe parameter executeInputWidth sets the number of instructions which can be
56010259SAndrew.Bardsley@arm.compacked into the output per cycle.  If the parameter decodeCycleInput is true,
56110259SAndrew.Bardsley@arm.comDecode can try to take instructions from more than one entry in its input
56210259SAndrew.Bardsley@arm.combuffer per cycle.
56310259SAndrew.Bardsley@arm.com
56410259SAndrew.Bardsley@arm.com\subsection execute Execute stage
56510259SAndrew.Bardsley@arm.com
56610259SAndrew.Bardsley@arm.comExecute provides all the instruction execution and memory access mechanisms.
56710259SAndrew.Bardsley@arm.comAn instructions passage through Execute can take multiple cycles with its
56810259SAndrew.Bardsley@arm.comprecise timing modelled by a functional unit pipeline FIFO.
56910259SAndrew.Bardsley@arm.com
57010259SAndrew.Bardsley@arm.comA vector of instructions (possibly including fault 'instructions') is provided
57110259SAndrew.Bardsley@arm.comto Execute by Decode and can be queued in the Execute input buffer before
57210259SAndrew.Bardsley@arm.combeing issued.  Setting the parameter executeCycleInput allows execute to
57310259SAndrew.Bardsley@arm.comexamine more than one input buffer entry (more than one instruction vector).
57410259SAndrew.Bardsley@arm.comThe number of instructions in the input vector can be set with
57510259SAndrew.Bardsley@arm.comexecuteInputWidth and the depth of the input buffer can be set with parameter
57610259SAndrew.Bardsley@arm.comexecuteInputBufferSize.
57710259SAndrew.Bardsley@arm.com
57810259SAndrew.Bardsley@arm.com\subsubsection fus Functional units
57910259SAndrew.Bardsley@arm.com
58010259SAndrew.Bardsley@arm.comThe Execute stage contains pipelines for each functional unit comprising the
58110259SAndrew.Bardsley@arm.comcomputational core of the CPU.  Functional units are configured via the
58210259SAndrew.Bardsley@arm.comexecuteFuncUnits parameter.  Each functional unit has a number of instruction
58310259SAndrew.Bardsley@arm.comclasses it supports, a stated delay between instruction issues, and a delay
58410259SAndrew.Bardsley@arm.comfrom instruction issue to (possible) commit and an optional timing annotation
58510259SAndrew.Bardsley@arm.comcapable of more complicated timing.
58610259SAndrew.Bardsley@arm.com
58710259SAndrew.Bardsley@arm.comEach active cycle, Execute::evaluate performs this action:
58810259SAndrew.Bardsley@arm.com
58910259SAndrew.Bardsley@arm.com\verbatim
59010259SAndrew.Bardsley@arm.com    Execute::evaluate:
59110259SAndrew.Bardsley@arm.com        push input to inputBuffer
59210259SAndrew.Bardsley@arm.com        setup references to input/output data slots and branch output slot
59310259SAndrew.Bardsley@arm.com
59410259SAndrew.Bardsley@arm.com        step D-cache interface queues (similar to Fetch1)
59510259SAndrew.Bardsley@arm.com
59610259SAndrew.Bardsley@arm.com        if interrupt posted:
59710259SAndrew.Bardsley@arm.com            take interrupt (signalling branch to Fetch1/Fetch2)
59810259SAndrew.Bardsley@arm.com        else
59910259SAndrew.Bardsley@arm.com            commit instructions
60010259SAndrew.Bardsley@arm.com            issue new instructions
60110259SAndrew.Bardsley@arm.com
60210259SAndrew.Bardsley@arm.com        advance functional unit pipelines
60310259SAndrew.Bardsley@arm.com
60410259SAndrew.Bardsley@arm.com        reactivate Execute if the unit is still active
60510259SAndrew.Bardsley@arm.com
60610259SAndrew.Bardsley@arm.com        commit the push to the inputBuffer if that data hasn't all been used
60710259SAndrew.Bardsley@arm.com\endverbatim
60810259SAndrew.Bardsley@arm.com
60910259SAndrew.Bardsley@arm.com\subsubsection fifos Functional unit FIFOs
61010259SAndrew.Bardsley@arm.com
61110259SAndrew.Bardsley@arm.comFunctional units are implemented as SelfStallingPipelines (stage.hh).  These
61210259SAndrew.Bardsley@arm.comare TimeBuffer FIFOs with two distinct 'push' and 'pop' wires.  They respond
61310259SAndrew.Bardsley@arm.comto SelfStallingPipeline::advance in the same way as TimeBuffers <b>unless</b>
61410259SAndrew.Bardsley@arm.comthere is data at the far, 'pop', end of the FIFO.  A 'stalled' flag is
61510259SAndrew.Bardsley@arm.comprovided for signalling stalling and to allow a stall to be cleared.  The
61610259SAndrew.Bardsley@arm.comintention is to provide a pipeline for each functional unit which will never
61710259SAndrew.Bardsley@arm.comadvance an instruction out of that pipeline until it has been processed and
61810259SAndrew.Bardsley@arm.comthe pipeline is explicitly unstalled.
61910259SAndrew.Bardsley@arm.com
62010259SAndrew.Bardsley@arm.comThe actions 'issue', 'commit', and 'advance' act on the functional units.
62110259SAndrew.Bardsley@arm.com
62210259SAndrew.Bardsley@arm.com\subsubsection issue Issue
62310259SAndrew.Bardsley@arm.com
62410259SAndrew.Bardsley@arm.comIssuing instructions involves iterating over both the input buffer
62510259SAndrew.Bardsley@arm.cominstructions and the heads of the functional units to try and issue
62610259SAndrew.Bardsley@arm.cominstructions in order.  The number of instructions which can be issued each
62710259SAndrew.Bardsley@arm.comcycle is limited by the parameter executeIssueLimit, how executeCycleInput is
62810259SAndrew.Bardsley@arm.comset, the availability of pipeline space and the policy used to choose a
62910259SAndrew.Bardsley@arm.compipeline in which the instruction can be issued.
63010259SAndrew.Bardsley@arm.com
63110259SAndrew.Bardsley@arm.comAt present, the only issue policy is strict round-robin visiting of each
63210259SAndrew.Bardsley@arm.compipeline with the given instructions in sequence.  For greater flexibility,
63310259SAndrew.Bardsley@arm.combetter (and more specific policies) will need to be possible.
63410259SAndrew.Bardsley@arm.com
63510259SAndrew.Bardsley@arm.comMemory operation instructions traverse their functional units to perform their
63610259SAndrew.Bardsley@arm.comEA calculations.  On 'commit', the ExecContext::initiateAcc execution phase is
63710259SAndrew.Bardsley@arm.comperformed and any memory access is issued (via. ExecContext::{read,write}Mem
63810259SAndrew.Bardsley@arm.comcalling LSQ::pushRequest) to the LSQ.
63910259SAndrew.Bardsley@arm.com
64010259SAndrew.Bardsley@arm.comNote that faults are issued as if they are instructions and can (currently) be
64110259SAndrew.Bardsley@arm.comissued to *any* functional unit.
64210259SAndrew.Bardsley@arm.com
64310259SAndrew.Bardsley@arm.comEvery issued instruction is also pushed into the Execute::inFlightInsts queue.
64410259SAndrew.Bardsley@arm.comMemory ref. instructions are pushing into Execute::inFUMemInsts queue.
64510259SAndrew.Bardsley@arm.com
64610259SAndrew.Bardsley@arm.com\subsubsection commit Commit
64710259SAndrew.Bardsley@arm.com
64810259SAndrew.Bardsley@arm.comInstructions are committed by examining the head of the Execute::inFlightInsts
64910259SAndrew.Bardsley@arm.comqueue (which is decorated with the functional unit number to which the
65010259SAndrew.Bardsley@arm.cominstruction was issued).  Instructions which can then be found in their
65110259SAndrew.Bardsley@arm.comfunctional units are executed and popped from Execute::inFlightInsts.
65210259SAndrew.Bardsley@arm.com
65310259SAndrew.Bardsley@arm.comMemory operation instructions are committed into the memory queues (as
65410259SAndrew.Bardsley@arm.comdescribed above) and exit their functional unit pipeline but are not popped
65510259SAndrew.Bardsley@arm.comfrom the Execute::inFlightInsts queue.  The Execute::inFUMemInsts queue
65610259SAndrew.Bardsley@arm.comprovides ordering to memory operations as they pass through the functional
65710259SAndrew.Bardsley@arm.comunits (maintaining issue order).  On entering the LSQ, instructions are popped
65810259SAndrew.Bardsley@arm.comfrom Execute::inFUMemInsts.
65910259SAndrew.Bardsley@arm.com
66010259SAndrew.Bardsley@arm.comIf the parameter executeAllowEarlyMemoryIssue is set, memory operations can be
66110259SAndrew.Bardsley@arm.comsent from their FU to the LSQ before reaching the head of
66210259SAndrew.Bardsley@arm.comExecute::inFlightInsts but after their dependencies are met.
66310259SAndrew.Bardsley@arm.comMinorDynInst::instToWaitFor is marked up with the latest dependent instruction
66410259SAndrew.Bardsley@arm.comexecSeqNum required to be committed for a memory operation to progress to the
66510259SAndrew.Bardsley@arm.comLSQ.
66610259SAndrew.Bardsley@arm.com
66710259SAndrew.Bardsley@arm.comOnce a memory response is available (by testing the head of
66810259SAndrew.Bardsley@arm.comExecute::inFlightInsts against LSQ::findResponse), commit will process that
66910259SAndrew.Bardsley@arm.comresponse (ExecContext::completeAcc) and pop the instruction from
67010259SAndrew.Bardsley@arm.comExecute::inFlightInsts.
67110259SAndrew.Bardsley@arm.com
67210259SAndrew.Bardsley@arm.comAny branch, fault or interrupt will cause a stream sequence number change and
67310259SAndrew.Bardsley@arm.comsignal a branch to Fetch1/Fetch2.  Only instructions with the current stream
67410259SAndrew.Bardsley@arm.comsequence number will be issued and/or committed.
67510259SAndrew.Bardsley@arm.com
67610259SAndrew.Bardsley@arm.com\subsubsection advance Advance
67710259SAndrew.Bardsley@arm.com
67810259SAndrew.Bardsley@arm.comAll non-stalled pipeline are advanced and may, thereafter, become stalled.
67910259SAndrew.Bardsley@arm.comPotential activity in the next cycle is signalled if there are any
68010259SAndrew.Bardsley@arm.cominstructions remaining in any pipeline.
68110259SAndrew.Bardsley@arm.com
68210259SAndrew.Bardsley@arm.com\subsubsection sb Scoreboard
68310259SAndrew.Bardsley@arm.com
68410259SAndrew.Bardsley@arm.comThe scoreboard (Scoreboard) is used to control instruction issue.  It contains
68510259SAndrew.Bardsley@arm.coma count of the number of in flight instructions which will write each general
68610259SAndrew.Bardsley@arm.compurpose CPU integer or float register.  Instructions will only be issued where
68710259SAndrew.Bardsley@arm.comthe scoreboard contains a count of 0 instructions which will write to one of
68810259SAndrew.Bardsley@arm.comthe instructions source registers.
68910259SAndrew.Bardsley@arm.com
69010259SAndrew.Bardsley@arm.comOnce an instruction is issued, the scoreboard counts for each destination
69110259SAndrew.Bardsley@arm.comregister for an instruction will be incremented.
69210259SAndrew.Bardsley@arm.com
69310259SAndrew.Bardsley@arm.comThe estimated delivery time of the instruction's result is marked up in the
69410259SAndrew.Bardsley@arm.comscoreboard by adding the length of the issued-to FU to the current time.  The
69510259SAndrew.Bardsley@arm.comtimings parameter on each FU provides a list of additional rules for
69610259SAndrew.Bardsley@arm.comcalculating the delivery time.  These are documented in the parameter comments
69710259SAndrew.Bardsley@arm.comin MinorCPU.py.
69810259SAndrew.Bardsley@arm.com
69910259SAndrew.Bardsley@arm.comOn commit, (for memory operations, memory response commit) the scoreboard
70010259SAndrew.Bardsley@arm.comcounters for an instruction's source registers are decremented.  will be
70110259SAndrew.Bardsley@arm.comdecremented.
70210259SAndrew.Bardsley@arm.com
70310259SAndrew.Bardsley@arm.com\subsubsection ifi Execute::inFlightInsts
70410259SAndrew.Bardsley@arm.com
70510259SAndrew.Bardsley@arm.comThe Execute::inFlightInsts queue will always contain all instructions in
70610259SAndrew.Bardsley@arm.comflight in Execute in the correct issue order.  Execute::issue is the only
70710259SAndrew.Bardsley@arm.comprocess which will push an instruction into the queue.  Execute::commit is the
70810259SAndrew.Bardsley@arm.comonly process that can pop an instruction.
70910259SAndrew.Bardsley@arm.com
71010259SAndrew.Bardsley@arm.com\subsubsection lsq LSQ
71110259SAndrew.Bardsley@arm.com
71210259SAndrew.Bardsley@arm.comThe LSQ can support multiple outstanding transactions to memory in a number of
71310259SAndrew.Bardsley@arm.comconservative cases.
71410259SAndrew.Bardsley@arm.com
71510259SAndrew.Bardsley@arm.comThere are three queues to contain requests: requests, transfers and the store
71610259SAndrew.Bardsley@arm.combuffer.  The requests and transfers queue operate in a similar manner to the
71710259SAndrew.Bardsley@arm.comqueues in Fetch1.  The store buffer is used to decouple the delay of
71810259SAndrew.Bardsley@arm.comcompleting store operations from following loads.
71910259SAndrew.Bardsley@arm.com
72010259SAndrew.Bardsley@arm.comRequests are issued to the DTLB as their instructions leave their functional
72110259SAndrew.Bardsley@arm.comunit.  At the head of requests, cacheable load requests can be sent to memory
72210259SAndrew.Bardsley@arm.comand on to the transfers queue.  Cacheable stores will be passed to transfers
72310259SAndrew.Bardsley@arm.comunprocessed and progress that queue maintaining order with other transactions.
72410259SAndrew.Bardsley@arm.com
72510259SAndrew.Bardsley@arm.comThe conditions in LSQ::tryToSendToTransfers dictate when requests can
72610259SAndrew.Bardsley@arm.combe sent to memory.
72710259SAndrew.Bardsley@arm.com
72810259SAndrew.Bardsley@arm.comAll uncacheable transactions, split transactions and locked transactions are
72910259SAndrew.Bardsley@arm.comprocessed in order at the head of requests.  Additionally, store results
73010259SAndrew.Bardsley@arm.comresiding in the store buffer can have their data forwarded to cacheable loads
73110259SAndrew.Bardsley@arm.com(removing the need to perform a read from memory) but no cacheable load can be
73210259SAndrew.Bardsley@arm.comissue to the transfers queue until that queue's stores have drained into the
73310259SAndrew.Bardsley@arm.comstore buffer.
73410259SAndrew.Bardsley@arm.com
73510259SAndrew.Bardsley@arm.comAt the end of transfers, requests which are LSQ::LSQRequest::Complete (are
73610259SAndrew.Bardsley@arm.comfaulting, are cacheable stores, or have been sent to memory and received a
73710259SAndrew.Bardsley@arm.comresponse) can be picked off by Execute and either committed
73810259SAndrew.Bardsley@arm.com(ExecContext::completeAcc) and, for stores, be sent to the store buffer.
73910259SAndrew.Bardsley@arm.com
74010259SAndrew.Bardsley@arm.comBarrier instructions do not prevent cacheable loads from progressing to memory
74110259SAndrew.Bardsley@arm.combut do cause a stream change which will discard that load.  Stores will not be
74210259SAndrew.Bardsley@arm.comcommitted to the store buffer if they are in the shadow of the barrier but
74310259SAndrew.Bardsley@arm.combefore the new instruction stream has arrived at Execute.  As all other memory
74410259SAndrew.Bardsley@arm.comtransactions are delayed at the end of the requests queue until they are at
74510259SAndrew.Bardsley@arm.comthe head of Execute::inFlightInsts, they will be discarded by any barrier
74610259SAndrew.Bardsley@arm.comstream change.
74710259SAndrew.Bardsley@arm.com
74810259SAndrew.Bardsley@arm.comAfter commit, LSQ::BarrierDataRequest requests are inserted into the
74910259SAndrew.Bardsley@arm.comstore buffer to track each barrier until all preceding memory transactions
75010259SAndrew.Bardsley@arm.comhave drained from the store buffer.  No further memory transactions will be
75110259SAndrew.Bardsley@arm.comissued from the ends of FUs until after the barrier has drained.
75210259SAndrew.Bardsley@arm.com
75310259SAndrew.Bardsley@arm.com\subsubsection drain Draining
75410259SAndrew.Bardsley@arm.com
75510259SAndrew.Bardsley@arm.comDraining is mostly handled by the Execute stage.  When initiated by calling
75610259SAndrew.Bardsley@arm.comMinorCPU::drain, Pipeline::evaluate checks the draining status of each unit
75710259SAndrew.Bardsley@arm.comeach cycle and keeps the pipeline active until draining is complete.  It is
75810259SAndrew.Bardsley@arm.comPipeline that signals the completion of draining.  Execute is triggered by
75910259SAndrew.Bardsley@arm.comMinorCPU::drain and starts stepping through its Execute::DrainState state
76010259SAndrew.Bardsley@arm.commachine, starting from state Execute::NotDraining, in this order:
76110259SAndrew.Bardsley@arm.com
76210259SAndrew.Bardsley@arm.com<table>
76310259SAndrew.Bardsley@arm.com<tr>
76410259SAndrew.Bardsley@arm.com    <td><b>State</b></td>
76510259SAndrew.Bardsley@arm.com    <td><b>Meaning</b></td>
76610259SAndrew.Bardsley@arm.com</tr>
76710259SAndrew.Bardsley@arm.com<tr>
76810259SAndrew.Bardsley@arm.com    <td>Execute::NotDraining</td>
76910259SAndrew.Bardsley@arm.com    <td>Not trying to drain, normal execution</td>
77010259SAndrew.Bardsley@arm.com</tr>
77110259SAndrew.Bardsley@arm.com<tr>
77210259SAndrew.Bardsley@arm.com    <td>Execute::DrainCurrentInst</td>
77310259SAndrew.Bardsley@arm.com    <td>Draining micro-ops to complete inst.</td>
77410259SAndrew.Bardsley@arm.com</tr>
77510259SAndrew.Bardsley@arm.com<tr>
77610259SAndrew.Bardsley@arm.com    <td>Execute::DrainHaltFetch</td>
77710259SAndrew.Bardsley@arm.com    <td>Halt fetching instructions</td>
77810259SAndrew.Bardsley@arm.com</tr>
77910259SAndrew.Bardsley@arm.com<tr>
78010259SAndrew.Bardsley@arm.com    <td>Execute::DrainAllInsts</td>
78110259SAndrew.Bardsley@arm.com    <td>Discarding all instructions presented</td>
78210259SAndrew.Bardsley@arm.com</tr>
78310259SAndrew.Bardsley@arm.com</table>
78410259SAndrew.Bardsley@arm.com
78510259SAndrew.Bardsley@arm.comWhen complete, a drained Execute unit will be in the Execute::DrainAllInsts
78610259SAndrew.Bardsley@arm.comstate where it will continue to discard instructions but has no knowledge of
78710259SAndrew.Bardsley@arm.comthe drained state of the rest of the model.
78810259SAndrew.Bardsley@arm.com
78910259SAndrew.Bardsley@arm.com\section debug Debug options
79010259SAndrew.Bardsley@arm.com
79110259SAndrew.Bardsley@arm.comThe model provides a number of debug flags which can be passed to gem5 with
79210259SAndrew.Bardsley@arm.comthe --debug-flags option.
79310259SAndrew.Bardsley@arm.com
79410259SAndrew.Bardsley@arm.comThe available flags are:
79510259SAndrew.Bardsley@arm.com
79610259SAndrew.Bardsley@arm.com<table>
79710259SAndrew.Bardsley@arm.com<tr>
79810259SAndrew.Bardsley@arm.com    <td><b>Debug flag</b></td>
79910259SAndrew.Bardsley@arm.com    <td><b>Unit which will generate debugging output</b></td>
80010259SAndrew.Bardsley@arm.com</tr>
80110259SAndrew.Bardsley@arm.com<tr>
80210259SAndrew.Bardsley@arm.com    <td>Activity</td>
80310259SAndrew.Bardsley@arm.com    <td>Debug ActivityMonitor actions</td>
80410259SAndrew.Bardsley@arm.com</tr>
80510259SAndrew.Bardsley@arm.com<tr>
80610259SAndrew.Bardsley@arm.com    <td>Branch</td>
80710259SAndrew.Bardsley@arm.com    <td>Fetch2 and Execute branch prediction decisions</td>
80810259SAndrew.Bardsley@arm.com</tr>
80910259SAndrew.Bardsley@arm.com<tr>
81010259SAndrew.Bardsley@arm.com    <td>MinorCPU</td>
81110259SAndrew.Bardsley@arm.com    <td>CPU global actions such as wakeup/thread suspension</td>
81210259SAndrew.Bardsley@arm.com</tr>
81310259SAndrew.Bardsley@arm.com<tr>
81410259SAndrew.Bardsley@arm.com    <td>Decode</td>
81510259SAndrew.Bardsley@arm.com    <td>Decode</td>
81610259SAndrew.Bardsley@arm.com</tr>
81710259SAndrew.Bardsley@arm.com<tr>
81810259SAndrew.Bardsley@arm.com    <td>MinorExec</td>
81910259SAndrew.Bardsley@arm.com    <td>Execute behaviour</td>
82010259SAndrew.Bardsley@arm.com</tr>
82110259SAndrew.Bardsley@arm.com<tr>
82210259SAndrew.Bardsley@arm.com    <td>Fetch</td>
82310259SAndrew.Bardsley@arm.com    <td>Fetch1 and Fetch2</td>
82410259SAndrew.Bardsley@arm.com</tr>
82510259SAndrew.Bardsley@arm.com<tr>
82610259SAndrew.Bardsley@arm.com    <td>MinorInterrupt</td>
82710259SAndrew.Bardsley@arm.com    <td>Execute interrupt handling</td>
82810259SAndrew.Bardsley@arm.com</tr>
82910259SAndrew.Bardsley@arm.com<tr>
83010259SAndrew.Bardsley@arm.com    <td>MinorMem</td>
83110259SAndrew.Bardsley@arm.com    <td>Execute memory interactions</td>
83210259SAndrew.Bardsley@arm.com</tr>
83310259SAndrew.Bardsley@arm.com<tr>
83410259SAndrew.Bardsley@arm.com    <td>MinorScoreboard</td>
83510259SAndrew.Bardsley@arm.com    <td>Execute scoreboard activity</td>
83610259SAndrew.Bardsley@arm.com</tr>
83710259SAndrew.Bardsley@arm.com<tr>
83810259SAndrew.Bardsley@arm.com    <td>MinorTrace</td>
83910259SAndrew.Bardsley@arm.com    <td>Generate MinorTrace cyclic state trace output (see below)</td>
84010259SAndrew.Bardsley@arm.com</tr>
84110259SAndrew.Bardsley@arm.com<tr>
84210259SAndrew.Bardsley@arm.com    <td>MinorTiming</td>
84310259SAndrew.Bardsley@arm.com    <td>MinorTiming instruction timing modification operations</td>
84410259SAndrew.Bardsley@arm.com</tr>
84510259SAndrew.Bardsley@arm.com</table>
84610259SAndrew.Bardsley@arm.com
84710259SAndrew.Bardsley@arm.comThe group flag Minor enables all of the flags beginning with Minor.
84810259SAndrew.Bardsley@arm.com
84910259SAndrew.Bardsley@arm.com\section trace MinorTrace and minorview.py
85010259SAndrew.Bardsley@arm.com
85110259SAndrew.Bardsley@arm.comThe debug flag MinorTrace causes cycle-by-cycle state data to be printed which
85210259SAndrew.Bardsley@arm.comcan then be processed and viewed by the minorview.py tool.  This output is
85310259SAndrew.Bardsley@arm.comvery verbose and so it is recommended it only be used for small examples.
85410259SAndrew.Bardsley@arm.com
85510259SAndrew.Bardsley@arm.com\subsection traceformat MinorTrace format
85610259SAndrew.Bardsley@arm.com
85710259SAndrew.Bardsley@arm.comThere are three types of line outputted by MinorTrace:
85810259SAndrew.Bardsley@arm.com
85910259SAndrew.Bardsley@arm.com\subsubsection state MinorTrace - Ticked unit cycle state
86010259SAndrew.Bardsley@arm.com
86110259SAndrew.Bardsley@arm.comFor example:
86210259SAndrew.Bardsley@arm.com
86310259SAndrew.Bardsley@arm.com\verbatim
86410259SAndrew.Bardsley@arm.com 110000: system.cpu.dcachePort: MinorTrace: state=MemoryRunning in_tlb_mem=0/0
86510259SAndrew.Bardsley@arm.com\endverbatim
86610259SAndrew.Bardsley@arm.com
86710259SAndrew.Bardsley@arm.comFor each time step, the MinorTrace flag will cause one MinorTrace line to be
86810259SAndrew.Bardsley@arm.comprinted for every named element in the model.
86910259SAndrew.Bardsley@arm.com
87010259SAndrew.Bardsley@arm.com\subsubsection traceunit MinorInst - summaries of instructions issued by \
87110259SAndrew.Bardsley@arm.com    Decode
87210259SAndrew.Bardsley@arm.com
87310259SAndrew.Bardsley@arm.comFor example:
87410259SAndrew.Bardsley@arm.com
87510259SAndrew.Bardsley@arm.com\verbatim
87610259SAndrew.Bardsley@arm.com 140000: system.cpu.execute: MinorInst: id=0/1.1/1/1.1 addr=0x5c \
87710259SAndrew.Bardsley@arm.com                             inst="  mov r0, #0" class=IntAlu
87810259SAndrew.Bardsley@arm.com\endverbatim
87910259SAndrew.Bardsley@arm.com
88010259SAndrew.Bardsley@arm.comMinorInst lines are currently only generated for instructions which are
88110259SAndrew.Bardsley@arm.comcommitted.
88210259SAndrew.Bardsley@arm.com
88310259SAndrew.Bardsley@arm.com\subsubsection tracefetch1 MinorLine - summaries of line fetches issued by \
88410259SAndrew.Bardsley@arm.com    Fetch1
88510259SAndrew.Bardsley@arm.com
88610259SAndrew.Bardsley@arm.comFor example:
88710259SAndrew.Bardsley@arm.com
88810259SAndrew.Bardsley@arm.com\verbatim
88910259SAndrew.Bardsley@arm.com  92000: system.cpu.icachePort: MinorLine: id=0/1.1/1 size=36 \
89010259SAndrew.Bardsley@arm.com                                vaddr=0x5c paddr=0x5c
89110259SAndrew.Bardsley@arm.com\endverbatim
89210259SAndrew.Bardsley@arm.com
89310259SAndrew.Bardsley@arm.com\subsection minorview minorview.py
89410259SAndrew.Bardsley@arm.com
89510259SAndrew.Bardsley@arm.comMinorview (util/minorview.py) can be used to visualise the data created by
89610259SAndrew.Bardsley@arm.comMinorTrace.
89710259SAndrew.Bardsley@arm.com
89810259SAndrew.Bardsley@arm.com\verbatim
89910259SAndrew.Bardsley@arm.comusage: minorview.py [-h] [--picture picture-file] [--prefix name]
90010259SAndrew.Bardsley@arm.com                   [--start-time time] [--end-time time] [--mini-views]
90110259SAndrew.Bardsley@arm.com                   event-file
90210259SAndrew.Bardsley@arm.com
90310259SAndrew.Bardsley@arm.comMinor visualiser
90410259SAndrew.Bardsley@arm.com
90510259SAndrew.Bardsley@arm.compositional arguments:
90610259SAndrew.Bardsley@arm.com  event-file
90710259SAndrew.Bardsley@arm.com
90810259SAndrew.Bardsley@arm.comoptional arguments:
90910259SAndrew.Bardsley@arm.com  -h, --help            show this help message and exit
91010259SAndrew.Bardsley@arm.com  --picture picture-file
91110259SAndrew.Bardsley@arm.com                        markup file containing blob information (default:
91210259SAndrew.Bardsley@arm.com                        <minorview-path>/minor.pic)
91310259SAndrew.Bardsley@arm.com  --prefix name         name prefix in trace for CPU to be visualised
91410259SAndrew.Bardsley@arm.com                        (default: system.cpu)
91510259SAndrew.Bardsley@arm.com  --start-time time     time of first event to load from file
91610259SAndrew.Bardsley@arm.com  --end-time time       time of last event to load from file
91710259SAndrew.Bardsley@arm.com  --mini-views          show tiny views of the next 10 time steps
91810259SAndrew.Bardsley@arm.com\endverbatim
91910259SAndrew.Bardsley@arm.com
92010259SAndrew.Bardsley@arm.comRaw debugging output can be passed to minorview.py as the event-file. It will
92110259SAndrew.Bardsley@arm.compick out the MinorTrace lines and use other lines where units in the
92210259SAndrew.Bardsley@arm.comsimulation are named (such as system.cpu.dcachePort in the above example) will
92310259SAndrew.Bardsley@arm.comappear as 'comments' when units are clicked on the visualiser.
92410259SAndrew.Bardsley@arm.com
92510259SAndrew.Bardsley@arm.comClicking on a unit which contains instructions or lines will bring up a speech
92610259SAndrew.Bardsley@arm.combubble giving extra information derived from the MinorInst/MinorLine lines.
92710259SAndrew.Bardsley@arm.com
92810259SAndrew.Bardsley@arm.com--start-time and --end-time allow only sections of debug files to be loaded.
92910259SAndrew.Bardsley@arm.com
93010259SAndrew.Bardsley@arm.com--prefix allows the name prefix of the CPU to be inspected to be supplied.
93110259SAndrew.Bardsley@arm.comThis defaults to 'system.cpu'.
93210259SAndrew.Bardsley@arm.com
93310259SAndrew.Bardsley@arm.comIn the visualiser, The buttons Start, End, Back, Forward, Play and Stop can be
93410259SAndrew.Bardsley@arm.comused to control the displayed simulation time.
93510259SAndrew.Bardsley@arm.com
93610259SAndrew.Bardsley@arm.comThe diagonally striped coloured blocks are showing the InstId of the
93710259SAndrew.Bardsley@arm.cominstruction or line they represent.  Note that lines in Fetch1 and f1ToF2.F
93810259SAndrew.Bardsley@arm.comonly show the id fields of a line and that instructions in Fetch2, f2ToD, and
93910259SAndrew.Bardsley@arm.comdecode.inputBuffer do not yet have execute sequence numbers.  The T/S.P/L/F.E
94010259SAndrew.Bardsley@arm.combuttons can be used to toggle parts of InstId on and off to make it easier to
94110259SAndrew.Bardsley@arm.comunderstand the display.  Useful combinations are:
94210259SAndrew.Bardsley@arm.com
94310259SAndrew.Bardsley@arm.com<table>
94410259SAndrew.Bardsley@arm.com<tr>
94510259SAndrew.Bardsley@arm.com    <td><b>Combination</b></td>
94610259SAndrew.Bardsley@arm.com    <td><b>Reason</b></td>
94710259SAndrew.Bardsley@arm.com</tr>
94810259SAndrew.Bardsley@arm.com<tr>
94910259SAndrew.Bardsley@arm.com    <td>E</td>
95010259SAndrew.Bardsley@arm.com    <td>just show the final execute sequence number</td>
95110259SAndrew.Bardsley@arm.com</tr>
95210259SAndrew.Bardsley@arm.com<tr>
95310259SAndrew.Bardsley@arm.com    <td>F/E</td>
95410259SAndrew.Bardsley@arm.com    <td>show the instruction-related numbers</td>
95510259SAndrew.Bardsley@arm.com</tr>
95610259SAndrew.Bardsley@arm.com<tr>
95710259SAndrew.Bardsley@arm.com    <td>S/P</td>
95810259SAndrew.Bardsley@arm.com    <td>show just the stream-related numbers (watch the stream sequence
95910259SAndrew.Bardsley@arm.com        change with branches and not change with predicted branches)</td>
96010259SAndrew.Bardsley@arm.com</tr>
96110259SAndrew.Bardsley@arm.com<tr>
96210259SAndrew.Bardsley@arm.com    <td>S/E</td>
96310259SAndrew.Bardsley@arm.com    <td>show instructions and their stream</td>
96410259SAndrew.Bardsley@arm.com</tr>
96510259SAndrew.Bardsley@arm.com</table>
96610259SAndrew.Bardsley@arm.com
96710259SAndrew.Bardsley@arm.comThe key to the right shows all the displayable colours (some of the colour
96810259SAndrew.Bardsley@arm.comchoices are quite bad!):
96910259SAndrew.Bardsley@arm.com
97010259SAndrew.Bardsley@arm.com<table>
97110259SAndrew.Bardsley@arm.com<tr>
97210259SAndrew.Bardsley@arm.com    <td><b>Symbol</b></td>
97310259SAndrew.Bardsley@arm.com    <td><b>Meaning</b></td>
97410259SAndrew.Bardsley@arm.com</tr>
97510259SAndrew.Bardsley@arm.com<tr>
97610259SAndrew.Bardsley@arm.com    <td>U</td>
97710259SAndrew.Bardsley@arm.com    <td>Unknown data</td>
97810259SAndrew.Bardsley@arm.com</tr>
97910259SAndrew.Bardsley@arm.com<tr>
98010259SAndrew.Bardsley@arm.com    <td>B</td>
98110259SAndrew.Bardsley@arm.com    <td>Blocked stage</td>
98210259SAndrew.Bardsley@arm.com</tr>
98310259SAndrew.Bardsley@arm.com<tr>
98410259SAndrew.Bardsley@arm.com    <td>-</td>
98510259SAndrew.Bardsley@arm.com    <td>Bubble</td>
98610259SAndrew.Bardsley@arm.com</tr>
98710259SAndrew.Bardsley@arm.com<tr>
98810259SAndrew.Bardsley@arm.com    <td>E</td>
98910259SAndrew.Bardsley@arm.com    <td>Empty queue slot</td>
99010259SAndrew.Bardsley@arm.com</tr>
99110259SAndrew.Bardsley@arm.com<tr>
99210259SAndrew.Bardsley@arm.com    <td>R</td>
99310259SAndrew.Bardsley@arm.com    <td>Reserved queue slot</td>
99410259SAndrew.Bardsley@arm.com</tr>
99510259SAndrew.Bardsley@arm.com<tr>
99610259SAndrew.Bardsley@arm.com    <td>F</td>
99710259SAndrew.Bardsley@arm.com    <td>Fault</td>
99810259SAndrew.Bardsley@arm.com</tr>
99910259SAndrew.Bardsley@arm.com<tr>
100010259SAndrew.Bardsley@arm.com    <td>r</td>
100110259SAndrew.Bardsley@arm.com    <td>Read (used as the leftmost stripe on data in the dcachePort)</td>
100210259SAndrew.Bardsley@arm.com</tr>
100310259SAndrew.Bardsley@arm.com<tr>
100410259SAndrew.Bardsley@arm.com    <td>w</td>
100510259SAndrew.Bardsley@arm.com    <td>Write " "</td>
100610259SAndrew.Bardsley@arm.com</tr>
100710259SAndrew.Bardsley@arm.com<tr>
100810259SAndrew.Bardsley@arm.com    <td>0 to 9</td>
100910259SAndrew.Bardsley@arm.com    <td>last decimal digit of the corresponding data</td>
101010259SAndrew.Bardsley@arm.com</tr>
101110259SAndrew.Bardsley@arm.com</table>
101210259SAndrew.Bardsley@arm.com
101310259SAndrew.Bardsley@arm.com\verbatim
101410259SAndrew.Bardsley@arm.com
101510259SAndrew.Bardsley@arm.com    ,---------------.         .--------------.  *U
101610259SAndrew.Bardsley@arm.com    | |=|->|=|->|=| |         ||=|||->||->|| |  *-  <- Fetch queues/LSQ
101710259SAndrew.Bardsley@arm.com    `---------------'         `--------------'  *R
101810259SAndrew.Bardsley@arm.com    === ======                                  *w  <- Activity/Stage activity
101910259SAndrew.Bardsley@arm.com                              ,--------------.  *1
102010259SAndrew.Bardsley@arm.com    ,--.      ,.      ,.      | ============ |  *3  <- Scoreboard
102110259SAndrew.Bardsley@arm.com    |  |-\[]-\||-\[]-\||-\[]-\| ============ |  *5  <- Execute::inFlightInsts
102210259SAndrew.Bardsley@arm.com    |  | :[] :||-/[]-/||-/[]-/| -. --------  |  *7
102310259SAndrew.Bardsley@arm.com    |  |-/[]-/||  ^   ||      |  | --------- |  *9
102410259SAndrew.Bardsley@arm.com    |  |      ||  |   ||      |  | ------    |
102510259SAndrew.Bardsley@arm.com[]->|  |    ->||  |   ||      |  | ----      |
102610259SAndrew.Bardsley@arm.com    |  |<-[]<-||<-+-<-||<-[]<-|  | ------    |->[] <- Execute to Fetch1,
102710259SAndrew.Bardsley@arm.com    '--`      `'  ^   `'      | -' ------    |        Fetch2 branch data
102810259SAndrew.Bardsley@arm.com             ---. |  ---.     `--------------'
102910259SAndrew.Bardsley@arm.com             ---' |  ---'       ^       ^
103010259SAndrew.Bardsley@arm.com                  |   ^         |       `------------ Execute
103110259SAndrew.Bardsley@arm.com  MinorBuffer ----' input       `-------------------- Execute input buffer
103210259SAndrew.Bardsley@arm.com                    buffer
103310259SAndrew.Bardsley@arm.com\endverbatim
103410259SAndrew.Bardsley@arm.com
103510259SAndrew.Bardsley@arm.comStages show the colours of the instructions currently being
103610259SAndrew.Bardsley@arm.comgenerated/processed.
103710259SAndrew.Bardsley@arm.com
103810259SAndrew.Bardsley@arm.comForward FIFOs between stages show the data being pushed into them at the
103910259SAndrew.Bardsley@arm.comcurrent tick (to the left), the data in transit, and the data available at
104010259SAndrew.Bardsley@arm.comtheir outputs (to the right).
104110259SAndrew.Bardsley@arm.com
104210259SAndrew.Bardsley@arm.comThe backwards FIFO between Fetch2 and Fetch1 shows branch prediction data.
104310259SAndrew.Bardsley@arm.com
104410259SAndrew.Bardsley@arm.comIn general, all displayed data is correct at the end of a cycle's activity at
104510259SAndrew.Bardsley@arm.comthe time indicated but before the inter-stage FIFOs are ticked.  Each FIFO
104610259SAndrew.Bardsley@arm.comhas, therefore an extra slot to show the asserted new input data, and all the
104710259SAndrew.Bardsley@arm.comdata currently within the FIFO.
104810259SAndrew.Bardsley@arm.com
104910259SAndrew.Bardsley@arm.comInput buffers for each stage are shown below the corresponding stage and show
105010259SAndrew.Bardsley@arm.comthe contents of those buffers as horizontal strips.  Strips marked as reserved
105110259SAndrew.Bardsley@arm.com(cyan by default) are reserved to be filled by the previous stage.  An input
105210259SAndrew.Bardsley@arm.combuffer with all reserved or occupied slots will, therefore, block the previous
105310259SAndrew.Bardsley@arm.comstage from generating output.
105410259SAndrew.Bardsley@arm.com
105510259SAndrew.Bardsley@arm.comFetch queues and LSQ show the lines/instructions in the queues of each
105610259SAndrew.Bardsley@arm.cominterface and show the number of lines/instructions in TLB and memory in the
105710259SAndrew.Bardsley@arm.comtwo striped colours of the top of their frames.
105810259SAndrew.Bardsley@arm.com
105910259SAndrew.Bardsley@arm.comInside Execute, the horizontal bars represent the individual FU pipelines.
106010259SAndrew.Bardsley@arm.comThe vertical bar to the left is the input buffer and the bar to the right, the
106110259SAndrew.Bardsley@arm.cominstructions committed this cycle.  The background of Execute shows
106210259SAndrew.Bardsley@arm.cominstructions which are being committed this cycle in their original FU
106310259SAndrew.Bardsley@arm.compipeline positions.
106410259SAndrew.Bardsley@arm.com
106510259SAndrew.Bardsley@arm.comThe strip at the top of the Execute block shows the current streamSeqNum that
106610259SAndrew.Bardsley@arm.comExecute is committing.  A similar stripe at the top of Fetch1 shows that
106710259SAndrew.Bardsley@arm.comstage's expected streamSeqNum and the stripe at the top of Fetch2 shows its
106810259SAndrew.Bardsley@arm.comissuing predictionSeqNum.
106910259SAndrew.Bardsley@arm.com
107010259SAndrew.Bardsley@arm.comThe scoreboard shows the number of instructions in flight which will commit a
107110259SAndrew.Bardsley@arm.comresult to the register in the position shown.  The scoreboard contains slots
107210259SAndrew.Bardsley@arm.comfor each integer and floating point register.
107310259SAndrew.Bardsley@arm.com
107410259SAndrew.Bardsley@arm.comThe Execute::inFlightInsts queue shows all the instructions in flight in
107510259SAndrew.Bardsley@arm.comExecute with the oldest instruction (the next instruction to be committed) to
107610259SAndrew.Bardsley@arm.comthe right.
107710259SAndrew.Bardsley@arm.com
107810259SAndrew.Bardsley@arm.com'Stage activity' shows the signalled activity (as E/1) for each stage (with
107910259SAndrew.Bardsley@arm.comCPU miscellaneous activity to the left)
108010259SAndrew.Bardsley@arm.com
108110259SAndrew.Bardsley@arm.com'Activity' show a count of stage and pipe activity.
108210259SAndrew.Bardsley@arm.com
108310259SAndrew.Bardsley@arm.com\subsection picformat minor.pic format
108410259SAndrew.Bardsley@arm.com
108510259SAndrew.Bardsley@arm.comThe minor.pic file (src/minor/minor.pic) describes the layout of the
108610259SAndrew.Bardsley@arm.commodels blocks on the visualiser.  Its format is described in the supplied
108710259SAndrew.Bardsley@arm.comminor.pic file.
108810259SAndrew.Bardsley@arm.com
108910259SAndrew.Bardsley@arm.com*/
109010259SAndrew.Bardsley@arm.com
109110259SAndrew.Bardsley@arm.com}
1092