inside-minor.doxygen revision 10259
110259SAndrew.Bardsley@arm.com# Copyright (c) 2014 ARM Limited 210259SAndrew.Bardsley@arm.com# All rights reserved 310259SAndrew.Bardsley@arm.com# 410259SAndrew.Bardsley@arm.com# The license below extends only to copyright in the software and shall 510259SAndrew.Bardsley@arm.com# not be construed as granting a license to any other intellectual 610259SAndrew.Bardsley@arm.com# property including but not limited to intellectual property relating 710259SAndrew.Bardsley@arm.com# to a hardware implementation of the functionality of the software 810259SAndrew.Bardsley@arm.com# licensed hereunder. You may use the software subject to the license 910259SAndrew.Bardsley@arm.com# terms below provided that you ensure that this notice is replicated 1010259SAndrew.Bardsley@arm.com# unmodified and in its entirety in all distributions of the software, 1110259SAndrew.Bardsley@arm.com# modified or unmodified, in source code or in binary form. 1210259SAndrew.Bardsley@arm.com# 1310259SAndrew.Bardsley@arm.com# Redistribution and use in source and binary forms, with or without 1410259SAndrew.Bardsley@arm.com# modification, are permitted provided that the following conditions are 1510259SAndrew.Bardsley@arm.com# met: redistributions of source code must retain the above copyright 1610259SAndrew.Bardsley@arm.com# notice, this list of conditions and the following disclaimer; 1710259SAndrew.Bardsley@arm.com# redistributions in binary form must reproduce the above copyright 1810259SAndrew.Bardsley@arm.com# notice, this list of conditions and the following disclaimer in the 1910259SAndrew.Bardsley@arm.com# documentation and/or other materials provided with the distribution; 2010259SAndrew.Bardsley@arm.com# neither the name of the copyright holders nor the names of its 2110259SAndrew.Bardsley@arm.com# contributors may be used to endorse or promote products derived from 2210259SAndrew.Bardsley@arm.com# this software without specific prior written permission. 2310259SAndrew.Bardsley@arm.com# 2410259SAndrew.Bardsley@arm.com# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 2510259SAndrew.Bardsley@arm.com# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 2610259SAndrew.Bardsley@arm.com# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 2710259SAndrew.Bardsley@arm.com# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 2810259SAndrew.Bardsley@arm.com# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 2910259SAndrew.Bardsley@arm.com# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 3010259SAndrew.Bardsley@arm.com# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 3110259SAndrew.Bardsley@arm.com# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 3210259SAndrew.Bardsley@arm.com# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 3310259SAndrew.Bardsley@arm.com# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 3410259SAndrew.Bardsley@arm.com# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 3510259SAndrew.Bardsley@arm.com# 3610259SAndrew.Bardsley@arm.com# Authors: Andrew Bardsley 3710259SAndrew.Bardsley@arm.com 3810259SAndrew.Bardsley@arm.comnamespace Minor 3910259SAndrew.Bardsley@arm.com{ 4010259SAndrew.Bardsley@arm.com 4110259SAndrew.Bardsley@arm.com/*! 4210259SAndrew.Bardsley@arm.com 4310259SAndrew.Bardsley@arm.com\page minor Inside the Minor CPU model 4410259SAndrew.Bardsley@arm.com 4510259SAndrew.Bardsley@arm.com\tableofcontents 4610259SAndrew.Bardsley@arm.com 4710259SAndrew.Bardsley@arm.comThis document contains a description of the structure and function of the 4810259SAndrew.Bardsley@arm.comMinor gem5 in-order processor model. It is recommended reading for anyone who 4910259SAndrew.Bardsley@arm.comwants to understand Minor's internal organisation, design decisions, C++ 5010259SAndrew.Bardsley@arm.comimplementation and Python configuration. A familiarity with gem5 and some of 5110259SAndrew.Bardsley@arm.comits internal structures is assumed. This document is meant to be read 5210259SAndrew.Bardsley@arm.comalongside the Minor source code and to explain its general structure without 5310259SAndrew.Bardsley@arm.combeing too slavish about naming every function and data type. 5410259SAndrew.Bardsley@arm.com 5510259SAndrew.Bardsley@arm.com\section whatis What is Minor? 5610259SAndrew.Bardsley@arm.com 5710259SAndrew.Bardsley@arm.comMinor is an in-order processor model with a fixed pipeline but configurable 5810259SAndrew.Bardsley@arm.comdata structures and execute behaviour. It is intended to be used to model 5910259SAndrew.Bardsley@arm.comprocessors with strict in-order execution behaviour and allows visualisation 6010259SAndrew.Bardsley@arm.comof an instruction's position in the pipeline through the 6110259SAndrew.Bardsley@arm.comMinorTrace/minorview.py format/tool. The intention is to provide a framework 6210259SAndrew.Bardsley@arm.comfor micro-architecturally correlating the model with a particular, chosen 6310259SAndrew.Bardsley@arm.comprocessor with similar capabilities. 6410259SAndrew.Bardsley@arm.com 6510259SAndrew.Bardsley@arm.com\section philo Design philosophy 6610259SAndrew.Bardsley@arm.com 6710259SAndrew.Bardsley@arm.com\subsection mt Multithreading 6810259SAndrew.Bardsley@arm.com 6910259SAndrew.Bardsley@arm.comThe model isn't currently capable of multithreading but there are THREAD 7010259SAndrew.Bardsley@arm.comcomments in key places where stage data needs to be arrayed to support 7110259SAndrew.Bardsley@arm.commultithreading. 7210259SAndrew.Bardsley@arm.com 7310259SAndrew.Bardsley@arm.com\subsection structs Data structures 7410259SAndrew.Bardsley@arm.com 7510259SAndrew.Bardsley@arm.comDecorating data structures with large amounts of life-cycle information is 7610259SAndrew.Bardsley@arm.comavoided. Only instructions (MinorDynInst) contain a significant proportion of 7710259SAndrew.Bardsley@arm.comtheir data content whose values are not set at construction. 7810259SAndrew.Bardsley@arm.com 7910259SAndrew.Bardsley@arm.comAll internal structures have fixed sizes on construction. Data held in queues 8010259SAndrew.Bardsley@arm.comand FIFOs (MinorBuffer, FUPipeline) should have a BubbleIF interface to 8110259SAndrew.Bardsley@arm.comallow a distinct 'bubble'/no data value option for each type. 8210259SAndrew.Bardsley@arm.com 8310259SAndrew.Bardsley@arm.comInter-stage 'struct' data is packaged in structures which are passed by value. 8410259SAndrew.Bardsley@arm.comOnly MinorDynInst, the line data in ForwardLineData and the memory-interfacing 8510259SAndrew.Bardsley@arm.comobjects Fetch1::FetchRequest and LSQ::LSQRequest are '::new' allocated while 8610259SAndrew.Bardsley@arm.comrunning the model. 8710259SAndrew.Bardsley@arm.com 8810259SAndrew.Bardsley@arm.com\section model Model structure 8910259SAndrew.Bardsley@arm.com 9010259SAndrew.Bardsley@arm.comObjects of class MinorCPU are provided by the model to gem5. MinorCPU 9110259SAndrew.Bardsley@arm.comimplements the interfaces of (cpu.hh) and can provide data and 9210259SAndrew.Bardsley@arm.cominstruction interfaces for connection to a cache system. The model is 9310259SAndrew.Bardsley@arm.comconfigured in a similar way to other gem5 models through Python. That 9410259SAndrew.Bardsley@arm.comconfiguration is passed on to MinorCPU::pipeline (of class Pipeline) which 9510259SAndrew.Bardsley@arm.comactually implements the processor pipeline. 9610259SAndrew.Bardsley@arm.com 9710259SAndrew.Bardsley@arm.comThe hierarchy of major unit ownership from MinorCPU down looks like this: 9810259SAndrew.Bardsley@arm.com 9910259SAndrew.Bardsley@arm.com<ul> 10010259SAndrew.Bardsley@arm.com<li>MinorCPU</li> 10110259SAndrew.Bardsley@arm.com<ul> 10210259SAndrew.Bardsley@arm.com <li>Pipeline - container for the pipeline, owns the cyclic 'tick' 10310259SAndrew.Bardsley@arm.com event mechanism and the idling (cycle skipping) mechanism.</li> 10410259SAndrew.Bardsley@arm.com <ul> 10510259SAndrew.Bardsley@arm.com <li>Fetch1 - instruction fetch unit responsible for fetching cache 10610259SAndrew.Bardsley@arm.com lines (or parts of lines from the I-cache interface)</li> 10710259SAndrew.Bardsley@arm.com <ul> 10810259SAndrew.Bardsley@arm.com <li>Fetch1::IcachePort - interface to the I-cache from 10910259SAndrew.Bardsley@arm.com Fetch1</li> 11010259SAndrew.Bardsley@arm.com </ul> 11110259SAndrew.Bardsley@arm.com <li>Fetch2 - line to instruction decomposition</li> 11210259SAndrew.Bardsley@arm.com <li>Decode - instruction to micro-op decomposition</li> 11310259SAndrew.Bardsley@arm.com <li>Execute - instruction execution and data memory 11410259SAndrew.Bardsley@arm.com interface</li> 11510259SAndrew.Bardsley@arm.com <ul> 11610259SAndrew.Bardsley@arm.com <li>LSQ - load store queue for memory ref. instructions</li> 11710259SAndrew.Bardsley@arm.com <li>LSQ::DcachePort - interface to the D-cache from 11810259SAndrew.Bardsley@arm.com Execute</li> 11910259SAndrew.Bardsley@arm.com </ul> 12010259SAndrew.Bardsley@arm.com </ul> 12110259SAndrew.Bardsley@arm.com </ul> 12210259SAndrew.Bardsley@arm.com</ul> 12310259SAndrew.Bardsley@arm.com 12410259SAndrew.Bardsley@arm.com\section keystruct Key data structures 12510259SAndrew.Bardsley@arm.com 12610259SAndrew.Bardsley@arm.com\subsection ids Instruction and line identity: InstId (dyn_inst.hh) 12710259SAndrew.Bardsley@arm.com 12810259SAndrew.Bardsley@arm.comAn InstId contains the sequence numbers and thread numbers that describe the 12910259SAndrew.Bardsley@arm.comlife cycle and instruction stream affiliations of individual fetched cache 13010259SAndrew.Bardsley@arm.comlines and instructions. 13110259SAndrew.Bardsley@arm.com 13210259SAndrew.Bardsley@arm.comAn InstId is printed in one of the following forms: 13310259SAndrew.Bardsley@arm.com 13410259SAndrew.Bardsley@arm.com - T/S.P/L - for fetched cache lines 13510259SAndrew.Bardsley@arm.com - T/S.P/L/F - for instructions before Decode 13610259SAndrew.Bardsley@arm.com - T/S.P/L/F.E - for instructions from Decode onwards 13710259SAndrew.Bardsley@arm.com 13810259SAndrew.Bardsley@arm.comfor example: 13910259SAndrew.Bardsley@arm.com 14010259SAndrew.Bardsley@arm.com - 0/10.12/5/6.7 14110259SAndrew.Bardsley@arm.com 14210259SAndrew.Bardsley@arm.comInstId's fields are: 14310259SAndrew.Bardsley@arm.com 14410259SAndrew.Bardsley@arm.com<table> 14510259SAndrew.Bardsley@arm.com<tr> 14610259SAndrew.Bardsley@arm.com <td><b>Field</b></td> 14710259SAndrew.Bardsley@arm.com <td><b>Symbol</b></td> 14810259SAndrew.Bardsley@arm.com <td><b>Generated by</b></td> 14910259SAndrew.Bardsley@arm.com <td><b>Checked by</b></td> 15010259SAndrew.Bardsley@arm.com <td><b>Function</b></td> 15110259SAndrew.Bardsley@arm.com</tr> 15210259SAndrew.Bardsley@arm.com 15310259SAndrew.Bardsley@arm.com<tr> 15410259SAndrew.Bardsley@arm.com <td>InstId::threadId</td> 15510259SAndrew.Bardsley@arm.com <td>T</td> 15610259SAndrew.Bardsley@arm.com <td>Fetch1</td> 15710259SAndrew.Bardsley@arm.com <td>Everywhere the thread number is needed</td> 15810259SAndrew.Bardsley@arm.com <td>Thread number (currently always 0).</td> 15910259SAndrew.Bardsley@arm.com</tr> 16010259SAndrew.Bardsley@arm.com 16110259SAndrew.Bardsley@arm.com<tr> 16210259SAndrew.Bardsley@arm.com <td>InstId::streamSeqNum</td> 16310259SAndrew.Bardsley@arm.com <td>S</td> 16410259SAndrew.Bardsley@arm.com <td>Execute</td> 16510259SAndrew.Bardsley@arm.com <td>Fetch1, Fetch2, Execute (to discard lines/insts)</td> 16610259SAndrew.Bardsley@arm.com <td>Stream sequence number as chosen by Execute. Stream 16710259SAndrew.Bardsley@arm.com sequence numbers change after changes of PC (branches, exceptions) in 16810259SAndrew.Bardsley@arm.com Execute and are used to separate pre and post branch instruction 16910259SAndrew.Bardsley@arm.com streams.</td> 17010259SAndrew.Bardsley@arm.com</tr> 17110259SAndrew.Bardsley@arm.com 17210259SAndrew.Bardsley@arm.com<tr> 17310259SAndrew.Bardsley@arm.com <td>InstId::predictionSeqNum</td> 17410259SAndrew.Bardsley@arm.com <td>P</td> 17510259SAndrew.Bardsley@arm.com <td>Fetch2</td> 17610259SAndrew.Bardsley@arm.com <td>Fetch2 (while discarding lines after prediction)</td> 17710259SAndrew.Bardsley@arm.com <td>Prediction sequence numbers represent branch prediction decisions. 17810259SAndrew.Bardsley@arm.com This is used by Fetch2 to mark lines/instructions according to the last 17910259SAndrew.Bardsley@arm.com followed branch prediction made by Fetch2. Fetch2 can signal to Fetch1 18010259SAndrew.Bardsley@arm.com that it should change its fetch address and mark lines with a new 18110259SAndrew.Bardsley@arm.com prediction sequence number (which it will only do if the stream sequence 18210259SAndrew.Bardsley@arm.com number Fetch1 expects matches that of the request). </td> </tr> 18310259SAndrew.Bardsley@arm.com 18410259SAndrew.Bardsley@arm.com<tr> 18510259SAndrew.Bardsley@arm.com<td>InstId::lineSeqNum</td> 18610259SAndrew.Bardsley@arm.com<td>L</td> 18710259SAndrew.Bardsley@arm.com<td>Fetch1</td> 18810259SAndrew.Bardsley@arm.com<td>(Just for debugging)</td> 18910259SAndrew.Bardsley@arm.com<td>Line fetch sequence number of this cache line or the line 19010259SAndrew.Bardsley@arm.com this instruction was extracted from. 19110259SAndrew.Bardsley@arm.com </td> 19210259SAndrew.Bardsley@arm.com</tr> 19310259SAndrew.Bardsley@arm.com 19410259SAndrew.Bardsley@arm.com<tr> 19510259SAndrew.Bardsley@arm.com<td>InstId::fetchSeqNum</td> 19610259SAndrew.Bardsley@arm.com<td>F</td> 19710259SAndrew.Bardsley@arm.com<td>Fetch2</td> 19810259SAndrew.Bardsley@arm.com<td>Fetch2 (as the inst. sequence number for branches)</td> 19910259SAndrew.Bardsley@arm.com<td>Instruction fetch order assigned by Fetch2 when lines 20010259SAndrew.Bardsley@arm.com are decomposed into instructions. 20110259SAndrew.Bardsley@arm.com </td> 20210259SAndrew.Bardsley@arm.com</tr> 20310259SAndrew.Bardsley@arm.com 20410259SAndrew.Bardsley@arm.com<tr> 20510259SAndrew.Bardsley@arm.com<td>InstId::execSeqNum</td> 20610259SAndrew.Bardsley@arm.com<td>E</td> 20710259SAndrew.Bardsley@arm.com<td>Decode</td> 20810259SAndrew.Bardsley@arm.com<td>Execute (to check instruction identity in queues/FUs/LSQ)</td> 20910259SAndrew.Bardsley@arm.com<td>Instruction order after micro-op decomposition.</td> 21010259SAndrew.Bardsley@arm.com</tr> 21110259SAndrew.Bardsley@arm.com 21210259SAndrew.Bardsley@arm.com</table> 21310259SAndrew.Bardsley@arm.com 21410259SAndrew.Bardsley@arm.comThe sequence number fields are all independent of each other and although, for 21510259SAndrew.Bardsley@arm.cominstance, InstId::execSeqNum for an instruction will always be >= 21610259SAndrew.Bardsley@arm.comInstId::fetchSeqNum, the comparison is not useful. 21710259SAndrew.Bardsley@arm.com 21810259SAndrew.Bardsley@arm.comThe originating stage of each sequence number field keeps a counter for that 21910259SAndrew.Bardsley@arm.comfield which can be incremented in order to generate new, unique numbers. 22010259SAndrew.Bardsley@arm.com 22110259SAndrew.Bardsley@arm.com\subsection insts Instructions: MinorDynInst (dyn_inst.hh) 22210259SAndrew.Bardsley@arm.com 22310259SAndrew.Bardsley@arm.comMinorDynInst represents an instruction's progression through the pipeline. An 22410259SAndrew.Bardsley@arm.cominstruction can be three things: 22510259SAndrew.Bardsley@arm.com 22610259SAndrew.Bardsley@arm.com<table> 22710259SAndrew.Bardsley@arm.com<tr> 22810259SAndrew.Bardsley@arm.com <td><b>Thing</b></td> 22910259SAndrew.Bardsley@arm.com <td><b>Predicate</b></td> 23010259SAndrew.Bardsley@arm.com <td><b>Explanation</b></td> 23110259SAndrew.Bardsley@arm.com</tr> 23210259SAndrew.Bardsley@arm.com<tr> 23310259SAndrew.Bardsley@arm.com <td>A bubble</td> 23410259SAndrew.Bardsley@arm.com <td>MinorDynInst::isBubble()</td> 23510259SAndrew.Bardsley@arm.com <td>no instruction at all, just a space-filler</td> 23610259SAndrew.Bardsley@arm.com</tr> 23710259SAndrew.Bardsley@arm.com<tr> 23810259SAndrew.Bardsley@arm.com <td>A fault</td> 23910259SAndrew.Bardsley@arm.com <td>MinorDynInst::isFault()</td> 24010259SAndrew.Bardsley@arm.com <td>a fault to pass down the pipeline in an instruction's clothing</td> 24110259SAndrew.Bardsley@arm.com</tr> 24210259SAndrew.Bardsley@arm.com<tr> 24310259SAndrew.Bardsley@arm.com <td>A decoded instruction</td> 24410259SAndrew.Bardsley@arm.com <td>MinorDynInst::isInst()</td> 24510259SAndrew.Bardsley@arm.com <td>instructions are actually passed to the gem5 decoder in Fetch2 and so 24610259SAndrew.Bardsley@arm.com are created fully decoded. MinorDynInst::staticInst is the decoded 24710259SAndrew.Bardsley@arm.com instruction form.</td> 24810259SAndrew.Bardsley@arm.com</tr> 24910259SAndrew.Bardsley@arm.com</table> 25010259SAndrew.Bardsley@arm.com 25110259SAndrew.Bardsley@arm.comInstructions are reference counted using the gem5 RefCountingPtr 25210259SAndrew.Bardsley@arm.com(base/refcnt.hh) wrapper. They therefore usually appear as MinorDynInstPtr in 25310259SAndrew.Bardsley@arm.comcode. Note that as RefCountingPtr initialises as nullptr rather than an 25410259SAndrew.Bardsley@arm.comobject that supports BubbleIF::isBubble, passing raw MinorDynInstPtrs to 25510259SAndrew.Bardsley@arm.comQueue%s and other similar structures from stage.hh without boxing is 25610259SAndrew.Bardsley@arm.comdangerous. 25710259SAndrew.Bardsley@arm.com 25810259SAndrew.Bardsley@arm.com\subsection fld ForwardLineData (pipe_data.hh) 25910259SAndrew.Bardsley@arm.com 26010259SAndrew.Bardsley@arm.comForwardLineData is used to pass cache lines from Fetch1 to Fetch2. Like 26110259SAndrew.Bardsley@arm.comMinorDynInst%s, they can be bubbles (ForwardLineData::isBubble()), 26210259SAndrew.Bardsley@arm.comfault-carrying or can contain a line (partial line) fetched by Fetch1. The 26310259SAndrew.Bardsley@arm.comdata carried by ForwardLineData is owned by a Packet object returned from 26410259SAndrew.Bardsley@arm.commemory and is explicitly memory managed and do must be deleted once processed 26510259SAndrew.Bardsley@arm.com(by Fetch2 deleting the Packet). 26610259SAndrew.Bardsley@arm.com 26710259SAndrew.Bardsley@arm.com\subsection fid ForwardInstData (pipe_data.hh) 26810259SAndrew.Bardsley@arm.com 26910259SAndrew.Bardsley@arm.comForwardInstData can contain up to ForwardInstData::width() instructions in its 27010259SAndrew.Bardsley@arm.comForwardInstData::insts vector. This structure is used to carry instructions 27110259SAndrew.Bardsley@arm.combetween Fetch2, Decode and Execute and to store input buffer vectors in Decode 27210259SAndrew.Bardsley@arm.comand Execute. 27310259SAndrew.Bardsley@arm.com 27410259SAndrew.Bardsley@arm.com\subsection fr Fetch1::FetchRequest (fetch1.hh) 27510259SAndrew.Bardsley@arm.com 27610259SAndrew.Bardsley@arm.comFetchRequests represent I-cache line fetch requests. The are used in the 27710259SAndrew.Bardsley@arm.commemory queues of Fetch1 and are pushed into/popped from Packet::senderState 27810259SAndrew.Bardsley@arm.comwhile traversing the memory system. 27910259SAndrew.Bardsley@arm.com 28010259SAndrew.Bardsley@arm.comFetchRequests contain a memory system Request (mem/request.hh) for that fetch 28110259SAndrew.Bardsley@arm.comaccess, a packet (Packet, mem/packet.hh), if the request gets to memory, and a 28210259SAndrew.Bardsley@arm.comfault field that can be populated with a TLB-sourced prefetch fault (if any). 28310259SAndrew.Bardsley@arm.com 28410259SAndrew.Bardsley@arm.com\subsection lsqr LSQ::LSQRequest (execute.hh) 28510259SAndrew.Bardsley@arm.com 28610259SAndrew.Bardsley@arm.comLSQRequests are similar to FetchRequests but for D-cache accesses. They carry 28710259SAndrew.Bardsley@arm.comthe instruction associated with a memory access. 28810259SAndrew.Bardsley@arm.com 28910259SAndrew.Bardsley@arm.com\section pipeline The pipeline 29010259SAndrew.Bardsley@arm.com 29110259SAndrew.Bardsley@arm.com\verbatim 29210259SAndrew.Bardsley@arm.com------------------------------------------------------------------------------ 29310259SAndrew.Bardsley@arm.com Key: 29410259SAndrew.Bardsley@arm.com 29510259SAndrew.Bardsley@arm.com [] : inter-stage BufferBuffer 29610259SAndrew.Bardsley@arm.com ,--. 29710259SAndrew.Bardsley@arm.com | | : pipeline stage 29810259SAndrew.Bardsley@arm.com `--' 29910259SAndrew.Bardsley@arm.com ---> : forward communication 30010259SAndrew.Bardsley@arm.com <--- : backward communication 30110259SAndrew.Bardsley@arm.com 30210259SAndrew.Bardsley@arm.com rv : reservation information for input buffers 30310259SAndrew.Bardsley@arm.com 30410259SAndrew.Bardsley@arm.com ,------. ,------. ,------. ,-------. 30510259SAndrew.Bardsley@arm.com (from --[]-v->|Fetch1|-[]->|Fetch2|-[]->|Decode|-[]->|Execute|--> (to Fetch1 30610259SAndrew.Bardsley@arm.com Execute) | | |<-[]-| |<-rv-| |<-rv-| | & Fetch2) 30710259SAndrew.Bardsley@arm.com | `------'<-rv-| | | | | | 30810259SAndrew.Bardsley@arm.com `-------------->| | | | | | 30910259SAndrew.Bardsley@arm.com `------' `------' `-------' 31010259SAndrew.Bardsley@arm.com------------------------------------------------------------------------------ 31110259SAndrew.Bardsley@arm.com\endverbatim 31210259SAndrew.Bardsley@arm.com 31310259SAndrew.Bardsley@arm.comThe four pipeline stages are connected together by MinorBuffer FIFO 31410259SAndrew.Bardsley@arm.com(stage.hh, derived ultimately from TimeBuffer) structures which allow 31510259SAndrew.Bardsley@arm.cominter-stage delays to be modelled. There is a MinorBuffer%s between adjacent 31610259SAndrew.Bardsley@arm.comstages in the forward direction (for example: passing lines from Fetch1 to 31710259SAndrew.Bardsley@arm.comFetch2) and, between Fetch2 and Fetch1, a buffer in the backwards direction 31810259SAndrew.Bardsley@arm.comcarrying branch predictions. 31910259SAndrew.Bardsley@arm.com 32010259SAndrew.Bardsley@arm.comStages Fetch2, Decode and Execute have input buffers which, each cycle, can 32110259SAndrew.Bardsley@arm.comaccept input data from the previous stage and can hold that data if the stage 32210259SAndrew.Bardsley@arm.comis not ready to process it. Input buffers store data in the same form as it 32310259SAndrew.Bardsley@arm.comis received and so Decode and Execute's input buffers contain the output 32410259SAndrew.Bardsley@arm.cominstruction vector (ForwardInstData (pipe_data.hh)) from their previous stages 32510259SAndrew.Bardsley@arm.comwith the instructions and bubbles in the same positions as a single buffer 32610259SAndrew.Bardsley@arm.comentry. 32710259SAndrew.Bardsley@arm.com 32810259SAndrew.Bardsley@arm.comStage input buffers provide a Reservable (stage.hh) interface to their 32910259SAndrew.Bardsley@arm.comprevious stages, to allow slots to be reserved in their input buffers, and 33010259SAndrew.Bardsley@arm.comcommunicate their input buffer occupancy backwards to allow the previous stage 33110259SAndrew.Bardsley@arm.comto plan whether it should make an output in a given cycle. 33210259SAndrew.Bardsley@arm.com 33310259SAndrew.Bardsley@arm.com\subsection events Event handling: MinorActivityRecorder (activity.hh, 33410259SAndrew.Bardsley@arm.compipeline.hh) 33510259SAndrew.Bardsley@arm.com 33610259SAndrew.Bardsley@arm.comMinor is essentially a cycle-callable model with some ability to skip cycles 33710259SAndrew.Bardsley@arm.combased on pipeline activity. External events are mostly received by callbacks 33810259SAndrew.Bardsley@arm.com(e.g. Fetch1::IcachePort::recvTimingResp) and cause the pipeline to be woken 33910259SAndrew.Bardsley@arm.comup to service advancing request queues. 34010259SAndrew.Bardsley@arm.com 34110259SAndrew.Bardsley@arm.comTicked (sim/ticked.hh) is a base class bringing together an evaluate 34210259SAndrew.Bardsley@arm.commember function and a provided SimObject. It provides a Ticked::start/stop 34310259SAndrew.Bardsley@arm.cominterface to start and pause clock events from being periodically issued. 34410259SAndrew.Bardsley@arm.comPipeline is a derived class of Ticked. 34510259SAndrew.Bardsley@arm.com 34610259SAndrew.Bardsley@arm.comDuring evaluate calls, stages can signal that they still have work to do in 34710259SAndrew.Bardsley@arm.comthe next cycle by calling either MinorCPU::activityRecorder->activity() (for 34810259SAndrew.Bardsley@arm.comnon-callable related activity) or MinorCPU::wakeupOnEvent(<stageId>) (for 34910259SAndrew.Bardsley@arm.comstage callback-related 'wakeup' activity). 35010259SAndrew.Bardsley@arm.com 35110259SAndrew.Bardsley@arm.comPipeline::evaluate contains calls to evaluate for each unit and a test for 35210259SAndrew.Bardsley@arm.compipeline idling which can turns off the clock tick if no unit has signalled 35310259SAndrew.Bardsley@arm.comthat it may become active next cycle. 35410259SAndrew.Bardsley@arm.com 35510259SAndrew.Bardsley@arm.comWithin Pipeline (pipeline.hh), the stages are evaluated in reverse order (and 35610259SAndrew.Bardsley@arm.comso will ::evaluate in reverse order) and their backwards data can be 35710259SAndrew.Bardsley@arm.comread immediately after being written in each cycle allowing output decisions 35810259SAndrew.Bardsley@arm.comto be 'perfect' (allowing synchronous stalling of the whole pipeline). Branch 35910259SAndrew.Bardsley@arm.compredictions from Fetch2 to Fetch1 can also be transported in 0 cycles making 36010259SAndrew.Bardsley@arm.comfetch1ToFetch2BackwardDelay the only configurable delay which can be set as 36110259SAndrew.Bardsley@arm.comlow as 0 cycles. 36210259SAndrew.Bardsley@arm.com 36310259SAndrew.Bardsley@arm.comThe MinorCPU::activateContext and MinorCPU::suspendContext interface can be 36410259SAndrew.Bardsley@arm.comcalled to start and pause threads (threads in the MT sense) and to start and 36510259SAndrew.Bardsley@arm.compause the pipeline. Executing instructions can call this interface 36610259SAndrew.Bardsley@arm.com(indirectly through the ThreadContext) to idle the CPU/their threads. 36710259SAndrew.Bardsley@arm.com 36810259SAndrew.Bardsley@arm.com\subsection stages Each pipeline stage 36910259SAndrew.Bardsley@arm.com 37010259SAndrew.Bardsley@arm.comIn general, the behaviour of a stage (each cycle) is: 37110259SAndrew.Bardsley@arm.com 37210259SAndrew.Bardsley@arm.com\verbatim 37310259SAndrew.Bardsley@arm.com evaluate: 37410259SAndrew.Bardsley@arm.com push input to inputBuffer 37510259SAndrew.Bardsley@arm.com setup references to input/output data slots 37610259SAndrew.Bardsley@arm.com 37710259SAndrew.Bardsley@arm.com do 'every cycle' 'step' tasks 37810259SAndrew.Bardsley@arm.com 37910259SAndrew.Bardsley@arm.com if there is input and there is space in the next stage: 38010259SAndrew.Bardsley@arm.com process and generate a new output 38110259SAndrew.Bardsley@arm.com maybe re-activate the stage 38210259SAndrew.Bardsley@arm.com 38310259SAndrew.Bardsley@arm.com send backwards data 38410259SAndrew.Bardsley@arm.com 38510259SAndrew.Bardsley@arm.com if the stage generated output to the following FIFO: 38610259SAndrew.Bardsley@arm.com signal pipe activity 38710259SAndrew.Bardsley@arm.com 38810259SAndrew.Bardsley@arm.com if the stage has more processable input and space in the next stage: 38910259SAndrew.Bardsley@arm.com re-activate the stage for the next cycle 39010259SAndrew.Bardsley@arm.com 39110259SAndrew.Bardsley@arm.com commit the push to the inputBuffer if that data hasn't all been used 39210259SAndrew.Bardsley@arm.com\endverbatim 39310259SAndrew.Bardsley@arm.com 39410259SAndrew.Bardsley@arm.comThe Execute stage differs from this model as its forward output (branch) data 39510259SAndrew.Bardsley@arm.comis unconditionally sent to Fetch1 and Fetch2. To allow this behaviour, Fetch1 39610259SAndrew.Bardsley@arm.comand Fetch2 must be unconditionally receptive to that data. 39710259SAndrew.Bardsley@arm.com 39810259SAndrew.Bardsley@arm.com\subsection fetch1 Fetch1 stage 39910259SAndrew.Bardsley@arm.com 40010259SAndrew.Bardsley@arm.comFetch1 is responsible for fetching cache lines or partial cache lines from the 40110259SAndrew.Bardsley@arm.comI-cache and passing them on to Fetch2 to be decomposed into instructions. It 40210259SAndrew.Bardsley@arm.comcan receive 'change of stream' indications from both Execute and Fetch2 to 40310259SAndrew.Bardsley@arm.comsignal that it should change its internal fetch address and tag newly fetched 40410259SAndrew.Bardsley@arm.comlines with new stream or prediction sequence numbers. When both Execute and 40510259SAndrew.Bardsley@arm.comFetch2 signal changes of stream at the same time, Fetch1 takes Execute's 40610259SAndrew.Bardsley@arm.comchange. 40710259SAndrew.Bardsley@arm.com 40810259SAndrew.Bardsley@arm.comEvery line issued by Fetch1 will bear a unique line sequence number which can 40910259SAndrew.Bardsley@arm.combe used for debugging stream changes. 41010259SAndrew.Bardsley@arm.com 41110259SAndrew.Bardsley@arm.comWhen fetching from the I-cache, Fetch1 will ask for data from the current 41210259SAndrew.Bardsley@arm.comfetch address (Fetch1::pc) up to the end of the 'data snap' size set in the 41310259SAndrew.Bardsley@arm.comparameter fetch1LineSnapWidth. Subsequent autonomous line fetches will fetch 41410259SAndrew.Bardsley@arm.comwhole lines at a snap boundary and of size fetch1LineWidth. 41510259SAndrew.Bardsley@arm.com 41610259SAndrew.Bardsley@arm.comFetch1 will only initiate a memory fetch if it can reserve space in Fetch2 41710259SAndrew.Bardsley@arm.cominput buffer. That input buffer serves an the fetch queue/LFL for the system. 41810259SAndrew.Bardsley@arm.com 41910259SAndrew.Bardsley@arm.comFetch1 contains two queues: requests and transfers to handle the stages of 42010259SAndrew.Bardsley@arm.comtranslating the address of a line fetch (via the TLB) and accommodating the 42110259SAndrew.Bardsley@arm.comrequest/response of fetches to/from memory. 42210259SAndrew.Bardsley@arm.com 42310259SAndrew.Bardsley@arm.comFetch requests from Fetch1 are pushed into the requests queue as newly 42410259SAndrew.Bardsley@arm.comallocated FetchRequest objects once they have been sent to the ITLB with a 42510259SAndrew.Bardsley@arm.comcall to itb->translateTiming. 42610259SAndrew.Bardsley@arm.com 42710259SAndrew.Bardsley@arm.comA response from the TLB moves the request from the requests queue to the 42810259SAndrew.Bardsley@arm.comtransfers queue. If there is more than one entry in each queue, it is 42910259SAndrew.Bardsley@arm.compossible to get a TLB response for request which is not at the head of the 43010259SAndrew.Bardsley@arm.comrequests queue. In that case, the TLB response is marked up as a state change 43110259SAndrew.Bardsley@arm.comto Translated in the request object, and advancing the request to transfers 43210259SAndrew.Bardsley@arm.com(and the memory system) is left to calls to Fetch1::stepQueues which is called 43310259SAndrew.Bardsley@arm.comin the cycle following any event is received. 43410259SAndrew.Bardsley@arm.com 43510259SAndrew.Bardsley@arm.comFetch1::tryToSendToTransfers is responsible for moving requests between the 43610259SAndrew.Bardsley@arm.comtwo queues and issuing requests to memory. Failed TLB lookups (prefetch 43710259SAndrew.Bardsley@arm.comaborts) continue to occupy space in the queues until they are recovered at the 43810259SAndrew.Bardsley@arm.comhead of transfers. 43910259SAndrew.Bardsley@arm.com 44010259SAndrew.Bardsley@arm.comResponses from memory change the request object state to Complete and 44110259SAndrew.Bardsley@arm.comFetch1::evaluate can pick up response data, package it in the ForwardLineData 44210259SAndrew.Bardsley@arm.comobject, and forward it to Fetch2%'s input buffer. 44310259SAndrew.Bardsley@arm.com 44410259SAndrew.Bardsley@arm.comAs space is always reserved in Fetch2::inputBuffer, setting the input buffer's 44510259SAndrew.Bardsley@arm.comsize to 1 results in non-prefetching behaviour. 44610259SAndrew.Bardsley@arm.com 44710259SAndrew.Bardsley@arm.comWhen a change of stream occurs, translated requests queue members and 44810259SAndrew.Bardsley@arm.comcompleted transfers queue members can be unconditionally discarded to make way 44910259SAndrew.Bardsley@arm.comfor new transfers. 45010259SAndrew.Bardsley@arm.com 45110259SAndrew.Bardsley@arm.com\subsection fetch2 Fetch2 stage 45210259SAndrew.Bardsley@arm.com 45310259SAndrew.Bardsley@arm.comFetch2 receives a line from Fetch1 into its input buffer. The data in the 45410259SAndrew.Bardsley@arm.comhead line in that buffer is iterated over and separated into individual 45510259SAndrew.Bardsley@arm.cominstructions which are packed into a vector of instructions which can be 45610259SAndrew.Bardsley@arm.compassed to Decode. Packing instructions can be aborted early if a fault is 45710259SAndrew.Bardsley@arm.comfound in either the input line as a whole or a decomposed instruction. 45810259SAndrew.Bardsley@arm.com 45910259SAndrew.Bardsley@arm.com\subsubsection bp Branch prediction 46010259SAndrew.Bardsley@arm.com 46110259SAndrew.Bardsley@arm.comFetch2 contains the branch prediction mechanism. This is a wrapper around the 46210259SAndrew.Bardsley@arm.combranch predictor interface provided by gem5 (cpu/pred/...). 46310259SAndrew.Bardsley@arm.com 46410259SAndrew.Bardsley@arm.comBranches are predicted for any control instructions found. If prediction is 46510259SAndrew.Bardsley@arm.comattempted for an instruction, the MinorDynInst::triedToPredict flag is set on 46610259SAndrew.Bardsley@arm.comthat instruction. 46710259SAndrew.Bardsley@arm.com 46810259SAndrew.Bardsley@arm.comWhen a branch is predicted to take, the MinorDynInst::predictedTaken flag is 46910259SAndrew.Bardsley@arm.comset and MinorDynInst::predictedTarget is set to the predicted target PC value. 47010259SAndrew.Bardsley@arm.comThe predicted branch instruction is then packed into Fetch2%'s output vector, 47110259SAndrew.Bardsley@arm.comthe prediction sequence number is incremented, and the branch is communicated 47210259SAndrew.Bardsley@arm.comto Fetch1. 47310259SAndrew.Bardsley@arm.com 47410259SAndrew.Bardsley@arm.comAfter signalling a prediction, Fetch2 will discard its input buffer contents 47510259SAndrew.Bardsley@arm.comand will reject any new lines which have the same stream sequence number as 47610259SAndrew.Bardsley@arm.comthat branch but have a different prediction sequence number. This allows 47710259SAndrew.Bardsley@arm.comfollowing sequentially fetched lines to be rejected without ignoring new lines 47810259SAndrew.Bardsley@arm.comgenerated by a change of stream indicated from a 'real' branch from Execute 47910259SAndrew.Bardsley@arm.com(which will have a new stream sequence number). 48010259SAndrew.Bardsley@arm.com 48110259SAndrew.Bardsley@arm.comThe program counter value provided to Fetch2 by Fetch1 packets is only updated 48210259SAndrew.Bardsley@arm.comwhen there is a change of stream. Fetch2::havePC indicates whether the PC 48310259SAndrew.Bardsley@arm.comwill be picked up from the next processed input line. Fetch2::havePC is 48410259SAndrew.Bardsley@arm.comnecessary to allow line-wrapping instructions to be tracked through decode. 48510259SAndrew.Bardsley@arm.com 48610259SAndrew.Bardsley@arm.comBranches (and instructions predicted to branch) which are processed by Execute 48710259SAndrew.Bardsley@arm.comwill generate BranchData (pipe_data.hh) data explaining the outcome of the 48810259SAndrew.Bardsley@arm.combranch which is sent forwards to Fetch1 and Fetch2. Fetch1 uses this data to 48910259SAndrew.Bardsley@arm.comchange stream (and update its stream sequence number and address for new 49010259SAndrew.Bardsley@arm.comlines). Fetch2 uses it to update the branch predictor. Minor does not 49110259SAndrew.Bardsley@arm.comcommunicate branch data to the branch predictor for instructions which are 49210259SAndrew.Bardsley@arm.comdiscarded on the way to commit. 49310259SAndrew.Bardsley@arm.com 49410259SAndrew.Bardsley@arm.comBranchData::BranchReason (pipe_data.hh) encodes the possible branch scenarios: 49510259SAndrew.Bardsley@arm.com 49610259SAndrew.Bardsley@arm.com<table> 49710259SAndrew.Bardsley@arm.com<tr> 49810259SAndrew.Bardsley@arm.com <td>Branch enum val.</td> 49910259SAndrew.Bardsley@arm.com <td>In Execute</td> 50010259SAndrew.Bardsley@arm.com <td>Fetch1 reaction</td> 50110259SAndrew.Bardsley@arm.com <td>Fetch2 reaction</td> 50210259SAndrew.Bardsley@arm.com</tr> 50310259SAndrew.Bardsley@arm.com<tr> 50410259SAndrew.Bardsley@arm.com <td>NoBranch</td> 50510259SAndrew.Bardsley@arm.com <td>(output bubble data)</td> 50610259SAndrew.Bardsley@arm.com <td>-</td> 50710259SAndrew.Bardsley@arm.com <td>-</td> 50810259SAndrew.Bardsley@arm.com</tr> 50910259SAndrew.Bardsley@arm.com<tr> 51010259SAndrew.Bardsley@arm.com <td>CorrectlyPredictedBranch</td> 51110259SAndrew.Bardsley@arm.com <td>Predicted, taken</td> 51210259SAndrew.Bardsley@arm.com <td>-</td> 51310259SAndrew.Bardsley@arm.com <td>Update BP as taken branch</td> 51410259SAndrew.Bardsley@arm.com</tr> 51510259SAndrew.Bardsley@arm.com<tr> 51610259SAndrew.Bardsley@arm.com <td>UnpredictedBranch</td> 51710259SAndrew.Bardsley@arm.com <td>Not predicted, taken and was taken</td> 51810259SAndrew.Bardsley@arm.com <td>New stream</td> 51910259SAndrew.Bardsley@arm.com <td>Update BP as taken branch</td> 52010259SAndrew.Bardsley@arm.com</tr> 52110259SAndrew.Bardsley@arm.com<tr> 52210259SAndrew.Bardsley@arm.com <td>BadlyPredictedBranch</td> 52310259SAndrew.Bardsley@arm.com <td>Predicted, not taken</td> 52410259SAndrew.Bardsley@arm.com <td>New stream to restore to old inst. source</td> 52510259SAndrew.Bardsley@arm.com <td>Update BP as not taken branch</td> 52610259SAndrew.Bardsley@arm.com</tr> 52710259SAndrew.Bardsley@arm.com<tr> 52810259SAndrew.Bardsley@arm.com <td>BadlyPredictedBranchTarget</td> 52910259SAndrew.Bardsley@arm.com <td>Predicted, taken, but to a different target than predicted one</td> 53010259SAndrew.Bardsley@arm.com <td>New stream</td> 53110259SAndrew.Bardsley@arm.com <td>Update BTB to new target</td> 53210259SAndrew.Bardsley@arm.com</tr> 53310259SAndrew.Bardsley@arm.com<tr> 53410259SAndrew.Bardsley@arm.com <td>SuspendThread</td> 53510259SAndrew.Bardsley@arm.com <td>Hint to suspend fetching</td> 53610259SAndrew.Bardsley@arm.com <td>Suspend fetch for this thread (branch to next inst. as wakeup 53710259SAndrew.Bardsley@arm.com fetch addr)</td> 53810259SAndrew.Bardsley@arm.com <td>-</td> 53910259SAndrew.Bardsley@arm.com</tr> 54010259SAndrew.Bardsley@arm.com<tr> 54110259SAndrew.Bardsley@arm.com <td>Interrupt</td> 54210259SAndrew.Bardsley@arm.com <td>Interrupt detected</td> 54310259SAndrew.Bardsley@arm.com <td>New stream</td> 54410259SAndrew.Bardsley@arm.com <td>-</td> 54510259SAndrew.Bardsley@arm.com</tr> 54610259SAndrew.Bardsley@arm.com</table> 54710259SAndrew.Bardsley@arm.com 54810259SAndrew.Bardsley@arm.comThe parameter decodeInputWidth sets the number of instructions which can be 54910259SAndrew.Bardsley@arm.compacked into the output per cycle. If the parameter fetch2CycleInput is true, 55010259SAndrew.Bardsley@arm.comDecode can try to take instructions from more than one entry in its input 55110259SAndrew.Bardsley@arm.combuffer per cycle. 55210259SAndrew.Bardsley@arm.com 55310259SAndrew.Bardsley@arm.com\subsection decode Decode stage 55410259SAndrew.Bardsley@arm.com 55510259SAndrew.Bardsley@arm.comDecode takes a vector of instructions from Fetch2 (via its input buffer) and 55610259SAndrew.Bardsley@arm.comdecomposes those instructions into micro-ops (if necessary) and packs them 55710259SAndrew.Bardsley@arm.cominto its output instruction vector. 55810259SAndrew.Bardsley@arm.com 55910259SAndrew.Bardsley@arm.comThe parameter executeInputWidth sets the number of instructions which can be 56010259SAndrew.Bardsley@arm.compacked into the output per cycle. If the parameter decodeCycleInput is true, 56110259SAndrew.Bardsley@arm.comDecode can try to take instructions from more than one entry in its input 56210259SAndrew.Bardsley@arm.combuffer per cycle. 56310259SAndrew.Bardsley@arm.com 56410259SAndrew.Bardsley@arm.com\subsection execute Execute stage 56510259SAndrew.Bardsley@arm.com 56610259SAndrew.Bardsley@arm.comExecute provides all the instruction execution and memory access mechanisms. 56710259SAndrew.Bardsley@arm.comAn instructions passage through Execute can take multiple cycles with its 56810259SAndrew.Bardsley@arm.comprecise timing modelled by a functional unit pipeline FIFO. 56910259SAndrew.Bardsley@arm.com 57010259SAndrew.Bardsley@arm.comA vector of instructions (possibly including fault 'instructions') is provided 57110259SAndrew.Bardsley@arm.comto Execute by Decode and can be queued in the Execute input buffer before 57210259SAndrew.Bardsley@arm.combeing issued. Setting the parameter executeCycleInput allows execute to 57310259SAndrew.Bardsley@arm.comexamine more than one input buffer entry (more than one instruction vector). 57410259SAndrew.Bardsley@arm.comThe number of instructions in the input vector can be set with 57510259SAndrew.Bardsley@arm.comexecuteInputWidth and the depth of the input buffer can be set with parameter 57610259SAndrew.Bardsley@arm.comexecuteInputBufferSize. 57710259SAndrew.Bardsley@arm.com 57810259SAndrew.Bardsley@arm.com\subsubsection fus Functional units 57910259SAndrew.Bardsley@arm.com 58010259SAndrew.Bardsley@arm.comThe Execute stage contains pipelines for each functional unit comprising the 58110259SAndrew.Bardsley@arm.comcomputational core of the CPU. Functional units are configured via the 58210259SAndrew.Bardsley@arm.comexecuteFuncUnits parameter. Each functional unit has a number of instruction 58310259SAndrew.Bardsley@arm.comclasses it supports, a stated delay between instruction issues, and a delay 58410259SAndrew.Bardsley@arm.comfrom instruction issue to (possible) commit and an optional timing annotation 58510259SAndrew.Bardsley@arm.comcapable of more complicated timing. 58610259SAndrew.Bardsley@arm.com 58710259SAndrew.Bardsley@arm.comEach active cycle, Execute::evaluate performs this action: 58810259SAndrew.Bardsley@arm.com 58910259SAndrew.Bardsley@arm.com\verbatim 59010259SAndrew.Bardsley@arm.com Execute::evaluate: 59110259SAndrew.Bardsley@arm.com push input to inputBuffer 59210259SAndrew.Bardsley@arm.com setup references to input/output data slots and branch output slot 59310259SAndrew.Bardsley@arm.com 59410259SAndrew.Bardsley@arm.com step D-cache interface queues (similar to Fetch1) 59510259SAndrew.Bardsley@arm.com 59610259SAndrew.Bardsley@arm.com if interrupt posted: 59710259SAndrew.Bardsley@arm.com take interrupt (signalling branch to Fetch1/Fetch2) 59810259SAndrew.Bardsley@arm.com else 59910259SAndrew.Bardsley@arm.com commit instructions 60010259SAndrew.Bardsley@arm.com issue new instructions 60110259SAndrew.Bardsley@arm.com 60210259SAndrew.Bardsley@arm.com advance functional unit pipelines 60310259SAndrew.Bardsley@arm.com 60410259SAndrew.Bardsley@arm.com reactivate Execute if the unit is still active 60510259SAndrew.Bardsley@arm.com 60610259SAndrew.Bardsley@arm.com commit the push to the inputBuffer if that data hasn't all been used 60710259SAndrew.Bardsley@arm.com\endverbatim 60810259SAndrew.Bardsley@arm.com 60910259SAndrew.Bardsley@arm.com\subsubsection fifos Functional unit FIFOs 61010259SAndrew.Bardsley@arm.com 61110259SAndrew.Bardsley@arm.comFunctional units are implemented as SelfStallingPipelines (stage.hh). These 61210259SAndrew.Bardsley@arm.comare TimeBuffer FIFOs with two distinct 'push' and 'pop' wires. They respond 61310259SAndrew.Bardsley@arm.comto SelfStallingPipeline::advance in the same way as TimeBuffers <b>unless</b> 61410259SAndrew.Bardsley@arm.comthere is data at the far, 'pop', end of the FIFO. A 'stalled' flag is 61510259SAndrew.Bardsley@arm.comprovided for signalling stalling and to allow a stall to be cleared. The 61610259SAndrew.Bardsley@arm.comintention is to provide a pipeline for each functional unit which will never 61710259SAndrew.Bardsley@arm.comadvance an instruction out of that pipeline until it has been processed and 61810259SAndrew.Bardsley@arm.comthe pipeline is explicitly unstalled. 61910259SAndrew.Bardsley@arm.com 62010259SAndrew.Bardsley@arm.comThe actions 'issue', 'commit', and 'advance' act on the functional units. 62110259SAndrew.Bardsley@arm.com 62210259SAndrew.Bardsley@arm.com\subsubsection issue Issue 62310259SAndrew.Bardsley@arm.com 62410259SAndrew.Bardsley@arm.comIssuing instructions involves iterating over both the input buffer 62510259SAndrew.Bardsley@arm.cominstructions and the heads of the functional units to try and issue 62610259SAndrew.Bardsley@arm.cominstructions in order. The number of instructions which can be issued each 62710259SAndrew.Bardsley@arm.comcycle is limited by the parameter executeIssueLimit, how executeCycleInput is 62810259SAndrew.Bardsley@arm.comset, the availability of pipeline space and the policy used to choose a 62910259SAndrew.Bardsley@arm.compipeline in which the instruction can be issued. 63010259SAndrew.Bardsley@arm.com 63110259SAndrew.Bardsley@arm.comAt present, the only issue policy is strict round-robin visiting of each 63210259SAndrew.Bardsley@arm.compipeline with the given instructions in sequence. For greater flexibility, 63310259SAndrew.Bardsley@arm.combetter (and more specific policies) will need to be possible. 63410259SAndrew.Bardsley@arm.com 63510259SAndrew.Bardsley@arm.comMemory operation instructions traverse their functional units to perform their 63610259SAndrew.Bardsley@arm.comEA calculations. On 'commit', the ExecContext::initiateAcc execution phase is 63710259SAndrew.Bardsley@arm.comperformed and any memory access is issued (via. ExecContext::{read,write}Mem 63810259SAndrew.Bardsley@arm.comcalling LSQ::pushRequest) to the LSQ. 63910259SAndrew.Bardsley@arm.com 64010259SAndrew.Bardsley@arm.comNote that faults are issued as if they are instructions and can (currently) be 64110259SAndrew.Bardsley@arm.comissued to *any* functional unit. 64210259SAndrew.Bardsley@arm.com 64310259SAndrew.Bardsley@arm.comEvery issued instruction is also pushed into the Execute::inFlightInsts queue. 64410259SAndrew.Bardsley@arm.comMemory ref. instructions are pushing into Execute::inFUMemInsts queue. 64510259SAndrew.Bardsley@arm.com 64610259SAndrew.Bardsley@arm.com\subsubsection commit Commit 64710259SAndrew.Bardsley@arm.com 64810259SAndrew.Bardsley@arm.comInstructions are committed by examining the head of the Execute::inFlightInsts 64910259SAndrew.Bardsley@arm.comqueue (which is decorated with the functional unit number to which the 65010259SAndrew.Bardsley@arm.cominstruction was issued). Instructions which can then be found in their 65110259SAndrew.Bardsley@arm.comfunctional units are executed and popped from Execute::inFlightInsts. 65210259SAndrew.Bardsley@arm.com 65310259SAndrew.Bardsley@arm.comMemory operation instructions are committed into the memory queues (as 65410259SAndrew.Bardsley@arm.comdescribed above) and exit their functional unit pipeline but are not popped 65510259SAndrew.Bardsley@arm.comfrom the Execute::inFlightInsts queue. The Execute::inFUMemInsts queue 65610259SAndrew.Bardsley@arm.comprovides ordering to memory operations as they pass through the functional 65710259SAndrew.Bardsley@arm.comunits (maintaining issue order). On entering the LSQ, instructions are popped 65810259SAndrew.Bardsley@arm.comfrom Execute::inFUMemInsts. 65910259SAndrew.Bardsley@arm.com 66010259SAndrew.Bardsley@arm.comIf the parameter executeAllowEarlyMemoryIssue is set, memory operations can be 66110259SAndrew.Bardsley@arm.comsent from their FU to the LSQ before reaching the head of 66210259SAndrew.Bardsley@arm.comExecute::inFlightInsts but after their dependencies are met. 66310259SAndrew.Bardsley@arm.comMinorDynInst::instToWaitFor is marked up with the latest dependent instruction 66410259SAndrew.Bardsley@arm.comexecSeqNum required to be committed for a memory operation to progress to the 66510259SAndrew.Bardsley@arm.comLSQ. 66610259SAndrew.Bardsley@arm.com 66710259SAndrew.Bardsley@arm.comOnce a memory response is available (by testing the head of 66810259SAndrew.Bardsley@arm.comExecute::inFlightInsts against LSQ::findResponse), commit will process that 66910259SAndrew.Bardsley@arm.comresponse (ExecContext::completeAcc) and pop the instruction from 67010259SAndrew.Bardsley@arm.comExecute::inFlightInsts. 67110259SAndrew.Bardsley@arm.com 67210259SAndrew.Bardsley@arm.comAny branch, fault or interrupt will cause a stream sequence number change and 67310259SAndrew.Bardsley@arm.comsignal a branch to Fetch1/Fetch2. Only instructions with the current stream 67410259SAndrew.Bardsley@arm.comsequence number will be issued and/or committed. 67510259SAndrew.Bardsley@arm.com 67610259SAndrew.Bardsley@arm.com\subsubsection advance Advance 67710259SAndrew.Bardsley@arm.com 67810259SAndrew.Bardsley@arm.comAll non-stalled pipeline are advanced and may, thereafter, become stalled. 67910259SAndrew.Bardsley@arm.comPotential activity in the next cycle is signalled if there are any 68010259SAndrew.Bardsley@arm.cominstructions remaining in any pipeline. 68110259SAndrew.Bardsley@arm.com 68210259SAndrew.Bardsley@arm.com\subsubsection sb Scoreboard 68310259SAndrew.Bardsley@arm.com 68410259SAndrew.Bardsley@arm.comThe scoreboard (Scoreboard) is used to control instruction issue. It contains 68510259SAndrew.Bardsley@arm.coma count of the number of in flight instructions which will write each general 68610259SAndrew.Bardsley@arm.compurpose CPU integer or float register. Instructions will only be issued where 68710259SAndrew.Bardsley@arm.comthe scoreboard contains a count of 0 instructions which will write to one of 68810259SAndrew.Bardsley@arm.comthe instructions source registers. 68910259SAndrew.Bardsley@arm.com 69010259SAndrew.Bardsley@arm.comOnce an instruction is issued, the scoreboard counts for each destination 69110259SAndrew.Bardsley@arm.comregister for an instruction will be incremented. 69210259SAndrew.Bardsley@arm.com 69310259SAndrew.Bardsley@arm.comThe estimated delivery time of the instruction's result is marked up in the 69410259SAndrew.Bardsley@arm.comscoreboard by adding the length of the issued-to FU to the current time. The 69510259SAndrew.Bardsley@arm.comtimings parameter on each FU provides a list of additional rules for 69610259SAndrew.Bardsley@arm.comcalculating the delivery time. These are documented in the parameter comments 69710259SAndrew.Bardsley@arm.comin MinorCPU.py. 69810259SAndrew.Bardsley@arm.com 69910259SAndrew.Bardsley@arm.comOn commit, (for memory operations, memory response commit) the scoreboard 70010259SAndrew.Bardsley@arm.comcounters for an instruction's source registers are decremented. will be 70110259SAndrew.Bardsley@arm.comdecremented. 70210259SAndrew.Bardsley@arm.com 70310259SAndrew.Bardsley@arm.com\subsubsection ifi Execute::inFlightInsts 70410259SAndrew.Bardsley@arm.com 70510259SAndrew.Bardsley@arm.comThe Execute::inFlightInsts queue will always contain all instructions in 70610259SAndrew.Bardsley@arm.comflight in Execute in the correct issue order. Execute::issue is the only 70710259SAndrew.Bardsley@arm.comprocess which will push an instruction into the queue. Execute::commit is the 70810259SAndrew.Bardsley@arm.comonly process that can pop an instruction. 70910259SAndrew.Bardsley@arm.com 71010259SAndrew.Bardsley@arm.com\subsubsection lsq LSQ 71110259SAndrew.Bardsley@arm.com 71210259SAndrew.Bardsley@arm.comThe LSQ can support multiple outstanding transactions to memory in a number of 71310259SAndrew.Bardsley@arm.comconservative cases. 71410259SAndrew.Bardsley@arm.com 71510259SAndrew.Bardsley@arm.comThere are three queues to contain requests: requests, transfers and the store 71610259SAndrew.Bardsley@arm.combuffer. The requests and transfers queue operate in a similar manner to the 71710259SAndrew.Bardsley@arm.comqueues in Fetch1. The store buffer is used to decouple the delay of 71810259SAndrew.Bardsley@arm.comcompleting store operations from following loads. 71910259SAndrew.Bardsley@arm.com 72010259SAndrew.Bardsley@arm.comRequests are issued to the DTLB as their instructions leave their functional 72110259SAndrew.Bardsley@arm.comunit. At the head of requests, cacheable load requests can be sent to memory 72210259SAndrew.Bardsley@arm.comand on to the transfers queue. Cacheable stores will be passed to transfers 72310259SAndrew.Bardsley@arm.comunprocessed and progress that queue maintaining order with other transactions. 72410259SAndrew.Bardsley@arm.com 72510259SAndrew.Bardsley@arm.comThe conditions in LSQ::tryToSendToTransfers dictate when requests can 72610259SAndrew.Bardsley@arm.combe sent to memory. 72710259SAndrew.Bardsley@arm.com 72810259SAndrew.Bardsley@arm.comAll uncacheable transactions, split transactions and locked transactions are 72910259SAndrew.Bardsley@arm.comprocessed in order at the head of requests. Additionally, store results 73010259SAndrew.Bardsley@arm.comresiding in the store buffer can have their data forwarded to cacheable loads 73110259SAndrew.Bardsley@arm.com(removing the need to perform a read from memory) but no cacheable load can be 73210259SAndrew.Bardsley@arm.comissue to the transfers queue until that queue's stores have drained into the 73310259SAndrew.Bardsley@arm.comstore buffer. 73410259SAndrew.Bardsley@arm.com 73510259SAndrew.Bardsley@arm.comAt the end of transfers, requests which are LSQ::LSQRequest::Complete (are 73610259SAndrew.Bardsley@arm.comfaulting, are cacheable stores, or have been sent to memory and received a 73710259SAndrew.Bardsley@arm.comresponse) can be picked off by Execute and either committed 73810259SAndrew.Bardsley@arm.com(ExecContext::completeAcc) and, for stores, be sent to the store buffer. 73910259SAndrew.Bardsley@arm.com 74010259SAndrew.Bardsley@arm.comBarrier instructions do not prevent cacheable loads from progressing to memory 74110259SAndrew.Bardsley@arm.combut do cause a stream change which will discard that load. Stores will not be 74210259SAndrew.Bardsley@arm.comcommitted to the store buffer if they are in the shadow of the barrier but 74310259SAndrew.Bardsley@arm.combefore the new instruction stream has arrived at Execute. As all other memory 74410259SAndrew.Bardsley@arm.comtransactions are delayed at the end of the requests queue until they are at 74510259SAndrew.Bardsley@arm.comthe head of Execute::inFlightInsts, they will be discarded by any barrier 74610259SAndrew.Bardsley@arm.comstream change. 74710259SAndrew.Bardsley@arm.com 74810259SAndrew.Bardsley@arm.comAfter commit, LSQ::BarrierDataRequest requests are inserted into the 74910259SAndrew.Bardsley@arm.comstore buffer to track each barrier until all preceding memory transactions 75010259SAndrew.Bardsley@arm.comhave drained from the store buffer. No further memory transactions will be 75110259SAndrew.Bardsley@arm.comissued from the ends of FUs until after the barrier has drained. 75210259SAndrew.Bardsley@arm.com 75310259SAndrew.Bardsley@arm.com\subsubsection drain Draining 75410259SAndrew.Bardsley@arm.com 75510259SAndrew.Bardsley@arm.comDraining is mostly handled by the Execute stage. When initiated by calling 75610259SAndrew.Bardsley@arm.comMinorCPU::drain, Pipeline::evaluate checks the draining status of each unit 75710259SAndrew.Bardsley@arm.comeach cycle and keeps the pipeline active until draining is complete. It is 75810259SAndrew.Bardsley@arm.comPipeline that signals the completion of draining. Execute is triggered by 75910259SAndrew.Bardsley@arm.comMinorCPU::drain and starts stepping through its Execute::DrainState state 76010259SAndrew.Bardsley@arm.commachine, starting from state Execute::NotDraining, in this order: 76110259SAndrew.Bardsley@arm.com 76210259SAndrew.Bardsley@arm.com<table> 76310259SAndrew.Bardsley@arm.com<tr> 76410259SAndrew.Bardsley@arm.com <td><b>State</b></td> 76510259SAndrew.Bardsley@arm.com <td><b>Meaning</b></td> 76610259SAndrew.Bardsley@arm.com</tr> 76710259SAndrew.Bardsley@arm.com<tr> 76810259SAndrew.Bardsley@arm.com <td>Execute::NotDraining</td> 76910259SAndrew.Bardsley@arm.com <td>Not trying to drain, normal execution</td> 77010259SAndrew.Bardsley@arm.com</tr> 77110259SAndrew.Bardsley@arm.com<tr> 77210259SAndrew.Bardsley@arm.com <td>Execute::DrainCurrentInst</td> 77310259SAndrew.Bardsley@arm.com <td>Draining micro-ops to complete inst.</td> 77410259SAndrew.Bardsley@arm.com</tr> 77510259SAndrew.Bardsley@arm.com<tr> 77610259SAndrew.Bardsley@arm.com <td>Execute::DrainHaltFetch</td> 77710259SAndrew.Bardsley@arm.com <td>Halt fetching instructions</td> 77810259SAndrew.Bardsley@arm.com</tr> 77910259SAndrew.Bardsley@arm.com<tr> 78010259SAndrew.Bardsley@arm.com <td>Execute::DrainAllInsts</td> 78110259SAndrew.Bardsley@arm.com <td>Discarding all instructions presented</td> 78210259SAndrew.Bardsley@arm.com</tr> 78310259SAndrew.Bardsley@arm.com</table> 78410259SAndrew.Bardsley@arm.com 78510259SAndrew.Bardsley@arm.comWhen complete, a drained Execute unit will be in the Execute::DrainAllInsts 78610259SAndrew.Bardsley@arm.comstate where it will continue to discard instructions but has no knowledge of 78710259SAndrew.Bardsley@arm.comthe drained state of the rest of the model. 78810259SAndrew.Bardsley@arm.com 78910259SAndrew.Bardsley@arm.com\section debug Debug options 79010259SAndrew.Bardsley@arm.com 79110259SAndrew.Bardsley@arm.comThe model provides a number of debug flags which can be passed to gem5 with 79210259SAndrew.Bardsley@arm.comthe --debug-flags option. 79310259SAndrew.Bardsley@arm.com 79410259SAndrew.Bardsley@arm.comThe available flags are: 79510259SAndrew.Bardsley@arm.com 79610259SAndrew.Bardsley@arm.com<table> 79710259SAndrew.Bardsley@arm.com<tr> 79810259SAndrew.Bardsley@arm.com <td><b>Debug flag</b></td> 79910259SAndrew.Bardsley@arm.com <td><b>Unit which will generate debugging output</b></td> 80010259SAndrew.Bardsley@arm.com</tr> 80110259SAndrew.Bardsley@arm.com<tr> 80210259SAndrew.Bardsley@arm.com <td>Activity</td> 80310259SAndrew.Bardsley@arm.com <td>Debug ActivityMonitor actions</td> 80410259SAndrew.Bardsley@arm.com</tr> 80510259SAndrew.Bardsley@arm.com<tr> 80610259SAndrew.Bardsley@arm.com <td>Branch</td> 80710259SAndrew.Bardsley@arm.com <td>Fetch2 and Execute branch prediction decisions</td> 80810259SAndrew.Bardsley@arm.com</tr> 80910259SAndrew.Bardsley@arm.com<tr> 81010259SAndrew.Bardsley@arm.com <td>MinorCPU</td> 81110259SAndrew.Bardsley@arm.com <td>CPU global actions such as wakeup/thread suspension</td> 81210259SAndrew.Bardsley@arm.com</tr> 81310259SAndrew.Bardsley@arm.com<tr> 81410259SAndrew.Bardsley@arm.com <td>Decode</td> 81510259SAndrew.Bardsley@arm.com <td>Decode</td> 81610259SAndrew.Bardsley@arm.com</tr> 81710259SAndrew.Bardsley@arm.com<tr> 81810259SAndrew.Bardsley@arm.com <td>MinorExec</td> 81910259SAndrew.Bardsley@arm.com <td>Execute behaviour</td> 82010259SAndrew.Bardsley@arm.com</tr> 82110259SAndrew.Bardsley@arm.com<tr> 82210259SAndrew.Bardsley@arm.com <td>Fetch</td> 82310259SAndrew.Bardsley@arm.com <td>Fetch1 and Fetch2</td> 82410259SAndrew.Bardsley@arm.com</tr> 82510259SAndrew.Bardsley@arm.com<tr> 82610259SAndrew.Bardsley@arm.com <td>MinorInterrupt</td> 82710259SAndrew.Bardsley@arm.com <td>Execute interrupt handling</td> 82810259SAndrew.Bardsley@arm.com</tr> 82910259SAndrew.Bardsley@arm.com<tr> 83010259SAndrew.Bardsley@arm.com <td>MinorMem</td> 83110259SAndrew.Bardsley@arm.com <td>Execute memory interactions</td> 83210259SAndrew.Bardsley@arm.com</tr> 83310259SAndrew.Bardsley@arm.com<tr> 83410259SAndrew.Bardsley@arm.com <td>MinorScoreboard</td> 83510259SAndrew.Bardsley@arm.com <td>Execute scoreboard activity</td> 83610259SAndrew.Bardsley@arm.com</tr> 83710259SAndrew.Bardsley@arm.com<tr> 83810259SAndrew.Bardsley@arm.com <td>MinorTrace</td> 83910259SAndrew.Bardsley@arm.com <td>Generate MinorTrace cyclic state trace output (see below)</td> 84010259SAndrew.Bardsley@arm.com</tr> 84110259SAndrew.Bardsley@arm.com<tr> 84210259SAndrew.Bardsley@arm.com <td>MinorTiming</td> 84310259SAndrew.Bardsley@arm.com <td>MinorTiming instruction timing modification operations</td> 84410259SAndrew.Bardsley@arm.com</tr> 84510259SAndrew.Bardsley@arm.com</table> 84610259SAndrew.Bardsley@arm.com 84710259SAndrew.Bardsley@arm.comThe group flag Minor enables all of the flags beginning with Minor. 84810259SAndrew.Bardsley@arm.com 84910259SAndrew.Bardsley@arm.com\section trace MinorTrace and minorview.py 85010259SAndrew.Bardsley@arm.com 85110259SAndrew.Bardsley@arm.comThe debug flag MinorTrace causes cycle-by-cycle state data to be printed which 85210259SAndrew.Bardsley@arm.comcan then be processed and viewed by the minorview.py tool. This output is 85310259SAndrew.Bardsley@arm.comvery verbose and so it is recommended it only be used for small examples. 85410259SAndrew.Bardsley@arm.com 85510259SAndrew.Bardsley@arm.com\subsection traceformat MinorTrace format 85610259SAndrew.Bardsley@arm.com 85710259SAndrew.Bardsley@arm.comThere are three types of line outputted by MinorTrace: 85810259SAndrew.Bardsley@arm.com 85910259SAndrew.Bardsley@arm.com\subsubsection state MinorTrace - Ticked unit cycle state 86010259SAndrew.Bardsley@arm.com 86110259SAndrew.Bardsley@arm.comFor example: 86210259SAndrew.Bardsley@arm.com 86310259SAndrew.Bardsley@arm.com\verbatim 86410259SAndrew.Bardsley@arm.com 110000: system.cpu.dcachePort: MinorTrace: state=MemoryRunning in_tlb_mem=0/0 86510259SAndrew.Bardsley@arm.com\endverbatim 86610259SAndrew.Bardsley@arm.com 86710259SAndrew.Bardsley@arm.comFor each time step, the MinorTrace flag will cause one MinorTrace line to be 86810259SAndrew.Bardsley@arm.comprinted for every named element in the model. 86910259SAndrew.Bardsley@arm.com 87010259SAndrew.Bardsley@arm.com\subsubsection traceunit MinorInst - summaries of instructions issued by \ 87110259SAndrew.Bardsley@arm.com Decode 87210259SAndrew.Bardsley@arm.com 87310259SAndrew.Bardsley@arm.comFor example: 87410259SAndrew.Bardsley@arm.com 87510259SAndrew.Bardsley@arm.com\verbatim 87610259SAndrew.Bardsley@arm.com 140000: system.cpu.execute: MinorInst: id=0/1.1/1/1.1 addr=0x5c \ 87710259SAndrew.Bardsley@arm.com inst=" mov r0, #0" class=IntAlu 87810259SAndrew.Bardsley@arm.com\endverbatim 87910259SAndrew.Bardsley@arm.com 88010259SAndrew.Bardsley@arm.comMinorInst lines are currently only generated for instructions which are 88110259SAndrew.Bardsley@arm.comcommitted. 88210259SAndrew.Bardsley@arm.com 88310259SAndrew.Bardsley@arm.com\subsubsection tracefetch1 MinorLine - summaries of line fetches issued by \ 88410259SAndrew.Bardsley@arm.com Fetch1 88510259SAndrew.Bardsley@arm.com 88610259SAndrew.Bardsley@arm.comFor example: 88710259SAndrew.Bardsley@arm.com 88810259SAndrew.Bardsley@arm.com\verbatim 88910259SAndrew.Bardsley@arm.com 92000: system.cpu.icachePort: MinorLine: id=0/1.1/1 size=36 \ 89010259SAndrew.Bardsley@arm.com vaddr=0x5c paddr=0x5c 89110259SAndrew.Bardsley@arm.com\endverbatim 89210259SAndrew.Bardsley@arm.com 89310259SAndrew.Bardsley@arm.com\subsection minorview minorview.py 89410259SAndrew.Bardsley@arm.com 89510259SAndrew.Bardsley@arm.comMinorview (util/minorview.py) can be used to visualise the data created by 89610259SAndrew.Bardsley@arm.comMinorTrace. 89710259SAndrew.Bardsley@arm.com 89810259SAndrew.Bardsley@arm.com\verbatim 89910259SAndrew.Bardsley@arm.comusage: minorview.py [-h] [--picture picture-file] [--prefix name] 90010259SAndrew.Bardsley@arm.com [--start-time time] [--end-time time] [--mini-views] 90110259SAndrew.Bardsley@arm.com event-file 90210259SAndrew.Bardsley@arm.com 90310259SAndrew.Bardsley@arm.comMinor visualiser 90410259SAndrew.Bardsley@arm.com 90510259SAndrew.Bardsley@arm.compositional arguments: 90610259SAndrew.Bardsley@arm.com event-file 90710259SAndrew.Bardsley@arm.com 90810259SAndrew.Bardsley@arm.comoptional arguments: 90910259SAndrew.Bardsley@arm.com -h, --help show this help message and exit 91010259SAndrew.Bardsley@arm.com --picture picture-file 91110259SAndrew.Bardsley@arm.com markup file containing blob information (default: 91210259SAndrew.Bardsley@arm.com <minorview-path>/minor.pic) 91310259SAndrew.Bardsley@arm.com --prefix name name prefix in trace for CPU to be visualised 91410259SAndrew.Bardsley@arm.com (default: system.cpu) 91510259SAndrew.Bardsley@arm.com --start-time time time of first event to load from file 91610259SAndrew.Bardsley@arm.com --end-time time time of last event to load from file 91710259SAndrew.Bardsley@arm.com --mini-views show tiny views of the next 10 time steps 91810259SAndrew.Bardsley@arm.com\endverbatim 91910259SAndrew.Bardsley@arm.com 92010259SAndrew.Bardsley@arm.comRaw debugging output can be passed to minorview.py as the event-file. It will 92110259SAndrew.Bardsley@arm.compick out the MinorTrace lines and use other lines where units in the 92210259SAndrew.Bardsley@arm.comsimulation are named (such as system.cpu.dcachePort in the above example) will 92310259SAndrew.Bardsley@arm.comappear as 'comments' when units are clicked on the visualiser. 92410259SAndrew.Bardsley@arm.com 92510259SAndrew.Bardsley@arm.comClicking on a unit which contains instructions or lines will bring up a speech 92610259SAndrew.Bardsley@arm.combubble giving extra information derived from the MinorInst/MinorLine lines. 92710259SAndrew.Bardsley@arm.com 92810259SAndrew.Bardsley@arm.com--start-time and --end-time allow only sections of debug files to be loaded. 92910259SAndrew.Bardsley@arm.com 93010259SAndrew.Bardsley@arm.com--prefix allows the name prefix of the CPU to be inspected to be supplied. 93110259SAndrew.Bardsley@arm.comThis defaults to 'system.cpu'. 93210259SAndrew.Bardsley@arm.com 93310259SAndrew.Bardsley@arm.comIn the visualiser, The buttons Start, End, Back, Forward, Play and Stop can be 93410259SAndrew.Bardsley@arm.comused to control the displayed simulation time. 93510259SAndrew.Bardsley@arm.com 93610259SAndrew.Bardsley@arm.comThe diagonally striped coloured blocks are showing the InstId of the 93710259SAndrew.Bardsley@arm.cominstruction or line they represent. Note that lines in Fetch1 and f1ToF2.F 93810259SAndrew.Bardsley@arm.comonly show the id fields of a line and that instructions in Fetch2, f2ToD, and 93910259SAndrew.Bardsley@arm.comdecode.inputBuffer do not yet have execute sequence numbers. The T/S.P/L/F.E 94010259SAndrew.Bardsley@arm.combuttons can be used to toggle parts of InstId on and off to make it easier to 94110259SAndrew.Bardsley@arm.comunderstand the display. Useful combinations are: 94210259SAndrew.Bardsley@arm.com 94310259SAndrew.Bardsley@arm.com<table> 94410259SAndrew.Bardsley@arm.com<tr> 94510259SAndrew.Bardsley@arm.com <td><b>Combination</b></td> 94610259SAndrew.Bardsley@arm.com <td><b>Reason</b></td> 94710259SAndrew.Bardsley@arm.com</tr> 94810259SAndrew.Bardsley@arm.com<tr> 94910259SAndrew.Bardsley@arm.com <td>E</td> 95010259SAndrew.Bardsley@arm.com <td>just show the final execute sequence number</td> 95110259SAndrew.Bardsley@arm.com</tr> 95210259SAndrew.Bardsley@arm.com<tr> 95310259SAndrew.Bardsley@arm.com <td>F/E</td> 95410259SAndrew.Bardsley@arm.com <td>show the instruction-related numbers</td> 95510259SAndrew.Bardsley@arm.com</tr> 95610259SAndrew.Bardsley@arm.com<tr> 95710259SAndrew.Bardsley@arm.com <td>S/P</td> 95810259SAndrew.Bardsley@arm.com <td>show just the stream-related numbers (watch the stream sequence 95910259SAndrew.Bardsley@arm.com change with branches and not change with predicted branches)</td> 96010259SAndrew.Bardsley@arm.com</tr> 96110259SAndrew.Bardsley@arm.com<tr> 96210259SAndrew.Bardsley@arm.com <td>S/E</td> 96310259SAndrew.Bardsley@arm.com <td>show instructions and their stream</td> 96410259SAndrew.Bardsley@arm.com</tr> 96510259SAndrew.Bardsley@arm.com</table> 96610259SAndrew.Bardsley@arm.com 96710259SAndrew.Bardsley@arm.comThe key to the right shows all the displayable colours (some of the colour 96810259SAndrew.Bardsley@arm.comchoices are quite bad!): 96910259SAndrew.Bardsley@arm.com 97010259SAndrew.Bardsley@arm.com<table> 97110259SAndrew.Bardsley@arm.com<tr> 97210259SAndrew.Bardsley@arm.com <td><b>Symbol</b></td> 97310259SAndrew.Bardsley@arm.com <td><b>Meaning</b></td> 97410259SAndrew.Bardsley@arm.com</tr> 97510259SAndrew.Bardsley@arm.com<tr> 97610259SAndrew.Bardsley@arm.com <td>U</td> 97710259SAndrew.Bardsley@arm.com <td>Unknown data</td> 97810259SAndrew.Bardsley@arm.com</tr> 97910259SAndrew.Bardsley@arm.com<tr> 98010259SAndrew.Bardsley@arm.com <td>B</td> 98110259SAndrew.Bardsley@arm.com <td>Blocked stage</td> 98210259SAndrew.Bardsley@arm.com</tr> 98310259SAndrew.Bardsley@arm.com<tr> 98410259SAndrew.Bardsley@arm.com <td>-</td> 98510259SAndrew.Bardsley@arm.com <td>Bubble</td> 98610259SAndrew.Bardsley@arm.com</tr> 98710259SAndrew.Bardsley@arm.com<tr> 98810259SAndrew.Bardsley@arm.com <td>E</td> 98910259SAndrew.Bardsley@arm.com <td>Empty queue slot</td> 99010259SAndrew.Bardsley@arm.com</tr> 99110259SAndrew.Bardsley@arm.com<tr> 99210259SAndrew.Bardsley@arm.com <td>R</td> 99310259SAndrew.Bardsley@arm.com <td>Reserved queue slot</td> 99410259SAndrew.Bardsley@arm.com</tr> 99510259SAndrew.Bardsley@arm.com<tr> 99610259SAndrew.Bardsley@arm.com <td>F</td> 99710259SAndrew.Bardsley@arm.com <td>Fault</td> 99810259SAndrew.Bardsley@arm.com</tr> 99910259SAndrew.Bardsley@arm.com<tr> 100010259SAndrew.Bardsley@arm.com <td>r</td> 100110259SAndrew.Bardsley@arm.com <td>Read (used as the leftmost stripe on data in the dcachePort)</td> 100210259SAndrew.Bardsley@arm.com</tr> 100310259SAndrew.Bardsley@arm.com<tr> 100410259SAndrew.Bardsley@arm.com <td>w</td> 100510259SAndrew.Bardsley@arm.com <td>Write " "</td> 100610259SAndrew.Bardsley@arm.com</tr> 100710259SAndrew.Bardsley@arm.com<tr> 100810259SAndrew.Bardsley@arm.com <td>0 to 9</td> 100910259SAndrew.Bardsley@arm.com <td>last decimal digit of the corresponding data</td> 101010259SAndrew.Bardsley@arm.com</tr> 101110259SAndrew.Bardsley@arm.com</table> 101210259SAndrew.Bardsley@arm.com 101310259SAndrew.Bardsley@arm.com\verbatim 101410259SAndrew.Bardsley@arm.com 101510259SAndrew.Bardsley@arm.com ,---------------. .--------------. *U 101610259SAndrew.Bardsley@arm.com | |=|->|=|->|=| | ||=|||->||->|| | *- <- Fetch queues/LSQ 101710259SAndrew.Bardsley@arm.com `---------------' `--------------' *R 101810259SAndrew.Bardsley@arm.com === ====== *w <- Activity/Stage activity 101910259SAndrew.Bardsley@arm.com ,--------------. *1 102010259SAndrew.Bardsley@arm.com ,--. ,. ,. | ============ | *3 <- Scoreboard 102110259SAndrew.Bardsley@arm.com | |-\[]-\||-\[]-\||-\[]-\| ============ | *5 <- Execute::inFlightInsts 102210259SAndrew.Bardsley@arm.com | | :[] :||-/[]-/||-/[]-/| -. -------- | *7 102310259SAndrew.Bardsley@arm.com | |-/[]-/|| ^ || | | --------- | *9 102410259SAndrew.Bardsley@arm.com | | || | || | | ------ | 102510259SAndrew.Bardsley@arm.com[]->| | ->|| | || | | ---- | 102610259SAndrew.Bardsley@arm.com | |<-[]<-||<-+-<-||<-[]<-| | ------ |->[] <- Execute to Fetch1, 102710259SAndrew.Bardsley@arm.com '--` `' ^ `' | -' ------ | Fetch2 branch data 102810259SAndrew.Bardsley@arm.com ---. | ---. `--------------' 102910259SAndrew.Bardsley@arm.com ---' | ---' ^ ^ 103010259SAndrew.Bardsley@arm.com | ^ | `------------ Execute 103110259SAndrew.Bardsley@arm.com MinorBuffer ----' input `-------------------- Execute input buffer 103210259SAndrew.Bardsley@arm.com buffer 103310259SAndrew.Bardsley@arm.com\endverbatim 103410259SAndrew.Bardsley@arm.com 103510259SAndrew.Bardsley@arm.comStages show the colours of the instructions currently being 103610259SAndrew.Bardsley@arm.comgenerated/processed. 103710259SAndrew.Bardsley@arm.com 103810259SAndrew.Bardsley@arm.comForward FIFOs between stages show the data being pushed into them at the 103910259SAndrew.Bardsley@arm.comcurrent tick (to the left), the data in transit, and the data available at 104010259SAndrew.Bardsley@arm.comtheir outputs (to the right). 104110259SAndrew.Bardsley@arm.com 104210259SAndrew.Bardsley@arm.comThe backwards FIFO between Fetch2 and Fetch1 shows branch prediction data. 104310259SAndrew.Bardsley@arm.com 104410259SAndrew.Bardsley@arm.comIn general, all displayed data is correct at the end of a cycle's activity at 104510259SAndrew.Bardsley@arm.comthe time indicated but before the inter-stage FIFOs are ticked. Each FIFO 104610259SAndrew.Bardsley@arm.comhas, therefore an extra slot to show the asserted new input data, and all the 104710259SAndrew.Bardsley@arm.comdata currently within the FIFO. 104810259SAndrew.Bardsley@arm.com 104910259SAndrew.Bardsley@arm.comInput buffers for each stage are shown below the corresponding stage and show 105010259SAndrew.Bardsley@arm.comthe contents of those buffers as horizontal strips. Strips marked as reserved 105110259SAndrew.Bardsley@arm.com(cyan by default) are reserved to be filled by the previous stage. An input 105210259SAndrew.Bardsley@arm.combuffer with all reserved or occupied slots will, therefore, block the previous 105310259SAndrew.Bardsley@arm.comstage from generating output. 105410259SAndrew.Bardsley@arm.com 105510259SAndrew.Bardsley@arm.comFetch queues and LSQ show the lines/instructions in the queues of each 105610259SAndrew.Bardsley@arm.cominterface and show the number of lines/instructions in TLB and memory in the 105710259SAndrew.Bardsley@arm.comtwo striped colours of the top of their frames. 105810259SAndrew.Bardsley@arm.com 105910259SAndrew.Bardsley@arm.comInside Execute, the horizontal bars represent the individual FU pipelines. 106010259SAndrew.Bardsley@arm.comThe vertical bar to the left is the input buffer and the bar to the right, the 106110259SAndrew.Bardsley@arm.cominstructions committed this cycle. The background of Execute shows 106210259SAndrew.Bardsley@arm.cominstructions which are being committed this cycle in their original FU 106310259SAndrew.Bardsley@arm.compipeline positions. 106410259SAndrew.Bardsley@arm.com 106510259SAndrew.Bardsley@arm.comThe strip at the top of the Execute block shows the current streamSeqNum that 106610259SAndrew.Bardsley@arm.comExecute is committing. A similar stripe at the top of Fetch1 shows that 106710259SAndrew.Bardsley@arm.comstage's expected streamSeqNum and the stripe at the top of Fetch2 shows its 106810259SAndrew.Bardsley@arm.comissuing predictionSeqNum. 106910259SAndrew.Bardsley@arm.com 107010259SAndrew.Bardsley@arm.comThe scoreboard shows the number of instructions in flight which will commit a 107110259SAndrew.Bardsley@arm.comresult to the register in the position shown. The scoreboard contains slots 107210259SAndrew.Bardsley@arm.comfor each integer and floating point register. 107310259SAndrew.Bardsley@arm.com 107410259SAndrew.Bardsley@arm.comThe Execute::inFlightInsts queue shows all the instructions in flight in 107510259SAndrew.Bardsley@arm.comExecute with the oldest instruction (the next instruction to be committed) to 107610259SAndrew.Bardsley@arm.comthe right. 107710259SAndrew.Bardsley@arm.com 107810259SAndrew.Bardsley@arm.com'Stage activity' shows the signalled activity (as E/1) for each stage (with 107910259SAndrew.Bardsley@arm.comCPU miscellaneous activity to the left) 108010259SAndrew.Bardsley@arm.com 108110259SAndrew.Bardsley@arm.com'Activity' show a count of stage and pipe activity. 108210259SAndrew.Bardsley@arm.com 108310259SAndrew.Bardsley@arm.com\subsection picformat minor.pic format 108410259SAndrew.Bardsley@arm.com 108510259SAndrew.Bardsley@arm.comThe minor.pic file (src/minor/minor.pic) describes the layout of the 108610259SAndrew.Bardsley@arm.commodels blocks on the visualiser. Its format is described in the supplied 108710259SAndrew.Bardsley@arm.comminor.pic file. 108810259SAndrew.Bardsley@arm.com 108910259SAndrew.Bardsley@arm.com*/ 109010259SAndrew.Bardsley@arm.com 109110259SAndrew.Bardsley@arm.com} 1092