19264Sdjordje.kovacevic@arm.com# Copyright (c) 2012 ARM Limited 29264Sdjordje.kovacevic@arm.com# All rights reserved 39264Sdjordje.kovacevic@arm.com# 49264Sdjordje.kovacevic@arm.com# The license below extends only to copyright in the software and shall 59264Sdjordje.kovacevic@arm.com# not be construed as granting a license to any other intellectual 69264Sdjordje.kovacevic@arm.com# property including but not limited to intellectual property relating 79264Sdjordje.kovacevic@arm.com# to a hardware implementation of the functionality of the software 89264Sdjordje.kovacevic@arm.com# licensed hereunder. You may use the software subject to the license 99264Sdjordje.kovacevic@arm.com# terms below provided that you ensure that this notice is replicated 109264Sdjordje.kovacevic@arm.com# unmodified and in its entirety in all distributions of the software, 119264Sdjordje.kovacevic@arm.com# modified or unmodified, in source code or in binary form. 129264Sdjordje.kovacevic@arm.com# 139264Sdjordje.kovacevic@arm.com# Redistribution and use in source and binary forms, with or without 149264Sdjordje.kovacevic@arm.com# modification, are permitted provided that the following conditions are 159264Sdjordje.kovacevic@arm.com# met: redistributions of source code must retain the above copyright 169264Sdjordje.kovacevic@arm.com# notice, this list of conditions and the following disclaimer; 179264Sdjordje.kovacevic@arm.com# redistributions in binary form must reproduce the above copyright 189264Sdjordje.kovacevic@arm.com# notice, this list of conditions and the following disclaimer in the 199264Sdjordje.kovacevic@arm.com# documentation and/or other materials provided with the distribution; 209264Sdjordje.kovacevic@arm.com# neither the name of the copyright holders nor the names of its 219264Sdjordje.kovacevic@arm.com# contributors may be used to endorse or promote products derived from 229264Sdjordje.kovacevic@arm.com# this software without specific prior written permission. 239264Sdjordje.kovacevic@arm.com# 249264Sdjordje.kovacevic@arm.com# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 259264Sdjordje.kovacevic@arm.com# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 269264Sdjordje.kovacevic@arm.com# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 279264Sdjordje.kovacevic@arm.com# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 289264Sdjordje.kovacevic@arm.com# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 299264Sdjordje.kovacevic@arm.com# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 309264Sdjordje.kovacevic@arm.com# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 319264Sdjordje.kovacevic@arm.com# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 329264Sdjordje.kovacevic@arm.com# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 339264Sdjordje.kovacevic@arm.com# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 349264Sdjordje.kovacevic@arm.com# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 359264Sdjordje.kovacevic@arm.com# 369264Sdjordje.kovacevic@arm.com# Author: Djordje Kovacevic 379264Sdjordje.kovacevic@arm.com 389264Sdjordje.kovacevic@arm.com/*! \page gem5MemorySystem Memory System in gem5 399264Sdjordje.kovacevic@arm.com 409264Sdjordje.kovacevic@arm.com \tableofcontents 419264Sdjordje.kovacevic@arm.com 429264Sdjordje.kovacevic@arm.com The document describes memory subsystem in gem5 with focus on program flow 439264Sdjordje.kovacevic@arm.com during CPU’s simple memory transactions (read or write). 449264Sdjordje.kovacevic@arm.com 459264Sdjordje.kovacevic@arm.com 469264Sdjordje.kovacevic@arm.com \section gem5_MS_MH MODEL HIERARCHY 479264Sdjordje.kovacevic@arm.com 489264Sdjordje.kovacevic@arm.com Model that is used in this document consists of two out-of-order (O3) 499264Sdjordje.kovacevic@arm.com ARM v7 CPUs with corresponding L1 data caches and Simple Memory. It is 509264Sdjordje.kovacevic@arm.com created by running gem5 with the following parameters: 519264Sdjordje.kovacevic@arm.com 529264Sdjordje.kovacevic@arm.com configs/example/fs.py --caches --cpu-type=arm_detailed --num-cpus=2 539264Sdjordje.kovacevic@arm.com 5413892Sgabeblack@google.com Gem5 uses Simulation Objects (SimObject) derived objects as basic blocks for 559264Sdjordje.kovacevic@arm.com building memory system. They are connected via ports with established 569264Sdjordje.kovacevic@arm.com master/slave hierarchy. Data flow is initiated on master port while the 579264Sdjordje.kovacevic@arm.com response messages and snoop queries appear on the slave port. The following 5813892Sgabeblack@google.com figure shows the hierarchy of Simulation Objects used in this document: 599264Sdjordje.kovacevic@arm.com 6013892Sgabeblack@google.com \image html "gem5_MS_Fig1.PNG" "Simulation Object hierarchy of the model" width=3cm 619264Sdjordje.kovacevic@arm.com 629264Sdjordje.kovacevic@arm.com \section gem5_CPU CPU 639264Sdjordje.kovacevic@arm.com 649264Sdjordje.kovacevic@arm.com It is not in the scope of this document to describe O3 CPU model in details, so 659264Sdjordje.kovacevic@arm.com here are only a few relevant notes about the model: 669264Sdjordje.kovacevic@arm.com 679264Sdjordje.kovacevic@arm.com <b>Read access </b>is initiated by sending message to the port towards DCache 689264Sdjordje.kovacevic@arm.com object. If DCache rejects the message (for being blocked or busy) CPU will 699264Sdjordje.kovacevic@arm.com flush the pipeline and the access will be re-attempted later on. The access 709264Sdjordje.kovacevic@arm.com is completed upon receiving reply message (ReadRep) from DCache. 719264Sdjordje.kovacevic@arm.com 729264Sdjordje.kovacevic@arm.com <b>Write access</b> is initiated by storing the request into store buffer whose 739264Sdjordje.kovacevic@arm.com context is emptied and sent to DCache on every tick. DCache may also reject 749264Sdjordje.kovacevic@arm.com the request. Write access is completed when write reply (WriteRep) message is 759264Sdjordje.kovacevic@arm.com received from DCache. 769264Sdjordje.kovacevic@arm.com 779264Sdjordje.kovacevic@arm.com Load & store buffers (for read and write access) don’t impose any 789264Sdjordje.kovacevic@arm.com restriction on the number of active memory accesses. Therefore, the maximum 799264Sdjordje.kovacevic@arm.com number of outstanding CPU’s memory access requests is not limited by CPU 8013892Sgabeblack@google.com Simulation Object but by underlying memory system model. 819264Sdjordje.kovacevic@arm.com 829264Sdjordje.kovacevic@arm.com <b>Split memory access</b> is implemented. 839264Sdjordje.kovacevic@arm.com 849264Sdjordje.kovacevic@arm.com The message that is sent by CPU contains memory type (Normal, Device, Strongly 859264Sdjordje.kovacevic@arm.com Ordered and cachebility) of the accessed region. However, this is not being used 869264Sdjordje.kovacevic@arm.com by the rest of the model that takes more simplified approach towards memory types. 879264Sdjordje.kovacevic@arm.com 889264Sdjordje.kovacevic@arm.com \section gem5_DCache DATA CACHE OBJECT 899264Sdjordje.kovacevic@arm.com 909264Sdjordje.kovacevic@arm.com Data Cache object implements a standard cache structure: 919264Sdjordje.kovacevic@arm.com 9213892Sgabeblack@google.com \image html "gem5_MS_Fig2.PNG" "DCache Simulation Object" width=3cm 939264Sdjordje.kovacevic@arm.com 949264Sdjordje.kovacevic@arm.com <b>Cached memory reads</b> that match particular cache tag (with Valid & Read 959264Sdjordje.kovacevic@arm.com flags) will be completed (by sending ReadResp to CPU) after a configurable time. 969264Sdjordje.kovacevic@arm.com Otherwise, the request is forwarded to Miss Status and Handling Register 979264Sdjordje.kovacevic@arm.com (MSHR) block. 989264Sdjordje.kovacevic@arm.com 999264Sdjordje.kovacevic@arm.com <b>Cached memory writes</b> that match particular cache tag (with Valid, Read 1009264Sdjordje.kovacevic@arm.com & Write flags) will be completed (by sending WriteResp CPU) after the same 1019264Sdjordje.kovacevic@arm.com configurable time. Otherwise, the request is forwarded to Miss Status and 1029264Sdjordje.kovacevic@arm.com Handling Register(MSHR) block. 1039264Sdjordje.kovacevic@arm.com 1049264Sdjordje.kovacevic@arm.com <b>Uncached memory reads</b> are forwarded to MSHR block. 1059264Sdjordje.kovacevic@arm.com 1069264Sdjordje.kovacevic@arm.com <b>Uncached memory writes</b> are forwarded to WriteBuffer block. 1079264Sdjordje.kovacevic@arm.com 1089264Sdjordje.kovacevic@arm.com <b>Evicted (& dirty) cache lines</b> are forwarded to WriteBuffer block. 1099264Sdjordje.kovacevic@arm.com 1109264Sdjordje.kovacevic@arm.com CPU’s access to Data Cache is blocked if any of the following is true: 1119264Sdjordje.kovacevic@arm.com 1129264Sdjordje.kovacevic@arm.com - MSHR block is full. (The size of MSHR’s buffer is configurable.) 1139264Sdjordje.kovacevic@arm.com 1149264Sdjordje.kovacevic@arm.com - Writeback block is full. (The size of the block’s buffer is 1159264Sdjordje.kovacevic@arm.com configurable.) 1169264Sdjordje.kovacevic@arm.com 1179264Sdjordje.kovacevic@arm.com - The number of outstanding memory accesses against the same memory cache line 1189264Sdjordje.kovacevic@arm.com has reached configurable threshold value – see MSHR and Write Buffer for details. 1199264Sdjordje.kovacevic@arm.com 1209264Sdjordje.kovacevic@arm.com Data Cache in block state will reject any request from slave port (from CPU) 1219264Sdjordje.kovacevic@arm.com regardless of whether it would result in cache hit or miss. Note that 1229264Sdjordje.kovacevic@arm.com incoming messages on master port (response messages and snoop requests) 1239264Sdjordje.kovacevic@arm.com are never rejected. 1249264Sdjordje.kovacevic@arm.com 1259264Sdjordje.kovacevic@arm.com Cache hit on uncachable memory region (unpredicted behaviour according to 1269264Sdjordje.kovacevic@arm.com ARM ARM) will invalidate cache line and fetch data from memory. 1279264Sdjordje.kovacevic@arm.com 1289264Sdjordje.kovacevic@arm.com \subsection gem5_MS_TAndDBlock Tags & Data Block 1299264Sdjordje.kovacevic@arm.com 1309264Sdjordje.kovacevic@arm.com Cache lines (referred as blocks in source code) are organised into sets with 1319264Sdjordje.kovacevic@arm.com configurable associativity and size. They have the following status flags: 1329264Sdjordje.kovacevic@arm.com - <b>Valid.</b> It holds data. Address tag is valid 1339264Sdjordje.kovacevic@arm.com - <b>Read.</b> No read request will be accepted without this flag being set. 1349264Sdjordje.kovacevic@arm.com For example, cache line is valid and unreadable when it waits for write flag 1359264Sdjordje.kovacevic@arm.com to complete write access. 1369264Sdjordje.kovacevic@arm.com - <b>Write.</b> It may accept writes. Cache line with Write flags 1379264Sdjordje.kovacevic@arm.com identifies Unique state – no other cache memory holds the copy. 1389264Sdjordje.kovacevic@arm.com - <b>Dirty.</b> It needs Writeback when evicted. 1399264Sdjordje.kovacevic@arm.com 1409264Sdjordje.kovacevic@arm.com Read access will hit cache line if address tags match and Valid and Read 1419264Sdjordje.kovacevic@arm.com flags are set. Write access will hit cache line if address tags match and 1429264Sdjordje.kovacevic@arm.com Valid, Read and Write flags are set. 1439264Sdjordje.kovacevic@arm.com 1449264Sdjordje.kovacevic@arm.com \subsection gem5_MS_Queues MSHR and Write Buffer Queues 1459264Sdjordje.kovacevic@arm.com 1469264Sdjordje.kovacevic@arm.com Miss Status and Handling Register (MSHR) queue holds the list of CPU’s 1479264Sdjordje.kovacevic@arm.com outstanding memory requests that require read access to lower memory 1489264Sdjordje.kovacevic@arm.com level. They are: 1499264Sdjordje.kovacevic@arm.com - Cached Read misses. 1509264Sdjordje.kovacevic@arm.com - Cached Write misses. 1519264Sdjordje.kovacevic@arm.com - Uncached reads. 1529264Sdjordje.kovacevic@arm.com 1539264Sdjordje.kovacevic@arm.com WriteBuffer queue holds the following memory requests: 1549264Sdjordje.kovacevic@arm.com - Uncached writes. 1559264Sdjordje.kovacevic@arm.com - Writeback from evicted (& dirty) cache lines. 1569264Sdjordje.kovacevic@arm.com 1579264Sdjordje.kovacevic@arm.com \image html "gem5_MS_Fig3.PNG" "MSHR and Write Buffer Blocks" width=6cm 1589264Sdjordje.kovacevic@arm.com 1599264Sdjordje.kovacevic@arm.com Each memory request is assigned to corresponding MSHR object (READ or WRITE 1609264Sdjordje.kovacevic@arm.com on diagram above) that represents particular block (cache line) of memory 1619264Sdjordje.kovacevic@arm.com that has to be read or written in order to complete the command(s). As shown 1629264Sdjordje.kovacevic@arm.com on gigure above, cached read/writes against the same cache line have a common 1639264Sdjordje.kovacevic@arm.com MSHR object and will be completed with a single memory access. 1649264Sdjordje.kovacevic@arm.com 1659264Sdjordje.kovacevic@arm.com The size of the block (and therefore the size of read/write access to lower 1669264Sdjordje.kovacevic@arm.com memory) is: 1679264Sdjordje.kovacevic@arm.com - The size of cache line for cached access & writeback; 1689264Sdjordje.kovacevic@arm.com - As specified in CPU instruction for uncached access. 1699264Sdjordje.kovacevic@arm.com 1709264Sdjordje.kovacevic@arm.com In general, Data Cache model distinguishes between just two memory types: 1719264Sdjordje.kovacevic@arm.com - Normal Cached memory. It is always treated as write back, read and write 1729264Sdjordje.kovacevic@arm.com allocate. 1739264Sdjordje.kovacevic@arm.com - Normal uncached, Device and Strongly Ordered types are treated equally 1749264Sdjordje.kovacevic@arm.com (as uncached memory) 1759264Sdjordje.kovacevic@arm.com 1769264Sdjordje.kovacevic@arm.com \subsection gem5_MS_Ordering Memory Access Ordering 1779264Sdjordje.kovacevic@arm.com 1789264Sdjordje.kovacevic@arm.com An unique order number is assigned to each CPU read/write request(as they appear on 1799264Sdjordje.kovacevic@arm.com slave port). Order numbers of MSHR objects are copied from the first 1809264Sdjordje.kovacevic@arm.com assigned read/write. 1819264Sdjordje.kovacevic@arm.com 1829264Sdjordje.kovacevic@arm.com Memory read/writes from each of these two queues are executed in order (according 1839264Sdjordje.kovacevic@arm.com to the assigned order number). When both queues are not empty the model will 1849264Sdjordje.kovacevic@arm.com execute memory read from MSHR block unless WriteBuffer is full. It will, 1859264Sdjordje.kovacevic@arm.com however, always preserve the order of read/writes on the same 1869264Sdjordje.kovacevic@arm.com (or overlapping) memory cache line (block). 1879264Sdjordje.kovacevic@arm.com 1889264Sdjordje.kovacevic@arm.com In summary: 1899264Sdjordje.kovacevic@arm.com - Order of accesses to cached memory is not preserved unless they target 1909264Sdjordje.kovacevic@arm.com the same cache line. For example, the accesses #1, #5 & #10 will 1919264Sdjordje.kovacevic@arm.com complete simultaneously in the same tick (still in order). The access 1929264Sdjordje.kovacevic@arm.com #5 will complete before #3. 1939264Sdjordje.kovacevic@arm.com - Order of all uncached memory writes is preserved. Write#6 always 1949264Sdjordje.kovacevic@arm.com completes before Write#13. 1959264Sdjordje.kovacevic@arm.com - Order to all uncached memory reads is preserved. Read#2 always completes 1969264Sdjordje.kovacevic@arm.com before Read#8. 1979264Sdjordje.kovacevic@arm.com - The order of a read and a write uncached access is not necessarily 1989264Sdjordje.kovacevic@arm.com preserved - unless their access regions overlap. Therefore, Write#6 1999264Sdjordje.kovacevic@arm.com always completes before Read#8 (they target the same memory block). 2009264Sdjordje.kovacevic@arm.com However, Write#13 may complete before Read#8. 2019264Sdjordje.kovacevic@arm.com 2029264Sdjordje.kovacevic@arm.com 2039264Sdjordje.kovacevic@arm.com \section gem5_MS_Bus COHERENT BUS OBJECT 2049264Sdjordje.kovacevic@arm.com 2059264Sdjordje.kovacevic@arm.com \image html "gem5_MS_Fig4.PNG" "Coherent Bus Object" width=3cm 2069264Sdjordje.kovacevic@arm.com 2079264Sdjordje.kovacevic@arm.com Coherent Bus object provides basic support for snoop protocol: 2089264Sdjordje.kovacevic@arm.com 2099264Sdjordje.kovacevic@arm.com <b>All requests on the slave port</b> are forwarded to the appropriate master port. Requests 2109264Sdjordje.kovacevic@arm.com for cached memory regions are also forwarded to other slave ports (as snoop 2119264Sdjordje.kovacevic@arm.com requests). 2129264Sdjordje.kovacevic@arm.com 2139264Sdjordje.kovacevic@arm.com <b>Master port replies</b> are forwarded to the appropriate slave port. 2149264Sdjordje.kovacevic@arm.com 2159264Sdjordje.kovacevic@arm.com <b>Master port snoop requests</b> are forwarded to all slave ports. 2169264Sdjordje.kovacevic@arm.com 2179264Sdjordje.kovacevic@arm.com <b>Slave port snoop replies</b> are forwarded to the port that was the source of the 2189264Sdjordje.kovacevic@arm.com request. (Note that the source of snoop request can be either slave or 2199264Sdjordje.kovacevic@arm.com master port.) 2209264Sdjordje.kovacevic@arm.com 2219264Sdjordje.kovacevic@arm.com The bus declares itself blocked for a configurable period of time after 2229264Sdjordje.kovacevic@arm.com any of the following events: 2239264Sdjordje.kovacevic@arm.com - A packet is sent (or failed to be sent) to a slave port. 2249264Sdjordje.kovacevic@arm.com - A reply message is sent to a master port. 2259264Sdjordje.kovacevic@arm.com - Snoop response from one slave port is sent to another slave port. 2269264Sdjordje.kovacevic@arm.com 2279264Sdjordje.kovacevic@arm.com The bus in blocked state rejects the following incoming messages: 2289264Sdjordje.kovacevic@arm.com - Slave port requests. 2299264Sdjordje.kovacevic@arm.com - Master port replies. 2309264Sdjordje.kovacevic@arm.com - Master port snoop requests. 2319264Sdjordje.kovacevic@arm.com 2329264Sdjordje.kovacevic@arm.com \section gem5_MS_SimpleMemory SIMPLE MEMORY OBJECT 2339264Sdjordje.kovacevic@arm.com 2349264Sdjordje.kovacevic@arm.com It never blocks the access on slave port. 2359264Sdjordje.kovacevic@arm.com 2369264Sdjordje.kovacevic@arm.com Memory read/write takes immediate effect. (Read or write is performed when 2379264Sdjordje.kovacevic@arm.com the request is received). 2389264Sdjordje.kovacevic@arm.com 2399264Sdjordje.kovacevic@arm.com Reply message is sent after a configurable period of time . 2409264Sdjordje.kovacevic@arm.com 2419264Sdjordje.kovacevic@arm.com \section gem5_MS_MessageFlow MESSAGE FLOW 2429264Sdjordje.kovacevic@arm.com 2439264Sdjordje.kovacevic@arm.com \subsection gem5_MS_Ordering Read Access 2449264Sdjordje.kovacevic@arm.com 2459264Sdjordje.kovacevic@arm.com The following diagram shows read access that hits Data Cache line with Valid 2469264Sdjordje.kovacevic@arm.com and Read flags: 2479264Sdjordje.kovacevic@arm.com 2489264Sdjordje.kovacevic@arm.com \image html "gem5_MS_Fig5.PNG" "Read Hit (Read flag must be set in cache line)" width=3cm 2499264Sdjordje.kovacevic@arm.com 2509264Sdjordje.kovacevic@arm.com Cache miss read access will generate the following sequence of messages: 2519264Sdjordje.kovacevic@arm.com 2529264Sdjordje.kovacevic@arm.com \image html "gem5_MS_Fig6.PNG" "Read Miss with snoop reply" width=3cm 2539264Sdjordje.kovacevic@arm.com 2549264Sdjordje.kovacevic@arm.com Note that bus object never gets response from both DCache2 and Memory object. 2559264Sdjordje.kovacevic@arm.com It sends the very same ReadReq package (message) object to memory and data 2569264Sdjordje.kovacevic@arm.com cache. When Data Cache wants to reply on snoop request it marks the message 2579264Sdjordje.kovacevic@arm.com with MEM_INHIBIT flag that tells Memory object not to process the message. 2589264Sdjordje.kovacevic@arm.com 2599264Sdjordje.kovacevic@arm.com \subsection gem5_MS_Ordering Write Access 2609264Sdjordje.kovacevic@arm.com 2619264Sdjordje.kovacevic@arm.com The following diagram shows write access that hits DCache1 cache line with 2629264Sdjordje.kovacevic@arm.com Valid & Write flags: 2639264Sdjordje.kovacevic@arm.com 2649264Sdjordje.kovacevic@arm.com \image html "gem5_MS_Fig7.PNG" "Write Hit (with Write flag set in cache line)" width=3cm 2659264Sdjordje.kovacevic@arm.com 2669264Sdjordje.kovacevic@arm.com Next figure shows write access that hits DCache1 cache line with Valid but no 2679264Sdjordje.kovacevic@arm.com Write flags – which qualifies as write miss. DCache1 issues UpgradeReq to 2689264Sdjordje.kovacevic@arm.com obtain write permission. DCache2::snoopTiming will invalidate cache line that 2699264Sdjordje.kovacevic@arm.com has been hit. Note that UpgradeResp message doesn’t carry data. 2709264Sdjordje.kovacevic@arm.com 2719264Sdjordje.kovacevic@arm.com \image html "gem5_MS_Fig8.PNG" "Write Miss – matching tag with no Write flag" width=3cm 2729264Sdjordje.kovacevic@arm.com 2739264Sdjordje.kovacevic@arm.com The next diagram shows write miss in DCache. ReadExReq invalidates cache line 2749264Sdjordje.kovacevic@arm.com in DCache2. ReadExResp carries the content of memory cache line. 2759264Sdjordje.kovacevic@arm.com 2769264Sdjordje.kovacevic@arm.com \image html "gem5_MS_Fig9.PNG" "Miss - no matching tag" width=3cm 2779264Sdjordje.kovacevic@arm.com 2789264Sdjordje.kovacevic@arm.com*/ 279