14269:7e364bd625e1 |
01-Aug-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix BDI size calculation
The bitmask field indicates to which base a delta refers, and in the original paper it is fixed and proportional to the highest number of bases allowed in the compressed data.
Change-Id: I271bf2e19e0765de52b933eaf6d4fcc2ce25d185 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19748 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14211:acfef4916339 |
29-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use SatCounter for RRPV
Use SatCounter in RRIP's RRPV. As such, move validation functionality to a proper variable.
Change-Id: I142db2b7f6cd518ac3a2b68c9ed48005402b3464 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20452 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14193:7dd8a6df30e2 |
17-Aug-2019 |
Gabe Black <gabeblack@google.com> |
mem: Eliminate the Base(Slave|Master)Port classes.
The Port class has assumed all the duties of the less generic Base*Port classes, making them unnecessary. Since they don't add anything but make the code more complex, this change eliminates them.
Change-Id: Ibb9c56def04465f353362595c1f1c5ac5083e5e9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20236 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Gabe Black <gabeblack@google.com> |
14131:f0529ae28f97 |
02-Aug-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix non-virtual base destructor of Repl Entry
ReplaceableEntry contains a virtual method, yet its destructor was not virtual, causing errors in some compilers.
Change-Id: I13deec843f4007d9deb924882a8d98ff6a89c84f Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19808 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14118:3d2ee7721eb0 |
29-Jul-2019 |
Tiago Mück <tiago.muck@arm.com> |
mem-cache: mark block as dirty when handling SW prefetch
This addresses the issue described in 64687ee mem-cache: Mark block as dirty after a SWPrefetchEXResp.
Previous patch misses cases when the prefetch response is ReadExResp or UpgradeResp. Also, marking the block as dirty in serviceMSHRTargets instead of in handleFill covers cases when the prefetch is coalesced with other requests.
Change-Id: I2b377fdd240eb0f09e720b6bb284dee6545925ce Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19688 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14117:2f88285aaa8b |
30-Jul-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix set and way of sub-entries
Set and way of sub-entries were not being set previously. They must be set after the sub-blocks have been assigned to the main block.
Change-Id: I7b6921b8437b29c472d691cd78cf20f2bb6c7e07 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19669 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14089:fe1e5813d62c |
30-May-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create CPack compressor
Implementation of C-Pack, as described in "C-Pack: A High- Performance Microprocessor Cache Compression Algorithm".
C-Pack uses pattern matching schemes to detect and compress frequently appearing data patterns. As in the original paper, it divides the input in 32-bit words, and uses 6 patterns to match with its dictionary.
For the patterns, each letter represents a byte: Z is a null byte, M is a dictionary match, X is a new value. The patterns are ZZZZ, XXXX, MMMM, MMXX, ZZZX, MMMX.
Change-Id: I2efc9db2c862620dcc1155300e39be558f9017e0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11105 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14035:60068a2d56e0 |
31-May-2019 |
Daniel Carvalho <odanrc@yahoo.com.br> |
Revert "mem-cache: Remove writebacks packet list"
This reverts commit bf0a722acdd8247602e83720a5f81a0b69c76250.
Reason for revert: This patch introduces a bug:
The problem here is that the insertion of block A may cause the eviction of block B, which on the lower level may cause the eviction of block A. Since A is not marked as present yet, A is "safely" removed from the snoop filter
However, by reverting it, using atomic and a Tags sub-class that can generate multiple evictions at once becomes broken when using Atomic mode and shall be fixed in a future patch.
Change-Id: I5b27e54b54ae5b50255588835c1a2ebf3015f002 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19088 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14015:e709cec78417 |
16-May-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Accuracy-based rate control for prefetchers
Added a mechanism to control the number of prefetches generated based in the effectiveness of the prefetches generated so far.
Change-Id: I33af82546f74a5b5ab372c28574b76dd9a1bd46a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18808 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14013:aeb3ca1762bb |
27-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Support for page crossing prefetches
Prefetchers can now issue hardware prefetch requests that go beyond the boundaries of the system page. Page crossing references will need to look up the TLBs to be able to compute the physical address to be prefetched.
Change-Id: Ib56374097e3b7dc87414139d210ea9272f96b06b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14620 Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13991:102d94094d6b |
12-Apr-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem-cache: Add multi-prefetcher adaptor
This patch adds a meta-prefetcher that enables gem5's cache models to connect to multiple prefetchers. Sub-prefetchers still use the probes-based interface and training can be controlled independently. However, when the cache requests a prefetch packet, the adaptor traverses the priority list of prefetchers and uses the first prefetcher that is able to generate a prefetch.
Kudos to Mitch Hayenga for the original version of this patch.
Change-Id: I25569a834997e5404c7183ec995d212912c5dcdf Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18868 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13981:577196ddd040 |
02-May-2019 |
Gabe Black <gabeblack@google.com> |
arch, base, cpu, dev, mem, sim: Remove #if 0-ed out code.
This code will be preserved through version control, but otherwise creates clutter and will rot in place since it's never compiled.
Change-Id: Id265f6deac445116843956ea5cf1210d8127274e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18608 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13963:94555f0223ba |
11-Apr-2019 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Use SatCounter for prefetchers
Many prefetchers re-implement saturating counters with ints. Make them use SatCounters instead.
Added missing operators and constructors to SatCounter for that to be possible and their respective tests.
Change-Id: I36f10c89c27c9b3d1bf461e9ea546920f6ebb888 Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17995 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Javier Bueno Hedo <javier.bueno@metempsy.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13954:2f400a5f2627 |
07-Jul-2017 |
Giacomo Gabrielli <giacomo.gabrielli@arm.com> |
cpu,mem: Add support for partial loads/stores and wide mem. accesses
This changeset adds support for partial (or masked) loads/stores, i.e. loads/stores that can disable accesses to individual bytes within the target address range. In addition, this changeset extends the code to crack memory accesses across most CPU models (TimingSimpleCPU still TBD), so that arbitrarily wide memory accesses are supported. These changes are required for supporting ISAs with wide vectors.
Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> - Tiago Muck <tiago.muck@arm.com>
Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13948:f8666d4d5855 |
18-Apr-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove writebacks packet list
Previously all atomic writebacks concerned a single block, therefore, when a block was evicted, no other block would be pending eviction. With sector tags (and compression), however, a single replacement can generate many evictions.
This can cause problems, since a writeback that evicts a block may evict blocks in the lower cache. If one of these conflict with one of the blocks pending eviction in the higher level, the snoop must inform it to the lower level. Since atomic mode does not have a writebuffer, this kind of conflict wouldn't be noticed.
Therefore, instead of evicting multiple blocks at once, we do it one by one.
Change-Id: I2fc2f9eb0f26248ddf91adbe987d158f5a2e592b Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18209 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13947:4cf8087cab09 |
08-Aug-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Handle data expansion
When a block in compressed form is overwriten, it may change its size. If the new compressed size is bigger, and the total size becomes bigger than the block size, one or more blocks will have to be evicted. This is called data expansion, or fat writes.
This change assumes that a first level cache cannot have a compressor, since otherwise data expansion should have been handled for atomic operations and writes. As such, data expansions should only be seen on writebacks. As writebacks are forwarded to the next level when failed, there should be no data expansions when servicing misses either.
This patch adds the functionality to handle data expansions by evicting the co-allocated blocks to make room for an expanded block.
Change-Id: I0bd77bf6446bfae336889940b2f75d6f0c87e533 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/12087 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13946:8e96e9be7f2c |
19-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add co-allocation function to compressed tags
Implement a co-allocation function in compressed tags, so that compressed blocks can be co-allocated in a superblock. Co-allocation is possible when compression ratio (CR) blocks that share a superblock tag can be compressed to up to (100/CR)% of their size.
Change-Id: I937cc1fcbb488e70309cb5478c12db65f1b4b23f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11411 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13945:a573bed35a8b |
19-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add compression and decompression calls
Add a compressor to the base cache class and compress within block allocation and decompress on writebacks.
This change does not implement data expansion (fat writes) yet, nor it adds the compression latency to the block write time.
Change-Id: Ie36db65f7487c9b05ec4aedebc2c7651b4cb4821 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11410 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13944:5000533e6b81 |
13-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create BDI Compressor
Implement Base-Delta-Immediate compression, as described in 'Base-Delta-Immediate Compression: Practical Data Compression for On-Chip Caches'
Change-Id: I7980c340ab53a086b748f4b2108de4adc775fac8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11412 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13943:4046b0c547be |
29-May-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add compression stats
Add compression statistics to the compressors. It tracks the number of blocks that can fit into a certain power of two size, and the number of decompressions.
For example, if a block is compressed to 100 bits, it will belong to the 128-bits compression size. Although it could also fit bigger sizes, they are not taken into account for the stats (i.e., the 100-bit compression will fit only the 128-bits size, not 256 or higher).
We save stats for compressions that fail (i.e., compressed size is bigger than original cache line size).
Change-Id: Idab71a40a660e33259908ccd880e42a880b5ee06 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11103 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13942:e8b59b523af6 |
13-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create cache compressor
Create basic template for cache compressors. A basic compressor must implement a compression and a decompression method.
Change-Id: I83dc4d2b8d2bc5ed9f760c938edfa4ebdd6b8583 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11100 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13941:2c19da00ef9c |
15-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add block size to findVictim
Add block size to findVictim. For standard caches it will not be used. Compressed caches, however, need to know the size of the compressed block to decide whether a block is co-allocatable or not.
Change-Id: Id07f79763687b29f75d707c080fa9bd978a408aa Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11198 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Mohammad Seyedzadeh <sm.seyedzade@gmail.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13940:33cc30e2de52 |
30-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add compression data to CompressionBlk
Add a compression bit, decompression latency and compressed block size and their respective getters and setters.
Change-Id: Ia9d8656552d60e8d4e85fe5379dd75fc5adb0abe Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11102 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13939:c9e81d00a992 |
29-May-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create CacheComp debug flag
Create a debug flag for cache compression.
Change-Id: Id4b8e86d658d3aa550906ee0f8da3b54f4cdab7d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11104 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13938:14f80b6b37c1 |
29-May-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Stub compression framework
Create a stub of a compression framework where we can have multiple data blocks per tag entry. Only consecutive blocks can share a tag as of now.
For each tag entry there can be multiple data blocks. We have the same number of tags a conventional cache would have, but we instantiate the maximum number of data blocks (according to the compression ratio) per tag, to virtually implement compression without increasing the complexity of the simulator.
Change-Id: I549940c7afb2f744ab293ff8bb283967e7551a11 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/10763 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13932:24f825a9a080 |
07-Mar-2019 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Mark block as dirty after a SWPrefetchEXResp
This is a workaround for a bug introduced from the change: 59e3585a8 arch-arm: We add PRFM PST instruction for arm which can cause deadlocks in the memory system.
The design of the classic memory system in gem5 makes the folloing two assumptions: * A cache that fetches a block with an intention to modify it, becomes the point of ordering and therefore commits to respond to any snoop requests [1]. * A cache that fetches an exclusive copy of the block, does so with the intention to modify it [2]. Immediately after it receives the block, it will write to it and mark it as dirty. As the point of ordering, it responds to any outstanding snoops.
The current implementation of prefetch exclusive request breaks the second assumption. A cache can fetch an exclusive block without an immediate intention to modify it. If the block is not modified, it will not be marked as dirty. However, the cache has committed to respond to outstanding snoops, and if the block is clean it won't. This can result in deadlocks where a snoop gets stuck waiting for responses.
One solution (implemented by this patch) is to unconditionally mark the block dirty when filling due to a prefetch exclusive request. This makes the PrefetchExReq behave like a WriteReq. However, as it may mark as dirty a clean block, it creates the requirement for an uncessary WritebackDirty in the future. In practice, this shouldn't be a big problem unless the application is unnecessarily using prefetch exclusive instructions.
Other solutions, would require deeper changes to the design of the memory system to handle this properly.
[1]: When a cache commits to respond, it "informs" the xbar/PoC (point of coherence) and the other caches of its intention to respond. As a result the request will not be send to the main memory. [2]: In fact the assumption is that in the needsWritable MSHR there is at least one WriteReq before any snoops from other caches.
Change-Id: I378d3c0dadf25fc52e430b67102347b44d2f18ea Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17729 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Tested-by: kokoro <noreply+kokoro@google.com> |
13892:0182a0601f66 |
22-Apr-2019 |
Gabe Black <gabeblack@google.com> |
mem: Minimize the use of MemObject.
MemObject doesn't provide anything beyond its base ClockedObject any more, so this change removes it from most inheritance hierarchies. Occasionally MemObject is replaced with SimObject when I was fairly confident that the extra functionality of ClockedObject wasn't needed.
Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Gabe Black <gabeblack@google.com> |
13875:656d633621fa |
23-Apr-2019 |
Andrea Mondelli <Andrea.Mondelli@ucf.edu> |
cpu,mem: missing override specifier
Change-Id: I731d3ef021596450ac307461f215760a148bb28a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18348 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13866:d0829f20374a |
22-Apr-2019 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Fix fix of replacement count
Commit 7976b561de61b7523ca9a860154ad7ba701d12a7 tried fixing replacement update when a single location can be associated to multiple blocks.
Although the comment of the correct action was added, the proper validation check was forgotten. This change adds that check and moves doing the eviction to when there is a valid block.
Change-Id: I31d8bb914ccfd1849e9d97464d70a58a62f59533 Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18210 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13863:f7391cb38ce7 |
18-Apr-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix increasing replacement count
Replacements should be increased when there is any evicted block, which does not necessarily have to be the victim.
For example, assume a superblock contains 4 blocks, and both A and C are stored compressed (belonging to SB_1). Then F, from SB2 needs to make room by replacing SB1. If F map to location 2, the number of replacements should be increased, even though 2 had no valid blocks:
Tag Data Tag Data |SB_1|--|A|X|C|X| --> |SB_2| |X|F|X|X| 1 2 3 4 1 2 3 4
Change-Id: I7b3735d28a35faa8d8fa613a1555bb258da65859 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18208 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13862:9b6d6541244f |
11-Feb-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove blk_addr from Queue::trySatisfyFunctional
The blk_addr is pkt->getBlockAddr(), and therefore can be acquired internally, when needed, as long as the pkt is provided.
Change-Id: I2780445d2a0cb9e27257961efc4f438cc19550e5 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17537 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13861:7815aef6668f |
24-Jan-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add match functions to QueueEntry
Having the caller decide the matching logic is error-prone, and frequently ends up with the secure bit being forgotten. This change adds matching functions to the QueueEntry to avoid this problem.
As a side effect the signature of findPending has been changed.
Change-Id: I6e494a821c1e6e841ab103ec69632c0e1b269a08 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17530 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13860:8f8df5b68439 |
11-Feb-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Add packet matching functions
Add both block and non-block-aligned packet matching functions, so that both address and secure bits are checked when checking whether a packet matches a request.
Change-Id: Id0069befb925d112e06f250741cb47d9dfa249cc Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17533 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13859:4156ac0c7257 |
30-Jan-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move Target to QueueEntry
WriteQueueEntry's target has 100% functionality overlap with MSHR's, therefore make it base to MSHR::Target.
Change-Id: I48614e78179d708bd91bbe75a752e5a05146e8eb Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17534 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13858:f01183becd57 |
24-Jan-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Assert Entry inherits from QueueEntry in Queue
Queue has several assumptions regarding its template parameter, so make sure they are fulfilled by forcing Entry to be derived from QueueEntry.
Change-Id: I0203a62aec00c04ac89e9674d86a44a07f9f13ab Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17529 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13849:858526a875ab |
09-Apr-2019 |
Anis Peysieux <anis.peysieux@inria.fr> |
mem-cache: Fix RRPV for RRIP
The RRPV values for RRIP and NRU replacment policies. Long re-rereference interval was used instead of distant re-rereference interval and vice-versa. The btp value permit to choose beetwen distant and long insertion ratio. A btp value of 0 force the policy to always insert at a distant re-reference interval and a btp value of 100 force the policy to always insert at a long (intermediate) re-rereference interval.
Change-Id: I516098f73942b769dcc31fe0edfe07c3e9c3effd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17851 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13844:e409800a51c7 |
12-Feb-2019 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix MSHR handling of cache clean requests
Previously satisfied clean requests would not snoop in-service MSHRs. This is a problem when a clean request is also invalidating, in which case we have to post-invalidate or post-downgrade outstanding requests. This changes fixes this bug.
Change-Id: I31e42aa94dd3637b2818e00fbaae68c810145eaf Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17728 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
13835:dff303952ba9 |
04-Apr-2019 |
Ryan Gambord <gambordr@oregonstate.edu> |
mem-cache: ambiguous use of abs function
std::abs doesn't accept unsigned long long, generating the error:
error: call to 'abs' is ambiguous
Use instead a compare-and-subtract idiom.
Also, Changed return type of distanceFromTrigger from unsigned int to Addr to prevent overflow problems.
Change-Id: Ia7752c1c7a838f98e8c7ed6ade9f586f31bbcf7d Signed-off-by: Ryan Gambord <gambordr@oregonstate.edu> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17788 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13832:79e439e69d9b |
02-Apr-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: AMPM Prefetcher fails when restoring from a checkpoint
The preriodic event triggers an assertion due to an incorrect tick value to schedule when restoring from a checkpoint.
Change-Id: I9454dd0c97d5a098f8a409886e63f7a7e990947c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17732 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13829:b623eae407f0 |
02-Apr-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Fix PIF prefetcher compilation error with NULL ISA
Referencing BaseCPU is causing a compilation error when using the NULL ISA. This patch changes the reference to a SimObject, which fixes the problem.
Change-Id: I2530486cab65974f5b83e54a733c4b0e98730d26 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17731 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13828:73addeac3dd3 |
02-Apr-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: ISB prefetcher was triggering an assertion
An assertion ignored the case when an entry of the SP table had been invalidated.
Change-Id: I5bf04e7a0979300b0f41f680c371f6397d4cbf3f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17734 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13827:2764e4b4de5d |
02-Apr-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Fix panic in Indirect Memory prefetcher
Memory requests with a size non-power-of-two and less than 8 values were causing a panic, but there these should be allowed and ignored by the prefetcher.
Change-Id: I86baa60058cc8a7f232d6ba5748d4c24a463c840 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17733 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13825:90e5b4dfeaff |
25-Feb-2019 |
Ivan Pizarro <ivan.pizarro@metempsy.com> |
mem-cache: Proactive Instruction Fetch Implementation
Ferdman, M., Kaynak, C., & Falsafi, B. (2011, December). Proactive instruction fetch. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (pp. 152-162). ACM.
Change-Id: I38c3ab30a94ab279f03e3d5936ce8ed118310c0e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16968 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13817:716bcdc780f9 |
27-Mar-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove extra cache header from AMAP
The cache header was being included in the AMAP, although not used, which resulted in slightly longer compilation time.
Change-Id: I3654bc719c6b5f558af116addae159301602a3cf Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17711 Reviewed-by: Javier Bueno Hedo <javier.bueno@metempsy.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13786:860c780d9f30 |
07-Mar-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Added the STeMS prefetcher
Reference: Stephen Somogyi, Thomas F. Wenisch, Anastasia Ailamaki, and Babak Falsafi. 2009. Spatio-temporal memory streaming. In Proceedings of the 36th annual international symposium on Computer architecture (ISCA '09). ACM, New York, NY, USA, 69-80.
Change-Id: I58cea1a7faa9391f8aa4469eb4973feabd31097a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16423 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13784:1941dc118243 |
07-Mar-2019 |
Gabe Black <gabeblack@google.com> |
arch, cpu, dev, gpu, mem, sim, python: start using getPort.
Replace the getMasterPort, getSlavePort, and getEthPort functions with getPort, and remove extraneous mechanisms that are no longer necessary.
Change-Id: Iab7e3c02d2f3a0cf33e7e824e18c28646b5bc318 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17040 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
13775:36b71cff789e |
15-Mar-2019 |
Andrea Mondelli <Andrea.Mondelli@ucf.edu> |
mem-cache: tautological comparison of byteOrder
Error: build/X86/mem/cache/prefetch/indirect_memory.cc:56:24: error: result of comparison of constant -1 with expression of type 'const ByteOrder' is always false [-Werror,-Wtautological-constant-out-of-range-compare] fatal_if(byteOrder == -1, "This prefetcher requires a defined ISA\n"); ~~~~~~~~~ ^ ~~ build/X86/base/logging.hh:205:14: note: expanded from macro 'fatal_if' if ((cond)) { \ ^~~~ 1 error generated.
Fix: cast of constant (-1) used in comparison
Change-Id: I3deb154c2fe5b92c4ddf499176cb185c4ec7cf64 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17388 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13772:31b71dadc472 |
07-Mar-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Added the Indirect Memory Prefetcher
Reference: Xiangyao Yu, Christopher J. Hughes, Nadathur Satish, and Srinivas Devadas. 2015. IMP: indirect memory prefetcher. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO-48). ACM, New York, NY, USA, 178-190. DOI: https://doi.org/10.1145/2830772.2830807
Change-Id: I52790f69c13ec55b8c1c8b9396ef9a1fb1be9797 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16223 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13765:7936e603ac0d |
13-Mar-2019 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Fix write hit latency calculation order
Patch 6d8694a5fb5cfb905186249581cc6a3fde6cc38a changes the order at which the access latency is calculated for hits. This order is incorrect, since the calculations must use the blk's whenReady value before the access is satisfied.
Change-Id: I30dae5435f54200cc8fdf71fd0dbd2cf9c6f8b17 Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17190 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13755:a1d7b56e3a64 |
10-Mar-2019 |
Ryan Gambord <gambordr@oregonstate.edu> |
mem-cache: Removed default arg from get() in prefetch/base.hh
commit b0d1643 caused building against NULL to break due to NULLIsa::GuestByteOrder not being defined.
Removal of default argument in src/mem/cache/prefetch/base.hh fixes this.
Change-Id: I99a4abb4be1418fadec145481164f7caa3334ca0 Signed-off-by: Ryan Gambord Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17070 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13752:135bb759ee9c |
08-Mar-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Revert "mem-cache: Remove Packet dependency in Tags"
Reverting patch due to polymorphism limitations.
This reverts commit 86a54d91936b524c0ef0f282959f0fc29bafe7eb.
Change-Id: Ie032dcc5176448c62118c89732b3cc6b8efd5a13 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17049 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13751:614d6e02a5fb |
21-Feb-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Added extra information to PrefetchInfo
Added additional information to the PrefetchInfo data structure - Whether the event is triggered by a cache miss - Whether the event is a write or a read - Size of the data accessed - Data accessed by the request
Change-Id: I070f3ffe837ea960a357388e7f2b8a61d7b2196c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16583 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13750:11dd302dfaa4 |
05-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add header delay to handleFill whenReady
A prefetch response will have a header delay, which was not being taken into account.
Change-Id: I66a071bc81ef41b8c0de37aa2df75171d1979a6f Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14895 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13749:b2486662285d |
04-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Allow tag-only accesses on latency calculation
Some accesses only need to search for a tag in the tag array, with no need to touch the data array. This is the case for CleanEvicts, evicts that don't find a corresponding block entry (since a write cannot be done in parallel with tag lookup), and maintenance operations.
Change-Id: I7365a915500b5d7ab636d49a9acc627072a7f58e Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14878 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13748:de3b813c4b90 |
04-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add lookup latency to access' whenReady
When dealing with writebacks, as soon as the packet metadata arrives there will be a tag lookup, done sequentially because a write can't be done in parallel. While the tag lookup is being done, the payload will arrive. When both the payload are present and the tag is correct block entry is determined the fill happens.
Change-Id: If1a0085d742458b675bfc012b6d908d9d9a25e32 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14877 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13747:5c90d834a58c |
29-Nov-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix recvTimingReq doWritebacks tick
Before being sent to the writebuffer, the evicted blocks must be selected for replacement, and therefore the access latency must be applied. The forward latency is then applied on top of that delay.
Change-Id: I16a25a8bf6051f63eb7a02fe66acb6af26d434fc Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14736 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13746:723109f11d56 |
04-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use header delay on latency calculation
Previously the bus delay was being ignored for the access latency calculation, and then applied on top of the access latency. This patch fixes the order, as first the packet must arrive before the access starts.
Change-Id: I6d55299a911d54625c147814dd423bfc63ef1b65 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14876 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13745:1cf82fb6c4ab |
04-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove old todo about latency in hit function
The header and payload delay have already been accounted and zeroed previous to calling this function. The probe is not allowed to modify the packet, therefore no extra delays are added, and it is safe to remove the todo note.
Change-Id: I8ddf7e189fbe609cdec34364f3c013427930daf7 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14875 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13735:52ab3bab4f28 |
13-Dec-2018 |
Ivan Pizarro <ivan.pizarro@metempsy.com> |
mem-cache: Sandbox Based Optimal Offset Implementation
Brown, N. T., & Sendag, R. Sandbox Based Optimal Offset Estimation.
Change-Id: Ieb693b6b2c3d8bdfb6948389ca10e92c85454862 Reviewed-on: https://gem5-review.googlesource.com/c/15095 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13732:43e7199f511f |
22-Jan-2019 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Copy over flags to forwarded response
A cache that forwards a request to the memory below does not fill and forwards the response with the data to cache above. This change ensures that the flags of the original response are also preserved.
Change-Id: I244b20b073c31b976358816c5b14bba413b8271f Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/16182 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
13721:a80fcb3e1322 |
25-Feb-2019 |
Andrea Mondelli <Andrea.Mondelli@ucf.edu> |
mem-cache: added missing override specifier in BoP
Added missing specifier for various virtual functions.
Change-Id: I41aebb3b76bce6dd3bee21ac0e2b0e52cb90fc80 Reviewed-on: https://gem5-review.googlesource.com/c/16728 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13717:11e81e2a98bd |
03-Dec-2018 |
Ivan Pizarro <ivan.pizarro@metempsy.com> |
mem-cache: A Best-Offset Prefetcher
Michaud, P. (2015, June). A best-offset prefetcher. In 2nd Data Prefetching Championship.
Change-Id: I61bb89ca5639356d54aeb04e856d5bf6e8805c22 Reviewed-on: https://gem5-review.googlesource.com/c/14820 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13707:5aab50651a66 |
21-Feb-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Add a mechanism to iterate all entries of an AssociativeSet
Added functions to obtain an iterator to access all entries of an AssociativeSet container.
Change-Id: I1ec555bd97d97e3edaced2b8f61287e922279c26 Reviewed-on: https://gem5-review.googlesource.com/c/16582 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13700:56fa28e6fab4 |
31-Jan-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Added the Slim AMPM Prefetcher
Reference: Towards Bandwidth-Efficient Prefetching with Slim AMPM. Young, V., & Krishna, A. (2015). The 2nd Data Prefetching Championship.
Slim AMPM is composed of two prefetchers, the DPCT and the AMPM (both already in gem5).
Change-Id: I6e868faf216e3e75231cf181d59884ed6f0d382a Reviewed-on: https://gem5-review.googlesource.com/c/16383 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13669:24ef552b4d6d |
05-Dec-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Irregular Stream Buffer Prefetcher
Based in the description of the following publication: Akanksha Jain and Calvin Lin. 2013. Linearizing irregular memory accesses for improved correlated prefetching. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). ACM, New York, NY, USA, 247-259.
Change-Id: Ibeb6abc93ca40ad634df6ed5cf8becb0a49d1165 Reviewed-on: https://gem5-review.googlesource.com/c/15215 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
13667:e3ae3619b9ab |
05-Feb-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Added the Delta Correlating Prediction Tables Prefetcher
Reference: Multi-level hardware prefetching using low complexity delta correlating prediction tables with partial matching. Marius Grannaes, Magnus Jahre, and Lasse Natvig. 2010. In Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers (HiPEAC'10) Change-Id: I7b5d7ede9284862a427cfd5693a47652a69ed49d Reviewed-on: https://gem5-review.googlesource.com/c/16062 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
13665:9c7fe3811b88 |
25-Jan-2019 |
Andreas Sandberg <andreas.sandberg@arm.com> |
python: Don't assume SimObjects live in the global namespace
The importer in Python 3 doesn't like the way we import SimObjects from the global namespace. Convert the existing SimObject declarations to import from m5.objects. As a side-effect, this makes these files consistent with configuration files.
Change-Id: I11153502b430822130722839e1fa767b82a027aa Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15981 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> |
13624:3d8220c2d41d |
13-Dec-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Updated version of the Signature Path Prefetcher
This implementation is based in the description available in: Jinchun Kim, Seth H. Pugsley, Paul V. Gratz, A. L. Narasimha Reddy, Chris Wilkerson, and Zeshan Chishti. 2016. Path confidence based lookahead prefetching. In The 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-49). IEEE Press, Piscataway, NJ, USA, Article 60, 12 pages.
Change-Id: I4b8b54efef48ced7044bd535de9a69bca68d47d9 Reviewed-on: https://gem5-review.googlesource.com/c/14819 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13564:9bbd53a77887 |
27-Nov-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Determine if a packet queue forces ordering at construction
A packet queue is typically used to hold on to packets that are schedules to be sent in the future or when they need to queue behind younger packets that have been sent out yet. Due to memory order requirements, some MemObjects need to maintain the order for packet (mostly responses) that reference the same cache block.
Prior to this patch the ordering requirements where determined when the packet was scheduled to be sent. This patch moves the parameter to the constructor.
Change-Id: Ieb4d94e86bc7514f5036b313ec23ea47dd653164 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15555 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13554:f16adb9b35cc |
12-Dec-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Access Map Pattern Matching Prefetcher
Implementation of the Access Map Pattern Matching prefetcher Based in the description of the following paper: Access map pattern matching for high performance data cache prefetch. Ishii, Y., Inaba, M., & Hiraki, K. (2011). Journal of Instruction-Level Parallelism, 13, 1-24.
Change-Id: I0d4b7f7afc2ab4938bdd8755bfed26e26a28530c Reviewed-on: https://gem5-review.googlesource.com/c/15096 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13553:047def1fa787 |
29-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Signature Path Prefetcher
Related paper: Lookahead Prefetching with Signature Path J Kim, PV Gratz, ALN Reddy The 2nd Data Prefetching Championship (DPC2), 2015
Change-Id: I2319be2fa409f955f65e1bf1e1bb2d6d9a4fea11 Reviewed-on: https://gem5-review.googlesource.com/c/14737 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13552:86c9a15aa4ef |
29-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: allow prefetchers to emit page crossing references
QueuedPrefetcher takes the responsability to check for page crossing references.
Change-Id: I0ae6bf8be465118990d9ea1cac0da8f70e69aeb1 Reviewed-on: https://gem5-review.googlesource.com/c/14735 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13551:f352df8e2863 |
17-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: virtual address support for prefetchers
Prefetchers can be configured to operate with virtual or physical addreses. The option can be configured through the "use_virtual_addresses" parameter of the Prefetcher object.
Change-Id: I4f8c3687988afecc8a91c3c5b2d44cc0580f72aa Reviewed-on: https://gem5-review.googlesource.com/c/14416 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13485:12e16073f6a7 |
07-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Workaround for setWhenReady assertion
Change 174da8e2da6a896d2e97bc264f9c827a0f4c35ac added an assert that is not satisfiable with current implementation, breaking some regression tests.
Change-Id: Ibafaf0c51906384364f0b2a4b931f8ec6126d858 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14955 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13478:59414c401cd9 |
05-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove writebacks parameter from serviceMSHRTargets
Change 8ba77ae8fc98a355082da2bd9fdc6ecf4928f725 introduced the writebacks parameter, but it was never used.
Change-Id: I225e5b399de42d77c72fc0012d3dc93ef39b8853 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14896 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13477:044307c0d0b8 |
28-Nov-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add getter and setter to CacheBlk::whenReady
Add a getter and a setter function to access CacheBlk::whenReady to encapsulate the variable and allow error checking. This error checking consists on verifying that writes to a block after it has been inserted follow a chronological order.
As a side effect, tickInserted retain its value until updated, that is, it is not reset in invalidate().
Change-Id: Idc3c5a99c3f002ee9acc2424f00e554877fd3a69 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14715 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13473:ba87e4c95508 |
25-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Optimize sector valid and secure check
Previously a loop was being done to check whether the block was valid/secure or not. Variables have been added to skip this loop and save and update sector block state when sub-blocks are validated, invalidated and secured.
Change-Id: Ie1734f7dfda9698c7bf22a1fcbfc47ffb9239cea Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14363 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13449:2f7efa89c58b |
26-Nov-2018 |
Gabe Black <gabeblack@google.com> |
arch, base, cpu, gpu, mem: Replace assert(0 or false with panic.
Neither assert(0) nor assert(false) give any hint as to why control getting to them is bad, and their more descriptive versions, assert(0 && "description") and assert(false && "description"), jury rig assert to add an error message when the utility function panic() already does that directly with better formatting options.
This change replaces that flavor of call to assert with panic, except in the actual code which processes the formatting that panic uses (to avoid infinitely recurring error handling), and in some *.sm files since I don't know what rules those have to follow and don't want to accidentaly break them.
Change-Id: I8addfbfaf77eaed94ec8191f2ae4efb477cefdd0 Reviewed-on: https://gem5-review.googlesource.com/c/14636 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13445:070fc4d948c0 |
25-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add setters to validate and secure block
In order to allow polymorphism of the block these two functions have been added, and all direct status assignments to these bits have been substituted.
We also assert that the block has been invalidated before insertion. Then the block is validated in the insertion.
Change-Id: Ie7be42408721ad4c2c9dc880f82a62cb594f8668 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14362 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13434:99807b35a66c |
17-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: a missing cast was truncating addresses
High bits were truncated when computing the block address
Change-Id: Iab2a4c6063ece2d1d4c24ce5686045a6d6d35434 Reviewed-on: https://gem5-review.googlesource.com/c/14415 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13428:ceddb3964aea |
15-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: fix invalid iterator access
An iterator was assigned end() and then it was used to access its corresponding element.
Change-Id: I87246cf56cbc694dd6b4e2cabbe84a08429d2ac3 Reviewed-on: https://gem5-review.googlesource.com/c/14361 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13427:72a3afac3e78 |
11-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Make StridePrefetcher use Replacement Policies
Previously StridePrefetcher was only able to use random replacement policy. This change allows all replacement policies to be applied to the pc table.
Change-Id: I8714e71a6a4c9c31fbca49a07a456dcacd3e402c Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14360 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13426:d2b0e9ec67f1 |
11-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Add invalidation function to StrideEntry
Add invalidation function to StrideEntry so that every entry can be invalidated appropriately.
Change-Id: I38c42b7d7c93d839f797d116f1d2c88572123c0e Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14359 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13425:00abf35b2f7e |
11-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Make PCTable context independent
Move the unordered_map outside of the PCTable, as it belongs to the StridePrefetcher. By doing so we are moving towards a table that ressembles the ones of the Tags classes.
Some functions have been moved from the prefetcher to the PCTable, as they didn't belong there. As such, they have been renamed to remove the unnecessary prefix.
Change-Id: I3e54bc7dee65e1f78d96b0d548ac8345b7bd4364 Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14358 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13424:1744211c9a65 |
13-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Vectorize StridePrefetcher's entries.
Turn StridePrefetcher::PCTable::entries into a vector of vectors.
Change-Id: I2a4589a76eb205910c43723638b7989eddd5ca24 Reviewed-on: https://gem5-review.googlesource.com/c/14357 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13423:a414d6fccc4e |
13-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Return entry in StridePrefetcher::pcTableHit()
Return a pointer to the entry instead of returning a boolean and passing a pointer reference. As a side effect, change the name of the function to be more descriptive of the functionality.
Change-Id: Iad44979e98031754c1d0857b1790c0eaf77e9765 Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14356 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13422:4ec52da74cd5 |
11-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Cleanup prefetchers
Prefetcher code had extra variables, dependencies that could be removed, code duplication, and missing overrides.
Change-Id: I6e9fbf67a0bdab7eb591893039e088261f52d31a Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14355 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13419:aaadcfae091a |
13-Nov-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove Cache dependency from Tags
Tags do not need to be aware of caches.
Change-Id: Ib6a082b74dcd9b2f10852651634b59512732fb2a Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14296 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13418:08101e89101e |
18-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move access latency calculation to Cache
Access latency was not being calculated properly, as it was always assuming that for hits reads take as long as writes, and that parallel accesses would produce the same latency for read and write misses.
By moving the calculation to the Cache we can use the write/ read information, reduce latency variables duplication and remove Cache dependency from Tags.
The tag lookup latency is still calculated by the Tags.
Change-Id: I71bc68fb5c3515b372c3bf002d61b6f048a45540 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13697 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13416:d90887d0c889 |
09-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: implement a probe-based interface
The HW Prefetcher of a cache can now listen events from their associated CPUs and from its own cache.
Change-Id: I28aecd8faf8ed44be94464d84485bd1cea2efae3 Reviewed-on: https://gem5-review.googlesource.com/c/14155 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13412:bc5b08f44e6d |
06-Nov-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Align how we handle requests in atomic with timing
Requests, for which a cache has already committed to respond do not perform any lookups. Previously in atomic mode the packet would pay the lookup latency while in timing it wouldn't. This patch aligns recvAtomic with recvTimingReq and removes the lookup latency from the the handling of such requests.
Change-Id: I50a0631f8058e5086d94d55af0e1788a60e2883f Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/14175 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
13378:038ea95fd793 |
02-Nov-2018 |
Gabe Black <gabeblack@google.com> |
mem-cache: Rename the tag class init function to tagsInit.
Since the tag classes are subclasses of SimObject, they inherit an init function which does generic initialization at simulation startup and which doesn't take any parameters. A new function was added which does take a parameter, and which is just for doing tag specific initialization as triggered by the base cache. These two names clashed, and clang complained that the tag local name was hiding the SimObject name (which it was).
Change-Id: I399775aceaf8f1a8e2646d434facef22e6d3e7d0 Reviewed-on: https://gem5-review.googlesource.com/c/13875 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Gabe Black <gabeblack@google.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13377:2e04ce7d3fd4 |
15-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Use Packet writing functions instead of memcpy
Classes were using memcpy instead of the Packet functions created for writing to/from the packet. This allows these writes to be better checked and tracked.
This also fixes a bug in MemCheckerMonitor, which was using the incorrect type for the packet pointer.
Change-Id: I5bbc8a24e59464e8219bb6d54af8209e6d4ee1af Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13695 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13376:2165f3f012ed |
26-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix double block invalidation
Block was being invalidated twice when not a tempBlock. Make explicit that the else case is only to be applied when handling the tempBlock, as otherwise the Tags should be taking care of the invalidation.
Change-Id: Ie7603fdbe156c54e94bbdc83541b55e66f8d250f Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13895 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13367:dc06baae4275 |
19-Oct-2018 |
yuetsu.kodama <yuetsu.kodama@riken.jp> |
arch-arm: We add PRFM PST instruction for arm
Note current PRFM supports only PLD, but PST (prefetch for store) is also important for latency hiding. We also bug fix in disassembler to display prfop correctly.
Change-Id: I9144e7233900aa2d555e1c1a6a2c2e41d837aa13 Signed-off-by: Yuetsu Kodama <yuetsu.kodama@riken.jp> Reviewed-on: https://gem5-review.googlesource.com/c/13675 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
13358:5e1605b47a21 |
19-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move evictBlock(CacheBlk*, PacketList&) to base
Move evictBlock(CacheBlk*, PacketList&) to base cache, as it is both sub-classes implementations are equal.
Change-Id: I80fbd16813bfcc4938fb01ed76abe29b3f8b3018 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13656 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13353:63f4073c1fc7 |
18-Oct-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix unused variable warning in FALRU:invalidate()
Change-Id: I3b902045433ca56b3e62c251158e784b5fa9e4d7 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13600 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
13352:75647326f19b |
10-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add write coalescing and write-no-allocate to the caches
Enable the cache to detect contiguous writes and hold on to the MSHR long enough to allow the entire line to be written. If the whole line is written, the MSHR will be sent out as an invalidation requests, as it is part of a whole-line write, i.e. no-fetch-on-write.
The cache is also able to switch to a write-no-allocate policy on the actual completion of the writes, and instead use the tempBlock and turn the write operation into a writeback.
These policies are all well-known, and described in works such as Jouppi, Cache Write Policies and Performance, vol 21, no 2, ACM, 1993.
Change-Id: I19792f2970b3c6798c9b2b493acdd156897284ae Reviewed-on: https://gem5-review.googlesource.com/c/12907 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13351:1d456a63bfbc |
10-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Delay servicing an MSHR after its allocation
An MSHR is allocated and the computed latency determines when the MSHR will be ready and can be serviced by the cache. This patch adds a function that allows changing the time that an MSHR is ready and adjusts the queue such that other MSHRs can be serviced first if they are ready.
Change-Id: Ie908191fcb3c2d84d4c6f855c8b1e41ca5881bff Reviewed-on: https://gem5-review.googlesource.com/c/12906 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13350:247e4108a5e8 |
10-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Restructure whole-line writes to simplify write merging
This patch changes how we deal with whole-line writes their responses. With these changes, we use the MSHR tracking to determine if a whole-line is written, and on a fill we simply handle the invalidation response, with the actual writes taking place as part of satisfying the CPU-side hit.
Change-Id: I9a18e41a95db3c20b97f8bca7d95ff33d35a578b Reviewed-on: https://gem5-review.googlesource.com/c/12905 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13349:20890038e8a0 |
10-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Determine if an MSHR does a whole-line write
This patch adds support for determining whether the targets in an MSHR are 1) only writes and 2) whether these writes are effectively a whole-line write. This patch adds the necessary functions in the MSHR to allow for write coalescing in the cache.
Change-Id: I2c9a9a83d2d9b506a491ba5b0b9ac1054bdb31b4 Reviewed-on: https://gem5-review.googlesource.com/c/12904 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13236:8ea2f58940b0 |
12-Oct-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Add missing includes in TreePLRU
Add missing includes to TreePLRU files.
Change-Id: Ia1e7b2aa91eec8a30b6dccf513cca37a3058b350 Reviewed-on: https://gem5-review.googlesource.com/c/13477 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13225:8d1621fc586e |
11-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Factor ReplaceableEntry out
ReplaceableEntry is referenced by many classes that do not necessarily need access to the replacement policies. Therefore, in order to allow better compilation units, we factor it out to a new file.
Change-Id: I0823567bf1ca336ffcdf783682ef473e8878d7fd Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13418 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13224:1e74ea6ffe51 |
11-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move sector_blks to tags folder
Move sector_blks.hh and sector_blks.cc to the tags folder, as its usage scope is restricted to the tags, and caches should not be aware of them.
Change-Id: Ia7a71f51ec251d827872daf108c87da543a0ba57 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13417 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13223:081299f403fe |
11-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Rename blk.cc/hh to cache_blk.cc/hh
Rename the files blk.cc and blk.hh to cache_blk.cc and cache_blk.hh to comply with the usual file-class naming rules.
Change-Id: I8af45df3e4b8dd934fd9929ec914fb230cb2cb09 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13416 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13222:0dbcc7d7d66f |
10-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Virtualize block print
Encapsulate and virtualize block print, so that relevant information can be easily printed anywhere.
Change-Id: I91109c29c126755183a0fd2b4446f5335e64076b Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13415 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13221:48bce2835200 |
05-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create Tree-PLRU replacement policy
Implementation of a Tree-PLRU replacement policy. It is based on the assumption that a set associative cache is used.
Change-Id: I74b227e88fd6c93aab5bb2cd0e8730376db28f52 Reviewed-on: https://gem5-review.googlesource.com/c/11106 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13220:78a8391a0f95 |
12-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove CacheSet.hh
Replacement policies aren't aware of cache sets and do not organize blocks based on replacement data. Block search is independent of block placement.
Besides, indexing policies have their own way of addressing the sets, therefore there is no need to use this class anymore.
BlkType has been removed, as it wasn't being used.
Change-Id: Ia79c2a491e59f295c8d60a0466c317eb0e2bdab9 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/9782 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13219:454ecc63338d |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Split Tags for indexing policies
Split indexing functionality from tags, so that code duplication is reduced when adding new classes that use different indexing policies, such as set associative, skewed associative or other hash-based policies.
An indexing policy defines the mapping between an address' set and its physical location. For example, a conventional set assoc cache maps an address to all ways in a set using an immutable function, that is, a set x is always mapped to set x. However, skewed assoc caches map an address to a different set for each way, using a skewing function.
FALRU has been left unmodified as it is a specialization with its own complexity.
Change-Id: I0838b41663f21eba0aeab7aeb7839e3703ca3324 Reviewed-on: https://gem5-review.googlesource.com/c/8885 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13218:5e7df60c6cab |
07-Sep-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use set and way for ReplaceableEntry
Replaceable entries belong to table-like structures, and therefore they should be indexable by combining a row and a column. These, using conventional cache nomenclature translate to sets and ways.
Make these entries aware of their sets and ways. The idea is to make indexing policies usable by other table-like structures. In order to do so we move sets and ways to ReplaceableEntry, which will be the common base among table entries.
Change-Id: If0e3dacf9ea2f523af9cface067469ccecf82648 Reviewed-on: https://gem5-review.googlesource.com/c/12764 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13217:725b1701b4ee |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use possible locations to find block
Use possible locations to find block to make it placement policy independent.
Change-Id: I4c9d9e1e1ff91ce12e85ca1970f927d8f4f5a93b Reviewed-on: https://gem5-review.googlesource.com/c/8884 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13216:6ae030076b29 |
21-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create tags initialization function
Having the blocks initialized in the constructor makes it harder to apply inheritance in the tags classes. This patch decouples the block initialization functionality from the constructor by using an init() function. It also sets the parent cache.
Change-Id: I0da7fdaae492b1177c7cc3bda8639f79921fbbeb Reviewed-on: https://gem5-review.googlesource.com/c/11509 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13215:82cdb8db4643 |
06-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove Packet dependency in Tags
Decouple Tags from Packets, only extracting the necessary functionality for block insertion. As a side effect, create a new function to update common insertion statistics.
Change-Id: I5c58f7c17de3255beee531f72a3fd25a30d74c90 Reviewed-on: https://gem5-review.googlesource.com/c/11098 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13165:d52afbf4cdfe |
04-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix FALRU hash invalidation
The block was being invalidated before the hash could erase its entry, therefore it was using invalid values (tag was being assigned MaxAddr and the secure bit was reset).
This change reorders the calls, so that the appropriate hash entry is erased.
Change-Id: I161463df0f8f5220179bc68d7be12051e5390d01 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13210 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13164:da6240a1ccfb |
03-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Make checking function const in FALRU
The checking function should not be able to modify either the head and tail pointers nor should it modify its class.
Change-Id: I2ad495f0c8c6b778d48512143e94b4c9a353f22e Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13209 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13163:55923cb33a7e |
03-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Make boundaries in FALRU an STL container
Turn the dynamically allocated array of pointers "boundaries" into a STL vector.
Change-Id: I3409898473b155f69b4c6e038eba2dffb5b09380 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13208 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13162:b6a5d452d52d |
03-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix FALRU inCachesMask initialization
inCachesMask is not being initialized, which triggers an assertion on insertion. Fix this by implementing a default constructor for the FALRUBlk.
Change-Id: I587cf5e0191c4587d938e6ab6036ec1b32f37793 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13207 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13026:c8380b98c0ef |
19-Sep-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix non-bijective function in Skewed caches
The hash() function must be bijective for the skewed caches to work, however when the hashing is done on top of a one-bit address, the MSB and LSB refer to the same bit, and therefore their xor will always be zero.
This patch adds a fatal error to not allow the user to set an invalid value for the number of sets that would generate that bug.
As a side note, the missing header for the bitfields functions has been added.
Change-Id: I35a03ac5fdc4debb091f7f2db5db33568d0b0021 Reviewed-on: https://gem5-review.googlesource.com/12724 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13017:a620da03ab10 |
01-Sep-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix bug in handleAtomicReqMiss
"4976ff5 mem-cache: Refactor the recvAtomic function" introduced a bug where if an atomic request that fills in using the tempBlock it will not evict it when it finishes handling the request as it should. This triggers an assertion. This change fixes this bug.
Change-Id: I73c808a7e15237eddb36b5448ef6728f7bcf7fd9 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/12644 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12964:0315ef861b8a |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create Skewed Assoc placement policy
Create a class that implements the skewed associative placement policy. It uses the hash function and expansions of the skewing functions described in "Skewed-Associative caches", by Seznec.
Only 8 skewing functions are implemented, and therefore if more are needed a hash function will be recursively applied on top of the output of one of these functions to generate different values. This is not optimal, and if more functions are needed it might be more effective to implement them.
Change-Id: Ibc77edffd8128114a8b200cec5d8deedfb5105cb Reviewed-on: https://gem5-review.googlesource.com/8886 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12843:d2ab5af49985 |
13-Jul-2018 |
Robert Kovacsics <rmk35@cl.cam.ac.uk> |
mem-cache: TempCacheBlk allocates and destroys its own data
This change is because I want to make CacheBlk::data private, so that I can track all the places which write to it. But to keep that commit smaller (it is pretty big, because of all the places which might change it), I have split this into a commit of its own.
Change-Id: I15a2fc1752085ff3681f5c74ec90be3828a559ea Reviewed-on: https://gem5-review.googlesource.com/11829 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12823:ba630bc7a36d |
19-Jul-2018 |
Robert Kovacsics <rmk35@cl.cam.ac.uk> |
mem: Rename Packet::checkFunctional to trySatisfyFunctional
Packet::checkFunctional also wrote data to/from the packet depending on if it was read/write, respectively, which the 'check' in the name would suggest otherwise. This renames it to doFunctional, which is more suggestive. It also renames any function called checkFunctional which calls Packet::checkFunctional. These are
- Bridge::BridgeMasterPort::checkFunctional - calls Packet::checkFunctional - MSHR::checkFunctional - calls Packet::checkFunctional - MSHR::TargetList::checkFunctional - calls Packet::checkFunctional - Queue<>::checkFunctional (of src/mem/cache/queue.hh, not src/cpu/minor/buffers.h) - Instantiated with Queue<WriteQueueEntry> and Queue<MSHR> - WriteQueueEntry - calls Packet::checkFunctional - WriteQueueEntry::TargetList - calls Packet::checkFunctional - MemDelay::checkFunctional - calls QueuedSlavePort/QueuedMasterPort::checkFunctional - Packet::checkFunctional - PacketQueue::checkFunctional - calls Packet::checkFunctional - QueuedSlavePort::checkFunctional - calls PacketQueue::doFunctional - QueuedMasterPort::checkFunctional - calls PacketQueue::doFunctional - SerialLink::SerialLinkMasterPort::checkFunctional - calls Packet::doFunctional
Change-Id: Ieca2579c020c329040da053ba8e25820801b62c5 Reviewed-on: https://gem5-review.googlesource.com/11810 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12820:5d66b60a2c47 |
13-Jul-2018 |
Robert Kovacsics <rmk35@cl.cam.ac.uk> |
mem-cache: Typo in comment: 'proceed' -> 'precede'
The writebacks happen before anything below, not after.
Change-Id: I7eaefbbf33aa17c496255dedd964a56118a28741 Reviewed-on: https://gem5-review.googlesource.com/11749 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12794:ba78a382b0f6 |
18-Mar-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Promote deferred targets on cache clean responses
While a cache clean operation is pending, all requests to the corresponding block get deferred. When the response of a cache clean operation is received, if the block is present and the response is not invalidating, we can service all deferred targets that didn't require writable. This change implements this functionality.
Change-Id: Ief47e74d07749a6a9736ab450eb46eefa53464a2 Reviewed-on: https://gem5-review.googlesource.com/11018 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12793:dda6af979353 |
16-Mar-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Promote targets that don't require writable
Until now, all deferred targets of an MSHR would be promoted together as soon as the targets were serviced. Due to the way we handle cache clean operations we might need to promote only deferred targets that don't require writable, leaving some targets as deferred. This change adds support for this selective promotion.
Change-Id: I502e523dc9adbaf394955cbacea8286ab6a9b6bc Reviewed-on: https://gem5-review.googlesource.com/11017 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12792:9af3470e24e7 |
16-Mar-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix promoting of targets that need writable
There are cases where a request which does not need a writable copy gets an response upgraded reponse and fills in a writable copy. When this happens, we promote deferred MSHR targets that were deferred because they needed a writable copy to service them immediately.
Previously, we would uncoditionally promote deferred targets. Since the deferred targets might contain a cache invalidation operation, we have to make sure that any targets following the cache invalidation is not promoted.
Change-Id: I1f7b28f7d35f84329e065c8f63117db21852365a Reviewed-on: https://gem5-review.googlesource.com/11016 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12791:8f27b3c23a91 |
16-Mar-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Selectively clear downstream pending
Until now, all deferred targets of an MSHR would be promoted together as soon as the targets were serviced. When we promote deferred targets we also clear the downstreamPending flag.
Due to the way we handle cache clean operations we might need to promote only deferred targets that don't require writable, leaving some targets as deferred. To allow for partial target promotion, this change adds support for clearing the downstreamPending only for a subset of a TargetsList.
Change-Id: Id06953643ba9a975ebacc76ac10215441e264e74 Reviewed-on: https://gem5-review.googlesource.com/11015 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12782:558fb870aefe |
18-Jun-2018 |
Jason Lowe-Power <jason@lowepower.com> |
mem-cache: Fix TempCacheBlock insert
TempCacheBlock insert() had a different signature than the parent class which caused an error on clang. This matches the signature with default zero values.
Change-Id: Ic096914497f3d17e88295c9e65a04d76fdddf365 Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/11349 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12775:84d56bc8cd8b |
21-Nov-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix support for secure blocks in the FALRU cache
Fully associative caches use an unordered map to enable efficient lookups of existing blocks. Previously this map was indexed using the tag of the block. Security extentions allow secure and non secure versions of a block with the same tag to co-exist in the cache. This patch amends the block map to allow correct lookups for FALRU caches.
Change-Id: Iccf07464deab56d1d270bae14bb3b154047e3556 Reviewed-on: https://gem5-review.googlesource.com/11309 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12774:b7948f858593 |
13-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Initialize CacheBlk data pointer
Initialize CacheBlk's data pointer as a nullptr.
Change-Id: Ice85b4b11495cad4b0a160ccb9efe1be673e57e2 Reviewed-on: https://gem5-review.googlesource.com/11097 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12773:387fa9e5c9ff |
07-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Forward declare ReplaceableEntry
Forward declare ReplaceableEntry where in classes where pointers to it are used.
Change-Id: I49c08d36442a563d7a6b4c9bcd7eba3591d29b60 Reviewed-on: https://gem5-review.googlesource.com/11096 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12766:1c347e60c7fd |
22-Jan-2018 |
Tuan Ta <qtt2@cornell.edu> |
base,mem: Support AtomicOpFunctor in the classic memory system
AtomicOpFunctor can be used to implement atomic memory operations. AtomicOpFunctor is captured inside a memory request and executed directly in the memory hierarchy in a single step.
This patch enables AtomicOpFunctor pointers to be included in a memory request and executed in a single step in the classic cache system.
This patch also makes the copy constructor of Request class do a deep copy of AtomicOpFunctor object. This prevents a copy of a Request object from accessing a deleted AtomicOpFunctor object.
Change-Id: I6649532b37f711e55f4552ad26893efeb300dd37 Reviewed-on: https://gem5-review.googlesource.com/8185 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12756:7f7bd5dbfcb1 |
13-Jun-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Remove unnecessary cast in SectorTags::findVictim
Removes an uneccessary cast that also caused an unused variable error (due to -Werror) when compiling .fast targets.
Change-Id: Ic043f462925e7eaa7b691455f1d9e08a1c101980 Reviewed-on: https://gem5-review.googlesource.com/11119 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12754:15c1d281ce1a |
06-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Insert on block allocation
When a block is being replaced in an allocation, if successfull, the block will be inserted. Therefore we move the insertion functionality to allocateBlock().
allocateBlock's signature has been modified to allow this modification.
Change-Id: I60d17a83ff4f3021fdc976378868ccde6c7507bc Reviewed-on: https://gem5-review.googlesource.com/10812 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12753:fe5b2dbe42bb |
06-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Make packet const in insertBlock
The packet should not be modified within insertBlock.
Change-Id: If7d2b01fe131f9923194efd155c9e85eeab24d5a Reviewed-on: https://gem5-review.googlesource.com/10811 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12752:6a0e3eb1cc5d |
05-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create Sector Cache
Implementation of Sector Caches, i.e., a cache with multiple sequential data entries per tag entry for Set Associtive placement policies.
Change-Id: I8e1e9448fa44ba308ccb16cd5bcc5fd36c988feb Reviewed-on: https://gem5-review.googlesource.com/9741 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12749:223c83ed9979 |
04-Jun-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
misc: Using smart pointers for memory Requests
This patch is changing the underlying type for RequestPtr from Request* to shared_ptr<Request>. Having memory requests being managed by smart pointers will simplify the code; it will also prevent memory leakage and dangling pointers.
Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10996 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12748:ae5ce8e42de7 |
03-Jun-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
misc: Substitute pointer to Request with aliased RequestPtr
Every usage of Request* in the code has been replaced with the RequestPtr alias. This is a preparing patch for when RequestPtr will be the typdefed to a smart pointer to Request rather then a raw pointer to Request.
Change-Id: I73cbaf2d96ea9313a590cdc731a25662950cd51a Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10995 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> |
12747:785f582e44ab |
02-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Change Cache block tag check
Change tag to address check for compatibility with sector design. Cache should not use tag, as sector sub-blocks share them, and it could lead to wrong accesses.
Change-Id: Id1fa26f417595f475c5b5c07ae1f02f5fa0684ba Reviewed-on: https://gem5-review.googlesource.com/10723 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12746:0d0c266663d4 |
02-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use secure bit in findVictim
Sector caches must know if there was a sector hit in order to decide whether a victim's sector must be fully evicted to give place to a new sector or not.
In order to do so it needs the tag and secure information.
Change-Id: Ib554169e25fa131d6bf986561f7970b787c56874 Reviewed-on: https://gem5-review.googlesource.com/10722 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12745:e28c117a9806 |
02-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move tagsInUse to children
Move tagsInUse to children, as sector caches have different tag invalidation and insertion, and thus they must handle updating this variable.
Change-Id: I875c9b7364a909c76daf610d1e226c4e82063870 Reviewed-on: https://gem5-review.googlesource.com/10721 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12744:d1ff0b42b747 |
24-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Return evictions along with victims
For both sector and compressed caches multiple blocks may need to be evicted in order to make room for a new block.
For example, when replacing a sector, all the blocks in this sector must be evicted. A replacement, however, does not always need to evict multiple blocks, as it is in the case of an insertion of a block whose sector is already present in the cache (i.e., its corresponding entry in the sector had not been brought in yet, so it was invalid).
This patch creates the cache framework for that to happen.
Change-Id: I77bedf69637cf899fef4d9432eb6da8529ea398b Reviewed-on: https://gem5-review.googlesource.com/10142 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12743:b5ccee582b40 |
20-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use ReplaceableEntry in findBlockBySetAndWay
With a sector cache you can't find a block using only its set and way, as there is the sector offset to take into account. As all of these blocks inherit from ReplaceableEntry, the return type of this function has been updated.
This function has also been declared closer to findBlock() due to their similar functionality.
Change-Id: I4730a2b4ebb5738f7fc118201e208a1b9c3ba8e8 Reviewed-on: https://gem5-review.googlesource.com/10141 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12731:36a41bd85c0f |
17-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Privatize extractSet
Only BaseSetAssoc uses extractSet(). Besides, skewed caches need the way information to know which set an address is located at.
Change-Id: Id222e907dc550d053018561bb2683cfc415471ec Reviewed-on: https://gem5-review.googlesource.com/9962 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12730:6c2ea88bf129 |
16-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create an address aware TempCacheBlk
tempBlock has its member variables manually set in order to allow it to be used in the block address regeneration function. This is not necessary, and ti can be simply given the address, so it does not need to be aware of set and tag. This will simplify implementation of sector and skewed caches.
Change-Id: Iaffb10c323509722cd5589fe1030b818d43336d6 Reviewed-on: https://gem5-review.googlesource.com/9961 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12729:9870d6f73e04 |
30-May-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix secure bit modification
Secure bit was being updated outside insertion.
Change-Id: I83d9b010e8cf64013bbea9bae3ea68b0c414a189 Reviewed-on: https://gem5-review.googlesource.com/10622 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12728:57bdea4f96aa |
30-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Replace block visitor with std::function
This change modifies forEachBlk tags function to accept std::function as parameter. It also adds an anyBlk tags function that given a condition, it iterates through the blocks and returns whether the condition is met.
Finally, it uses forEachBlk to implement the print, computeStats and cleanupRefs functions that also work for the FALRU class.
Change-Id: I2f75f4baa1fdd5a1d343a63ecace3eb9458fbf03 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10621 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12727:56c23b54bcb1 |
02-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix include directives in the cache related classes
Change-Id: I111b0f662897c43974aadb08da1ed85c7542585c Reviewed-on: https://gem5-review.googlesource.com/10433 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12726:850e9965525b |
05-Feb-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Add a non-coherent cache
The class re-uses the existing MSHR and write queue. At the moment every single access is handled by the cache, even uncacheable accesses, and nothing is forwarded.
This is a modified version of a changeset put together by Andreas Hansson <andreas.hansson@arm.com>
Change-Id: I41f7f9c2b8c7fa5ec23712a4446e8adb1c9a336a Reviewed-on: https://gem5-review.googlesource.com/8291 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12725:3dcb96899659 |
03-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Move cache bypass mechanism to the ports
Cache bypass is necessary for cpu models like the KvmCPU. Previously the bypass would happen at the cache classes. With this change the bypassing happens directly at the ports.
Change-Id: I34de9fc63383aee8590643e169501ea6060d2d62 Reviewed-on: https://gem5-review.googlesource.com/10432 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12724:4f6fac3191d2 |
02-Feb-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Adopt a more sensible cache class hierarchy
This patch changes what goes into the BaseCache and what goes into the Cache, to make it easier to add a NoncoherentCache with as much re-use as possible. A number of redundant members and definitions are also removed in the process.
This is a modified version of a changeset put together by Andreas Hansson <andreas.hansson@arm.com>
Change-Id: Ie9dd73c4ec07732e778e7416b712dad8b4bd5d4b Reviewed-on: https://gem5-review.googlesource.com/10431 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12723:530dc4bf1a00 |
04-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Add helper function to perform evictions
Change-Id: I2df24eb1a8516220bec9b685c8c09bf55be18681 Reviewed-on: https://gem5-review.googlesource.com/10430 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12722:d84f756891fe |
10-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Delegate block invalidation to block allocation
For a block replacement we first select a victim block, we invalidate it and then populate it with the new information. Prior to this change BaseTags::insertBlock() did the invalidation and filled in the block with the new information. Now that the replacements stat is moved to the BaseCache, insertBlock does not need to perform the invalidation and as a result we can unify the block eviction code in BaseCache.
Change-Id: I5bdf00b2dab2752ed2137ab7201ed1dc451333b3 Reviewed-on: https://gem5-review.googlesource.com/10429 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12721:7f611e9412f0 |
04-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Refactor the recvAtomic function
The recvAtomic function in the cache handles atomic requests. Over time, recvAtomic has grown in complexity and code size. This change factors out some of its functionality in a separate functiona. The new functions handles atomic requests that miss.
Change-Id: If77d2de1e3e802e1da37f889f68910e700c59209 Reviewed-on: https://gem5-review.googlesource.com/10425 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12720:8db2ee0c2cf6 |
02-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Refactor the cache recvTimingReq function
The recvTimingReq function in the cache handles timing requests. Over time, recvTimingReq has grown in complexity and code size. This change factors out some of its functionality in two separate functions. The new functions handle timing requests that hit and timing requests that miss separately.
Change-Id: I09902d648d7272f0f9ec2851fa6376f7305ba418 Reviewed-on: https://gem5-review.googlesource.com/10424 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12719:68a20fbd07a6 |
01-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Refactor the cache recvTimingResp function
The recvTimingResp function in the cache handles timing responses. Over time, recvTimingResp has grown in complexity and code size. This change factors out some of its functionality to a separate function. The new function iterates through the in-service targets and handles them accordingly.
Change-Id: I0ef28288640f6be1b30452b0664d32432e692ea6 Reviewed-on: https://gem5-review.googlesource.com/10423 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12718:abad79926b86 |
31-May-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix RandomReplData
Random replacement policy's data was being instantiated with the incorrect class.
Change-Id: Ib573a6b5a63868d6069997c6279bec3b10c6b9b9 Reviewed-on: https://gem5-review.googlesource.com/10623 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12715:0c8b4f376378 |
02-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Determine if an MSHR has requests from another cache
To decide whether we allocate upon receiving a response we need to determine if any of the currently serviced requests (non-deferred targets) is comming from another cache. This change adds support for tracking this information in the MSHR.
Change-Id: If1db93c12b6af5813b91b9d6b6e5e196d327f038 Reviewed-on: https://gem5-review.googlesource.com/10422 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12704:4d2bcc64d469 |
10-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Move reference count stats update to blk invalidation
The tags in the cache keep track of the number of references to the blocks as well as the average number of references between an insertion and the next invalidation. Previously the stats where updated only on block insertion and invalidations were ignored. This changes moves the update of the counters to the block invalidation function.
Change-Id: Ie7672c13813ec278a65232694024d2e5e17c4612 Reviewed-on: https://gem5-review.googlesource.com/10428 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12703:2d0e4d2d76b3 |
10-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Remove isTouched field from the CacheBlk
At the moment isTouched is used in the warm-up detection mechanism but it keeps track of the same information as isValid(). This change removes it and substitutes its use by isValid().
Change-Id: I611ddf2fa4562ae3b3b2ed2fb74d26abd2e5ec62 Reviewed-on: https://gem5-review.googlesource.com/10427 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12702:27cb33a96e0f |
10-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Move replacements stat to the base cache class
Change-Id: I25dbcfcddfe1c422a76eb1af3f726c1360d8d110 Reviewed-on: https://gem5-review.googlesource.com/10426 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12700:c44381b89f9e |
30-Apr-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Simplify writeback for the tempBlock in recvTimingResp
When we use the tempBlock to fill-in, we have to write it back and invalidate it at the end of current transaction. This patch simplifies the writeback flow by treating it as a regular writeback.
Change-Id: I257be7bbff211e2832ad001a4e991daf67704485 Reviewed-on: https://gem5-review.googlesource.com/10421 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12691:8e1371fde4be |
13-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create block insertion function
Create a block insertion function to be used when inserting blocks. This resets the number of references to 1 (the insertion is taken into account), sets the insertion tick, and set secure state.
Change-Id: Ifc34cbbd1c125207ce47912d188809221c7a157e Reviewed-on: https://gem5-review.googlesource.com/9824 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12685:dcf85db6ec5c |
23-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create Second-Chance replacement policy
Implementation of a Second-Chance replacement policy. Similar to FIFO, but every block is given a second chance if it has been touched.
Change-Id: Id4d52b698d0045a4914a4d848fdf9c3c00a28508 Reviewed-on: https://gem5-review.googlesource.com/9441 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12684:44ebd2bc020f |
27-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: ReplacementPolicy specific replacement data
Replacement data is specific for each replacement policy, and thus should be instantiated differently by each policy.
Touch() and reset() do not need to be aware of CacheBlk, as they only update its ReplacementData.
Invalidate() makes replacement policies independent of cache blocks, by removing the awareness of the valid state.
An inheritable base ReplaceableEntry class was created to allow usage of replacement policies with any table-like structure.
Change-Id: I998917d800fa48504ed95abffa2f1b7bfd68522b Reviewed-on: https://gem5-review.googlesource.com/9421 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12680:91f4d6668b4f |
04-Apr-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
sim,cpu,mem,arch: Introduced MasterInfo data structure
With this patch a gem5 System will store more info about its Masters. While it was previously keeping track of the Master name and Master ID only, it is now adding a per-Master pointer to the SimObject related to the Master. This will make it possible for a client to query a System for a Master using either the master's name or the master's pointer.
Change-Id: I8b97d328a65cd06f329e2cdd3679451c17d2b8f6 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/9781 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12679:6c416cb3ca06 |
25-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use block iteration in BaseSetAssoc
Use block iteration instead of numSets and assoc in print(), cleanupRefs() and computeStats().
This makes these functions rely solely on what they are used for: printing and calculating stats of blocks. With the addition of Sectors an extra indirection level is added, and thus these functions would be skipping blocks.
Change-Id: I0006f82736cce02ba3e501ffafe9236f748daf32 Reviewed-on: https://gem5-review.googlesource.com/10143 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12677:dd0af90f2e05 |
18-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use findBlock in FALRU's block access
An access must perform a block search, which is done by findBlock.
The tagHash is indexed by tags, so use extractTag instead of re- implementing its functionality.
Change-Id: Ib5abacbc65cddf0f2d7e4440eb5355b56998a585 Reviewed-on: https://gem5-review.googlesource.com/10082 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12676:d0a1f557c156 |
19-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use secure flag in FALRU's findBlock
FALRU's findBlock() must use the secure flag to assure proper functionality.
Change-Id: I54e9fbd3c9093b3e8043c4c6c850b74a8f1f5ec0 Reviewed-on: https://gem5-review.googlesource.com/10081 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12672:4e1c5ce90fcd |
11-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create NRU Replacement Policy
Implementation of a Not Recently Used replacement policy.
Change-Id: I24ab3a6f1db6dcb756b869cfebb5c4bc544170e8 Reviewed-on: https://gem5-review.googlesource.com/9001 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12665:4ca9fc117b95 |
12-Apr-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Revamp multiple size tracking for FALRU caches
This change fixes a few bugs and refactors the mechanism by which caches that use the FALRU tags can output statistics for multiple cache sizes ranging from the minimum cache of interest up to the actual configured cache size.
Change-Id: Ibea029cf275a8c068c26eceeb06c761fc53aede2 Reviewed-on: https://gem5-review.googlesource.com/9826 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12648:78941f188bb3 |
27-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add MoveToTail to FALRU
FALRU was missing MoveToTail functionality within its invalidate function, and MoveToHead was doing unnecessary passes when the moved block was the head already.
Besides, added some comments to make the code understandable.
Change-Id: I2430d82b5d53c88b102a62610ea38b46d6e03a55 Reviewed-on: https://gem5-review.googlesource.com/9541 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12637:bfc3cb9c7e6c |
30-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Remove unused 'using namespace'
Removal of unused/barely used 'using namespace' from C++ files.
Change-Id: I66dc548c04506db2e41180b9ea7ab5abd7d5375a Reviewed-on: https://gem5-review.googlesource.com/9601 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12636:9859213e2662 |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move insertBlock functionality in FALRU
Block insertion is being done in the getCandidates function, while the insertBlock function does not do anything.
Besides, BaseTags' stats weren't being updated.
Change-Id: Iadab9c1ea61519214f66fa24c4b91c4fc95604c0 Reviewed-on: https://gem5-review.googlesource.com/8882 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12635:3abc52e4b4f3 |
11-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create LIP Replacement Policy
Implementation of a LRU Insertion Policy replacement policy.
Change-Id: I1a9aa0091ff2cdc1b1652c1d5ec7a3b33fba5b44 Reviewed-on: https://gem5-review.googlesource.com/9002 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12634:e69074a3c9b9 |
11-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create BIP Replacement Policy
Implementation of a Bimodal Insertion Policy replacement policy.
Change-Id: Ife058d0d4310dbcb35858348006189f0b2bf7c37 Reviewed-on: https://gem5-review.googlesource.com/9003 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12633:675cd1260b40 |
04-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use Packet functions to write data blocks
Instead of using raw memcpy, use the proper writer functions from the Packet class in Cache.
Fixed typos in comments of these functions.
Change-Id: I156a00989c6cbaa73763349006a37a18243d6ed4 Reviewed-on: https://gem5-review.googlesource.com/9661 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12630:2208bf99bffd |
05-Feb-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Remove unused return value from the recvTimingReq func
The recvTimingReq function in the cache always returns true. This changeset removes the return value.
Change-Id: I00dddca65ee7224ecfa579ea5195c841dac02972 Reviewed-on: https://gem5-review.googlesource.com/8289 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12629:c17d4dc2379e |
22-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix FALRU data block seg fault
FALRU didn't initialize the blocks' data, causing seg faults. This patch does not make FALRU functional yet.
Change-Id: I10cbcf5afc3f8bc357eeb8b7cb46789dec47ba8b Reviewed-on: https://gem5-review.googlesource.com/9302 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12628:458d655f2abb |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create LFU replacement policy
Implementation of a Least Frequently Used replacement policy.
Change-Id: I772afccd3a7955777e53d59341e922718db44e5c Reviewed-on: https://gem5-review.googlesource.com/8890 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12627:33d3bb6f19a5 |
12-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create RRIP Replacement Policy
Implementation of a Re-Reference Interval Prediction replacement policy.
Change-Id: Iba716eb5df2bf2be156e765f889d94f6ad00c91b Reviewed-on: https://gem5-review.googlesource.com/8981 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12626:e161d7725d4b |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create BRRIP replacement policy
Implementation of a Bimodal Re-Reference Interval Prediction replacement policy.
Change-Id: I25d4a59a60ef7ac496c66852e394fd6cbaf50912 Reviewed-on: https://gem5-review.googlesource.com/8891 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12613:40c18bb90501 |
23-Mar-2018 |
Jason Lowe-Power <jason@lowepower.com> |
mem-cache: fix missing overrides in repl policies
Change-Id: I67759a4532e8a46c1643d4c3a9c546ad6b565b81 Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/9321 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12607:b1cc6815194e |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create FIFO replacement policy
Implementation of a First-In, First-Out replacement policy.
Change-Id: Id234ec9d29c092dd4516e609da14b8a75a96b5e4 Reviewed-on: https://gem5-review.googlesource.com/8888 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12606:3bb0c54096e8 |
23-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix MRU rebase
Rebase of MRU missed a const qualifier, introducing a compilation error.
Change-Id: Ia25aa30523613a1a87593a353abe439946656f63 Reviewed-on: https://gem5-review.googlesource.com/9301 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12601:21a10e7b578a |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create MRU replacement policy
Implementation of a Most Recently Used replacement policy.
Change-Id: Id52cb247ca25d4523dcc53490d113695dac6a3f1 Reviewed-on: https://gem5-review.googlesource.com/8889 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12600:e670dd17c8cf |
19-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Split array indexing and replacement policies.
Replacement policies (LRU, Random) are currently considered as array indexing methods, but have completely different functionalities:
- Array indexers determine the possible locations for block allocation. This information is used to generate replacement candidates when conflicts happen. - Replacement policies determine which of the replacement candidates should be evicted to make room for new allocations.
For this reason, they were split into different classes. Advantages:
- Easier and more straightforward to implement other replacement policies (RRIP, LFU, ARC, ...) - Allow easier future implementation of cache organization schemes
As now we can't assure the use of sets, the previous way to create a true LRU is not viable. Now a timestamp_bits parameter controls how many bits are dedicated for the timestamp, and a true LRU can be achieved through an infinite number of bits (although a few bits suffice in practice).
Change-Id: I23750db121f1474d17831137e6ff618beb2b3eda Reviewed-on: https://gem5-review.googlesource.com/8501 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12599:43ade6cf92b7 |
12-Mar-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Allow clean operations when block allocation fails
Block allocation can fail when there is an in-service MSHR that operates on the victim block. This can happed due to: * an upgrade operation: a request that needs a writable copy of the block finds a shared (non-writable) copy of the block in the cache and has allocates an MSHR for the pending upgrade operation, or * a clean operation: a clean request finds a dirty copy of the block and allocates an MSHR for the pending clean operation. This changes relaxes an assertion to allow for the 2nd case (cache clean operations).
Change-Id: Ib51482160b5f2b3702ed744b0eac2029d34bc9d4 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-on: https://gem5-review.googlesource.com/9021 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> |
12574:22936e2eb2da |
06-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use CacheBlk parameter on address regeneration
Skewed caches need to know the way to regenerate a block address.
Change-Id: I62c61ac9509eff2f37bad36862751956db7a6e40 Reviewed-on: https://gem5-review.googlesource.com/8782 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12573:b69e74b5baba |
08-Mar-2018 |
Jason Lowe-Power <jason@lowepower.com> |
mem-cache: Fix missing overrides
clang doesn't like inconsistent overrides. Add override to all overidden functions in lru.hh
Change-Id: I100ff4a7d90757439afee879ff9838c15f5c0b1d Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/8861 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12567:fef8623b1796 |
28-Jun-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Make the block invalidate functions virtual
This change makes the cache block invalidation function in the BaseTags and CacheBlk class virtual to enable derived classes.
Change-Id: I2e64b01c6ca637f16d10474fc8b08eeec3f23453 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8287 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12566:d6d48df9bf0f |
31-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Make invalidate a common function between tag classes
invalidate was defined as a separate function in the base associative and fully-associative tags classes although both functions should implement identical functionality. This patch moves the invalidate function in the base tags class.
Change-Id: I206ee969b00ab9e05873c6d87531474fcd712907 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8286 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12565:950ef64cb0a8 |
17-Jan-2018 |
Xiaoyu Ma <xiaoyuma@google.com> |
mem-cache: Allow prefetchers to override setCache.
This lets them hook setCache, perhaps to set up additional state based on the set cache.
Change-Id: Ic3b34fa43d052c71e8ef733a57fe47c70899cd27 Reviewed-on: https://gem5-review.googlesource.com/8701 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> |
12557:16b682f1d8a2 |
05-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix bug generated by 8282
Merge 1ae7fced4d32898531a6875a339ef00e43e20e66 generated a bug in tagsInUse calculation.
Change-Id: I079e327a0a26a7968f2ed8e433dd6e790c80998b Reviewed-on: https://gem5-review.googlesource.com/8781 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12556:522b57ee9abf |
07-Nov-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Populate whenReady for blocks filled from writebacks
Writebacks write data to either an existing block or a newly allocated block. In either case we need to populate the whenReady field of the block which will determine when the new value can be used.
Change-Id: I5788fad0b8086a1be96714639bf6a9470b334926 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8285 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12555:4ecdaa830686 |
05-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use findBlock() in accessBlock()
Use placement policy specific block search within generic access.
Change-Id: I6070035e6e00595bcf073d4011f78a55ba7e7a8a Reviewed-on: https://gem5-review.googlesource.com/8721 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12554:86264baddf36 |
02-Nov-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Remove redundant block initialization on allocation
Change-Id: I7496e12e6a517529316c480d5f6e2ade601f0e2d Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8282 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12553:514f2e4fb751 |
31-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Remove mumBlock redundant initialiation from FALRU
Change-Id: Id3afec0a62446d6d0f44ccb655032343037637e0 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8281 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12552:5615a3de961f |
22-Nov-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Populate the secure bit when the temp block is filled
The secure bit should be set when we fill a block with data from a secure location, as indicated by the packet that triggers the fill. This patch fixes a bug in which the cache wouldn't populate the secure bit when filling the temp block.
Change-Id: I95c706146449804ff42b205b25dd79750f3e882a Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8284 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12551:a5016c69f510 |
02-Nov-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Remove unnecessary block initialization on writeback
Change-Id: Ia9b825bcbb8d326705f74c15a93a88703153ba5a Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8283 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12549:d3e5cfe631fc |
27-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove extra block init in BaseSetAssoc
Removed extra initialization of cache block just after they have been created and organized the comments.
Change-Id: I75c1beaf0489e3e530fd8cbff2739dc7593e3e6f Reviewed-on: https://gem5-review.googlesource.com/8661 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12548:285f1792a2da |
26-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Vectorize C arrays in BaseSetAssoc.
Transform BaseSetAssoc's arrays into C++ vectors to avoid unnecessary resource management.
Change-Id: I656f42f29e5f9589eba491b410ca1df5a64f2f34 Reviewed-on: https://gem5-review.googlesource.com/8621 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12545:13eaf39f933b |
23-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix CacheSet memory leak
CacheSet blocks were being allocated but never freed. Used vector to avoid using pure C array.
Change-Id: I6f32fa5a305ff4e1d7602535026c1396764102ed Reviewed-on: https://gem5-review.googlesource.com/8603 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12513:4dfc54394b5a |
07-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Make cache warmup percentage a parameter.
The warmupPercentage is the percentage of different tags (based on the cache size) that need to be touched in order to warm up the cache. If Warmup failed (i.e., not enough tags were touched), warmup_cycle = 0.
The warmup is not being taken into account to calculate the stats (i.e., stats acquisition starts before cache is warmed up). Maybe in the future this functionality should be added.
Change-Id: I2b93a99c19fddb99a4c60e6d4293fa355744d05e Reviewed-on: https://gem5-review.googlesource.com/8061 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12501:42537a80ef17 |
19-Dec-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Only pendingModified MSHRs can satisfy CMO snoops
We set the satisfied flag when a cache clean request encounters: 1) a block with the dirty bit set, or 2) a pending modified MSHR which means that the cache will get copy of the block that will be soon modified.
This changeset fixes a previous bug that set the satisfied flag on snooping MSHR hits even the pendingModified flags was not set.
Change-Id: I4968c4820997be5cc1238148eea12a1ba39837d4 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Sudhanshu Jha <sudhanshu.jha@arm.com> Reviewed-on: https://gem5-review.googlesource.com/7822 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12500:a91cf4e8b6a4 |
14-Dec-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Cleaned blocks should be marked as not writable
A writeclean packet writes a dirty block to the memory below and therefore sets the dirty flag for the block when the memory below is a cache. If the block was also marked as writable it can satisfy future write requests without further requests/snoops. This can lead to multiple copies of the same block marked as dirty which is not allowed. This changeset clears the writable flag from the cleaned block to prevent the cache from satisfying future write requests without sending a downstream request.
Change-Id: I14d3c62fd33f81b1a8ba62374c8565ccab00a6fe Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/7821 Maintainer: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12493:a1cf71a6de73 |
06-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove extra numSets zero check.
numSets is unsigned, so it cannot be lower than 0. Besides, isPowerOf2(0) is false by definition (and implemmentation*), so there is no need for the double check.
* As presented in base/intmath.hh
Change-Id: I3f6296694a937434feddc7ed21f11c2a6fdfc5a9 Reviewed-on: https://gem5-review.googlesource.com/7901 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
12492:4e76959883a6 |
05-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Standardize mem folder header guards
Standardize all header guards in the mem directory according to the most frequent patterns. In general they have the form: mem: __FOLDER_TREE_FILE_NAME_HH__ ruby: __FOLDER_TREE_FILENAME_HH__
Change-Id: I983853e292deb302becf151bf0e970057dc24774 Reviewed-on: https://gem5-review.googlesource.com/7881 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12425:7f8c9032b18c |
04-Sep-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Prune unnecessary writebacks in exclusive caches
Exclusive caches use the tempBlock to fill for responses from a downstream cache. The reason for this is that they only pass the block to the cache above without keeping a copy. When all requests are serviced the block is immediately invalidated unless it is dirty, in which case it has to be written back to the memory below.
To avoid unnecessary writebacks, this changeset forces mostly exclusive caches to issuse requests that can only fetch clean data when possible.
Reported-by: Quereshi Muhammad Avais <avais@kaist.ac.kr>
Change-Id: I01b377563f5aa3e12d22f425a04db7c023071849 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5061 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12351:17eaa27bef22 |
21-Sep-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Co-ordination of CMOs in the xbar
A clean packet request serving a cache maintenance operation (CMO) visits all memories down to the specified xbar. The visited caches invalidate their copy (if the CMO is invalidating) and if a dirty copy is found a write packet writes the dirty data to the memory level below the specified xbar. A response is send back when all the caches are clean and/or invalidated and the specified xbar has seen the write packet.
This patch adds the following functionality in the xbar: 1) Accounts for the cache clean requests that go through the xbar 2) Generates the cache clean response when both the cache clean request and the corresponding writeclean packet has crossed the destination xbar.
Previously transactions in the xbar were identified using the pointer of the original request. Cache clean transactions comprise of two different packets, the clean request and the writeclean, and therefore have different request pointers. This patch adds support for custom transaction IDs that by default take the value of the request pointer but can be overriden by the contructor. This allows the clean request and writeclean share the same id which the coherent xbar uses to co-ordinate them and send the response in a timely manner.
Change-Id: I80db76386a1caded38dc66e6e18f930c3bb800ff Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5051 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12350:811452f255d5 |
22-Sep-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add support for handling CMOs in the MSHRs
To add support for cache maintenance operations (CMOs) in the MSHRs, this change adds the following functionality: - If a CMO request hits in the MSHRs, we deferred as we can't coalesce it with any other requests. - When we promote any deferred targets, we promote them in order and stop if we encounter a CMO request. If the CMO request is at the beginning of the deferred targets list it will be the only promoted target.
Change-Id: I10d1f7e16bd6d522d917279c5d408a3f0cee4286 Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5050 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12349:47f454120200 |
01-Jun-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add support for CMOs in the cache
This change adds support for maintenance operations (CMOs) in the cache. The supported memory operations clean and/or invalidate a cache block as specified by its VA to the specified xbar (PoU, PoC).
A cache maintenance packet visits all memories down to the specified xbar. Caches need to invalidate their copy if it is an invalidating CMO. If it is (additionally) a cleaning CMO and a dirty copy exists, the cache cleans it with a WriteClean request.
Change-Id: Ibf31daa7213925898f3408738b11b1dd76c90b79 Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5049 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12348:bef2d9d3c353 |
07-Sep-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Promote deferred targets only when the block is valid
When a response indicates that there are no other sharers of the block, the cache can promote its copy of the block to writable and potential service deferred targets even if the request didn't ask for a writable copy.
Previously, a response would guarantee the presence of the block in the cache. A response could either be filling, upgrading or a response to an invalidation due to a pending whole line write. Responses to cache maintenance invalidations break this assumption. This change adds an extra check to make sure that the block was already valid or that the response is filling before promoting the block.
Change-Id: I6839f683a05d4dad4205c23f365a925b7b05e366 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5048 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12346:9b1144d046ca |
22-Sep-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Support for specifying the destination of a WriteClean
Previously, WriteClean packets would always write to the first memory below unless the memory was unable to allocate in which case it would be forwarded further below.
This change adds support for specifying the destination of a WriteClean packet. The cache annotates the request with the specified destination and marks the packet as write-through upon its creation. The coherent xbar checks packets for their destination and resets the write-through flag when necessary e.g., the coherent xbar that is set as the PoC will reset the write-through flag for packets to the PoC.
Change-Id: I84b653f5cb6e46e97e09508649a3725d72d94606 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5046 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12345:70c783a93195 |
31-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add support for WriteClean packets in the memory system
This change adds support for creating and handling WriteClean packets. The WriteClean operation is almost identical to a WritebackDirty with the exception that the cache generating a WriteClean retains a copy of the block.
Change-Id: I63c8de62919fad0f9547d412f8266aa4292ebecd Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5045 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12343:51ae6d08466f |
29-Sep-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Add support for checking whether a cache is busy
This changeset adds support for checking whether the cache is currently busy and a timing request would be rejected.
Change-Id: I5e37b011b2387b1fa1c9e687b9be545f06ffb5f5 Reviewed-on: https://gem5-review.googlesource.com/5042 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12334:e0ab29a34764 |
30-Nov-2017 |
Gabe Black <gabeblack@google.com> |
misc: Rename misc.(hh|cc) to logging.(hh|cc)
These files aren't a collection of miscellaneous stuff, they're the definition of the Logger interface, and a few utility macros for calling into that interface (panic, warn, etc.).
Change-Id: I84267ac3f45896a83c0ef027f8f19c5e9a5667d1 Reviewed-on: https://gem5-review.googlesource.com/6226 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Gabe Black <gabeblack@google.com> |
12167:24eb63c709c2 |
02-Aug-2017 |
Pau Cabre <pau.cabre@metempsy.com> |
mem-cache: Delete squashed HWPrefetches
Request and Packet for squashed HWPrefetches were not deleted
Change-Id: I9b66bb01b8ed6a5ddfaaa8739a68165dc4a7006c Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/4340 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12091:f2d1af96ad2d |
13-Jun-2017 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem-cache: Add missing overrides to BaseCache
Change-Id: I6a3a57e3067c247bd6ce6f01ac9459883f4aae2c Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/3880 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12084:5a3769ff3d55 |
07-Jun-2017 |
Sean Wilson <spwilson2@wisc.edu> |
mem: Replace EventWrapper use with EventFunctionWrapper
NOTE: With this change there is a possibility for `DRAMCtrl::Rank`s event names to not properly match the rank they were generated by. This could occur if the public rank member is modified after the Rank's construction. A patch would mean refactoring Rank and `DRAMCtrl`b to privatize many of the members of Rank behind getters.
Change-Id: I7b8bd15086f4ffdfd3f40be4aeddac5e786fd78e Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3745 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
11893:3033b3e6a32a |
30-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Make blkAlign a common function between all tag classes
blkAlign was defined as a separate function in the base associative and fully-associative tags classes although both functions implemented identical functionality. This patch moves the blkAlign in the base tags class.
Change-Id: I3d415d0e62bddeec7ce0d559667e40a8c5fdc2d4 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> |
11892:c7ea349e1cd3 |
26-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Use pkt::getBlockAddr instead of BaseCace::blockAlign
Change-Id: I0ed4e528cb750a323facdc811dde7f0ed1ff228e Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> |
11872:ba90ffa751b6 |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove unused size field from the CacheBlk class
Change-Id: I6149290d6d2ac1a4bd6165871c93d7b7d6a980ad Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11871:474ac613d0d7 |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove the unused asid field from the CacheBlk class
Change-Id: I29f45733c5fad822bdd0d8dcc7939d86b2e8c97b Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11870:b470020b29de |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove unused arguments (asid/contex_id) from accessBlock
Change-Id: I79c2662fc81630ab321db8a75be6cd15fa07d372 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11869:aa9d04c7e3bb |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove unused type BlkList from the cache and the tags
Change-Id: If9ebb8488e8db587482ecfa99d2c12cfe5734fb9 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11868:cc435f8f8b05 |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove unused functions from the tag classes
Change-Id: I4f3c2c027b1acaaf791a4c71086f34a9b9fbf4df Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11867:1342b4dbc556 |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Always use the helper function to invalidate a block
Policies like the LRU need to be notified when a block is invalidated, the helper function does this along with invalidating the block.
Change-Id: I3ed59cf07938caa7f394ee6054b0af9e00b267ea Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11866:8732d8d0a9e5 |
21-Feb-2017 |
Sascha Bischoff <sascha.bischoff@arm.com> |
mem: Fix MSHR assert triggering for invalidated prefetches
This changeset updates an assert in src/mem/cache/mshr.cc which was erroneously catching invalidated prefetch requests. These requests can become invalidated if another component writes (an exclusive access) to this location during the time that the read request is in flight. The original assert made the assumption that these cases can only occur for reads generated by the CPU, and hence prefetcher-generated requests would sometimes trip the assert.
Change-Id: If4f043273a688c2bab8f7a641192a2b583e7b20e Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11865:608f8c34f549 |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Populate the secure flag in the writeback visitor
Previously the writeback visitor would not consider and set the secure flag for the blocks that are written back to memory. This patch fixes this.
Change-Id: Ie1a425fa9211407a70a4343f2c6b3d073371378f Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11863:b47dda418ae6 |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove stale argument from a panic statement
Change-Id: I7ae5fa44a937f641a2ddd242a49e0cd23f68b9f2 Reviewed-by: Sudhanshu Jha <sudhanshu.jha@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11859:76c36516e0ae |
19-Feb-2017 |
Andreas Hansson <andreas.hansson@arm.com> |
sim: Ensure draining is deterministic
The traversal of drainable objects could potentially be non-deterministic when using an unordered set containing object pointers. To ensure that the iteration is deterministic, we switch to a vector. Note that the lookup and traversal of the drainable objects is not performance critical, so the change has no negative consequences. |
11858:5869c83bc8c7 |
19-Feb-2017 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Ensure deferred snoops are cache-line aligned
This patch fixes a bug where a deferred snoop ended up being to a partial cache line, and not cache-line aligned, all due to how we copy the packet. |
11830:79c3f6a60392 |
11-Feb-2017 |
Bjoern A. Zeeb <baz21@cam.ac.uk> |
mem: fix printing of 1st cache tags line
Rather than having the 1st line on the Log line and every other line on its own, add a new line to have a common format for all of them. Makes parsing a lot easier.
Reviewed at http://reviews.gem5.org/r/3808/
Signed-off-by: Jason Lowe-Power <jason@lowepower.com> |
11800:54436a1784dc |
09-Nov-2016 |
Brandon Potter <brandon.potter@amd.com> |
style: [patch 3/22] reduce include dependencies in some headers
Used cppclean to help identify useless includes and removed them. This involved erroneously included headers, but also cases where forward declarations could have been used rather than a full include. |
11793:ef606668d247 |
09-Nov-2016 |
Brandon Potter <brandon.potter@amd.com> |
style: [patch 1/22] use /r/3648/ to reorganize includes |
11751:cd6248b276a8 |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Respond to InvalidateReq when the block is (pending) dirty
Previously when an InvalidateReq snooped a cache with a dirty block or a pending modified MSHR, it would invalidate the block or set the postInv flag. The cache would not send an InvalidateResp. though, causing memory order violations. This patches changes this behavior, making the cache with the dirty block or pending modified MSHR the ordering point.
Change-Id: Ib4c31012f4f6693ffb137cd77258b160fbc239ca Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> |
11750:c15cc4d973ea |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Invalidate a blk when servicing the 1st invalidating target
Previously an MSHR with one or more invalidating targets would first service all targets in the MSHR TargetList and then invalidate the block. As a result any service snooping targets would lookup in the cache and incorrectly find the block. This patch forces the invalidation to happen when the first invalidating target is encountered.
Change-Id: I9df15de24e1d351cd96f5a2c424d9a03d81c2cce Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> |
11749:3b2cb95f48ed |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Allow non invalidating snoops on an InvalidateReq MSHR
This patch changes an assertion that previously assumed that a non invalidating snoop request should never be serviced by an InvalidateReq MSHR. The MSHR serves as the ordering point for the snooping packet. When the InvalidateResp reaches the cache the snooping packet snoops the caches above to find the requested block. One or more of the caches above will have the block since earlier it has seen a WriteLineReq.
Change-Id: I0c147c8b5d5019e18bd34adf9af0fccfe431ae07 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> |
11747:a6da15219f95 |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Always use InvalidateReq to service WriteLineReq misses
Previously, a WriteLineReq that missed in a cache would send out an InvalidateReq if the block lookup failed or an UpgradeReq if the block lookup succeeded but the block had sharers. This changes ensures that a WriteLineReq always sends an InvalidateReq to invalidate all copies of the block and satisfy the WriteLineReq.
Change-Id: I207ff5b267663abf02bc0b08aeadde69ad81be61 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> |
11745:3102db8903f5 |
05-Dec-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Ensure InvalidateReq is considered isForward by MSHRs
This patch fixes an issue where an MSHR would incorrectly be perceived to provide data to targets arriving after an InvalidateReq. To address this the InvalidateReq is now treated as isForward, much like an UpgradeReq that did not hit in the cache.
Change-Id: Ia878444d949539b5c33fd19f3e12b0b8a872275e Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11744:5d33c6972dda |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Make packet debug printing more uniform
Previously DPRINTFs printing information about a packet would use ad hoc formats. This patch changes all DPRINTFs to use the print function defined by the packet class, making the packet printing format more uniform and easier to change.
Change-Id: Idd436a9758d4bf70c86a574d524648b2a2580970 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11742:3dcf0b891749 |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Service only the 1st FromCPU MSHR target on ReadRespWithInv
A response to a ReadReq can either be a ReadResp or a ReadRespWithInvalidate. As we add targets to an MSHR for a ReadReq we assume that the response will be a ReadResp. When the response is invalidating (ReadRespWithInvalidate) servicing more than one targets can potentially violate the memory ordering. This change fixes the way we handle a ReadRespWithInvalidate. When a cache receives a ReadRespWithInvalidate we service only the first FromCPU target and all the FromSnoop targets from the MSHR target list. The rest of the FromCPU targets are deferred and serviced by a new request.
Change-Id: I75c30c268851987ee5f8644acb46f440b4eeeec2 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11741:72916416d2e2 |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Keep track of allocOnFill in the TargetList
Previously the information of whether a response was allocating or not was a property of the MSHR. This change makes this flag a property of the TargetList. Differernt TargetLists, e.g. the targets and the deferred targets lists might have different values. Additionally, the information about whether each of the target expects an allocating response is stored inside the TargetList container. This allows for repopulating the flag in case some of the targets are removed.
Change-Id: If3ec2516992f42a6d9da907009ffe3ab8d0d2021 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11740:6e1cb0f750c0 |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add support for repopulating the flags of an MSHR TargetList
This patch adds support for repopulating the flags of an MSHR TargetList. The added functionality makes it possible to remove targets from a TargetList without leaving it in an inconsistent state.
Change-Id: I3f7a8e97bfd3e2e49bebad056d11bbfb087aad91 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11722:f15f02d8c79e |
30-Nov-2016 |
Sophiane Senni <sophiane.senni@gmail.com> |
mem: Split the hit_latency into tag_latency and data_latency
If the cache access mode is parallel, i.e. "sequential_access" parameter is set to "False", tags and data are accessed in parallel. Therefore, the hit_latency is the maximum latency between tag_latency and data_latency. On the other hand, if the cache access mode is sequential, i.e. "sequential_access" parameter is set to "True", tags and data are accessed sequentially. Therefore, the hit_latency is the sum of tag_latency plus data_latency.
Signed-off-by: Jason Lowe-Power <jason@lowepower.com> |
11610:3fb50f935a6a |
14-Aug-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Print an MSHR without triggering any assertions
Previously printing an mshr would trigger an assertion if the MSHR was not in service or if the targets list was empty. This patch changes the print function to bypasses the accessor functions for postInvalidate and postDowngrade and avoid the relevant assertions. It also checks if the targets list is empty before calling print on it.
Change-Id: Ic18bee6cb088f63976112eba40e89501237cfe62 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11602:7e0199f80816 |
12-Aug-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Update mostly exclusive policy even further
This patch takes yet another step in maintaining the clusivity, in that it allows a mostly-inclusive cache to hold on to blocks even when responding to a ReadExReq or UpgradeReq. Previously the cache simply invalidated these blocks, but there is no strict need to do so.
The most important part of this patch is that we simply mark the block clean when satisfying the upstream request where the cache is allowed to keep the block. The only tricky part of the patch is in the memory management of deferred snoops, where we need to distinguish the cases where only the packet was copied (we expected to respond), and the cases where we created an entirely new packet and request (we kept it only to replay later).
The code in satisfyRequest is definitely ready for some refactoring after this.
Change-Id: I201ddc7b2582eaa46fb8cff0c7ad09e02d64b0fc Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> |
11601:382e0637fae0 |
12-Aug-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Update mostly exclusive cache policy to cover more cases
This patch changes how the mostly exclusive policy is enforced to ensure that we drop blocks when we should. As part of this change, the actual invalidation due to the clusivity enforcement is moved outside the hit handling, to a separate method maintainClusivity. For the timing mode that means we can deal with all MSHR targets before taking any action and possibly dropping the block. The method satisfyCpuSideRequest is also renamed satisfyRequest as part of this change (since we only ever see requests from the cpu-side port).
Change-Id: If6f3d1e0c3e7be9a67b72a55e4fc2ec4a90fd3d2 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> |
11600:a38c3f9c82d1 |
12-Aug-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add a FromCache packet attribute
This patch adds a FromCache attribute to the packet, and updates a number of the existing request commands to reflect that the request originates from a cache. The attribute simplifies checking if a requests came from a cache or not, and this is used by both the cache and snoop filter in follow-on patches.
Change-Id: Ib0a7a080bbe4d6036ddd84b46fd45bc7eb41cd8f Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Steve Reinhardt <stever@gmail.com> |
11558:b921b96cbf74 |
11-Jul-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove stale argument from a DPRINTF in the cache code
Change-Id: I70dd11c23b45dfc606ef08233d2e50fcc0817505 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11522:348411ec525a |
06-Jun-2016 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
sim: Call regStats of base-class as well
We want to extend the stats of objects hierarchically and thus it is necessary to register the statistics of the base-class(es), as well. For now, these are empty, but generic stats will be added there.
Patch originally provided by Akash Bagdia at ARM Ltd. |
11493:06b73eb44660 |
26-May-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix memory leak in handling of deferred snoops
This patch fixes a memory leak where deferred snoop packets never got deallocated. On the call to MSHR::handleSnoop these snoops were treated as if a response will be sent, as the MSHR was pendingModified. Consequently, a copy of the packet was created and added to the MSHR targets. However, an preceeding target to the same MSHR, originally from a CPU, was serviced before the snoop, and caused the block to be invalidated. This happens for ReadExReq and UpgradeReq.
Note that the original snoop will receive a response, just not from the cache in question, but instead from the cache upstream that issued the ReadExReq or UpgradeReq.
Change-Id: I4ac012fbc8a46cf693ca390fe9476105d444e6f4 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
11490:e03a6233d061 |
26-May-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not set cacheResponding on MSHR snoop if not responding
This patch changes the flow control for HSHR::handleSnoop to ensure that we only set cacheResponding on the snoop packet if we are actually responding. This avoids situations where a responder is stalling indefinitely on a response that never arrives.
Change-Id: I691dd01755b614b30203581aa74fc743b350eacc Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
11486:f09bb73b3050 |
26-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: fix headers include order in the cache related classes
Change-Id: Ia57cc104978861ab342720654e408dbbfcbe4b69 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11485:8ca4fbefff3e |
26-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: remove redudant check whether the cache forwards snoops
Change-Id: I57b56771086e1e2f512977fb7248d93c171ab925 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11484:08b33c52a16d |
26-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: change NULL to nullptr in the cache related classes
Change-Id: I5042410be54935650b7d05c84d8d9efbfcc06e70 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11483:d4c2e56d18b2 |
26-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: fix the line length in the cache related classes
Change-Id: I6d1feb164a958dde0da87a1cd2698096112c4a82 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11455:067177a1b578 |
21-Apr-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Include WriteLineReq in cache demand stats
Somehow the WriteLineReq were never added to the list of commands considered demand. |
11454:e55afadc4e19 |
21-Apr-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused cache stats
Prune cache stats that are never actually used. |
11453:dd9763792521 |
21-Apr-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Deallocate all write-queue entries when sent
This patch removes the write-queue entry tracking previously used for uncacheable writes. The write-queue entry is now deallocated as soon as the packet is sent. As a result we also forego the stats for uncacheable writes. Additionally, there is no longer a need to attach the write-queue entry to the packet. |
11452:4bc3a0c0861c |
21-Apr-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Align downstream cache packet creation in atomic and timing
This patch makes the control flow more uniform in atomic and timing, ultimately making the code easier to understand. |
11439:d0368996f1e0 |
07-Apr-2016 |
Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com> |
mem: Add priority to QueuedPrefetcher
Queued prefetcher entries now count with a priority field. The idea is to add packets ordered by priority and then by age.
For the existing algorithms in which priority doesn't make sense, it is set to 0 for all deferred packets in the queue. |
11438:3c9fd319a982 |
07-Apr-2016 |
Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com> |
mem: Handful extra features for BasePrefetcher
Some common functionality added to the base prefetcher, mainly dealing with extracting the block address, page address, block index inside the page and some other information that can be inferred from the block address. This is used for some prefetching algorithms, and having the methods in the base, as well as the block size and other information is the sensible way. |
11436:f351b7f248db |
27-May-2015 |
Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com> |
mem: Add unused prefetch counter in caches
Added stat to the cache to account for HardPF'ed blocks that are evicted before being referenced (over-prefetching). |
11435:0f1b46dde3fa |
07-Apr-2016 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu.
This is a re-spin of 20264eb after the revert (bd1c6789) and includes some fixes of that commit. |
11429:cf5af0cc3be4 |
06-Apr-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
Revert power patch sets with unexpected interactions
The following patches had unexpected interactions with the current upstream code and have been reverted for now:
e07fd01651f3: power: Add support for power models 831c7f2f9e39: power: Low-power idle power state for idle CPUs 4f749e00b667: power: Add power states to ClockedObject
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11428:20264eb69fbf |
05-Apr-2016 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu. |
11377:a06a4debe272 |
17-Mar-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Adjust cache queue reserve to more conservative values
The cache queue reserve is there as an overflow to give us enough headroom based on when we block the cache, and how many transactions we may already have accepted before actually blocking. The previous values were probably chosen to be "big enough", when we actually know that we check the MSHRs after every single allocation, and for the write buffers we know that we implicitly may need one entry for every outstanding MSHR. * * * mem: Adjust cache queue reserve to more conservative values
The cache queue reserve is there as an overflow to give us enough headroom based on when we block the cache, and how many transactions we may already have accepted before actually blocking. The previous values were probably chosen to be "big enough", when we actually know that we check the MSHRs after every single allocation, and for the write buffers we know that we implicitly may need one entry for every outstanding MSHR. |
11375:f98df9231cdd |
17-Mar-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Create a separate class for the cache write buffer
This patch breaks out the cache write buffer into a separate class, without affecting any stats. The goal of the patch is to avoid encumbering the much-simpler write queue with the complex MSHR handling. In a follow on patch this simplification allows us to implement write combining.
The WriteQueue gets its own class, but shares a common ancestor, the generic Queue, with the MSHRQueue. |
11357:6668387fa488 |
10-Aug-2015 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem, cpu: Add assertions to snoop invalidation logic
This patch adds assertions that enforce that only invalidating snoops will ever reach into the logic that tracks in-order load completion and also invalidation of LL/SC (and MONITOR / MWAIT) monitors. Also adds some comments to MSHR::replaceUpgrades(). |
11352:4e195fb9ec4f |
24-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Ensure that InvalidateReq is not forwarded as ReadExReq
This patch fixes an issue where an InvalidationReq only traversed one level of the cache hierarchy, and was subsequently turned into a ReadExReq due to it needing writable, and the command not being checked for explicitly. |
11340:dc0ed2d4da50 |
15-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Avoid using invalid iterator in cache lock list traversal
Fix up issue highlighted by Valgrind and the clang Address Sanitizer. |
11335:42961fda6d75 |
10-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Be less conservative in clearing load locks in the cache
Avoid being overly conservative in clearing load locks in the cache, and allow writes to the line if they are from the same context. This is in line with ALPHA and ARM. |
11334:9bd2e84abdca |
10-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Move the point of coherency to the coherent crossbar
This patch introduces the ability of making the coherent crossbar the point of coherency. If so, the crossbar does not forward packets where a cache with ownership has already committed to responding, and also does not forward any coherency-related packets that are not intended for a downstream memory controller. Thus, invalidations and upgrades are turned around in the crossbar, and the memory controller only sees normal reads and writes.
In addition this patch moves the express snoop promotion of a packet to the crossbar, thus allowing the downstream cache to check the express snoop flag (as it should) for bypassing any blocking, rather than relying on whether a cache is responding or not. |
11333:c41d552d6f2e |
10-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Align cache behaviour in atomic when upstream is responding
Adopt the same flow as in timing mode, where the caches on the path to memory get to keep the line (if present), and we use the responderHadWritable flag to determine if we need to forward the (invalidating) packet or not. |
11332:40bcb0e97de9 |
10-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Align how snoops are handled when hitting writebacks
This patch unifies the snoop handling in case of hitting writebacks with how we handle snoops hitting in the tags. As a result, we end up using the same optimisation as the normal snoops, where we inform the downstream cache if we encounter a line in Modified (writable and dirty) state, which enables us to avoid sending out express snoops to invalidate any Shared copies of the line. A few regressions consequently change, as some transactions are sunk higher up in the cache hierarchy. |
11331:cd5c48db28e6 |
10-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Deduce if cache should forward snoops
This patch changes how the cache determines if snoops should be forwarded from the memory side to the CPU side. Instead of having a parameter, the cache now looks at the port connected on the CPU side, and if it is a snooping port, then snoops are forwarded. Less error prone, and less parameters to worry about.
The patch also tidies up the CPU classes to ensure that their I-side port is not snooping by removing overrides to the snoop request handler, such that snoop requests will panic via the default MasterPort implement |
11321:02e930db812d |
06-Feb-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
style: fix missing spaces in control statements
Result of running 'hg m5style --skip-all --fix-control -a'. |
11288:57c340f947c7 |
31-Dec-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: add CacheVerbose debug flag, filter noisy DPRINTFs
Some of the DPRINTFs added to the classic cache in cset 45df88079f04, while useful to those unfamiliar with the cache code, end up being noise when you're familiar with the code but are trying to debug tricky protocol issues. (Particularly getting two messages from each cache as it receives a snoop request then declares that there was no match.)
This patch introduces a CacheVerbose debug flag, and moves a subset of the added DPRINTFs into that category, so that Cache by itself returns to being a more succinct summary of cache activity.
Also added a CacheAll compound flag to turn on all the cache-related debug flags (other than CacheTags, which you *really* have to want badly to turn it on, IMO). |
11286:2071db8f864b |
31-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not allocate space for packet data if not needed
This patch looks at the request and response command to determine if either actually has any data payload, and if not, we do not allocate any space for packet data.
The only tricky case is where the command type is changed as part of the MSHR functionality. In these cases where the original packet had no data, but the new packet does, we need to explicitly call allocate(). |
11285:25715951a4b8 |
31-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not alter cache block state on uncacheable snoops
This patch ensures we do not respond with a Modified (dirty and writable) line if the request is uncacheable, and that the cache responding retains the line without modifying the state (even if responding). |
11284:b3926db25371 |
31-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make cache terminology easier to understand
This patch changes the name of a bunch of packet flags and MSHR member functions and variables to make the coherency protocol easier to understand. In addition the patch adds and updates lots of descriptions, explicitly spelling out assumptions.
The following name changes are made:
* the packet memInhibit flag is renamed to cacheResponding
* the packet sharedAsserted flag is renamed to hasSharers
* the packet NeedsExclusive attribute is renamed to NeedsWritable
* the packet isSupplyExclusive is renamed responderHadWritable
* the MSHR pendingDirty is renamed to pendingModified
The cache states, Modified, Owned, Exclusive, Shared are also called out in the cache and MSHR code to make it easier to understand. |
11279:3fd1142adad9 |
28-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Explicitly check MSHR snoops for cases not dealt with
Add a sanity check to make it explicit that we currently do not allow an I/O coherent agent to directly issue writes into the coherent part of the memory system (it has to go via a cache, and get transformed into a read ex, upgrade or invalidation). |
11278:18411ccc4f3c |
28-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused cache squash functionality
This patch removes the unused squash function from the MSHR queue, and the associated (and also unused) threadNum member from the MSHR. |
11277:4f8703832608 |
28-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Avoid unecessary checks when creating HardPFReq in cache
The checks made before sending out a HardPFReq were unecessarily complex, and checked for cases that never occur. This patch tidies it up. |
11276:3561d002d8c7 |
28-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not use sender state to track forwarded snoops in cache
This patch changes how the cache tracks which snoops are forwarded, and which ones are created locally. Previously the identification was based on an empty sender state of a specific class, but this method fails to distinguish which cache actually attached the sender state. Instead we use the same mechanism as the crossbar, and keep track of the requests that have outstanding snoops. |
11275:fc2b0e6550ad |
28-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix cache sender state handling and add clarification
This patch addresses a bug in how the cache attached the MSHR as a sender state. Rather than overwriting any existing sender state it now pushes a new one. The handling of upward snoops is also clarified. |
11271:f4ad5be63ba8 |
17-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix memory allocation bug in deferred snoop handling
This patch fixes a corner case in the deferred snoop handling, where requests ended up being used by multiple packets with different lifetimes, and inadvertently got deleted while they were still in use. |
11211:4e70e13c1a2c |
15-Nov-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
arm: Add missing explicit overrides for classic caches
Make clang when compiling on OSX. |
11199:929fd978ab4e |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add an option to perform clean writebacks from caches
This patch adds the necessary commands and cache functionality to allow clean writebacks. This functionality is crucial, especially when having exclusive (victim) caches. For example, if read-only L1 instruction caches are not sending clean writebacks, there will never be any spills from the L1 to the L2. At the moment the cache model defaults to not sending clean writebacks, and this should possibly be re-evaluated.
The implementation of clean writebacks relies on a new packet command WritebackClean, which acts much like a Writeback (renamed WritebackDirty), and also much like a CleanEvict. On eviction of a clean block the cache either sends a clean evict, or a clean writeback, and if any copies are still cached upstream the clean evict/writeback is dropped. Similarly, if a clean evict/writeback reaches a cache where there are outstanding MSHRs for the block, the packet is dropped. In the typical case though, the clean writeback allocates a block in the downstream cache, and marks it writable if the evicted block was writable.
The patch changes the O3_ARM_v7a L1 cache configuration and the default L1 caches in config/common/Caches.py |
11197:f8fdd931e674 |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add cache clusivity
This patch adds a parameter to control the cache clusivity, that is if the cache is mostly inclusive or exclusive. At the moment there is no intention to support strict policies, and thus the options are: 1) mostly inclusive, or 2) mostly exclusive.
The choice of policy guides the behaviuor on a cache fill, and a new helper function, allocOnFill, is created to encapsulate the decision making process. For the timing mode, the decision is annotated on the MSHR on sending out the downstream packet, and in atomic we directly pass the decision to handleFill. We (ab)use the tempBlock in cases where we are not allocating on fill, leaving the rest of the cache unaffected. Simple and effective.
This patch also makes it more explicit that multiple caches are allowed to consider a block writable (this is the case also before this patch). That is, for a mostly inclusive cache, multiple caches upstream may also consider the block exclusive. The caches considering the block writable/exclusive all appear along the same path to memory, and from a coherency protocol point of view it works due to the fact that we always snoop upwards in zero time before querying any downstream cache.
Note that this patch does not introduce clean writebacks. Thus, for clean lines we are essentially removing a cache level if it is made mostly exclusive. For example, lines from the read-only L1 instruction cache or table-walker cache are always clean, and simply get dropped rather than being passed to the L2. If the L2 is mostly exclusive and does not allocate on fill it will thus never hold the line. A follow on patch adds the clean writebacks.
The patch changes the L2 of the O3_ARM_v7a CPU configuration to be mostly exclusive (and stats are affected accordingly). |
11194:c3ba89c653a9 |
06-Nov-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Enforce insertion order on the cache response path
This patch enforces insertion order transmission of packets on the response path in the cache. Note that the logic to enforce order is already present in the packet queue, this patch simply turns it on for queues in the response path.
Without this patch, there are corner cases where a request-response is faster than a response-response forwarded through the cache. This violation of queuing order causes problems in the snoop filter leaving it with inaccurate information. This causes assert failures in the snoop filter later on.
A follow on patch relaxes the order enforcement in the packet queue to limit the performance impact. |
11191:9eabb2bf349b |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not treat CleanEvict as a write operation
This patch changes the CleanEvict command type to not be considered a write. Initially it was made a zero-sized write to match the writeback command, but as things developed it became clear that it causes more problems than it solves. For example, the memory modules (and bridge) should not consider the CleanEvict as a write, but instead discard it. With this patch it will be neither a read, nor write, and as it does not need a response the slave will simply sink it. |
11190:0964165d1857 |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Unify delayed packet deletion
This patch unifies how we deal with delayed packet deletion, where the receiving slave is responsible for deleting the packet, but the sending agent (e.g. a cache) is still relying on the pointer until the call to sendTimingReq completes. Previously we used a mix of a deletion vector and a construct using unique_ptr. With this patch we ensure all slaves use the latter approach. |
11189:4237221d3e31 |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Appease clang static analyzer
A few minor fixes to issues identified by the clang static analyzer. |
11177:524c44cf8278 |
29-Oct-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Clarify cache MSHR handling on fill
This patch addresses the upgrading of deferred targets in the MSHR, and makes it clearer by explicitly calling out what is happening (deferred targets are promoted if we get exclusivity without asking for it). |
11169:44b5c183c3cd |
12-Oct-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Add explicit overrides and fix other clang >= 3.5 issues
This patch adds explicit overrides as this is now required when using "-Wall" with clang >= 3.5, the latter now part of the most recent XCode. The patch consequently removes "virtual" for those methods where "override" is added. The latter should be enough of an indication.
As part of this patch, a few minor issues that clang >= 3.5 complains about are also resolved (unused methods and variables). |
11168:f98eb2da15a4 |
12-Oct-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Remove redundant compiler-specific defines
This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions. |
11137:0229c7b15ca1 |
25-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add check for block status on WriteLineReq fill
More checks to help with understanding of functionality. |
11136:3fd483cdd458 |
25-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix WriteLineReq fill behaviour
This patch fixes issues in the interactions between deferred snoops and WriteLineReq. More specifically, the patch addresses an issue where deferred snoops caused assertion failures when being serviced on the arrival of an InvalidateResp. The response packet was perceived to be invalidating, when actually it is not for the cache that sent out the original invalidation request. |
11130:45a23e44e93d |
25-Sep-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Add snoops for CleanEvicts and Writebacks in atomic mode
This patch mirrors the logic in timing mode which sends up snoops to check for cached copies before sending CleanEvicts and Writebacks down the memory hierarchy. In case there is a copy in a cache above, discard CleanEvicts and set the BLOCK_CACHED flag in Writebacks so that writebacks do not reset the cache residency bit in the snoop filter below. |
11127:f39c2cc0d44e |
25-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make the coherent crossbar account for timing snoops
This patch introduces the concept of a snoop latency. Given the requirement to snoop and forward packets in zero time (due to the coherency mechanism), the latency is accounted for later.
On a snoop, we establish the latency, and later add it to the header delay of the packet. To allow multiple caches to contribute to the snoop latency, we use a separate variable in the packet, and then take the maximum before adding it to the header delay. |
11083:61b329833f74 |
04-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Avoid setting markPending if not needed
In cases where a newly added target does not have any upstream MSHR to mark as downstreamPending, remember that nothing is marked. This allows us to avoid attempting to find the MSHR as part of the clearing of downstreamPending. |
11082:8539728fd457 |
04-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up CacheSet
Minor tweaks and house keeping. |
11081:4d8b7783a692 |
04-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up the snoop state-transition logic
Remove broken and unused option to pass dirty data on non-exclusive snoops. Also beef up the comments a bit. |
11055:54071fd5c397 |
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
arm, mem: Remove unused CLEAR_LL request flag
Cleaning up dead code. The CLREX stores zero directly to MISCREG_LOCKFLAG and so the request flag is no longer needed. The corresponding functionality in the cache tags is also removed. |
11054:00bddca96da6 |
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused cache squash functionality
Tidying up. |
11053:62544e45c0f4 |
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add explicit Cache subclass and make BaseCache abstract
Open up for other subclasses to BaseCache and transition to using the explicit Cache subclass. |
11051:81b1f46061c8 |
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Move cache_impl.hh to cache.cc
There is no longer any need to keep the implementation in a header. |
11005:e7f403b6b76f |
07-Aug-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
base: Declare a type for context IDs
Context IDs used to be declared as ad hoc (usually as int). This changeset introduces a typedef for ContextIDs and a constant for invalid context IDs. |
10943:329eef4c58f0 |
30-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add missing clean eviction on uncacheable access
This patch adds a missing clean eviction, occuring when an uncacheable access flushes and invalidates an existing block. |
10942:224c85495f96 |
30-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused RequestCause in cache
This patch removes the RequestCause, and also simplifies how we schedule the sending of packets through the memory-side port. The deassertion of bus requests is removed as it is not used. |
10941:a39646f4c407 |
30-Jul-2015 |
David Guillen-Fandos <david.guillen@arm.com> |
mem: Make caches way aware
This patch makes cache sets aware of the way number. This enables some nice features such as the ablity to restrict way allocation. The implemented mechanism allows to set a maximum way number to be allocated 'k' which must fulfill 0 < k <= N (where N is the number of ways). In the future more sophisticated mechasims can be implemented. |
10940:49d9b53b21dc |
30-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Transition away from isSupplyExclusive for writebacks
This patch changes how writebacks communicate whether the line is passed as modified or owned. Previously we relied on the isSupplyExclusive mechanism, which was originally designed to avoid unecessary snoops.
For normal cache requests we use the sharedAsserted mechanism to determine if a block should be marked writeable or not, and with this patch we transition the writebacks to also use this mechanism. Conceptually this is cleaner and more consistent. |
10939:6f23825b091b |
30-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up CacheBlk class
This patch modernises and tidies up the CacheBlk, removing dead code. |
10922:5ee72f4b2931 |
13-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix (ab)use of emplace to avoid temporary object creation |
10913:38dbdeea7f1f |
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Refactor and simplify the drain API
The drain() call currently passes around a DrainManager pointer, which is now completely pointless since there is only ever one global DrainManager in the system. It also contains vestiges from the time when SimObjects had to keep track of their child objects that needed draining.
This changeset moves all of the DrainState handling to the Drainable base class and changes the drain() and drainResume() calls to reflect this. Particularly, the drain() call has been updated to take no parameters (the DrainManager argument isn't needed) and return a DrainState instead of an unsigned integer (there is no point returning anything other than 0 or 1 any more). Drainable objects should return either DrainState::Draining (equivalent to returning 1 in the old system) if they need more time to drain or DrainState::Drained (equivalent to returning 0 in the old system) if they are already in a consistent state. Returning DrainState::Running is considered an error.
Drain done signalling is now done through the signalDrainDone() method in the Drainable class instead of using the DrainManager directly. The new call checks if the state of the object is DrainState::Draining before notifying the drain manager. This means that it is safe to call signalDrainDone() without first checking if the simulator has requested draining. The intention here is to reduce the code needed to implement draining in simple objects. |
10912:b99a6662d7c2 |
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Decouple draining from the SimObject hierarchy
Draining is currently done by traversing the SimObject graph and calling drain()/drainResume() on the SimObjects. This is not ideal when non-SimObjects (e.g., ports) need draining since this means that SimObjects owning those objects need to be aware of this.
This changeset moves the responsibility for finding objects that need draining from SimObjects and the Python-side of the simulator to the DrainManager. The DrainManager now maintains a set of all objects that need draining. To reduce the overhead in classes owning non-SimObjects that need draining, objects inheriting from Drainable now automatically register with the DrainManager. If such an object is destroyed, it is automatically unregistered. This means that drain() and drainResume() should never be called directly on a Drainable object.
While implementing the new functionality, the DrainManager has now been made thread safe. In practice, this means that it takes a lock whenever it manipulates the set of Drainable objects since SimObjects in different threads may create Drainable objects dynamically. Similarly, the drain counter is now an atomic_uint, which ensures that it is manipulated correctly when objects signal that they are done draining.
A nice side effect of these changes is that it makes the drain state changes stricter, which the simulation scripts can exploit to avoid redundant drains. |
10910:32f3d1c454ec |
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Make the drain state a global typed enum
The drain state enum is currently a part of the Drainable interface. The same state machine will be used by the DrainManager to identify the global state of the simulator. Make the drain state a global typed enum to better cater for this usage scenario. |
10905:a6ca6831e775 |
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Refactor the serialization base class
Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically:
* Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section.
* Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects.
* Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections).
* The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects.
* Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code. |
10888:85a001f2193b |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Delay responses in the crossbar before forwarding
This patch changes how the crossbar classes deal with responses. Instead of forwarding responses directly and burdening the neighbouring modules in paying for the latency (through the pkt->headerDelay), we now queue them before sending them.
The coherency protocol is not affected as requests and any snoop requests/responses are still passed on in zero time. Thus, the responses end up paying for any header delay accumulated when passing through the crossbar. Any latency incurred on the request path will be paid for on the response side, if no other module has dealt with it.
As a result of this patch, responses are returned at a later point. This affects the number of outstanding transactions, and quite a few regressions see an impact in blocking due to no MSHRs, increased cache-miss latencies, etc.
Going forward we should be able to use the same concept also for snoop responses, and any request that is not an express snoop. |
10887:279efb97ec99 |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove redundant is_top_level cache parameter
This patch takes the final step in removing the is_top_level parameter from the cache. With the recent changes to read requests and write invalidations, the parameter is no longer needed, and consequently removed.
This also means that asymmetric cache hierarchies are now fully supported (and we are actually using them already with L1 caches, but no table-walker caches, connected to a shared L2). |
10886:fdd4a895f325 |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Split WriteInvalidateReq into write and invalidate
WriteInvalidateReq ensures that a whole-line write does not incur the cost of first doing a read exclusive, only to later overwrite the data. This patch splits the existing WriteInvalidateReq into a WriteLineReq, which is done locally, and an InvalidateReq that is sent out throughout the memory system. The WriteLineReq re-uses the normal WriteResp.
The change allows us to better express the difference between the cache that is performing the write, and the ones that are merely invalidating. As a consequence, we no longer have to rely on the isTopLevel flag. Moreover, the actual memory in the system does not see the intitial write, only the writeback. We were marking the written line as dirty already, so there is really no need to also push the write all the way to the memory.
The overall flow of the write-invalidate operation remains the same, i.e. the operation is only carried out once the response for the invalidate comes back. This patch adds the InvalidateResp for this very reason. |
10885:3ac92bf1f31f |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add ReadCleanReq and ReadSharedReq packets
This patch adds two new read requests packets:
ReadCleanReq - For a cache to explicitly request clean data. The response is thus exclusive or shared, but not owned or modified. The read-only caches (see previous patch) use this request type to ensure they do not get dirty data.
ReadSharedReq - We add this to distinguish cache read requests from those issued by other masters, such as devices and CPUs. Thus, devices use ReadReq, and caches use ReadCleanReq, ReadExReq, or ReadSharedReq. For the latter, the response can be any state, shared, exclusive, owned or even modified.
Both ReadCleanReq and ReadSharedReq re-use the normal ReadResp. The two transactions are aligned with the emerging cache-coherent TLM standard and the AMBA nomenclature.
With this change, the normal ReadReq should never be used by a cache, and is reserved for the actual (non-caching) masters in the system. We thus have a way of identifying if a request came from a cache or not. The introduction of ReadSharedReq thus removes the need for the current isTopLevel hack, and also allows us to stop relying on checking the packet size to determine if the source is a cache or not. This is fixed in follow-on patches. |
10884:c60acdbdd6ad |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Allow read-only caches and check compliance
This patch adds a parameter to the BaseCache to enable a read-only cache, for example for the instruction cache, or table-walker cache (not for x86). A number of checks are put in place in the code to ensure a read-only cache does not end up with dirty data.
A follow-on patch adds suitable read requests to allow a read-only cache to explicitly ask for clean data. |
10883:9294c4a60251 |
03-Jul-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Add clean evicts to improve snoop filter tracking
This patch adds eviction notices to the caches, to provide accurate tracking of cache blocks in snoop filters. We add the CleanEvict message to the memory heirarchy and use both CleanEvicts and Writebacks with BLOCK_CACHED flags to propagate notice of clean and dirty evictions respectively, down the memory hierarchy. Note that the BLOCK_CACHED flag indicates whether there exist any copies of the evicted block in the caches above the evicting cache.
The purpose of the CleanEvict message is to notify snoop filters of silent evictions in the relevant caches. The CleanEvict message behaves much like a Writeback. CleanEvict is a write and a request but unlike a Writeback, CleanEvict does not have data and does not need exclusive access to the block. The cache generates the CleanEvict message on a fill resulting in eviction of a clean block. Before travelling downwards CleanEvict requests generate zero-time snoop requests to check if the same block is cached in upper levels of the memory heirarchy. If the block exists, the cache discards the CleanEvict message. The snoops check the tags, writeback queue and the MSHRs of upper level caches in a manner similar to snoops generated from HardPFReqs. Currently CleanEvicts keep travelling towards main memory unless they encounter the block corresponding to their address or reach main memory (since we have no well defined point of serialisation). Main memory simply discards CleanEvict messages.
We have modified the behavior of Writebacks, such that they generate snoops to check for the presence of blocks in upper level caches. It is possible in our current implmentation for a lower level cache to be writing back a block while a shared copy of the same block exists in the upper level cache. If the snoops find the same block in upper level caches, we set the BLOCK_CACHED flag in the Writeback message.
We have also added logic to account for interaction of other message types with CleanEvicts waiting in the writeback queue. A simple example is of a response arriving at a cache removing any CleanEvicts to the same address from the cache's writeback queue. |
10871:119cfadf2203 |
09-Jun-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix snoop packet data allocation bug
This patch fixes an issue where the snoop packet did not properly forward the data pointer in case of static data. |
10826:fe0b1f40ea5a |
17-Mar-2015 |
Stephan Diestelhorst <stephan.diestelhorst@ARM.com> |
mem: Create a request copy for deferred snoops
Sometimes, we need to defer an express snoop in an MSHR, but the original request might complete and deallocate the original pkt->req. In those cases, create a copy of the request so that someone who is inspecting the delayed snoop can also inspect the request still. All of this is rather hacky, but the allocation / linking and general life-time management of Packet and Request is rather tricky. Deleting the copy is another tricky area, testing so far has shown that the right copy is deleted at the right time. |
10821:581fb2484bd6 |
05-May-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Snoop into caches on uncacheable accesses
This patch takes a last step in fixing issues related to uncacheable accesses. We do not separate uncacheable memory from uncacheable devices, and in cases where it is really memory, there are valid scenarios where we need to snoop since we do not support cache maintenance instructions (yet). On snooping an uncacheable access we thus provide data if possible. In essence this makes uncacheable accesses IO coherent.
The snoop filter is also queried to steer the snoops, but not updated since the uncacheable accesses do not allocate a block. |
10819:2e8abe3bbe32 |
05-May-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Pass shared downstream through caches
This patch ensures that we pass on information about a packet being shared (rather than exclusive), when forwarding a packet downstream.
Without this patch there is a risk that a downstream cache considers the line exclusive when it really isn't. |
10818:9077e269ca4a |
05-May-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Add forward snoop check for HardPFReqs
We should always check whether the cache is supposed to be forwarding snoops before generating snoops. |
10817:404b2b015a17 |
05-May-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add missing stats update for uncacheable MSHRs
This patch adds a missing counter update for the uncacheable accesses. By updating this counter we also get a meaningful average latency for uncacheable accesses (previously inf). |
10816:b3b9097f44a9 |
05-May-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up BaseCache parameters
This patch simply tidies up the BaseCache parameters and removes the unused "two_queue" parameter. |
10815:169af9a2779f |
05-May-2015 |
David Guillen <david.guillen@arm.com> |
mem: Remove templates in cache model
This patch changes the cache implementation to rely on virtual methods rather than using the replacement policy as a template argument.
There is no impact on the simulation performance, and overall the changes make it easier to modify (and subclass) the cache and/or replacement policy. |
10771:ea35886cd847 |
27-Mar-2015 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem: Support any number of master-IDs in stride prefetcher
The stride prefetcher had a hardcoded number of contexts (i.e. master-IDs) that it could handle. Since master IDs need to be unique per system, and every core, cache etc. requires a separate master port, a static limit on these does not make much sense.
Instead, this patch adds a small hash map that will map all master IDs to the right prefetch state and dynamically allocates new state for new master IDs. |
10770:c48310de1a51 |
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Allocate cache writebacks before new MSHRs
This patch changes the order of writeback allocation such that any writebacks resulting from a tag lookup (e.g. for an uncacheable access), are added to the writebuffer before any new MSHR entries are allocated. This ensures that the writebacks logically precedes the new allocations.
The patch also changes the uncacheable flush to use proper timed (or atomic) writebacks, as opposed to functional writes. |
10769:9e521c0c3877 |
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Cleanup flow for uncacheable accesses
This patch simplifies the code dealing with uncacheable timing accesses, aiming to align it with the existing miss handling. Similar to what we do in atomic, a timing request now goes through Cache::access (where the block is also flushed), and then proceeds to ignore any existing MSHR for the block in question. This unifies the flow for cacheable and uncacheable accesses, and for atomic and timing. |
10768:9a34e28cd2c2 |
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Ignore uncacheable MSHRs when finding matches
This patch changes how we search for matching MSHRs, ignoring any MSHR that is allocated for an uncacheable access. By doing so, this patch fixes a corner case in the MSHRs where incorrect data ended up being copied into a (cacheable) read packet due to a first uncacheable MSHR target of size 4, followed by a cacheable target to the same MSHR of size 64. The latter target was filled with nonsense data. |
10767:993c2baa485a |
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove redundant allocateUncachedReadBuffer in cache
This patch removes the no-longer-needed allocateUncachedReadBuffer. Besides the checks it is exactly the same as allocateMissBuffer and thus provides no value. |
10766:b2071d0eb5f1 |
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Modernise MSHR iterators to C++11
This patch updates the iterators in the MSHR and MSHR queues to use C++11 range-based for loops. It also does a bit of additional house keeping. |
10764:b32578b2af99 |
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Align all MSHR entries to block boundaries
This patch aligns all MSHR queue entries to block boundaries to simplify checks for matches. Previously there were corner cases that could lead to existing entries not being identified as matches.
There are, rather alarmingly, a few regressions that change with this patch. |
10763:d524dc4f16ae |
27-Mar-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Rename PREFETCH_SNOOP_SQUASH flag to BLOCK_CACHED
This patch subsumes the PREFETCH_SNOOP_SQUASH flag with the more generic BLOCK_CACHED flag. Future patches implementing cache eviction messages can use the BLOCK_CACHED flag in almost the same manner as hardware prefetches use the PREFETCH_SNOOP_SQUASH flag. The PREFTECH_SNOOP_FLAG is set if the prefetch target is found in the tags or the MSHRs in any state, so we are simply replacing calls to setPrefetchSquashed() with setBlockCached(). The case of where the prefetch target is found in the writeback MSHRs of upper level caches continues to be covered by the MEM_INHIBIT flag. |
10745:791e4619919d |
19-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Use emplace front/back for deferred packets
Embrace C++11 for the deferred packets as we actually store the objects in the data structure, and not just pointers. |
10741:655ff3f6352d |
11-Feb-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: remove redundant test in in Cache::recvTimingResp()
For some reason we were checking mshr->hasTargets() even though we had already called mshr->getTarget() unconditionally earlier in the same function (which asserts if there are no targets). Get rid of this useless check, and while we're at it get rid of the redundant call to mshr->getTarget(), since we still have the value saved in a local var. |
10740:88d515925cbf |
11-Feb-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: add local var in Cache::recvTimingResp()
The main loop in recvTimingResp() uses target->pkt all over the place. Create a local tgt_pkt to help keep lines under the line length limit. |
10738:a5f134ef30d3 |
14-Mar-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: clean up write buffer check in Cache::handleSnoop()
The 'if (writebacks.size)' check was redundant, because writeBuffer.findMatches() would return false if the writebacks list was empty.
Also renamed 'mshr' to 'wb_entry' in this context since we are pointing at a writebuffer entry and not an MSHR (even though it's the same C++ class). |
10725:d1387fcd94b8 |
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Unify all cache DPRINTF address formatting
This patch changes all the DPRINTF messages in the cache to use '%#llx' every time a packet address is printed. The inclusion of '#' ensures '0x' is prepended, and since the address type is a uint64_t %x really should be %llx. |
10724:1072b1381560 |
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix cache MSHR conflict determination
This patch fixes a rather subtle issue in the sending of MSHR requests in the cache, where the logic previously did not check for conflicts between the MSRH queue and the write queue when requests were not ready. The correct thing to do is to always check, since not having a ready MSHR does not guarantee that there is no conflict.
The underlying problem seems to have slipped past due to the symmetric timings used for the write queue and MSHR queue. However, with the recent timing changes the bug caused regressions to fail. |
10722:886d2458e0d6 |
02-Mar-2015 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem: Add option to force in-order insertion in PacketQueue
By default, the packet queue is ordered by the ticks of the to-be-sent packages. With the recent modifications of packages sinking their header time when their resposne leaves the caches, there could be cases of MSHR targets being allocated and ordered A, B, but their responses being sent out in the order B,A. This led to inconsistencies in bus traffic, in particular the snoop filter observing first a ReadExResp and later a ReadRespWithInv. Logically, these were ordered the other way around behind the MSHR, but due to the timing adjustments when inserting into the PacketQueue, they were sent out in the wrong order on the bus, confusing the snoop filter.
This patch adds a flag (off by default) such that these special cases can request in-order insertion into the packet queue, which might offset timing slighty. This is expected to occur rarely and not affect timing results. |
10721:3e6a3eaac71b |
02-Mar-2015 |
Marco Balboni <Marco.Balboni@ARM.com> |
mem: Downstream components consumes new crossbar delays
This patch makes the caches and memory controllers consume the delay that is annotated to a packet by the crossbar. Previously many components simply threw these delays away. Note that the devices still do not pay for these delays. |
10714:9ba5e70964a4 |
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up the cache debug messages
Avoid redundant inclusion of the name in the DPRINTF string. |
10713:eddb533708cb |
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Split port retry for all different packet classes
This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios.
The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting.
The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks. |
10712:245cd4691cbf |
02-Mar-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Fix prefetchSquash + memInhibitAsserted bug
This patch resolves a bug with hardware prefetches. Before a hardware prefetch is sent towards the memory, the system generates a snoop request to check all caches above the prefetch generating cache for the presence of the prefetth target. If the prefetch target is found in the tags or the MSHRs of the upper caches, the cache sets the prefetchSquashed flag in the snoop packet. When the snoop packet returns with the prefetchSquashed flag set, the prefetch generating cache deallocates the MSHR reserved for the prefetch. If the prefetch target is found in the writeback buffer of the upper cache, the cache sets the memInhibit flag, which signals the prefetch generating cache to expect the data from the writeback. When the snoop packet returns with the memInhibitAsserted flag set, it marks the allocated MSHR as inService and waits for the data from the writeback.
If the prefetch target is found in multiple upper level caches, specifically in the tags or MSHRs of one upper level cache and the writeback buffer of another, the snoop packet will return with both prefetchSquashed and memInhibitAsserted set, while the current code is not written to handle such an outcome. Current code checks for the prefetchSquashed flag first, if it finds the flag, it deallocates the reserved MSHR. This leads to assert failure when the data from the writeback appears at cache. In this fix, we simply switch the order of checks. We first check for memInhibitAsserted and then for prefetch squashed. |
10694:1a6785e37d81 |
11-Feb-2015 |
Marco Balboni <Marco.Balboni@ARM.com> |
mem: Clarification of packet crossbar timings
This patch clarifies the packet timings annotated when going through a crossbar.
The old 'firstWordDelay' is replaced by 'headerDelay' that represents the delay associated to the delivery of the header of the packet.
The old 'lastWordDelay' is replaced by 'payloadDelay' that represents the delay needed to processing the payload of the packet.
For now the uses and values remain identical. However, going forward the payloadDelay will be additive, and not include the headerDelay. Follow-on patches will make the headerDelay capture the pipeline latency incurred in the crossbar, whereas the payloadDelay will capture the additional serialisation delay. |
10693:c0979b2ebda5 |
11-Feb-2015 |
Marco Balboni <Marco.Balboni@ARM.com> |
mem: Clarify usage of latency in the cache
This patch adds some much-needed clarity in the specification of the cache timing. For now, hit_latency and response_latency are kept as top-level parameters, but the cache itself has a number of local variables to better map the individual timing variables to different behaviours (and sub-components).
The introduced variables are: - lookupLatency: latency of tag lookup, occuring on any access - forwardLatency: latency that occurs in case of outbound miss - fillLatency: latency to fill a cache block We keep the existing responseLatency
The forwardLatency is used by allocateInternalBuffer() for: - MSHR allocateWriteBuffer (unchached write forwarded to WriteBuffer); - MSHR allocateMissBuffer (cacheable miss in MSHR queue); - MSHR allocateUncachedReadBuffer (unchached read allocated in MSHR queue) It is our assumption that the time for the above three buffers is the same. Similarly, for snoop responses passing through the cache we use forwardLatency. |
10680:7639c17357dc |
03-Feb-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Clarify express snoop behaviour
This patch adds a bit of documentation with insights around how express snoops really work. |
10679:204a0f53035e |
03-Feb-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Clarify cache behaviour for pending dirty responses
This patch adds a bit of clarification around the assumptions made in the cache when packets are sent out, and dirty responses are pending. As part of the change, the marking of an MSHR as in service is simplified slightly, and comments are added to explain what assumptions are made. |
10659:3a3bb559b112 |
22-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove Packet source from ForwardResponseRecord
This patch removes the source field from the ForwardResponseRecord, but keeps the class as it is part of how the cache identifies responses to hardware prefetches that are snooped upwards. |
10648:8c9ed0314ed1 |
20-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix bug in cache request retry mechanism
This patch ensures that inhibited packets that are about to be turned into express snoops do not update the retry flag in the cache. |
10627:63edd4a1243f |
23-Dec-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Change prefetcher to use random_mt
Prefechers has used rand() to generate random numers previously. |
10626:7982e539d003 |
23-Dec-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: Hide WriteInvalidate requests from prefetchers
Without this tweak, a prefetcher will happily prefetch data that will promptly be invalidated and overwritten by a WriteInvalidate. |
10625:00965520c9f5 |
23-Dec-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Fix event scheduling issue for prefetches
The cache's MemSidePacketQueue schedules a sendEvent based upon nextMSHRReadyTime() which is the time when the next MSHR is ready or whenever a future prefetch is ready. However, a prefetch being ready does not guarentee that it can obtain an MSHR. So, when all MSHRs are full, the simulation ends up unnecessiciarly scheduling a sendEvent every picosecond until an MSHR is finally freed and the prefetch can happen.
This patch fixes this by not signaling the prefetch ready time if the prefetch could not be generated. The event is rescheduled as soon as a MSHR becomes available. |
10624:97aa1ee1c2d9 |
23-Dec-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Fix bug relating to writebacks and prefetches
Previously the code commented about an unhandled case where it might be possible for a writeback to arrive after a prefetch was generated but before it was sent to the memory system. I hit that case. Luckily the prefetchSquash() logic already in the code handles dropping prefetch request in certian circumstances. |
10623:b9646f4546ad |
23-Dec-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Rework the structuring of the prefetchers
Re-organizes the prefetcher class structure. Previously the BasePrefetcher forced multiple assumptions on the prefetchers that inherited from it. This patch makes the BasePrefetcher class truly representative of base functionality. For example, the base class no longer enforces FIFO order. Instead, prefetchers with FIFO requests (like the existing stride and tagged prefetchers) now inherit from a new QueuedPrefetcher base class.
Finally, the stride-based prefetcher now assumes a custimizable lookup table (sets/ways) rather than the previous fully associative structure. |
10622:0b969a35781f |
23-Dec-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Add parameter to reserve MSHR entries for demand access
Adds a new parameter that reserves some number of MSHR entries for demand accesses. This helps prevent prefetchers from taking all MSHRs, forcing demand requests from the CPU to stall. |
10583:d1e1e8588881 |
02-Dec-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: Support WriteInvalidate (again)
This patch takes a clean-slate approach to providing WriteInvalidate (write streaming, full cache line writes without first reading) support.
Unlike the prior attempt, which took an aggressive approach of directly writing into the cache before handling the coherence actions, this approach follows the existing cache flows as closely as possible. |
10582:c04dc66e4316 |
02-Dec-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: Remove WriteInvalidate support
Prepare for a different implementation following in the next patch |
10571:c848de089432 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Clean up packet data allocation
This patch attempts to make the rules for data allocation in the packet explicit, understandable, and easy to verify. The constructor that copies a packet is extended with an additional flag "alloc_data" to enable the call site to explicitly say whether the newly created packet is short-lived (a zero-time snoop), or has an unknown life-time and therefore should allocate its own data (or copy a static pointer in the case of static data).
The tricky case is the static data. In essence this is a copy-avoidance scheme where the original source of the request (DMA, CPU etc) does not ask the memory system to return data as part of the packet, but instead provides a pointer, and then the memory system carries this pointer around, and copies the appropriate data to the location itself. Thus any derived packet actually never copies any data. As the original source does not copy any data from the response packet when arriving back at the source, we must maintain the copy of the original pointer to not break the system. We might want to revisit this one day and pay the price for a few extra memcpy invocations.
All in all this patch should make it easier to grok what is going on in the memory system and how data is actually copied (or not). |
10570:dcb908e40547 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Cleanup Packet::checkFunctional and hasData usage
This patch cleans up the use of hasData and checkFunctional in the packet. The hasData function is unfortunately suggesting that it checks if the packet has a valid data pointer, when it does in fact only check if the specific packet type is specified to have a data payload. The confusion led to a bug in checkFunctional. The latter function is also tidied up to avoid name overloading. |
10569:ffd46545b284 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make the requests carried by packets const
This adds a basic level of sanity checking to the packet by ensuring that a request is not modified once the packet is created. The only issue that had to be worked around is the relaying of software-prefetches in the cache. The specific situation is now solved by first copying the request, and then creating a new packet accordingly. |
10567:926802ed1536 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add checks and explanation for assertMemInhibit usage |
10565:23593fdaadcd |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove redundant Packet::allocate calls
This patch cleans up the packet memory allocation confusion. The data is always allocated at the requesting side, when a packet is created (or copied), and there is never a need for any device to allocate any space if it is merely responding to a paket. This behaviour is in line with how SystemC and TLM works as well, thus increasing interoperability, and matching established conventions.
The redundant calls to Packet::allocate are removed, and the checks in the function are tightened up to make sure data is only ever allocated once. There are still some oddities in the packet copy constructor where we copy the data pointer if it is static (without ownership), and allocate new space if the data is dynamic (with ownership). The latter is being worked on further in a follow-on patch. |
10563:755b18321206 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add const getters for write packet data
This patch takes a first step in tightening up how we use the data pointer in write packets. A const getter is added for the pointer itself (getConstPtr), and a number of member functions are also made const accordingly. In a range of places throughout the memory system the new member is used.
The patch also removes the unused isReadWrite function. |
10509:d5554f97c451 |
30-Oct-2014 |
Ali Saidi <Ali.Saidi@ARM.com> |
arm, mem: Fix drain bug and provide drain prints for more components. |
10503:94d58056729f |
21-Oct-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: don't inhibit WriteInv's or defer snoops on their MSHRs
WriteInvalidate semantics depend on the unconditional writeback or they won't complete. Also, there's no point in deferring snoops on their MSHRs, as they don't get new data at the end of their life cycle the way other transactions do.
Add comment in the cache about a minor inefficiency re: WriteInvalidate. |
10502:f2f1dbfd505e |
30-Oct-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: have WriteInvalidate obsolete MSHRs
Since WriteInvalidate directly writes into the cache, it can create tricky timing interleavings with reads and writes to the same cache line that haven't yet completed. This patch ensures that these requests, when completed, don't overwrite the newer data from the WriteInvalidate. |
10466:73b7549d979e |
16-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Dynamically determine page bytes in memory components
This patch takes a step towards an ISA-agnostic memory system by enabling the components to establish the page size after instantiation. The swap operation in the memory is now also allowing any granularity to avoid depending on the IntReg of the ISA. |
10424:a910aeb89098 |
09-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add packet sanity checks to cache and MSHRs
This patch adds a number of asserts to the cache, checking basic assumptions about packets being requests or responses. |
10412:6400a2ab4e22 |
27-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Fix a bunch of minor issues identified by static analysis
Add some missing initialisation, and fix a handful benign resource leaks (including some false positives). |
10405:7a618c07e663 |
20-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Rename Bus to XBar to better reflect its behaviour
This patch changes the name of the Bus classes to XBar to better reflect the actual timing behaviour. The actual instances in the config scripts are not renamed, and remain as e.g. iobus or membus.
As part of this renaming, the code has also been clean up slightly, making use of range-based for loops and tidying up some comments. The only changes outside the bus/crossbar code is due to the delay variables in the packet. |
10382:452a5f178ec5 |
20-Sep-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Remove the GHB prefetcher from the source tree
There are two primary issues with this code which make it deserving of deletion.
1) GHB is a way to structure a prefetcher, not a definitive type of prefetcher 2) This prefetcher isn't even structured like a GHB prefetcher. It's basically a worse version of the stride prefetcher.
It primarily serves to confuse new gem5 users and most functionality is already present in the stride prefetcher. |
10373:342348537a53 |
19-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Remove assertions ensuring unsigned values >= 0 |
10371:a16e73f1297f |
19-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add checks to sendTimingReq in cache
A small fix to ensure the return value is not ignored. |
10360:919c02740209 |
09-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Fix a number of unitialised variables and members
Static analysis unearther a bunch of uninitialised variables and members, and this patch addresses the problem. In all cases these omissions seem benign in the end, but at least fixing them means less false positives next time round. |
10345:b5bef3c8e070 |
27-Jun-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: write streaming support via WriteInvalidate promotion
Support full-block writes directly rather than requiring RMW: * a cache line is allocated in the cache upon receipt of a WriteInvalidateReq, not the WriteInvalidateResp. * only top-level caches allocate the line; the others just pass the request along and invalidate as necessary. * to close a timing window between the *Req and the *Resp, a new metadata bit tracks whether another cache has read a copy of the new line before the writeback to memory. |
10344:fa9ef374075f |
03-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix a bug in the cache port flow control
This patch fixes a bug in the cache port where the retry flag was reset too early, allowing new requests to arrive before the retry was actually sent, but with the event already scheduled. This caused a deadlock in the interactions with the O3 LSQ.
The patche fixes the underlying issue by shifting the resetting of the flag to be done by the event that also calls sendRetry(). The patch also tidies up the flow control in recvTimingReq and ensures that we also check if we already have a retry outstanding. |
10343:a1eea45928e6 |
13-May-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
cpu, mem: Make software prefetches non-blocking
Previously, they were treated so much like loads that they could stall at the head of the ROB. Now they are always treated like L1 hits. If they actually miss, a new request is created at the L1 and tracked from the MSHRs there if necessary (i.e. if it didn't coalesce with an existing outstanding load). |
10325:7aacec2a247d |
03-Sep-2014 |
Geoffrey Blake <geoffrey.blake@arm.com> |
cache: Fix handling of LL/SC requests under contention
If a set of LL/SC requests contend on the same cache block we can get into a situation where CPUs will deadlock if they expect a failed SC to supply them data. This case happens where 3 or more cores are contending for a cache block using LL/SC and the system is configured where 2 cores are connected to a local bus and the third is connected to a remote bus. If a core on the local bus sends an SCUpgrade and the core on the remote bus sends and SCUpgrade they will race to see who will win the SC access. In the meantime if the other core appends a read to one of the SCUpgrades it will expect to be supplied data by that SCUpgrade transaction. If it happens that the SCUpgrade that was picked to supply the data is failed, it will drop the appended request for data and never respond, leaving the requesting core to deadlock. This patch makes all SC's behave as normal stores to prevent this case but still makes sure to check whether it can perform the update. |
10318:98771a936b61 |
03-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Cleanup unused ISA traits constants
This patch prunes unused values, and also unifies how the values are defined (not using an enum for ALPHA), aligning the use of int vs Addr etc.
The patch also removes the duplication of PageBytes/PageShift and VMPageSize/LogVMPageSize. For all ISAs the two pairs had identical values and the latter has been removed. |
10274:68da5ef4bb6f |
13-Aug-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Properly set cache block status fields on writebacks
When a cacheline is written back to a lower-level cache, tags->insertBlock() sets various status parameters. However these status bits were cleared immediately after calling. This patch makes it so that these status fields are not cleared by moving them outside of the tags->insertBlock() call. |
10263:c00b5ba43967 |
28-Jul-2014 |
Anthony Gutierrez <atgutier@umich.edu> |
mem: refactor LRU cache tags and add random replacement tags
this patch implements a new tags class that uses a random replacement policy. these tags prefer to evict invalid blocks first, if none are available a replacement candidate is chosen at random.
this patch factors out the common code in the LRU class and creates a new abstract class: the BaseSetAssoc class. any set associative tag class must implement the functionality related to the actual replacement policy in the following methods:
accessBlock() findVictim() insertBlock() invalidate() |
10192:5c2c4195b839 |
09-May-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Squash prefetch requests from downstream caches
This patch squashes prefetch requests from downstream caches, so that they do not steal cachelines away from caches closer to the cpu. It was originally coded by Mitch Hayenga and modified by Aasheesh Kolli. |
10174:73b035a42df1 |
01-Apr-2014 |
Mitch Hayenga <Mitch.Hayenga@ARM.com> |
mem: Don't print out the data of a cache block
This never actually worked since it was printing out only a word of the cache block and not the entire thing and doubly didn't work csprintf overrides the %#x specifier and assumes a char* array is actually a string. |
10108:83bb6e381cbf |
07-Mar-2014 |
Prakash Ramrakhyani <prakash.ramrakhyani@arm.com> |
mem: Fix incorrect assert failure in the Cache
This patch fixes an assert condition that is not true at all times. There are valid situations that arise in dual-core dual-workload runs where the assert condition is false. The function call following the assert however needs to be called only when the condition is true (a block cannot be invalidated in the tags structure if has not been allocated in the structure, and the tempBlock is never allocated). Hence the 'assert' has been replaced with an 'if'. |
10067:3b30e9d30e10 |
18-Feb-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Filter cache snoops based on address ranges
This patch adds a filter to the cache to drop snoop requests that are not for a range covered by the cache. This fixes an issue observed when multiple caches are placed in parallel, covering different address ranges. Without this patch, all the caches will forward the snoop upwards, when only one should do so. |
10054:baaed1733069 |
30-Jan-2014 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com> |
mem: Add additional tolerance to stride prefetcher Forces the prefetcher to mispredict twice in a row before resetting the confidence of prefetching. This helps cases where a load PC strides by a constant factor, however it may operate on different arrays at times. Avoids the cost of retraining. Primarily helps with small iteration loops.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10053:b0b69dbafc08 |
30-Jan-2014 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com> |
mem: Allowed tagged instruction prefetching in stride prefetcher For systems with a tightly coupled L2, a stride-based prefetcher may observe access requests from both instruction and data L1 caches. However, the PC address of an instruction miss gives no relevant training information to the stride based prefetcher(there is no stride to train). In theses cases, its better if the L2 stride prefetcher simply reverted back to a simple N-block ahead prefetcher. This patch enables this option.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10052:5bb8e054456b |
30-Jan-2014 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com>, Amin Farmahini <aminfar@gmail.com> |
mem: prefetcher: add options, support for unaligned addresses
This patch extends the classic prefetcher to work on non-block aligned addresses. Because the existing prefetchers in gem5 mask off the lower address bits of cache accesses, many predictable strides fail to be detected. For example, if a load were to stride by 48 bytes, with 64 byte cachelines, the current stride based prefetcher would see an access pattern of 0, 64, 64, 128, 192.... Thus not detecting a constant stride pattern. This patch fixes this, by training the prefetcher on access and not masking off the lower address bits.
It also adds the following configuration options: 1) Training/prefetching only on cache misses, 2) Training/prefetching only on data acceses, 3) Optionally tagging prefetches with a PC address. #3 allows prefetchers to train off of prefetch requests in systems with multiple cache levels and PC-based prefetchers present at multiple levels. It also effectively allows a pipelining of prefetch requests (like in POWER4) across multiple levels of cache hierarchy.
Improves performance on my gem5 configuration by 4.3% for SPECINT and 4.7% for SPECFP (geomean). |
10048:1548b7aa657c |
28-Jan-2014 |
Amin Farmahini <aminfar@gmail.com> |
mem: Remove redundant findVictim() input argument The patch (1) removes the redundant writeback argument from findVictim() (2) fixes the description of access() function
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10028:fb8c44de891a |
24-Jan-2014 |
Giacomo Gabrielli <Giacomo.Gabrielli@arm.com> |
mem: Add support for a security bit in the memory system
This patch adds the basic building blocks required to support e.g. ARM TrustZone by discerning secure and non-secure memory accesses. |
10025:fdf737112e46 |
24-Jan-2014 |
Timothy M. Jones <timothy.jones@arm.com> |
Cache: Collect very basic stats on tag and data accesses
Adds very basic statistics on the number of tag and data accesses within the cache, which is important for power modelling. For the tags, simply count the associativity of the cache each time. For the data, this depends on whether tags and data are accessed sequentially, which is given by a new parameter. In the parallel case, all data blocks are accessed each time, but with sequential accesses, a single data block is accessed only on a hit. |
10024:fc10e1f9f124 |
24-Jan-2014 |
Dam Sunwoo <dam.sunwoo@arm.com> |
mem: per-thread cache occupancy and per-block ages
This patch enables tracking of cache occupancy per thread along with ages (in buckets) per cache blocks. Cache occupancy stats are recalculated on each stat dump. |
10020:2f33cb012383 |
24-Jan-2014 |
Matt Horsnell <matt.horsnell@ARM.com> |
mem: track per-request latencies and access depths in the cache hierarchy
Add some values and methods to the request object to track the translation and access latency for a request and which level of the cache hierarchy responded to the request. |
9944:4ff1c5c6dcbc |
17-Oct-2013 |
Matt Horsnell <matt.horsnell@ARM.com> |
cpu: add consistent guarding to *_impl.hh files. |
9850:87d6b41749e9 |
04-Sep-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Resurrect the NOISA build target and rename it NULL
This patch makes it possible to once again build gem5 without any ISA. The main purpose is to enable work around the interconnect and memory system without having to build any CPU models or device models.
The regress script is updated to include the NULL ISA target. Currently no regressions make use of it, but all the testers could (and perhaps should) transition to it. |
9814:7ad2b0186a32 |
18-Jul-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Set the cache line size on a system level
This patch removes the notion of a peer block size and instead sets the cache line size on the system level.
Previously the size was set per cache, and communicated through the interconnect. There were plenty checks to ensure that everyone had the same size specified, and these checks are now removed. Another benefit that is not yet harnessed is that the cache line size is now known at construction time, rather than after the port binding. Hence, the block size can be locally stored and does not have to be queried every time it is used.
A follow-on patch updates the configuration scripts accordingly. |
9813:bba03800b376 |
18-Jul-2013 |
Xiangyu Dong <rioshering@gmail.com> |
mem: Add cache class destructor to avoid memory leaks
Make valgrind a little bit happier |
9796:485399270ca1 |
27-Jun-2013 |
Prakash Ramrakhyani <prakash.ramrakhyani@arm.com> |
mem: Reorganize cache tags and make them a SimObject
This patch reorganizes the cache tags to allow more flexibility to implement new replacement policies. The base tags class is now a clocked object so that derived classes can use a clock if they need one. Also having deriving from SimObject allows specialized Tag classes to be swapped in/out in .py files.
The cache set is now templatized to allow it to contain customized cache blocks with additional informaiton. This involved moving code to the .hh file and removing cacheset.cc.
The statistics belonging to the cache tags are now including ".tags" in their name. Hence, the stats need an update to reflect the change in naming. |
9795:a31d1a0888a2 |
27-Jun-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove the cache builder
This patch removes the redundant cache builder class. |
9784:d28825cebfcc |
27-Jun-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Align cache timing to clock edges
This patch changes the cache timing calculations such that the results are aligned to clock edges.
Plenty stats change as a results of this patch. |
9782:285458078a09 |
27-Jun-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Cycles converted to Ticks in atomic cache accesses
This patch fixes an outstanding issue in the cache timing calculations where an atomic access returned a time in Cycles, but the port forwarded it on as if it was in Ticks.
A separate patch will update the regression stats. |
9779:0742b0ccc430 |
27-Jun-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove a redundant heap allocation for a snoop packet
This patch changes the updards snoop packet to avoid allocating and later deleting it. As the code executes in 0 time and the lifetime of the packet does not extend beyond the block there is no reason to heap allocate it. |
9725:0d4ee33078bb |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Spring cleaning of MSHR and MSHRQueue
This patch does some minor tidying up of the MSHR and MSHRQueue. The clean up started as part of some ad-hoc tracing and debugging, but seems worthwhile enough to go in as a separate patch.
The highlights of the changes are reduced scoping (private) members where possible, avoiding redundant new/delete, and constructor initialisation to please static code analyzers. |
9724:7c7ed0cae353 |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix MSHR print format
This patch fixes an incorrect print format string by adding an additional string element. |
9663:45df88079f04 |
22-Apr-2013 |
Uri Wiener <uri.wiener@arm.com> |
mem: Adding verbose debug output in the memory system
This patch provides useful printouts throughut the memory system. This includes pretty-printed cache tags and function call messages (call-stack like). |
9619:0da414aefaf6 |
27-Mar-2013 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com> |
mem: Fix cache latency bug Fixes a latency calculation bug for accesses during a cache line fill.
Under a cache miss, before the line is filled, accesses to the cache are associated with a MSHR and marked as targets. Once the line fill completes, MSHR target packets pay an additional latency of "responseLatency + busSerializationLatency". However, the "whenReady" field of the cache line is only set to an additional delay of "busSerializationLatency". This lacks the responseLatency component of the fill. It is possible for accesses that occur on the cycle of (or briefly after) the line fill to respond without properly paying the responseLatency. This also creates the situation where two accesses to the same address may be serviced in an order opposite of how they were received by the cache. For stores to the same address, this means that although the cache performs the stores in the order they were received, acknowledgements may be sent in a different order.
Adding the responseLatency component to the whenReady field preserves the penalty that should be paid and prevents these ordering issues.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
9614:c35b47fd0df8 |
26-Mar-2013 |
Rene de Jong <rene.dejong@arm.com> |
mem: Cancel cache retry event when blocking port
This patch solves the corner case scenario where the sendRetryEvent could be scheduled twice, when an io device stresses the IOcache in the system. This should not be possible in the cache system. |
9559:e6347e559e8f |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix sender state bug and delay popping
This patch fixes a newly introduced bug where the sender state was popped before checking that it should be. Amazingly all regressions pass, but Linux fails to boot on the detailed CPU with caches enabled. |
9550:e0e2c8f83d08 |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
scons: Fix up numerous warnings about name shadowing
This patch address the most important name shadowing warnings (as produced when using gcc/clang with -Wshadow). There are many locations where constructor parameters and function parameters shadow local variables, but these are left unchanged. |
9549:95a536fae9ac |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Enforce strict use of busFirst- and busLastWordTime
This patch adds a check to ensure that the delay incurred by the bus is not simply disregarded, but accounted for by someone. At this point, all the modules do is to zero it out, and no additional time is spent. This highlights where the bus timing is simply dropped instead of being paid for.
As a follow up, the locations identified in this patch should add this additional time to the packets in one way or another. For now it simply acts as a sanity check and highlights where the delay is simply ignored.
Since no time is added, all regressions remain the same. |
9548:63d36f7ef562 |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Change accessor function names to match the port interface
This patch changes the names of the cache accessor functions to be in line with those used by the ports. This is done to avoid confusion and get closer to a one-to-one correspondence between the interface of the memory object (the cache in this case) and the port itself.
The member function timingAccess has been split into a snoop/non-snoop part to avoid branching on the isResponse() of the packet. |
9547:6d81435f56cb |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make packet bus-related time accounting relative
This patch changes the bus-related time accounting done in the packet to be relative. Besides making it easier to align the cache timing to cache clock cycles, it also makes it possible to create a Last-Level Cache (LLC) directly to a memory controller without a bus inbetween.
The bus is unique in that it does not ever make the packets wait to reflect the time spent forwarding them. Instead, the cache is currently responsible for making the packets wait. Thus, the bus annotates the packets with the time needed for the first word to appear, and also the last word. The cache then delays the packets in its queues before passing them on. It is worth noting that every object attached to a bus (devices, memories, bridges, etc) should be doing this if we opt for keeping this way of accounting for the bus timing. |
9546:ac0c18d738ce |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add deferred packet class to prefetcher
This patch removes the time field from the packet as it was only used by the preftecher. Similar to the packet queue, the prefetcher now wraps the packet in a deferred packet, which also has a tick representing the absolute time when the packet should be sent. |
9545:508784fad4e5 |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
sim: Make clock private and access using clockPeriod()
This patch makes the clock member private to the ClockedObject and forces all children to access it using clockPeriod(). This makes it impossible to inadvertently change the clock, and also makes it easier to transition to a situation where the clock is derived from e.g. a clock domain, or through a multiplier. |
9543:a373b2e664ff |
19-Feb-2013 |
Sascha Bischoff <sascha.bischoff@arm.com> |
mem: Fix SenderState related cache deadlock
This patch fixes a potential deadlock in the caches. This deadlock could occur when more than one cache is used in a system, and pkt->senderState is modified in between the two caches. This happened as the caches relied on the senderState remaining unchanged, and used it for instantaneous upstream communication with other caches.
This issue has been addressed by iterating over the linked list of senderStates until we are either able to cast to a MSHR* or senderState is NULL. If the cast is successful, we know that the packet has previously passed through another cache, and therefore update the downstreamPending flag accordingly. Otherwise, we do nothing. |
9542:683991c46ac8 |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add predecessor to SenderState base class
This patch adds a predecessor field to the SenderState base class to make the process of linking them up more uniform, and enable a traversal of the stack without knowing the specific type of the subclasses.
There are a number of simplifications done as part of changing the SenderState, particularly in the RubyTest. |
9529:28d6d9663a7e |
15-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tighten up cache constness and scoping
This patch merely adopts a more strict use of const for the cache member functions and variables, and also moves a large portion of the member functions from public to protected. |
9524:d6ffa982a68b |
15-Feb-2013 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
sim: Add a system-global option to bypass caches
Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches.
To make memory mode tests cleaner, the following methods are added to the System class:
* isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed.
The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore. |
9486:569e1f1d762d |
28-Jan-2013 |
Anthony Gutierrez <atgutier@umich.edu> |
cache: remove drainManager because it's not used
the cache drainManager is set but never cleared, this is because the cache itself does not need to be drained and thus never triggers a signalDrainDone(). because the drainManager variable is not used properly and does not appear to be necessary it has been removed with this patch. |
9454:2694770a30d4 |
08-Jan-2013 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com> |
mem: Make LL/SC locks fine grained
The current implementation in gem5 just keeps a list of locks per cacheline. Due to this, a store to a non-overlapping portion of the cacheline can cause an LL/SC pair to fail. This patch simply adds an address range to the lock structure, so that the lock is only invalidated if the store overlaps the lock range. |
9445:5963165c00cb |
07-Jan-2013 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
mem: Fix guest corruption when caches handle uncacheable accesses
When the classic gem5 cache sees an uncacheable memory access, it used to ignore it or silently drop the cache line in case of a write. Normally, there shouldn't be any data in the cache belonging to an uncacheable address range. However, since some architecture models don't implement cache maintenance instructions, there might be some dirty data in the cache that is discarded when this happens. The reason it has mostly worked before is because such cache lines were most likely evicted by normal memory activity before a TLB flush was requested by the OS.
Previously, the cache model would invalidate cache lines when they were accessed by an uncacheable write. This changeset alters this behavior so all uncacheable memory accesses cause a cache flush with an associated writeback if necessary. This is implemented by reusing the cache flushing machinery used when draining the cache, which implies that writebacks are performed using functional accesses. |
9422:34d2e8082912 |
07-Jan-2013 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
mem: Remove the IIC replacement policy
The IIC replacement policy seems to be unused and has probably gathered too much bit rot to be useful. This patch removes the IIC and its associated cache parameters. |
9418:9923a5ab8c13 |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
sim: Fatal if a clocked object is set to have a clock of 0
This patch adds a check to the clocked object constructor to ensure it is not configured to have a clock period of 0. |
9379:40250293a6ae |
07-Jan-2013 |
Ali Saidi <Ali.Saidi@ARM.com> |
cache: add note about where conflicts are handled |
9347:b02075171b57 |
02-Nov-2012 |
Andreas Sandberg <Andreas.Sandberg@arm.com> |
mem: Add support for writing back and flushing caches
This patch adds support for the following optional drain methods in the classical memory system's cache model:
memWriteback() - Write back all dirty cache lines to memory using functional accesses.
memInvalidate() - Invalidate all cache lines. Dirty cache lines are lost unless a writeback is requested.
Since memWriteback() is called when checkpointing systems, this patch adds support for checkpointing systems with caches. The serialization code now checks whether there are any dirty lines in the cache. If there are dirty lines in the cache, the checkpoint is flagged as bad and a warning is printed. |
9342:6fec8f26e56d |
02-Nov-2012 |
Andreas Sandberg <Andreas.Sandberg@arm.com> |
sim: Move the draining interface into a separate base class
This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager. |
9338:97b4a2be1e5b |
02-Nov-2012 |
Andreas Sandberg <Andreas.Sandberg@arm.com> |
sim: Include object header files in SWIG interfaces
When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy.
This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it. |
9294:8fb03b13de02 |
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Add protocol-agnostic ports in the port hierarchy
This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations.
The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default. |
9290:90dd57ca9a7e |
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Fix: Address a few minor issues identified by cppcheck
This patch addresses a number of smaller issues identified by the code inspection utility cppcheck. There are a number of identified leaks in the arm/linux/system.cc (although the function only get's called once so it is not a major problem), a few deletes in dev/x86/i8042.cc that were not array deletes, and sprintfs where the character array had one element less than needed. In the IIC tags there was a function allocating an array of longs which is in fact never used. |
9288:3d6da8559605 |
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Use cycles to express cache-related latencies
This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency.
The stat blocked_cycles that actually counter in ticks is now updated to count in cycles.
As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions. |
9264:1607119c36bb |
25-Sep-2012 |
Djordje Kovacevic <djordje.kovacevic@arm.com> |
MEM: Put memory system document into doxygen |
9263:066099902102 |
25-Sep-2012 |
Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> |
Cache: add a response latency to the caches
In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path. |
9235:5aa4896ed55a |
19-Sep-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
AddrRange: Transition from Range<T> to AddrRange
This patch takes the final plunge and transitions from the templated Range class to the more specific AddrRange. In doing so it changes the obvious Range<Addr> to AddrRange, and also bumps the range_map to be AddrRangeMap.
In addition to the obvious changes, including the removal of redundant includes, this patch also does some house keeping in preparing for the introduction of address interleaving support in the ranges. The Range class is also stripped of all the functionality that is never used. |
9216:a5f937d152bf |
11-Sep-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
clang: Fix issues identified by the clang static analyzer
This patch addresses a few minor issues reported by the clang static analyzer.
The analysis was run with:
scan-build -disable-checker deadcode \ -enable-checker experimental.core \ -disable-checker experimental.core.CastToStruct \ -enable-checker experimental.cpluscplus |
9214:a42caed28e1f |
11-Sep-2012 |
Lena Olson <lena@cs.wisc.edu> |
Cache: Split invalidateBlk up to seperate block vs. tags
This seperates the functionality to clear the state in a block into blk.hh and the functionality to udpate the tag information into the tags. This gets rid of the case where calling invalidateBlk on an already-invalid block does something different than calling it on a valid block, which was confusing. |
9184:a1a8f137b796 |
07-Sep-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Param: Transition to Cycles for relevant parameters
This patch is a first step to using Cycles as a parameter type. The main affected modules are the CPUs and the Ruby caches. There are definitely plenty more places that are affected, but this patch serves as a starting point to making the transition.
An important part of this patch is to actually enable parameters to be specified as Param.Cycles which involves some changes to params.py. |
9165:f9e3dac185ba |
22-Aug-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Packet: Remove NACKs from packet and its use in endpoints
This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that).
The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe.
Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up. |
9163:3b5e13ac1940 |
22-Aug-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Extend the QueuedPort interface and use where appropriate
This patch extends the queued port interfaces with methods for scheduling the transmission of a timing request/response. The methods are named similar to the corresponding sendTiming(Snoop)Req/Resp, replacing the "send" with "sched". As the queues are currently unbounded, the methods always succeed and hence do not return a value.
This functionality was previously provided in the subclasses by calling PacketQueue::schedSendTiming with the appropriate parameters. With this change, there is no need to introduce these extra methods in the subclasses, and the use of the queued interface is more uniform and explicit. |
9152:86c0e6ca5e7c |
15-Aug-2012 |
Anthony Gutierrez <atgutier@umich.edu> |
O3,ARM: fix some problems with drain/switchout functionality and add Drain DPRINTFs
This patch fixes some problems with the drain/switchout functionality for the O3 cpu and for the ARM ISA and adds some useful debug print statements.
This is an incremental fix as there are still a few bugs/mem leaks with the switchout code. Particularly when switching from an O3CPU to a TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA I haven't encountered any more assertion failures; now the kernel will typically panic inside of simulation. |
9131:b6b4d41ba9b9 |
27-Jul-2012 |
Anthony Gutierrez <atgutier@umich.edu> |
cache: don't allow dirty data in the i-cache
removes the optimization that forwards an exclusive copy to a requester on a read, only for the i-cache. this optimization isn't necessary because we typically won't be writing to the i-cache. |
9095:0e6bd7082fac |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Align port names in C++ and Python
This patch is a first step to align the port names used in the Python world and the C++ world. Ultimately it serves to make the use of config.json together with output from the simulation easier, including post-processing of statistics.
Most notably, the CPU, cache, and bus is addressed in this patch, and there might be other ports that should be updated accordingly. The dash name separator has also been replaced with a "." which is what is used to concatenate the names in python, and a separation is made between the master and slave port in the bus. |
9090:e4e22240398f |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Make getAddrRanges const
This patch makes getAddrRanges const throughout the code base. There is no reason why it should not be, and making it const prevents adding any unintentional side-effects. |
9088:73eeda352933 |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Add isSnooping to slave port (asking master port)
This patch adds isSnooping to the slave port, and thus avoids going through getMasterPort to be able to ask the master. Over the course of the next few patches, all getMasterPort/getSlavePort in Port and MemObject are to be protocol agnostic, and the snooping is part of the protocol layer.
The function is already present on the master port, where it is implemented by the module itself, e.g. a cache. On the slave side, it is merely asking the connected master port. The same name is used by both functions despite their difference in behaviour. The initial design used isMasterSnooping on the slave port side, but the more verbose function name was later changed. |
9087:b5a084a6159b |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Move retry from port base class to Master/SlavePort
This patch is the last part of moving all protocol-related functionality out of the Port base class. All the send/recv functions are already moved, and the retry (which still governs all the timing transport functions) is the only part that remained in the base class.
The only point where this currently causes a bit of inconvenience is in the bus where the retry list is global and holds Port pointers (not Master/SlavePort). This is about to change with the split into a request/response bus and will soon be removed anyway.
The patch has no impact on any regressions. |
9086:496304c8017d |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Fix: Address a few benign memory leaks
This patch is the result of static analysis identifying a number of memory leaks. The leaks are all benign as they are a result of not deallocating memory in the desctructor. The fix still has value as it removes false positives in the static analysis. |
9084:ace8383f2b7e |
29-Jun-2012 |
Lena Olson <lena@cs.wisc.edu> |
Cache: Fix the LRU policy for classic memory hierarchy
The LRU policy always evicted the least recently touched way, even if it contained valid data and another way was invalid, as can happen if a block has been invalidated by coherance. This can result in caches never warming up even though they are replacing blocks. This modifies the LRU policy to move blocks to LRU position on invalidation. |
9082:7f95b7f56577 |
29-Jun-2012 |
Dam Sunwoo <dam.sunwoo@arm.com> |
Mem: fix master id assertion in cache_impl.hh The assertion was applied to the wrong packet. This patch fixes the issue rerported by Xiang Jiang on the gem5-dev mailing list. |
9076:fefce4388397 |
29-Jun-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
Cache: Only invalidate a line in the cache when an uncacheable write is seen. |
9063:965c042379df |
07-Jun-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
mem: Delay deleting of incoming packets by one call.
This patch is a temporary fix until Andreas' four-phase patches get reviewed and committed. Removing FastAlloc seems to have exposed an issue which previously was reasonable rare in which packets are freed before the sending cache is done with them. This change puts incoming packets no a pendingDelete queue which are deleted at the start of the next call and thus breaks the dependency between when the caller returns true and when the packet is actually used by the sending cache.
Running valgrind on a multi-core linux boot and the memtester results in no valgrind warnings. |
9044:904ddeecc653 |
05-Jun-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
sim: Remove FastAlloc
While FastAlloc provides a small performance increase (~1.5%) over regular malloc it isn't thread safe. After removing FastAlloc and using tcmalloc I've seen a performance increase of 12% over libc malloc when running twolf for ARM. |
9032:42dfc00ee1a1 |
30-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Bus: Turn the PortId into a transport function parameter
The main aim of this patch is to arrive at a suitable port interface for vector ports, including both the packet and the port id. This patch changes the bus transport functions (recvFunctional/Atomic/Timing) to require a PortId parameter indicating the source port. Previously this information was passed by setting the source field of the packet, and this is only required in the case of a timing request.
With this patch, the use of the source and destination field is also more restrictive, as they are only needed for timing accesses. The modifications to these fields for atomic snoops is now removed entirely, also making minor modifications to the cache. |
9031:32ecc0217c5e |
30-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Packet: Unify the use of PortID in packet and port
This patch removes the Packet::NodeID typedef and unifies it with the Port::PortId. The src and dest fields in the packet are used to hold a port id (e.g. in the bus), and thus the two should actually be the same.
The typedef PortID is now global (in base/types.hh) and aligned with the ThreadID in terms of capitalisation and naming of the InvalidPortID constant.
Before this patch, two flags were used for valid destination and source, rather than relying on a named value (InvalidPortID), and this is now redundant, as the src and dest field themselves are sufficient to tell whether the current value is a valid port identifier or not. Consequently, the VALID_SRC and VALID_DST are removed.
As part of the cleaning up, a number of int parameters and local variables are updated to use PortID.
Note that Ruby still has its own NodeID typedef. Furthermore, the MemObject getMaster/SlavePort still has an int idx parameter with a default value of -1 which should eventually change to PortID idx = InvalidPortID. |
9019:ea7d6873af6e |
24-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Cache: Remove dangling doWriteback declaration
This patch removes the declaration of doWriteback as there is no implementation for this member function. |
8995:a029d2119487 |
10-May-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
Cache: restructure code that actually isn't a loop |
8991:69fad6658160 |
10-May-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
gem5: fix some iterator use and erase bugs |
8988:528f0fa80f76 |
10-May-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
gem5: Fix a number of incorrect case statements |
8985:4b517873c9ae |
10-May-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
Cache: Panic if you attempt to create a checkpoint with a cache in the system |
8975:7f36d4436074 |
01-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate requests and responses for timing accesses
This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses.
For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions).
The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly.
With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself. |
8949:3fa1ee293096 |
14-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Remove the Broadcast destination from the packet
This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field.
Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before).
The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class.
In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing. |
8948:e95ee70f876c |
14-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate snoops and normal memory requests/responses
This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports.
Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below.
Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass.
Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses.
The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id.
In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions. |
8931:7a1dfb191e3f |
06-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Enable multiple distributed generalized memories
This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range.
All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory.
To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables.
Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. |
8922:17f037ad8918 |
30-Mar-2012 |
William Wang <william.wang@arm.com> |
MEM: Introduce the master/slave port sub-classes in C++
This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects.
The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches.
The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal.
The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again. |
8914:8c3bd7bea667 |
22-Mar-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Split SimpleTimingPort into PacketQueue and ports
This patch decouples the queueing and the port interactions to simplify the introduction of the master and slave ports. By separating the queueing functionality from the port itself, it becomes much easier to distinguish between master and slave ports, and still retain the queueing ability for both (without code duplication).
As part of the split into a PacketQueue and a port, there is now also a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The QueuedPort is useful for ports that want to leave the packet transmission of outgoing packets to the queue and is used by both master and slave ports. The SimpleTimingPort inherits from the QueuedPort and adds the implemention of recvTiming and recvFunctional through recvAtomic.
The PioPort and MessagePort are cleaned up as part of the changes. |
8883:c92153af04ac |
09-Mar-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
cache: Allow main memory to be at disjoint address ranges. |
8867:08cc303b718b |
01-Mar-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
Cache: Fix an issue with LRU when bonus block is used to complete transaction.
The block is never inserted because it's the one extra block in the cache, but it can be invalidated twice in a row. In that case the block doesn't have a new master id (beacuse it was never inserted), however it is valid and the accounting goes wrong at that point. |
8856:241ee47b0dc6 |
24-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Simplify cache ports preparing for master/slave split
This patch splits the two cache ports into a master (memory-side) and slave (cpu-side) subclass of port with slightly different functionality. For example, it is only the CPU-side port that blocks incoming requests, and only the memory-side port that schedules send events outside of what the transmit list dictates.
This patch simplifies the two classes by relying further on SimpleTimingPort and also generalises the latter to better accommodate the changes (introducing trySendTiming and scheduleSend). The memory-side cache port overrides sendDeferredPacket to be able to not only send responses from the transmit list, but also send requests based on the MSHRs.
A follow on patch further simplifies the SimpleTimingPort and the cache ports. |
8839:eeb293859255 |
13-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Introduce the master/slave port roles in the Python classes
This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves.
The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port.
Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves. |
8833:2870638642bd |
12-Feb-2012 |
Dam Sunwoo <dam.sunwoo@arm.com> |
mem: fix cache stats to use request ids correctly
This patch fixes the cache stats to use the new request ids. Cache stats also display the requestor names in the vector subnames. Most cache stats now include "nozero" and "nonan" flags to reduce the amount of excessive cache stat dump. Also, simplified incMissCount()/incHitCount() functions. |
8832:247fee427324 |
12-Feb-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
mem: Add a master ID to each request object.
This change adds a master id to each request object which can be used identify every device in the system that is capable of issuing a request. This is part of the way to removing the numCpus+1 stats in the cache and replacing them with the master ids. This is one of a series of changes that make way for the stats output to be changed to python. |
8831:6c08a877af8f |
12-Feb-2012 |
Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> |
prefetcher: Make prefetcher a sim object instead of it being a parameter on cache |
8809:bb10807da889 |
01-Feb-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with head, hopefully the last time for this batch. |
8799:dac1e33e07b0 |
28-Jan-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with the main repo. |
8794:e2ac2b7164dd |
18-Nov-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Get rid of includes of config/full_system.hh. |
8786:8be24baf68b8 |
07-Nov-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Get rid of FULL_SYSTEM in mem. |
8737:770ccf3af571 |
31-Jan-2012 |
Koan-Sin Tan <koansin.tan@gmail.com> |
clang: Enable compiling gem5 using clang 2.9 and 3.0
This patch adds the necessary flags to the SConstruct and SConscript files for compiling using clang 2.9 and later (on Ubuntu et al and OSX XCode 4.2), and also cleans up a bunch of compiler warnings found by clang. Most of the warnings are related to hidden virtual functions, comparisons with unsigneds >= 0, and if-statements with empty bodies. A number of mismatches between struct and class are also fixed. clang 2.8 is not working as it has problems with class names that occur in multiple namespaces (e.g. Statistics in kernel_stats.hh).
clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which causes confusion between the container std::set and the function Packet::set, and this is currently addressed by not including the entire namespace std, but rather selecting e.g. "using std::vector" in the appropriate places. |
8736:2d8a57343fe3 |
31-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Remove the otherPort from the cache ports
This patch is a very straight-forward simplification, removing the unecessary otherPort pointer from the cache port. The pointer was only used to forward range changes, and the address range is fixed for the cache. Removing the pointer simplifies the transition to master/slave ports. |
8712:7f762428a9f5 |
17-Jan-2012 |
William Wang <william.wang@arm.com> |
MEM: Remove the functional ports from the memory system
The functional ports are no longer used and this patch cleans up the legacy that is still present in buses, memories, CPUs etc. Note that this does not refer to the class FunctionalPort (already removed), but rather ports with the name (and use) functional. |
8711:c7e14f52c682 |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate queries for snooping and address ranges
This patch simplifies the address-range determination mechanism and also unifies the naming across ports and devices. It further splits the queries for determining if a port is snooping and what address ranges it responds to (aiming towards a separation of cache-maintenance ports and pure memory-mapped ports). Default behaviours are such that most ports do not have to define isSnooping, and master ports need not implement getAddrRanges. |
8710:aab813d6a162 |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Remove Port removeConn and MemObject deletePortRefs
Cleaning up and simplifying the ports and going towards a more strict elaboration-time creation and binding of the ports. |
8708:7ccbdea0fa12 |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Simplify ports by removing EventManager
This patch removes the inheritance of EventManager from the ports and moves all responsibility for event queues to the owner. Eventually the event manager should be the interface block, which could either be the structural owner or a subblock like a LSQ in the O3 CPU for example. |
8702:2764cd55d2ad |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Differentiate functional cache accesses from CPU and memory
This patch changes the functionalAccess member function in the cache model such that it is aware of what port the access came from, i.e. if it came from the CPU side or from the memory side. By adding this information, it is possible to respect the 'forwardSnoops' flag for snooping requests coming from the memory side and not forward them. This fixes an outstanding issue with the IO bus getting accesses that have no valid destination port and also cleans up future changes to the bus model. |
8607:5fb918115c07 |
31-Oct-2011 |
Gabe Black <gblack@eecs.umich.edu> |
GCC: Get everything working with gcc 4.6.1.
And by "everything" I mean all the quick regressions. |
8548:33bdc36bf46f |
13-Sep-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Prefetch: Don't prefetch if address is in the write queue.
Check that we're not currently writing back an address the prefetcher is trying to prefetch before issuing it. We previously checked the mshrQueue and the cache itself, but forgot to check the writeBuffer. This fixes a memory corrucption issue with an L2 prefetcher. |
8533:8dac0abb7a1b |
01-Sep-2011 |
Lisa Hsu <Lisa.Hsu@amd.com> |
Fix build for gcc-4.2 opt/fast
Even though the code is safe, compiler flags a warning here, which are treated as errors for fast/opt. I know it's redundant but it has no side effects and fixes the compile. |
8526:2e5d41fbc4a5 |
19-Aug-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Mem: Put prefetcher notify call before packet is deleted. |
8509:afb40c3d4ba6 |
19-Aug-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Prefetcher: Fix some memory leaks with the prefetcher. |
8472:37d052b21555 |
15-Jul-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Mem: Fix issue with prefetches originating at non-L1 caches getting stale data
Prefetch requests issued from the L2 or below wouldn't check if valid data is present higher in the system. If a prefetch into the L2 occured at the same time as writeback from a higher-level cache the dirty data could be replaced in by unmodified data in memory. |
8335:9228e00459d4 |
02-Jun-2011 |
Nathan Binkert <nate@binkert.org> |
scons: rename TraceFlags to DebugFlags |
8240:38befb82b2c9 |
19-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
stats: rename stats so they can be used as python expressions |
8232:b28d06a175be |
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
trace: reimplement the DTRACE function so it doesn't use a vector At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help |
8229:78bf55f23338 |
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
includes: sort all includes |
8134:b01a51ff05fa |
17-Mar-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Mem: Fix issue with dirty block being lost when entire block transferred to non-cache.
This change fixes the problem for all the cases we actively use. If you want to try more creative I/O device attachments (E.g. sharing an L2), this won't work. You would need another level of caching between the I/O device and the cache (which you actually need anyway with our current code to make sure writes propagate). This is required so that you can mark the cache in between as top level and it won't try to send ownership of a block to the I/O device. Asserts have been added that should catch any issues. |
8066:cb7bf3919bdd |
23-Feb-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Includes: Don't include isa_traits.hh and use the TheISA namespace unless really needed. |
7823:dac01f14f20f |
08-Jan-2011 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Replace curTick global variable with accessor functions. This step makes it easy to replace the accessor functions (which still access a global variable) with ones that access per-thread curTick values. |
7768:cdb18c1b51ea |
19-Nov-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
SCons: Support building without an ISA |
7710:5e129d3c6d7e |
18-Oct-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: minor SC assertion fix
Thanks to Joe Gross for finding/testing this. |
7705:fd65f85fcc0c |
13-Oct-2010 |
Gabe Black <gblack@eecs.umich.edu> |
Mem: Change the CLREX flag to CLEAR_LL.
CLREX is the name of an ARM instruction, not a name for this generic flag. |
7687:d1ba390671ec |
22-Sep-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: improve coherence handling of writebacks If we write back an exclusive copy, we now mark it as such, so the cache receiving the writeback can mark its copy as exclusive. This avoids some unnecessary upgrade requests when a cache later tries to re-acquire exclusive access to the block. |
7676:92274350b953 |
10-Sep-2010 |
Nathan Binkert <nate@binkert.org> |
style: fix sorting of includes and whitespace in some files |
7669:cc222ba29079 |
09-Sep-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: fail SC when invalidated while waiting for bus Corrects an oversight in cset f97b62be544f. The fix there only failed queued SCUpgradeReq packets that encountered an invalidation, which meant that the upgrade had to reach the L2 cache. To handle pending requests in the L1 we must similarly fail StoreCondReq packets too. |
7668:aec271db42c9 |
09-Sep-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: fix functional accesses to deal with coherence change We can't just obliviously return the first valid cache block we find any more... see comments for details. |
7667:aa8fd8f6a495 |
09-Sep-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: coherence protocol enhancements & bug fixes Allow lower-level caches (e.g., L2 or L3) to pass exclusive copies to higher levels (e.g., L1). This eliminates a lot of unnecessary upgrade transactions on read-write sequences to non-shared data.
Also some cleanup of MSHR coherence handling and multiple bug fixes. |
7659:657f0adae97c |
26-Aug-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: fix m5.fast compile bug in previous cset |
7658:3148ae920301 |
26-Aug-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: fix a bug in atomic multilevel snoops |
7636:59b6a1b5bb0c |
25-Aug-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: fix dumb typo in copyrights |
7612:917946898102 |
23-Aug-2010 |
Gene Wu <Gene.Wu@arm.com> |
MEM: Make CLREX a first class request operation and clear locks in caches when it in received |
7611:c119da5a80c8 |
23-Aug-2010 |
Gene Wu <Gene.Wu@arm.com> |
ARM: Make sure that software prefetch instructions can't change the state of the TLB |
7576:4154f3e1edae |
23-Aug-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
Compiler: Fixes for GCC 4.5. |
7510:fb7fc9aca918 |
22-Jul-2010 |
Timothy M. Jones <tjones1@inf.ed.ac.uk> |
Port: Only indicate that a SimpleTimingPort is drained if its send event is not scheduled, as well as the transmit list being empty. |
7497:aab017d1adc6 |
08-Jul-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: fix bug in SC upgrade handling This bug was introduced with the recent rework of SC failure handling in cset f97b62be544f. |
7468:6b72468fbad3 |
23-Jun-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: fix longstanding prefetcher bug Thanks to Joe Gross for pointing this out (again?). Apologies to anyone who pointed it out earlier and we didn't listen. |
7465:f97b62be544f |
16-Jun-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: fail store conditionals when upgrade loses race Requires new "SCUpgradeReq" message that marks upgrades for store conditionals, so downstream caches can fail these when they run into invalidations. See http://www.m5sim.org/flyspray/task/197 |
7464:8d92c2737ac8 |
16-Jun-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: fix dirty bit setting Only set the dirty bit when we actually write to a block (not if we thought we might but didn't, as in a failed SC or CAS). This requires makeing sure the dirty bit stays set when we get an exclusive (writable) copy in a cache-to-cache transfer from another owner, which n turn requires copying the mem-inhibit flag from timing-mode requests to their associated responses. |
7461:5a07045d0af2 |
15-Jun-2010 |
Nathan Binkert <nate@binkert.org> |
stats: only consider a formula initialized if there is a formula |
6979:7732bca47f60 |
24-Feb-2010 |
Lisa Hsu <Lisa.Hsu@amd.com> |
cache stats: account for writebacks and/or device occupancy in the cache. Plus, a minor bugfix that neglects to update blk->contextSrc in certain cases on a cache insert. |
6978:ab05e20dc4a7 |
23-Feb-2010 |
Lisa Hsu <Lisa.Hsu@amd.com> |
cache: Make caches sharing aware and add occupancy stats. On the config end, if a shared L2 is created for the system, it is parameterized to have n sharers as defined by option.num_cpus. In addition to making the cache sharing aware so that discriminating tag policies can make use of context_ids to make decisions, I added an occupancy AverageStat and an occ % stat to each cache so that you could know which contexts are occupying how much cache on average, both in terms of blocks and percentage. Note that since devices have context_id -1, having an array of occ stats that correspond to each context_id will break here, so in FS mode I add an extra bucket for device blocks. This bucket is explicitly not added in SE mode in order to not only avoid ugliness in the stats.txt file, but to avoid broken stats (some formulas break when a bucket is 0). |
6976:1d7008e14da6 |
23-Feb-2010 |
Lisa Hsu <Lisa.Hsu@amd.com> |
cache: pull CacheSet out of LRU so that other tags can use associative sets. |
6817:5aec45d0fc24 |
12-Jan-2010 |
Lisa Hsu <Lisa.Hsu@amd.com> |
cache: make tags->insertBlock() and tags->accessBlock() context aware so that the cache can make context-specific decisions within their various tag policy implementations. |
6666:3199397fd905 |
26-Sep-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Minor cleanup: Use the blockAlign() method where it applies in the cache. |
6665:874f2ee2f115 |
26-Sep-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Force prefetches to check cache and MSHRs immediately prior to issue. This prevents redundant prefetches from being issued, solving the occasional 'needsExclusive && !blk->isWritable()' assertion failure in cache_impl.hh that several people have run into. Eliminates "prefetch_cache_check_push" flag, neither setting of which really solved the problem. |
6658:f4de76601762 |
23-Sep-2009 |
Nathan Binkert <nate@binkert.org> |
arch: nuke arch/isa_specific.hh and move stuff to generated config/the_isa.hh |
6429:7ed8937e375a |
02-Aug-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Fix setting of INST_FETCH flag for O3 CPU. It's still broken in inorder. Also enhance DPRINTFs in cache and physical memory so we can see more easily whether it's getting set or not. |
6227:a17798f2a52c |
05-Jun-2009 |
Nathan Binkert <nate@binkert.org> |
types: clean up types, especially signed vs unsigned |
6221:58a3c04e6344 |
26-May-2009 |
Nathan Binkert <nate@binkert.org> |
types: add a type for thread IDs and try to use it everywhere |
6216:2f4020838149 |
17-May-2009 |
Nathan Binkert <nate@binkert.org> |
includes: sort includes again |
6215:9aed64c9f10f |
17-May-2009 |
Nathan Binkert <nate@binkert.org> |
includes: use base/types.hh not inttypes.h or stdint.h |
6214:1ec0ec8933ae |
17-May-2009 |
Nathan Binkert <nate@binkert.org> |
types: Move stuff for global types into src/base/types.hh |
6122:9af6fb59752f |
16-Jul-2008 |
Steve Reinhardt <Steve.Reinhardt@amd.com> |
mem: use single BadAddr responder per system. Previously there was one per bus, which caused some coherence problems when more than one decided to respond. Now there is just one on the main memory bus. The default bus responder on all other buses is now the downstream cache's cpu_side port. Caches no longer need to do address range filtering; instead, we just have a simple flag to prevent snoops from propagating to the I/O bus. |
6105:a27c0934de24 |
20-Apr-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
request: rename INST_READ to INST_FETCH. |
6102:7fbf97dc6540 |
20-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
Mem: Change isLlsc to isLLSC. |
6076:e141cc7896ce |
19-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
Memory: Rename LOCKED for load locked store conditional to LLSC. |
6020:0647c8b31a99 |
06-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
Merge ARM into the head. ARM will compile but may not actually work. |
6013:208de84f046d |
12-Mar-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: set dirty bit on swaps (oops!) |
6010:a1e71f3576f8 |
10-Mar-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
prefetch: don't panic on requests w/o contextID (e.g., writebacks). |
5999:3cf8e71257e0 |
05-Mar-2009 |
Nathan Binkert <nate@binkert.org> |
stats: Fix all stats usages to deal with template fixes |
5875:d82be3235ab4 |
16-Feb-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Fixes to get prefetching working again. Apparently we broke it with the cache rewrite and never noticed. Thanks to Bao Yungang <baoyungang@gmail.com> for a significant part of these changes (and for inspiring me to work on the rest). Some other overdue cleanup on the prefetch code too. |
5746:d7540fa81f1d |
14-Nov-2008 |
Steve Reinhardt <Steve.Reinhardt@amd.com> |
Cache: get rid of obsolete Tag methods. I think readData() and writeData() were used for Erik's compression work, but that code is gone, these aren't called anymore, and they don't even really do what their names imply. |
5730:dea5fcd1ead0 |
10-Nov-2008 |
Steve Reinhardt <Steve.Reinhardt@amd.com> |
Cache: Refactor packet forwarding a bit. Makes adding write-through operations easier. |
5717:6ed48cba2217 |
04-Nov-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
decouple eviction from insertion in the cache. |
5716:ee56bb539212 |
04-Nov-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
Change the findBlock(addr, lat) to accessBlock, which I think has better connotations for what is really happening and how it should be used. |
5715:e8c1d4e669a7 |
04-Nov-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
get rid of all instances of readTid() and getThreadNum(). Unify and eliminate redundancies with threadId() as their replacement. |
5714:76abee886def |
02-Nov-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
Add in Context IDs to the simulator. From now on, cpuId is almost never used, the primary identifier for a hardware context should be contextId(). The concept of threads within a CPU remains, in the form of threadId() because sometimes you need to know which context within a cpu to manipulate. |
5707:da86e00f87a0 |
23-Oct-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
s/cpu_id/cpuId in o3 (to be consistent and match style), also fix some typos in comments. |
5706:2cc2387049bc |
23-Oct-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
probe function no longer used anywhere. |
5705:aea94955635b |
23-Oct-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
remove the totally obsolete split cache |
5699:ab3067124402 |
14-Oct-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
This function declaration isn't used anywhere. HG: user: Lisa Hsu <hsul@eecs.umich.edu> HG: branch default HG: changed src/mem/cache/cache.hh |
5606:6da7a58b0bc8 |
09-Oct-2008 |
Nathan Binkert <nate@binkert.org> |
eventq: convert all usage of events to use the new API. For now, there is still a single global event queue, but this is necessary for making the steps towards a parallelized m5. |
5543:3af77710f397 |
10-Sep-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
style: Remove non-leading tabs everywhere they shouldn't be. Developers should configure their editors to not insert tabs |
5494:85c8d296c1cb |
28-Jun-2008 |
Steve Reinhardt <stever@gmail.com> |
Backed out changeset 94a7bb476fca: caused memory leak. |
5489:94a7bb476fca |
21-Jun-2008 |
Steve Reinhardt <stever@gmail.com> |
Generate more useful error messages for unconnected ports. Force all non-default ports to provide a name and an owner in the constructor. |
5458:9ffc2be2d925 |
13-Jun-2008 |
Steve Reinhardt <stever@gmail.com> |
Get rid of bogus cache assertion. I was asserting that the only reason you would defer targets is if a write came in while you had an outstanding read miss, but there's another case where you could get a read access after you've snooped an invalidation and buffered it because it applies to a prior outstanding miss. |
5402:05c388940eb6 |
15-May-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
Make sure that output files are always checked success before they're used. Make OutputDirectory::resolve() private and change the functions using resolve() to instead use create(). |
5388:3b4772ca8368 |
25-Mar-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix handling of writeback-induced writebacks in atomic mode. |
5386:5614618f4027 |
24-Mar-2008 |
Steve Reinhardt <stever@gmail.com> |
Don't FastAlloc MSHRs since we don't allocate them on the fly. |
5384:dc6bb852ca68 |
22-Mar-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix cache problem with writes to tempBlock getting wrong writeback address. |
5381:55789e3f65cd |
17-Mar-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix a few Packet memory leaks. |
5379:800a2f0641b5 |
15-Mar-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix subtle cache bug where read could return stale data if a prior write miss arrived while an even earlier read miss was still outstanding. |
5366:ccef4b20c987 |
27-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Revamp cache timing access mshr check to make stats sane again. |
5365:49bef92749d1 |
26-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Cache: better comments particularly regarding writeback situation. |
5350:67e5e13f4146 |
16-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Make L2+ caches allocate new block for writeback misses instead of forwarding down the line. |
5345:6a783e4946ac |
11-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Automated merge with file:/home/stever/hg/m5-orig |
5338:e75d02a09806 |
10-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix #include lines for renamed cache files. |
5337:f81512eb8bdf |
10-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Rename cache files for brevity and consistency with rest of tree. |
5321:14afee693b39 |
06-Jan-2008 |
Geoffrey Blake <blakeg@umich.edu> |
Temporary fix for ll/sc bug see flyspray task for more info: http://www.m5sim.org/flyspray/task/197
Signed-off by: Ali Saidi <saidi@eecs.umich.edu> |
5319:13cb690ba6d6 |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Add ReadRespWithInvalidate to handle multi-level coherence situation where we defer a response to a read from a far-away cache A, then later defer a ReadExcl from a cache B on the same bus as us. We'll assert MemInhibit in both cases, but in the latter case MemInhibit will keep the invalidation from reaching cache A. This special response tells cache A that it gets the block to satisfy its read, but must immediately invalidate it. |
5318:fc6a69e31c8e |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Mark cache-to-cache MSHRs as downstreamPending when necessary. Don't mark upstream MSHR as pending if downstream MSHR is already in service. |
5317:5f5eb2456e8b |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Don't DPRINTF in the middle of a PrintReq. |
5315:30997e988446 |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Additional comments and helper functions for PrintReq. |
5314:e902f12a3af1 |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Add functional PrintReq command for memory-system debugging. |
5313:07c9a3b1539b |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix formatting and comments in cache_impl.hh |
5271:5e7547af97fb |
16-Nov-2007 |
Steve Reinhardt <stever@gmail.com> |
Tweak check for writable block fill. |
5270:ba8f3ca2a525 |
16-Nov-2007 |
Steve Reinhardt <stever@gmail.com> |
Fix bug on exclusive response to ReadReq with pending WriteReq. |
5213:ad68c4b99d6d |
04-Nov-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Cache: Fix for OS X 10.5 compiling. |
5197:5d7cf59548f5 |
16-Sep-2007 |
Steve Reinhardt <stever@gmail.com> |
mem: clean up bus/cache DPRINTFs a bit Not so much noise on failed sends, and more complete info when grepping a trace using an address. |
5192:582e583f8e7e |
31-Oct-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Traceflags: Add SCons function to created a traceflag instead of having one file with them all. |
5034:6186ef720dd4 |
30-Aug-2007 |
Miles Kaufmann <milesck@eecs.umich.edu> |
params: Deprecate old-style constructors; update most SimObject constructors.
SimObjects not yet updated: - Process and subclasses - BaseCPU and subclasses
The SimObject(const std::string &name) constructor was removed. Subclasses that still rely on that behavior must call the parent initializer as : SimObject(makeParams(name)) |
5012:c0a28154d002 |
27-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with head |
4986:b7c82ad6b3ef |
24-Aug-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Mem: Make errors in the memory system be responses, not requests. Fixes cache handling of error responses. |
4970:d0ed47928f9c |
12-Aug-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
MemorySystem: Fix the use of ?: to produce correct results. |
4965:ad0e792a5c78 |
10-Aug-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
DMA: Add IOCache and fix bus bridge to optionally only send requests one way so a cache can handle partial block requests for i/o devices. |
4948:55bcb35dc166 |
04-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with head. |
4934:0573f1ffcc4d |
03-Aug-2007 |
Steve Reinhardt <stever@gmail.com> |
cache: get rid of obsolete params from python. |
4929:6db35d0c81c6 |
29-Jul-2007 |
Steve Reinhardt <stever@gmail.com> |
memory system: fix functional access bug. Make sure not to keep processing functional accesses after they've been responded to. Also use checkFunctional() return value instead of checking packet command field where possible, mostly just for consistency. |
4920:03b88702070e |
27-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
cache/memtest: fixes for functional accesses. |
4919:013a8e9117b6 |
27-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
cache: Get rid of unused variable. |
4918:3214e3694fb2 |
27-Jul-2007 |
Nathan Binkert <nate@binkert.org> |
Merge python and x86 changes with cache branch |
4917:9e84859dde4d |
26-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Have owner respond to UpgradeReq to avoid race. |
4916:000ab733f1eb |
26-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Add downward express snoops for invalidations. |
4915:4c0e0f67fc94 |
26-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Continue snooping after a writeback is encountered. |
4913:d81df43157b3 |
25-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Can't block on memInhibit packets (now that bus no longer filters them for us). |
4910:fd583ea6a3bb |
24-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
A couple more minor bug fixes for multilevel coherence. |
4908:771ec077a955 |
23-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Replace lowerMSHRPending flag with more robust scheme based on following Packet senderState links. |
4905:0ccda2bb3be7 |
22-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Replace DeferredSnoop flag with LowerMSHRPending flag. Turns out DeferredSnoop isn't quite the right bit of info we needed... see new comment in cache_impl.hh. |
4904:291184a5eb05 |
22-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
A few minor non-debug compilation issues. |
4903:865d314b7139 |
21-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Deal with invalidations intersecting outstanding upgrades. If the invalidation beats the upgrade at a lower level then the upgrade must be converted to a read exclusive "in the field". Restructure target list & deferred target list to factor out some common code. |
4902:bc666118c6e2 |
21-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Several more fixes for multi-level timing coherence. - Add "deferred snoop" flag to Packet so upper-level caches can distinguish whether lower-level cache request was in-service or not at the time of the original snoop. - Revamp response handling to properly handle deferred snoops on non-cache-fill requests (i.e. upgrades). - Make sure forwarded writebacks are kept in write buffer at lower-level caches so they get snooped properly. |
4900:9397ff92c45c |
17-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Forward cache-to-cache responses through other caches. |
4899:6179f3039eb2 |
17-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Assert that an mshr has a target in getTarget(). |
4896:93f20b1f3925 |
15-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge from head. |
4895:d36959284fbc |
15-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix up a bunch of multilevel coherence issues. Atomic mode seems to work. Timing is closer but not there yet. |
4889:a557a85bdb96 |
14-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Add CacheRepl trace flag and move a couple DPRINTFs to it. |
4888:a1c0cca0979f |
14-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Move a couple of DPRINTFs from Cache to CachePort. |
4885:385a051ad874 |
14-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge of DPRINTF fixes from head. |
4882:78904f539525 |
03-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Delete packets when we're done with them. |
4881:3e4b4f6ff9dd |
02-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Couple more minor bug fixes for FS timing mode.
src/cpu/simple/timing.cc: Fix another SC problem. src/mem/cache/cache_impl.hh: Forgot to call makeTimingResponse() on uncached timing responses. |
4876:a18cedc19da5 |
30-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of remaining traces of obsolete CoherenceProtocol object. |
4872:c810a14f9a39 |
30-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Factor out a little more common code. |
4871:02c0ad6e09ee |
30-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix up a few statistics problems. Stats pretty much line up with old code, except: - bug in old code included L1 latency in L2 miss time, making it too high - UniCoherence did cache-to-cache transfers even from non-owner caches, so occasionally the icache would get a block from the dcache not the L2 - L2 can now receive ReadExReq from L1 since L1s have coherence |
4870:fcc39d001154 |
30-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of Packet result field. Error responses are now encoded in cmd field. |
4763:fef9a47b3732 |
24-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with head. |
4762:c94e103c83ad |
24-Jul-2007 |
Nathan Binkert <nate@binkert.org> |
Major changes to how SimObjects are created and initialized. Almost all creation and initialization now happens in python. Parameter objects are generated and initialized by python. The .ini file is now solely for debugging purposes and is not used in construction of the objects in any way. |
4739:9f8edf47aeca |
14-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix & tweak DPRINTFs for tracediff w/new cache code. Note that we should *not* print pointer values in DPRINTFs as these needlessly clutter tracediff output. |
4672:cc97e595e07d |
27-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of coherence protocol object. |
4671:5d29d3be0f79 |
27-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Revamp replacement-of-upgrade handling. |
4670:54ac1fb49a26 |
27-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Handle deferred snoops better. |
4669:afd3ecbf9798 |
26-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
cache_impl.hh: Change target overflow from assertion to warning.
src/mem/cache/cache_impl.hh: Change target overflow from assertion to warning. |
4668:fcce0b964c7c |
26-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Handle replacement of block with pending upgrade.
src/mem/cache/tags/lru.cc: Add some replacement DPRINTFs |
4667:bf428e572091 |
26-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Couple minor bug fixes...
src/mem/cache/cache_impl.hh: Handle grants with no packet. src/mem/cache/miss/mshr.cc: Fix MSHR snoop hit handling. |
4666:5d110d024fcf |
25-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of requestCauses. Use timestamped queue to make sure we don't re-request bus prematurely. Use callback to avoid calling sendRetry() recursively within recvTiming. |
4665:9471921e5e08 |
24-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Better handling of deferred targets. |
4630:5a832c366b22 |
22-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fixes to hitLatency, blocking, buffer allocation. Single-cpu timing mode seems to work now. |
4628:17b3ce796176 |
21-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Getting closer...
configs/example/memtest.py: Add progress interval option. src/base/traceflags.py: Add MemTest flag. src/cpu/memtest/memtest.cc: Clean up tracing. src/cpu/memtest/memtest.hh: Get rid of unused code. |
4627:2766d5cfbd9d |
17-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge vm1.(none):/home/stever/bk/newmem-head into vm1.(none):/home/stever/bk/newmem-cache2
configs/example/memtest.py: Hand merge redundant changes. |
4626:ed8aacb19c03 |
17-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
More major reorg of cache. Seems to work for atomic mode now, timing mode still broken.
configs/example/memtest.py: Revamp options. src/cpu/memtest/memtest.cc: No need for memory initialization. No need to make atomic response... memory system should do that now. src/cpu/memtest/memtest.hh: MemTest really doesn't want to snoop. src/mem/bridge.cc: checkFunctional() cleanup. src/mem/bus.cc: src/mem/bus.hh: src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache.cc: src/mem/cache/cache.hh: src/mem/cache/cache_blk.hh: src/mem/cache/cache_builder.cc: src/mem/cache/cache_impl.hh: src/mem/cache/coherence/coherence_protocol.cc: src/mem/cache/coherence/coherence_protocol.hh: src/mem/cache/coherence/simple_coherence.hh: src/mem/cache/miss/SConscript: src/mem/cache/miss/mshr.cc: src/mem/cache/miss/mshr.hh: src/mem/cache/miss/mshr_queue.cc: src/mem/cache/miss/mshr_queue.hh: src/mem/cache/prefetch/base_prefetcher.cc: src/mem/cache/tags/fa_lru.cc: src/mem/cache/tags/fa_lru.hh: src/mem/cache/tags/iic.cc: src/mem/cache/tags/iic.hh: src/mem/cache/tags/lru.cc: src/mem/cache/tags/lru.hh: src/mem/cache/tags/split.cc: src/mem/cache/tags/split.hh: src/mem/cache/tags/split_lifo.cc: src/mem/cache/tags/split_lifo.hh: src/mem/cache/tags/split_lru.cc: src/mem/cache/tags/split_lru.hh: src/mem/packet.cc: src/mem/packet.hh: src/mem/physical.cc: src/mem/physical.hh: src/mem/tport.cc: More major reorg. Seems to work for atomic mode now, timing mode still broken. |
4621:0468bff29088 |
28-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge vm1.(none):/home/stever/bk/newmem-head into vm1.(none):/home/stever/bk/newmem-cache2 |
4597:063f25d13229 |
20-Jun-2007 |
Nathan Binkert <binkertn@umich.edu> |
Make sure all parameters have default values if they're supposed to and make sure parameters have the right type. Also make sure that any object that should be an intermediate type has the right options set. |
4549:42b30b2529e1 |
10-Jun-2007 |
Nathan Binkert <binkertn@umich.edu> |
More realistic parameters |
4486:aaeb03a8a6e1 |
27-May-2007 |
Nathan Binkert <binkertn@umich.edu> |
Move SimObject python files alongside the C++ and fix the SConscript files so that only the objects that are actually available in a given build are compiled in. Remove a bunch of files that aren't used anymore. |
4478:33c4bf0ab4b9 |
22-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix getDeviceAddressRanges() to get snooping right. |
4477:375b35072b58 |
22-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge vm1.(none):/home/stever/bk/newmem-head into vm1.(none):/home/stever/bk/newmem-cache2
src/mem/cache/base_cache.hh: Manual conflict resolution. |
4475:fb185cc1c845 |
22-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Change getDeviceAddressRanges to use bool for snoop arg. |
4473:fa451e5f9f06 |
22-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Another pass of minor changes in preparation for new protocol.
src/mem/cache/cache_impl.hh: src/mem/cache/coherence/simple_coherence.hh: Get rid of old invalidate propagation logic in preparation for new multilevel snoop protocol. src/mem/cache/coherence/coherence_protocol.cc: L2 cache now has protocol, so protocol must handle ReadExReq coming in from the CPU side. src/mem/cache/miss/mshr_queue.cc: Assertion is failing, so let's take it out for now. src/mem/packet.cc: src/mem/packet.hh: Add WritebackAck command. Reorganize enum to put responses next to corresponding requests. Get rid of unused WriteReqNoAck. |
4469:1a5deb8fffd3 |
19-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge vm1.(none):/home/stever/bk/newmem-head into vm1.(none):/home/stever/bk/newmem-cache2
src/mem/bridge.cc: SCCS merged |
4458:d43aab911e6e |
19-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
First set of changes for reorganized cache coherence support. Compiles but doesn't work... committing just so I can merge (stupid bk!).
src/mem/bridge.cc: Get rid of SNOOP_COMMIT. src/mem/bus.cc: src/mem/packet.hh: Get rid of SNOOP_COMMIT & two-pass snoop. First bits of EXPRESS_SNOOP support. src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache.hh: src/mem/cache/cache_impl.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/miss_queue.cc: src/mem/cache/prefetch/base_prefetcher.cc: Big reorg of ports and port-related functions & events. src/mem/cache/cache.cc: src/mem/cache/cache_builder.cc: src/mem/cache/coherence/SConscript: Get rid of UniCoherence object. |
4456:02b3756b83e4 |
14-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge vm1.(none):/home/stever/bk/newmem-head into vm1.(none):/home/stever/bk/newmem-cache2 |
4451:bfb7c7c0b7ea |
14-May-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
add uglyiness to fix dmas
src/dev/io_device.cc: extra printing and assertions src/mem/bridge.hh: deal with packets only satisfying part of a request by making many requests src/mem/cache/cache_impl.hh: make the cache try to satisfy a functional request from the cache above it before checking itself |
4449:dc56f9418210 |
14-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Eliminate unused PacketPtr from BaseCache's RequestEvent and ResponseEvent. Compiles but not tested. |
4448:4c1ae4adf9bb |
14-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Split BaseCache::CacheEvent into RequestEvent and ResponseEvent. Compiles but not tested. |
4444:0648bdc8d1c9 |
10-May-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
remove hit_latency and make latency do the right thing set the latency parameter in terms of a latency add caches to tsunami-simple configs
configs/common/Caches.py: tests/configs/memtest.py: tests/configs/o3-timing-mp.py: tests/configs/o3-timing.py: tests/configs/simple-atomic-mp.py: tests/configs/simple-timing-mp.py: tests/configs/simple-timing.py: set the latency parameter in terms of a latency configs/common/FSConfig.py: give the bridge a default latency too src/mem/cache/cache_builder.cc: src/python/m5/objects/BaseCache.py: remove hit_latency and make latency do the right thing tests/configs/tsunami-simple-atomic-dual.py: tests/configs/tsunami-simple-atomic.py: tests/configs/tsunami-simple-timing-dual.py: tests/configs/tsunami-simple-timing.py: add caches to tsunami-simple configs |
4435:7da241055348 |
09-May-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
add a backoff algorithm when nacks are received by devices add seperate response buffers and request queue sizes in bus bridge add delay to respond to a nack in the bus bridge
src/dev/i8254xGBe.cc: src/dev/ide_ctrl.cc: src/dev/ns_gige.cc: src/dev/pcidev.hh: src/dev/sinic.cc: add backoff delay parameters src/dev/io_device.cc: src/dev/io_device.hh: add a backoff algorithm when nacks are received. src/mem/bridge.cc: src/mem/bridge.hh: add seperate response buffers and request queue sizes add a new parameters to specify how long before a nack in ready to go after a packet that needs to be nacked is received src/mem/cache/cache_impl.hh: assert on the src/mem/tport.cc: add a friendly assert to make sure the packet was inserted into the list |
4321:6f8b597ab244 |
04-Apr-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
The MemoryObject tha owns a port should delete it if it so chooses when deletePortRefs() is called on it with that port as a parameter. In this way a MemoryObject can keep a functional port around and give it to anyone who wants to do functional accesses rather than creating a new one each time.
src/mem/bus.cc: src/mem/bus.hh: src/mem/cache/cache_impl.hh: only keep around one func port we give to anyone who wants it. Otherwise we can run out of port ids reasonably quickly if a lot of functional accesses are happening (e.g. remote debugging, dprintk, etc) |
4300:39657530a8c3 |
28-Mar-2007 |
Ron Dreslinski <rdreslin@umich.edu> |
Call compare and Swap on the target, not the response. |
4296:f7855a71f660 |
27-Mar-2007 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/bk/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/head |
4284:c8800319ed0c |
23-Mar-2007 |
Kevin Lim <ktlim@umich.edu> |
Merge ktlim@zizzer:/bk/newmem into zamp.eecs.umich.edu:/z/ktlim2/clean/tmp/clean2
src/cpu/base_dyn_inst.hh: Hand merge. Line is no longer needed because it's handled in the ISA. |
4219:e3f636da1042 |
27-Mar-2007 |
Ron Dreslinski <rdreslin@umich.edu> |
First Pass At Cmp/Swap in caches |
4211:d3a09a666b68 |
12-Mar-2007 |
Ron Dreslinski <rdreslin@umich.edu> |
Clean up more memory leaks |
4203:b5c2bb0b9cae |
12-Mar-2007 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix some of the memory leaks related to writebacks
src/cpu/memtest/memtest.cc: Add the [] to a delete to make it work correctly src/mem/cache/cache_impl.hh: Fix one of the memory leaks |
4202:f7a05daec670 |
11-Mar-2007 |
Nathan Binkert <binkertn@umich.edu> |
Rework the way SCons recurses into subdirectories, making it automatic. The point is that now a subdirectory can be added to the build process just by creating a SConscript file in it. The process has two passes. On the first pass, all subdirs of the root of the tree are searched for SConsopts files. These files contain any command line options that ought to be added for a particular subdirectory. On the second pass, all subdirs of the src directory are searched for SConscript files. These files describe how to build any given subdirectory. I have added a Source() function. Any file (relative to the directory in which the SConscript resides) passed to that function is added to the build. Clean up everything to take advantage of Source(). function is added to the list of files to be built. |
4190:5069dfa3d62e |
08-Mar-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
stop m5 from leaking like a sieve don't create a new physPort/virtPort every time activateContext() is called add the ability to tell a memory object to delete it's reference to a port and a method to have a port call deletePortRefs() on the port owner as well as delete it's peer still need to stop calling connectMemoPorts() every time activateContext() is called or we'll overflow the bus id and panic
src/cpu/thread_state.cc: if we hav ea (phys|virt)Port don't create a new on, have it delete it's peer and then reuse it src/mem/bus.cc: src/mem/bus.hh: add ability to delete a port by usig a hash_map instead of an array to store port ids add a function to do deleting src/mem/cache/cache.hh: src/mem/cache/cache_impl.hh: src/mem/mem_object.cc: src/mem/mem_object.hh: adda function to delete port references from a memory object src/mem/port.cc: src/mem/port.hh: add a removeConn function that tell the owener to delete any references to the port and then deletes its peer |
4167:ce5d0f62f13b |
06-Mar-2007 |
Nathan Binkert <binkertn@umich.edu> |
Move all of the parameters of the Root SimObject so they are directly configured by python. Move stuff from root.(cc|hh) to core.(cc|hh) since it really belogs there now. In the process, simplify how ticks are used in the python code. |
4040:eb894f3fc168 |
12-Feb-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
rename store conditional stuff as extra data so it can be used for conditional swaps as well Add support for a twin 64 bit int load Add Memory barrier and write barrier flags as appropriate Make atomic memory ops atomic
src/arch/alpha/isa/mem.isa: src/arch/alpha/locked_mem.hh: src/cpu/base_dyn_inst.hh: src/mem/cache/cache_blk.hh: src/mem/cache/cache_impl.hh: rename store conditional stuff as extra data so it can be used for conditional swaps as well src/arch/alpha/types.hh: src/arch/mips/types.hh: src/arch/sparc/types.hh: add a largest read data type for statically allocating read buffers in atomic simple cpu src/arch/isa_parser.py: Add support for a twin 64 bit int load src/arch/sparc/isa/decoder.isa: Make atomic memory ops atomic Add Memory barrier and write barrier flags as appropriate src/arch/sparc/isa/formats/mem/basicmem.isa: add post access code block and define a twinload format for twin loads src/arch/sparc/isa/formats/mem/blockmem.isa: remove old microcoded twin load coad src/arch/sparc/isa/formats/mem/mem.isa: swap.isa replaces the code in loadstore.isa src/arch/sparc/isa/formats/mem/util.isa: add a post access code block src/arch/sparc/isa/includes.isa: need bigint.hh for Twin64_t src/arch/sparc/isa/operands.isa: add a twin 64 int type src/cpu/simple/atomic.cc: src/cpu/simple/atomic.hh: src/cpu/simple/base.hh: src/cpu/simple/timing.cc: add support for twinloads add support for swap and conditional swap instructions rename store conditional stuff as extra data so it can be used for conditional swaps as well src/mem/packet.cc: src/mem/packet.hh: Add support for atomic swap memory commands src/mem/packet_access.hh: Add endian conversion function for Twin64_t type src/mem/physical.cc: src/mem/physical.hh: src/mem/request.hh: Add support for atomic swap memory commands Rename sc code to extradata |
4034:ba523332c82b |
23-Mar-2007 |
Kevin Lim <ktlim@umich.edu> |
3 memory system fixes: 1. Update packet's flags properly when a snoop happens 2. Don't allow accesses to read a block's data if the block has outstanding MSHRs. This avoids a RAW hazard in MP systems that the memory system was not detecting properly earlier (a write required a block to upgrade, and while the upgrade was outstanding, a read came along and read old data). 3. Update MSHR's request upon a response being handled. If the MSHR has more targets than it can respond to in one cycle, then its request must be properly updated to the new head of the targets list.
src/mem/bus.cc: Update packet's flags properly upon snoop. src/mem/cache/cache_impl.hh: Be sure to not allow accesses to a block with outstanding MSHRs. src/mem/cache/miss/miss_queue.cc: Update MSHR's request upon a response being handled. |
4026:7c8c480474c6 |
07-Feb-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem into vm1.(none):/home/stever/bk/newmem-head |
4023:fbefb05ecf2e |
06-Feb-2007 |
Kevin Lim <ktlim@umich.edu> |
Fix for LL/SC that Ron sent me. |
4022:c422464ca16e |
07-Feb-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Make memory commands dense again to avoid cache stat table explosion. Created MemCmd class to wrap enum and provide handy methods to check attributes, convert to string/int, etc. |
4020:c77bd3d23e48 |
07-Feb-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Minor DPRINTF fixes. |
3970:d54945bab95d |
03-Jan-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer:/bk/newmem into zower.eecs.umich.edu:/eecshome/m5/newmem |
3918:1f9a98d198e8 |
26-Jan-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
make our code a little more standards compliant pretty close to compiling w/ suns compiler
briefly: add dummy return after panic()/fatal() split out flags by compiler vendor include cstring and cmath where appropriate use std namespace for string ops
SConstruct: Add code to detect compiler and choose cflags based on detected compiler Fix zlib check to work with suncc src/SConscript: split out flags by compiler vendor src/arch/sparc/isa/decoder.isa: use correct namespace for sqrt src/arch/sparc/isa/formats/basic.isa: add dummy return around panic src/arch/sparc/isa/formats/integerop.isa: use correct namespace for stringops src/arch/sparc/isa/includes.isa: include cstring and cmath where appropriate src/arch/sparc/isa_traits.hh: remove dangling comma src/arch/sparc/system.cc: dummy return to make sun cc front end happy src/arch/sparc/tlb.cc: src/base/compression/lzss_compression.cc: use std namespace for string ops src/arch/sparc/utility.hh: no reason to say something is unsigned unsigned int src/base/compression/null_compression.hh: dummy returns to for suncc front end src/base/cprintf.hh: use standard variadic argument syntax instead of gnuc specefic renaming src/base/hashmap.hh: don't need to define hash for suncc src/base/hostinfo.cc: need stdio.h for sprintf src/base/loader/object_file.cc: munmap is in std namespace not null src/base/misc.hh: use M5 generic noreturn macros use standard variadic macro __VA_ARGS__ src/base/pollevent.cc: we need file.h for file flags src/base/random.cc: mess with include files to make suncc happy src/base/remote_gdb.cc: malloc memory for function instead of having a non-constant in an array size src/base/statistics.hh: use std namespace for floor src/base/stats/text.cc: include math.h for rint (cmath won't work) src/base/time.cc: use suncc version of ctime_r src/base/time.hh: change macro to work with both gcc and suncc src/base/timebuf.hh: include cstring from memset and use std:: src/base/trace.hh: change variadic macros to be normal format src/cpu/SConscript: add dummy returns where appropriate src/cpu/activity.cc: include cstring for memset src/cpu/exetrace.hh: include cstring fro memcpy src/cpu/simple/base.hh: add dummy return for panic src/dev/baddev.cc: src/dev/pciconfigall.cc: src/dev/platform.cc: src/dev/sparc/t1000.cc: add dummy return where appropriate src/dev/ide_atareg.h: make define work for both gnuc and suncc src/dev/io_device.hh: add dummy returns where approirate src/dev/pcidev.hh: src/mem/cache/cache_impl.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/tags/lru.hh: src/mem/cache/tags/split.hh: src/mem/cache/tags/split_lifo.hh: src/mem/cache/tags/split_lru.hh: src/mem/dram.cc: src/mem/packet.cc: src/mem/port.cc: include cstring for string ops src/dev/sparc/mm_disk.cc: add dummy return where appropriate include cstring for string ops src/mem/cache/miss/blocking_buffer.hh: src/mem/port.hh: Add dummy return where appropriate src/mem/cache/tags/iic.cc: cast hastSets to double for log() call src/mem/physical.cc: cast pmemAddr to char* for munmap src/sim/byteswap.hh: make define work for suncc and gnuc |
3862:ec47e4243107 |
19-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Streamline Cache/Tags interface: get rid of redundant functions, don't regenerate address from block in cache so that tags can turn around and use address to look up block again. |
3861:3b35b0f0b6a9 |
19-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
No need to template prefetcher on cache TagStore type. |
3860:73e3642713a3 |
18-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of generic CacheTags object (fold back into Cache). |
3738:c06cd072bbbe |
14-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Split CachePort class into CpuSidePort and MemSidePort and push those into derived Cache template class to eliminate a few layers of virtual functions and conditionals ("if (isCpuSide) { ... }" etc.). |
3721:6d0e55c05a46 |
05-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Don't compress data on writebacks unless it's actually necessary. |
3719:23ca579a363a |
04-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Turn cache MissQueue/BlockingBuffer into virtual object instead of template parameter. |
3712:c8a8938402cd |
03-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Make cache compression policy a runtime virtual thing instead of a template policy. |
3682:bf27fd870dae |
28-Nov-2006 |
Kevin Lim <ktlim@umich.edu> |
Remove assertion. It's not needed and messes up writebacks when a 2 level cache is used in a uniprocessor setting. |
3678:a689a7cf337e |
22-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Do a functional access to levels above on a read as a temporary solution for L2's in FS
Fix a small writeback bug when missing in the L2 in atomic mode
src/mem/bus.cc: Fix a comment to make sense src/mem/cache/cache_impl.hh: Do a functional access to levels above on a read as a temporary solution for L2's in FS Also fix a small writeback miss in L2 issue src/mem/cache/coherence/simple_coherence.hh: src/mem/cache/coherence/uni_coherence.cc: src/mem/cache/coherence/uni_coherence.hh: Do a functional access to levels above on a read as a temporary solution for L2's in FS tests/quick/00.hello/ref/alpha/linux/o3-timing/m5stats.txt: tests/quick/00.hello/ref/alpha/linux/simple-timing/m5stats.txt: tests/quick/01.hello-2T-smt/ref/alpha/linux/o3-timing/m5stats.txt: Update ref's for writeback changes |
3660:63e9c578bf83 |
13-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix a bug to handle the fact that a CPU can send Functional accesses while a sendTiming has not returned in the call stack.
src/mem/cache/base_cache.cc: Sometimes a functional access comes while waiting on a outstanding packet being sent. This could be because Timing CPU does some post processing on the recvTiming which send functional access. Either the CPU should leave the pkt/req around (so They can be referenced in the mem system). Or the mem system should remove them from outstanding lists and reinsert them if they fail in the sendTiming.
I did the later, eventually we should consider doing the former if that is the correct behavior. |
3655:4cae75fbc19c |
14-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
If all the targets aren't satisfied, reinitialize the packet. |
3652:81081c5de9aa |
13-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
If we didn't satisfy all targets, reset the packet we are requesting with. |
3648:e84414759d6b |
13-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Since cpus now send out snoop ranges, remove it from the cache. |
3611:205c2bdcdbb0 |
12-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Handle packets being deleted by lower level properly. Fixes for Mem Leak associated with Writebacks.
src/mem/cache/miss/mshr_queue.cc: Fixes for Mem Leak associated with Writebacks. (Double Delete removed) |
3610:c0f97b22db1a |
12-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Don't insert reponses into the list more than once If you get inserted in the front, reschedule the event |
3609:932a09e3e0c2 |
12-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Move code before a early return to make sure it is executed on all paths |
3608:8d8258faf7f6 |
12-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Yet another small bug in mem system related to flow control
src/mem/cache/cache_impl.hh: When upgrades change to readEx make sure to allocate the block Fix dprintf |
3607:7b7dd28784c4 |
12-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix functional access errors related to delayed respnoses in cachePort
src/mem/cache/base_cache.cc: On a delayed response, be sure to call the fixPacket wrapper to toggle hasData flag. src/mem/packet.cc: src/mem/packet.hh: Create a wrapper to toggle the hasData flag on delayed responses |
3606:9a4154893155 |
10-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
More fixes for functional accesses. It now makes the writeback memory leak to crash all configs. Working on that now.
src/mem/cache/base_cache.cc: Keep a list of the responders so we can search them on functional accesses. src/mem/cache/base_cache.hh: Properly put things on a list for responses so we can search the list. Also, be sure to check the outgoing ports lists on a functional access (factor some common code out there) src/mem/cache/cache_impl.hh: Properly return when the first read hit on a functional access. Make sure to call to check the other ports list of packets before forwarding it out. |
3503:0754b2b23408 |
07-Nov-2006 |
Kevin Lim <ktlim@umich.edu> |
Fix up bus draining and add draining to the caches.
src/mem/bus.cc: Fix up draining to work properly. src/mem/bus.hh: Initialize drainEvent to NULL. src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: Add draining to the caches. |
3487:dd7b0e5e907c |
02-Nov-2006 |
Kevin Lim <ktlim@umich.edu> |
Caches return a new functional port whenever asked for one.
src/mem/cache/base_cache.cc: Have caches return a new functional port whenever asked for them. I'm pretty sure this is desired behavior. Ron can correct me if it's not. |
3401:1df0cb879413 |
31-Oct-2006 |
Kevin Lim <ktlim@umich.edu> |
Ports now have a pointer to the MemObject that owns it (can be NULL).
src/cpu/simple/atomic.hh: Port now takes in the MemObject that owns it. src/cpu/simple/timing.hh: Port now takes in MemObject that owns it. src/dev/io_device.cc: src/mem/bus.hh: Ports now take in the MemObject that owns it. src/mem/cache/base_cache.cc: Ports now take in the MemObject that own it. src/mem/port.hh: src/mem/tport.hh: Ports now optionally take in the MemObject that owns it. |
3375:fc9deea82085 |
23-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Clean up cache DPRINTFs |
3374:d274a61e8e6c |
22-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
s/pktuest/request/ (all in comments) |
3369:1da3e60827b6 |
22-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Small bug fixes for timing LL/SC. Better now but not necessarily 100% there yet.
src/mem/cache/cache_impl.hh: Generate response packet on failed store conditional. src/mem/packet.hh: Clear packet flags when reinitializing. (SATISFIED in particular is one we don't want to leave set.) |
3367:5bd949e01861 |
21-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Just give up if a store conditional misses completely in the cache (don't treat as normal write miss). |
3365:323803612cbb |
21-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Refactor coherence state table initialization. |
3352:8e940d22b2a8 |
20-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/bk/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest
src/mem/tport.cc: Merge PacketPtr changes |
3349:fec4a86fa212 |
20-Oct-2006 |
Nathan Binkert <binkertn@umich.edu> |
Use PacketPtr everywhere |
3348:11f6ef023158 |
20-Oct-2006 |
Nathan Binkert <binkertn@umich.edu> |
refactor code for the packet, get rid of packet_impl.hh and call it packet_access.hh and fix the #includes so things compile right. |
3342:19e716ad518e |
20-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Use fixPacket function everywhere. Fix fixPacket assert function. Stop timing port from forwarding the request if a response was found in its queue on a read.
src/cpu/memtest/memtest.cc: src/cpu/memtest/memtest.hh: src/python/m5/objects/MemTest.py: Add parameter to configure what percentage of mem accesses are functional src/mem/cache/base_cache.cc: src/mem/cache/cache_impl.hh: Use fix Packet function src/mem/packet.cc: Fix an assert that was checking the wrong thing src/mem/tport.cc: Properly detect if we need to do the access to the functional device |
3341:82c51d920701 |
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix corner case on assertion. I need to move over to using the fixPacket function so I don't have to make the same changes everywhere. Still a functional access bug someplace I need to track down in timing mode.
src/mem/cache/base_cache.cc: src/mem/cache/cache_impl.hh: Fix corner case on assertion tests/configs/memtest.py: Updated memtester with uncacheable addresses and functional accesses |
3340:5b24f2c55fae |
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix memtester to use functional access, fix cache to work functionally now that we could test it.
src/cpu/memtest/memtest.cc: Fix memtest to do functional accesses src/mem/cache/cache_impl.hh: Fix cache to handle functional accesses properly based on memtester changes Still need to fix functional accesses in timing mode now that the memtester can test it. |
3339:d1b3ec71baa4 |
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Small changes: ?? doesn't compile in warn statements Should have been false, where I had a true.
src/cpu/o3/lsq_impl.hh: Apparently you can't have ?? in a warn statement (Something about trigraphs) src/mem/cache/cache_impl.hh: Forgot to signal atomic mode in snoopProbe |
3338:fdb673b90ca7 |
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes to get single level uni-coherence to work. Now to try L2 caches in FS.
src/mem/cache/base_cache.hh: Fix uni-coherence for atomic accesses in coherence protocol access to port src/mem/cache/cache_impl.hh: Properly handle uni-coherence src/mem/cache/coherence/simple_coherence.hh: Properly forward invalidates (not done for MSI+ protocols (assumed top level for now) src/mem/cache/coherence/uni_coherence.cc: src/mem/cache/coherence/uni_coherence.hh: Properly forward invalidates in atomic/timing uni-coherence |
3337:98e3fe23fe22 |
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/bk/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest |
3336:511088415da6 |
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Always get the functional access from the highest level of cache first.
src/mem/cache/cache_impl.hh: Get the read data from the highest level of cache on a functional access |
3334:79f481d4e307 |
18-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/bk/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest |
3328:50b7be1f9ab6 |
19-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
First cut at LL/SC support in caches (atomic mode only).
configs/example/fs.py: Add MOESI protocol to caches (uni coherence not quite working w/FS yet). |
3317:fc913ad3eba5 |
18-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Break a lot of overly long lines. Factor out some asserts that were on both sides of an if/else. |
3316:191bf5e30ac3 |
18-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of doData() lines (were already commented out). Reindent due to resulting changes in nesting. |
3315:f15ce6434ab0 |
18-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of obsolete in-cache copy support. |
3313:f44dfa966df5 |
18-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Include packet_impl.hh (need this on my laptop, but not on zizzer... g++ 4 thing maybe?) |
3310:21adbb41a37e |
17-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes for uni-coherence in timing mode for FS. Still a bug in atomic uni-coherence in FS.
src/cpu/o3/fetch_impl.hh: src/cpu/o3/lsq_impl.hh: src/cpu/simple/atomic.cc: src/cpu/simple/timing.cc: Make CPU models handle coherence requests src/mem/cache/base_cache.cc: Properly signal coherence CSHRs src/mem/cache/coherence/uni_coherence.cc: Only deallocate once |
3309:183edf675c27 |
17-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes to cache eliminating the assumption that the Packet is still valid after sending out a request. Still need to rework upgrades into this system, but works for now.
src/mem/cache/base_cache.cc: Re order code to be more readable src/mem/cache/base_cache.hh: Be sure to delete the copy on a bus block src/mem/cache/cache_impl.hh: Be sure to remove the copy on a writeback success src/mem/cache/miss/mshr_queue.cc: Demorgans to make it easier to understand src/mem/tport.cc: Delete writebacks |
3308:b85887027c9b |
17-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Properly chack the pkt pointer on upgrades to insure no segfaults when writebacks delete the packet. |
3307:ee2de66e23f1 |
17-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix it so that the cache does not assume to gave the packet it sent out via sendTiming. Still need to fix upgrades to use this path
src/mem/cache/base_cache.cc: Copy the pkt to the MSHR before issuing the sendTiming where it may be changed/consumed src/mem/cache/cache_impl.hh: Use copy of packet, because sendTiming may have changed the pkt Also, delete the copy when the time comes |
3303:d67ab1244d38 |
14-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of unused CacheBlk << output operator. |
3293:4ac3d9486d6e |
13-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix for DMA's in FS caches. Fix CSHR's for flow control. Fix for Bus Bridges reusing packets (clean flags up)
Now both timing/atomic caches with MOESI in UP fail at same point.
src/dev/io_device.hh: DMA's should send WriteInvalidates src/mem/bridge.cc: Reusing packet, clean flags in the packet set by bus. src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache.hh: src/mem/cache/cache_impl.hh: src/mem/cache/coherence/simple_coherence.hh: src/mem/cache/coherence/uni_coherence.cc: src/mem/cache/coherence/uni_coherence.hh: Fix CSHR's for flow control. src/mem/packet.hh: Make a writeInvalidateResp, since the DMA expects responses to it's writes |
3292:34666be8f3fb |
12-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix CSHR retrys |
3285:89b08bd7420e |
12-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Remove bus and top level parameters from cache
src/mem/cache/base_cache.hh: Remove top level param from cache src/mem/cache/coherence/uni_coherence.cc: Remove top level parameters from the cache |
3284:917750443a75 |
12-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Check the response queue on functional accesses. The response queue is not tying up an MSHR, should we change that or assume infinite storage for responses?
src/mem/cache/base_cache.cc: src/mem/tport.cc: Add in functional check of retry queued packets. |
3281:d0f7a2e1573f |
12-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix problems with unCacheable addresses in timing-coherence
src/base/traceflags.py: src/mem/physical.cc: Add debug falgs fro physical memory accesses src/mem/cache/cache_impl.hh: Snoops to uncacheable blocks should not happen src/mem/cache/miss/miss_queue.cc: Set the size properly on unCacheable accesses |
3262:5f96609a30ef |
11-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
More cache fixes. Atomic coherence now works as well.
src/cpu/memtest/memtest.cc: src/cpu/memtest/memtest.hh: Make Memtester able to test atomic as well src/mem/bus.cc: src/mem/bus.hh: Handle atomic snoops properly for cache->cache transfers src/mem/cache/cache_impl.hh: Debug output. Clean up memleak in atomic mode. Set hitLatency. Still need to send back reasonable number for atomic return value. src/mem/packet.cc: Add command strings for new commands src/python/m5/objects/MemTest.py: Add param to test atomic memory. |
3255:5b6cade9060f |
11-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Use bus response time paramteres Fix bug with deadlocking
src/mem/cache/base_cache.cc: Make sure to not wait anymore |
3251:5ed435255205 |
11-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
When turning asserts into if's don't forget to invert.
src/mem/cache/base_cache.cc: When turning asserts into if's don't forget to invert. Must be too sleepy. |
3250:e32f670162a5 |
11-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Writebacks can be pulled out from under the BusRequest when snoops of uprgades to owned blocks hit in the WB buffer |
3249:7144ab5a3c94 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Only issue responses if we aren;t already blocked |
3246:29acc553907f |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Debugging info
src/base/traceflags.py: Add new flags for cacheport src/mem/bus.cc: Add debugging info src/mem/cache/base_cache.cc: Add debuggin info |
3237:39baab979195 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Some more code cleanup
src/mem/cache/base_cache.cc: Add sanity checks src/mem/cache/base_cache.hh: Fix for retry mechanism |
3236:f303f5e88656 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix some more mem leaks, still some left Update retry mechanism
src/mem/cache/base_cache.cc: Rework the retry mechanism src/mem/cache/base_cache.hh: Rework the retry mechanism Try to fix memory bug src/mem/cache/cache_impl.hh: Rework upgrades to not be blocked by slave src/mem/cache/miss/mshr_queue.cc: Fix mem leak on writebacks |
3235:87bec63ab497 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix cshr Retry's Fix Upgrades being blocked by slave |
3214:779bab9071b5 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/z/m5/Bitkeeper/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest
src/mem/packet.hh: Hand merge code |
3212:41b04a73857f |
09-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem into zeep.eecs.umich.edu:/home/gblack/m5/newmem_bus |
3211:fe9df4627b32 |
09-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem into zeep.eecs.umich.edu:/home/gblack/m5/newmem_bus |
3208:97d9cc1e626f |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix several bugs pertaining to upgrades/mem leaks.
src/mem/cache/base_cache.cc: Fix a bug about not having a request to send src/mem/cache/base_cache.hh: Fix a bug with the blocking code src/mem/cache/cache.hh: AFix a bug with snoop hits in WB buffer src/mem/cache/cache_impl.hh: Fix a bug with snoop hits in WB buffer Also, add better DPRINTF's src/mem/cache/miss/miss_queue.cc: Fix a bug with upgrades (Need to clean it up later) src/mem/cache/miss/mshr.cc: Fix a memory leak bug, still some outstanding with writebacks not being deleted src/mem/cache/miss/mshr_queue.cc: Fix a bug about upgrades (need to clean up later) src/mem/packet.hh: Fix for newly added cmd attribute for upgrades tests/configs/memtest.py: More interesting testcase |
3207:0698f82cfbb3 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Handle NACK's that occur from devices on the same bus. Not fully implemented yet, but good enough for single level cache coherence
src/mem/packet.hh: Add a bit to distinguish invalidates and upgrades |
3205:135273dc77a9 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix how upgrades work. Remove some dead code.
src/mem/cache/cache_impl.hh: Upgrades don't need a response. Moved satisfied check into bus so removed some dead code. src/mem/cache/coherence/coherence_protocol.cc: src/mem/packet.hh: Upgrades don't require a response |
3204:1ac62ef68c44 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
One step closet to having NACK's work.
src/cpu/memtest/memtest.cc: Fix functional return path src/cpu/memtest/memtest.hh: Add snoop ranges in src/mem/cache/base_cache.cc: Properly signal NACKED src/mem/cache/cache_impl.hh: Catch nacked packet and panic for now |
3199:8e7749972f03 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix a typo in the printf |
3197:c5c7d434d135 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix a bitwise operation that was accidentally a logical operation. |
3195:4aa11ac8395c |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/z/m5/Bitkeeper/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest |
3194:a304c81d654d |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Set size properly on uncache accesses Don't use the senderState after you get a succesful sendTiming. Not guarnteed to be correct
src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache.hh: src/mem/cache/cache_impl.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/blocking_buffer.hh: src/mem/cache/miss/miss_queue.hh: Don't use the senderState after you get a succesful sendTiming. Not guarnteed to be correct |
3193:3df743a775d5 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Add more DPRINTF's fix a supply condition.
src/mem/cache/cache_impl.hh: Add more usefull DPRINTF's REmove the PC to get rid of asserts |
3192:f3e215dda3f6 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Have cpus send snoop ranges |
3189:bd5657abca1a |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Don't create a response if one isn't needed. |
3188:0a850349908c |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Don't block responses even if the cache is blocked. |
3185:1cc3355b84bf |
08-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Make sure to propogate sendFunctional calls with functional not atomic.
src/mem/cache/cache_impl.hh: Fix a error case by putting a panic in. Make sure to propogate sendFunctional calls with functional not atomic. |
3175:693ce319ee95 |
08-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Only respond if the pkt needs a response. Fix an issue with memory handling writebacks.
src/mem/cache/base_cache.hh: src/mem/tport.cc: Only respond if the pkt needs a response. src/mem/physical.cc: Make physical memory respond to writebacks, set satisfied for invalidates/upgrades. |
3174:b6b8440de50e |
08-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/z/m5/Bitkeeper/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest |
3173:2df0d82268d6 |
08-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Move away from using the statusChange function on snoops. Clean up snooping code in general. |
3172:2c84db071850 |
08-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Replace tests of LOCKED/UNCACHEABLE flags with isLocked()/isUncacheable(). |
3168:31c84f0573e1 |
08-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
missing else |
3153:90c1e143e33d |
07-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix a missing pointer |
3152:f16d754e0b10 |
07-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
No need to keep trying to request the data bus if we are already waiting. |
3151:7e437baee004 |
07-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Add mechanism for caches to handle failure of the fast path on responses.
For now, responses have priority over requests (may want to revist this).
src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: Add mechanism for caches to handle failure of the fast path on responses. |
3149:5409b6f356a3 |
07-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix infinite writebacks bug in cache.
src/mem/cache/cache_impl.hh: Make sure to pop the list. Fixes infinite writeback bug. src/mem/cache/miss/mshr_queue.cc: Add an assert as sanity check in case .full() stops working again. |
3148:765ddf2612f1 |
06-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/z/m5/Bitkeeper/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest |
3144:b6e9e1811d71 |
06-Oct-2006 |
Lisa Hsu <hsul@eecs.umich.edu> |
there are two main thrusts of this changeset.
1) return the periodicity of checkpoints back into the code (i.e. make m5 checkpoint n m meaningful again). 2) to do this, i had to much around with being able to repeatedly schedule and SimLoopExitEvent, which led to changes in how exit simloop events are handled to make this easier.
src/arch/alpha/isa/decoder.isa: src/mem/cache/cache_impl.hh: modify arg. order for new calling convention of exitSimLoop. src/cpu/base.cc: src/sim/main.cc: src/sim/pseudo_inst.cc: src/sim/root.cc: now, instead of creating a new SimLoopExitEvent, call a wrapper schedExitSimLoop which handles all the default args. src/sim/sim_events.cc: src/sim/sim_events.hh: src/sim/sim_exit.hh: add the periodicity of checkpointing back into the code.
to facilitate this, there are now two wrappers (instead of just overloading exitSimLoop). exitSimLoop is only for exiting NOW (i.e. at curTick), while schedExitSimLoop schedules and exit event for the future. |
3137:5dd9b13986a7 |
06-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Another thread number removed |
3136:a1eba7e17de5 |
06-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Remove threadnum from cache everywhere for now Fix so that blocking for the same reason doesn't fail. I.E. multiple writebacks want to set the blocked flag.
src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/miss_queue.cc: src/mem/cache/miss/mshr.cc: Remove threadnum from cache everywhere for now |
3135:8e008e281579 |
05-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes for functional accesses to use the snoop path. And small other tweaks to snooping coherence.
src/mem/cache/base_cache.hh: Make timing response at the time of send. src/mem/cache/cache.hh: src/mem/cache/cache_impl.hh: Update probe interface to be bi-directional for functional accesses src/mem/packet.hh: Add the function to create an atomic response to a given request |
3134:cf578b0dd70d |
05-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
First pass at snooping stuff that compiles and doesn't break.
Still need: -Handle NACK's on the recieve side -Distinguish top level caches -Handle repsonses from caches failing the fast path -Handle BusError and propogate it -Fix the invalidate packet associated with snooping in the cache
src/mem/bus.cc: Make sure to snoop on functional accesses src/mem/cache/base_cache.cc: Wait to make a request into a response until it is ready to be issued src/mem/cache/base_cache.hh: Support range changes for snoops Set up snoop responses for cache->cache transfers src/mem/cache/cache_impl.hh: Only access the cache if it wasn't satisfied by cache->cache transfer Handle snoop phases (detect block, then snoop) Fix functional access to work properly (still need to fix snoop path for functional accesses) |
3075:b2e56d8b8566 |
22-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Still need LL/SC support in cache, add hack to always return success for now |
3039:9cec9533b941 |
17-Aug-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Changes to build m5.fast |
3013:a173458c7f4d |
16-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes for blocking in the caches that needed to be pulled
src/mem/cache/base_cache.cc: Add in retry path for blocking with multi-level caches src/mem/cache/base_cache.hh: Pull more of the blocking fixes into head src/mem/packet.hh: Fix typo |
2994:f19cdc9c919c |
15-Aug-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Merge zizzer:/bk/newmem into zeep.pool:/z/saidi/tmp/m5.newmem |
2991:60cd98c72fd9 |
15-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Pulled out changes to fix EIO programs with caches. Also fixes any translatingPort read/write Blob function problems with caches.
-Basically removed the ASID from places it is no longer needed due to PageTable
src/mem/cache/cache.hh: src/mem/cache/cache_impl.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/blocking_buffer.hh: src/mem/cache/miss/miss_queue.cc: src/mem/cache/miss/miss_queue.hh: src/mem/cache/miss/mshr.cc: src/mem/cache/miss/mshr.hh: src/mem/cache/miss/mshr_queue.cc: src/mem/cache/miss/mshr_queue.hh: src/mem/cache/prefetch/base_prefetcher.cc: src/mem/cache/prefetch/base_prefetcher.hh: src/mem/cache/tags/fa_lru.cc: src/mem/cache/tags/fa_lru.hh: src/mem/cache/tags/iic.cc: src/mem/cache/tags/iic.hh: src/mem/cache/tags/lru.cc: src/mem/cache/tags/lru.hh: src/mem/cache/tags/split.cc: src/mem/cache/tags/split.hh: src/mem/cache/tags/split_lifo.cc: src/mem/cache/tags/split_lifo.hh: src/mem/cache/tags/split_lru.cc: src/mem/cache/tags/split_lru.hh: Remove asid where it wasn't neccesary anymore due to Page Table |
2990:d5074a2d3a9b |
15-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/z/m5/Bitkeeper/newmem into zizzer.eecs.umich.edu:/.automount/zazzer/z/rdreslin/m5bk/newmem |
2989:9a6f66c38acc |
15-Aug-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
fixes for gcc 4.1 Nate needs to fix sinic builder stuff Gabe needs to verify my fixes to decoder.isa
OPT/DEBUG compiles for ALPHA_FS, ALPHA_SE, MIPS_SE, SPARC_SE with this changeset
README: Fix the swig version in the readme src/SConscript: remove sinic until nate fixes the builder crap for it src/arch/alpha/system.hh: src/arch/mips/isa/includes.isa: src/arch/sparc/isa/decoder.isa: src/base/stats/visit.cc: src/base/timebuf.hh: src/dev/ide_disk.cc: src/dev/sinic.cc: src/mem/cache/miss/mshr.cc: src/mem/cache/miss/mshr_queue.cc: src/mem/packet.hh: src/mem/request.hh: src/sim/builder.hh: src/sim/system.hh: fixes for gcc 4.1 |
2982:0ecdb0879b14 |
14-Aug-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix up doxygen. |
2980:eab855f06b79 |
15-Aug-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Cleaned up include files and got rid of many using directives in header files. |
2975:9f8a7f66c91b |
11-Aug-2006 |
Gabe Black <gblack@eecs.umich.edu> |
#include of iostream needed. |
2897:d30a4674261c |
15-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Some changes to support blocking in the caches
src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache_impl.hh: Outstanding blocking updates for cache |
2885:703566816f07 |
10-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Some fixes so that MSHR's are matched and we don't issue overlapping requests with detailed cpu
src/mem/cache/base_cache.cc: If we still have outstanding requests, need to schedule event again src/mem/cache/miss/miss_queue.cc: Need to use block size so overlapping requests match in the MSHR's src/mem/cache/miss/mshr.cc: Actually save the address, otherwise we can't match MSHR's |
2858:6b243823ac53 |
07-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix address range calculation. Still need bus to handle snoop ranges. On the way towards multi-level caches (L2)
src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: Fix address range calculation. Still need bus to handle snoop ranges. |
2856:89691405ec9c |
07-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Update cpus to use the getPort function to use a connector object to connect the I/D cache ports to memory
configs/test/test.py: Update to use new cpu getPort functionality src/cpu/base.cc: Make cpu's a memObject to expose getPort interface src/cpu/base.hh: Make cpu's a memObject to export getPort interface src/cpu/simple/atomic.cc: src/cpu/simple/atomic.hh: src/cpu/simple/timing.cc: src/cpu/simple/timing.hh: Now use the connector via getPort interface src/mem/cache/base_cache.cc: Make sure the cache recognizes all port names |
2855:5ca2cdb32521 |
06-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Timing cache works for hello world test. Still need 1) detailed CPU (blocking ability in cache) 1a) Multiple outstanding requests (need to keep track of times for events) 2)Multi-level support 3)MP coherece support 4)LL/SC support 5)Functional path needs to be correctly implemented (temporarily works without multiple outstanding requests (simple cpu))
src/cpu/simple/timing.cc: Temp hack because timing cpu doesn't export ports properly so single I/D cache communicates only through the Icache port. src/mem/cache/base_cache.cc: Handle marking MSHR's in service Add support for getting CSHR's src/mem/cache/base_cache.hh: Make these functions visible at the base cache level src/mem/cache/cache.hh: make the functions virtual src/mem/cache/cache_impl.hh: Rename the function to make sense src/mem/packet.hh: Accidentally clearing the needsResponse field when sending a response back. |
2844:265f19c60d45 |
06-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Now timing reads work in single level of cache with simple cpu
src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache.hh: Changes to handle timing reads in Simple CPU (blocking buffers) |
2835:d2a977df88de |
05-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix some unset values in the request in the timing CPU. Properly implement the MSHR allocate function.
src/cpu/simple/timing.cc: Set the thread context in the CPU.
Need to do this properly, currently I just set it to Cpu=0 Thread=0. This will just cause all the stats in the cache based on these to just yield totals and not a distribution. src/mem/cache/miss/mshr.cc: Properly implement the allocate function for the MSHR. |
2827:45c3bdb0ffd4 |
30-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
AtomicSimpleCPU with a cache now runs the hello world! test program. Need to clean up a bunch of flags/hacks in the code. Then onto Timming mode.
Functional accesses also work properly, although not exactly how we wanted them. I'll need to clean that up as well.
src/cpu/simple/atomic.cc: Atomic CPU needs to set thread context so stats work in cache. Temporarily just use CPU=0 ThreadID=0 src/mem/cache/cache_impl.hh: Need to return success/failure properly still Physical memory object doesn't assert SATISFIED anymore, need to remove that flag src/mem/cache/tags/lru.cc: Doesn't work if the REQ doesn't set it's ASID. Temporary fix use 0 always |
2826:d20db4a6f7d1 |
30-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
First pass, now compiles with current head of tree. Compile and initialization work, still working on functionality.
src/mem/cache/base_cache.cc: Temp fix for cpu's use of getPort functionality. CPU's will need to be ported to the new connector objects. Also, all packets have to have data or the delete fails. src/mem/cache/cache.hh: Fix function prototypes so overloading works src/mem/cache/cache_impl.hh: fix functions to match virtual base class src/mem/cache/miss/miss_queue.cc: Packets havve to have data, or delete fails src/python/m5/objects/BaseCache.py: Update for newmem |
2825:d5d9593a1f19 |
30-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix the packet data allocation methods. Small fixes from changesets after my initial work.
This now compiles.
src/mem/cache/base_cache.cc: Fix getPort function that changed src/mem/cache/base_cache.hh: Fix get port function, provide default implementations of virtual functions in the base class src/mem/cache/cache.hh: Fix virtual function declerations src/mem/cache/cache_builder.cc: Fix params src/mem/cache/cache_impl.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/miss_queue.cc: src/mem/cache/miss/mshr.cc: src/mem/cache/prefetch/base_prefetcher.cc: src/mem/cache/tags/iic.cc: src/mem/cache/tags/lru.cc: Properly allocate data in packet |
2814:b723c79f5349 |
30-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
All files compile in the mem directory except cache_builder
Missing some functionality (like split caches and copy support)
src/SConscript: Typo src/mem/cache/prefetch/base_prefetcher.cc: src/mem/cache/prefetch/ghb_prefetcher.hh: src/mem/cache/prefetch/stride_prefetcher.hh: src/mem/cache/prefetch/tagged_prefetcher_impl.hh: src/mem/cache/tags/fa_lru.cc: src/mem/cache/tags/fa_lru.hh: src/mem/cache/tags/iic.cc: src/mem/cache/tags/iic.hh: src/mem/cache/tags/lru.cc: src/mem/cache/tags/lru.hh: src/mem/cache/tags/split.cc: src/mem/cache/tags/split.hh: src/mem/cache/tags/split_lifo.cc: src/mem/cache/tags/split_lifo.hh: src/mem/cache/tags/split_lru.cc: src/mem/cache/tags/split_lru.hh: src/mem/packet.hh: src/mem/request.hh: Fix so it compiles |
2813:89d9196456ac |
29-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Still missing prefetch and tags directories as well as cache builder. Some implementation details were left blank still, need to fill them in.
src/SConscript: Reorder build to compile all files first src/mem/cache/cache.hh: src/mem/cache/cache_builder.cc: src/mem/cache/cache_impl.hh: src/mem/cache/coherence/coherence_protocol.cc: src/mem/cache/coherence/uni_coherence.cc: src/mem/cache/coherence/uni_coherence.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/miss_queue.cc: src/mem/cache/miss/mshr.cc: src/mem/cache/miss/mshr.hh: src/mem/cache/miss/mshr_queue.cc: More changesets pulled, now compiles everything in /miss directory and in the root directory src/mem/packet.hh: Add some more support, need to clean some of it out once everything is working |
2812:8e5feae75615 |
28-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
More Changes, working towards cache.cc compiling. Headers cleaned up.
src/mem/cache/cache_blk.hh: Remove XC |
2811:9da12e9830ce |
28-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Backing in more changsets, getting closer to compile base_cache.cc compiles, continuing on
src/SConscript: Add in compilation flags for cache files src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: Back in more fixes, now base_cache compiles src/mem/cache/cache.hh: src/mem/cache/cache_blk.hh: src/mem/cache/cache_impl.hh: src/mem/cache/coherence/coherence_protocol.cc: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/blocking_buffer.hh: src/mem/cache/miss/miss_queue.cc: src/mem/cache/miss/miss_queue.hh: src/mem/cache/miss/mshr.cc: src/mem/cache/miss/mshr.hh: src/mem/cache/miss/mshr_queue.cc: src/mem/cache/miss/mshr_queue.hh: src/mem/cache/prefetch/base_prefetcher.cc: src/mem/cache/tags/fa_lru.cc: src/mem/cache/tags/iic.cc: src/mem/cache/tags/lru.cc: src/mem/cache/tags/split_lifo.cc: src/mem/cache/tags/split_lru.cc: src/mem/packet.cc: src/mem/packet.hh: src/mem/request.hh: Backing in more changsets, getting closer to compile |
2810:5befce12ad70 |
28-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Was having difficulty with merging the cache, reverted to an early version and will add back in the patches to make it work soon.
src/mem/cache/prefetch/tagged_prefetcher_impl.hh: Trying to merge src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache.cc: src/mem/cache/cache.hh: src/mem/cache/cache_blk.hh: src/mem/cache/cache_builder.cc: src/mem/cache/cache_impl.hh: src/mem/cache/coherence/coherence_protocol.cc: src/mem/cache/coherence/coherence_protocol.hh: src/mem/cache/coherence/simple_coherence.hh: src/mem/cache/coherence/uni_coherence.cc: src/mem/cache/coherence/uni_coherence.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/blocking_buffer.hh: src/mem/cache/miss/miss_queue.cc: src/mem/cache/miss/miss_queue.hh: src/mem/cache/miss/mshr.cc: src/mem/cache/miss/mshr.hh: src/mem/cache/miss/mshr_queue.cc: src/mem/cache/miss/mshr_queue.hh: src/mem/cache/prefetch/base_prefetcher.cc: src/mem/cache/prefetch/base_prefetcher.hh: src/mem/cache/prefetch/ghb_prefetcher.cc: src/mem/cache/prefetch/ghb_prefetcher.hh: src/mem/cache/prefetch/stride_prefetcher.cc: src/mem/cache/prefetch/stride_prefetcher.hh: src/mem/cache/prefetch/tagged_prefetcher.hh: src/mem/cache/tags/base_tags.cc: src/mem/cache/tags/base_tags.hh: src/mem/cache/tags/fa_lru.cc: src/mem/cache/tags/fa_lru.hh: src/mem/cache/tags/iic.cc: src/mem/cache/tags/iic.hh: src/mem/cache/tags/lru.cc: src/mem/cache/tags/lru.hh: src/mem/cache/tags/repl/gen.cc: src/mem/cache/tags/repl/gen.hh: src/mem/cache/tags/repl/repl.cc: src/mem/cache/tags/repl/repl.hh: src/mem/cache/tags/split.cc: src/mem/cache/tags/split.hh: src/mem/cache/tags/split_blk.hh: src/mem/cache/tags/split_lifo.cc: src/mem/cache/tags/split_lifo.hh: src/mem/cache/tags/split_lru.cc: src/mem/cache/tags/split_lru.hh: Pulling an early version of the cache into the tree due to merging issues. Will apply patches and push. |
2665:a124942bacb8 |
31-May-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Updated Authors from bk prs info |
2632:1bb2f91485ea |
22-May-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
New directory structure: - simulator source now in 'src' subdirectory - imported files from 'ext' repository - support building in arbitrary places, including outside of the source tree. See comment at top of SConstruct file for more details. Regression tests are temporarily disabled; that syetem needs more extensive revisions.
SConstruct: Update for new directory structure. Modify to support build trees that are not subdirectories of the source tree. See comment at top of file for more details. Regression tests are temporarily disabled. src/arch/SConscript: src/arch/isa_parser.py: src/python/SConscript: Update for new directory structure. |