#
14193:7dd8a6df30e2 |
|
17-Aug-2019 |
Gabe Black <gabeblack@google.com> |
mem: Eliminate the Base(Slave|Master)Port classes.
The Port class has assumed all the duties of the less generic Base*Port classes, making them unnecessary. Since they don't add anything but make the code more complex, this change eliminates them.
Change-Id: Ibb9c56def04465f353362595c1f1c5ac5083e5e9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20236 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Gabe Black <gabeblack@google.com>
|
#
14035:60068a2d56e0 |
|
31-May-2019 |
Daniel Carvalho <odanrc@yahoo.com.br> |
Revert "mem-cache: Remove writebacks packet list"
This reverts commit bf0a722acdd8247602e83720a5f81a0b69c76250.
Reason for revert: This patch introduces a bug:
The problem here is that the insertion of block A may cause the eviction of block B, which on the lower level may cause the eviction of block A. Since A is not marked as present yet, A is "safely" removed from the snoop filter
However, by reverting it, using atomic and a Tags sub-class that can generate multiple evictions at once becomes broken when using Atomic mode and shall be fixed in a future patch.
Change-Id: I5b27e54b54ae5b50255588835c1a2ebf3015f002 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19088 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
13981:577196ddd040 |
|
02-May-2019 |
Gabe Black <gabeblack@google.com> |
arch, base, cpu, dev, mem, sim: Remove #if 0-ed out code.
This code will be preserved through version control, but otherwise creates clutter and will rot in place since it's never compiled.
Change-Id: Id265f6deac445116843956ea5cf1210d8127274e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18608 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
13948:f8666d4d5855 |
|
18-Apr-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove writebacks packet list
Previously all atomic writebacks concerned a single block, therefore, when a block was evicted, no other block would be pending eviction. With sector tags (and compression), however, a single replacement can generate many evictions.
This can cause problems, since a writeback that evicts a block may evict blocks in the lower cache. If one of these conflict with one of the blocks pending eviction in the higher level, the snoop must inform it to the lower level. Since atomic mode does not have a writebuffer, this kind of conflict wouldn't be noticed.
Therefore, instead of evicting multiple blocks at once, we do it one by one.
Change-Id: I2fc2f9eb0f26248ddf91adbe987d158f5a2e592b Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18209 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13947:4cf8087cab09 |
|
08-Aug-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Handle data expansion
When a block in compressed form is overwriten, it may change its size. If the new compressed size is bigger, and the total size becomes bigger than the block size, one or more blocks will have to be evicted. This is called data expansion, or fat writes.
This change assumes that a first level cache cannot have a compressor, since otherwise data expansion should have been handled for atomic operations and writes. As such, data expansions should only be seen on writebacks. As writebacks are forwarded to the next level when failed, there should be no data expansions when servicing misses either.
This patch adds the functionality to handle data expansions by evicting the co-allocated blocks to make room for an expanded block.
Change-Id: I0bd77bf6446bfae336889940b2f75d6f0c87e533 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/12087 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
13945:a573bed35a8b |
|
19-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add compression and decompression calls
Add a compressor to the base cache class and compress within block allocation and decompress on writebacks.
This change does not implement data expansion (fat writes) yet, nor it adds the compression latency to the block write time.
Change-Id: Ie36db65f7487c9b05ec4aedebc2c7651b4cb4821 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11410 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13892:0182a0601f66 |
|
22-Apr-2019 |
Gabe Black <gabeblack@google.com> |
mem: Minimize the use of MemObject.
MemObject doesn't provide anything beyond its base ClockedObject any more, so this change removes it from most inheritance hierarchies. Occasionally MemObject is replaced with SimObject when I was fairly confident that the extra functionality of ClockedObject wasn't needed.
Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Gabe Black <gabeblack@google.com>
|
#
13860:8f8df5b68439 |
|
11-Feb-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Add packet matching functions
Add both block and non-block-aligned packet matching functions, so that both address and secure bits are checked when checking whether a packet matches a request.
Change-Id: Id0069befb925d112e06f250741cb47d9dfa249cc Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17533 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13784:1941dc118243 |
|
07-Mar-2019 |
Gabe Black <gabeblack@google.com> |
arch, cpu, dev, gpu, mem, sim, python: start using getPort.
Replace the getMasterPort, getSlavePort, and getEthPort functions with getPort, and remove extraneous mechanisms that are no longer necessary.
Change-Id: Iab7e3c02d2f3a0cf33e7e824e18c28646b5bc318 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17040 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
13749:b2486662285d |
|
04-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Allow tag-only accesses on latency calculation
Some accesses only need to search for a tag in the tag array, with no need to touch the data array. This is the case for CleanEvicts, evicts that don't find a corresponding block entry (since a write cannot be done in parallel with tag lookup), and maintenance operations.
Change-Id: I7365a915500b5d7ab636d49a9acc627072a7f58e Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14878 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13746:723109f11d56 |
|
04-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use header delay on latency calculation
Previously the bus delay was being ignored for the access latency calculation, and then applied on top of the access latency. This patch fixes the order, as first the packet must arrive before the access starts.
Change-Id: I6d55299a911d54625c147814dd423bfc63ef1b65 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14876 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13717:11e81e2a98bd |
|
03-Dec-2018 |
Ivan Pizarro <ivan.pizarro@metempsy.com> |
mem-cache: A Best-Offset Prefetcher
Michaud, P. (2015, June). A best-offset prefetcher. In 2nd Data Prefetching Championship.
Change-Id: I61bb89ca5639356d54aeb04e856d5bf6e8805c22 Reviewed-on: https://gem5-review.googlesource.com/c/14820 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13624:3d8220c2d41d |
|
13-Dec-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Updated version of the Signature Path Prefetcher
This implementation is based in the description available in: Jinchun Kim, Seth H. Pugsley, Paul V. Gratz, A. L. Narasimha Reddy, Chris Wilkerson, and Zeshan Chishti. 2016. Path confidence based lookahead prefetching. In The 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-49). IEEE Press, Piscataway, NJ, USA, Article 60, 12 pages.
Change-Id: I4b8b54efef48ced7044bd535de9a69bca68d47d9 Reviewed-on: https://gem5-review.googlesource.com/c/14819 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13478:59414c401cd9 |
|
05-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove writebacks parameter from serviceMSHRTargets
Change 8ba77ae8fc98a355082da2bd9fdc6ecf4928f725 introduced the writebacks parameter, but it was never used.
Change-Id: I225e5b399de42d77c72fc0012d3dc93ef39b8853 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14896 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13418:08101e89101e |
|
18-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move access latency calculation to Cache
Access latency was not being calculated properly, as it was always assuming that for hits reads take as long as writes, and that parallel accesses would produce the same latency for read and write misses.
By moving the calculation to the Cache we can use the write/ read information, reduce latency variables duplication and remove Cache dependency from Tags.
The tag lookup latency is still calculated by the Tags.
Change-Id: I71bc68fb5c3515b372c3bf002d61b6f048a45540 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13697 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13416:d90887d0c889 |
|
09-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: implement a probe-based interface
The HW Prefetcher of a cache can now listen events from their associated CPUs and from its own cache.
Change-Id: I28aecd8faf8ed44be94464d84485bd1cea2efae3 Reviewed-on: https://gem5-review.googlesource.com/c/14155 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13358:5e1605b47a21 |
|
19-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move evictBlock(CacheBlk*, PacketList&) to base
Move evictBlock(CacheBlk*, PacketList&) to base cache, as it is both sub-classes implementations are equal.
Change-Id: I80fbd16813bfcc4938fb01ed76abe29b3f8b3018 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13656 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13352:75647326f19b |
|
10-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add write coalescing and write-no-allocate to the caches
Enable the cache to detect contiguous writes and hold on to the MSHR long enough to allow the entire line to be written. If the whole line is written, the MSHR will be sent out as an invalidation requests, as it is part of a whole-line write, i.e. no-fetch-on-write.
The cache is also able to switch to a write-no-allocate policy on the actual completion of the writes, and instead use the tempBlock and turn the write operation into a writeback.
These policies are all well-known, and described in works such as Jouppi, Cache Write Policies and Performance, vol 21, no 2, ACM, 1993.
Change-Id: I19792f2970b3c6798c9b2b493acdd156897284ae Reviewed-on: https://gem5-review.googlesource.com/c/12907 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13350:247e4108a5e8 |
|
10-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Restructure whole-line writes to simplify write merging
This patch changes how we deal with whole-line writes their responses. With these changes, we use the MSHR tracking to determine if a whole-line is written, and on a fill we simply handle the invalidation response, with the actual writes taking place as part of satisfying the CPU-side hit.
Change-Id: I9a18e41a95db3c20b97f8bca7d95ff33d35a578b Reviewed-on: https://gem5-review.googlesource.com/c/12905 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13223:081299f403fe |
|
11-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Rename blk.cc/hh to cache_blk.cc/hh
Rename the files blk.cc and blk.hh to cache_blk.cc and cache_blk.hh to comply with the usual file-class naming rules.
Change-Id: I8af45df3e4b8dd934fd9929ec914fb230cb2cb09 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13416 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13017:a620da03ab10 |
|
01-Sep-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix bug in handleAtomicReqMiss
"4976ff5 mem-cache: Refactor the recvAtomic function" introduced a bug where if an atomic request that fills in using the tempBlock it will not evict it when it finishes handling the request as it should. This triggers an assertion. This change fixes this bug.
Change-Id: I73c808a7e15237eddb36b5448ef6728f7bcf7fd9 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/12644 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
12754:15c1d281ce1a |
|
06-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Insert on block allocation
When a block is being replaced in an allocation, if successfull, the block will be inserted. Therefore we move the insertion functionality to allocateBlock().
allocateBlock's signature has been modified to allow this modification.
Change-Id: I60d17a83ff4f3021fdc976378868ccde6c7507bc Reviewed-on: https://gem5-review.googlesource.com/10812 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
12730:6c2ea88bf129 |
|
16-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create an address aware TempCacheBlk
tempBlock has its member variables manually set in order to allow it to be used in the block address regeneration function. This is not necessary, and ti can be simply given the address, so it does not need to be aware of set and tag. This will simplify implementation of sector and skewed caches.
Change-Id: Iaffb10c323509722cd5589fe1030b818d43336d6 Reviewed-on: https://gem5-review.googlesource.com/9961 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
12728:57bdea4f96aa |
|
30-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Replace block visitor with std::function
This change modifies forEachBlk tags function to accept std::function as parameter. It also adds an anyBlk tags function that given a condition, it iterates through the blocks and returns whether the condition is met.
Finally, it uses forEachBlk to implement the print, computeStats and cleanupRefs functions that also work for the FALRU class.
Change-Id: I2f75f4baa1fdd5a1d343a63ecace3eb9458fbf03 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10621 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
12724:4f6fac3191d2 |
|
02-Feb-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Adopt a more sensible cache class hierarchy
This patch changes what goes into the BaseCache and what goes into the Cache, to make it easier to add a NoncoherentCache with as much re-use as possible. A number of redundant members and definitions are also removed in the process.
This is a modified version of a changeset put together by Andreas Hansson <andreas.hansson@arm.com>
Change-Id: Ie9dd73c4ec07732e778e7416b712dad8b4bd5d4b Reviewed-on: https://gem5-review.googlesource.com/10431 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
12702:27cb33a96e0f |
|
10-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Move replacements stat to the base cache class
Change-Id: I25dbcfcddfe1c422a76eb1af3f726c1360d8d110 Reviewed-on: https://gem5-review.googlesource.com/10426 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
|
#
12334:e0ab29a34764 |
|
30-Nov-2017 |
Gabe Black <gabeblack@google.com> |
misc: Rename misc.(hh|cc) to logging.(hh|cc)
These files aren't a collection of miscellaneous stuff, they're the definition of the Logger interface, and a few utility macros for calling into that interface (panic, warn, etc.).
Change-Id: I84267ac3f45896a83c0ef027f8f19c5e9a5667d1 Reviewed-on: https://gem5-review.googlesource.com/6226 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Gabe Black <gabeblack@google.com>
|
#
12091:f2d1af96ad2d |
|
13-Jun-2017 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem-cache: Add missing overrides to BaseCache
Change-Id: I6a3a57e3067c247bd6ce6f01ac9459883f4aae2c Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/3880 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
12084:5a3769ff3d55 |
|
07-Jun-2017 |
Sean Wilson <spwilson2@wisc.edu> |
mem: Replace EventWrapper use with EventFunctionWrapper
NOTE: With this change there is a possibility for `DRAMCtrl::Rank`s event names to not properly match the rank they were generated by. This could occur if the public rank member is modified after the Rank's construction. A patch would mean refactoring Rank and `DRAMCtrl`b to privatize many of the members of Rank behind getters.
Change-Id: I7b8bd15086f4ffdfd3f40be4aeddac5e786fd78e Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3745 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
11892:c7ea349e1cd3 |
|
26-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Use pkt::getBlockAddr instead of BaseCace::blockAlign
Change-Id: I0ed4e528cb750a323facdc811dde7f0ed1ff228e Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
|
#
11744:5d33c6972dda |
|
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Make packet debug printing more uniform
Previously DPRINTFs printing information about a packet would use ad hoc formats. This patch changes all DPRINTFs to use the print function defined by the packet class, making the packet printing format more uniform and easier to change.
Change-Id: Idd436a9758d4bf70c86a574d524648b2a2580970 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
|
#
11722:f15f02d8c79e |
|
30-Nov-2016 |
Sophiane Senni <sophiane.senni@gmail.com> |
mem: Split the hit_latency into tag_latency and data_latency
If the cache access mode is parallel, i.e. "sequential_access" parameter is set to "False", tags and data are accessed in parallel. Therefore, the hit_latency is the maximum latency between tag_latency and data_latency. On the other hand, if the cache access mode is sequential, i.e. "sequential_access" parameter is set to "True", tags and data are accessed sequentially. Therefore, the hit_latency is the sum of tag_latency plus data_latency.
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
|
#
11483:d4c2e56d18b2 |
|
26-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: fix the line length in the cache related classes
Change-Id: I6d1feb164a958dde0da87a1cd2698096112c4a82 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
11454:e55afadc4e19 |
|
21-Apr-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused cache stats
Prune cache stats that are never actually used.
|
#
11436:f351b7f248db |
|
27-May-2015 |
Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com> |
mem: Add unused prefetch counter in caches
Added stat to the cache to account for HardPF'ed blocks that are evicted before being referenced (over-prefetching).
|
#
11375:f98df9231cdd |
|
17-Mar-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Create a separate class for the cache write buffer
This patch breaks out the cache write buffer into a separate class, without affecting any stats. The goal of the patch is to avoid encumbering the much-simpler write queue with the complex MSHR handling. In a follow on patch this simplification allows us to implement write combining.
The WriteQueue gets its own class, but shares a common ancestor, the generic Queue, with the MSHRQueue.
|
#
11331:cd5c48db28e6 |
|
10-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Deduce if cache should forward snoops
This patch changes how the cache determines if snoops should be forwarded from the memory side to the CPU side. Instead of having a parameter, the cache now looks at the port connected on the CPU side, and if it is a snooping port, then snoops are forwarded. Less error prone, and less parameters to worry about.
The patch also tidies up the CPU classes to ensure that their I-side port is not snooping by removing overrides to the snoop request handler, such that snoop requests will panic via the default MasterPort implement
|
#
11284:b3926db25371 |
|
31-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make cache terminology easier to understand
This patch changes the name of a bunch of packet flags and MSHR member functions and variables to make the coherency protocol easier to understand. In addition the patch adds and updates lots of descriptions, explicitly spelling out assumptions.
The following name changes are made:
* the packet memInhibit flag is renamed to cacheResponding
* the packet sharedAsserted flag is renamed to hasSharers
* the packet NeedsExclusive attribute is renamed to NeedsWritable
* the packet isSupplyExclusive is renamed responderHadWritable
* the MSHR pendingDirty is renamed to pendingModified
The cache states, Modified, Owned, Exclusive, Shared are also called out in the cache and MSHR code to make it easier to understand.
|
#
11199:929fd978ab4e |
|
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add an option to perform clean writebacks from caches
This patch adds the necessary commands and cache functionality to allow clean writebacks. This functionality is crucial, especially when having exclusive (victim) caches. For example, if read-only L1 instruction caches are not sending clean writebacks, there will never be any spills from the L1 to the L2. At the moment the cache model defaults to not sending clean writebacks, and this should possibly be re-evaluated.
The implementation of clean writebacks relies on a new packet command WritebackClean, which acts much like a Writeback (renamed WritebackDirty), and also much like a CleanEvict. On eviction of a clean block the cache either sends a clean evict, or a clean writeback, and if any copies are still cached upstream the clean evict/writeback is dropped. Similarly, if a clean evict/writeback reaches a cache where there are outstanding MSHRs for the block, the packet is dropped. In the typical case though, the clean writeback allocates a block in the downstream cache, and marks it writable if the evicted block was writable.
The patch changes the O3_ARM_v7a L1 cache configuration and the default L1 caches in config/common/Caches.py
|
#
11197:f8fdd931e674 |
|
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add cache clusivity
This patch adds a parameter to control the cache clusivity, that is if the cache is mostly inclusive or exclusive. At the moment there is no intention to support strict policies, and thus the options are: 1) mostly inclusive, or 2) mostly exclusive.
The choice of policy guides the behaviuor on a cache fill, and a new helper function, allocOnFill, is created to encapsulate the decision making process. For the timing mode, the decision is annotated on the MSHR on sending out the downstream packet, and in atomic we directly pass the decision to handleFill. We (ab)use the tempBlock in cases where we are not allocating on fill, leaving the rest of the cache unaffected. Simple and effective.
This patch also makes it more explicit that multiple caches are allowed to consider a block writable (this is the case also before this patch). That is, for a mostly inclusive cache, multiple caches upstream may also consider the block exclusive. The caches considering the block writable/exclusive all appear along the same path to memory, and from a coherency protocol point of view it works due to the fact that we always snoop upwards in zero time before querying any downstream cache.
Note that this patch does not introduce clean writebacks. Thus, for clean lines we are essentially removing a cache level if it is made mostly exclusive. For example, lines from the read-only L1 instruction cache or table-walker cache are always clean, and simply get dropped rather than being passed to the L2. If the L2 is mostly exclusive and does not allocate on fill it will thus never hold the line. A follow on patch adds the clean writebacks.
The patch changes the L2 of the O3_ARM_v7a CPU configuration to be mostly exclusive (and stats are affected accordingly).
|
#
11191:9eabb2bf349b |
|
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not treat CleanEvict as a write operation
This patch changes the CleanEvict command type to not be considered a write. Initially it was made a zero-sized write to match the writeback command, but as things developed it became clear that it causes more problems than it solves. For example, the memory modules (and bridge) should not consider the CleanEvict as a write, but instead discard it. With this patch it will be neither a read, nor write, and as it does not need a response the slave will simply sink it.
|
#
11053:62544e45c0f4 |
|
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add explicit Cache subclass and make BaseCache abstract
Open up for other subclasses to BaseCache and transition to using the explicit Cache subclass.
|
#
11051:81b1f46061c8 |
|
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Move cache_impl.hh to cache.cc
There is no longer any need to keep the implementation in a header.
|
#
10942:224c85495f96 |
|
30-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused RequestCause in cache
This patch removes the RequestCause, and also simplifies how we schedule the sending of packets through the memory-side port. The deassertion of bus requests is removed as it is not used.
|
#
10912:b99a6662d7c2 |
|
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Decouple draining from the SimObject hierarchy
Draining is currently done by traversing the SimObject graph and calling drain()/drainResume() on the SimObjects. This is not ideal when non-SimObjects (e.g., ports) need draining since this means that SimObjects owning those objects need to be aware of this.
This changeset moves the responsibility for finding objects that need draining from SimObjects and the Python-side of the simulator to the DrainManager. The DrainManager now maintains a set of all objects that need draining. To reduce the overhead in classes owning non-SimObjects that need draining, objects inheriting from Drainable now automatically register with the DrainManager. If such an object is destroyed, it is automatically unregistered. This means that drain() and drainResume() should never be called directly on a Drainable object.
While implementing the new functionality, the DrainManager has now been made thread safe. In practice, this means that it takes a lock whenever it manipulates the set of Drainable objects since SimObjects in different threads may create Drainable objects dynamically. Similarly, the drain counter is now an atomic_uint, which ensures that it is manipulated correctly when objects signal that they are done draining.
A nice side effect of these changes is that it makes the drain state changes stricter, which the simulation scripts can exploit to avoid redundant drains.
|
#
10887:279efb97ec99 |
|
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove redundant is_top_level cache parameter
This patch takes the final step in removing the is_top_level parameter from the cache. With the recent changes to read requests and write invalidations, the parameter is no longer needed, and consequently removed.
This also means that asymmetric cache hierarchies are now fully supported (and we are actually using them already with L1 caches, but no table-walker caches, connected to a shared L2).
|
#
10884:c60acdbdd6ad |
|
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Allow read-only caches and check compliance
This patch adds a parameter to the BaseCache to enable a read-only cache, for example for the instruction cache, or table-walker cache (not for x86). A number of checks are put in place in the code to ensure a read-only cache does not end up with dirty data.
A follow-on patch adds suitable read requests to allow a read-only cache to explicitly ask for clean data.
|
#
10821:581fb2484bd6 |
|
05-May-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Snoop into caches on uncacheable accesses
This patch takes a last step in fixing issues related to uncacheable accesses. We do not separate uncacheable memory from uncacheable devices, and in cases where it is really memory, there are valid scenarios where we need to snoop since we do not support cache maintenance instructions (yet). On snooping an uncacheable access we thus provide data if possible. In essence this makes uncacheable accesses IO coherent.
The snoop filter is also queried to steer the snoops, but not updated since the uncacheable accesses do not allocate a block.
|
#
10767:993c2baa485a |
|
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove redundant allocateUncachedReadBuffer in cache
This patch removes the no-longer-needed allocateUncachedReadBuffer. Besides the checks it is exactly the same as allocateMissBuffer and thus provides no value.
|
#
10764:b32578b2af99 |
|
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Align all MSHR entries to block boundaries
This patch aligns all MSHR queue entries to block boundaries to simplify checks for matches. Previously there were corner cases that could lead to existing entries not being identified as matches.
There are, rather alarmingly, a few regressions that change with this patch.
|
#
10714:9ba5e70964a4 |
|
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up the cache debug messages
Avoid redundant inclusion of the name in the DPRINTF string.
|
#
10713:eddb533708cb |
|
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Split port retry for all different packet classes
This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios.
The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting.
The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks.
|
#
10693:c0979b2ebda5 |
|
11-Feb-2015 |
Marco Balboni <Marco.Balboni@ARM.com> |
mem: Clarify usage of latency in the cache
This patch adds some much-needed clarity in the specification of the cache timing. For now, hit_latency and response_latency are kept as top-level parameters, but the cache itself has a number of local variables to better map the individual timing variables to different behaviours (and sub-components).
The introduced variables are: - lookupLatency: latency of tag lookup, occuring on any access - forwardLatency: latency that occurs in case of outbound miss - fillLatency: latency to fill a cache block We keep the existing responseLatency
The forwardLatency is used by allocateInternalBuffer() for: - MSHR allocateWriteBuffer (unchached write forwarded to WriteBuffer); - MSHR allocateMissBuffer (cacheable miss in MSHR queue); - MSHR allocateUncachedReadBuffer (unchached read allocated in MSHR queue) It is our assumption that the time for the above three buffers is the same. Similarly, for snoop responses passing through the cache we use forwardLatency.
|
#
10679:204a0f53035e |
|
03-Feb-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Clarify cache behaviour for pending dirty responses
This patch adds a bit of clarification around the assumptions made in the cache when packets are sent out, and dirty responses are pending. As part of the change, the marking of an MSHR as in service is simplified slightly, and comments are added to explain what assumptions are made.
|
#
10582:c04dc66e4316 |
|
02-Dec-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: Remove WriteInvalidate support
Prepare for a different implementation following in the next patch
|
#
10345:b5bef3c8e070 |
|
27-Jun-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: write streaming support via WriteInvalidate promotion
Support full-block writes directly rather than requiring RMW: * a cache line is allocated in the cache upon receipt of a WriteInvalidateReq, not the WriteInvalidateResp. * only top-level caches allocate the line; the others just pass the request along and invalidate as necessary. * to close a timing window between the *Req and the *Resp, a new metadata bit tracks whether another cache has read a copy of the new line before the writeback to memory.
|
#
10344:fa9ef374075f |
|
03-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix a bug in the cache port flow control
This patch fixes a bug in the cache port where the retry flag was reset too early, allowing new requests to arrive before the retry was actually sent, but with the event already scheduled. This caused a deadlock in the interactions with the O3 LSQ.
The patche fixes the underlying issue by shifting the resetting of the flag to be done by the event that also calls sendRetry(). The patch also tidies up the flow control in recvTimingReq and ensures that we also check if we already have a retry outstanding.
|
#
10028:fb8c44de891a |
|
24-Jan-2014 |
Giacomo Gabrielli <Giacomo.Gabrielli@arm.com> |
mem: Add support for a security bit in the memory system
This patch adds the basic building blocks required to support e.g. ARM TrustZone by discerning secure and non-secure memory accesses.
|
#
10020:2f33cb012383 |
|
24-Jan-2014 |
Matt Horsnell <matt.horsnell@ARM.com> |
mem: track per-request latencies and access depths in the cache hierarchy
Add some values and methods to the request object to track the translation and access latency for a request and which level of the cache hierarchy responded to the request.
|
#
9529:28d6d9663a7e |
|
15-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tighten up cache constness and scoping
This patch merely adopts a more strict use of const for the cache member functions and variables, and also moves a large portion of the member functions from public to protected.
|
#
9486:569e1f1d762d |
|
28-Jan-2013 |
Anthony Gutierrez <atgutier@umich.edu> |
cache: remove drainManager because it's not used
the cache drainManager is set but never cleared, this is because the cache itself does not need to be drained and thus never triggers a signalDrainDone(). because the drainManager variable is not used properly and does not appear to be necessary it has been removed with this patch.
|
#
9347:b02075171b57 |
|
02-Nov-2012 |
Andreas Sandberg <Andreas.Sandberg@arm.com> |
mem: Add support for writing back and flushing caches
This patch adds support for the following optional drain methods in the classical memory system's cache model:
memWriteback() - Write back all dirty cache lines to memory using functional accesses.
memInvalidate() - Invalidate all cache lines. Dirty cache lines are lost unless a writeback is requested.
Since memWriteback() is called when checkpointing systems, this patch adds support for checkpointing systems with caches. The serialization code now checks whether there are any dirty lines in the cache. If there are dirty lines in the cache, the checkpoint is flagged as bad and a warning is printed.
|
#
9342:6fec8f26e56d |
|
02-Nov-2012 |
Andreas Sandberg <Andreas.Sandberg@arm.com> |
sim: Move the draining interface into a separate base class
This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager.
|
#
9294:8fb03b13de02 |
|
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Add protocol-agnostic ports in the port hierarchy
This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations.
The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default.
|
#
9288:3d6da8559605 |
|
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Use cycles to express cache-related latencies
This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency.
The stat blocked_cycles that actually counter in ticks is now updated to count in cycles.
As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions.
|
#
9263:066099902102 |
|
25-Sep-2012 |
Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> |
Cache: add a response latency to the caches
In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path.
|
#
9163:3b5e13ac1940 |
|
22-Aug-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Extend the QueuedPort interface and use where appropriate
This patch extends the queued port interfaces with methods for scheduling the transmission of a timing request/response. The methods are named similar to the corresponding sendTiming(Snoop)Req/Resp, replacing the "send" with "sched". As the queues are currently unbounded, the methods always succeed and hence do not return a value.
This functionality was previously provided in the subclasses by calling PacketQueue::schedSendTiming with the appropriate parameters. With this change, there is no need to introduce these extra methods in the subclasses, and the use of the queued interface is more uniform and explicit.
|
#
9087:b5a084a6159b |
|
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Move retry from port base class to Master/SlavePort
This patch is the last part of moving all protocol-related functionality out of the Port base class. All the send/recv functions are already moved, and the retry (which still governs all the timing transport functions) is the only part that remained in the base class.
The only point where this currently causes a bit of inconvenience is in the bus where the retry list is global and holds Port pointers (not Master/SlavePort). This is about to change with the split into a request/response bus and will soon be removed anyway.
The patch has no impact on any regressions.
|
#
8975:7f36d4436074 |
|
01-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate requests and responses for timing accesses
This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses.
For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions).
The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly.
With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself.
|
#
8948:e95ee70f876c |
|
14-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate snoops and normal memory requests/responses
This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports.
Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below.
Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass.
Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses.
The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id.
In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions.
|
#
8922:17f037ad8918 |
|
30-Mar-2012 |
William Wang <william.wang@arm.com> |
MEM: Introduce the master/slave port sub-classes in C++
This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects.
The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches.
The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal.
The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again.
|
#
8914:8c3bd7bea667 |
|
22-Mar-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Split SimpleTimingPort into PacketQueue and ports
This patch decouples the queueing and the port interactions to simplify the introduction of the master and slave ports. By separating the queueing functionality from the port itself, it becomes much easier to distinguish between master and slave ports, and still retain the queueing ability for both (without code duplication).
As part of the split into a PacketQueue and a port, there is now also a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The QueuedPort is useful for ports that want to leave the packet transmission of outgoing packets to the queue and is used by both master and slave ports. The SimpleTimingPort inherits from the QueuedPort and adds the implemention of recvTiming and recvFunctional through recvAtomic.
The PioPort and MessagePort are cleaned up as part of the changes.
|
#
8883:c92153af04ac |
|
09-Mar-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
cache: Allow main memory to be at disjoint address ranges.
|
#
8856:241ee47b0dc6 |
|
24-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Simplify cache ports preparing for master/slave split
This patch splits the two cache ports into a master (memory-side) and slave (cpu-side) subclass of port with slightly different functionality. For example, it is only the CPU-side port that blocks incoming requests, and only the memory-side port that schedules send events outside of what the transmit list dictates.
This patch simplifies the two classes by relying further on SimpleTimingPort and also generalises the latter to better accommodate the changes (introducing trySendTiming and scheduleSend). The memory-side cache port overrides sendDeferredPacket to be able to not only send responses from the transmit list, but also send requests based on the MSHRs.
A follow on patch further simplifies the SimpleTimingPort and the cache ports.
|
#
8833:2870638642bd |
|
12-Feb-2012 |
Dam Sunwoo <dam.sunwoo@arm.com> |
mem: fix cache stats to use request ids correctly
This patch fixes the cache stats to use the new request ids. Cache stats also display the requestor names in the vector subnames. Most cache stats now include "nozero" and "nonan" flags to reduce the amount of excessive cache stat dump. Also, simplified incMissCount()/incHitCount() functions.
|
#
8809:bb10807da889 |
|
01-Feb-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with head, hopefully the last time for this batch.
|
#
8799:dac1e33e07b0 |
|
28-Jan-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with the main repo.
|
#
8794:e2ac2b7164dd |
|
18-Nov-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Get rid of includes of config/full_system.hh.
|
#
8786:8be24baf68b8 |
|
07-Nov-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Get rid of FULL_SYSTEM in mem.
|
#
8737:770ccf3af571 |
|
31-Jan-2012 |
Koan-Sin Tan <koansin.tan@gmail.com> |
clang: Enable compiling gem5 using clang 2.9 and 3.0
This patch adds the necessary flags to the SConstruct and SConscript files for compiling using clang 2.9 and later (on Ubuntu et al and OSX XCode 4.2), and also cleans up a bunch of compiler warnings found by clang. Most of the warnings are related to hidden virtual functions, comparisons with unsigneds >= 0, and if-statements with empty bodies. A number of mismatches between struct and class are also fixed. clang 2.8 is not working as it has problems with class names that occur in multiple namespaces (e.g. Statistics in kernel_stats.hh).
clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which causes confusion between the container std::set and the function Packet::set, and this is currently addressed by not including the entire namespace std, but rather selecting e.g. "using std::vector" in the appropriate places.
|
#
8736:2d8a57343fe3 |
|
31-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Remove the otherPort from the cache ports
This patch is a very straight-forward simplification, removing the unecessary otherPort pointer from the cache port. The pointer was only used to forward range changes, and the address range is fixed for the cache. Removing the pointer simplifies the transition to master/slave ports.
|
#
8711:c7e14f52c682 |
|
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate queries for snooping and address ranges
This patch simplifies the address-range determination mechanism and also unifies the naming across ports and devices. It further splits the queries for determining if a port is snooping and what address ranges it responds to (aiming towards a separation of cache-maintenance ports and pure memory-mapped ports). Default behaviours are such that most ports do not have to define isSnooping, and master ports need not implement getAddrRanges.
|
#
8232:b28d06a175be |
|
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
trace: reimplement the DTRACE function so it doesn't use a vector At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help
|
#
8229:78bf55f23338 |
|
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
includes: sort all includes
|
#
8134:b01a51ff05fa |
|
17-Mar-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Mem: Fix issue with dirty block being lost when entire block transferred to non-cache.
This change fixes the problem for all the cases we actively use. If you want to try more creative I/O device attachments (E.g. sharing an L2), this won't work. You would need another level of caching between the I/O device and the cache (which you actually need anyway with our current code to make sure writes propagate). This is required so that you can mark the cache in between as top level and it won't try to send ownership of a block to the I/O device. Asserts have been added that should catch any issues.
|
#
7823:dac01f14f20f |
|
08-Jan-2011 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Replace curTick global variable with accessor functions. This step makes it easy to replace the accessor functions (which still access a global variable) with ones that access per-thread curTick values.
|
#
7676:92274350b953 |
|
10-Sep-2010 |
Nathan Binkert <nate@binkert.org> |
style: fix sorting of includes and whitespace in some files
|
#
7667:aa8fd8f6a495 |
|
09-Sep-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: coherence protocol enhancements & bug fixes Allow lower-level caches (e.g., L2 or L3) to pass exclusive copies to higher levels (e.g., L1). This eliminates a lot of unnecessary upgrade transactions on read-write sequences to non-shared data.
Also some cleanup of MSHR coherence handling and multiple bug fixes.
|
#
7461:5a07045d0af2 |
|
15-Jun-2010 |
Nathan Binkert <nate@binkert.org> |
stats: only consider a formula initialized if there is a formula
|
#
6978:ab05e20dc4a7 |
|
23-Feb-2010 |
Lisa Hsu <Lisa.Hsu@amd.com> |
cache: Make caches sharing aware and add occupancy stats. On the config end, if a shared L2 is created for the system, it is parameterized to have n sharers as defined by option.num_cpus. In addition to making the cache sharing aware so that discriminating tag policies can make use of context_ids to make decisions, I added an occupancy AverageStat and an occ % stat to each cache so that you could know which contexts are occupying how much cache on average, both in terms of blocks and percentage. Note that since devices have context_id -1, having an array of occ stats that correspond to each context_id will break here, so in FS mode I add an extra bucket for device blocks. This bucket is explicitly not added in SE mode in order to not only avoid ugliness in the stats.txt file, but to avoid broken stats (some formulas break when a bucket is 0).
|
#
6666:3199397fd905 |
|
26-Sep-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Minor cleanup: Use the blockAlign() method where it applies in the cache.
|
#
6227:a17798f2a52c |
|
05-Jun-2009 |
Nathan Binkert <nate@binkert.org> |
types: clean up types, especially signed vs unsigned
|
#
6215:9aed64c9f10f |
|
17-May-2009 |
Nathan Binkert <nate@binkert.org> |
includes: use base/types.hh not inttypes.h or stdint.h
|
#
6122:9af6fb59752f |
|
16-Jul-2008 |
Steve Reinhardt <Steve.Reinhardt@amd.com> |
mem: use single BadAddr responder per system. Previously there was one per bus, which caused some coherence problems when more than one decided to respond. Now there is just one on the main memory bus. The default bus responder on all other buses is now the downstream cache's cpu_side port. Caches no longer need to do address range filtering; instead, we just have a simple flag to prevent snoops from propagating to the I/O bus.
|
#
5999:3cf8e71257e0 |
|
05-Mar-2009 |
Nathan Binkert <nate@binkert.org> |
stats: Fix all stats usages to deal with template fixes
|
#
5875:d82be3235ab4 |
|
16-Feb-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Fixes to get prefetching working again. Apparently we broke it with the cache rewrite and never noticed. Thanks to Bao Yungang <baoyungang@gmail.com> for a significant part of these changes (and for inspiring me to work on the rest). Some other overdue cleanup on the prefetch code too.
|
#
5715:e8c1d4e669a7 |
|
04-Nov-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
get rid of all instances of readTid() and getThreadNum(). Unify and eliminate redundancies with threadId() as their replacement.
|
#
5338:e75d02a09806 |
|
10-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix #include lines for renamed cache files.
|
#
5337:f81512eb8bdf |
|
10-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Rename cache files for brevity and consistency with rest of tree.
|