History log of /gem5/src/mem/cache/base.hh
Revision Date Author Comments
# 14193:7dd8a6df30e2 17-Aug-2019 Gabe Black <gabeblack@google.com>

mem: Eliminate the Base(Slave|Master)Port classes.

The Port class has assumed all the duties of the less generic
Base*Port classes, making them unnecessary. Since they don't add
anything but make the code more complex, this change eliminates them.

Change-Id: Ibb9c56def04465f353362595c1f1c5ac5083e5e9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20236
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 14035:60068a2d56e0 31-May-2019 Daniel Carvalho <odanrc@yahoo.com.br>

Revert "mem-cache: Remove writebacks packet list"

This reverts commit bf0a722acdd8247602e83720a5f81a0b69c76250.

Reason for revert: This patch introduces a bug:

The problem here is that the insertion of block A may cause the
eviction of block B, which on the lower level may cause the
eviction of block A. Since A is not marked as present yet, A is
"safely" removed from the snoop filter

However, by reverting it, using atomic and a Tags sub-class that
can generate multiple evictions at once becomes broken when using
Atomic mode and shall be fixed in a future patch.

Change-Id: I5b27e54b54ae5b50255588835c1a2ebf3015f002
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19088
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 13981:577196ddd040 02-May-2019 Gabe Black <gabeblack@google.com>

arch, base, cpu, dev, mem, sim: Remove #if 0-ed out code.

This code will be preserved through version control, but otherwise
creates clutter and will rot in place since it's never compiled.

Change-Id: Id265f6deac445116843956ea5cf1210d8127274e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18608
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 13948:f8666d4d5855 18-Apr-2019 Daniel R. Carvalho <odanrc@yahoo.com.br>

mem-cache: Remove writebacks packet list

Previously all atomic writebacks concerned a single block,
therefore, when a block was evicted, no other block would be
pending eviction. With sector tags (and compression),
however, a single replacement can generate many evictions.

This can cause problems, since a writeback that evicts a block
may evict blocks in the lower cache. If one of these conflict
with one of the blocks pending eviction in the higher level, the
snoop must inform it to the lower level. Since atomic mode does
not have a writebuffer, this kind of conflict wouldn't be noticed.

Therefore, instead of evicting multiple blocks at once, we
do it one by one.

Change-Id: I2fc2f9eb0f26248ddf91adbe987d158f5a2e592b
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18209
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13947:4cf8087cab09 08-Aug-2018 Daniel R. Carvalho <odanrc@yahoo.com.br>

mem-cache: Handle data expansion

When a block in compressed form is overwriten, it may change
its size. If the new compressed size is bigger, and the total
size becomes bigger than the block size, one or more blocks
will have to be evicted. This is called data expansion, or
fat writes.

This change assumes that a first level cache cannot have a
compressor, since otherwise data expansion should have been
handled for atomic operations and writes. As such, data
expansions should only be seen on writebacks. As writebacks
are forwarded to the next level when failed, there should
be no data expansions when servicing misses either.

This patch adds the functionality to handle data expansions
by evicting the co-allocated blocks to make room for an
expanded block.

Change-Id: I0bd77bf6446bfae336889940b2f75d6f0c87e533
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/12087
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 13945:a573bed35a8b 19-Jun-2018 Daniel R. Carvalho <odanrc@yahoo.com.br>

mem-cache: Add compression and decompression calls

Add a compressor to the base cache class and compress within
block allocation and decompress on writebacks.

This change does not implement data expansion (fat writes) yet,
nor it adds the compression latency to the block write time.

Change-Id: Ie36db65f7487c9b05ec4aedebc2c7651b4cb4821
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11410
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13892:0182a0601f66 22-Apr-2019 Gabe Black <gabeblack@google.com>

mem: Minimize the use of MemObject.

MemObject doesn't provide anything beyond its base ClockedObject any
more, so this change removes it from most inheritance hierarchies.
Occasionally MemObject is replaced with SimObject when I was fairly
confident that the extra functionality of ClockedObject wasn't needed.

Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 13860:8f8df5b68439 11-Feb-2019 Daniel R. Carvalho <odanrc@yahoo.com.br>

mem: Add packet matching functions

Add both block and non-block-aligned packet matching functions,
so that both address and secure bits are checked when checking
whether a packet matches a request.

Change-Id: Id0069befb925d112e06f250741cb47d9dfa249cc
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17533
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13784:1941dc118243 07-Mar-2019 Gabe Black <gabeblack@google.com>

arch, cpu, dev, gpu, mem, sim, python: start using getPort.

Replace the getMasterPort, getSlavePort, and getEthPort functions
with getPort, and remove extraneous mechanisms that are no longer
necessary.

Change-Id: Iab7e3c02d2f3a0cf33e7e824e18c28646b5bc318
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17040
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 13749:b2486662285d 04-Dec-2018 Daniel R. Carvalho <odanrc@yahoo.com.br>

mem-cache: Allow tag-only accesses on latency calculation

Some accesses only need to search for a tag in the tag array, with
no need to touch the data array. This is the case for CleanEvicts,
evicts that don't find a corresponding block entry (since a write
cannot be done in parallel with tag lookup), and maintenance
operations.

Change-Id: I7365a915500b5d7ab636d49a9acc627072a7f58e
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14878
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13746:723109f11d56 04-Dec-2018 Daniel R. Carvalho <odanrc@yahoo.com.br>

mem-cache: Use header delay on latency calculation

Previously the bus delay was being ignored for the access latency
calculation, and then applied on top of the access latency. This
patch fixes the order, as first the packet must arrive before the
access starts.

Change-Id: I6d55299a911d54625c147814dd423bfc63ef1b65
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14876
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13717:11e81e2a98bd 03-Dec-2018 Ivan Pizarro <ivan.pizarro@metempsy.com>

mem-cache: A Best-Offset Prefetcher

Michaud, P. (2015, June). A best-offset prefetcher.
In 2nd Data Prefetching Championship.

Change-Id: I61bb89ca5639356d54aeb04e856d5bf6e8805c22
Reviewed-on: https://gem5-review.googlesource.com/c/14820
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13624:3d8220c2d41d 13-Dec-2018 Javier Bueno <javier.bueno@metempsy.com>

mem-cache: Updated version of the Signature Path Prefetcher

This implementation is based in the description available in:
Jinchun Kim, Seth H. Pugsley, Paul V. Gratz, A. L. Narasimha Reddy,
Chris Wilkerson, and Zeshan Chishti. 2016.
Path confidence based lookahead prefetching.
In The 49th Annual IEEE/ACM International Symposium on Microarchitecture
(MICRO-49). IEEE Press, Piscataway, NJ, USA, Article 60, 12 pages.

Change-Id: I4b8b54efef48ced7044bd535de9a69bca68d47d9
Reviewed-on: https://gem5-review.googlesource.com/c/14819
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13478:59414c401cd9 05-Dec-2018 Daniel R. Carvalho <odanrc@yahoo.com.br>

mem-cache: Remove writebacks parameter from serviceMSHRTargets

Change 8ba77ae8fc98a355082da2bd9fdc6ecf4928f725 introduced the
writebacks parameter, but it was never used.

Change-Id: I225e5b399de42d77c72fc0012d3dc93ef39b8853
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/14896
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13418:08101e89101e 18-Oct-2018 Daniel R. Carvalho <odanrc@yahoo.com.br>

mem-cache: Move access latency calculation to Cache

Access latency was not being calculated properly, as it was
always assuming that for hits reads take as long as writes,
and that parallel accesses would produce the same latency
for read and write misses.

By moving the calculation to the Cache we can use the write/
read information, reduce latency variables duplication and
remove Cache dependency from Tags.

The tag lookup latency is still calculated by the Tags.

Change-Id: I71bc68fb5c3515b372c3bf002d61b6f048a45540
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/13697
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13416:d90887d0c889 09-Nov-2018 Javier Bueno <javier.bueno@metempsy.com>

mem-cache: implement a probe-based interface

The HW Prefetcher of a cache can now listen events
from their associated CPUs and from its own cache.

Change-Id: I28aecd8faf8ed44be94464d84485bd1cea2efae3
Reviewed-on: https://gem5-review.googlesource.com/c/14155
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13358:5e1605b47a21 19-Oct-2018 Daniel R. Carvalho <odanrc@yahoo.com.br>

mem-cache: Move evictBlock(CacheBlk*, PacketList&) to base

Move evictBlock(CacheBlk*, PacketList&) to base cache,
as it is both sub-classes implementations are equal.

Change-Id: I80fbd16813bfcc4938fb01ed76abe29b3f8b3018
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/13656
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13352:75647326f19b 10-Oct-2016 Nikos Nikoleris <nikos.nikoleris@arm.com>

mem: Add write coalescing and write-no-allocate to the caches

Enable the cache to detect contiguous writes and hold on to the MSHR
long enough to allow the entire line to be written. If the whole line
is written, the MSHR will be sent out as an invalidation requests, as
it is part of a whole-line write, i.e. no-fetch-on-write.

The cache is also able to switch to a write-no-allocate policy on the
actual completion of the writes, and instead use the tempBlock and
turn the write operation into a writeback.

These policies are all well-known, and described in works such as
Jouppi, Cache Write Policies and Performance, vol 21, no 2, ACM, 1993.

Change-Id: I19792f2970b3c6798c9b2b493acdd156897284ae
Reviewed-on: https://gem5-review.googlesource.com/c/12907
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13350:247e4108a5e8 10-Oct-2016 Nikos Nikoleris <nikos.nikoleris@arm.com>

mem: Restructure whole-line writes to simplify write merging

This patch changes how we deal with whole-line writes their
responses. With these changes, we use the MSHR tracking to determine
if a whole-line is written, and on a fill we simply handle the
invalidation response, with the actual writes taking place as part of
satisfying the CPU-side hit.

Change-Id: I9a18e41a95db3c20b97f8bca7d95ff33d35a578b
Reviewed-on: https://gem5-review.googlesource.com/c/12905
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13223:081299f403fe 11-Oct-2018 Daniel R. Carvalho <odanrc@yahoo.com.br>

mem-cache: Rename blk.cc/hh to cache_blk.cc/hh

Rename the files blk.cc and blk.hh to cache_blk.cc and cache_blk.hh
to comply with the usual file-class naming rules.

Change-Id: I8af45df3e4b8dd934fd9929ec914fb230cb2cb09
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/13416
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13017:a620da03ab10 01-Sep-2018 Nikos Nikoleris <nikos.nikoleris@arm.com>

mem-cache: Fix bug in handleAtomicReqMiss

"4976ff5 mem-cache: Refactor the recvAtomic function" introduced a bug
where if an atomic request that fills in using the tempBlock it will
not evict it when it finishes handling the request as it should. This
triggers an assertion. This change fixes this bug.

Change-Id: I73c808a7e15237eddb36b5448ef6728f7bcf7fd9
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/12644
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 12754:15c1d281ce1a 06-Jun-2018 Daniel R. Carvalho <odanrc@yahoo.com.br>

mem-cache: Insert on block allocation

When a block is being replaced in an allocation, if successfull,
the block will be inserted. Therefore we move the insertion
functionality to allocateBlock().

allocateBlock's signature has been modified to allow this
modification.

Change-Id: I60d17a83ff4f3021fdc976378868ccde6c7507bc
Reviewed-on: https://gem5-review.googlesource.com/10812
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 12730:6c2ea88bf129 16-Apr-2018 Daniel R. Carvalho <odanrc@yahoo.com.br>

mem-cache: Create an address aware TempCacheBlk

tempBlock has its member variables manually set in order to allow
it to be used in the block address regeneration function. This is
not necessary, and ti can be simply given the address, so it does
not need to be aware of set and tag. This will simplify
implementation of sector and skewed caches.

Change-Id: Iaffb10c323509722cd5589fe1030b818d43336d6
Reviewed-on: https://gem5-review.googlesource.com/9961
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 12728:57bdea4f96aa 30-May-2018 Nikos Nikoleris <nikos.nikoleris@arm.com>

mem-cache: Replace block visitor with std::function

This change modifies forEachBlk tags function to accept std::function
as parameter. It also adds an anyBlk tags function that given a
condition, it iterates through the blocks and returns whether the
condition is met.

Finally, it uses forEachBlk to implement the print, computeStats and
cleanupRefs functions that also work for the FALRU class.

Change-Id: I2f75f4baa1fdd5a1d343a63ecace3eb9458fbf03
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10621
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 12724:4f6fac3191d2 02-Feb-2018 Nikos Nikoleris <nikos.nikoleris@arm.com>

mem-cache: Adopt a more sensible cache class hierarchy

This patch changes what goes into the BaseCache and what goes into the
Cache, to make it easier to add a NoncoherentCache with as much re-use
as possible. A number of redundant members and definitions are also
removed in the process.

This is a modified version of a changeset put together by Andreas
Hansson <andreas.hansson@arm.com>

Change-Id: Ie9dd73c4ec07732e778e7416b712dad8b4bd5d4b
Reviewed-on: https://gem5-review.googlesource.com/10431
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 12702:27cb33a96e0f 10-May-2018 Nikos Nikoleris <nikos.nikoleris@arm.com>

mem-cache: Move replacements stat to the base cache class

Change-Id: I25dbcfcddfe1c422a76eb1af3f726c1360d8d110
Reviewed-on: https://gem5-review.googlesource.com/10426
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>


# 12334:e0ab29a34764 30-Nov-2017 Gabe Black <gabeblack@google.com>

misc: Rename misc.(hh|cc) to logging.(hh|cc)

These files aren't a collection of miscellaneous stuff, they're the
definition of the Logger interface, and a few utility macros for
calling into that interface (panic, warn, etc.).

Change-Id: I84267ac3f45896a83c0ef027f8f19c5e9a5667d1
Reviewed-on: https://gem5-review.googlesource.com/6226
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 12091:f2d1af96ad2d 13-Jun-2017 Andreas Sandberg <andreas.sandberg@arm.com>

mem-cache: Add missing overrides to BaseCache

Change-Id: I6a3a57e3067c247bd6ce6f01ac9459883f4aae2c
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/3880
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 12084:5a3769ff3d55 07-Jun-2017 Sean Wilson <spwilson2@wisc.edu>

mem: Replace EventWrapper use with EventFunctionWrapper

NOTE: With this change there is a possibility for `DRAMCtrl::Rank`s
event names to not properly match the rank they were generated by. This
could occur if the public rank member is modified after the Rank's
construction. A patch would mean refactoring Rank and `DRAMCtrl`b to
privatize many of the members of Rank behind getters.

Change-Id: I7b8bd15086f4ffdfd3f40be4aeddac5e786fd78e
Signed-off-by: Sean Wilson <spwilson2@wisc.edu>
Reviewed-on: https://gem5-review.googlesource.com/3745
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 11892:c7ea349e1cd3 26-Oct-2016 Nikos Nikoleris <nikos.nikoleris@arm.com>

mem: Use pkt::getBlockAddr instead of BaseCace::blockAlign

Change-Id: I0ed4e528cb750a323facdc811dde7f0ed1ff228e
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>


# 11744:5d33c6972dda 05-Dec-2016 Nikos Nikoleris <nikos.nikoleris@arm.com>

mem: Make packet debug printing more uniform

Previously DPRINTFs printing information about a packet would use ad hoc
formats. This patch changes all DPRINTFs to use the print function
defined by the packet class, making the packet printing format more
uniform and easier to change.

Change-Id: Idd436a9758d4bf70c86a574d524648b2a2580970
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>


# 11722:f15f02d8c79e 30-Nov-2016 Sophiane Senni <sophiane.senni@gmail.com>

mem: Split the hit_latency into tag_latency and data_latency

If the cache access mode is parallel, i.e. "sequential_access" parameter
is set to "False", tags and data are accessed in parallel. Therefore,
the hit_latency is the maximum latency between tag_latency and
data_latency. On the other hand, if the cache access mode is
sequential, i.e. "sequential_access" parameter is set to "True",
tags and data are accessed sequentially. Therefore, the hit_latency
is the sum of tag_latency plus data_latency.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>


# 11483:d4c2e56d18b2 26-May-2016 Nikos Nikoleris <nikos.nikoleris@arm.com>

mem: fix the line length in the cache related classes

Change-Id: I6d1feb164a958dde0da87a1cd2698096112c4a82
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>


# 11454:e55afadc4e19 21-Apr-2016 Andreas Hansson <andreas.hansson@arm.com>

mem: Remove unused cache stats

Prune cache stats that are never actually used.


# 11436:f351b7f248db 27-May-2015 Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com>

mem: Add unused prefetch counter in caches

Added stat to the cache to account for HardPF'ed blocks that are evicted
before being referenced (over-prefetching).


# 11375:f98df9231cdd 17-Mar-2016 Andreas Hansson <andreas.hansson@arm.com>

mem: Create a separate class for the cache write buffer

This patch breaks out the cache write buffer into a separate class,
without affecting any stats. The goal of the patch is to avoid
encumbering the much-simpler write queue with the complex MSHR
handling. In a follow on patch this simplification allows us to
implement write combining.

The WriteQueue gets its own class, but shares a common ancestor, the
generic Queue, with the MSHRQueue.


# 11331:cd5c48db28e6 10-Feb-2016 Andreas Hansson <andreas.hansson@arm.com>

mem: Deduce if cache should forward snoops

This patch changes how the cache determines if snoops should be
forwarded from the memory side to the CPU side. Instead of having a
parameter, the cache now looks at the port connected on the CPU side,
and if it is a snooping port, then snoops are forwarded. Less error
prone, and less parameters to worry about.

The patch also tidies up the CPU classes to ensure that their I-side
port is not snooping by removing overrides to the snoop request
handler, such that snoop requests will panic via the default
MasterPort implement


# 11284:b3926db25371 31-Dec-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Make cache terminology easier to understand

This patch changes the name of a bunch of packet flags and MSHR member
functions and variables to make the coherency protocol easier to
understand. In addition the patch adds and updates lots of
descriptions, explicitly spelling out assumptions.

The following name changes are made:

* the packet memInhibit flag is renamed to cacheResponding

* the packet sharedAsserted flag is renamed to hasSharers

* the packet NeedsExclusive attribute is renamed to NeedsWritable

* the packet isSupplyExclusive is renamed responderHadWritable

* the MSHR pendingDirty is renamed to pendingModified

The cache states, Modified, Owned, Exclusive, Shared are also called
out in the cache and MSHR code to make it easier to understand.


# 11199:929fd978ab4e 06-Nov-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Add an option to perform clean writebacks from caches

This patch adds the necessary commands and cache functionality to
allow clean writebacks. This functionality is crucial, especially when
having exclusive (victim) caches. For example, if read-only L1
instruction caches are not sending clean writebacks, there will never
be any spills from the L1 to the L2. At the moment the cache model
defaults to not sending clean writebacks, and this should possibly be
re-evaluated.

The implementation of clean writebacks relies on a new packet command
WritebackClean, which acts much like a Writeback (renamed
WritebackDirty), and also much like a CleanEvict. On eviction of a
clean block the cache either sends a clean evict, or a clean
writeback, and if any copies are still cached upstream the clean
evict/writeback is dropped. Similarly, if a clean evict/writeback
reaches a cache where there are outstanding MSHRs for the block, the
packet is dropped. In the typical case though, the clean writeback
allocates a block in the downstream cache, and marks it writable if
the evicted block was writable.

The patch changes the O3_ARM_v7a L1 cache configuration and the
default L1 caches in config/common/Caches.py


# 11197:f8fdd931e674 06-Nov-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Add cache clusivity

This patch adds a parameter to control the cache clusivity, that is if
the cache is mostly inclusive or exclusive. At the moment there is no
intention to support strict policies, and thus the options are: 1)
mostly inclusive, or 2) mostly exclusive.

The choice of policy guides the behaviuor on a cache fill, and a new
helper function, allocOnFill, is created to encapsulate the decision
making process. For the timing mode, the decision is annotated on the
MSHR on sending out the downstream packet, and in atomic we directly
pass the decision to handleFill. We (ab)use the tempBlock in cases
where we are not allocating on fill, leaving the rest of the cache
unaffected. Simple and effective.

This patch also makes it more explicit that multiple caches are
allowed to consider a block writable (this is the case
also before this patch). That is, for a mostly inclusive cache,
multiple caches upstream may also consider the block exclusive. The
caches considering the block writable/exclusive all appear along the
same path to memory, and from a coherency protocol point of view it
works due to the fact that we always snoop upwards in zero time before
querying any downstream cache.

Note that this patch does not introduce clean writebacks. Thus, for
clean lines we are essentially removing a cache level if it is made
mostly exclusive. For example, lines from the read-only L1 instruction
cache or table-walker cache are always clean, and simply get dropped
rather than being passed to the L2. If the L2 is mostly exclusive and
does not allocate on fill it will thus never hold the line. A follow
on patch adds the clean writebacks.

The patch changes the L2 of the O3_ARM_v7a CPU configuration to be
mostly exclusive (and stats are affected accordingly).


# 11191:9eabb2bf349b 06-Nov-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Do not treat CleanEvict as a write operation

This patch changes the CleanEvict command type to not be considered a
write. Initially it was made a zero-sized write to match the writeback
command, but as things developed it became clear that it causes more
problems than it solves. For example, the memory modules (and bridge)
should not consider the CleanEvict as a write, but instead discard
it. With this patch it will be neither a read, nor write, and as it
does not need a response the slave will simply sink it.


# 11053:62544e45c0f4 21-Aug-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Add explicit Cache subclass and make BaseCache abstract

Open up for other subclasses to BaseCache and transition to using the
explicit Cache subclass.


# 11051:81b1f46061c8 21-Aug-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Move cache_impl.hh to cache.cc

There is no longer any need to keep the implementation in a header.


# 10942:224c85495f96 30-Jul-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Remove unused RequestCause in cache

This patch removes the RequestCause, and also simplifies how we
schedule the sending of packets through the memory-side port. The
deassertion of bus requests is removed as it is not used.


# 10912:b99a6662d7c2 07-Jul-2015 Andreas Sandberg <andreas.sandberg@arm.com>

sim: Decouple draining from the SimObject hierarchy

Draining is currently done by traversing the SimObject graph and
calling drain()/drainResume() on the SimObjects. This is not ideal
when non-SimObjects (e.g., ports) need draining since this means that
SimObjects owning those objects need to be aware of this.

This changeset moves the responsibility for finding objects that need
draining from SimObjects and the Python-side of the simulator to the
DrainManager. The DrainManager now maintains a set of all objects that
need draining. To reduce the overhead in classes owning non-SimObjects
that need draining, objects inheriting from Drainable now
automatically register with the DrainManager. If such an object is
destroyed, it is automatically unregistered. This means that drain()
and drainResume() should never be called directly on a Drainable
object.

While implementing the new functionality, the DrainManager has now
been made thread safe. In practice, this means that it takes a lock
whenever it manipulates the set of Drainable objects since SimObjects
in different threads may create Drainable objects
dynamically. Similarly, the drain counter is now an atomic_uint, which
ensures that it is manipulated correctly when objects signal that they
are done draining.

A nice side effect of these changes is that it makes the drain state
changes stricter, which the simulation scripts can exploit to avoid
redundant drains.


# 10887:279efb97ec99 03-Jul-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Remove redundant is_top_level cache parameter

This patch takes the final step in removing the is_top_level parameter
from the cache. With the recent changes to read requests and write
invalidations, the parameter is no longer needed, and consequently
removed.

This also means that asymmetric cache hierarchies are now fully
supported (and we are actually using them already with L1 caches, but
no table-walker caches, connected to a shared L2).


# 10884:c60acdbdd6ad 03-Jul-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Allow read-only caches and check compliance

This patch adds a parameter to the BaseCache to enable a read-only
cache, for example for the instruction cache, or table-walker cache
(not for x86). A number of checks are put in place in the code to
ensure a read-only cache does not end up with dirty data.

A follow-on patch adds suitable read requests to allow a read-only
cache to explicitly ask for clean data.


# 10821:581fb2484bd6 05-May-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Snoop into caches on uncacheable accesses

This patch takes a last step in fixing issues related to uncacheable
accesses. We do not separate uncacheable memory from uncacheable
devices, and in cases where it is really memory, there are valid
scenarios where we need to snoop since we do not support cache
maintenance instructions (yet). On snooping an uncacheable access we
thus provide data if possible. In essence this makes uncacheable
accesses IO coherent.

The snoop filter is also queried to steer the snoops, but not updated
since the uncacheable accesses do not allocate a block.


# 10767:993c2baa485a 27-Mar-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Remove redundant allocateUncachedReadBuffer in cache

This patch removes the no-longer-needed
allocateUncachedReadBuffer. Besides the checks it is exactly the same
as allocateMissBuffer and thus provides no value.


# 10764:b32578b2af99 27-Mar-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Align all MSHR entries to block boundaries

This patch aligns all MSHR queue entries to block boundaries to
simplify checks for matches. Previously there were corner cases that
could lead to existing entries not being identified as matches.

There are, rather alarmingly, a few regressions that change with this
patch.


# 10714:9ba5e70964a4 02-Mar-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Tidy up the cache debug messages

Avoid redundant inclusion of the name in the DPRINTF string.


# 10713:eddb533708cb 02-Mar-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Split port retry for all different packet classes

This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.

The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvReqRetry, sendTimingResp has
recvRespRetry etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.

The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.


# 10693:c0979b2ebda5 11-Feb-2015 Marco Balboni <Marco.Balboni@ARM.com>

mem: Clarify usage of latency in the cache

This patch adds some much-needed clarity in the specification of the
cache timing. For now, hit_latency and response_latency are kept as
top-level parameters, but the cache itself has a number of local
variables to better map the individual timing variables to different
behaviours (and sub-components).

The introduced variables are:
- lookupLatency: latency of tag lookup, occuring on any access
- forwardLatency: latency that occurs in case of outbound miss
- fillLatency: latency to fill a cache block
We keep the existing responseLatency

The forwardLatency is used by allocateInternalBuffer() for:
- MSHR allocateWriteBuffer (unchached write forwarded to WriteBuffer);
- MSHR allocateMissBuffer (cacheable miss in MSHR queue);
- MSHR allocateUncachedReadBuffer (unchached read allocated in MSHR
queue)
It is our assumption that the time for the above three buffers is the
same. Similarly, for snoop responses passing through the cache we use
forwardLatency.


# 10679:204a0f53035e 03-Feb-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Clarify cache behaviour for pending dirty responses

This patch adds a bit of clarification around the assumptions made in
the cache when packets are sent out, and dirty responses are
pending. As part of the change, the marking of an MSHR as in service
is simplified slightly, and comments are added to explain what
assumptions are made.


# 10582:c04dc66e4316 02-Dec-2014 Curtis Dunham <Curtis.Dunham@arm.com>

mem: Remove WriteInvalidate support

Prepare for a different implementation following in the next patch


# 10345:b5bef3c8e070 27-Jun-2014 Curtis Dunham <Curtis.Dunham@arm.com>

mem: write streaming support via WriteInvalidate promotion

Support full-block writes directly rather than requiring RMW:
* a cache line is allocated in the cache upon receipt of a
WriteInvalidateReq, not the WriteInvalidateResp.
* only top-level caches allocate the line; the others just pass
the request along and invalidate as necessary.
* to close a timing window between the *Req and the *Resp, a new
metadata bit tracks whether another cache has read a copy of
the new line before the writeback to memory.


# 10344:fa9ef374075f 03-Sep-2014 Andreas Hansson <andreas.hansson@arm.com>

mem: Fix a bug in the cache port flow control

This patch fixes a bug in the cache port where the retry flag was
reset too early, allowing new requests to arrive before the retry was
actually sent, but with the event already scheduled. This caused a
deadlock in the interactions with the O3 LSQ.

The patche fixes the underlying issue by shifting the resetting of the
flag to be done by the event that also calls sendRetry(). The patch
also tidies up the flow control in recvTimingReq and ensures that we
also check if we already have a retry outstanding.


# 10028:fb8c44de891a 24-Jan-2014 Giacomo Gabrielli <Giacomo.Gabrielli@arm.com>

mem: Add support for a security bit in the memory system

This patch adds the basic building blocks required to support e.g. ARM
TrustZone by discerning secure and non-secure memory accesses.


# 10020:2f33cb012383 24-Jan-2014 Matt Horsnell <matt.horsnell@ARM.com>

mem: track per-request latencies and access depths in the cache hierarchy

Add some values and methods to the request object to track the translation
and access latency for a request and which level of the cache hierarchy responded
to the request.


# 9529:28d6d9663a7e 15-Feb-2013 Andreas Hansson <andreas.hansson@arm.com>

mem: Tighten up cache constness and scoping

This patch merely adopts a more strict use of const for the cache
member functions and variables, and also moves a large portion of the
member functions from public to protected.


# 9486:569e1f1d762d 28-Jan-2013 Anthony Gutierrez <atgutier@umich.edu>

cache: remove drainManager because it's not used

the cache drainManager is set but never cleared, this is because
the cache itself does not need to be drained and thus never
triggers a signalDrainDone(). because the drainManager variable
is not used properly and does not appear to be necessary it has
been removed with this patch.


# 9347:b02075171b57 02-Nov-2012 Andreas Sandberg <Andreas.Sandberg@arm.com>

mem: Add support for writing back and flushing caches

This patch adds support for the following optional drain methods in
the classical memory system's cache model:

memWriteback() - Write back all dirty cache lines to memory using
functional accesses.

memInvalidate() - Invalidate all cache lines. Dirty cache lines
are lost unless a writeback is requested.

Since memWriteback() is called when checkpointing systems, this patch
adds support for checkpointing systems with caches. The serialization
code now checks whether there are any dirty lines in the cache. If
there are dirty lines in the cache, the checkpoint is flagged as bad
and a warning is printed.


# 9342:6fec8f26e56d 02-Nov-2012 Andreas Sandberg <Andreas.Sandberg@arm.com>

sim: Move the draining interface into a separate base class

This patch moves the draining interface from SimObject to a separate
class that can be used by any object needing draining. However,
objects not visible to the Python code (i.e., objects not deriving
from SimObject) still depend on their parents informing them when to
drain. This patch also gets rid of the CountedDrainEvent (which isn't
really an event) and replaces it with a DrainManager.


# 9294:8fb03b13de02 15-Oct-2012 Andreas Hansson <andreas.hansson@arm.com>

Port: Add protocol-agnostic ports in the port hierarchy

This patch adds an additional level of ports in the inheritance
hierarchy, separating out the protocol-specific and protocl-agnostic
parts. All the functionality related to the binding of ports is now
confined to use BaseMaster/BaseSlavePorts, and all the
protocol-specific parts stay in the Master/SlavePort. In the future it
will be possible to add other protocol-specific implementations.

The functions used in the binding of ports, i.e. getMaster/SlavePort
now use the base classes, and the index parameter is updated to use
the PortID typedef with the symbolic InvalidPortID as the default.


# 9288:3d6da8559605 15-Oct-2012 Andreas Hansson <andreas.hansson@arm.com>

Mem: Use cycles to express cache-related latencies

This patch changes the cache-related latencies from an absolute time
expressed in Ticks, to a number of cycles that can be scaled with the
clock period of the caches. Ultimately this patch serves to enable
future work that involves dynamic frequency scaling. As an immediate
benefit it also makes it more convenient to specify cache performance
without implicitly assuming a specific CPU core operating frequency.

The stat blocked_cycles that actually counter in ticks is now updated
to count in cycles.

As the timing is now rounded to the clock edges of the cache, there
are some regressions that change. Plenty of them have very minor
changes, whereas some regressions with a short run-time are perturbed
quite significantly. A follow-on patch updates all the statistics for
the regressions.


# 9263:066099902102 25-Sep-2012 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com>

Cache: add a response latency to the caches

In the current caches the hit latency is paid twice on a miss. This patch lets
a configurable response latency be set of the cache for the backward path.


# 9163:3b5e13ac1940 22-Aug-2012 Andreas Hansson <andreas.hansson@arm.com>

Port: Extend the QueuedPort interface and use where appropriate

This patch extends the queued port interfaces with methods for
scheduling the transmission of a timing request/response. The methods
are named similar to the corresponding sendTiming(Snoop)Req/Resp,
replacing the "send" with "sched". As the queues are currently
unbounded, the methods always succeed and hence do not return a value.

This functionality was previously provided in the subclasses by
calling PacketQueue::schedSendTiming with the appropriate
parameters. With this change, there is no need to introduce these
extra methods in the subclasses, and the use of the queued interface
is more uniform and explicit.


# 9087:b5a084a6159b 09-Jul-2012 Andreas Hansson <andreas.hansson@arm.com>

Port: Move retry from port base class to Master/SlavePort

This patch is the last part of moving all protocol-related
functionality out of the Port base class. All the send/recv functions
are already moved, and the retry (which still governs all the timing
transport functions) is the only part that remained in the base class.

The only point where this currently causes a bit of inconvenience is
in the bus where the retry list is global and holds Port pointers (not
Master/SlavePort). This is about to change with the split into a
request/response bus and will soon be removed anyway.

The patch has no impact on any regressions.


# 8975:7f36d4436074 01-May-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Separate requests and responses for timing accesses

This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.

For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).

The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.

With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.


# 8948:e95ee70f876c 14-Apr-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Separate snoops and normal memory requests/responses

This patch introduces port access methods that separates snoop
request/responses from normal memory request/responses. The
differentiation is made for functional, atomic and timing accesses and
builds on the introduction of master and slave ports.

Before the introduction of this patch, the packets belonging to the
different phases of the protocol (request -> [forwarded snoop request
-> snoop response]* -> response) all use the same port access
functions, even though the snoop packets flow in the opposite
direction to the normal packet. That is, a coherent master sends
normal request and receives responses, but receives snoop requests and
sends snoop responses (vice versa for the slave). These two distinct
phases now use different access functions, as described below.

Starting with the functional access, a master sends a request to a
slave through sendFunctional, and the request packet is turned into a
response before the call returns. In a system without cache coherence,
this is all that is needed from the functional interface. For the
cache-coherent scenario, a slave also sends snoop requests to coherent
masters through sendFunctionalSnoop, with responses returned within
the same packet pointer. This is currently used by the bus and caches,
and the LSQ of the O3 CPU. The send/recvFunctional and
send/recvFunctionalSnoop are moved from the Port super class to the
appropriate subclass.

Atomic accesses follow the same flow as functional accesses, with
request being sent from master to slave through sendAtomic. In the
case of cache-coherent ports, a slave can send snoop requests to a
master through sendAtomicSnoop. Just as for the functional access
methods, the atomic send and receive member functions are moved to the
appropriate subclasses.

The timing access methods are different from the functional and atomic
in that requests and responses are separated in time and
send/recvTiming are used for both directions. Hence, a master uses
sendTiming to send a request to a slave, and a slave uses sendTiming
to send a response back to a master, at a later point in time. Snoop
requests and responses travel in the opposite direction, similar to
what happens in functional and atomic accesses. With the introduction
of this patch, it is possible to determine the direction of packets in
the bus, and no longer necessary to look for both a master and a slave
port with the requested port id.

In contrast to the normal recvFunctional, recvAtomic and recvTiming
that are pure virtual functions, the recvFunctionalSnoop,
recvAtomicSnoop and recvTimingSnoop have a default implementation that
calls panic. This is to allow non-coherent master and slave ports to
not implement these functions.


# 8922:17f037ad8918 30-Mar-2012 William Wang <william.wang@arm.com>

MEM: Introduce the master/slave port sub-classes in C++

This patch introduces the notion of a master and slave port in the C++
code, thus bringing the previous classification from the Python
classes into the corresponding simulation objects and memory objects.

The patch enables us to classify behaviours into the two bins and add
assumptions and enfore compliance, also simplifying the two
interfaces. As a starting point, isSnooping is confined to a master
port, and getAddrRanges to slave ports. More of these specilisations
are to come in later patches.

The getPort function is not getMasterPort and getSlavePort, and
returns a port reference rather than a pointer as NULL would never be
a valid return value. The default implementation of these two
functions is placed in MemObject, and calls fatal.

The one drawback with this specific patch is that it requires some
code duplication, e.g. QueuedPort becomes QueuedMasterPort and
QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort
(avoiding multiple inheritance). With the later introduction of the
port interfaces, moving the functionality outside the port itself, a
lot of the duplicated code will disappear again.


# 8914:8c3bd7bea667 22-Mar-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Split SimpleTimingPort into PacketQueue and ports

This patch decouples the queueing and the port interactions to
simplify the introduction of the master and slave ports. By separating
the queueing functionality from the port itself, it becomes much
easier to distinguish between master and slave ports, and still retain
the queueing ability for both (without code duplication).

As part of the split into a PacketQueue and a port, there is now also
a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The
QueuedPort is useful for ports that want to leave the packet
transmission of outgoing packets to the queue and is used by both
master and slave ports. The SimpleTimingPort inherits from the
QueuedPort and adds the implemention of recvTiming and recvFunctional
through recvAtomic.

The PioPort and MessagePort are cleaned up as part of the changes.


# 8883:c92153af04ac 09-Mar-2012 Ali Saidi <Ali.Saidi@ARM.com>

cache: Allow main memory to be at disjoint address ranges.


# 8856:241ee47b0dc6 24-Feb-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Simplify cache ports preparing for master/slave split

This patch splits the two cache ports into a master (memory-side) and
slave (cpu-side) subclass of port with slightly different
functionality. For example, it is only the CPU-side port that blocks
incoming requests, and only the memory-side port that schedules send
events outside of what the transmit list dictates.

This patch simplifies the two classes by relying further on
SimpleTimingPort and also generalises the latter to better accommodate
the changes (introducing trySendTiming and scheduleSend). The
memory-side cache port overrides sendDeferredPacket to be able to not
only send responses from the transmit list, but also send requests
based on the MSHRs.

A follow on patch further simplifies the SimpleTimingPort and the
cache ports.


# 8833:2870638642bd 12-Feb-2012 Dam Sunwoo <dam.sunwoo@arm.com>

mem: fix cache stats to use request ids correctly

This patch fixes the cache stats to use the new request ids.
Cache stats also display the requestor names in the vector subnames.
Most cache stats now include "nozero" and "nonan" flags to reduce the
amount of excessive cache stat dump. Also, simplified
incMissCount()/incHitCount() functions.


# 8809:bb10807da889 01-Feb-2012 Gabe Black <gblack@eecs.umich.edu>

Merge with head, hopefully the last time for this batch.


# 8799:dac1e33e07b0 28-Jan-2012 Gabe Black <gblack@eecs.umich.edu>

Merge with the main repo.


# 8794:e2ac2b7164dd 18-Nov-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Get rid of includes of config/full_system.hh.


# 8786:8be24baf68b8 07-Nov-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Get rid of FULL_SYSTEM in mem.


# 8737:770ccf3af571 31-Jan-2012 Koan-Sin Tan <koansin.tan@gmail.com>

clang: Enable compiling gem5 using clang 2.9 and 3.0

This patch adds the necessary flags to the SConstruct and SConscript
files for compiling using clang 2.9 and later (on Ubuntu et al and OSX
XCode 4.2), and also cleans up a bunch of compiler warnings found by
clang. Most of the warnings are related to hidden virtual functions,
comparisons with unsigneds >= 0, and if-statements with empty
bodies. A number of mismatches between struct and class are also
fixed. clang 2.8 is not working as it has problems with class names
that occur in multiple namespaces (e.g. Statistics in
kernel_stats.hh).

clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which
causes confusion between the container std::set and the function
Packet::set, and this is currently addressed by not including the
entire namespace std, but rather selecting e.g. "using std::vector" in
the appropriate places.


# 8736:2d8a57343fe3 31-Jan-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Remove the otherPort from the cache ports

This patch is a very straight-forward simplification, removing the
unecessary otherPort pointer from the cache port. The pointer was only
used to forward range changes, and the address range is fixed for the
cache. Removing the pointer simplifies the transition to master/slave
ports.


# 8711:c7e14f52c682 17-Jan-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Separate queries for snooping and address ranges

This patch simplifies the address-range determination mechanism and
also unifies the naming across ports and devices. It further splits
the queries for determining if a port is snooping and what address
ranges it responds to (aiming towards a separation of
cache-maintenance ports and pure memory-mapped ports). Default
behaviours are such that most ports do not have to define isSnooping,
and master ports need not implement getAddrRanges.


# 8232:b28d06a175be 15-Apr-2011 Nathan Binkert <nate@binkert.org>

trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help


# 8229:78bf55f23338 15-Apr-2011 Nathan Binkert <nate@binkert.org>

includes: sort all includes


# 8134:b01a51ff05fa 17-Mar-2011 Ali Saidi <Ali.Saidi@ARM.com>

Mem: Fix issue with dirty block being lost when entire block transferred to non-cache.

This change fixes the problem for all the cases we actively use. If you want to try
more creative I/O device attachments (E.g. sharing an L2), this won't work. You
would need another level of caching between the I/O device and the cache
(which you actually need anyway with our current code to make sure writes
propagate). This is required so that you can mark the cache in between as
top level and it won't try to send ownership of a block to the I/O device.
Asserts have been added that should catch any issues.


# 7823:dac01f14f20f 08-Jan-2011 Steve Reinhardt <steve.reinhardt@amd.com>

Replace curTick global variable with accessor functions.
This step makes it easy to replace the accessor functions
(which still access a global variable) with ones that access
per-thread curTick values.


# 7676:92274350b953 10-Sep-2010 Nathan Binkert <nate@binkert.org>

style: fix sorting of includes and whitespace in some files


# 7667:aa8fd8f6a495 09-Sep-2010 Steve Reinhardt <steve.reinhardt@amd.com>

cache: coherence protocol enhancements & bug fixes
Allow lower-level caches (e.g., L2 or L3) to pass exclusive
copies to higher levels (e.g., L1). This eliminates a lot
of unnecessary upgrade transactions on read-write sequences
to non-shared data.

Also some cleanup of MSHR coherence handling and multiple
bug fixes.


# 7461:5a07045d0af2 15-Jun-2010 Nathan Binkert <nate@binkert.org>

stats: only consider a formula initialized if there is a formula


# 6978:ab05e20dc4a7 23-Feb-2010 Lisa Hsu <Lisa.Hsu@amd.com>

cache: Make caches sharing aware and add occupancy stats.
On the config end, if a shared L2 is created for the system, it is
parameterized to have n sharers as defined by option.num_cpus. In addition to
making the cache sharing aware so that discriminating tag policies can make use
of context_ids to make decisions, I added an occupancy AverageStat and an occ %
stat to each cache so that you could know which contexts are occupying how much
cache on average, both in terms of blocks and percentage. Note that since
devices have context_id -1, having an array of occ stats that correspond to
each context_id will break here, so in FS mode I add an extra bucket for device
blocks. This bucket is explicitly not added in SE mode in order to not only
avoid ugliness in the stats.txt file, but to avoid broken stats (some formulas
break when a bucket is 0).


# 6666:3199397fd905 26-Sep-2009 Steve Reinhardt <steve.reinhardt@amd.com>

Minor cleanup: Use the blockAlign() method where it applies in the cache.


# 6227:a17798f2a52c 05-Jun-2009 Nathan Binkert <nate@binkert.org>

types: clean up types, especially signed vs unsigned


# 6215:9aed64c9f10f 17-May-2009 Nathan Binkert <nate@binkert.org>

includes: use base/types.hh not inttypes.h or stdint.h


# 6122:9af6fb59752f 16-Jul-2008 Steve Reinhardt <Steve.Reinhardt@amd.com>

mem: use single BadAddr responder per system.
Previously there was one per bus, which caused some coherence problems
when more than one decided to respond. Now there is just one on
the main memory bus. The default bus responder on all other buses
is now the downstream cache's cpu_side port. Caches no longer need
to do address range filtering; instead, we just have a simple flag
to prevent snoops from propagating to the I/O bus.


# 5999:3cf8e71257e0 05-Mar-2009 Nathan Binkert <nate@binkert.org>

stats: Fix all stats usages to deal with template fixes


# 5875:d82be3235ab4 16-Feb-2009 Steve Reinhardt <steve.reinhardt@amd.com>

Fixes to get prefetching working again.
Apparently we broke it with the cache rewrite and never noticed.
Thanks to Bao Yungang <baoyungang@gmail.com> for a significant part
of these changes (and for inspiring me to work on the rest).
Some other overdue cleanup on the prefetch code too.


# 5715:e8c1d4e669a7 04-Nov-2008 Lisa Hsu <hsul@eecs.umich.edu>

get rid of all instances of readTid() and getThreadNum(). Unify and eliminate
redundancies with threadId() as their replacement.


# 5338:e75d02a09806 10-Feb-2008 Steve Reinhardt <stever@gmail.com>

Fix #include lines for renamed cache files.


# 5337:f81512eb8bdf 10-Feb-2008 Steve Reinhardt <stever@gmail.com>

Rename cache files for brevity and consistency with rest of tree.