#
14013:aeb3ca1762bb |
|
27-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Support for page crossing prefetches
Prefetchers can now issue hardware prefetch requests that go beyond the boundaries of the system page. Page crossing references will need to look up the TLBs to be able to compute the physical address to be prefetched.
Change-Id: Ib56374097e3b7dc87414139d210ea9272f96b06b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14620 Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13751:614d6e02a5fb |
|
21-Feb-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Added extra information to PrefetchInfo
Added additional information to the PrefetchInfo data structure - Whether the event is triggered by a cache miss - Whether the event is a write or a read - Size of the data accessed - Data accessed by the request
Change-Id: I070f3ffe837ea960a357388e7f2b8a61d7b2196c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16583 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13717:11e81e2a98bd |
|
03-Dec-2018 |
Ivan Pizarro <ivan.pizarro@metempsy.com> |
mem-cache: A Best-Offset Prefetcher
Michaud, P. (2015, June). A best-offset prefetcher. In 2nd Data Prefetching Championship.
Change-Id: I61bb89ca5639356d54aeb04e856d5bf6e8805c22 Reviewed-on: https://gem5-review.googlesource.com/c/14820 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13624:3d8220c2d41d |
|
13-Dec-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Updated version of the Signature Path Prefetcher
This implementation is based in the description available in: Jinchun Kim, Seth H. Pugsley, Paul V. Gratz, A. L. Narasimha Reddy, Chris Wilkerson, and Zeshan Chishti. 2016. Path confidence based lookahead prefetching. In The 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-49). IEEE Press, Piscataway, NJ, USA, Article 60, 12 pages.
Change-Id: I4b8b54efef48ced7044bd535de9a69bca68d47d9 Reviewed-on: https://gem5-review.googlesource.com/c/14819 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13551:f352df8e2863 |
|
17-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: virtual address support for prefetchers
Prefetchers can be configured to operate with virtual or physical addreses. The option can be configured through the "use_virtual_addresses" parameter of the Prefetcher object.
Change-Id: I4f8c3687988afecc8a91c3c5b2d44cc0580f72aa Reviewed-on: https://gem5-review.googlesource.com/c/14416 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13434:99807b35a66c |
|
17-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: a missing cast was truncating addresses
High bits were truncated when computing the block address
Change-Id: Iab2a4c6063ece2d1d4c24ce5686045a6d6d35434 Reviewed-on: https://gem5-review.googlesource.com/c/14415 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13422:4ec52da74cd5 |
|
11-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Cleanup prefetchers
Prefetcher code had extra variables, dependencies that could be removed, code duplication, and missing overrides.
Change-Id: I6e9fbf67a0bdab7eb591893039e088261f52d31a Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14355 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13416:d90887d0c889 |
|
09-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: implement a probe-based interface
The HW Prefetcher of a cache can now listen events from their associated CPUs and from its own cache.
Change-Id: I28aecd8faf8ed44be94464d84485bd1cea2efae3 Reviewed-on: https://gem5-review.googlesource.com/c/14155 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
12727:56c23b54bcb1 |
|
02-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix include directives in the cache related classes
Change-Id: I111b0f662897c43974aadb08da1ed85c7542585c Reviewed-on: https://gem5-review.googlesource.com/10433 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
12680:91f4d6668b4f |
|
04-Apr-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
sim,cpu,mem,arch: Introduced MasterInfo data structure
With this patch a gem5 System will store more info about its Masters. While it was previously keeping track of the Master name and Master ID only, it is now adding a per-Master pointer to the SimObject related to the Master. This will make it possible for a client to query a System for a Master using either the master's name or the master's pointer.
Change-Id: I8b97d328a65cd06f329e2cdd3679451c17d2b8f6 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/9781 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
11793:ef606668d247 |
|
09-Nov-2016 |
Brandon Potter <brandon.potter@amd.com> |
style: [patch 1/22] use /r/3648/ to reorganize includes
|
#
11522:348411ec525a |
|
06-Jun-2016 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
sim: Call regStats of base-class as well
We want to extend the stats of objects hierarchically and thus it is necessary to register the statistics of the base-class(es), as well. For now, these are empty, but generic stats will be added there.
Patch originally provided by Akash Bagdia at ARM Ltd.
|
#
11438:3c9fd319a982 |
|
07-Apr-2016 |
Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com> |
mem: Handful extra features for BasePrefetcher
Some common functionality added to the base prefetcher, mainly dealing with extracting the block address, page address, block index inside the page and some other information that can be inferred from the block address. This is used for some prefetching algorithms, and having the methods in the base, as well as the block size and other information is the sensible way.
|
#
10883:9294c4a60251 |
|
03-Jul-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Add clean evicts to improve snoop filter tracking
This patch adds eviction notices to the caches, to provide accurate tracking of cache blocks in snoop filters. We add the CleanEvict message to the memory heirarchy and use both CleanEvicts and Writebacks with BLOCK_CACHED flags to propagate notice of clean and dirty evictions respectively, down the memory hierarchy. Note that the BLOCK_CACHED flag indicates whether there exist any copies of the evicted block in the caches above the evicting cache.
The purpose of the CleanEvict message is to notify snoop filters of silent evictions in the relevant caches. The CleanEvict message behaves much like a Writeback. CleanEvict is a write and a request but unlike a Writeback, CleanEvict does not have data and does not need exclusive access to the block. The cache generates the CleanEvict message on a fill resulting in eviction of a clean block. Before travelling downwards CleanEvict requests generate zero-time snoop requests to check if the same block is cached in upper levels of the memory heirarchy. If the block exists, the cache discards the CleanEvict message. The snoops check the tags, writeback queue and the MSHRs of upper level caches in a manner similar to snoops generated from HardPFReqs. Currently CleanEvicts keep travelling towards main memory unless they encounter the block corresponding to their address or reach main memory (since we have no well defined point of serialisation). Main memory simply discards CleanEvict messages.
We have modified the behavior of Writebacks, such that they generate snoops to check for the presence of blocks in upper level caches. It is possible in our current implmentation for a lower level cache to be writing back a block while a shared copy of the same block exists in the upper level cache. If the snoops find the same block in upper level caches, we set the BLOCK_CACHED flag in the Writeback message.
We have also added logic to account for interaction of other message types with CleanEvicts waiting in the writeback queue. A simple example is of a response arriving at a cache removing any CleanEvicts to the same address from the cache's writeback queue.
|
#
10626:7982e539d003 |
|
23-Dec-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: Hide WriteInvalidate requests from prefetchers
Without this tweak, a prefetcher will happily prefetch data that will promptly be invalidated and overwritten by a WriteInvalidate.
|
#
10623:b9646f4546ad |
|
23-Dec-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Rework the structuring of the prefetchers
Re-organizes the prefetcher class structure. Previously the BasePrefetcher forced multiple assumptions on the prefetchers that inherited from it. This patch makes the BasePrefetcher class truly representative of base functionality. For example, the base class no longer enforces FIFO order. Instead, prefetchers with FIFO requests (like the existing stride and tagged prefetchers) now inherit from a new QueuedPrefetcher base class.
Finally, the stride-based prefetcher now assumes a custimizable lookup table (sets/ways) rather than the previous fully associative structure.
|
#
10466:73b7549d979e |
|
16-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Dynamically determine page bytes in memory components
This patch takes a step towards an ISA-agnostic memory system by enabling the components to establish the page size after instantiation. The swap operation in the memory is now also allowing any granularity to avoid depending on the IntReg of the ISA.
|
#
10360:919c02740209 |
|
09-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Fix a number of unitialised variables and members
Static analysis unearther a bunch of uninitialised variables and members, and this patch addresses the problem. In all cases these omissions seem benign in the end, but at least fixing them means less false positives next time round.
|
#
10318:98771a936b61 |
|
03-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Cleanup unused ISA traits constants
This patch prunes unused values, and also unifies how the values are defined (not using an enum for ALPHA), aligning the use of int vs Addr etc.
The patch also removes the duplication of PageBytes/PageShift and VMPageSize/LogVMPageSize. For all ISAs the two pairs had identical values and the latter has been removed.
|
#
10052:5bb8e054456b |
|
30-Jan-2014 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com>, Amin Farmahini <aminfar@gmail.com> |
mem: prefetcher: add options, support for unaligned addresses
This patch extends the classic prefetcher to work on non-block aligned addresses. Because the existing prefetchers in gem5 mask off the lower address bits of cache accesses, many predictable strides fail to be detected. For example, if a load were to stride by 48 bytes, with 64 byte cachelines, the current stride based prefetcher would see an access pattern of 0, 64, 64, 128, 192.... Thus not detecting a constant stride pattern. This patch fixes this, by training the prefetcher on access and not masking off the lower address bits.
It also adds the following configuration options: 1) Training/prefetching only on cache misses, 2) Training/prefetching only on data acceses, 3) Optionally tagging prefetches with a PC address. #3 allows prefetchers to train off of prefetch requests in systems with multiple cache levels and PC-based prefetchers present at multiple levels. It also effectively allows a pipelining of prefetch requests (like in POWER4) across multiple levels of cache hierarchy.
Improves performance on my gem5 configuration by 4.3% for SPECINT and 4.7% for SPECFP (geomean).
|
#
10028:fb8c44de891a |
|
24-Jan-2014 |
Giacomo Gabrielli <Giacomo.Gabrielli@arm.com> |
mem: Add support for a security bit in the memory system
This patch adds the basic building blocks required to support e.g. ARM TrustZone by discerning secure and non-secure memory accesses.
|
#
10024:fc10e1f9f124 |
|
24-Jan-2014 |
Dam Sunwoo <dam.sunwoo@arm.com> |
mem: per-thread cache occupancy and per-block ages
This patch enables tracking of cache occupancy per thread along with ages (in buckets) per cache blocks. Cache occupancy stats are recalculated on each stat dump.
|
#
9546:ac0c18d738ce |
|
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add deferred packet class to prefetcher
This patch removes the time field from the packet as it was only used by the preftecher. Similar to the packet queue, the prefetcher now wraps the packet in a deferred packet, which also has a tick representing the absolute time when the packet should be sent.
|
#
9545:508784fad4e5 |
|
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
sim: Make clock private and access using clockPeriod()
This patch makes the clock member private to the ClockedObject and forces all children to access it using clockPeriod(). This makes it impossible to inadvertently change the clock, and also makes it easier to transition to a situation where the clock is derived from e.g. a clock domain, or through a multiplier.
|
#
9288:3d6da8559605 |
|
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Use cycles to express cache-related latencies
This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency.
The stat blocked_cycles that actually counter in ticks is now updated to count in cycles.
As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions.
|
#
8991:69fad6658160 |
|
10-May-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
gem5: fix some iterator use and erase bugs
|
#
8949:3fa1ee293096 |
|
14-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Remove the Broadcast destination from the packet
This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field.
Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before).
The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class.
In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing.
|
#
8832:247fee427324 |
|
12-Feb-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
mem: Add a master ID to each request object.
This change adds a master id to each request object which can be used identify every device in the system that is capable of issuing a request. This is part of the way to removing the numCpus+1 stats in the cache and replacing them with the master ids. This is one of a series of changes that make way for the stats output to be changed to python.
|
#
8831:6c08a877af8f |
|
12-Feb-2012 |
Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> |
prefetcher: Make prefetcher a sim object instead of it being a parameter on cache
|
#
8533:8dac0abb7a1b |
|
01-Sep-2011 |
Lisa Hsu <Lisa.Hsu@amd.com> |
Fix build for gcc-4.2 opt/fast
Even though the code is safe, compiler flags a warning here, which are treated as errors for fast/opt. I know it's redundant but it has no side effects and fixes the compile.
|
#
8509:afb40c3d4ba6 |
|
19-Aug-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Prefetcher: Fix some memory leaks with the prefetcher.
|
#
8232:b28d06a175be |
|
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
trace: reimplement the DTRACE function so it doesn't use a vector At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help
|
#
8229:78bf55f23338 |
|
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
includes: sort all includes
|
#
6665:874f2ee2f115 |
|
26-Sep-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Force prefetches to check cache and MSHRs immediately prior to issue. This prevents redundant prefetches from being issued, solving the occasional 'needsExclusive && !blk->isWritable()' assertion failure in cache_impl.hh that several people have run into. Eliminates "prefetch_cache_check_push" flag, neither setting of which really solved the problem.
|
#
6658:f4de76601762 |
|
23-Sep-2009 |
Nathan Binkert <nate@binkert.org> |
arch: nuke arch/isa_specific.hh and move stuff to generated config/the_isa.hh
|
#
6105:a27c0934de24 |
|
20-Apr-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
request: rename INST_READ to INST_FETCH.
|
#
5875:d82be3235ab4 |
|
16-Feb-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Fixes to get prefetching working again. Apparently we broke it with the cache rewrite and never noticed. Thanks to Bao Yungang <baoyungang@gmail.com> for a significant part of these changes (and for inspiring me to work on the rest). Some other overdue cleanup on the prefetch code too.
|
#
5714:76abee886def |
|
02-Nov-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
Add in Context IDs to the simulator. From now on, cpuId is almost never used, the primary identifier for a hardware context should be contextId(). The concept of threads within a CPU remains, in the form of threadId() because sometimes you need to know which context within a cpu to manipulate.
|
#
5338:e75d02a09806 |
|
10-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix #include lines for renamed cache files.
|
#
5337:f81512eb8bdf |
|
10-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Rename cache files for brevity and consistency with rest of tree.
|