14300:22183ae13998 |
19-Sep-2019 |
Jing Qu <jqu32@wisc.edu> |
mem-ruby: prevent cacheProbe being called multiple times
The cacheProbe() function will return the victim entry, and it gets called for multiple times in trigger function in a single miss. This will cause a problem when we try to add a new replacement policy to the Ruby system. Certain policy, like RRIP, will modify the block information every time the getVictim() function gets called. To prevent future problems, we need to store the victim entry, so that we only call it once in one miss.
Change-Id: Ic5ca05f789d9bbfb963b8e993ef707020f243702 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21099 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Pouya Fotouhi <pfotouhi@ucdavis.edu> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14297:b4519e586f5e |
10-Sep-2019 |
Jordi Vaquero <jordi.vaquero@metempsy.com> |
cpu, mem: Changing AtomicOpFunctor* for unique_ptr<AtomicOpFunctor>
This change is based on modify the way we move the AtomicOpFunctor* through gem5 in order to mantain proper ownership of the object and ensuring its destruction when it is no longer used.
Doing that we fix at the same time a memory leak in Request.hh where we were assigning a new AtomicOpFunctor* without destroying the previous one.
This change creates a new type AtomicOpFunctor_ptr as a std::unique_ptr<AtomicOpFunctor> and move its ownership as needed. Except for its only usage when AtomicOpFunc() is called.
Change-Id: Ic516f9d8217cb1ae1f0a19500e5da0336da9fd4f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20919 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14296:dafc66b0212f |
11-Sep-2019 |
Gabe Black <gabeblack@google.com> |
mem: Delete the now unused Message*Port classes.
This port type is no longer used.
Change-Id: If4abbb774819644bea58fd82e00dfdec8f79b5a6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20822 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
14269:7e364bd625e1 |
01-Aug-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix BDI size calculation
The bitmask field indicates to which base a delta refers, and in the original paper it is fixed and proportional to the highest number of bases allowed in the compressed data.
Change-Id: I271bf2e19e0765de52b933eaf6d4fcc2ce25d185 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19748 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14262:991410960fdb |
11-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Move Bloom Filters to base
All Bloom Filters are completely independent of Ruby, and therefore can be used everywhere.
As a side effect, Ruby was not using the filters, so their dependency was removed.
Change-Id: Ic5f430610c33c0791fb81c79101ebe737189497e Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18875 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
14261:134c8be7c1e5 |
06-Sep-2019 |
Gabe Black <gabeblack@google.com> |
mem: Mark MemObject as deprecated.
It's constructor will now warn that it's deprecated and suggest using ClockedObject directly. This change also gets rid of the params() method and the Params typedef since they are functionally equivalent to the ClockedObject versions.
It also removes the include of mem/port.hh which is not used in mem_object.hh. This may break code which purposefully or (more likely) accidentally depended on that transitive include from mem_object.hh.
Change-Id: I6dab3ba626e3f3ab6a6bd86edcf4f5cb4d6d2c45 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20720 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14217:68c3d00f780a |
26-Aug-2019 |
Srikant Bharadwaj <srikant.bharadwaj@amd.com> |
ruby: Fix the way stall map size is checked for availability
To ensure that enqueuer observes the practical availability. We check the message buffer queue size at the start of the cycle. We also add the size of the stall queue to consider the total queue size. However, messages can be moved from regular queue to stall map. This leads to messages being considered twice leading to false flow control. This patch fixes it by storing the stall map size at the beginning of the cycle and considering it for checking availability.
Change-Id: I6ea94f34fe5279b91f74e106d43263e55ec4bf06 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20389 Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
14211:acfef4916339 |
29-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use SatCounter for RRPV
Use SatCounter in RRIP's RRPV. As such, move validation functionality to a proper variable.
Change-Id: I142db2b7f6cd518ac3a2b68c9ed48005402b3464 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20452 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14208:1c8f93faf08f |
27-Jun-2019 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Convert CommMonitor to the new stat framework
Change-Id: I851c29909f3e6923c0233505a4d0f2d266bc254f Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19371 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
14204:c1bc1320aa86 |
11-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Define BloomFilter namespace
Define a BloomFilter namespace and put all BloomFilter related code in it.
As a side effect the BloomFilter classes have been renamed to remove the "BloomFilter" suffix.
Change-Id: I3ee8cc225bf3b820e561c3e25a6bf38e0012e3a8 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18874 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14203:186d80c1a87f |
11-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Make H3 inherit from MultiBitSelBloomFilter
Make MultiBitSelBloomFilter a generic BloomFilter that maps multiple entries to an address, and therefore uses multiple hash functions. This allows the common functionality of both filters to be merged into one, since they only differ in the hash functions being used.
Change-Id: I0984067b710a208715f5f2727b8c4312feb6529b Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18873 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14202:64f03da8df1e |
10-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Finish implementing BloomFilter merge
Not all Bloom Filters had their union functionality implemented. This change adds them.
Change-Id: I86af18d3c5eabd0da8280b57a88789b3af803c04 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18872 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14201:ecd0511edbc3 |
09-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Remove NonCountingBloomFilter
Make BlockBloomFilter accept having a single bitfield, in which case it behaves exactly as the NonCountingBloomFilter, and thus the latter can be removed.
Change-Id: I56d96a89290c933293ce434bbe0e8bcd4bbcaa42 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18871 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14200:e30d3fc98da7 |
10-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Make MultiGrainBloomFilter generic
Allow combining any number of Bloom Filters in the MultiGrain.
Change-Id: I73ae33063e1feed731af6f625d2f64245f21df18 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18869 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14199:89b4c53db683 |
09-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Parameterize xor bits in BlockBloomFilter
Parameterize bitfield ranges in BlockBloomFilter such that the hash is applied between masked bitfields of an address.
Change-Id: I008bd873458e9815e98530e308491adb65bb34cb Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18870 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14197:26cca0c29be6 |
17-Aug-2019 |
Gabe Black <gabeblack@google.com> |
cpu, mem: Add new getSendFunctional method to the base CPU.
This returns a sendFunctional delegate references which can be used to send functional accesses directly, or more likely when constructing a PortProxy subclass. In those cases only the functional capabilities of those ports are needed so there's no reason to require a full port which supports all three protocols. Also, this removes the last remaining use of get(Data|Inst)Port which relies on those returning a port which supports the gem5 protocols, except the default implementations of this new function. If a CPU doesn't have traditional gem5 style ports, it can override this function to do whatever other behavior is necessary and return its real ports through get(Data|Inst)Port.
Change-Id: Ide4da81e3bc679662cd85902ba6bd537cce54a53 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20237 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Gabe Black <gabeblack@google.com> |
14196:ce364f5517f3 |
15-Aug-2019 |
Gabe Black <gabeblack@google.com> |
mem: Make PortProxy use a delegate for a sendFunctional function.
The only part of the MaserPort the PortProxy uses is the sendFunctional function which is part of the functional protocol. Rather than require a MasterPort which comes along with a lot of other mechanisms, this change slightly adjusts the PortProxy to only require that function through the use of a delegate. That allows lots of flexibility in how the actual packet gets sent and what sends it.
In cases where code constructs a PortProxy and passes its constructor an unbound MasterPort, the PortProxy will create a delegate to the sendFunctional method on its own.
This should also make it easier for objects which don't have traditional gem5 style ports, for instance systemc models, to implement just the little bit of the protocol they need, rather than having to stub out a whole port class, most of which will be ignored.
Change-Id: I234b42ce050f12313b551a61736186ddf2c9e2c7 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20229 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14193:7dd8a6df30e2 |
17-Aug-2019 |
Gabe Black <gabeblack@google.com> |
mem: Eliminate the Base(Slave|Master)Port classes.
The Port class has assumed all the duties of the less generic Base*Port classes, making them unnecessary. Since they don't add anything but make the code more complex, this change eliminates them.
Change-Id: Ibb9c56def04465f353362595c1f1c5ac5083e5e9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20236 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Gabe Black <gabeblack@google.com> |
14192:595a4358b844 |
17-Aug-2019 |
Gabe Black <gabeblack@google.com> |
cpu, dev, mem: Use the new Port methods.
Use getPeer, takeOverFrom, and << to simplify the use of ports in some areas.
Change-Id: Idfbda27411b5d6b742f5e4927894302ea6d6a53d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20235 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
14189:a363edac6a12 |
16-Aug-2019 |
Gabe Black <gabeblack@google.com> |
mem, sim, systemc: Reorganize Port and co.s bind, unbind slightly.
The base Port class can keep track of its peer, and also whether it's connected. This is partially delegated away from the port subclasses which still keep track of a cast version of their peer pointer for their own conveneince, so that it can be used by generic code. Even with the Port mechanism's new flexibility, each port still has exactly one peer and is either connected or not based on whether there is a peer currently.
Change-Id: Id3228617dd1604d196814254a1aadeac5ade7cde Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20232 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Gabe Black <gabeblack@google.com> |
14185:f4017d66f4df |
16-Aug-2019 |
Gabe Black <gabeblack@google.com> |
mem: Put gem5 protocols in their own directory.
This reduces clutter in the src/mem directory, and makes it clear that those protocols are for the classic gem5 memory system, not ruby, TLM, etc.
Change-Id: I6cf6b21134d82f4f01991e4fe92dbea8c7e82081 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20231 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
14184:11ac1337c5e2 |
16-Aug-2019 |
Gabe Black <gabeblack@google.com> |
mem: Move ruby protocols into a directory called ruby_protocol.
Now that the gem5 protocols are split out, it would be nice to put them in their own protocol directory. It's also confusing to have files called *_protocol which are not in the protocol directory.
Change-Id: I7475ee111630050a2421816dfd290921baab9f71 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20230 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14183:8116c413222e |
15-Aug-2019 |
Gabe Black <gabeblack@google.com> |
mem: Split the various protocols out of the gem5 master/slave ports.
This makes the protocols easier to see in their entirity, and makes it easier to add a new type of port which only supports the functional protocol.
Change-Id: If5d639bef45062f0a23af2ac46f50933e6a8f144 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20228 Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
14182:04d886980a5e |
21-Aug-2019 |
Ciro Santilli <ciro.santilli@arm.com> |
mem-ruby: fix build with PROTOCOL=MOESI_hammer
Was failing with:
Error: Unrecognized variable: l1i_victim_addr
since: I2c43f22aba5af3a57e54b1c435e5d3fbba86d1d5
Change-Id: I7df666acb724ee541804dd7557753a9ba4005516 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20261 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Pouya Fotouhi <pfotouhi@ucdavis.edu> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14166:bcebed5b33ad |
06-Jun-2019 |
Pablo Prieto <pablo.prieto@unican.es> |
mem-ruby, arch-hsail: Removed hit latency from VIPERCoalescer
Removed the dcache hit latency from VIPERCoalescer so HSAIL_X86 compiles after commit 496d5ed3e1f7dad42b0c2ebe0050d84621be8f99
Change-Id: I050a58d90f0f6356824c3c3bcb3f0b3c76d145e0 Signed-off-by: Pablo Prieto <pablo.prieto@unican.es> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19148 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14161:67544e7cebb3 |
07-Aug-2019 |
Pouya Fotouhi <Pouya.Fotouhi@amd.com> |
mem-ruby: Use check_on_cache_probe on MI
This change uses check_on_cache_probe statement to check if the cacheline subject to eviction is locked in MI.
Change-Id: I276822e987e52f7682ff30f55880f295b6af023d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19888 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
14160:d2a04d0ad93c |
07-Aug-2019 |
Pouya Fotouhi <Pouya.Fotouhi@amd.com> |
mem-ruby: Use check_on_cache_probe on MOESI hammer
This change uses check_on_cache_probe statement to check if the cacheline subject to eviction is locked in MOESI hammer.
Change-Id: I2c43f22aba5af3a57e54b1c435e5d3fbba86d1d5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19891 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
14159:f0a7a75d049b |
07-Aug-2019 |
Pouya Fotouhi <Pouya.Fotouhi@amd.com> |
mem-ruby: Use check_on_cache_probe on MOESI CMP
This change uses check_on_cache_probe statement to check if the cacheline subject to eviction is locked in MOESI CMP.
Change-Id: I3a8879e10ebd94ef68194836475e656761fed62c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19908 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
14158:adfd1421d69e |
07-Aug-2019 |
Pouya Fotouhi <Pouya.Fotouhi@amd.com> |
mem-ruby: Use check_on_cache_probe on MOESI
This change uses check_on_cache_probe statement to check if the cacheline subject to eviction is locked in MOESI.
Change-Id: Ie650ccdc15bb41b4088e534975b662408aaccf24 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19890 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
14156:beb72fa8eb83 |
07-Aug-2019 |
Pouya Fotouhi <Pouya.Fotouhi@amd.com> |
mem-ruby: Use check_on_cache_probe to protect locked lines from eviction
This change uses check_on_cache_probe statement to check if the cacheline subject to eviction is locked in MESI Three Level.
Change-Id: Ib0de54aa067c7603db1f7321cc4825b123b641ac Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19868 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14155:4e52ec7ae960 |
27-Feb-2019 |
Pouya Fotouhi <pfotouhi@ucdavis.edu> |
mem-ruby: Use check_on_cache_probe to protect locked lines from eviction
This change uses check_on_cache_probe statement to check if the cacheline subject to eviction is locked in MESI Two Level. Other protocols should be updated accordingly.
Signed-off-by: Pouya Fotouhi <pfotouhi@ucdavis.edu> Change-Id: Idcdbc8ee528eb5e4e2f8d56a268a3a92eadd95b1 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16809 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14138:4d94d8df94aa |
08-Aug-2019 |
Brandon Potter <brandon.potter@amd.com> |
sim-se: remove unused parameter
The init function which processes invoke on their page tables has a thread context pointer parameter. The parameter is not used by the code so remove it.
Change-Id: Ic4766fbc105d81c1c9ee4b5c0f428497dff2ab30 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19948 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14131:f0529ae28f97 |
02-Aug-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix non-virtual base destructor of Repl Entry
ReplaceableEntry contains a virtual method, yet its destructor was not virtual, causing errors in some compilers.
Change-Id: I13deec843f4007d9deb924882a8d98ff6a89c84f Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19808 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14125:89db448f81e6 |
30-Jul-2019 |
Pouya Fotouhi <Pouya.Fotouhi@amd.com> |
mem-ruby: Remove assertion with incorrect assumption
Current code assumes that only one cacheline would either be in RW. This is not true for GPU protocols, and may not be true for some CPU-only protocols with state violations.
Change-Id: I70db4fbb4e80663551e8635307bb937a4db8dc63 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19708 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14123:3525a51d01ef |
21-Jun-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Move eraseIfNullEntry to when holder is updated
The entry should only be tested for deletion when holder is updated.
Change-Id: I5a10b6fa876912709b7467860d43c23c60f38568 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19750 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14122:11979370f6f8 |
17-Apr-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Encapsulate retry variables of SnoopFilter
Group all variables related to the restoration of a snoop filter entry due to a crossbar retry.
Change-Id: I4e03edb3afd06563b7a5812959739876709eceeb Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19749 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14118:3d2ee7721eb0 |
29-Jul-2019 |
Tiago Mück <tiago.muck@arm.com> |
mem-cache: mark block as dirty when handling SW prefetch
This addresses the issue described in 64687ee mem-cache: Mark block as dirty after a SWPrefetchEXResp.
Previous patch misses cases when the prefetch response is ReadExResp or UpgradeResp. Also, marking the block as dirty in serviceMSHRTargets instead of in handleFill covers cases when the prefetch is coalesced with other requests.
Change-Id: I2b377fdd240eb0f09e720b6bb284dee6545925ce Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19688 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14117:2f88285aaa8b |
30-Jul-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix set and way of sub-entries
Set and way of sub-entries were not being set previously. They must be set after the sub-blocks have been assigned to the main block.
Change-Id: I7b6921b8437b29c472d691cd78cf20f2bb6c7e07 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19669 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14097:1f8f4c773c49 |
27-Feb-2019 |
Pouya Fotouhi <pfotouhi@ucdavis.edu> |
mem-ruby: Adding a new slicc statement - to not evict locked cachelines
Ruby caches block incoming ports with messages on a locked address to make sure the line would not be replaced by others. But they do not check the lock upon capacity/conflict misses.
This change adds a new slicc statement "check_on_cache_probe" which takes two arguments (mandatoryQueue for the controller, and the line subject to eviction - i.e. address returned by cacheProbe). If the line is locked, incoming message is delayed for 1 cycle and the controller skips this request (i.e. does not trigger an event).
Coherence protocols should be updated accordingly. One use case for MESI Two Level will be added in a separate change.
Signed-off-by: Pouya Fotouhi <pfotouhi@ucdavis.edu> Change-Id: I79ca2d45518de7a4e382b520a11f8e221b0cb803 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16808 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Srikant Bharadwaj <srikant.bharadwaj@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
14089:fe1e5813d62c |
30-May-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create CPack compressor
Implementation of C-Pack, as described in "C-Pack: A High- Performance Microprocessor Cache Compression Algorithm".
C-Pack uses pattern matching schemes to detect and compress frequently appearing data patterns. As in the original paper, it divides the input in 32-bit words, and uses 6 patterns to match with its dictionary.
For the patterns, each letter represents a byte: Z is a null byte, M is a dictionary match, X is a new value. The patterns are ZZZZ, XXXX, MMMM, MMXX, ZZZX, MMMX.
Change-Id: I2efc9db2c862620dcc1155300e39be558f9017e0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11105 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14084:9e60f14d5f0d |
11-Jul-2019 |
Chun-Chen TK Hsu <chunchenhsu@google.com> |
mem: Check response only when needed in CommMonitor
CommMonitor checks pkt->isResponse() for all packets in recvAtomic(). This assertion fails when packets don't need response, such as WritebackDirty. This change fixes this.
Signed-off-by: Chun-Chen TK Hsu Change-Id: I168e349e179b14fa5472698d9300478dc89693fb Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19428 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14079:0d35b63510ed |
08-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Fix MultiGrainBloomFilter accessing
When accessing the page filter the page hash should be used instead of the hash of the base filter.
Change-Id: I17b7c64f2a0d654c7d9a77a7bfb435385d81032c Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18739 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14078:90e52798b6ce |
07-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Remove Bloom Filters' ruby dependency
Substitute the common ruby header by base's bitfield to eliminate all ruby dependency in Bloom Filters.
As a side note, BulkBloomFilter now assumes addresses are 64 bit long.
Change-Id: Ibdb1f926ddcc06c848851c1e6a34863541808360 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18738 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14077:47938885514e |
07-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Parameterize block size in Bloom Filters
Substitute all occurrences of Ruby's block size by a Python configurable offset.
Change-Id: If4913e842921447deda943b0482fb0c78a44c275 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18737 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14076:271aa7778eb0 |
06-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Make Bloom Filters SimObjects
Make all bloom filters SimObjects.
Change-Id: I586293cdfb559361cb868b3198368e8b9b193356 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18736 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14075:0bfb08f318dd |
05-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Generalize use of bloom filters' isSet
In general the corresponding entries of an address are considered to be set when the sum of all of them reach their maximum value (i.e., they are all set), so generalize that into the base class.
Change-Id: If50b8c56065ad339b4ff2322ddc3c077a3bfc518 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18735 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14074:45ed745858cc |
05-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Cleanup filters
Renamed member variables to comply with general naming conventional outside of the ruby folder so that the filters can be moved out.
Moved code to base to reduce code duplication.
Renamed the private get_index functions to hash, to make their functionality explicit.
Change-Id: Ic6519cfc5e09ea95bc502a29b27f750f04eda754 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18734 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14073:4a435d5c63f2 |
08-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Fix MultiGrainBloomFilter total count calculation
Previous value was always 0, and was never incrementing. The total count should take into account the value stored in the entry.
Change-Id: I93813e3f388198967b30cf11848a8a8c3a7b91f4 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18733 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14072:aede6dbe889e |
05-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Remove index based functions in bloom filters
Most of the index based functions were not implemented, and a user is more likely to be interested in checking the filter contents based on an address than an index.
As a side effect, the Bulk's hash function became unused, and according to the paper permute() was doing more than just permuting, so it was renamed.
Change-Id: I6423a2565a082fee2e7f11fa489a11f253064d99 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18732 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14071:054392802955 |
07-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Remove Bloom Filter's print()
Print was unused. As a side effect 'using namespace std' is no longer needed.
Change-Id: Ief10cba1a11dfdd4edb7464eb9291fc83d6668cd Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18731 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14070:104aec37a31f |
08-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Standardize Bloom Filter deletion support
Standard Bloom Filters do not support element deletion by default, however some variants do. Allow calling the unset function with all filters, and do nothing by default.
Change-Id: Icf4b0f8b997c4c70fa714b2576474810275db78b Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18730 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14069:bdafa68f3cce |
06-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Bloom filters - Remove in/decrement
Increment and decrement were functions created to supply the different naming convention used by the counting bloom filter. They were removed, and the set and unset functions were used in their place instead, as in the other filters.
Change-Id: I45732bdfa3083add0a975f374a0f3560003e9d09 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18729 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14038:8ba13d8b7810 |
01-May-2019 |
Matthew Poremba <matthew.poremba@amd.com> |
mem: Option to toggle DRAM low-power states
Adding an option to enable DRAM low-power states. The low power states can have a significant impact on application performance (sim_ticks) on the order of 2-3x, especially for compute-gpu apps. The options allows for it to easily be enabled/disabled to compare performance numbers. The option is disabled by default.
Change-Id: Ib9bddbb792a1a6a4afb5339003472ff8f00a5859 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18548 Reviewed-by: Wendy Elsasser <wendy.elsasser@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14037:a1e12c851596 |
20-Apr-2017 |
John Alsop <johnathan.alsop@amd.com> |
mem-ruby: Enable set size increase
Add NUMBER_BITS_PER_SET environment variable to control the size of the bitmask in Set.hh (default=64). Necessary for configs which require >64 instances of a given machine type. This can be set in the build_opts file, e.g. by adding the following line: NUMBER_BITS_PER_SET = <number>
Change-Id: I314a3cadca8ce975fcf4a60d9022494751688e88 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18968 Reviewed-by: Tiago Mück <tiago.muck@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14035:60068a2d56e0 |
31-May-2019 |
Daniel Carvalho <odanrc@yahoo.com.br> |
Revert "mem-cache: Remove writebacks packet list"
This reverts commit bf0a722acdd8247602e83720a5f81a0b69c76250.
Reason for revert: This patch introduces a bug:
The problem here is that the insertion of block A may cause the eviction of block B, which on the lower level may cause the eviction of block A. Since A is not marked as present yet, A is "safely" removed from the snoop filter
However, by reverting it, using atomic and a Tags sub-class that can generate multiple evictions at once becomes broken when using Atomic mode and shall be fixed in a future patch.
Change-Id: I5b27e54b54ae5b50255588835c1a2ebf3015f002 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19088 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14019:4732393f8210 |
02-May-2019 |
Gabe Black <gabeblack@google.com> |
mem: Remove the now unused Copy* methods from the FS port proxy.
Change-Id: Ie433a9e4c9ee748911060eb7b1b47e617aa297a6 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18576 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
14015:e709cec78417 |
16-May-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Accuracy-based rate control for prefetchers
Added a mechanism to control the number of prefetches generated based in the effectiveness of the prefetches generated so far.
Change-Id: I33af82546f74a5b5ab372c28574b76dd9a1bd46a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18808 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14013:aeb3ca1762bb |
27-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Support for page crossing prefetches
Prefetchers can now issue hardware prefetch requests that go beyond the boundaries of the system page. Page crossing references will need to look up the TLBs to be able to compute the physical address to be prefetched.
Change-Id: Ib56374097e3b7dc87414139d210ea9272f96b06b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14620 Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
14012:1bdf42ed6add |
02-May-2019 |
Gabe Black <gabeblack@google.com> |
mem: Add a readString method to the PortProxy which takes a char *.
This version takes a char * instead of an std::string &, and a maximum length to fill in like strncpy. This is intended to be a replacement for the CopyStringOut function.
Change-Id: Ib661924a3fa7e05761d572ffecbe2c0cc8659d48 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18574 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
14011:faf0a568ba6b |
01-May-2019 |
Gabe Black <gabeblack@google.com> |
mem: Use a const T & in write<> to avoid an unnecessary copy.
If the type T is complex/large, the it makes sense to access it in place and not copy it and then not modify it.
Change-Id: Idd24be4fbba636375637ff72b1ba5ee32eb76215 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18573 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
14009:a4b36ce75361 |
01-May-2019 |
Gabe Black <gabeblack@google.com> |
mem, arm: Replace the pointer type in PortProxy with void *.
The void * type is for pointers which point to an unknown type. We should use that when handling anonymous buffers in the PortProxy functions, instead of uint8_t * which points to bytes.
Importantly, C/C++ doesn't require you to do any casting to turn an arbitrary pointer type into a void *. This will get rid of lots of tedious, verbose casting throughout the code base.
Change-Id: Id1adecc283c866d8e24524efd64f37b079088bd9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18571 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
14008:e36048ba1c2c |
01-May-2019 |
Gabe Black <gabeblack@google.com> |
mem, arm: Move some helper methods into the base PortProxy class.
These were originally in the SETranslatingPortProxy class, but they're not specific to SE mode in any way and are an unnecessary divergence between the SE and FS mode translating port proxies.
Change-Id: I8cb77531cc287bd15b2386410ffa7b43cdfa67d0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18570 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
14007:36f842f523c6 |
01-May-2019 |
Gabe Black <gabeblack@google.com> |
arm, mem: Move the SecurePortProxy subclass into it's own file.
The idea of a "secure" memory area/access is specific to ARM and shouldn't be in the common mem directory, although it's built in to the generic memory protocol at this point.
Regardless, it should minimially be in its own file like the virtual and physical port proxy classes are.
Change-Id: I140d4566ee2deded784adb04bcf6f11755a85c0c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18569 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14006:5258c91ede20 |
04-Mar-2019 |
Tiago Muck <tiago.muck@arm.com> |
mem: Parameterize coherent xbar sanity checks
Parameters can be used to change coherent xbar limits for the routing table and outstanding snoops. We need the ability to tweak these values as the current defaults may be violated in simulations with large core counts.
Change-Id: Idb64b8c105683d02d8beba5bce13b815181ba824 Signed-off-by: Tiago Muck <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18789 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14005:6cad91d6136c |
07-Feb-2019 |
Tiago Muck <tiago.muck@arm.com> |
mem: Snoop filter support for large systems
Changed SnoopMask to use std::bitset instead of uint64 so we can simulate larger systems without having to workaround limitations on the number of ports. No noticeable performance drop was observed after this change. The size of the bitset is currently set to 256 which should fit most needs.
Change-Id: I216882300500e2dcb789889756e73a1033271621 Signed-off-by: Tiago Muck <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18791 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13998:2feca2ebe67b |
12-Dec-2018 |
Tiago Muck <tiago.muck@arm.com> |
mem: Add invalid context id check on LLSC checks
If the request's address is in the LLSC list, its context Id was being fetched unconditionally, which could cause the assert at Request::contextId() to fail.
Change-Id: Iae9791f81c8fe9a7fcd842cd8ab7db18f34f2808 Signed-off-by: Tiago Muck <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18792 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13991:102d94094d6b |
12-Apr-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem-cache: Add multi-prefetcher adaptor
This patch adds a meta-prefetcher that enables gem5's cache models to connect to multiple prefetchers. Sub-prefetchers still use the probes-based interface and training can be controlled independently. However, when the cache requests a prefetch packet, the adaptor traverses the priority list of prefetchers and uses the first prefetcher that is able to generate a prefetch.
Kudos to Mitch Hayenga for the original version of this patch.
Change-Id: I25569a834997e5404c7183ec995d212912c5dcdf Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18868 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13981:577196ddd040 |
02-May-2019 |
Gabe Black <gabeblack@google.com> |
arch, base, cpu, dev, mem, sim: Remove #if 0-ed out code.
This code will be preserved through version control, but otherwise creates clutter and will rot in place since it's never compiled.
Change-Id: Id265f6deac445116843956ea5cf1210d8127274e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18608 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13976:48a3d50649a1 |
04-Apr-2019 |
Tiago Muck <tiago.muck@arm.com> |
mem-ruby: MOESI_CMP_dir cleanup
Removed unused states and actions
Change-Id: I3dc684c78d4b92d219e71522ddb706a13f9874d1 Signed-off-by: Tiago Muck <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18415 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: John Alsop <johnathan.alsop@amd.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13975:31372ed09a54 |
19-Feb-2019 |
Tiago Muck <tiago.muck@arm.com> |
mem-ruby: Cache latencies for MOESI_CMP_dir
Modified both L1 and L2 controllers to take into account the cache latency parameters. Default values in the configuration script updated as well.
Change-Id: I72bb8dd29ee0b02da06e1addf13b266fe4d1e979 Signed-off-by: Tiago Muck <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18414 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13974:af47a3ae0f6b |
19-Feb-2019 |
Tiago Muck <tiago.muck@arm.com> |
mem-ruby: Hit latencies defined by the controllers
Removed the icache/dcache hit latency parameters from the Sequencer. They were replaced by the mandatory queue enqueue latency that is now defined by the top-level cache controller. By default, the latency is defined by the mandatory_queue_latency parameter. When the latency depends on specific protocol states or on the request type, the protocol may override the mandatoryQueueLatency function.
Change-Id: I72e57a7ea49501ef81dc7f591bef14134274647c Signed-off-by: Tiago Muck <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18413 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13973:2f953d25716b |
25-Feb-2019 |
Tiago Muck <tiago.muck@arm.com> |
mem-ruby: Do not change blocked msg enqueue info
Updating the message counter and enqueue times when adding blocked messages back to the queue does not make a lot of sense since these messages are not new arrivals. More importantly, this may lead to starvation. See the scenario below:
1) Request A for a blocked line X arrives 2) A is handled; X is blocked so A is stalled 3) Request B for X arrives; Reponse for X arrives 4) Response is handled; X unblocked; A added back to the request queue 5) B is handled ahead of A (since A's arrival was updated); X may become blocked again
If new requests keep comming for X, A may will be stalled forever.
Change-Id: Icad79f3f716a870e91cb3455437b8b3c35f130ac Signed-off-by: Tiago Muck <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18412 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13972:b67844f26cd8 |
19-Feb-2019 |
Tiago Muck <tiago.muck@arm.com> |
mem-ruby: Unique ranks for MOESI_CMP_dir in ports
Setting different values for the rank parameter for all inputs ports. If left unset, it defaults to 0. This may cause issues since the rank is used as an index in the controller's list of stalled buffers.
Change-Id: Ie8ff660b7450df959292311040aebf802657efcf Signed-off-by: Tiago Muck <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18411 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13971:0201983aad69 |
14-Feb-2019 |
Tiago Muck <tiago.muck@arm.com> |
mem-ruby: Change MOESI_CMP_Dir L2 addressing
L1 controller selects the L2 to message based on the assigned address ranges instead of explicitly interleaving bits in the L1 controller. This simplifies the L1 controller implementation a bit and allows for more flexibility when changing the address->controller mapping.
Change-Id: Ie67999bb977566939432a5045f65dbd2da81816a Signed-off-by: Tiago Muck <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18410 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13970:b5ae3dd624d4 |
14-Feb-2019 |
Tiago Muck <tiago.muck@arm.com> |
mem-ruby: Fix MOESI_CMP_dir debug msg
Change-Id: I3fd32bd2e81dbf9a8ea49a43727564b8a9d64767 Signed-off-by: Tiago Muck <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18409 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13969:6893a5af1f06 |
07-Feb-2019 |
Tiago Muck <tiago.muck@arm.com> |
mem-ruby: Prevent response stalls on MOESI_CMP_directory
When a message triggers a transition that has actions which allocate TBEs, the generated code automatically includes a check for the TBETable size before executing any action. If the table is full, the transition returns TransitionResult_ResourceStall and no more messages from the buffer are handled (until the next cycle).
This behavior may lead to deadlocks in the MOESI_CMP_directory protocol since events triggered by the response queue may allocate TBEs (e.g. L2 replacements triggered by the response queue). If the table is full, the queue is stalled preventing other responses from freeing TBEs.
This patch fixes this by handling WRITEBACK_DIRTY_DATA/CLEAN_DATA messages as requests and WB_ACK/WB_NACK as responses. All controllers are changed to work with the new types. With this fix, responses are always handled first in all controllers, and no response triggers TBE allocations.
Change-Id: I377c0ec4f06d528e9f0541daf3dcc621184f2524 Signed-off-by: Tiago Muck <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18408 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: John Alsop <johnathan.alsop@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13963:94555f0223ba |
11-Apr-2019 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Use SatCounter for prefetchers
Many prefetchers re-implement saturating counters with ints. Make them use SatCounters instead.
Added missing operators and constructors to SatCounter for that to be possible and their respective tests.
Change-Id: I36f10c89c27c9b3d1bf461e9ea546920f6ebb888 Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17995 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Javier Bueno Hedo <javier.bueno@metempsy.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13956:0a8aa25fb57e |
06-May-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Replace string parameter in MultiBitSelBloomFilter
Replace string parameter from MultiBitSelBloomFilter's constructor by their tokenized counterparts.
Change-Id: I2e3db109dc4814fa0e9c13259f1136a6c4083092 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18728 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13954:2f400a5f2627 |
07-Jul-2017 |
Giacomo Gabrielli <giacomo.gabrielli@arm.com> |
cpu,mem: Add support for partial loads/stores and wide mem. accesses
This changeset adds support for partial (or masked) loads/stores, i.e. loads/stores that can disable accesses to individual bytes within the target address range. In addition, this changeset extends the code to crack memory accesses across most CPU models (TimingSimpleCPU still TBD), so that arbitrarily wide memory accesses are supported. These changes are required for supporting ISAs with wide vectors.
Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> - Tiago Muck <tiago.muck@arm.com>
Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13949:56b621267cf0 |
24-Jan-2019 |
Tiago Muck <tiago.muck@arm.com> |
mem-ruby: Fix MOESI_CMP_directory blocked line handling
Using recycle in the L2 controllers to put messages back into the buffer may lead to starvation when there are many L1 requests for the same line. This can easily trigger the deadlock detection mechanism in configurations with many cores (16+). Replacing recycle by stall_and_wait for L1 requests avoids this issue. wakeUpBuffers calls were added to all transitions from transient to stable states.
Change-Id: I28b8aeacc48919ccf38e69653cd9205a4153514b Signed-off-by: Tiago Muck <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17568 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Matthew Poremba <matthew.poremba@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13948:f8666d4d5855 |
18-Apr-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove writebacks packet list
Previously all atomic writebacks concerned a single block, therefore, when a block was evicted, no other block would be pending eviction. With sector tags (and compression), however, a single replacement can generate many evictions.
This can cause problems, since a writeback that evicts a block may evict blocks in the lower cache. If one of these conflict with one of the blocks pending eviction in the higher level, the snoop must inform it to the lower level. Since atomic mode does not have a writebuffer, this kind of conflict wouldn't be noticed.
Therefore, instead of evicting multiple blocks at once, we do it one by one.
Change-Id: I2fc2f9eb0f26248ddf91adbe987d158f5a2e592b Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18209 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13947:4cf8087cab09 |
08-Aug-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Handle data expansion
When a block in compressed form is overwriten, it may change its size. If the new compressed size is bigger, and the total size becomes bigger than the block size, one or more blocks will have to be evicted. This is called data expansion, or fat writes.
This change assumes that a first level cache cannot have a compressor, since otherwise data expansion should have been handled for atomic operations and writes. As such, data expansions should only be seen on writebacks. As writebacks are forwarded to the next level when failed, there should be no data expansions when servicing misses either.
This patch adds the functionality to handle data expansions by evicting the co-allocated blocks to make room for an expanded block.
Change-Id: I0bd77bf6446bfae336889940b2f75d6f0c87e533 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/12087 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13946:8e96e9be7f2c |
19-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add co-allocation function to compressed tags
Implement a co-allocation function in compressed tags, so that compressed blocks can be co-allocated in a superblock. Co-allocation is possible when compression ratio (CR) blocks that share a superblock tag can be compressed to up to (100/CR)% of their size.
Change-Id: I937cc1fcbb488e70309cb5478c12db65f1b4b23f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11411 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13945:a573bed35a8b |
19-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add compression and decompression calls
Add a compressor to the base cache class and compress within block allocation and decompress on writebacks.
This change does not implement data expansion (fat writes) yet, nor it adds the compression latency to the block write time.
Change-Id: Ie36db65f7487c9b05ec4aedebc2c7651b4cb4821 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11410 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13944:5000533e6b81 |
13-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create BDI Compressor
Implement Base-Delta-Immediate compression, as described in 'Base-Delta-Immediate Compression: Practical Data Compression for On-Chip Caches'
Change-Id: I7980c340ab53a086b748f4b2108de4adc775fac8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11412 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13943:4046b0c547be |
29-May-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add compression stats
Add compression statistics to the compressors. It tracks the number of blocks that can fit into a certain power of two size, and the number of decompressions.
For example, if a block is compressed to 100 bits, it will belong to the 128-bits compression size. Although it could also fit bigger sizes, they are not taken into account for the stats (i.e., the 100-bit compression will fit only the 128-bits size, not 256 or higher).
We save stats for compressions that fail (i.e., compressed size is bigger than original cache line size).
Change-Id: Idab71a40a660e33259908ccd880e42a880b5ee06 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11103 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13942:e8b59b523af6 |
13-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create cache compressor
Create basic template for cache compressors. A basic compressor must implement a compression and a decompression method.
Change-Id: I83dc4d2b8d2bc5ed9f760c938edfa4ebdd6b8583 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11100 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13941:2c19da00ef9c |
15-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add block size to findVictim
Add block size to findVictim. For standard caches it will not be used. Compressed caches, however, need to know the size of the compressed block to decide whether a block is co-allocatable or not.
Change-Id: Id07f79763687b29f75d707c080fa9bd978a408aa Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11198 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Mohammad Seyedzadeh <sm.seyedzade@gmail.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13940:33cc30e2de52 |
30-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add compression data to CompressionBlk
Add a compression bit, decompression latency and compressed block size and their respective getters and setters.
Change-Id: Ia9d8656552d60e8d4e85fe5379dd75fc5adb0abe Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11102 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13939:c9e81d00a992 |
29-May-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create CacheComp debug flag
Create a debug flag for cache compression.
Change-Id: Id4b8e86d658d3aa550906ee0f8da3b54f4cdab7d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11104 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13938:14f80b6b37c1 |
29-May-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Stub compression framework
Create a stub of a compression framework where we can have multiple data blocks per tag entry. Only consecutive blocks can share a tag as of now.
For each tag entry there can be multiple data blocks. We have the same number of tags a conventional cache would have, but we instantiate the maximum number of data blocks (according to the compression ratio) per tag, to virtually implement compression without increasing the complexity of the simulator.
Change-Id: I549940c7afb2f744ab293ff8bb283967e7551a11 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/10763 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13932:24f825a9a080 |
07-Mar-2019 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Mark block as dirty after a SWPrefetchEXResp
This is a workaround for a bug introduced from the change: 59e3585a8 arch-arm: We add PRFM PST instruction for arm which can cause deadlocks in the memory system.
The design of the classic memory system in gem5 makes the folloing two assumptions: * A cache that fetches a block with an intention to modify it, becomes the point of ordering and therefore commits to respond to any snoop requests [1]. * A cache that fetches an exclusive copy of the block, does so with the intention to modify it [2]. Immediately after it receives the block, it will write to it and mark it as dirty. As the point of ordering, it responds to any outstanding snoops.
The current implementation of prefetch exclusive request breaks the second assumption. A cache can fetch an exclusive block without an immediate intention to modify it. If the block is not modified, it will not be marked as dirty. However, the cache has committed to respond to outstanding snoops, and if the block is clean it won't. This can result in deadlocks where a snoop gets stuck waiting for responses.
One solution (implemented by this patch) is to unconditionally mark the block dirty when filling due to a prefetch exclusive request. This makes the PrefetchExReq behave like a WriteReq. However, as it may mark as dirty a clean block, it creates the requirement for an uncessary WritebackDirty in the future. In practice, this shouldn't be a big problem unless the application is unnecessarily using prefetch exclusive instructions.
Other solutions, would require deeper changes to the design of the memory system to handle this properly.
[1]: When a cache commits to respond, it "informs" the xbar/PoC (point of coherence) and the other caches of its intention to respond. As a result the request will not be send to the main memory. [2]: In fact the assumption is that in the needsWritable MSHR there is at least one WriteReq before any snoops from other caches.
Change-Id: I378d3c0dadf25fc52e430b67102347b44d2f18ea Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17729 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Tested-by: kokoro <noreply+kokoro@google.com> |
13893:0e863b6c441a |
24-Apr-2019 |
Gabe Black <gabeblack@google.com> |
mem: Remove the ISA specialized versions of port proxy's read/write.
These selected their behavior based on ifdefs and had to be disabled when on the NULL ISA. The versions which take an explicit endianness have been renamed to just read/write instead of readGtoH and writeHtoG since the direction of the translation is obvious from context.
Change-Id: I6cfbfda6c4481962d442d3370534e50532d41814 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18372 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13892:0182a0601f66 |
22-Apr-2019 |
Gabe Black <gabeblack@google.com> |
mem: Minimize the use of MemObject.
MemObject doesn't provide anything beyond its base ClockedObject any more, so this change removes it from most inheritance hierarchies. Occasionally MemObject is replaced with SimObject when I was fairly confident that the extra functionality of ClockedObject wasn't needed.
Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Gabe Black <gabeblack@google.com> |
13875:656d633621fa |
23-Apr-2019 |
Andrea Mondelli <Andrea.Mondelli@ucf.edu> |
cpu,mem: missing override specifier
Change-Id: I731d3ef021596450ac307461f215760a148bb28a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18348 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13867:9b10bbcf0543 |
15-Apr-2019 |
Alexandru Dutu <alexandru.dutu@amd.com> |
sim-se: Enhance clone for X86KvmCPU
This changeset enables clone to work with X86KvmCPU model, which will allow running multi-threaded applications at near hardware speeds. Even though the application is multi-threaded, the KvmCPU model uses one event queue, therefore, only one hardware thread will be used, through KVM, to simulate multiple application threads.
Change-Id: I2b2a7b1edb1c56eeb9c4fa0553cd236029cd53f8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18268 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13866:d0829f20374a |
22-Apr-2019 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Fix fix of replacement count
Commit 7976b561de61b7523ca9a860154ad7ba701d12a7 tried fixing replacement update when a single location can be associated to multiple blocks.
Although the comment of the correct action was added, the proper validation check was forgotten. This change adds that check and moves doing the eviction to when there is a valid block.
Change-Id: I31d8bb914ccfd1849e9d97464d70a58a62f59533 Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18210 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13863:f7391cb38ce7 |
18-Apr-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix increasing replacement count
Replacements should be increased when there is any evicted block, which does not necessarily have to be the victim.
For example, assume a superblock contains 4 blocks, and both A and C are stored compressed (belonging to SB_1). Then F, from SB2 needs to make room by replacing SB1. If F map to location 2, the number of replacements should be increased, even though 2 had no valid blocks:
Tag Data Tag Data |SB_1|--|A|X|C|X| --> |SB_2| |X|F|X|X| 1 2 3 4 1 2 3 4
Change-Id: I7b3735d28a35faa8d8fa613a1555bb258da65859 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18208 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13862:9b6d6541244f |
11-Feb-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove blk_addr from Queue::trySatisfyFunctional
The blk_addr is pkt->getBlockAddr(), and therefore can be acquired internally, when needed, as long as the pkt is provided.
Change-Id: I2780445d2a0cb9e27257961efc4f438cc19550e5 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17537 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13861:7815aef6668f |
24-Jan-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add match functions to QueueEntry
Having the caller decide the matching logic is error-prone, and frequently ends up with the secure bit being forgotten. This change adds matching functions to the QueueEntry to avoid this problem.
As a side effect the signature of findPending has been changed.
Change-Id: I6e494a821c1e6e841ab103ec69632c0e1b269a08 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17530 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13860:8f8df5b68439 |
11-Feb-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Add packet matching functions
Add both block and non-block-aligned packet matching functions, so that both address and secure bits are checked when checking whether a packet matches a request.
Change-Id: Id0069befb925d112e06f250741cb47d9dfa249cc Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17533 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13859:4156ac0c7257 |
30-Jan-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move Target to QueueEntry
WriteQueueEntry's target has 100% functionality overlap with MSHR's, therefore make it base to MSHR::Target.
Change-Id: I48614e78179d708bd91bbe75a752e5a05146e8eb Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17534 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13858:f01183becd57 |
24-Jan-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Assert Entry inherits from QueueEntry in Queue
Queue has several assumptions regarding its template parameter, so make sure they are fulfilled by forcing Entry to be derived from QueueEntry.
Change-Id: I0203a62aec00c04ac89e9674d86a44a07f9f13ab Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17529 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13857:9255d7412a58 |
12-Feb-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Make DRAMCtrl::decodeAddr const
DRAMCtrl's decodeAddr does not need to modify the packet it receives, nor should it modify the contents of the class, and therefore both the packet and the function are made const.
Change-Id: I577f48d9a43611ba54878a9a793cb7b4fbb326f4 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17540 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13856:c4a7f25aacb4 |
08-Feb-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Allow packet to provide its own addr range
Add a getter to Packet to allow it to provide its own addr range.
Change-Id: I2128ea3b71906502d10d9376b050a62407defd23 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17536 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13855:b7bf081341d0 |
16-Apr-2019 |
Andrea Mondelli <Andrea.Mondelli@ucf.edu> |
mem: missing override specifier
Change-Id: Ied4817bcda317826303a1bb688b41823b18b489b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18128 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13854:45d90a16247b |
25-Mar-2019 |
Gabe Black <gabeblack@google.com> |
mem: Teach SimpleMem to return a MemBackdoor when appropriate.
If the back door SimpleMem inherits from AbstractMem has a pointer and is hence valid, SimpleMem will return that pointer when asked.
Change-Id: I734daba48e4ae5b4ad8ac9a108e7b12b5e82803f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17669 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13853:7ec6a25d2bc1 |
25-Mar-2019 |
Gabe Black <gabeblack@google.com> |
mem: Maintain a back door into the AbstractMem's backing store.
The backing store pointer is added to the back door when it's set, assuming that the range isn't interleaved. If it is interleaved, then there isn't a way to get a flat pointer to the backing store.
Depending on how the backing store is set up, it may be possible to return a larger backdoor which applies to all interleaved memories at the same time and to avoid problems with interleaving. I'm leaving this as a todo.
Change-Id: I0e531c22835ec10954ab39f761b3d87666b59220 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17668 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
13849:858526a875ab |
09-Apr-2019 |
Anis Peysieux <anis.peysieux@inria.fr> |
mem-cache: Fix RRPV for RRIP
The RRPV values for RRIP and NRU replacment policies. Long re-rereference interval was used instead of distant re-rereference interval and vice-versa. The btp value permit to choose beetwen distant and long insertion ratio. A btp value of 0 force the policy to always insert at a distant re-reference interval and a btp value of 100 force the policy to always insert at a long (intermediate) re-rereference interval.
Change-Id: I516098f73942b769dcc31fe0edfe07c3e9c3effd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17851 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13847:c9b92a513019 |
22-Mar-2019 |
Gabe Black <gabeblack@google.com> |
mem: Plumb backdoor requests through the xbar classes.
Change-Id: Ic8f49339ab95c31d2f00edfdf23a46f1271ec3aa Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17593 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Gabe Black <gabeblack@google.com> |
13845:60939226a345 |
21-Mar-2019 |
Gabe Black <gabeblack@google.com> |
mem: Add sendAtomicBackdoor/recvAtomicBackdoor port methods.
These both perform atomic accesses like their non-backdoor equivalents, and also request a backdoor corresponding to the access.
The default implementation for recvAtomicBackdoor prints a warning (once per port instance), calls recvAtomic to do the actual access, and leaves the backdoor pointer as nullptr. That way if an object doesn't know how to handle or transfer requests for a back door, it automatically replies in a safe way that ignores the back door request.
Change-Id: Ia9fbbe9996eb4b71ea62214d203aa039a05f1618 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17590 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Gabe Black <gabeblack@google.com> |
13844:e409800a51c7 |
12-Feb-2019 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix MSHR handling of cache clean requests
Previously satisfied clean requests would not snoop in-service MSHRs. This is a problem when a clean request is also invalidating, in which case we have to post-invalidate or post-downgrade outstanding requests. This changes fixes this bug.
Change-Id: I31e42aa94dd3637b2818e00fbaae68c810145eaf Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17728 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
13837:05fc1c522c80 |
21-Mar-2019 |
Gabe Black <gabeblack@google.com> |
mem: Add a MemBackdoor type to track memory backdoors.
These are similar to the structures TLM's DMI mechanism uses. Instead of having an invalidation broadcast which propogates backwards up the port hierarchy, this mechanism tracks a set of callbacks which are triggered when a back door is invalidated to let other holders clean up their bookkeeping.
Change-Id: If24489258dcaee14d7b6e5b996dfb1c2636f26ab Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17589 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> |
13835:dff303952ba9 |
04-Apr-2019 |
Ryan Gambord <gambordr@oregonstate.edu> |
mem-cache: ambiguous use of abs function
std::abs doesn't accept unsigned long long, generating the error:
error: call to 'abs' is ambiguous
Use instead a compare-and-subtract idiom.
Also, Changed return type of distanceFromTrigger from unsigned int to Addr to prevent overflow problems.
Change-Id: Ia7752c1c7a838f98e8c7ed6ade9f586f31bbcf7d Signed-off-by: Ryan Gambord <gambordr@oregonstate.edu> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17788 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13834:1a7c647cbeac |
04-Apr-2019 |
Jason Lowe-Power <jason@lowepower.com> |
mem: Reverse order of write/read mem queue check
For atomic RMW instructions that go directly to memory, we want to put them on the write queue instead of the read queue. Swap the if/else condition to accomplish this.
Note: This is ignoring the read latency of the RMW, but these instructions should usually be handled in caches anyway.
Change-Id: I62dbfff3a16ac470f1ebdb489abe878962b20bb6 Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17828 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13832:79e439e69d9b |
02-Apr-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: AMPM Prefetcher fails when restoring from a checkpoint
The preriodic event triggers an assertion due to an incorrect tick value to schedule when restoring from a checkpoint.
Change-Id: I9454dd0c97d5a098f8a409886e63f7a7e990947c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17732 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13829:b623eae407f0 |
02-Apr-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Fix PIF prefetcher compilation error with NULL ISA
Referencing BaseCPU is causing a compilation error when using the NULL ISA. This patch changes the reference to a SimObject, which fixes the problem.
Change-Id: I2530486cab65974f5b83e54a733c4b0e98730d26 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17731 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13828:73addeac3dd3 |
02-Apr-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: ISB prefetcher was triggering an assertion
An assertion ignored the case when an entry of the SP table had been invalidated.
Change-Id: I5bf04e7a0979300b0f41f680c371f6397d4cbf3f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17734 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13827:2764e4b4de5d |
02-Apr-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Fix panic in Indirect Memory prefetcher
Memory requests with a size non-power-of-two and less than 8 values were causing a panic, but there these should be allowed and ignored by the prefetcher.
Change-Id: I86baa60058cc8a7f232d6ba5748d4c24a463c840 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17733 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13825:90e5b4dfeaff |
25-Feb-2019 |
Ivan Pizarro <ivan.pizarro@metempsy.com> |
mem-cache: Proactive Instruction Fetch Implementation
Ferdman, M., Kaynak, C., & Falsafi, B. (2011, December). Proactive instruction fetch. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (pp. 152-162). ACM.
Change-Id: I38c3ab30a94ab279f03e3d5936ce8ed118310c0e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16968 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13817:716bcdc780f9 |
27-Mar-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove extra cache header from AMAP
The cache header was being included in the AMAP, although not used, which resulted in slightly longer compilation time.
Change-Id: I3654bc719c6b5f558af116addae159301602a3cf Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17711 Reviewed-by: Javier Bueno Hedo <javier.bueno@metempsy.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13809:8a34eb8a6339 |
26-Mar-2019 |
Gabe Black <gabeblack@google.com> |
mem: Deleting this init() method was accidentally dropped during rebase.
Deleting this init() method was part of a change just committed, but was accidentally dropped during a rebase.
Change-Id: I0f22778596ed11e182f3111d9999a0fef727f6cc Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17688 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
13808:0a44fbc3a853 |
22-Mar-2019 |
Gabe Black <gabeblack@google.com> |
mem: Clean up the xbars a little.
Get rid of comments which just restate the code, get rid of redundant "virtual" keywords, add "override"s, fix style, and get rid of xbar::init which was empty and hiding the parent class init.
Change-Id: I8ce20abee340baa88084d142f2fb8c633ee54ba9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17592 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
13799:15badf7874ee |
19-Mar-2019 |
Andrea Mondelli <Andrea.Mondelli@ucf.edu> |
misc: missing override specifier
Missing specifier of overridden virtual function declared in sim_object.hh
Removed redundant "virtual" keyword
Change-Id: I42aa3349b537c9e62607bce20cf1b3aabdb99bf2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17468 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
13786:860c780d9f30 |
07-Mar-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Added the STeMS prefetcher
Reference: Stephen Somogyi, Thomas F. Wenisch, Anastasia Ailamaki, and Babak Falsafi. 2009. Spatio-temporal memory streaming. In Proceedings of the 36th annual international symposium on Computer architecture (ISCA '09). ACM, New York, NY, USA, 69-80.
Change-Id: I58cea1a7faa9391f8aa4469eb4973feabd31097a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16423 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13784:1941dc118243 |
07-Mar-2019 |
Gabe Black <gabeblack@google.com> |
arch, cpu, dev, gpu, mem, sim, python: start using getPort.
Replace the getMasterPort, getSlavePort, and getEthPort functions with getPort, and remove extraneous mechanisms that are no longer necessary.
Change-Id: Iab7e3c02d2f3a0cf33e7e824e18c28646b5bc318 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17040 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
13782:9f6654f478e2 |
07-Mar-2019 |
Gabe Black <gabeblack@google.com> |
mem: Move bind() and unbind() into the Port class.
These are now pure virtual methods which more specialized port subclasses will need to implement. The SlavePort class implements them by ignoring them and then providing parallel functions for the MasterPort to call. The MasterPort's methods do basically what they did before, except now bind() uses dynamic cast to check if its peer is of the appropriate type and also to convert it into that type before connecting to it.
Change-Id: I0948799bc954acaebf371e6b6612cee1d3023bc4 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17038 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
13775:36b71cff789e |
15-Mar-2019 |
Andrea Mondelli <Andrea.Mondelli@ucf.edu> |
mem-cache: tautological comparison of byteOrder
Error: build/X86/mem/cache/prefetch/indirect_memory.cc:56:24: error: result of comparison of constant -1 with expression of type 'const ByteOrder' is always false [-Werror,-Wtautological-constant-out-of-range-compare] fatal_if(byteOrder == -1, "This prefetcher requires a defined ISA\n"); ~~~~~~~~~ ^ ~~ build/X86/base/logging.hh:205:14: note: expanded from macro 'fatal_if' if ((cond)) { \ ^~~~ 1 error generated.
Fix: cast of constant (-1) used in comparison
Change-Id: I3deb154c2fe5b92c4ddf499176cb185c4ec7cf64 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17388 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13773:fc2f9a60cb2d |
14-Mar-2019 |
Ryan Gambord <gambordr@oregonstate.edu> |
mem: Removed circular include ref
If BasicLink.hh is modified, the style checker forces a reordering of the includes, which results in build errors because it ends up including Topology.hh before including its xxxParams.hh files, which include forward declarations of the BasicLink family of classes, and so Topology.hh throws errors that BasicLink etc. are not declared.
Change-Id: I664a0652e53f0cc61763c2190a980c655b85d397 Signed-off-by: Ryan Gambord <gambordr@oregonstate.edu> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17270 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13772:31b71dadc472 |
07-Mar-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Added the Indirect Memory Prefetcher
Reference: Xiangyao Yu, Christopher J. Hughes, Nadathur Satish, and Srinivas Devadas. 2015. IMP: indirect memory prefetcher. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO-48). ACM, New York, NY, USA, 178-190. DOI: https://doi.org/10.1145/2830772.2830807
Change-Id: I52790f69c13ec55b8c1c8b9396ef9a1fb1be9797 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16223 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13771:10d990934f15 |
07-Mar-2019 |
Gabe Black <gabeblack@google.com> |
mem: Move the Port base class into sim.
The Port class is going to be officially used for more than just memory system connections.
Change-Id: I493e721f99051865c5f0c06946a2303ff723c2af Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17036 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
13769:b8f532287e81 |
07-Mar-2019 |
Gabe Black <gabeblack@google.com> |
mem: Track the MemObject owner in MasterPort and SlavePort.
These types are much more tied to MemObjects and the gem5 memory protocol than the Port or BaseMasterPort and BaseSlavePort classes.
Change-Id: I36bc8c75b9c74d28ee8b65dbcbf742cd41135742 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17032 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Gabe Black <gabeblack@google.com> |
13765:7936e603ac0d |
13-Mar-2019 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Fix write hit latency calculation order
Patch 6d8694a5fb5cfb905186249581cc6a3fde6cc38a changes the order at which the access latency is calculated for hits. This order is incorrect, since the calculations must use the blk's whenReady value before the access is satisfied.
Change-Id: I30dae5435f54200cc8fdf71fd0dbd2cf9c6f8b17 Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17190 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13755:a1d7b56e3a64 |
10-Mar-2019 |
Ryan Gambord <gambordr@oregonstate.edu> |
mem-cache: Removed default arg from get() in prefetch/base.hh
commit b0d1643 caused building against NULL to break due to NULLIsa::GuestByteOrder not being defined.
Removal of default argument in src/mem/cache/prefetch/base.hh fixes this.
Change-Id: I99a4abb4be1418fadec145481164f7caa3334ca0 Signed-off-by: Ryan Gambord Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17070 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13752:135bb759ee9c |
08-Mar-2019 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Revert "mem-cache: Remove Packet dependency in Tags"
Reverting patch due to polymorphism limitations.
This reverts commit 86a54d91936b524c0ef0f282959f0fc29bafe7eb.
Change-Id: Ie032dcc5176448c62118c89732b3cc6b8efd5a13 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17049 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13751:614d6e02a5fb |
21-Feb-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Added extra information to PrefetchInfo
Added additional information to the PrefetchInfo data structure - Whether the event is triggered by a cache miss - Whether the event is a write or a read - Size of the data accessed - Data accessed by the request
Change-Id: I070f3ffe837ea960a357388e7f2b8a61d7b2196c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16583 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13750:11dd302dfaa4 |
05-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add header delay to handleFill whenReady
A prefetch response will have a header delay, which was not being taken into account.
Change-Id: I66a071bc81ef41b8c0de37aa2df75171d1979a6f Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14895 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13749:b2486662285d |
04-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Allow tag-only accesses on latency calculation
Some accesses only need to search for a tag in the tag array, with no need to touch the data array. This is the case for CleanEvicts, evicts that don't find a corresponding block entry (since a write cannot be done in parallel with tag lookup), and maintenance operations.
Change-Id: I7365a915500b5d7ab636d49a9acc627072a7f58e Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14878 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13748:de3b813c4b90 |
04-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add lookup latency to access' whenReady
When dealing with writebacks, as soon as the packet metadata arrives there will be a tag lookup, done sequentially because a write can't be done in parallel. While the tag lookup is being done, the payload will arrive. When both the payload are present and the tag is correct block entry is determined the fill happens.
Change-Id: If1a0085d742458b675bfc012b6d908d9d9a25e32 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14877 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13747:5c90d834a58c |
29-Nov-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix recvTimingReq doWritebacks tick
Before being sent to the writebuffer, the evicted blocks must be selected for replacement, and therefore the access latency must be applied. The forward latency is then applied on top of that delay.
Change-Id: I16a25a8bf6051f63eb7a02fe66acb6af26d434fc Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14736 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13746:723109f11d56 |
04-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use header delay on latency calculation
Previously the bus delay was being ignored for the access latency calculation, and then applied on top of the access latency. This patch fixes the order, as first the packet must arrive before the access starts.
Change-Id: I6d55299a911d54625c147814dd423bfc63ef1b65 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14876 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13745:1cf82fb6c4ab |
04-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove old todo about latency in hit function
The header and payload delay have already been accounted and zeroed previous to calling this function. The probe is not allowed to modify the packet, therefore no extra delays are added, and it is safe to remove the todo note.
Change-Id: I8ddf7e189fbe609cdec34364f3c013427930daf7 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14875 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13736:e678df1f0bf2 |
22-Feb-2019 |
Srikant Bharadwaj <srikant.bharadwaj@amd.com> |
ruby: Fix garnet's round robin arbitration for vc selection
Garnet utilizes round robin policy to select a VC for transmission ar Network Interface and Routers. The current logic for round robin is only fair if all the virtual networks are active at a given router. If the router or network interface is not receiving traffic in from any vnet then the priority is always taken up by the next vnet in numerically (or loops back to 0).
This fix changes the way we perform round robin arbitration. When a VC is selected in a cycle, the round robin pointer is set to the VC next to it and is iterated from there on. If any VC does not have a flit in a given cycle, it will lose its turn until the next round. At maximum traffic this will model round robin correctly even if a certain VNET is not active at that unit.
Change-Id: I9bf805221054f9f25bee14b57ff521f4ce4ca980 Reviewed-on: https://gem5-review.googlesource.com/c/16688 Reviewed-by: Jieming Yin <Jieming.Yin@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13735:52ab3bab4f28 |
13-Dec-2018 |
Ivan Pizarro <ivan.pizarro@metempsy.com> |
mem-cache: Sandbox Based Optimal Offset Implementation
Brown, N. T., & Sendag, R. Sandbox Based Optimal Offset Estimation.
Change-Id: Ieb693b6b2c3d8bdfb6948389ca10e92c85454862 Reviewed-on: https://gem5-review.googlesource.com/c/15095 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13732:43e7199f511f |
22-Jan-2019 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Copy over flags to forwarded response
A cache that forwards a request to the memory below does not fill and forwards the response with the data to cache above. This change ensures that the flags of the original response are also preserved.
Change-Id: I244b20b073c31b976358816c5b14bba413b8271f Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/16182 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
13721:a80fcb3e1322 |
25-Feb-2019 |
Andrea Mondelli <Andrea.Mondelli@ucf.edu> |
mem-cache: added missing override specifier in BoP
Added missing specifier for various virtual functions.
Change-Id: I41aebb3b76bce6dd3bee21ac0e2b0e52cb90fc80 Reviewed-on: https://gem5-review.googlesource.com/c/16728 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13720:18f5d3990ac9 |
26-Jan-2019 |
Andreas Sandberg <andreas.sandberg@arm.com> |
python: Stop using basestring to test for strings
The base class basestring doesn't exist in Python 3. Use string_types from six instead.
Change-Id: I7e84903fb7dd4a0af7ae4e9f4ec2e54338f212bb Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15998 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Juha Jäykkä <juha.jaykka@arm.com> |
13717:11e81e2a98bd |
03-Dec-2018 |
Ivan Pizarro <ivan.pizarro@metempsy.com> |
mem-cache: A Best-Offset Prefetcher
Michaud, P. (2015, June). A best-offset prefetcher. In 2nd Data Prefetching Championship.
Change-Id: I61bb89ca5639356d54aeb04e856d5bf6e8805c22 Reviewed-on: https://gem5-review.googlesource.com/c/14820 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13709:dd6b7ac5801f |
26-Jan-2019 |
Andreas Sandberg <andreas.sandberg@arm.com> |
python: Make iterator handling Python 3 compatible
Many functions that used to return lists (e.g., dict.items()) now return iterators and their iterator counterparts (e.g., dict.iteritems()) have been removed. Switch calls to the Python 2.7 iterator methods to use the Python 3 equivalent and add explicit list conversions where necessary.
Change-Id: I0c18114955af8f4932d81fb689a0adb939dafaba Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15992 Reviewed-by: Juha Jäykkä <juha.jaykka@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
13707:5aab50651a66 |
21-Feb-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Add a mechanism to iterate all entries of an AssociativeSet
Added functions to obtain an iterator to access all entries of an AssociativeSet container.
Change-Id: I1ec555bd97d97e3edaced2b8f61287e922279c26 Reviewed-on: https://gem5-review.googlesource.com/c/16582 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13700:56fa28e6fab4 |
31-Jan-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Added the Slim AMPM Prefetcher
Reference: Towards Bandwidth-Efficient Prefetching with Slim AMPM. Young, V., & Krishna, A. (2015). The 2nd Data Prefetching Championship.
Slim AMPM is composed of two prefetchers, the DPCT and the AMPM (both already in gem5).
Change-Id: I6e868faf216e3e75231cf181d59884ed6f0d382a Reviewed-on: https://gem5-review.googlesource.com/c/16383 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13675:afeab32b3655 |
24-Jan-2019 |
Andreas Sandberg <andreas.sandberg@arm.com> |
python: Replace dict.has_key with 'key in dict'
Python 3 has removed dict.has_key in favour of 'key in dict'.
Change-Id: I9852a5f57d672bea815308eb647a0ce45624fad5 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15987 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> |
13672:2969e4d5abf4 |
12-Feb-2019 |
Andreas Sandberg <andreas.sandberg@arm.com> |
python: Replace orderdict with collections.OrderedDict
Python 2.7 and newer has support for ordered dictionaries in the standard library. Remove this custom class.
Change-Id: I4b720405aa3c4ce8d5c0b401eefe744a85ac3a3e Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/16362 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
13669:24ef552b4d6d |
05-Dec-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Irregular Stream Buffer Prefetcher
Based in the description of the following publication: Akanksha Jain and Calvin Lin. 2013. Linearizing irregular memory accesses for improved correlated prefetching. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). ACM, New York, NY, USA, 247-259.
Change-Id: Ibeb6abc93ca40ad634df6ed5cf8becb0a49d1165 Reviewed-on: https://gem5-review.googlesource.com/c/15215 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
13667:e3ae3619b9ab |
05-Feb-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Added the Delta Correlating Prediction Tables Prefetcher
Reference: Multi-level hardware prefetching using low complexity delta correlating prediction tables with partial matching. Marius Grannaes, Magnus Jahre, and Lasse Natvig. 2010. In Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers (HiPEAC'10) Change-Id: I7b5d7ede9284862a427cfd5693a47652a69ed49d Reviewed-on: https://gem5-review.googlesource.com/c/16062 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
13665:9c7fe3811b88 |
25-Jan-2019 |
Andreas Sandberg <andreas.sandberg@arm.com> |
python: Don't assume SimObjects live in the global namespace
The importer in Python 3 doesn't like the way we import SimObjects from the global namespace. Convert the existing SimObject declarations to import from m5.objects. As a side-effect, this makes these files consistent with configuration files.
Change-Id: I11153502b430822130722839e1fa767b82a027aa Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15981 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> |
13661:c6e84ef6a309 |
19-Jan-2019 |
Pouya Fotouhi <pfotouhi@ucdavis.edu> |
mem-ruby: Fixing Topology
The constructor assumes the number of nodes (i.e. controllers) equal to the number of external nodes. This is a not necessarily valid for all cases (e.g MESI_Three_Level - where L0s are directly connected to L1s). MachineType_base_number(MachineType_NUM) provides the total number of controllers.
Signed-off-by: Pouya Fotouhi <pfotouhi@ucdavis.edu> Change-Id: Id906099dc967ec70aa34dedb0b55351031ff242c Reviewed-on: https://gem5-review.googlesource.com/c/15716 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13660:7c2b97d962f0 |
19-Jan-2019 |
Pouya Fotouhi <pfotouhi@ucdavis.edu> |
mem-ruby: Fixing MESI Three Level
Adding back some changes done in patch 676ae57827. Transient state IS_I, STALE_DATA, Data_Stale event are necessary.
Issue: (cacheline A, initial state for P0 and P1 is I) | P0 | P1 | |GETX (A)| | | |GETS (A)| |Inv_All | | P1 never sends the ACK - deadlock It should ACK, later upon data use it as stale data, and got to I.
Solution: P1(A): GETS: I->IS Inv_All: IS->IS_I, Send ACK Data: IS_I->I, STALE_DATA to L0
Signed-off-by: Pouya Fotouhi <pfotouhi@ucdavis.edu> Change-Id: I1e7b2c05439d08579c68d8eb444e0f332e75e07f Reviewed-on: https://gem5-review.googlesource.com/c/15715 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13624:3d8220c2d41d |
13-Dec-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Updated version of the Signature Path Prefetcher
This implementation is based in the description available in: Jinchun Kim, Seth H. Pugsley, Paul V. Gratz, A. L. Narasimha Reddy, Chris Wilkerson, and Zeshan Chishti. 2016. Path confidence based lookahead prefetching. In The 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-49). IEEE Press, Piscataway, NJ, USA, Article 60, 12 pages.
Change-Id: I4b8b54efef48ced7044bd535de9a69bca68d47d9 Reviewed-on: https://gem5-review.googlesource.com/c/14819 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13584:6b33c3d70473 |
13-Jan-2019 |
Zicong Wang <wangzicong@nudt.edu.cn> |
mem-ruby: Fix missing TBE allocation and deallocation
The TBE allocation and deallcation are currently missing during the directory state transition from I to M in protocol MI_example.
Change-Id: If7569c02faf56ea84c34ee1345f1a33d318cdfff Signed-off-by: Zicong Wang <wangzicong@nudt.edu.cn> Reviewed-on: https://gem5-review.googlesource.com/c/15535 Reviewed-by: Pouya Fotouhi <pfotouhi@ucdavis.edu> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13573:3223a8c1c3dd |
14-Nov-2018 |
Sascha Bischoff <sascha.bischoff@arm.com> |
mem: Add tryTiming suppport to CommMonitor
The CommMonitor did not support tryTiming, which resulted in gem5 panicing if the CommMonitor was used.
With this change, we update the CommMonitor pass through the tryTiming() calls.
Change-Id: I86810170e5e10a0c5d63af76fc4a6ab70710d2fb Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15736 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13565:fe1169a7502d |
04-Dec-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Allow inserts in the begining of a packet queue
A packet queue keeps track of packets that are scheduled to be sent at a specified time. Packets are sorted such that the packet with the earliest scheduled time is at the front of the list (unless there are other ordering requirements). Previouly, the implemented algorithm didn't allow packets to be placed at the front of the queue resulting in uneccessary delays. This change fixes the implementation of schedSendTiming.
Change-Id: Ic74abec7c3f4c12dbf67b5ab26a8d4232e18e19e Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15556 Reviewed-by: Bradley Wang <radwang@ucdavis.edu> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13564:9bbd53a77887 |
27-Nov-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Determine if a packet queue forces ordering at construction
A packet queue is typically used to hold on to packets that are schedules to be sent in the future or when they need to queue behind younger packets that have been sent out yet. Due to memory order requirements, some MemObjects need to maintain the order for packet (mostly responses) that reference the same cache block.
Prior to this patch the ordering requirements where determined when the packet was scheduled to be sent. This patch moves the parameter to the constructor.
Change-Id: Ieb4d94e86bc7514f5036b313ec23ea47dd653164 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15555 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13554:f16adb9b35cc |
12-Dec-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Access Map Pattern Matching Prefetcher
Implementation of the Access Map Pattern Matching prefetcher Based in the description of the following paper: Access map pattern matching for high performance data cache prefetch. Ishii, Y., Inaba, M., & Hiraki, K. (2011). Journal of Instruction-Level Parallelism, 13, 1-24.
Change-Id: I0d4b7f7afc2ab4938bdd8755bfed26e26a28530c Reviewed-on: https://gem5-review.googlesource.com/c/15096 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13553:047def1fa787 |
29-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: Signature Path Prefetcher
Related paper: Lookahead Prefetching with Signature Path J Kim, PV Gratz, ALN Reddy The 2nd Data Prefetching Championship (DPC2), 2015
Change-Id: I2319be2fa409f955f65e1bf1e1bb2d6d9a4fea11 Reviewed-on: https://gem5-review.googlesource.com/c/14737 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13552:86c9a15aa4ef |
29-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: allow prefetchers to emit page crossing references
QueuedPrefetcher takes the responsability to check for page crossing references.
Change-Id: I0ae6bf8be465118990d9ea1cac0da8f70e69aeb1 Reviewed-on: https://gem5-review.googlesource.com/c/14735 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13551:f352df8e2863 |
17-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: virtual address support for prefetchers
Prefetchers can be configured to operate with virtual or physical addreses. The option can be configured through the "use_virtual_addresses" parameter of the Prefetcher object.
Change-Id: I4f8c3687988afecc8a91c3c5b2d44cc0580f72aa Reviewed-on: https://gem5-review.googlesource.com/c/14416 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13486:d69584f27c78 |
07-Dec-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
mem: Compile tracePacket only when TRACING_ON is defined
If TRACING_ON is not defined (e.g. when building gem5.fast), clang compilations will fail reporting an unused function.
Change-Id: I959dba6e9fcf74b951e16365077939ae4d4ef924 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/14975 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13485:12e16073f6a7 |
07-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Workaround for setWhenReady assertion
Change 174da8e2da6a896d2e97bc264f9c827a0f4c35ac added an assert that is not satisfiable with current implementation, breaking some regression tests.
Change-Id: Ibafaf0c51906384364f0b2a4b931f8ec6126d858 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14955 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13478:59414c401cd9 |
05-Dec-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove writebacks parameter from serviceMSHRTargets
Change 8ba77ae8fc98a355082da2bd9fdc6ecf4928f725 introduced the writebacks parameter, but it was never used.
Change-Id: I225e5b399de42d77c72fc0012d3dc93ef39b8853 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14896 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13477:044307c0d0b8 |
28-Nov-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add getter and setter to CacheBlk::whenReady
Add a getter and a setter function to access CacheBlk::whenReady to encapsulate the variable and allow error checking. This error checking consists on verifying that writes to a block after it has been inserted follow a chronological order.
As a side effect, tickInserted retain its value until updated, that is, it is not reset in invalidate().
Change-Id: Idc3c5a99c3f002ee9acc2424f00e554877fd3a69 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14715 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13475:5189e2334f1a |
28-Nov-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
base, sim: Add missing destructors
Derived classes with virtual functions need to define a virtual destructor or a protected destructor otherwise calling the base class destructor has undefined behavior. This change adds a virtual distructor in the base class.
Change-Id: I1c855aa56dff6585ff99b9147bdb4eb9729a0a53 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/14815 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13473:ba87e4c95508 |
25-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Optimize sector valid and secure check
Previously a loop was being done to check whether the block was valid/secure or not. Variables have been added to skip this loop and save and update sector block state when sub-blocks are validated, invalidated and secured.
Change-Id: Ie1734f7dfda9698c7bf22a1fcbfc47ffb9239cea Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14363 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13449:2f7efa89c58b |
26-Nov-2018 |
Gabe Black <gabeblack@google.com> |
arch, base, cpu, gpu, mem: Replace assert(0 or false with panic.
Neither assert(0) nor assert(false) give any hint as to why control getting to them is bad, and their more descriptive versions, assert(0 && "description") and assert(false && "description"), jury rig assert to add an error message when the utility function panic() already does that directly with better formatting options.
This change replaces that flavor of call to assert with panic, except in the actual code which processes the formatting that panic uses (to avoid infinitely recurring error handling), and in some *.sm files since I don't know what rules those have to follow and don't want to accidentaly break them.
Change-Id: I8addfbfaf77eaed94ec8191f2ae4efb477cefdd0 Reviewed-on: https://gem5-review.googlesource.com/c/14636 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13445:070fc4d948c0 |
25-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add setters to validate and secure block
In order to allow polymorphism of the block these two functions have been added, and all direct status assignments to these bits have been substituted.
We also assert that the block has been invalidated before insertion. Then the block is validated in the insertion.
Change-Id: Ie7be42408721ad4c2c9dc880f82a62cb594f8668 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14362 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13434:99807b35a66c |
17-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: a missing cast was truncating addresses
High bits were truncated when computing the block address
Change-Id: Iab2a4c6063ece2d1d4c24ce5686045a6d6d35434 Reviewed-on: https://gem5-review.googlesource.com/c/14415 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13430:4b0a26035e4b |
07-Sep-2016 |
Matteo Andreozzi <Matteo.Andreozzi@arm.com> |
mem: avoid calling regStat twice on a QoSPolicy
Change-Id: I216c57073fabe29c3f898a5d89cee41efd4277d5 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13696 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13428:ceddb3964aea |
15-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: fix invalid iterator access
An iterator was assigned end() and then it was used to access its corresponding element.
Change-Id: I87246cf56cbc694dd6b4e2cabbe84a08429d2ac3 Reviewed-on: https://gem5-review.googlesource.com/c/14361 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13427:72a3afac3e78 |
11-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Make StridePrefetcher use Replacement Policies
Previously StridePrefetcher was only able to use random replacement policy. This change allows all replacement policies to be applied to the pc table.
Change-Id: I8714e71a6a4c9c31fbca49a07a456dcacd3e402c Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14360 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13426:d2b0e9ec67f1 |
11-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Add invalidation function to StrideEntry
Add invalidation function to StrideEntry so that every entry can be invalidated appropriately.
Change-Id: I38c42b7d7c93d839f797d116f1d2c88572123c0e Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14359 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13425:00abf35b2f7e |
11-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Make PCTable context independent
Move the unordered_map outside of the PCTable, as it belongs to the StridePrefetcher. By doing so we are moving towards a table that ressembles the ones of the Tags classes.
Some functions have been moved from the prefetcher to the PCTable, as they didn't belong there. As such, they have been renamed to remove the unnecessary prefix.
Change-Id: I3e54bc7dee65e1f78d96b0d548ac8345b7bd4364 Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14358 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13424:1744211c9a65 |
13-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Vectorize StridePrefetcher's entries.
Turn StridePrefetcher::PCTable::entries into a vector of vectors.
Change-Id: I2a4589a76eb205910c43723638b7989eddd5ca24 Reviewed-on: https://gem5-review.googlesource.com/c/14357 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13423:a414d6fccc4e |
13-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Return entry in StridePrefetcher::pcTableHit()
Return a pointer to the entry instead of returning a boolean and passing a pointer reference. As a side effect, change the name of the function to be more descriptive of the functionality.
Change-Id: Iad44979e98031754c1d0857b1790c0eaf77e9765 Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14356 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13422:4ec52da74cd5 |
11-Nov-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Cleanup prefetchers
Prefetcher code had extra variables, dependencies that could be removed, code duplication, and missing overrides.
Change-Id: I6e9fbf67a0bdab7eb591893039e088261f52d31a Signed-off-by: Daniel <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14355 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13419:aaadcfae091a |
13-Nov-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove Cache dependency from Tags
Tags do not need to be aware of caches.
Change-Id: Ib6a082b74dcd9b2f10852651634b59512732fb2a Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/14296 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13418:08101e89101e |
18-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move access latency calculation to Cache
Access latency was not being calculated properly, as it was always assuming that for hits reads take as long as writes, and that parallel accesses would produce the same latency for read and write misses.
By moving the calculation to the Cache we can use the write/ read information, reduce latency variables duplication and remove Cache dependency from Tags.
The tag lookup latency is still calculated by the Tags.
Change-Id: I71bc68fb5c3515b372c3bf002d61b6f048a45540 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13697 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13416:d90887d0c889 |
09-Nov-2018 |
Javier Bueno <javier.bueno@metempsy.com> |
mem-cache: implement a probe-based interface
The HW Prefetcher of a cache can now listen events from their associated CPUs and from its own cache.
Change-Id: I28aecd8faf8ed44be94464d84485bd1cea2efae3 Reviewed-on: https://gem5-review.googlesource.com/c/14155 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13412:bc5b08f44e6d |
06-Nov-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Align how we handle requests in atomic with timing
Requests, for which a cache has already committed to respond do not perform any lookups. Previously in atomic mode the packet would pay the lookup latency while in timing it wouldn't. This patch aligns recvAtomic with recvTimingReq and removes the lookup latency from the the handling of such requests.
Change-Id: I50a0631f8058e5086d94d55af0e1788a60e2883f Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/14175 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
13399:98f54e365584 |
14-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-ruby: Use Packet writing functions instead of memcpy
Classes were using memcpy instead of the Packet functions created for writing to/from the packet. This allows these writes to be better checked and tracked.
Change-Id: Iae3fba1351330916ee1d4103809c71e151b1639e Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13915 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13378:038ea95fd793 |
02-Nov-2018 |
Gabe Black <gabeblack@google.com> |
mem-cache: Rename the tag class init function to tagsInit.
Since the tag classes are subclasses of SimObject, they inherit an init function which does generic initialization at simulation startup and which doesn't take any parameters. A new function was added which does take a parameter, and which is just for doing tag specific initialization as triggered by the base cache. These two names clashed, and clang complained that the tag local name was hiding the SimObject name (which it was).
Change-Id: I399775aceaf8f1a8e2646d434facef22e6d3e7d0 Reviewed-on: https://gem5-review.googlesource.com/c/13875 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Gabe Black <gabeblack@google.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13377:2e04ce7d3fd4 |
15-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Use Packet writing functions instead of memcpy
Classes were using memcpy instead of the Packet functions created for writing to/from the packet. This allows these writes to be better checked and tracked.
This also fixes a bug in MemCheckerMonitor, which was using the incorrect type for the packet pointer.
Change-Id: I5bbc8a24e59464e8219bb6d54af8209e6d4ee1af Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13695 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13376:2165f3f012ed |
26-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix double block invalidation
Block was being invalidated twice when not a tempBlock. Make explicit that the else case is only to be applied when handling the tempBlock, as otherwise the Tags should be taking care of the invalidation.
Change-Id: Ie7603fdbe156c54e94bbdc83541b55e66f8d250f Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13895 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13368:c6c62c2cb733 |
04-Oct-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-ruby: Fix MOESI_CMP_directory in ports order
To avoid deadlocks ruby objects typically prioritize the handling of responses to all other events. The order in which in_port statements are written determine the order in which they are handled. This patch fixes the order of in_order statements for the L2 cache in the MOESI_CMP_directory.
Change-Id: I62248b0480a88ac2cd945425155f0961a1cf6cb1 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13595 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13367:dc06baae4275 |
19-Oct-2018 |
yuetsu.kodama <yuetsu.kodama@riken.jp> |
arch-arm: We add PRFM PST instruction for arm
Note current PRFM supports only PLD, but PST (prefetch for store) is also important for latency hiding. We also bug fix in disassembler to display prfop correctly.
Change-Id: I9144e7233900aa2d555e1c1a6a2c2e41d837aa13 Signed-off-by: Yuetsu Kodama <yuetsu.kodama@riken.jp> Reviewed-on: https://gem5-review.googlesource.com/c/13675 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
13358:5e1605b47a21 |
19-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move evictBlock(CacheBlk*, PacketList&) to base
Move evictBlock(CacheBlk*, PacketList&) to base cache, as it is both sub-classes implementations are equal.
Change-Id: I80fbd16813bfcc4938fb01ed76abe29b3f8b3018 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13656 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13353:63f4073c1fc7 |
18-Oct-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix unused variable warning in FALRU:invalidate()
Change-Id: I3b902045433ca56b3e62c251158e784b5fa9e4d7 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13600 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
13352:75647326f19b |
10-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add write coalescing and write-no-allocate to the caches
Enable the cache to detect contiguous writes and hold on to the MSHR long enough to allow the entire line to be written. If the whole line is written, the MSHR will be sent out as an invalidation requests, as it is part of a whole-line write, i.e. no-fetch-on-write.
The cache is also able to switch to a write-no-allocate policy on the actual completion of the writes, and instead use the tempBlock and turn the write operation into a writeback.
These policies are all well-known, and described in works such as Jouppi, Cache Write Policies and Performance, vol 21, no 2, ACM, 1993.
Change-Id: I19792f2970b3c6798c9b2b493acdd156897284ae Reviewed-on: https://gem5-review.googlesource.com/c/12907 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13351:1d456a63bfbc |
10-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Delay servicing an MSHR after its allocation
An MSHR is allocated and the computed latency determines when the MSHR will be ready and can be serviced by the cache. This patch adds a function that allows changing the time that an MSHR is ready and adjusts the queue such that other MSHRs can be serviced first if they are ready.
Change-Id: Ie908191fcb3c2d84d4c6f855c8b1e41ca5881bff Reviewed-on: https://gem5-review.googlesource.com/c/12906 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13350:247e4108a5e8 |
10-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Restructure whole-line writes to simplify write merging
This patch changes how we deal with whole-line writes their responses. With these changes, we use the MSHR tracking to determine if a whole-line is written, and on a fill we simply handle the invalidation response, with the actual writes taking place as part of satisfying the CPU-side hit.
Change-Id: I9a18e41a95db3c20b97f8bca7d95ff33d35a578b Reviewed-on: https://gem5-review.googlesource.com/c/12905 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13349:20890038e8a0 |
10-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Determine if an MSHR does a whole-line write
This patch adds support for determining whether the targets in an MSHR are 1) only writes and 2) whether these writes are effectively a whole-line write. This patch adds the necessary functions in the MSHR to allow for write coalescing in the cache.
Change-Id: I2c9a9a83d2d9b506a491ba5b0b9ac1054bdb31b4 Reviewed-on: https://gem5-review.googlesource.com/c/12904 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13348:e3d801b37735 |
17-Oct-2018 |
Gabe Black <gabeblack@google.com> |
mem: Mark the guest endianness packet accessors as deprecated.
Change-Id: Iebefeb5b1ce905f2b45b30b7656d6a01d0724584 Reviewed-on: https://gem5-review.googlesource.com/c/13575 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
13347:4085b1fa2288 |
12-Oct-2018 |
Gabe Black <gabeblack@google.com> |
null: Stop specifying an endianness in isa_traits.hh.
The NULL ISA doesn't really have an endianness. Now that the packet accessors which consumed that endianness are gone, we can get rid of that setting as well.
Change-Id: I8dd4c7b8236b07df4458fea377865f30141121d4 Reviewed-on: https://gem5-review.googlesource.com/c/13466 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
13346:67e56546fd5a |
12-Oct-2018 |
Gabe Black <gabeblack@google.com> |
mem: Explicitly specify the endianness in the abstract memory.
The accessors are used for debugging output. If we're using an ISA where there's an endianness, we use that explicitly, falling back to a binary dump if the size isn't supported. If not, then we just dump the data without interpretation regardless of size.
Change-Id: Ib050c4c876ee41f17cfd14ad657150bf6ab1de39 Reviewed-on: https://gem5-review.googlesource.com/c/13464 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Gabe Black <gabeblack@google.com> |
13236:8ea2f58940b0 |
12-Oct-2018 |
Daniel <odanrc@yahoo.com.br> |
mem-cache: Add missing includes in TreePLRU
Add missing includes to TreePLRU files.
Change-Id: Ia1e7b2aa91eec8a30b6dccf513cca37a3058b350 Reviewed-on: https://gem5-review.googlesource.com/c/13477 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13235:83ad50c4a285 |
12-Oct-2018 |
Gabe Black <gabeblack@google.com> |
mem: Get rid of some stray lines which ended up in packet.hh.
These were left in by mistake when refactoring patches for review.
Change-Id: I4c39b5a3e2a2d3957e725a6ffcf48c25b8a69f2e Reviewed-on: https://gem5-review.googlesource.com/c/13495 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
13228:c7257ea8d84a |
12-Oct-2018 |
Gabe Black <gabeblack@google.com> |
mem: Expose the raw packet accessor functions.
This avoids a place where data has its endianness switched so that when the endianness based accessors switch it back it returns to normal. It also makes it easier to show intent when accessing single bytes where endianness doesn't matter, and there's no contextual endianness.
Change-Id: I1b97396c1b9bb39727d35112d90e3969e5fe0aab Reviewed-on: https://gem5-review.googlesource.com/c/13455 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13225:8d1621fc586e |
11-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Factor ReplaceableEntry out
ReplaceableEntry is referenced by many classes that do not necessarily need access to the replacement policies. Therefore, in order to allow better compilation units, we factor it out to a new file.
Change-Id: I0823567bf1ca336ffcdf783682ef473e8878d7fd Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13418 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13224:1e74ea6ffe51 |
11-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move sector_blks to tags folder
Move sector_blks.hh and sector_blks.cc to the tags folder, as its usage scope is restricted to the tags, and caches should not be aware of them.
Change-Id: Ia7a71f51ec251d827872daf108c87da543a0ba57 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13417 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13223:081299f403fe |
11-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Rename blk.cc/hh to cache_blk.cc/hh
Rename the files blk.cc and blk.hh to cache_blk.cc and cache_blk.hh to comply with the usual file-class naming rules.
Change-Id: I8af45df3e4b8dd934fd9929ec914fb230cb2cb09 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13416 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13222:0dbcc7d7d66f |
10-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Virtualize block print
Encapsulate and virtualize block print, so that relevant information can be easily printed anywhere.
Change-Id: I91109c29c126755183a0fd2b4446f5335e64076b Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13415 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13221:48bce2835200 |
05-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create Tree-PLRU replacement policy
Implementation of a Tree-PLRU replacement policy. It is based on the assumption that a set associative cache is used.
Change-Id: I74b227e88fd6c93aab5bb2cd0e8730376db28f52 Reviewed-on: https://gem5-review.googlesource.com/c/11106 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13220:78a8391a0f95 |
12-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove CacheSet.hh
Replacement policies aren't aware of cache sets and do not organize blocks based on replacement data. Block search is independent of block placement.
Besides, indexing policies have their own way of addressing the sets, therefore there is no need to use this class anymore.
BlkType has been removed, as it wasn't being used.
Change-Id: Ia79c2a491e59f295c8d60a0466c317eb0e2bdab9 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/9782 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13219:454ecc63338d |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Split Tags for indexing policies
Split indexing functionality from tags, so that code duplication is reduced when adding new classes that use different indexing policies, such as set associative, skewed associative or other hash-based policies.
An indexing policy defines the mapping between an address' set and its physical location. For example, a conventional set assoc cache maps an address to all ways in a set using an immutable function, that is, a set x is always mapped to set x. However, skewed assoc caches map an address to a different set for each way, using a skewing function.
FALRU has been left unmodified as it is a specialization with its own complexity.
Change-Id: I0838b41663f21eba0aeab7aeb7839e3703ca3324 Reviewed-on: https://gem5-review.googlesource.com/c/8885 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13218:5e7df60c6cab |
07-Sep-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use set and way for ReplaceableEntry
Replaceable entries belong to table-like structures, and therefore they should be indexable by combining a row and a column. These, using conventional cache nomenclature translate to sets and ways.
Make these entries aware of their sets and ways. The idea is to make indexing policies usable by other table-like structures. In order to do so we move sets and ways to ReplaceableEntry, which will be the common base among table entries.
Change-Id: If0e3dacf9ea2f523af9cface067469ccecf82648 Reviewed-on: https://gem5-review.googlesource.com/c/12764 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13217:725b1701b4ee |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use possible locations to find block
Use possible locations to find block to make it placement policy independent.
Change-Id: I4c9d9e1e1ff91ce12e85ca1970f927d8f4f5a93b Reviewed-on: https://gem5-review.googlesource.com/c/8884 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13216:6ae030076b29 |
21-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create tags initialization function
Having the blocks initialized in the constructor makes it harder to apply inheritance in the tags classes. This patch decouples the block initialization functionality from the constructor by using an init() function. It also sets the parent cache.
Change-Id: I0da7fdaae492b1177c7cc3bda8639f79921fbbeb Reviewed-on: https://gem5-review.googlesource.com/c/11509 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13215:82cdb8db4643 |
06-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove Packet dependency in Tags
Decouple Tags from Packets, only extracting the necessary functionality for block insertion. As a side effect, create a new function to update common insertion statistics.
Change-Id: I5c58f7c17de3255beee531f72a3fd25a30d74c90 Reviewed-on: https://gem5-review.googlesource.com/c/11098 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13165:d52afbf4cdfe |
04-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix FALRU hash invalidation
The block was being invalidated before the hash could erase its entry, therefore it was using invalid values (tag was being assigned MaxAddr and the secure bit was reset).
This change reorders the calls, so that the appropriate hash entry is erased.
Change-Id: I161463df0f8f5220179bc68d7be12051e5390d01 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13210 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13164:da6240a1ccfb |
03-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Make checking function const in FALRU
The checking function should not be able to modify either the head and tail pointers nor should it modify its class.
Change-Id: I2ad495f0c8c6b778d48512143e94b4c9a353f22e Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13209 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13163:55923cb33a7e |
03-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Make boundaries in FALRU an STL container
Turn the dynamically allocated array of pointers "boundaries" into a STL vector.
Change-Id: I3409898473b155f69b4c6e038eba2dffb5b09380 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13208 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13162:b6a5d452d52d |
03-Oct-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix FALRU inCachesMask initialization
inCachesMask is not being initialized, which triggers an assertion on insertion. Fix this by implementing a default constructor for the FALRUBlk.
Change-Id: I587cf5e0191c4587d938e6ab6036ec1b32f37793 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13207 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13062:6f9defe1c11e |
19-Sep-2018 |
Xianwei Zhang <xianwei.zhang@amd.com> |
mem-ruby: Fix a bug in MessageBuffer randomization
In the previous implementation, messages are randomly inserted with delays only if both RubySystem and MessageBuffer randomization flags are set true. However, to find race conditions and cover more slicc transitions, ruby random testers rely on setting RubySystem flag to turn on randomization on all message buffers. As a fix, this patch enables a message buffer to have randomization when either RubySystem or its own flag is set.
Change-Id: I1e076908ff07e5846ebad4f4fc1c8f28d40bbfd4 Reviewed-on: https://gem5-review.googlesource.com/12784 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13026:c8380b98c0ef |
19-Sep-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix non-bijective function in Skewed caches
The hash() function must be bijective for the skewed caches to work, however when the hashing is done on top of a one-bit address, the MSB and LSB refer to the same bit, and therefore their xor will always be zero.
This patch adds a fatal error to not allow the user to set an invalid value for the number of sets that would generate that bug.
As a side note, the missing header for the bitfields functions has been added.
Change-Id: I35a03ac5fdc4debb091f7f2db5db33568d0b0021 Reviewed-on: https://gem5-review.googlesource.com/12724 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13023:a379876f2244 |
17-Aug-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
mem: Implement QoS Proportional Fair policy
Providing a configurable fair scheduling policy based on utilization; utilization is directly proportional to a score which is inversely proportional to the QoS priority. It is meant to avoid starvation of low priority packets.
Users can tune the policy by adjusting the weight parameter (weight of the following formula)
new_score = ((1.0 - weight) * old_score) + (weight * served_bytes)
Change-Id: I7679e234b916c57ebed06cec0ff3cff3cf2aef22 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/12359 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> |
13017:a620da03ab10 |
01-Sep-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix bug in handleAtomicReqMiss
"4976ff5 mem-cache: Refactor the recvAtomic function" introduced a bug where if an atomic request that fills in using the tempBlock it will not evict it when it finishes handling the request as it should. This triggers an assertion. This change fixes this bug.
Change-Id: I73c808a7e15237eddb36b5448ef6728f7bcf7fd9 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/12644 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12969:52de9d619ce6 |
04-Aug-2017 |
Matteo Andreozzi <Matteo.Andreozzi@arm.com> |
mem: Make DRAMCtrl a QoS-aware Memory Controller
This patch is turning DRAMCtrl a QoS-aware Memory Controller with "no policy" as a default policy.
Change-Id: I48163da8c8208498cf0398b07094cb840272507f Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/11973 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12968:1c2b8dd9241f |
10-May-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
mem: Implement base QoS Policies.
This patch implements a base fixed priority policy and an ideal turnaround policy for the QoS memory controller.
Change-Id: I38ce16f845fc0ec86d6fc4cc5dc5406f213a465e Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/11972 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12967:16c791dd6ee6 |
10-May-2018 |
Matteo Andreozzi <matteo.andreozzi@arm.com> |
mem: Add a simple QoS-aware Memory Controller
This patch implements QoSMemorySink: a simple generic QoS-aware memory controller which inherits from QoS::MemCtrl.
Change-Id: I537a4e2d4cb8f54fa0002eb088b2c6957afb9973 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/11971 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Matthew Poremba <porembam@gmail.com> |
12966:3b20a7f755d5 |
10-Jan-2018 |
Matteo Andreozzi <Matteo.Andreozzi@arm.com> |
mem: Add a QoS-aware Memory Controller type
This is the implementation of QoS algorithms support for gem5 memory objects. This change-list provides a framework for specifying QoS algorithm which can be used to prioritise service to specific masters in the memory controller. The QoS support implemented here is designed to be extendable so that new QoS algorithms can be easily plugged into the memory controller as "QoS Policies".
Change-Id: I0b611f13fce54dd1dd444eb806f8e98afd248bd5 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/11970 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12964:0315ef861b8a |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create Skewed Assoc placement policy
Create a class that implements the skewed associative placement policy. It uses the hash function and expansions of the skewing functions described in "Skewed-Associative caches", by Seznec.
Only 8 skewing functions are implemented, and therefore if more are needed a hash function will be recursively applied on top of the output of one of these functions to generate different values. This is not optimal, and if more functions are needed it might be more effective to implement them.
Change-Id: Ibc77edffd8128114a8b200cec5d8deedfb5105cb Reviewed-on: https://gem5-review.googlesource.com/8886 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12917:c18b776f460c |
16-May-2018 |
Stanislaw Czerniawski <stacze01@arm.com> |
mem: Add StreamID and SubstreamID
This patch adds StreamID and SubstreamID to Request. These fields can be used by a SMMU/IOMMU model to pick up the correct translation context for each request and they correspond to an ASID in a device. For this reason they have been merged together with the request asid in a union, so that a cpu will set the asid and a device will set the Stream and Substream ID.
Change-Id: Iac2b5a1ba9c6598ee7635c30845dc68ba6787c34 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/12187 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12892:796175b0e2dc |
15-Feb-2018 |
Brandon Potter <brandon.potter@amd.com> |
scons,ruby: do not generate unnecessary files
Do not generate garnet tester file or Ruby debug headers without a Ruby protocol (i.e. PROTOCOL=None). It makes no sense to include these files into the build when there will be no protocol to utilize them.
Change-Id: I8db4dd532f60008217a10c88a2e089f85df9d104 Reviewed-on: https://gem5-review.googlesource.com/8381 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12890:2dcd62e80e96 |
19-Feb-2018 |
Brandon Potter <brandon.potter@amd.com> |
ruby: remove unused code inside '#if 0 ... #endif'
The commented code contains bitrot. It is not clear how to fix the code so remove it.
The code will not compile if the preprocessor defines are removed. The llocker and uulocker variables that are used as indices into the persistent_randomize array are undefined. It's not clear what they should be from the current code.
5ab13e2deb shows when the lines were last modified. The functionality contained in the comments probably have not been used since that time. (This is an example of why one should never add commented code that is enabled by removing defines. The code rots and sits in the source forever.)
Change-Id: I3e0e7c9afc0b6088130e6f319075809fb6f16e5a Reviewed-on: https://gem5-review.googlesource.com/8481 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12843:d2ab5af49985 |
13-Jul-2018 |
Robert Kovacsics <rmk35@cl.cam.ac.uk> |
mem-cache: TempCacheBlk allocates and destroys its own data
This change is because I want to make CacheBlk::data private, so that I can track all the places which write to it. But to keep that commit smaller (it is pretty big, because of all the places which might change it), I have split this into a commit of its own.
Change-Id: I15a2fc1752085ff3681f5c74ec90be3828a559ea Reviewed-on: https://gem5-review.googlesource.com/11829 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12823:ba630bc7a36d |
19-Jul-2018 |
Robert Kovacsics <rmk35@cl.cam.ac.uk> |
mem: Rename Packet::checkFunctional to trySatisfyFunctional
Packet::checkFunctional also wrote data to/from the packet depending on if it was read/write, respectively, which the 'check' in the name would suggest otherwise. This renames it to doFunctional, which is more suggestive. It also renames any function called checkFunctional which calls Packet::checkFunctional. These are
- Bridge::BridgeMasterPort::checkFunctional - calls Packet::checkFunctional - MSHR::checkFunctional - calls Packet::checkFunctional - MSHR::TargetList::checkFunctional - calls Packet::checkFunctional - Queue<>::checkFunctional (of src/mem/cache/queue.hh, not src/cpu/minor/buffers.h) - Instantiated with Queue<WriteQueueEntry> and Queue<MSHR> - WriteQueueEntry - calls Packet::checkFunctional - WriteQueueEntry::TargetList - calls Packet::checkFunctional - MemDelay::checkFunctional - calls QueuedSlavePort/QueuedMasterPort::checkFunctional - Packet::checkFunctional - PacketQueue::checkFunctional - calls Packet::checkFunctional - QueuedSlavePort::checkFunctional - calls PacketQueue::doFunctional - QueuedMasterPort::checkFunctional - calls PacketQueue::doFunctional - SerialLink::SerialLinkMasterPort::checkFunctional - calls Packet::doFunctional
Change-Id: Ieca2579c020c329040da053ba8e25820801b62c5 Reviewed-on: https://gem5-review.googlesource.com/11810 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12822:fe6f6d605214 |
19-Jul-2018 |
Robert Kovacsics <rmk35@cl.cam.ac.uk> |
mem: Removed "using namespace std;" from src/mem/packet.cc
To avoid unintentional variable capture, all std calls must be prefixed. These are the identifiers which are in the std namespace (according to https://en.cppreference.com/w/cpp/symbol_index), but that will remain unprefixed with this change:
int8_t int16_t int32_t int64_t uint8_t uint16_t uint32_t uint64_t
The (u)int types are included from the packet header file, which includes <inttypes.h>, where they occur in the global namespace. They are in the std namespace in <cinttypes>/<cstdint>.
There is an occurrence of "set" in this file, which is "Packet::set" and not "std::set", so it is not prefixed with the std namespace
Change-Id: I7f6c0b61b09658e224fe31a9f73150b81861d6f8 Reviewed-on: https://gem5-review.googlesource.com/11809 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12821:3663a543ed2a |
13-Jul-2018 |
Robert Kovacsics <rmk35@cl.cam.ac.uk> |
mem: Fix off-by-one error in checkFunctional, and simplify it
There was an off-by-one error in the isRead() case, as `val_end` and `func_end` pointed to the last byte to write to (not one past the last byte), and thus `*_end - *_start` was not the length of the data to memcpy.
This was correct in the case of
val_start >= func_start && val_end <= func_end
where `overlap_size = size`, but if it were (as the other cases suggest) `overlap_size = val_end - val_start`, then it would also be off by one.
Also, the isWrite() case catered for this.
I simplified the four ifs into one case which uses min/max (this is how I spotted the inconsistency).
Change-Id: Ib5c5da084652e752f6baf1eec56b51b4f0f5c95c Reviewed-on: https://gem5-review.googlesource.com/11750 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12820:5d66b60a2c47 |
13-Jul-2018 |
Robert Kovacsics <rmk35@cl.cam.ac.uk> |
mem-cache: Typo in comment: 'proceed' -> 'precede'
The writebacks happen before anything below, not after.
Change-Id: I7eaefbbf33aa17c496255dedd964a56118a28741 Reviewed-on: https://gem5-review.googlesource.com/11749 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12802:c861c5743fc0 |
02-May-2018 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Add a memory delay simulator
Add a memory system component that delays traffic. The base functionality to delay packets is implemented in the abstract MemDelay class. This class exposes three methods that control packet delays:
* delayReq(pkt) * delayResp(pkt) * delaySnoopResp(pkt)
These methods should be specialized to implement delays for specific packet types.
The class SimpleMemDelay uses the MemDelay base class to implement constant delays for read/write requests and responses.
The intention is that these classes can be used for rapid prototyping of components that add a small fixed delay and the same throughput as the interconnect. I.e., any buffering done in the base class will be small and proportional to the introduced delay.
Change-Id: I158cb85f20e32bfdbcbfed66a785b4b2dd47b628 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nicholas Lindsey <nicholas.lindsay@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/11521 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12794:ba78a382b0f6 |
18-Mar-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Promote deferred targets on cache clean responses
While a cache clean operation is pending, all requests to the corresponding block get deferred. When the response of a cache clean operation is received, if the block is present and the response is not invalidating, we can service all deferred targets that didn't require writable. This change implements this functionality.
Change-Id: Ief47e74d07749a6a9736ab450eb46eefa53464a2 Reviewed-on: https://gem5-review.googlesource.com/11018 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12793:dda6af979353 |
16-Mar-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Promote targets that don't require writable
Until now, all deferred targets of an MSHR would be promoted together as soon as the targets were serviced. Due to the way we handle cache clean operations we might need to promote only deferred targets that don't require writable, leaving some targets as deferred. This change adds support for this selective promotion.
Change-Id: I502e523dc9adbaf394955cbacea8286ab6a9b6bc Reviewed-on: https://gem5-review.googlesource.com/11017 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12792:9af3470e24e7 |
16-Mar-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix promoting of targets that need writable
There are cases where a request which does not need a writable copy gets an response upgraded reponse and fills in a writable copy. When this happens, we promote deferred MSHR targets that were deferred because they needed a writable copy to service them immediately.
Previously, we would uncoditionally promote deferred targets. Since the deferred targets might contain a cache invalidation operation, we have to make sure that any targets following the cache invalidation is not promoted.
Change-Id: I1f7b28f7d35f84329e065c8f63117db21852365a Reviewed-on: https://gem5-review.googlesource.com/11016 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12791:8f27b3c23a91 |
16-Mar-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Selectively clear downstream pending
Until now, all deferred targets of an MSHR would be promoted together as soon as the targets were serviced. When we promote deferred targets we also clear the downstreamPending flag.
Due to the way we handle cache clean operations we might need to promote only deferred targets that don't require writable, leaving some targets as deferred. To allow for partial target promotion, this change adds support for clearing the downstreamPending only for a subset of a TargetsList.
Change-Id: Id06953643ba9a975ebacc76ac10215441e264e74 Reviewed-on: https://gem5-review.googlesource.com/11015 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12782:558fb870aefe |
18-Jun-2018 |
Jason Lowe-Power <jason@lowepower.com> |
mem-cache: Fix TempCacheBlock insert
TempCacheBlock insert() had a different signature than the parent class which caused an error on clang. This matches the signature with default zero values.
Change-Id: Ic096914497f3d17e88295c9e65a04d76fdddf365 Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/11349 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12781:bfb560f980f6 |
04-Jun-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Use address range to find the right physical address
Previously, we used the start address to determine the right physical memory while servicing memory requests. This change uses the full address range to correctly determine the right physical memory and expose bugs where requests might not fully map to a single physical memory.
Change-Id: I183d7552918106000f917a62ceb877511ff0ff71 Reviewed-on: https://gem5-review.googlesource.com/11118 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12780:14937f6495b4 |
04-Jun-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Use address range to find the destination port in the xbar
Previously the xbar used the start address to lookup the port map and determine the right destination of an incoming packet. This change uses the full address range to correctly determine the right master.
Change-Id: I5118712c43ae65aba64e71bf030bca5c99770bdd Reviewed-on: https://gem5-review.googlesource.com/11117 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12779:c1dc175bb9be |
19-Oct-2017 |
Gabe Black <gabeblack@google.com> |
mem: Use the caching in the AddrRangeMap class in PhysicalMemory
Use it instead of custom implemented caching.
Change-Id: Ie21012a77a3cb6ce57f34f879fa391678913896a Reviewed-on: https://gem5-review.googlesource.com/5244 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12778:ca8c50112a66 |
18-Oct-2017 |
Gabe Black <gabeblack@google.com> |
mem: Use the caching built into AddrRangeMap in the xbar
Use that instead of caching built into the crossbar.
Change-Id: If5a5355a0a1a6e532b14efc88a319de4c023f8c1 Reviewed-on: https://gem5-review.googlesource.com/5243 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12776:410b60d8a397 |
18-Apr-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
base, mem: Disambiguate if an addr range is contained or overlaps
We need to determined whether an address range is fully contained or it overlaps with an address range in the address range in the mmap. As an example, we use address range maps to associate ports to address ranges and we determine which port we will forward the request based on which address range contains the addresses accessed by the request. We also need to make sure that when we add a new port to the address range map, its address range does not overlap with any of the existing ports.
This patch splits the function find() into two functions contains() and intersects() to implement this distinct functionality. It also changes the xbar and the physical memory to use the right function.
Change-Id: If3fd3f774a16b27db2df76dc04f1d61824938008 Reviewed-on: https://gem5-review.googlesource.com/11115 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12775:84d56bc8cd8b |
21-Nov-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix support for secure blocks in the FALRU cache
Fully associative caches use an unordered map to enable efficient lookups of existing blocks. Previously this map was indexed using the tag of the block. Security extentions allow secure and non secure versions of a block with the same tag to co-exist in the cache. This patch amends the block map to allow correct lookups for FALRU caches.
Change-Id: Iccf07464deab56d1d270bae14bb3b154047e3556 Reviewed-on: https://gem5-review.googlesource.com/11309 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12774:b7948f858593 |
13-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Initialize CacheBlk data pointer
Initialize CacheBlk's data pointer as a nullptr.
Change-Id: Ice85b4b11495cad4b0a160ccb9efe1be673e57e2 Reviewed-on: https://gem5-review.googlesource.com/11097 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12773:387fa9e5c9ff |
07-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Forward declare ReplaceableEntry
Forward declare ReplaceableEntry where in classes where pointers to it are used.
Change-Id: I49c08d36442a563d7a6b4c9bcd7eba3591d29b60 Reviewed-on: https://gem5-review.googlesource.com/11096 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12766:1c347e60c7fd |
22-Jan-2018 |
Tuan Ta <qtt2@cornell.edu> |
base,mem: Support AtomicOpFunctor in the classic memory system
AtomicOpFunctor can be used to implement atomic memory operations. AtomicOpFunctor is captured inside a memory request and executed directly in the memory hierarchy in a single step.
This patch enables AtomicOpFunctor pointers to be included in a memory request and executed in a single step in the classic cache system.
This patch also makes the copy constructor of Request class do a deep copy of AtomicOpFunctor object. This prevents a copy of a Request object from accessing a deleted AtomicOpFunctor object.
Change-Id: I6649532b37f711e55f4552ad26893efeb300dd37 Reviewed-on: https://gem5-review.googlesource.com/8185 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12765:8c15915ba978 |
28-May-2018 |
Jason Lowe-Power <jason@lowepower.com> |
ruby: Revamp standalone SLICC script
There was some bitrot in the standalone SLICC script (util/slicc and src/mem/slicc/main.py). Fix the changes to the SLICC interface and also add some better documentation.
Change-Id: I91c0ec78d5072fba83edf32b661ae67967af7822 Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/10561 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12756:7f7bd5dbfcb1 |
13-Jun-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Remove unnecessary cast in SectorTags::findVictim
Removes an uneccessary cast that also caused an unused variable error (due to -Werror) when compiling .fast targets.
Change-Id: Ic043f462925e7eaa7b691455f1d9e08a1c101980 Reviewed-on: https://gem5-review.googlesource.com/11119 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12754:15c1d281ce1a |
06-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Insert on block allocation
When a block is being replaced in an allocation, if successfull, the block will be inserted. Therefore we move the insertion functionality to allocateBlock().
allocateBlock's signature has been modified to allow this modification.
Change-Id: I60d17a83ff4f3021fdc976378868ccde6c7507bc Reviewed-on: https://gem5-review.googlesource.com/10812 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12753:fe5b2dbe42bb |
06-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Make packet const in insertBlock
The packet should not be modified within insertBlock.
Change-Id: If7d2b01fe131f9923194efd155c9e85eeab24d5a Reviewed-on: https://gem5-review.googlesource.com/10811 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12752:6a0e3eb1cc5d |
05-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create Sector Cache
Implementation of Sector Caches, i.e., a cache with multiple sequential data entries per tag entry for Set Associtive placement policies.
Change-Id: I8e1e9448fa44ba308ccb16cd5bcc5fd36c988feb Reviewed-on: https://gem5-review.googlesource.com/9741 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12750:1dde69fad30f |
05-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
ruby: Fix initial weight in weighted LRU
Initial weight was using the timestamp instead of the weight.
Change-Id: I61d3c8424f85fd6856957087c477afda111f8ca7 Reviewed-on: https://gem5-review.googlesource.com/10801 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> |
12749:223c83ed9979 |
04-Jun-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
misc: Using smart pointers for memory Requests
This patch is changing the underlying type for RequestPtr from Request* to shared_ptr<Request>. Having memory requests being managed by smart pointers will simplify the code; it will also prevent memory leakage and dangling pointers.
Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10996 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12748:ae5ce8e42de7 |
03-Jun-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
misc: Substitute pointer to Request with aliased RequestPtr
Every usage of Request* in the code has been replaced with the RequestPtr alias. This is a preparing patch for when RequestPtr will be the typdefed to a smart pointer to Request rather then a raw pointer to Request.
Change-Id: I73cbaf2d96ea9313a590cdc731a25662950cd51a Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10995 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> |
12747:785f582e44ab |
02-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Change Cache block tag check
Change tag to address check for compatibility with sector design. Cache should not use tag, as sector sub-blocks share them, and it could lead to wrong accesses.
Change-Id: Id1fa26f417595f475c5b5c07ae1f02f5fa0684ba Reviewed-on: https://gem5-review.googlesource.com/10723 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12746:0d0c266663d4 |
02-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use secure bit in findVictim
Sector caches must know if there was a sector hit in order to decide whether a victim's sector must be fully evicted to give place to a new sector or not.
In order to do so it needs the tag and secure information.
Change-Id: Ib554169e25fa131d6bf986561f7970b787c56874 Reviewed-on: https://gem5-review.googlesource.com/10722 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12745:e28c117a9806 |
02-Jun-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move tagsInUse to children
Move tagsInUse to children, as sector caches have different tag invalidation and insertion, and thus they must handle updating this variable.
Change-Id: I875c9b7364a909c76daf610d1e226c4e82063870 Reviewed-on: https://gem5-review.googlesource.com/10721 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12744:d1ff0b42b747 |
24-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Return evictions along with victims
For both sector and compressed caches multiple blocks may need to be evicted in order to make room for a new block.
For example, when replacing a sector, all the blocks in this sector must be evicted. A replacement, however, does not always need to evict multiple blocks, as it is in the case of an insertion of a block whose sector is already present in the cache (i.e., its corresponding entry in the sector had not been brought in yet, so it was invalid).
This patch creates the cache framework for that to happen.
Change-Id: I77bedf69637cf899fef4d9432eb6da8529ea398b Reviewed-on: https://gem5-review.googlesource.com/10142 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12743:b5ccee582b40 |
20-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use ReplaceableEntry in findBlockBySetAndWay
With a sector cache you can't find a block using only its set and way, as there is the sector offset to take into account. As all of these blocks inherit from ReplaceableEntry, the return type of this function has been updated.
This function has also been declared closer to findBlock() due to their similar functionality.
Change-Id: I4730a2b4ebb5738f7fc118201e208a1b9c3ba8e8 Reviewed-on: https://gem5-review.googlesource.com/10141 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12731:36a41bd85c0f |
17-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Privatize extractSet
Only BaseSetAssoc uses extractSet(). Besides, skewed caches need the way information to know which set an address is located at.
Change-Id: Id222e907dc550d053018561bb2683cfc415471ec Reviewed-on: https://gem5-review.googlesource.com/9962 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12730:6c2ea88bf129 |
16-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create an address aware TempCacheBlk
tempBlock has its member variables manually set in order to allow it to be used in the block address regeneration function. This is not necessary, and ti can be simply given the address, so it does not need to be aware of set and tag. This will simplify implementation of sector and skewed caches.
Change-Id: Iaffb10c323509722cd5589fe1030b818d43336d6 Reviewed-on: https://gem5-review.googlesource.com/9961 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12729:9870d6f73e04 |
30-May-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix secure bit modification
Secure bit was being updated outside insertion.
Change-Id: I83d9b010e8cf64013bbea9bae3ea68b0c414a189 Reviewed-on: https://gem5-review.googlesource.com/10622 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12728:57bdea4f96aa |
30-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Replace block visitor with std::function
This change modifies forEachBlk tags function to accept std::function as parameter. It also adds an anyBlk tags function that given a condition, it iterates through the blocks and returns whether the condition is met.
Finally, it uses forEachBlk to implement the print, computeStats and cleanupRefs functions that also work for the FALRU class.
Change-Id: I2f75f4baa1fdd5a1d343a63ecace3eb9458fbf03 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10621 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12727:56c23b54bcb1 |
02-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Fix include directives in the cache related classes
Change-Id: I111b0f662897c43974aadb08da1ed85c7542585c Reviewed-on: https://gem5-review.googlesource.com/10433 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12726:850e9965525b |
05-Feb-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Add a non-coherent cache
The class re-uses the existing MSHR and write queue. At the moment every single access is handled by the cache, even uncacheable accesses, and nothing is forwarded.
This is a modified version of a changeset put together by Andreas Hansson <andreas.hansson@arm.com>
Change-Id: I41f7f9c2b8c7fa5ec23712a4446e8adb1c9a336a Reviewed-on: https://gem5-review.googlesource.com/8291 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12725:3dcb96899659 |
03-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Move cache bypass mechanism to the ports
Cache bypass is necessary for cpu models like the KvmCPU. Previously the bypass would happen at the cache classes. With this change the bypassing happens directly at the ports.
Change-Id: I34de9fc63383aee8590643e169501ea6060d2d62 Reviewed-on: https://gem5-review.googlesource.com/10432 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12724:4f6fac3191d2 |
02-Feb-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Adopt a more sensible cache class hierarchy
This patch changes what goes into the BaseCache and what goes into the Cache, to make it easier to add a NoncoherentCache with as much re-use as possible. A number of redundant members and definitions are also removed in the process.
This is a modified version of a changeset put together by Andreas Hansson <andreas.hansson@arm.com>
Change-Id: Ie9dd73c4ec07732e778e7416b712dad8b4bd5d4b Reviewed-on: https://gem5-review.googlesource.com/10431 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12723:530dc4bf1a00 |
04-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Add helper function to perform evictions
Change-Id: I2df24eb1a8516220bec9b685c8c09bf55be18681 Reviewed-on: https://gem5-review.googlesource.com/10430 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12722:d84f756891fe |
10-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Delegate block invalidation to block allocation
For a block replacement we first select a victim block, we invalidate it and then populate it with the new information. Prior to this change BaseTags::insertBlock() did the invalidation and filled in the block with the new information. Now that the replacements stat is moved to the BaseCache, insertBlock does not need to perform the invalidation and as a result we can unify the block eviction code in BaseCache.
Change-Id: I5bdf00b2dab2752ed2137ab7201ed1dc451333b3 Reviewed-on: https://gem5-review.googlesource.com/10429 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12721:7f611e9412f0 |
04-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Refactor the recvAtomic function
The recvAtomic function in the cache handles atomic requests. Over time, recvAtomic has grown in complexity and code size. This change factors out some of its functionality in a separate functiona. The new functions handles atomic requests that miss.
Change-Id: If77d2de1e3e802e1da37f889f68910e700c59209 Reviewed-on: https://gem5-review.googlesource.com/10425 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12720:8db2ee0c2cf6 |
02-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Refactor the cache recvTimingReq function
The recvTimingReq function in the cache handles timing requests. Over time, recvTimingReq has grown in complexity and code size. This change factors out some of its functionality in two separate functions. The new functions handle timing requests that hit and timing requests that miss separately.
Change-Id: I09902d648d7272f0f9ec2851fa6376f7305ba418 Reviewed-on: https://gem5-review.googlesource.com/10424 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12719:68a20fbd07a6 |
01-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Refactor the cache recvTimingResp function
The recvTimingResp function in the cache handles timing responses. Over time, recvTimingResp has grown in complexity and code size. This change factors out some of its functionality to a separate function. The new function iterates through the in-service targets and handles them accordingly.
Change-Id: I0ef28288640f6be1b30452b0664d32432e692ea6 Reviewed-on: https://gem5-review.googlesource.com/10423 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12718:abad79926b86 |
31-May-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix RandomReplData
Random replacement policy's data was being instantiated with the incorrect class.
Change-Id: Ib573a6b5a63868d6069997c6279bec3b10c6b9b9 Reviewed-on: https://gem5-review.googlesource.com/10623 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12715:0c8b4f376378 |
02-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Determine if an MSHR has requests from another cache
To decide whether we allocate upon receiving a response we need to determine if any of the currently serviced requests (non-deferred targets) is comming from another cache. This change adds support for tracking this information in the MSHR.
Change-Id: If1db93c12b6af5813b91b9d6b6e5e196d327f038 Reviewed-on: https://gem5-review.googlesource.com/10422 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12706:456304051464 |
28-Mar-2017 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: Add support for more flexible DRAM timing and topologies
This patch has 2 main aspects: 1) Add new parameter to adjust write-to-write delay 2) Enable support of more than 64 banks per controller
Changes for new parameter: Incorporated a new parameter, tCCD_L_WR, which defaults to tCCD_L. This parameter can be used to set a unique delay between writes and between reads.
To incorporate this parameter in the controller, modified the DRAMCtrl class to have separate variables for read and write column delays. Used these variables to account for tRTW, tWTR, tBURST, tCCD_L, and tCS as well as the new tCCD_L_WR parameter.
Changes to support more than 64 banks: Modified the logic selecting the next command (reorderQueue and minBankPrep functions). Replaced the unint64_t variables with a vector of uint32_t elements. There is a uint32_t element defined per ranks to allow up to 32 banks per rank. This will automatically scale with ranks without issue. Change will allow analysis of memory sub-systems beyond the current landscape.
Change-Id: I0ce466efed58276f843ad90e9ecc0ece6c37d646 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10103 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12705:9668a82ead4b |
06-Apr-2017 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: Optimize self-refresh entry
Self-refresh is entered during a refresh event, when the rank was previously in a precharge power-down state. The original code would enter self-refresh after a refresh was issued. The device subsequently will issue a refresh on self-refresh entry. On self-refresh exit, the controller will issue another refresh command.
Devices require at least one additional refresh to be issued between self-refresh exit and re-entry. This ensures that enough refreshes occur in the case when the device narrowly missed a refresh on self-refresh exit.
To minimize the number of refresh operations and still maintain the device requirement, the current logic does the following: 1) The controller will still enter self-refresh from a refresh event, when the previous state was precharge power-down. However, the refresh itself will be bypassed and the controller will immediately issue a self-refresh entry. 2) On a self-refresh exit, the controller will immediately issue a refresh command (per the original logic). This ensures the devices requirements are met and is a convenient way to kick off the command state machine.
Change-Id: I1c4b0dcbfa3bdafd755f3ccd65e267fcd700c491 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10102 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12704:4d2bcc64d469 |
10-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Move reference count stats update to blk invalidation
The tags in the cache keep track of the number of references to the blocks as well as the average number of references between an insertion and the next invalidation. Previously the stats where updated only on block insertion and invalidations were ignored. This changes moves the update of the counters to the block invalidation function.
Change-Id: Ie7672c13813ec278a65232694024d2e5e17c4612 Reviewed-on: https://gem5-review.googlesource.com/10428 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12703:2d0e4d2d76b3 |
10-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Remove isTouched field from the CacheBlk
At the moment isTouched is used in the warm-up detection mechanism but it keeps track of the same information as isValid(). This change removes it and substitutes its use by isValid().
Change-Id: I611ddf2fa4562ae3b3b2ed2fb74d26abd2e5ec62 Reviewed-on: https://gem5-review.googlesource.com/10427 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12702:27cb33a96e0f |
10-May-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Move replacements stat to the base cache class
Change-Id: I25dbcfcddfe1c422a76eb1af3f726c1360d8d110 Reviewed-on: https://gem5-review.googlesource.com/10426 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12700:c44381b89f9e |
30-Apr-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Simplify writeback for the tempBlock in recvTimingResp
When we use the tempBlock to fill-in, we have to write it back and invalidate it at the end of current transaction. This patch simplifies the writeback flow by treating it as a regular writeback.
Change-Id: I257be7bbff211e2832ad001a4e991daf67704485 Reviewed-on: https://gem5-review.googlesource.com/10421 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12697:cd71b966be1e |
27-Apr-2018 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
style: fix amd license and style issues
Change-Id: I26136fb49f743c4a597f8021cfd27f78897267b5 Reviewed-on: https://gem5-review.googlesource.com/10463 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> |
12691:8e1371fde4be |
13-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create block insertion function
Create a block insertion function to be used when inserting blocks. This resets the number of references to 1 (the insertion is taken into account), sets the insertion tick, and set secure state.
Change-Id: Ifc34cbbd1c125207ce47912d188809221c7a157e Reviewed-on: https://gem5-review.googlesource.com/9824 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12687:f26377b7f0c1 |
03-May-2018 |
Brad Beckmann <brad.beckmann@amd.com> |
mem-ruby: Consistent dprintf formats for issue outcomes
Change-Id: I053fc42f0d5f678f8e3434b53a0f09e00fc3e345 Reviewed-on: https://gem5-review.googlesource.com/10221 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12685:dcf85db6ec5c |
23-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create Second-Chance replacement policy
Implementation of a Second-Chance replacement policy. Similar to FIFO, but every block is given a second chance if it has been touched.
Change-Id: Id4d52b698d0045a4914a4d848fdf9c3c00a28508 Reviewed-on: https://gem5-review.googlesource.com/9441 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12684:44ebd2bc020f |
27-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: ReplacementPolicy specific replacement data
Replacement data is specific for each replacement policy, and thus should be instantiated differently by each policy.
Touch() and reset() do not need to be aware of CacheBlk, as they only update its ReplacementData.
Invalidate() makes replacement policies independent of cache blocks, by removing the awareness of the valid state.
An inheritable base ReplaceableEntry class was created to allow usage of replacement policies with any table-like structure.
Change-Id: I998917d800fa48504ed95abffa2f1b7bfd68522b Reviewed-on: https://gem5-review.googlesource.com/9421 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12680:91f4d6668b4f |
04-Apr-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
sim,cpu,mem,arch: Introduced MasterInfo data structure
With this patch a gem5 System will store more info about its Masters. While it was previously keeping track of the Master name and Master ID only, it is now adding a per-Master pointer to the SimObject related to the Master. This will make it possible for a client to query a System for a Master using either the master's name or the master's pointer.
Change-Id: I8b97d328a65cd06f329e2cdd3679451c17d2b8f6 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/9781 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12679:6c416cb3ca06 |
25-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use block iteration in BaseSetAssoc
Use block iteration instead of numSets and assoc in print(), cleanupRefs() and computeStats().
This makes these functions rely solely on what they are used for: printing and calculating stats of blocks. With the addition of Sectors an extra indirection level is added, and thus these functions would be skipping blocks.
Change-Id: I0006f82736cce02ba3e501ffafe9236f748daf32 Reviewed-on: https://gem5-review.googlesource.com/10143 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12677:dd0af90f2e05 |
18-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use findBlock in FALRU's block access
An access must perform a block search, which is done by findBlock.
The tagHash is indexed by tags, so use extractTag instead of re- implementing its functionality.
Change-Id: Ib5abacbc65cddf0f2d7e4440eb5355b56998a585 Reviewed-on: https://gem5-review.googlesource.com/10082 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12676:d0a1f557c156 |
19-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use secure flag in FALRU's findBlock
FALRU's findBlock() must use the secure flag to assure proper functionality.
Change-Id: I54e9fbd3c9093b3e8043c4c6c850b74a8f1f5ec0 Reviewed-on: https://gem5-review.googlesource.com/10081 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12672:4e1c5ce90fcd |
11-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create NRU Replacement Policy
Implementation of a Not Recently Used replacement policy.
Change-Id: I24ab3a6f1db6dcb756b869cfebb5c4bc544170e8 Reviewed-on: https://gem5-review.googlesource.com/9001 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12665:4ca9fc117b95 |
12-Apr-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Revamp multiple size tracking for FALRU caches
This change fixes a few bugs and refactors the mechanism by which caches that use the FALRU tags can output statistics for multiple cache sizes ranging from the minimum cache of interest up to the actual configured cache size.
Change-Id: Ibea029cf275a8c068c26eceeb06c761fc53aede2 Reviewed-on: https://gem5-review.googlesource.com/9826 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12662:bcda9eb2aef5 |
16-Apr-2018 |
John Alsop <johnathan.alsop@amd.com> |
mem-ruby: enable DPRINTFN calls in slicc for temporary debug printing
Change-Id: Ib92f8bb4ab7b61ebc96b935cb8abc42cf5ec6ac8 Reviewed-on: https://gem5-review.googlesource.com/9921 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12652:bae1a1865204 |
06-Mar-2018 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Add a helper function to get a word of variable length
There are many devices that need to handle reads/writes of different word sizes. A common pattern is a switch statement that check for the size of a packet and then calls the corresponding Packet::(get|set)<uintXX_t> methods. Simplify this by implementing Packet::(get|set)UintX helper functions.
The getter reads a word of the size specified in the packet and the specified endianness. The word is then zero-extended to 64 bits. Conversely, the setter truncates the word down to the size required in the packet and then byte-swaps it to the desired endianness.
Change-Id: I2f0c27fe3903abf3859bea13b07c7f5f0fb0809f Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/9761 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12651:1c704ca0944a |
20-Feb-2018 |
Brandon Potter <brandon.potter@amd.com> |
ruby,gpu-compute: bugfix for GPU_VIPER* protocols
12db50c895 changed how directory mapping works, but it seems to have broken the VIPER variants of the GPU protocols. The fix involves declaring the function in the related '.sm' files.
Change-Id: I116980d42a4aa648369058b529c9f8d9693eb894 Reviewed-on: https://gem5-review.googlesource.com/8521 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> |
12650:afa8b777a821 |
16-Feb-2018 |
Brandon Potter <brandon.potter@amd.com> |
ruby: bugfix for MESI_Three_Level protocol
Since a3177645, the MESI_Three_Level protocol does not build. This changeset addresses the problem by adding the L0Cache machine type to the static machine type declaration in Ruby's export file.
Change-Id: I6327547fcb34595619caeb73932c0032f5f65c9f Reviewed-on: https://gem5-review.googlesource.com/8383 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12649:a7ae239df810 |
13-Apr-2018 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
mem-ruby: fix more style issues in AMD licenses
Change-Id: I6585c5664d966989991f61303548aed634cf298a Reviewed-on: https://gem5-review.googlesource.com/9841 Reviewed-by: Michael LeBeane <Michael.Lebeane@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> |
12648:78941f188bb3 |
27-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Add MoveToTail to FALRU
FALRU was missing MoveToTail functionality within its invalidate function, and MoveToHead was doing unnecessary passes when the moved block was the head already.
Besides, added some comments to make the code understandable.
Change-Id: I2430d82b5d53c88b102a62610ea38b46d6e03a55 Reviewed-on: https://gem5-review.googlesource.com/9541 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12647:6d7e2f321496 |
12-Apr-2018 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
configs, mem-ruby: fix issues with style in AMD license
fixes line length and white space issues.
Change-Id: Ia04a91ec68cae2bcdabeb93bb1a0f74e8e5486c3 Reviewed-on: https://gem5-review.googlesource.com/9801 Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com> Maintainer: Bradford Beckmann <brad.beckmann@amd.com> |
12637:bfc3cb9c7e6c |
30-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Remove unused 'using namespace'
Removal of unused/barely used 'using namespace' from C++ files.
Change-Id: I66dc548c04506db2e41180b9ea7ab5abd7d5375a Reviewed-on: https://gem5-review.googlesource.com/9601 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12636:9859213e2662 |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Move insertBlock functionality in FALRU
Block insertion is being done in the getCandidates function, while the insertBlock function does not do anything.
Besides, BaseTags' stats weren't being updated.
Change-Id: Iadab9c1ea61519214f66fa24c4b91c4fc95604c0 Reviewed-on: https://gem5-review.googlesource.com/8882 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12635:3abc52e4b4f3 |
11-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create LIP Replacement Policy
Implementation of a LRU Insertion Policy replacement policy.
Change-Id: I1a9aa0091ff2cdc1b1652c1d5ec7a3b33fba5b44 Reviewed-on: https://gem5-review.googlesource.com/9002 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12634:e69074a3c9b9 |
11-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create BIP Replacement Policy
Implementation of a Bimodal Insertion Policy replacement policy.
Change-Id: Ife058d0d4310dbcb35858348006189f0b2bf7c37 Reviewed-on: https://gem5-review.googlesource.com/9003 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12633:675cd1260b40 |
04-Apr-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use Packet functions to write data blocks
Instead of using raw memcpy, use the proper writer functions from the Packet class in Cache.
Fixed typos in comments of these functions.
Change-Id: I156a00989c6cbaa73763349006a37a18243d6ed4 Reviewed-on: https://gem5-review.googlesource.com/9661 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12630:2208bf99bffd |
05-Feb-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Remove unused return value from the recvTimingReq func
The recvTimingReq function in the cache always returns true. This changeset removes the return value.
Change-Id: I00dddca65ee7224ecfa579ea5195c841dac02972 Reviewed-on: https://gem5-review.googlesource.com/8289 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12629:c17d4dc2379e |
22-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix FALRU data block seg fault
FALRU didn't initialize the blocks' data, causing seg faults. This patch does not make FALRU functional yet.
Change-Id: I10cbcf5afc3f8bc357eeb8b7cb46789dec47ba8b Reviewed-on: https://gem5-review.googlesource.com/9302 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12628:458d655f2abb |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create LFU replacement policy
Implementation of a Least Frequently Used replacement policy.
Change-Id: I772afccd3a7955777e53d59341e922718db44e5c Reviewed-on: https://gem5-review.googlesource.com/8890 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12627:33d3bb6f19a5 |
12-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create RRIP Replacement Policy
Implementation of a Re-Reference Interval Prediction replacement policy.
Change-Id: Iba716eb5df2bf2be156e765f889d94f6ad00c91b Reviewed-on: https://gem5-review.googlesource.com/8981 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12626:e161d7725d4b |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create BRRIP replacement policy
Implementation of a Bimodal Re-Reference Interval Prediction replacement policy.
Change-Id: I25d4a59a60ef7ac496c66852e394fd6cbaf50912 Reviewed-on: https://gem5-review.googlesource.com/8891 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12613:40c18bb90501 |
23-Mar-2018 |
Jason Lowe-Power <jason@lowepower.com> |
mem-cache: fix missing overrides in repl policies
Change-Id: I67759a4532e8a46c1643d4c3a9c546ad6b565b81 Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/9321 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12612:a64e6b723e5f |
27-Jul-2017 |
Jason Lowe-Power <jason@lowepower.com> |
ruby: Make sure addresses print in hex
Added fix in the invalid transition panic and various places in ruby random tester.
Change-Id: I879264da58369faf7de49d1a28b2da1cb935ef0a Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/8941 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12607:b1cc6815194e |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create FIFO replacement policy
Implementation of a First-In, First-Out replacement policy.
Change-Id: Id234ec9d29c092dd4516e609da14b8a75a96b5e4 Reviewed-on: https://gem5-review.googlesource.com/8888 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12606:3bb0c54096e8 |
23-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix MRU rebase
Rebase of MRU missed a const qualifier, introducing a compilation error.
Change-Id: Ia25aa30523613a1a87593a353abe439946656f63 Reviewed-on: https://gem5-review.googlesource.com/9301 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12601:21a10e7b578a |
09-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Create MRU replacement policy
Implementation of a Most Recently Used replacement policy.
Change-Id: Id52cb247ca25d4523dcc53490d113695dac6a3f1 Reviewed-on: https://gem5-review.googlesource.com/8889 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12600:e670dd17c8cf |
19-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Split array indexing and replacement policies.
Replacement policies (LRU, Random) are currently considered as array indexing methods, but have completely different functionalities:
- Array indexers determine the possible locations for block allocation. This information is used to generate replacement candidates when conflicts happen. - Replacement policies determine which of the replacement candidates should be evicted to make room for new allocations.
For this reason, they were split into different classes. Advantages:
- Easier and more straightforward to implement other replacement policies (RRIP, LFU, ARC, ...) - Allow easier future implementation of cache organization schemes
As now we can't assure the use of sets, the previous way to create a true LRU is not viable. Now a timestamp_bits parameter controls how many bits are dedicated for the timestamp, and a true LRU can be achieved through an infinite number of bits (although a few bits suffice in practice).
Change-Id: I23750db121f1474d17831137e6ff618beb2b3eda Reviewed-on: https://gem5-review.googlesource.com/8501 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12599:43ade6cf92b7 |
12-Mar-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Allow clean operations when block allocation fails
Block allocation can fail when there is an in-service MSHR that operates on the victim block. This can happed due to: * an upgrade operation: a request that needs a writable copy of the block finds a shared (non-writable) copy of the block in the cache and has allocates an MSHR for the pending upgrade operation, or * a clean operation: a clean request finds a dirty copy of the block and allocates an MSHR for the pending clean operation. This changes relaxes an assertion to allow for the 2nd case (cache clean operations).
Change-Id: Ib51482160b5f2b3702ed744b0eac2029d34bc9d4 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-on: https://gem5-review.googlesource.com/9021 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> |
12578:8196ae6fceba |
05-Mar-2018 |
Rico Amslinger <rico.amslinger@informatik.uni-augsburg.de> |
mem-ruby: Fix RubyPrefetcher support in MESI_Two_Level
Only a small quantity of prefetches were issued, as the positive feedback mechanism was not implemented. This commit adds a new action po_observeHit, which notifies the RubyPrefetcher of successful prefetches and resets the prefetch flag.
When a cache line was replaced by a prefetch, the wrong queue could be stalled. This commit adds a new event PF_L1_Replacement, which stalls the correct queue.
The behavior when receiving a prefetch or instruction fetch while in PF_IS_I (prefetch caused GETs, but got invalidated before the response was received) was undefined. This was changed to drop the prefetch request or change the state to non-prefetch, respectively. This behavior is analogous to IS_I (non-prefetch caused GETs, but got invalidated before the response was received) and the data case, respectively.
In my local branch a major (20+%) performance increase can be observed in SPEC2006 gobmk and leslie3d when enabling the prefetcher. Some other benchmarks like bwaves, GemsFDTD, sphinx and wrf show smaller (~10%) performance increases. Unfortunately, the performance in most other SPEC benchmarks is still poor, most likely as the prefetcher does not detect strides fast/often enough. In order to push the change timely (most benchmarks have runtimes in the order of days on my machine even with the smallest parameters) after checkout, I have only run gobmk with the base repository + this commit. The results match those of my local branch.
Change-Id: I9903a2fcd02060ea5e619b409f31f7d6fac47ae8 Reviewed-on: https://gem5-review.googlesource.com/8801 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Swapnil Haria <swapnilster@gmail.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12574:22936e2eb2da |
06-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use CacheBlk parameter on address regeneration
Skewed caches need to know the way to regenerate a block address.
Change-Id: I62c61ac9509eff2f37bad36862751956db7a6e40 Reviewed-on: https://gem5-review.googlesource.com/8782 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12573:b69e74b5baba |
08-Mar-2018 |
Jason Lowe-Power <jason@lowepower.com> |
mem-cache: Fix missing overrides
clang doesn't like inconsistent overrides. Add override to all overidden functions in lru.hh
Change-Id: I100ff4a7d90757439afee879ff9838c15f5c0b1d Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/8861 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12567:fef8623b1796 |
28-Jun-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Make the block invalidate functions virtual
This change makes the cache block invalidation function in the BaseTags and CacheBlk class virtual to enable derived classes.
Change-Id: I2e64b01c6ca637f16d10474fc8b08eeec3f23453 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8287 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12566:d6d48df9bf0f |
31-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Make invalidate a common function between tag classes
invalidate was defined as a separate function in the base associative and fully-associative tags classes although both functions should implement identical functionality. This patch moves the invalidate function in the base tags class.
Change-Id: I206ee969b00ab9e05873c6d87531474fcd712907 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8286 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12565:950ef64cb0a8 |
17-Jan-2018 |
Xiaoyu Ma <xiaoyuma@google.com> |
mem-cache: Allow prefetchers to override setCache.
This lets them hook setCache, perhaps to set up additional state based on the set cache.
Change-Id: Ic3b34fa43d052c71e8ef733a57fe47c70899cd27 Reviewed-on: https://gem5-review.googlesource.com/8701 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> |
12563:8d59ed22ae79 |
06-Mar-2018 |
Gabe Black <gabeblack@google.com> |
scons: Switch from the print statement to the print function.
Starting with version 3, scons imposes using the print function instead of the print statement in code it processes. To get things building again, this change moves all python code within gem5 to use the function version. Another change by another author separately made this same change to the site_tools and site_init.py files.
Change-Id: I2de7dc3b1be756baad6f60574c47c8b7e80ea3b0 Reviewed-on: https://gem5-review.googlesource.com/8761 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> |
12557:16b682f1d8a2 |
05-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix bug generated by 8282
Merge 1ae7fced4d32898531a6875a339ef00e43e20e66 generated a bug in tagsInUse calculation.
Change-Id: I079e327a0a26a7968f2ed8e433dd6e790c80998b Reviewed-on: https://gem5-review.googlesource.com/8781 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12556:522b57ee9abf |
07-Nov-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Populate whenReady for blocks filled from writebacks
Writebacks write data to either an existing block or a newly allocated block. In either case we need to populate the whenReady field of the block which will determine when the new value can be used.
Change-Id: I5788fad0b8086a1be96714639bf6a9470b334926 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8285 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12555:4ecdaa830686 |
05-Mar-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Use findBlock() in accessBlock()
Use placement policy specific block search within generic access.
Change-Id: I6070035e6e00595bcf073d4011f78a55ba7e7a8a Reviewed-on: https://gem5-review.googlesource.com/8721 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12554:86264baddf36 |
02-Nov-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Remove redundant block initialization on allocation
Change-Id: I7496e12e6a517529316c480d5f6e2ade601f0e2d Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8282 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12553:514f2e4fb751 |
31-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Remove mumBlock redundant initialiation from FALRU
Change-Id: Id3afec0a62446d6d0f44ccb655032343037637e0 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8281 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12552:5615a3de961f |
22-Nov-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Populate the secure bit when the temp block is filled
The secure bit should be set when we fill a block with data from a secure location, as indicated by the packet that triggers the fill. This patch fixes a bug in which the cache wouldn't populate the secure bit when filling the temp block.
Change-Id: I95c706146449804ff42b205b25dd79750f3e882a Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8284 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12551:a5016c69f510 |
02-Nov-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Remove unnecessary block initialization on writeback
Change-Id: Ia9b825bcbb8d326705f74c15a93a88703153ba5a Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8283 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> |
12549:d3e5cfe631fc |
27-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove extra block init in BaseSetAssoc
Removed extra initialization of cache block just after they have been created and organized the comments.
Change-Id: I75c1beaf0489e3e530fd8cbff2739dc7593e3e6f Reviewed-on: https://gem5-review.googlesource.com/8661 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12548:285f1792a2da |
26-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Vectorize C arrays in BaseSetAssoc.
Transform BaseSetAssoc's arrays into C++ vectors to avoid unnecessary resource management.
Change-Id: I656f42f29e5f9589eba491b410ca1df5a64f2f34 Reviewed-on: https://gem5-review.googlesource.com/8621 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12545:13eaf39f933b |
23-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Fix CacheSet memory leak
CacheSet blocks were being allocated but never freed. Used vector to avoid using pure C array.
Change-Id: I6f32fa5a305ff4e1d7602535026c1396764102ed Reviewed-on: https://gem5-review.googlesource.com/8603 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12536:0a1d2ced2d4c |
19-Feb-2018 |
Brandon Potter <brandon.potter@amd.com> |
mem: fix page_table bug for .fast build
Since b8b13206c8, the '.fast' build has failed to compile with an error caused by a variable and an assert.
As a reminder, assert macros are optimized out of the build for '.fast'. If an assert check requires a variable that is unused anywhere else in the code, the compiler complains that the variable is unused and the scons build fails. The solution is to add a M5_VAR_USED specifier to tell the compiler to ignore the variable.
Change-Id: I38f6bbed1e4c0506c5bbc1206c21f1f7e3d8dfe6 Reviewed-on: https://gem5-review.googlesource.com/8462 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> |
12532:a86ce386add1 |
13-Feb-2018 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Refactor port proxies to support secure accesses
The current physical port proxy doesn't know how to tag memory accesses as secure. Refactor the class slightly to create a set of methods (readBlobPhys, writeBlobPhys, memsetBlobPhys) that always access physical memory and take a set of Request::Flags as an argument. The new port proxy, SecurePortProxy, uses this interface to issue secure physical accesses.
Change-Id: I8232a4b35025be04ec8f91a00f0580266bacb338 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8364 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12522:463b7803e8dd |
08-Feb-2018 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Add PortProxy read/write helper with explicit endianness
Change-Id: Ia9a11ca68b2892dafd02f2c37324b99b35c77d34 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jack Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8146 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12519:1fcc0d0a8f91 |
13-Feb-2018 |
Rico Amslinger <rico.amslinger@informatik.uni-augsburg.de> |
mem, sim-se: Fixed seg-fault in EmulationPageTable::remap
When moving a memory region the target region should be unmapped. The assertion does reflect this, but the following line accesses the invalid pointer regardless. This commit replaces the pointer access with an emplace.
Change-Id: I85f9be4e6c223eab447c75043e593ed3f90017e1 Reviewed-on: https://gem5-review.googlesource.com/8261 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Gabe Black <gabeblack@google.com> |
12516:483fc7339fb1 |
09-Feb-2018 |
Wendy Elsasser <wendy.elsasser@arm.com> |
Fix DDR4_2400_8x8 DRAMCTRL configuration
Change-Id: I7af361e146909acc158590354ab22732d4b2f3d5 Signed-off-by: Wendy Elsasser <wendy.elsasser@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-on: https://gem5-review.googlesource.com/8101 Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12513:4dfc54394b5a |
07-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Make cache warmup percentage a parameter.
The warmupPercentage is the percentage of different tags (based on the cache size) that need to be touched in order to warm up the cache. If Warmup failed (i.e., not enough tags were touched), warmup_cycle = 0.
The warmup is not being taken into account to calculate the stats (i.e., stats acquisition starts before cache is warmed up). Maybe in the future this functionality should be added.
Change-Id: I2b93a99c19fddb99a4c60e6d4293fa355744d05e Reviewed-on: https://gem5-review.googlesource.com/8061 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12501:42537a80ef17 |
19-Dec-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Only pendingModified MSHRs can satisfy CMO snoops
We set the satisfied flag when a cache clean request encounters: 1) a block with the dirty bit set, or 2) a pending modified MSHR which means that the cache will get copy of the block that will be soon modified.
This changeset fixes a previous bug that set the satisfied flag on snooping MSHR hits even the pendingModified flags was not set.
Change-Id: I4968c4820997be5cc1238148eea12a1ba39837d4 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Sudhanshu Jha <sudhanshu.jha@arm.com> Reviewed-on: https://gem5-review.googlesource.com/7822 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12500:a91cf4e8b6a4 |
14-Dec-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Cleaned blocks should be marked as not writable
A writeclean packet writes a dirty block to the memory below and therefore sets the dirty flag for the block when the memory below is a cache. If the block was also marked as writable it can satisfy future write requests without further requests/snoops. This can lead to multiple copies of the same block marked as dirty which is not allowed. This changeset clears the writable flag from the cleaned block to prevent the cache from satisfying future write requests without sending a downstream request.
Change-Id: I14d3c62fd33f81b1a8ba62374c8565ccab00a6fe Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/7821 Maintainer: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12493:a1cf71a6de73 |
06-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem-cache: Remove extra numSets zero check.
numSets is unsigned, so it cannot be lower than 0. Besides, isPowerOf2(0) is false by definition (and implemmentation*), so there is no need for the double check.
* As presented in base/intmath.hh
Change-Id: I3f6296694a937434feddc7ed21f11c2a6fdfc5a9 Reviewed-on: https://gem5-review.googlesource.com/7901 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
12492:4e76959883a6 |
05-Feb-2018 |
Daniel R. Carvalho <odanrc@yahoo.com.br> |
mem: Standardize mem folder header guards
Standardize all header guards in the mem directory according to the most frequent patterns. In general they have the form: mem: __FOLDER_TREE_FILE_NAME_HH__ ruby: __FOLDER_TREE_FILENAME_HH__
Change-Id: I983853e292deb302becf151bf0e970057dc24774 Reviewed-on: https://gem5-review.googlesource.com/7881 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12461:a4cb506cda74 |
09-Jan-2018 |
Gabe Black <gabeblack@google.com> |
tarch, mem: Abstract the data stored in the SE page tables.
Rather than store the actual TLB entry that corresponds to a mapping, we can just store some abstracted information (address, a few flags) and then let the caller turn that into the appropriate entry. There could potentially be some small amount of overhead from creating entries vs. storing them and just installing them, but it's likely pretty minimal since that only happens on a TLB miss (ideally rare), and, if it is problematic, there could be some preallocated TLB entries which are just minimally filled in as necessary.
This has the nice effect of finally making the page tables ISA agnostic.
Change-Id: I11e630f60682f0a0029b0683eb8ff0135fbd4317 Reviewed-on: https://gem5-review.googlesource.com/7350 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
12460:0f221912b014 |
08-Jan-2018 |
Gabe Black <gabeblack@google.com> |
x86, mem: Rewrite the multilevel page table class.
The new version extracts all the x86 specific aspects of the class, and builds the interface around a variable collection of template arguments which are classes that represent the different levels of the page table. The multilevel page table class is now much more ISA independent.
Change-Id: Id42e168a78d0e70f80ab2438480cb6e00a3aa636 Reviewed-on: https://gem5-review.googlesource.com/7347 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Gabe Black <gabeblack@google.com> |
12458:8de44b407db4 |
08-Jan-2018 |
Gabe Black <gabeblack@google.com> |
x86, mem: Don't try to force physical addresses on the system.
Use the system object to allocate physical memory instead of manually placing certain structures and then forcing the system to start other allocations after them in physical memory.
Change-Id: Ie18c81645c3b648c64a6d7a649a0e50f7028f344 Reviewed-on: https://gem5-review.googlesource.com/7346 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> |
12457:b9b7bdb5a8ac |
06-Jan-2018 |
Gabe Black <gabeblack@google.com> |
x86, mem: Get rid of PageTableOps::getBasePtr.
Pass this constant into the page table constructor.
Change-Id: Icbf730f18d9dfcfebd10a196f7f799514728b0fb Reviewed-on: https://gem5-review.googlesource.com/7345 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> |
12456:9d042ae9dd5b |
05-Jan-2018 |
Gabe Black <gabeblack@google.com> |
x86, mem: Pass the multi level page table layout in as a parameter.
Don't get it from a global constant declared in an ISA header file.
Change-Id: Ie19440abdd76500a5e12e6791e6f755ad9e95af3 Reviewed-on: https://gem5-review.googlesource.com/7344 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12455:c88f0b37f433 |
05-Jan-2018 |
Gabe Black <gabeblack@google.com> |
arch, mem: Make the page table lookup function return a pointer.
This avoids having a copy in the lookup function itself, and the declaration of a lot of temporary TLB entry pointers in callers. The gpu TLB seems to have had the most dependence on the original signature of the lookup function, partially because it was relying on a somewhat unsafe copy to a TLB entry using a base class pointer type.
Change-Id: I8b1cf494468163deee000002d243541657faf57f Reviewed-on: https://gem5-review.googlesource.com/7343 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
12448:b299e560f1d8 |
04-Jan-2018 |
Gabe Black <gabeblack@google.com> |
arch, mem, sim: Consolidate and rename the SE mode page table classes.
Now that Nothing inherits from PageTableBase directly, it can be merged into FuncPageTable. This change also takes the opportunity to rename the combined class to EmulationPageTable which lets you know that it's specifically for SE mode.
Also remove the page table entry cache since it doesn't seem to actually improve performance. The TLBs likely absorb the majority of the locality, essentially acting like a cache like they would in real hardware.
Change-Id: If1bcb91aed08686603bf7bee37298c0eee826e13 Reviewed-on: https://gem5-review.googlesource.com/7342 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Gabe Black <gabeblack@google.com> |
12446:c4ad23df356a |
04-Jan-2018 |
Gabe Black <gabeblack@google.com> |
mem: Change the multilevel page table to inherit from FuncPageTable.
KVM looks up translations using the image of the page table in the guest's memory, but we don't have to. By maintaining that image in addition to rather than instead of maintaining an abstract copy makes our lookups faster, and ironically avoids duplicate implementation.
Change-Id: I9ff4cae6f7cf4027c3738b75f74eae50dde2fda1 Reviewed-on: https://gem5-review.googlesource.com/7341 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Gabe Black <gabeblack@google.com> |
12442:e003b72b46ac |
22-Dec-2017 |
Gabe Black <gabeblack@google.com> |
mem: Track TLB entries in the lookup cache as pointers.
Using the architectural page table on x86 and the functional page table on ARM, both with the twolf benchmark in SE mode, there was no performance penalty for doing so, and again possibly a performance improvement. By using a pointer instead of an inline instance, it's possible for the actual type of the TLB entry to be hidden somewhat, taking a step towards abstracting away another aspect of the ISAs.
Since the TLB entries are no longer overwritten and now need to be allocated and freed, this change introduces return types from the updateCache and eraseCacheEntry functions. These functions will return the pointer to any entry which has been displaced from the cache which the caller can either free or ignore, depending on whether the entry has a purpose outside of the cache.
Because the functional page table stores its entries over a longer time period, it will generally not delete the pointer returned from those functions. The "architechtural" page table, ie the one which is backed by memory, doesn't have any other use for the TlbEntrys and will delete them. That leads to more news and deletes than there used to be.
To address that, and also to speed up the architectural page table in general, it would be a good idea to augment the functional page table with an image of the table in memory, instead of replacing it with one. The functional page table would provide quick lookups and also avoid having to translate page table entries to TLB entries, making performance essentially equivalent to the functional case. The backing page tables, which are primarily for consumption by the physical hardware when in KVM, can be updated when mappings change but otherwise left alone.
If we end up doing that, we could just let the ISA specific process classes enable whatever additional TLB machinery they need, likely a backing copy in memory, without any knowledge or involvement from the ISA agnostic class. We would be able to get rid of the useArchPT setting and the bits of code in the configs which set it.
Change-Id: I2e21945cd852bb1b3d0740fe6a4c5acbfd9548c5 Reviewed-on: https://gem5-review.googlesource.com/6983 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> |
12437:0ef54c28bb34 |
02-Jan-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-ruby: Fix wakeup timeouts for the MOESI_CMP_token protocol
This changeset fixes a bug that was affecting the MOESI_CMP_token protocol where setting the next timeout required an absolute tick in the future.
Change-Id: Ibfdb59354e13c7e552cb3389e71bda010f333249 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/7163 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12436:c56112090c61 |
02-Jan-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-ruby: Remove function that maps responses to a DMA engine
The function map_Address_to_DMA was used to route responses to the first (and assumed to be the only) DMA engine in the system. This function is now unused as protocols handle responses and route them to the right DMA engine.
Change-Id: I2fba913cf2f12321d1a1e38e7ee85bdf26b8a47a Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/7162 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12435:146e465343b4 |
02-Jan-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-ruby: Add support for multiple DMA engines in MESI_Two_Level
Previously the MESI_Two_Level protocol supported systems with a single DMA engine and responses from the directory to DMA requests were routed back to the only DMA engine. This changeset adds support for multiple DMA engines in the system by routing the response to the DMA engine that originally sent the request.
Change-Id: I10ceda682ea29746636862ec8ef2a9c4220ca045 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/7161 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12432:2480d8b432f5 |
22-Dec-2017 |
Gabe Black <gabeblack@google.com> |
arch,mem: Remove the default value for page size.
This breaks one more architecture dependence outside of the ISAs.
Change-Id: I071f9ed73aef78e1cd1752247c183e30854b2d28 Reviewed-on: https://gem5-review.googlesource.com/6982 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> |
12431:000549e1f497 |
22-Dec-2017 |
Gabe Black <gabeblack@google.com> |
arch,mem: Move page table construction into the arch classes.
This gets rid of an awkward NoArchPageTable class, and also gives the arch a place to inject ISA specific parameters (specifically page size) without having to have TheISA:: in the generic version of these types.
Change-Id: I1412f303460d5c43dafdb9b3cd07af81c908a441 Reviewed-on: https://gem5-review.googlesource.com/6981 Reviewed-by: Alexandru Duțu <alexandru.dutu@amd.com> Maintainer: Gabe Black <gabeblack@google.com> |
12429:beefb9f5f551 |
09-Jan-2018 |
BKP <brandon.potter@amd.com> |
style: change C/C++ source permissions to noexec
Several files in the repository were tracked with execute permissions even though the files are just normal C/C++ files (and the one .isa).
Change-Id: I976b096acab4a1fc74c5699ef1f9b222c1e635c2 Reviewed-on: https://gem5-review.googlesource.com/7241 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
12425:7f8c9032b18c |
04-Sep-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Prune unnecessary writebacks in exclusive caches
Exclusive caches use the tempBlock to fill for responses from a downstream cache. The reason for this is that they only pass the block to the cache above without keeping a copy. When all requests are serviced the block is immediately invalidated unless it is dirty, in which case it has to be written back to the memory below.
To avoid unnecessary writebacks, this changeset forces mostly exclusive caches to issuse requests that can only fetch clean data when possible.
Reported-by: Quereshi Muhammad Avais <avais@kaist.ac.kr>
Change-Id: I01b377563f5aa3e12d22f425a04db7c023071849 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5061 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12406:86bde4a026b5 |
22-Dec-2017 |
Gabe Black <gabeblack@google.com> |
arch,cpu: "virtualize" the TLB interface.
CPUs have historically instantiated the architecture specific version of the TLBs to avoid a virtual function call, making them a little bit more dependent on what the current ISA is. Some simple performance measurement, the x86 twolf regression on the atomic CPU, shows that there isn't actually any performance benefit, and if anything the simulator goes slightly faster (although still within margin of error) when the TLB functions are virtual.
This change switches everything outside of the architectures themselves to use the generic BaseTLB type, and then inside the ISA for them to cast that to their architecture specific type to call into architecture specific interfaces.
The ARM TLB needed the most adjustment since it was using non-standard translation function signatures. Specifically, they all took an extra "type" parameter which defaulted to normal, and translateTiming returned a Fault. translateTiming actually doesn't need to return a Fault because everywhere that consumed it just stored it into a structure which it then deleted(?), and the fault is stored in the Translation object when the translation is done.
A little more work is needed to fully obviate the arch/tlb.hh header, so the TheISA::TLB type is still visible outside of the ISAs. Specifically, the TlbEntry type is used in the generic PageTable which lives in src/mem.
Change-Id: I51b68ee74411f9af778317eff222f9349d2ed575 Reviewed-on: https://gem5-review.googlesource.com/6921 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12395:322bb93e5f06 |
09-Nov-2017 |
Swapnil Haria <swapnilster@gmail.com> |
mem-ruby: Support atomic_noncaching acceses in ruby
Ruby has no support for atomic_noncaching accesses, which prevents using it with kvm-cpu. This patch fixes this by directly forwarding atomic requests from the ruby port/sequencer to the corresponding directory based on the destination address of the packet.
Change-Id: I0b4928bfda44fd9e5e48583c51d1ea422800da2d Reviewed-on: https://gem5-review.googlesource.com/5601 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Maintainer: Bradford Beckmann <brad.beckmann@amd.com> |
12392:e0dbdf30a2a5 |
13-Dec-2017 |
Jason Lowe-Power <jason@lowepower.com> |
misc: Updates for gcc7.2 for x86
GCC 7.2 is much stricter than previous GCC versions. The following changes are needed:
* There is now a warning if there is an implicit fallthrough between two case statments. C++17 adds the [[fallthrough]]; declaration. However, to support non C++17 standards (i.e., C++11), we use M5_FALLTHROUGH. M5_FALLTHROUGH checks for [[fallthrough]] compliant C++17 compiler and if that doesn't exist, it defaults to nothing (no older compilers generate warnings). * The above resulted in a couple of bugs that were found. This is noted in the review request on gerrit. * throw() for dynamic exception specification is deprecated * There were a couple of new uninitialized variable warnings * Can no longer perform bitwise operations on a bool. * Must now include <functional> for std::function * Compiler bug for void* lambda. Changed to auto as work around. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82878
Change-Id: I5d4c782a4e133fa4cdb119e35d9aff68c6e2958e Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/5802 Reviewed-by: Gabe Black <gabeblack@google.com> |
12386:2bf5fb25a5f1 |
13-Dec-2017 |
Gabe Black <gabeblack@google.com> |
arm,sparc,x86,base,cpu,sim: Replace the Twin(32|64)_t types with.
Replace them with std::array<>s.
Change-Id: I76624c87a1cd9b21c386a96147a18de92b8a8a34 Reviewed-on: https://gem5-review.googlesource.com/6602 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12357:86b87f330638 |
07-Oct-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-ruby: Prevent ruby from crashing on CMOs
Ruby has no support for cache maintenace operations. As a workaround, after printing a warning, we treat them as no-ops in the memory system and respond immediately without handling them. There should be workarounds in the memory system already that allow execution to proceed without the requirement for cache maintenance operations.
Change-Id: I125ee4fa37b674c636d87f2d9205bbc1a74da101 Reviewed-on: https://gem5-review.googlesource.com/5057 Reviewed-by: Jieming Yin <bjm419@gmail.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12355:568ec3a0c614 |
07-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
cpu: Add support for CMOs in the cpu models
Cache maintenance operations go through the write channel of the cpu. This changes makes sure that the cpu does not try to fill in the packet with data.
Change-Id: Ic83205bb1cda7967636d88f15adcb475eb38d158 Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5055 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12354:f7c29d65a656 |
12-Sep-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Ignore clean requests in the abstract memory
Systems with atomic cores and the fastmem option enabled bypass the whole memory system and access the abstract memory directly. Cache maintenance operations which would be normally handled before the point of unification/coherence should be ignored by the abstract memory.
Change-Id: I696cdd158222e5fd67f670cddbcf2efbbfd5eca4 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5054 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12353:5650eb170bfb |
26-Sep-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Handle CMO responses in the snoop filter
Previously responses would either transfer the ownership of the line or the actual data to the cache that send out the original request. Cache clean operations are different since they bring neither data nor ownership. When they are also invalidating the cache that send out the original request will invalidate any existing copies. This patch makes the snoop filter handle the cache clean responses accordingly.
Change-Id: I27165cb45b9dc57882526329c62db35f100d23df Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> Reviewed-by: Sudhanshu Jha <sudhanshu.jha@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5053 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12352:3bddc8785a99 |
22-Sep-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Allow CMOs as snooping requests in the snoop filter
The snoop filter performs sanity checks of the type of packets that are expected to snoop caches above. Cache maintenace operations are expected to perform a clean and or invalidate on all caches down to the specified point of reference and therefore could also generate snoops.
Change-Id: I7f8fef246a85faa87ccd289c28b49686ed7caa08 Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5052 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12351:17eaa27bef22 |
21-Sep-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Co-ordination of CMOs in the xbar
A clean packet request serving a cache maintenance operation (CMO) visits all memories down to the specified xbar. The visited caches invalidate their copy (if the CMO is invalidating) and if a dirty copy is found a write packet writes the dirty data to the memory level below the specified xbar. A response is send back when all the caches are clean and/or invalidated and the specified xbar has seen the write packet.
This patch adds the following functionality in the xbar: 1) Accounts for the cache clean requests that go through the xbar 2) Generates the cache clean response when both the cache clean request and the corresponding writeclean packet has crossed the destination xbar.
Previously transactions in the xbar were identified using the pointer of the original request. Cache clean transactions comprise of two different packets, the clean request and the writeclean, and therefore have different request pointers. This patch adds support for custom transaction IDs that by default take the value of the request pointer but can be overriden by the contructor. This allows the clean request and writeclean share the same id which the coherent xbar uses to co-ordinate them and send the response in a timely manner.
Change-Id: I80db76386a1caded38dc66e6e18f930c3bb800ff Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5051 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12350:811452f255d5 |
22-Sep-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add support for handling CMOs in the MSHRs
To add support for cache maintenance operations (CMOs) in the MSHRs, this change adds the following functionality: - If a CMO request hits in the MSHRs, we deferred as we can't coalesce it with any other requests. - When we promote any deferred targets, we promote them in order and stop if we encounter a CMO request. If the CMO request is at the beginning of the deferred targets list it will be the only promoted target.
Change-Id: I10d1f7e16bd6d522d917279c5d408a3f0cee4286 Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5050 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12349:47f454120200 |
01-Jun-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add support for CMOs in the cache
This change adds support for maintenance operations (CMOs) in the cache. The supported memory operations clean and/or invalidate a cache block as specified by its VA to the specified xbar (PoU, PoC).
A cache maintenance packet visits all memories down to the specified xbar. Caches need to invalidate their copy if it is an invalidating CMO. If it is (additionally) a cleaning CMO and a dirty copy exists, the cache cleans it with a WriteClean request.
Change-Id: Ibf31daa7213925898f3408738b11b1dd76c90b79 Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5049 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12348:bef2d9d3c353 |
07-Sep-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Promote deferred targets only when the block is valid
When a response indicates that there are no other sharers of the block, the cache can promote its copy of the block to writable and potential service deferred targets even if the request didn't ask for a writable copy.
Previously, a response would guarantee the presence of the block in the cache. A response could either be filling, upgrading or a response to an invalidation due to a pending whole line write. Responses to cache maintenance invalidations break this assumption. This change adds an extra check to make sure that the block was already valid or that the response is filling before promoting the block.
Change-Id: I6839f683a05d4dad4205c23f365a925b7b05e366 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5048 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12347:c4bb52d1aba4 |
22-Sep-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add support for cache maintenance operation requests
This change adds new packet cmds and request flags for cache maintenance operations.
1) A cache clean operation writes dirty data in the first memory below the specified xbar and updates any old copies in the memories above it. 2) A cache invalidate operation invalidates all copies of the specified block in the memories above the specified xbar 3) A clean and invalidate operation is a combination of the two operations above
Change-Id: If45702848bdd568de532cd57cba58499e5e4354c Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5047 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12346:9b1144d046ca |
22-Sep-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Support for specifying the destination of a WriteClean
Previously, WriteClean packets would always write to the first memory below unless the memory was unable to allocate in which case it would be forwarded further below.
This change adds support for specifying the destination of a WriteClean packet. The cache annotates the request with the specified destination and marks the packet as write-through upon its creation. The coherent xbar checks packets for their destination and resets the write-through flag when necessary e.g., the coherent xbar that is set as the PoC will reset the write-through flag for packets to the PoC.
Change-Id: I84b653f5cb6e46e97e09508649a3725d72d94606 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5046 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12345:70c783a93195 |
31-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add support for WriteClean packets in the memory system
This change adds support for creating and handling WriteClean packets. The WriteClean operation is almost identical to a WritebackDirty with the exception that the cache generating a WriteClean retains a copy of the block.
Change-Id: I63c8de62919fad0f9547d412f8266aa4292ebecd Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5045 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12344:57364c030de3 |
26-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add a WriteClean command to the packet class
A WriteClean packet allows a cache to write a block to a memory below without evicting its copy. A typical usecase for a WriteClean packet is a cache clean operation.
Change-Id: If356cb067da5ddf3210c135f41ef0891fb811568 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5044 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12343:51ae6d08466f |
29-Sep-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem-cache: Add support for checking whether a cache is busy
This changeset adds support for checking whether the cache is currently busy and a timing request would be rejected.
Change-Id: I5e37b011b2387b1fa1c9e687b9be545f06ffb5f5 Reviewed-on: https://gem5-review.googlesource.com/5042 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12342:53a3828f2468 |
29-Sep-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add function to check if the slave can receive a timing req
This changeset adds support for tryTiming, an interface that allows a master to check if the slave is busy or otherwise if it can accept a timing request.
Change-Id: Idc7c2337ae9ccf5dec54f308e488660591419a63 Reviewed-on: https://gem5-review.googlesource.com/5041 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Christian Menard <christian.menard@tu-dresden.de> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12341:6eebba99d117 |
31-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add the notion of point of unification in the coherent xbar
The point of unification is the first crossbar at which the instruction cache, the data cache and the translation table walks of the core are guaranteed to see the same copy of a memory location.
Change-Id: Ica79b34c8ed4f1a8f2379748e8520a8f8afffa90 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5040 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12334:e0ab29a34764 |
30-Nov-2017 |
Gabe Black <gabeblack@google.com> |
misc: Rename misc.(hh|cc) to logging.(hh|cc)
These files aren't a collection of miscellaneous stuff, they're the definition of the Logger interface, and a few utility macros for calling into that interface (panic, warn, etc.).
Change-Id: I84267ac3f45896a83c0ef027f8f19c5e9a5667d1 Reviewed-on: https://gem5-review.googlesource.com/6226 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Gabe Black <gabeblack@google.com> |
12266:63b8da9eeca4 |
19-Dec-2016 |
Radhika Jagtap <radhika.jagtap@arm.com> |
ext, mem: Pull DRAMPower SHA 90d6290 and rebase
This patch syncs the DRAMPower library of gem5 to the external github (https://github.com/ravenrd/DRAMPower).
The version pulled in is the commit: 90d6290f802c29b3de9e10233ceee22290907ce6 from 30th Oct. 2016.
This change also modifies the DRAM Ctrl interaction with the DRAMPower, due to changes in the lib API in the above version.
Previously multiple functions were called to prepare the power lib before calling the function that would calculate the enery. With the new API, these functions are encompassed inside the function to calculate the energy and therefore should now be removed from the DRAM controller.
The other key difference is the introduction of a new function called calcWindowEnergy which can be useful for any system that wants to do measurements over intervals. For gem5 DRAM ctrl that means we now need to accumulate the window energy measurements into the total stat.
Change-Id: I3570fff2805962e166ff2a1a3217ebf2d5a197fb Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5724 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
12246:9ffa51416f39 |
08-Nov-2017 |
Gabe Black <gabeblack@google.com> |
scons: Move Transform and termcap functionality into their own files.
Change-Id: Ica08e93f3873a7eafd02fe7d44c3bdbf0ce7f6b7 Reviewed-on: https://gem5-review.googlesource.com/5565 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
12241:5257f14fea78 |
31-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Align the snoop behavior in the XBar for atomic and timing
When the XBar receives a Writeback/WriteClean packet, it doesn't need to snoop the upstream caches. It only queries the snoop filter and sets the blockCached flag accordingly. This is in line with the recvTimingReq.
Change-Id: I0ae22f21491d75a111019124bb95bac7b16d3cd3 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5043 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12218:8c5db15dc8e7 |
13-Jun-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Signal the local monitor when clearing the global monitor
ARM systems require the coordination of the global and local monitors. When the system is run without caches the global monitor is implemented in the abstract memory object. This change adds a callback from the abstract memory that notifies the local monitor when the global monitor is cleared.
Additionally, for ARM systems the local monitor signals the event register and wakes the thread context up. Subsequent wait-for-event (WFE) instructions will be immediately signaled.
Change-Id: If6c038f3a6bea7239ba4258f07f39c7f9a30500b Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/3760 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12191:402368e3ca19 |
21-Sep-2017 |
Gabe Black <gabeblack@google.com> |
mem: Fill the new packet ID fields with master IDs when tracing packets.
This will let somebody consuming the memory packet trace make sense out of the master IDs passed along with individual accesses.
Change-Id: I621d915f218728066ce95e6fc81f36d14ae7e597 Reviewed-on: https://gem5-review.googlesource.com/4800 Reviewed-by: Rahul Thakur <rjthakur@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12189:18937fe3cc31 |
01-Sep-2017 |
Gabe Black <gabeblack@google.com> |
mem: Trace the request master ID in the MemTraceProbe.
There's a spot for it in the packet trace protobuf, so we should fill it with something.
Change-Id: I784feb3f668e1b20d67b6ef98d012bcf59b7bd40 Reviewed-on: https://soc-sim-internal-review.googlesource.com/3483 Reviewed-by: Rahul Thakur <rjthakur@google.com> Reviewed-on: https://gem5-review.googlesource.com/4781 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12167:24eb63c709c2 |
02-Aug-2017 |
Pau Cabre <pau.cabre@metempsy.com> |
mem-cache: Delete squashed HWPrefetches
Request and Packet for squashed HWPrefetches were not deleted
Change-Id: I9b66bb01b8ed6a5ddfaaa8739a68165dc4a7006c Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/4340 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12133:ca42be3276af |
28-Jun-2017 |
Sean Wilson <spwilson2@wisc.edu> |
ruby: Refactor some Event subclasses to lambdas
Change-Id: I9f47a20a869553515a759d9a29c05f6ce4b42d64 Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3930 Maintainer: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12128:75e1a5bed42e |
27-Jun-2017 |
Sean Wilson <spwilson2@wisc.edu> |
kvm, mem: Refactor some Event subclasses into lambdas
Change-Id: Ifafdcf4692d58a17f90e66ff8de8fa3e146c34bb Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3924 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
12091:f2d1af96ad2d |
13-Jun-2017 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem-cache: Add missing overrides to BaseCache
Change-Id: I6a3a57e3067c247bd6ce6f01ac9459883f4aae2c Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/3880 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12084:5a3769ff3d55 |
07-Jun-2017 |
Sean Wilson <spwilson2@wisc.edu> |
mem: Replace EventWrapper use with EventFunctionWrapper
NOTE: With this change there is a possibility for `DRAMCtrl::Rank`s event names to not properly match the rank they were generated by. This could occur if the public rank member is modified after the Rank's construction. A patch would mean refactoring Rank and `DRAMCtrl`b to privatize many of the members of Rank behind getters.
Change-Id: I7b8bd15086f4ffdfd3f40be4aeddac5e786fd78e Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3745 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12083:d6a612791733 |
13-Jun-2017 |
Sean Wilson <spwilson2@wisc.edu> |
mem: Replace EventWrapper in PacketQueue with EventFunctionWrapper
In order to replicate the same `name()` output with `PacketQueue`, subclasses using EventFunctionWrapper must initialize PacketQueue with their own name so the sendEvent holds the name of the subclass.
Change-Id: Ib091e118bab8858192e1d1370d61def42958ec29 Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3744 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12081:cb5fe81fd522 |
12-Jun-2017 |
Sean Wilson <spwilson2@wisc.edu> |
mem: Move the Rank construction logic to the Rank constructor
This change was made so Rank objects have their name assigned when they are instantiated. Therefore, they can initialize their member objects with their name and it is less likely to change during runtime.
(NOTE: I would recommend hiding the fields which would cause the name to change behind getters. Since modification of `Rank.rank` during runtime will cause the `name()` to change.)
Change-Id: Id51c3553b40e489792c57950e18b8ce927e43173 Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3742 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12071:fd0b4bd769dd |
06-Jun-2017 |
Javier Cano-Cano <javier.cano555@gmail.com> |
mem-garnet: Fix garnet stats
This patch fix some statistics that in presence of a resetStats instruction were not reseted. This bug makes impossible to obtain reliable network statistics when the simulation doesn't start from tick zero.
Change-Id: Ibec45f08d95bf0a533d94b70ec960719206ae945 Maintainer: Tushar Krishna <tushar@ece.gatech.edu> Reviewed-on: https://gem5-review.googlesource.com/3700 Reviewed-by: Jieming Yin <bjm419@gmail.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12065:e3e51756dfef |
13-Mar-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
ruby: Add support for address ranges in the directory
Previously the directory covered a flat address range that always started from address 0. This change adds a vector of address ranges with interleaving and hashing that each directory keeps track of and the necessary flexibility to support systems with non continuous memory ranges.
Change-Id: I6ea1c629bdf4c5137b7d9c89dbaf6c826adfd977 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2903 Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12051:4cc27e53748d |
03-Mar-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
ruby: Don't set the block data when a store conditional fails
Previously the Sequencer upon a Store Conditional would unconditionally set the data of the memory location. This change checks and prevents a failed Store Conditional from modifying any data.
Change-Id: Id63c9579d8f054f0e95c6d338a7e31aa48762755 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2902 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
11966:23791911437e |
13-Mar-2017 |
Javier Cano-Cano <javier.cano555@gmail.com> |
ruby: Fix MOESI_CMP_directory for new DMA status changes.
Multiple outstanding DMA requests introduced new DMA states that didn't be considered into slicc code. This patch implements the missed DMA state changes on MOESI_CMP_directory protocol.
Change-Id: I700d441d76556b7e77e0d507904af6ec6ba59cc2 Signed-off-by: Michael LeBeane <michael.lebeane@amd.com> Reviewed-on: https://gem5-review.googlesource.com/2380 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Michael LeBeane <Michael.Lebeane@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
11904:870e25baf014 |
24-Feb-2017 |
Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> |
misc: add missing copyright/author information in previous commit
See a06a46f and a854373.
Change-Id: Id66427db22b7d7764c218b9cd78d95db929f4127 Signed-off-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> Reviewed-on: https://gem5-review.googlesource.com/2224 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
11903:a75e4eae89c0 |
05-Feb-2017 |
Lena Olson <leolson@google.com> |
ruby: fix MOESI_hammer directory to work with > 3GB memory
The MOESI_hammer directory assumes a contiguous address space, but X86 has an IO gap from 3-4GB. This patch allows the directory to work with more than 3GB of memory on X86.
Assumptions: the physical address space (range of possible physical addresses) is 0-XGB when X <= 3GB, and 0-(X+1)GB when X > 3GB. If there is no IO gap this patch should still work.
Change-Id: I5453a09e953643cada2c096a91d339a3676f55ee Reviewed-on: https://gem5-review.googlesource.com/2169 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
11900:0787df49546b |
27-Feb-2017 |
Andreas Sandberg <andreas.sandberg@arm.com> |
gpu-compute: Fix Python/C++ object hierarchy discrepancies
The GPUCoalescer and the Shader classes have different base classes in C++ and Python. This causes subtle bugs in SWIG and compilation errors for PyBind.
Change-Id: I1ddd2a8ea43f083470538ddfea891347b21d14d8 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2228 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com> |
11893:3033b3e6a32a |
30-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Make blkAlign a common function between all tag classes
blkAlign was defined as a separate function in the base associative and fully-associative tags classes although both functions implemented identical functionality. This patch moves the blkAlign in the base tags class.
Change-Id: I3d415d0e62bddeec7ce0d559667e40a8c5fdc2d4 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> |
11892:c7ea349e1cd3 |
26-Oct-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Use pkt::getBlockAddr instead of BaseCace::blockAlign
Change-Id: I0ed4e528cb750a323facdc811dde7f0ed1ff228e Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> |
11888:d89dc575c7cb |
05-Feb-2017 |
Lena Olson <leolson@google.com> |
ruby: fix and/or precedence in slicc
The slicc compiler currently treats && and || with the same precedence. This is highly non-intuitive to people used to C, and was probably an error. This patch makes && bind tighter than ||.
For example, previously: if (A || B && C) compiled to: if ((A || B) && C) With this patch, it compiles to: if (A || (B && C))
Change-Id: Idbbd5b50cc86a8d6601045adc14a253284d7b791 Signed-off-by: Lena Olson (leolson@google.com) Reviewed-on: https://gem5-review.googlesource.com/2168 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Joe Gross <criusx@gmail.com> Reviewed-by: Sooraj Puthoor <puthoorsooraj@gmail.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> [ Rebased onto master ] Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11886:43b882cada33 |
27-Feb-2017 |
Brandon Potter <brandon.potter@amd.com> |
syscall_emul: [PATCH 15/22] add clone/execve for threading and multiprocess simulations
Modifies the clone system call and adds execve system call. Requires allowing processes to steal thread contexts from other processes in the same system object and the ability to detach pieces of process state (such as MemState) to allow dynamic sharing. |
11872:ba90ffa751b6 |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove unused size field from the CacheBlk class
Change-Id: I6149290d6d2ac1a4bd6165871c93d7b7d6a980ad Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11871:474ac613d0d7 |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove the unused asid field from the CacheBlk class
Change-Id: I29f45733c5fad822bdd0d8dcc7939d86b2e8c97b Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11870:b470020b29de |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove unused arguments (asid/contex_id) from accessBlock
Change-Id: I79c2662fc81630ab321db8a75be6cd15fa07d372 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11869:aa9d04c7e3bb |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove unused type BlkList from the cache and the tags
Change-Id: If9ebb8488e8db587482ecfa99d2c12cfe5734fb9 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11868:cc435f8f8b05 |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove unused functions from the tag classes
Change-Id: I4f3c2c027b1acaaf791a4c71086f34a9b9fbf4df Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11867:1342b4dbc556 |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Always use the helper function to invalidate a block
Policies like the LRU need to be notified when a block is invalidated, the helper function does this along with invalidating the block.
Change-Id: I3ed59cf07938caa7f394ee6054b0af9e00b267ea Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11866:8732d8d0a9e5 |
21-Feb-2017 |
Sascha Bischoff <sascha.bischoff@arm.com> |
mem: Fix MSHR assert triggering for invalidated prefetches
This changeset updates an assert in src/mem/cache/mshr.cc which was erroneously catching invalidated prefetch requests. These requests can become invalidated if another component writes (an exclusive access) to this location during the time that the read request is in flight. The original assert made the assumption that these cases can only occur for reads generated by the CPU, and hence prefetcher-generated requests would sometimes trip the assert.
Change-Id: If4f043273a688c2bab8f7a641192a2b583e7b20e Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11865:608f8c34f549 |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Populate the secure flag in the writeback visitor
Previously the writeback visitor would not consider and set the secure flag for the blocks that are written back to memory. This patch fixes this.
Change-Id: Ie1a425fa9211407a70a4343f2c6b3d073371378f Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11863:b47dda418ae6 |
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove stale argument from a panic statement
Change-Id: I7ae5fa44a937f641a2ddd242a49e0cd23f68b9f2 Reviewed-by: Sudhanshu Jha <sudhanshu.jha@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11859:76c36516e0ae |
19-Feb-2017 |
Andreas Hansson <andreas.hansson@arm.com> |
sim: Ensure draining is deterministic
The traversal of drainable objects could potentially be non-deterministic when using an unordered set containing object pointers. To ensure that the iteration is deterministic, we switch to a vector. Note that the lookup and traversal of the drainable objects is not performance critical, so the change has no negative consequences. |
11858:5869c83bc8c7 |
19-Feb-2017 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Ensure deferred snoops are cache-line aligned
This patch fixes a bug where a deferred snoop ended up being to a partial cache line, and not cache-line aligned, all due to how we copy the packet. |
11857:77b4fd593427 |
19-Feb-2017 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix memory footprint includes
Fix compilation errors due to missing include. |
11848:f438fcbab00e |
15-Feb-2017 |
Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> |
mem, stats: fix typos in CommMonitor and Stats
Signed-off-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed at http://reviews.gem5.org/r/3802/ |
11847:22d08b519cb0 |
15-Feb-2017 |
Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> |
mem, misc: fix building issue with CommMonitor (unused variables)
Signed-off-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed at http://reviews.gem5.org/r/3801/ |
11846:b9436a4bbbb9 |
15-Feb-2017 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: fix assertion in respondEvent
Assertion in the respondEvent erroneously fired. The assertion verifies that the controller has not moved to a low-power state prior to receiving read data from the memory. The original assertion triggered if the state was not: PWR_IDLE or PWR_ACT.
In the case that failed, a periodic refresh event occurred around the read. The REF is stalled until the final read burst is issued and the subsequent PRE closes the bank. While the PRE will temporarily move the state to PWR_IDLE, state will immediately transition to PWR_REF due to the pending refresh operation. This state does not match the assertion, which is subsequently triggered.
Fixed the assertion by explicitly checking that the state is not a low power state !PWR_SREF && !PWR_PRE_PDN && !PWR_ACT_PDN
Change-Id: I82921a733bbeac2bcb5a487c2f981448d41ed50b Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com> |
11837:17b37f38944a |
14-Feb-2017 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: Update DRAM configuration names
Names of DRAM configurations were updated to reflect both the channel and device data width.
Previous naming format was: <DEVICE_TYPE>_<DATA_RATE>_<CHANNEL_WIDTH>
The following nomenclature is now used: <DEVICE_TYPE>_<DATA_RATE>_<n>x<w> where n = The number of devices per rank on the channel x = Device width
Total channel width can be calculated by n*w
Example: A 64-bit DDR4, 2400 channel consisting of 4-bit devices: n = 16 w = 4 The resulting configuration name is: DDR4_2400_16x4
Updated scripts to match new naming convention.
Added unique configurations for DDR4 for: 1) 16x4 2) 8x8 3) 4x16
Change-Id: Ibd7f763b7248835c624309143cb9fc29d56a69d1 Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> |
11831:3c38d3e74980 |
12-Feb-2017 |
Tushar Krishna <tushar@ece.gatech.edu> |
ruby: fix round robin arbiter in garnet2.0 The rr arbiter pointer in garnet was getting updated on every request, even if there is no grant. This was leading to a huge variance in wait time at a router at high injection rates. This patch corrects it to update upon a grant. |
11830:79c3f6a60392 |
11-Feb-2017 |
Bjoern A. Zeeb <baz21@cam.ac.uk> |
mem: fix printing of 1st cache tags line
Rather than having the 1st line on the Log line and every other line on its own, add a new line to have a common format for all of them. Makes parsing a lot easier.
Reviewed at http://reviews.gem5.org/r/3808/
Signed-off-by: Jason Lowe-Power <jason@lowepower.com> |
11817:594d96c093d0 |
09-Feb-2017 |
Christian Menard <Christian.Menard@tu-dresden.de> |
misc: add a MasterId to the ExternalPort
The Request constructor requires a MasterID. However, an external transactor has no chance of getting a MasterID as it does not have a pointer to the System. This patch adds a MasterID to ExternalMaster to allow external modules to easily genrerate new Requests.
Signed-off-by: Jason Lowe-Power <jason@lowepower.com> |
11804:220375a47eeb |
27-Jan-2017 |
Rahul Thakur <rjthakur@google.com> |
mem: Refactor CommMonitor stats, add basic atomic mode stats
Signed-off-by: Jason Lowe-Power <jason@lowepower.com> |
11803:4f04a6593119 |
27-Jan-2017 |
Rahul Thakur <rjthakur@google.com> |
mem: Add memory footprint probe
Signed-off-by: Jason Lowe-Power <jason@lowepower.com> |
11800:54436a1784dc |
09-Nov-2016 |
Brandon Potter <brandon.potter@amd.com> |
style: [patch 3/22] reduce include dependencies in some headers
Used cppclean to help identify useless includes and removed them. This involved erroneously included headers, but also cases where forward declarations could have been used rather than a full include. |
11798:e034a4566653 |
19-Jan-2017 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
ruby: guard usage of GPUCoalescer code in Profiler
the GPUCoalescer code is used in the ruby profiler regardless of whether or not the coalescer code has been compiled, which can lead to link/run time errors. here we add #ifdefs to guard the usage of GPUCoalescer code. eventually we should refactor this code to use probe points. |
11797:f61fbb7ceb88 |
19-Jan-2017 |
Matthew Poremba <matthew.poremba@amd.com> |
ruby: Check MessageBuffer space in garnet NetworkInterface
Garnet's NetworkInterface does not consider the size of MessageBuffers when ejecting a Message from the network. Add a size check for the MessageBuffer and only enqueue if space is available. If space is not available, the message if placed in a queue and the credit is held. A callback from the MessageBuffer is implemented to wake the NetworkInterface. If there are messages in the stalled queue, they are processed first, in a FIFO manner and if succesfully ejected, the credit is finally sent back upstream. The maximum size of the stall queue is equal to the number of valid VNETs with MessageBuffers attached. |
11796:315e133f45df |
19-Jan-2017 |
Matthew Poremba <matthew.poremba@amd.com> |
ruby: Add occupancy stats to MessageBuffers
This patch is an updated version of /r/3297.
"The most important statistic for measuring memory hierarchy performance is throughput, which is affected by independent variables, buffer sizing and communication latency. It is difficult/impossible to debug performance issues through series buffers without knowing which are the bottlenecks. For finite buffers, this patch adds statistics for the average number of messages in the buffer, the occupancy of the buffer slots, and number of message stalls." |
11795:588a45268ce4 |
19-Jan-2017 |
Matthew Poremba <matthew.poremba@amd.com> |
ruby: Check all VNETs for injection in garnet NetworkInterface
The NetworkInterface wakeup currently iterates over all VNETs and breaks the loop if a VNET is unable to allocate a VC. This can cause a deadlock if a lower numbered VNET is unable to allocate a VC while a higher numbered VNET has idle VCs. This seems like a bug as Garnet 1.0 uses a while loop over an if-statement, suggesting the break was intended for this while loop. This patch removes the break statement, which allows up to one message to be dequeued from a VNET and injected into the network. |
11793:ef606668d247 |
09-Nov-2016 |
Brandon Potter <brandon.potter@amd.com> |
style: [patch 1/22] use /r/3648/ to reorganize includes |
11779:25dd0fd23474 |
20-Dec-2016 |
Joel Hestness <jthestness@gmail.com> |
ruby: Make MessageBuffers actually finite sized
When Ruby controllers stall messages in MessageBuffers, the buffer moves those messages off the priority heap and into a per-address stall map. When buffers are finite-sized, the test areNSlotsAvailable() only checks the size of the priority heap, but ignores the stall map, so the map is allowed to grow unbounded if the controller stalls numerous messages. This patch fixes the problem by tracking the stall map size and testing the total number of messages in the buffer appropriately. |
11778:dccdf4e12a0b |
20-Dec-2016 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
ruby: fix typo in DMASequencer::ackCallback() |
11777:ca38721228f3 |
20-Dec-2016 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
ruby: fix issue with unused var in DMASequencer
the iterator declared in DMASequencer::ackCallback() is only used in an assert, this causes clang to fail when building fast. here we move the find call on the request table directly into the assert. |
11767:6ef6e5dbff2d |
19-Dec-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Make the BaseXBar public to not confuse Python wrappers
The Python wrappers generally assume that destructors are public. Make the BaseXBar destructor public to avoid confusing the Python wrapper.
Change-Id: If958802409c0be74e875dd6e279742abfdb3ede1 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> |
11762:29d401db3746 |
15-Dec-2016 |
Jieming Yin <jieming.yin@amd.com> |
ruby: Detect garnet network-level deadlock.
This patch detects garnet network deadlock by monitoring network interfaces. If a network interface continuously fails to allocate virtual channels for a message, a possible deadlock is detected. |
11755:81db27b8869a |
05-Dec-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
ruby: Remove RubyMemoryControl and associated files
This patch removes the deprecated RubyMemoryControl. The DRAMCtrl module should be used instead. |
11751:cd6248b276a8 |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Respond to InvalidateReq when the block is (pending) dirty
Previously when an InvalidateReq snooped a cache with a dirty block or a pending modified MSHR, it would invalidate the block or set the postInv flag. The cache would not send an InvalidateResp. though, causing memory order violations. This patches changes this behavior, making the cache with the dirty block or pending modified MSHR the ordering point.
Change-Id: Ib4c31012f4f6693ffb137cd77258b160fbc239ca Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> |
11750:c15cc4d973ea |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Invalidate a blk when servicing the 1st invalidating target
Previously an MSHR with one or more invalidating targets would first service all targets in the MSHR TargetList and then invalidate the block. As a result any service snooping targets would lookup in the cache and incorrectly find the block. This patch forces the invalidation to happen when the first invalidating target is encountered.
Change-Id: I9df15de24e1d351cd96f5a2c424d9a03d81c2cce Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> |
11749:3b2cb95f48ed |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Allow non invalidating snoops on an InvalidateReq MSHR
This patch changes an assertion that previously assumed that a non invalidating snoop request should never be serviced by an InvalidateReq MSHR. The MSHR serves as the ordering point for the snooping packet. When the InvalidateResp reaches the cache the snooping packet snoops the caches above to find the requested block. One or more of the caches above will have the block since earlier it has seen a WriteLineReq.
Change-Id: I0c147c8b5d5019e18bd34adf9af0fccfe431ae07 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> |
11748:55bd32c72867 |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Don't use hasSharers in the snoopFilter for memory responses
When the snoopFilter receives a response, it updates its state using the hasSharers flag (indicates whether there are more than one copies of the block in the caches above). The hasSharers flag of the packet was previously populated when the request was traversing and snooping the caches looking for the block. 1) When the response is coming from the memory-side port, its order with respect to other responses is not necessarily preserved (e.g., a request that arrived second to the xbar can get its response first). As a result the snoopFilter might process responses out of order updating its residency information using the non valid hasSharers flag which was populated much earlier. 2) When the response is from an on-chip, the MSHRs preserve a well defined order and the hasSharers flag should contain valid information.
This patch changes the snoopFilter by avoiding the hasSharers flag when the response is from the memory-side port.
Change-Id: Ib2d22a5b7bf3eccac64445127d2ea20ee74bb25b Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11747:a6da15219f95 |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Always use InvalidateReq to service WriteLineReq misses
Previously, a WriteLineReq that missed in a cache would send out an InvalidateReq if the block lookup failed or an UpgradeReq if the block lookup succeeded but the block had sharers. This changes ensures that a WriteLineReq always sends an InvalidateReq to invalidate all copies of the block and satisfy the WriteLineReq.
Change-Id: I207ff5b267663abf02bc0b08aeadde69ad81be61 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> |
11746:6b84b831f47d |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Assert that the responderHadWritable is set only once
Change-Id: Ie3beeef25331f84a0a5bcc17f7a791f4a829695b Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11745:3102db8903f5 |
05-Dec-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Ensure InvalidateReq is considered isForward by MSHRs
This patch fixes an issue where an MSHR would incorrectly be perceived to provide data to targets arriving after an InvalidateReq. To address this the InvalidateReq is now treated as isForward, much like an UpgradeReq that did not hit in the cache.
Change-Id: Ia878444d949539b5c33fd19f3e12b0b8a872275e Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11744:5d33c6972dda |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Make packet debug printing more uniform
Previously DPRINTFs printing information about a packet would use ad hoc formats. This patch changes all DPRINTFs to use the print function defined by the packet class, making the packet printing format more uniform and easier to change.
Change-Id: Idd436a9758d4bf70c86a574d524648b2a2580970 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11742:3dcf0b891749 |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Service only the 1st FromCPU MSHR target on ReadRespWithInv
A response to a ReadReq can either be a ReadResp or a ReadRespWithInvalidate. As we add targets to an MSHR for a ReadReq we assume that the response will be a ReadResp. When the response is invalidating (ReadRespWithInvalidate) servicing more than one targets can potentially violate the memory ordering. This change fixes the way we handle a ReadRespWithInvalidate. When a cache receives a ReadRespWithInvalidate we service only the first FromCPU target and all the FromSnoop targets from the MSHR target list. The rest of the FromCPU targets are deferred and serviced by a new request.
Change-Id: I75c30c268851987ee5f8644acb46f440b4eeeec2 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11741:72916416d2e2 |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Keep track of allocOnFill in the TargetList
Previously the information of whether a response was allocating or not was a property of the MSHR. This change makes this flag a property of the TargetList. Differernt TargetLists, e.g. the targets and the deferred targets lists might have different values. Additionally, the information about whether each of the target expects an allocating response is stored inside the TargetList container. This allows for repopulating the flag in case some of the targets are removed.
Change-Id: If3ec2516992f42a6d9da907009ffe3ab8d0d2021 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11740:6e1cb0f750c0 |
05-Dec-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add support for repopulating the flags of an MSHR TargetList
This patch adds support for repopulating the flags of an MSHR TargetList. The added functionality makes it possible to remove targets from a TargetList without leaving it in an inconsistent state.
Change-Id: I3f7a8e97bfd3e2e49bebad056d11bbfb087aad91 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11732:e15e445c21a6 |
02-Dec-2016 |
Matthew Poremba <matthew.poremba@amd.com> |
ruby: Fix overflow reported by ASAN in MessageBuffer.
In MessageBuffer the m_not_avail_count member is incremented but not used. This causes an overflow reported by ASAN. This patch changes from an int to Stats::Scalar, since the count is useful in debugging finite MessageBuffers. |
11722:f15f02d8c79e |
30-Nov-2016 |
Sophiane Senni <sophiane.senni@gmail.com> |
mem: Split the hit_latency into tag_latency and data_latency
If the cache access mode is parallel, i.e. "sequential_access" parameter is set to "False", tags and data are accessed in parallel. Therefore, the hit_latency is the maximum latency between tag_latency and data_latency. On the other hand, if the cache access mode is sequential, i.e. "sequential_access" parameter is set to "True", tags and data are accessed sequentially. Therefore, the hit_latency is the sum of tag_latency plus data_latency.
Signed-off-by: Jason Lowe-Power <jason@lowepower.com> |
11715:31b2c4b52047 |
21-Nov-2016 |
Jieming Yin <jieming.yin@amd.com> |
ruby: Fix potential bugs in garnet2.0
1. Delete unused variable from struct LinkEntry 2. Correct GarnetExtLink and GarnetIntLink inheritance |
11712:70052ef97ce1 |
21-Nov-2016 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
ruby: add default ctor for MachineID type
not all uses of MachineID initialize its fields, so here we add a default ctor. |
11710:9e5050028323 |
19-Nov-2016 |
Sooraj Puthoor <puthoorsooraj@gmail.com> |
ruby: init MessageSizeType of SequencerMsg to Request_Control
SequencerMsg is autogenerated by slicc scripts and the MessageSizeType is initialized to the max enume value by default. The DMASequencer pushes this message to the mandatory queue and since the MessageSizeType is unitialized, string_to_MessageSizeType() function used by traces to print the message fails with a panic. This patch avoids this problem by initializing MessageSizeType of SequencerMsg to Request_Control. |
11704:c38fcdaa5fe5 |
26-Oct-2016 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
hsail,gpu-compute: fixes to appease clang++
fixes to appease clang++. tested on:
Ubuntu clang version 3.5.0-4ubuntu2~trusty2 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
Ubuntu clang version 3.6.0-2ubuntu1~trusty1 (tags/RELEASE_360/final) (based on LLVM 3.6.0)
the fixes address the following five issues:
1) the exec continuations in gpu_static_inst.hh were marked as protected when they should be public. here we mark them as public
2) the Abs instruction uses std::abs() in its execute method. because Abs is templated, it can also operate on U32 and U64, types, which cause Abs::execute() to pass uint32_t and uint64_t types to std::abs() respectively. this triggers a warning because std::abs() has no effect in this case. to rememdy this we add template specialization for the execute() method of Abs when its template paramter is U32 or U64.
3) Some potocols that utilize the code in cprintf.hh were missing includes to BoolVec.hh, which defines operator<< for the BoolVec type. This would cause issues when the generated code would try to pass a BoolVec type to a method in cprintf.hh that used operator<< on an instance of a BoolVec.
4) Surprise, clang doesn't like it when you clobber all the bits in a newly allocated object. I.e., this code:
tlb = new GpuTlbEntry\[size\]; std::memset(tlb, 0, sizeof(GpuTlbEntry) \* size);
Let's use std::vector to track the TLB entries in the GpuTlb now...
5) There were a few variables used only in DPRINTFs, so we mark them with M5_VAR_USED. |
11702:0bf388858d1e |
26-Oct-2016 |
Michael LeBeane <michael.lebeane@amd.com> |
ruby: Allow multiple outstanding DMA requests DMA sequencers and protocols can currently only issue one DMA access at a time. This patch implements the necessary functionality to support multiple outstanding DMA requests in Ruby. |
11689:9d19bb965564 |
26-Oct-2016 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
ruby: make a RequestDesc class instead of std::pair
the RequestDesc was previously implemented as a std::pair, which made the implementation overly complex and error prone. here we encapsulate the packet, primary, and secondary types all in a single data structure with all members properly intialized in a ctor |
11679:4aa51b4a2f24 |
13-Oct-2016 |
Omar Naji <Omar.Naji@arm.com> |
mem: add DRAM powerdown current
Change-Id: I763cffe0c69f5ebbbf6a6eb12bec5c13d5d0161d Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com> |
11678:8c6991a00515 |
13-Oct-2016 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: Add DRAM low-power functionality
Added power-down state transitions to the DRAM controller model.
Added per rank parameter, outstandingEvents, which tracks the number of outstanding command events and is used to determine when the controller should transition to a low power state. The controller will only transition when there are no outstanding events scheduled and the number of command entries for the given rank is 0.
The outstandingEvents parameter is incremented for every RD/WR burst, PRE, and REF event scheduled. ACT is implicitly covered by RD/WR since burst will always issue and complete after a required ACT. The parameter is decremented when the event is serviced (completed).
The controller will automatically transition to ACT power down, PRE power down, or SREF.
Transition to ACT power down state scheduled from: 1) The RespondEvent, where read data is received from the memory. ACT power-down entry will be scheduled when one or more banks is open, all commands for the rank have completed (no more commands scheduled), and there are no commands in queue for the rank
Transition to PRE power down scheduled from: 1) respondEvent, when all banks are closed, all commands have completed, and there are no commands in queue for the rank 2) prechargeEvent when all banks are closed, all commands have completed, and there are no commands in queue for the rank 3) refreshEvent, after the refresh is complete when the previous state was ACT power-down 4) refreshEvent, after the refresh is complete when the previous state was PRE power-down and there are commands in the queue.
Transition to SREF will be scheduled from: 1) refreshEvent, after the refresh is completes when the previous state was PRE power-down with no commands in queue
Power-down exit commands are scheduled from: 1) The refreshEvent, prior to issuing a refresh 2) doDRAMAccess, to wake-up the rank for RD/WR command issue.
Self-refresh exit commands are scheduled from: 1) The next request event, when the queue has commands for the rank in the readQueue or there are commands for the rank in the writeQueue and the bus state is WRITE.
Change-Id: I6103f660776e36c686655e71d92ec7b5b752050a Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com> |
11677:beaf1afe2f83 |
13-Oct-2016 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: Add callback to compute stats prior to dump event
The per rank statistics are periodically updated based on state transition and refresh events.
Add a method to update these when a dump event occurs to ensure they reflect accurate values. Specifically, need to ensure that the low-power state durations, power, and energy are logged correctly.
Change-Id: Ib642a6668340de8f494a608bb34982e58ba7f1eb Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com> |
11676:8a882e297eb2 |
13-Oct-2016 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: Modify drain to ensure banks and power are idled
Add constraint that all ranks have to be in PWR_IDLE before signaling drain complete
This will ensure that the banks are all closed and the rank has exited any low-power states.
On suspend, update the power stats to sync the DRAM power logic
The logic maintains the location of the signalDrainDone method, which is still triggered from either: 1) Read response event 2) Next request event
This ensures that the drain will complete in the READ bus state and minimizes the changes required.
Change-Id: If1476e631ea7d5999fe50a0c9379c5967a90e3d1 Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com> |
11675:60d18201148d |
13-Oct-2016 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: Sort memory commands and update DRAMPower
Add local variable to stores commands to be issued. These commands are in order within a single bank but will be out of order across banks & ranks.
A new procedure, flushCmdList, sorts commands across banks / ranks, and flushes the sorted list, up to curTick() to DRAMPower. This is currently called in refresh, once all previous commands are guaranteed to have completed. Could be called in other events like the powerEvent as well.
By only flushing commands up to curTick(), will not get out of sync when flushed at a periodic stats dump (done in subsequent patch).
Change-Id: I4ac65a52407f64270db1e16a1fb04cfe7f638851 Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com> |
11674:b4015b449dd3 |
13-Oct-2016 |
Omar Naji <Omar.Naji@arm.com> |
mem: update DDR3 die revision
Change-Id: I8992ddc1664c3ed4b2d36d8a34e4ce8be113b9de Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com> |
11673:9f3ccf96bb5a |
13-Oct-2016 |
Omar Naji <Omar.Naji@arm.com> |
mem: add DRAM powerdown timing |
11672:55276af429ed |
13-Oct-2016 |
Omar Naji <Omar.Naji@arm.com> |
mem: make DDR4 x16 |
11667:ebf2acd02fc5 |
06-Oct-2016 |
Tushar Krishna <tushar@ece.gatech.edu> |
ruby: Add M5_VAR_USED before variables used only inside assert in garnet2.0. This removes errors when building gem5.fast |
11666:10d59d546ea2 |
06-Oct-2016 |
Tushar Krishna <tushar@ece.gatech.edu> |
ruby: garnet2.0 Revamped version of garnet with more optimized single-cycle routers, more configurability, and cleaner code. |
11665:db895719c482 |
06-Oct-2016 |
Tushar Krishna <tushar@ece.gatech.edu> |
ruby: remove the original garnet code. Only garnet2.0 will be supported henceforth. |
11664:2365e9e396f7 |
06-Oct-2016 |
Tushar Krishna <tushar@ece.gatech.edu> |
config: add port directions and per-router delay in topology. This patch adds port direction names to the links during topology creation, which can be used for better printed names for the links or for users to code up their own adaptive routing algorithms. It also adds support for every router to have an independent latency value to support heterogeneous topologies with the subsequent garnet2.0 patch. |
11663:cf870cd20cfc |
06-Oct-2016 |
Tushar Krishna <tushar@ece.gatech.edu> |
config: make internal links in network topology unidirectional. This patch makes the internal links within the network topology unidirectional, thus allowing any deadlock-free routing algorithms to be specified from the topology itself using weights. This patch also renames Mesh.py and MeshDirCorners.py to Mesh_XY.py and MeshDirCorners_XY.py (Mesh with XY routing). It also adds a Mesh_westfirst.py and CrossbarGarnet.py topologies. |
11660:cfa97c37117a |
06-Oct-2016 |
Tushar Krishna <tushar@ece.gatech.edu> |
ruby: rename ALPHA_Network_test protocol to Garnet_standalone. Over the past 6 years, we realized that the protocol is essentially used to run the garnet network in a standalone manner, and feed standard synthetic traffic patterns through it. |
11654:49cbf4bb0d36 |
29-Sep-2016 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: correct size for partial memory writes
Fixed AbstractController::queueMemoryWritePartial to specify the correct size for partial memory writes. |
11653:fab5e4523380 |
29-Sep-2016 |
Brad Beckmann <Brad.Beckmann@amd.com> |
mem: minor dprintf fix to abstract mem
print number of bytes written as a decimal number, not hex |
11614:29606f000389 |
22-Aug-2016 |
David Hashe <david.j.hashe@gmail.com> |
cpu, mem, sim: Change how KVM maps memory
Only map memories into the KVM guest address space that are marked as usable by KVM. Create BackingStoreEntry class containing flags for is_conf_reported, in_addr_map, and kvm_map. |
11610:3fb50f935a6a |
14-Aug-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Print an MSHR without triggering any assertions
Previously printing an mshr would trigger an assertion if the MSHR was not in service or if the targets list was empty. This patch changes the print function to bypasses the accessor functions for postInvalidate and postDowngrade and avoid the relevant assertions. It also checks if the targets list is empty before calling print on it.
Change-Id: Ic18bee6cb088f63976112eba40e89501237cfe62 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11605:65ae342b627b |
12-Aug-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Add support for secure packets in the snoop filter
Secure and non-secure data can coexist in the cache and therefore the snoop filter should treat differently packets with secure and non secure accesses. This patch uses the lower bits of the line address to keep track of whether the packet is addressing secure memory or not.
Change-Id: I54a5e614dad566a5083582bede86c86896f2c2c1 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> |
11604:b254396b7759 |
12-Aug-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add snoop filter to SystemXBar by default
This patch changes the default behaviour of the SystemXBar, adding a snoop filter. With the recent updates to the snoop filter allocation behaviour this change no longer causes problems for the regressions without caches.
Change-Id: Ibe0cd437b71b2ede9002384126553679acc69cc1 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> |
11603:900cca8c5b04 |
12-Aug-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Use FromCache attribute in snoop filter allocation
This patch improves the snoop filter allocation decisions by not only looking at whether a port is snooping or not, but also if the packet actually came from a cache. The issue with only looking at isSnooping is that the CPU ports, for example, are snooping, but not actually caching. Previously we ended up incorrectly allocating entries in systems without caches (such as the atomic and timing quick regressions). Eventually these misguided allocations caused the snoop filter to panic due to an excessive size.
On the request path we now include the fromCache check on the packet itself, and for responses we check if we actually have a snoop-filter entry.
Change-Id: Idd2dbc4f00c7e07d331e9a02658aee30d0350d7e Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> |
11602:7e0199f80816 |
12-Aug-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Update mostly exclusive policy even further
This patch takes yet another step in maintaining the clusivity, in that it allows a mostly-inclusive cache to hold on to blocks even when responding to a ReadExReq or UpgradeReq. Previously the cache simply invalidated these blocks, but there is no strict need to do so.
The most important part of this patch is that we simply mark the block clean when satisfying the upstream request where the cache is allowed to keep the block. The only tricky part of the patch is in the memory management of deferred snoops, where we need to distinguish the cases where only the packet was copied (we expected to respond), and the cases where we created an entirely new packet and request (we kept it only to replay later).
The code in satisfyRequest is definitely ready for some refactoring after this.
Change-Id: I201ddc7b2582eaa46fb8cff0c7ad09e02d64b0fc Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> |
11601:382e0637fae0 |
12-Aug-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Update mostly exclusive cache policy to cover more cases
This patch changes how the mostly exclusive policy is enforced to ensure that we drop blocks when we should. As part of this change, the actual invalidation due to the clusivity enforcement is moved outside the hit handling, to a separate method maintainClusivity. For the timing mode that means we can deal with all MSHR targets before taking any action and possibly dropping the block. The method satisfyCpuSideRequest is also renamed satisfyRequest as part of this change (since we only ever see requests from the cpu-side port).
Change-Id: If6f3d1e0c3e7be9a67b72a55e4fc2ec4a90fd3d2 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> |
11600:a38c3f9c82d1 |
12-Aug-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add a FromCache packet attribute
This patch adds a FromCache attribute to the packet, and updates a number of the existing request commands to reflect that the request originates from a cache. The attribute simplifies checking if a requests came from a cache or not, and this is used by both the cache and snoop filter in follow-on patches.
Change-Id: Ib0a7a080bbe4d6036ddd84b46fd45bc7eb41cd8f Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Steve Reinhardt <stever@gmail.com> |
11596:329e49c419b1 |
10-Aug-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
ruby: Implement support for functional accesses to PIO ranges
There are cases where we want to put boot ROMs on the PIO bus. Ruby currently doesn't support functional accesses to such memories since functional accesses are always assumed to go to physical memory. Add the required support for routing functional accesses to the PIO bus.
Change-Id: Ia5b0fcbe87b9642bfd6ff98a55f71909d1a804e3 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Brad Beckmann <brad.beckmann@amd.com> Reviewed-by: Michael LeBeane <michael.lebeane@amd.com> |
11564:dac4b77b5a49 |
21-Jul-2016 |
David Guillen Fandos <david.guillen@arm.com> |
mem: Add snoop traffic statistic |
11558:b921b96cbf74 |
11-Jul-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Remove stale argument from a DPRINTF in the cache code
Change-Id: I70dd11c23b45dfc606ef08233d2e50fcc0817505 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11556:7aa1142a5730 |
01-Jul-2016 |
Matthew Poremba <Matthew.Poremba@amd.com> |
ruby: Fix double statistic registration in garnet
Currently garnet will not run due to double statistic registration of new stats in ClockedObject. This occurs because a temporary array named 'cls' is being added as a child to garnet internal and external link SimObjects. This patch simply renames the temporary array which prevents it from being added as a child object and avoids the assertion that a statistic was already registered.
Committed by Jason Lowe-Power <jason@lowepower.com> |
11555:2efa95cf8504 |
01-Jul-2016 |
Matthias Jung <jungma@eit.uni-kl.de> |
ext: Update DRAMPower
Sync DRAMPower to external tool
This patch syncs the DRAMPower library of gem5 to the external one on github (https://github.com/ravenrd/DRAMPower) of which I am a maintainer.
The version used is the commit: 902a00a1797c48a9df97ec88868f20e847680ae6 from 07. May. 2016.
Committed by Jason Lowe-Power <jason@lowepower.com> |
11551:d24ad08b22b0 |
01-Jul-2016 |
Abdul Mutaal Ahmad <abdul.mutaal@gmail.com> |
mem: different HMC configuration
In this new hmc configuration we have used the existing components in gem5 mainly [SerialLink] [NoncoherentXbar]& [DRAMCtrl] to define 3 different architecture for HMC.
Highlights
1- It explores 3 different HMC architectures
2- It creates 4-HMC crossbars and attaches 16 vault controllers with it. This will connect vaults to serial links
3- From the previous version, HMCController with round robin funtionality is being removed and all the serial links are being accessible directly from user ports
4- Latency incorporated by HMCController (in previous version) is being added to SerialLink
Committed by Jason Lowe-Power <jason@lowepower.com> |
11544:2383451ff6a5 |
20-Jun-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: Fix the snoop filter when there is a downstream addr mapper
The snoop filter handles requests in two steps which preceed and follow the call to send the packet downstream. An address mapper could possibly change the address of the packet when it is sent downstream breaking the snoop filter assumption that the address is unchanged
Change-Id: Ib2db755e9ebef4f2f7c0169a46b1b11185ffbe79 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11523:81332eb10367 |
06-Jun-2016 |
David Guillen Fandos <david.guillen@arm.com> |
stats: Fixing regStats function for some SimObjects
Fixing an issue with regStats not calling the parent class method for most SimObjects in Gem5. This causes issues if one adds new stats in the base class (since they are never initialized properly!).
Change-Id: Iebc5aa66f58816ef4295dc8e48a357558d76a77c Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11522:348411ec525a |
06-Jun-2016 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
sim: Call regStats of base-class as well
We want to extend the stats of objects hierarchically and thus it is necessary to register the statistics of the base-class(es), as well. For now, these are empty, but generic stats will be added there.
Patch originally provided by Akash Bagdia at ARM Ltd. |
11519:bf08fb8ccf4b |
03-Jun-2016 |
Marco Elver <marco.elver@ed.ac.uk> |
ruby: Implement SwapReq support
This implements SwapReq for Ruby memory.
A SwapReq should be treated like a write, except that the response packet contains the overwritten data.
Note that, in particular, the conditional checking for isStore/isLoad needs to be reversed, as a SwapReq is both. |
11493:06b73eb44660 |
26-May-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix memory leak in handling of deferred snoops
This patch fixes a memory leak where deferred snoop packets never got deallocated. On the call to MSHR::handleSnoop these snoops were treated as if a response will be sent, as the MSHR was pendingModified. Consequently, a copy of the packet was created and added to the MSHR targets. However, an preceeding target to the same MSHR, originally from a CPU, was serviced before the snoop, and caused the block to be invalidated. This happens for ReadExReq and UpgradeReq.
Note that the original snoop will receive a response, just not from the cache in question, but instead from the cache upstream that issued the ReadExReq or UpgradeReq.
Change-Id: I4ac012fbc8a46cf693ca390fe9476105d444e6f4 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
11490:e03a6233d061 |
26-May-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not set cacheResponding on MSHR snoop if not responding
This patch changes the flow control for HSHR::handleSnoop to ensure that we only set cacheResponding on the snoop packet if we are actually responding. This avoids situations where a responder is stalling indefinitely on a response that never arrives.
Change-Id: I691dd01755b614b30203581aa74fc743b350eacc Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> |
11489:47aca087ebb4 |
26-May-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix MemChecker unique_ptr type mismatch
This patch fixes the type of the unique_ptr instances, to ensure that the data that is allocated with new[] is also deleted with delete[]. The issue was highlighted by ASAN.
Change-Id: I2c5510424959d862a9954d83e728d901bb18d309 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
11486:f09bb73b3050 |
26-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: fix headers include order in the cache related classes
Change-Id: Ia57cc104978861ab342720654e408dbbfcbe4b69 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11485:8ca4fbefff3e |
26-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: remove redudant check whether the cache forwards snoops
Change-Id: I57b56771086e1e2f512977fb7248d93c171ab925 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11484:08b33c52a16d |
26-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: change NULL to nullptr in the cache related classes
Change-Id: I5042410be54935650b7d05c84d8d9efbfcc06e70 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11483:d4c2e56d18b2 |
26-May-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
mem: fix the line length in the cache related classes
Change-Id: I6d1feb164a958dde0da87a1cd2698096112c4a82 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11459:e41eca4aecbb |
26-Apr-2016 |
Matthew Poremba <matthew.poremba@amd.com> |
ruby: Rename pkt to m_pkt so it may be accessed via SLICC
Allow usage of packet class in ruby for convenience purposes. This may be used to access members of the packet/request class (e.g., via helper functions) and/or push protocol specific information to the packets SenderState without needing to modify SLICC types and protocols in multiple locations. |
11455:067177a1b578 |
21-Apr-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Include WriteLineReq in cache demand stats
Somehow the WriteLineReq were never added to the list of commands considered demand. |
11454:e55afadc4e19 |
21-Apr-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused cache stats
Prune cache stats that are never actually used. |
11453:dd9763792521 |
21-Apr-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Deallocate all write-queue entries when sent
This patch removes the write-queue entry tracking previously used for uncacheable writes. The write-queue entry is now deallocated as soon as the packet is sent. As a result we also forego the stats for uncacheable writes. Additionally, there is no longer a need to attach the write-queue entry to the packet. |
11452:4bc3a0c0861c |
21-Apr-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Align downstream cache packet creation in atomic and timing
This patch makes the control flow more uniform in atomic and timing, ultimately making the code easier to understand. |
11448:8d94df4c9da4 |
15-Apr-2016 |
Joel Hestness <jthestness@gmail.com> |
ruby: Fix block_on behavior
Ruby's controller block_on behavior aimed to block MessageBuffer requests into SLICC controllers when a Locked_RMW was in flight. Unfortunately, this functionality only partially works: When non-Locked_RMW memory accesses are issued to the sequencer to an address with an in-flight Locked_RMW, the sequencer may pass those accesses through to the controller. At the controller, a number of incorrect activities can occur depending on the protocol. In MOESI_hammer, for example, an intermediate IFETCH will cause an L1D to L2 transfer, which cannot be serviced, because the block_on functionality blocks the trigger queue, resulting in a deadlock. Further, if an intermediate store arrives (e.g. from a separate SMT thread), the sequencer allows the request through to the controller, and the atomicity of the Locked_RMW may be broken.
To avoid these problems, disallow the Sequencer from passing any memory accesses to the controller besides Locked_RMW_Write when a Locked_RMW is in- flight. |
11446:ae6e3dd1c32c |
15-Apr-2016 |
Bjoern A. Zeeb <baz21@cam.ac.uk> |
mem: FreeBSD does not provide MAP_NORESERVE either
Like OS X, FreeBSD does not support MAP_NORESERVE. Handle accordingly and update comment.
Committed by Jason Lowe-Power <power.jg@gmail.com> |
11443:df24b9af42c7 |
13-Apr-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Fix issues flagged by gcc 6
A few warnings (and thus errors) pop up after being added to -Wall:
1. -Wmisleading-indentation
In the auto-generated code there were instances of if/else blocks that were not indented to gcc's liking. This is addressed by adding braces.
2. -Wshift-negative-value
gcc is clever enougn to consider ~0 a negative constant, and rightfully complains. This is addressed by using mask() which explicitly casts to unsigned before shifting.
That is all. Porting done. |
11439:d0368996f1e0 |
07-Apr-2016 |
Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com> |
mem: Add priority to QueuedPrefetcher
Queued prefetcher entries now count with a priority field. The idea is to add packets ordered by priority and then by age.
For the existing algorithms in which priority doesn't make sense, it is set to 0 for all deferred packets in the queue. |
11438:3c9fd319a982 |
07-Apr-2016 |
Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com> |
mem: Handful extra features for BasePrefetcher
Some common functionality added to the base prefetcher, mainly dealing with extracting the block address, page address, block index inside the page and some other information that can be inferred from the block address. This is used for some prefetching algorithms, and having the methods in the base, as well as the block size and other information is the sensible way. |
11437:210624864179 |
07-Apr-2016 |
Victor Garcia <victor.garcia@arm.com> |
mem: Add Program Counter to MemTraceProbe |
11436:f351b7f248db |
27-May-2015 |
Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com> |
mem: Add unused prefetch counter in caches
Added stat to the cache to account for HardPF'ed blocks that are evicted before being referenced (over-prefetching). |
11435:0f1b46dde3fa |
07-Apr-2016 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu.
This is a re-spin of 20264eb after the revert (bd1c6789) and includes some fixes of that commit. |
11430:bd1c6789c33f |
07-Apr-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
Revert to 74c1e6513bd0 (sim: Thermal support for Linux) |
11429:cf5af0cc3be4 |
06-Apr-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
Revert power patch sets with unexpected interactions
The following patches had unexpected interactions with the current upstream code and have been reverted for now:
e07fd01651f3: power: Add support for power models 831c7f2f9e39: power: Low-power idle power state for idle CPUs 4f749e00b667: power: Add power states to ClockedObject
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11428:20264eb69fbf |
05-Apr-2016 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu. |
11422:4f749e00b667 |
18-Nov-2014 |
Akash Bagdia <akash.bagdia@ARM.com> |
power: Add power states to ClockedObject
Add 4 power states to the ClockedObject, provides necessary access functions to check and update the power state. Default power state is UNDEFINED, it is responsibility of the respective simulation model to provide the startup state and any other logic for state change.
Add number of transition stat. Add distribution of time spent in clock gated state. Add power state residency stat.
Add dump call back function to allow stats update of distribution and residency stats. |
11377:a06a4debe272 |
17-Mar-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Adjust cache queue reserve to more conservative values
The cache queue reserve is there as an overflow to give us enough headroom based on when we block the cache, and how many transactions we may already have accepted before actually blocking. The previous values were probably chosen to be "big enough", when we actually know that we check the MSHRs after every single allocation, and for the write buffers we know that we implicitly may need one entry for every outstanding MSHR. * * * mem: Adjust cache queue reserve to more conservative values
The cache queue reserve is there as an overflow to give us enough headroom based on when we block the cache, and how many transactions we may already have accepted before actually blocking. The previous values were probably chosen to be "big enough", when we actually know that we check the MSHRs after every single allocation, and for the write buffers we know that we implicitly may need one entry for every outstanding MSHR. |
11375:f98df9231cdd |
17-Mar-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Create a separate class for the cache write buffer
This patch breaks out the cache write buffer into a separate class, without affecting any stats. The goal of the patch is to avoid encumbering the much-simpler write queue with the complex MSHR handling. In a follow on patch this simplification allows us to implement write combining.
The WriteQueue gets its own class, but shares a common ancestor, the generic Queue, with the MSHRQueue. |
11357:6668387fa488 |
10-Aug-2015 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem, cpu: Add assertions to snoop invalidation logic
This patch adds assertions that enforce that only invalidating snoops will ever reach into the logic that tracks in-order load completion and also invalidation of LL/SC (and MONITOR / MWAIT) monitors. Also adds some comments to MSHR::replaceUpgrades(). |
11352:4e195fb9ec4f |
24-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Ensure that InvalidateReq is not forwarded as ReadExReq
This patch fixes an issue where an InvalidationReq only traversed one level of the cache hierarchy, and was subsequently turned into a ReadExReq due to it needing writable, and the command not being checked for explicitly. |
11347:faf5195f6ca7 |
23-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
scons: Add missing override to appease clang
Make clang happy...again. |
11346:64e862d3758f |
18-Feb-2016 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
ruby: move range change send from RubyPort to derived classes. |
11343:e777659dcff6 |
17-Feb-2016 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
ruby: send address ranges from RubyPort |
11341:bda2c39fd9fd |
15-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Add missing overrides to appease clang
Since the last round of fixes a few new issues have snuck in. We should consider switching the regression runs to clang. |
11340:dc0ed2d4da50 |
15-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Avoid using invalid iterator in cache lock list traversal
Fix up issue highlighted by Valgrind and the clang Address Sanitizer. |
11339:c45bfadcd51b |
14-Feb-2016 |
Michael LeBeane <Michael.Lebeane@amd.com> |
ruby: make DMASequencer inherit from RubyPort
This patch essentially rolls back 10518:30e3715c9405 to make RubyPort the parent class of DMASequencer. It removes redundant code and restores some features which were lost when directly inheriting from MemObject. For example, DMASequencer can now communicate to other devices using PIO, which is useful for memmory-mapped communication between multiple DMADevices. |
11335:42961fda6d75 |
10-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Be less conservative in clearing load locks in the cache
Avoid being overly conservative in clearing load locks in the cache, and allow writes to the line if they are from the same context. This is in line with ALPHA and ARM. |
11334:9bd2e84abdca |
10-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Move the point of coherency to the coherent crossbar
This patch introduces the ability of making the coherent crossbar the point of coherency. If so, the crossbar does not forward packets where a cache with ownership has already committed to responding, and also does not forward any coherency-related packets that are not intended for a downstream memory controller. Thus, invalidations and upgrades are turned around in the crossbar, and the memory controller only sees normal reads and writes.
In addition this patch moves the express snoop promotion of a packet to the crossbar, thus allowing the downstream cache to check the express snoop flag (as it should) for bypassing any blocking, rather than relying on whether a cache is responding or not. |
11333:c41d552d6f2e |
10-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Align cache behaviour in atomic when upstream is responding
Adopt the same flow as in timing mode, where the caches on the path to memory get to keep the line (if present), and we use the responderHadWritable flag to determine if we need to forward the (invalidating) packet or not. |
11332:40bcb0e97de9 |
10-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Align how snoops are handled when hitting writebacks
This patch unifies the snoop handling in case of hitting writebacks with how we handle snoops hitting in the tags. As a result, we end up using the same optimisation as the normal snoops, where we inform the downstream cache if we encounter a line in Modified (writable and dirty) state, which enables us to avoid sending out express snoops to invalidate any Shared copies of the line. A few regressions consequently change, as some transactions are sunk higher up in the cache hierarchy. |
11331:cd5c48db28e6 |
10-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Deduce if cache should forward snoops
This patch changes how the cache determines if snoops should be forwarded from the memory side to the CPU side. Instead of having a parameter, the cache now looks at the port connected on the CPU side, and if it is a snooping port, then snoops are forwarded. Less error prone, and less parameters to worry about.
The patch also tidies up the CPU classes to ensure that their I-side port is not snooping by removing overrides to the snoop request handler, such that snoop requests will panic via the default MasterPort implement |
11325:67cc559d513a |
06-Feb-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
style: eliminate explicit boolean comparisons
Result of running 'hg m5style --skip-all --fix-control -a' to get rid of '== true' comparisons, plus trivial manual edits to get rid of '== false'/'== False' comparisons.
Left a couple of explicit comparisons in where they didn't seem unreasonable: invalid boolean comparison in src/arch/mips/interrupts.cc:155 >> DPRINTF(Interrupt, "Interrupts OnCpuTimerINterrupt(tc) == true\n");<< invalid boolean comparison in src/unittest/unittest.hh:110 >> "EXPECT_FALSE(" #expr ")", (expr) == false)<< |
11321:02e930db812d |
06-Feb-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
style: fix missing spaces in control statements
Result of running 'hg m5style --skip-all --fix-control -a'. |
11320:42ecb523c64a |
06-Feb-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
style: remove trailing whitespace
Result of running 'hg m5style --skip-all --fix-white -a'. |
11311:01e65448c425 |
22-Jan-2016 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: removed Write_Only AccessPermission |
11309:9be8a40026df |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
ruby: split CPU and GPU latency stats |
11308:7d8836fd043d |
19-Jan-2016 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
gpu-compute: AMD's baseline GPU model |
11307:bd7d06ea90f5 |
19-Jan-2016 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
mem: write combining for ruby protocols
This patch adds support for write-combining in ruby. |
11306:a5340a2a24f9 |
19-Jan-2016 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
* * * mem: support for gpu-style RMWs in ruby
This patch adds support for GPU-style read-modify-write (RMW) operations in ruby. Such atomic operations are traditionally executed at the memory controller (instead of through an L1 cache using cache-line locking).
Currently, this patch works by propogating operation functors through the memory system. |
11305:78c1e4f5dfc5 |
20-Jul-2015 |
Blake Hechtman <blake.hechtman@amd.com> |
mem: misc flags for AMD gpu model
This patch add support to mark memory requests/packets with attributes defined in HSA, such as memory order and scope. |
11295:14029d75688d |
11-Jan-2016 |
Steve Reinhardt <stever@gmail.com> |
mem: fix bug in packet access endianness changes
The new Packet::setRaw() method incorrectly still contained an htog() conversion. As a result, calls to the old set() method (now defined as setRaw(htog(v))) underwent two htog conversions, which breaks things when htog() is not a no-op.
Interestingly the only test that caught this was a SPARC boot test, where an IsaFake device with a non-zero return value was getting swapped twice resulting in a register getting loaded with 0x100000000000000 instead of 1. (Good reason for keeping SPARC around, perhaps?) |
11294:a368064a2ab5 |
11-Jan-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
scons: Enable -Wextra by default
Make best use of the compiler, and enable -Wextra as well as -Wall. There are a few issues that had to be resolved, but they are all trivial. |
11288:57c340f947c7 |
31-Dec-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: add CacheVerbose debug flag, filter noisy DPRINTFs
Some of the DPRINTFs added to the classic cache in cset 45df88079f04, while useful to those unfamiliar with the cache code, end up being noise when you're familiar with the code but are trying to debug tricky protocol issues. (Particularly getting two messages from each cache as it receives a snoop request then declares that there was no match.)
This patch introduces a CacheVerbose debug flag, and moves a subset of the added DPRINTFs into that category, so that Cache by itself returns to being a more succinct summary of cache activity.
Also added a CacheAll compound flag to turn on all the cache-related debug flags (other than CacheTags, which you *really* have to want badly to turn it on, IMO). |
11287:0d5bbeaeb8ca |
31-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not rely on the NeedsWritable flag for responses
This patch removes the NeedsWritable flag for all responses, as it is really only the request that needs a writable response. The response, on the other hand, should in these cases always provide the line in a writable state, as indicated by the hasSharers flag not being set.
When we send requests that has NeedsWritable set, the response will always have the hasSharers flag not set. Additionally, there are cases where the request did not have NeedsWritable set, and we still get a writable response with the hasSharers flag not set. This never happens on snoops, but is used by downstream caches to pass ownership upstream.
As part of this patch, the affected response types are updated, and the snoop filter is similarly modified to check only the hasSharers flag (as it should). A sanity check is also added to the packet class, asserting that we never look at the NeedsWritable flag for responses.
No regressions are affected. |
11286:2071db8f864b |
31-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not allocate space for packet data if not needed
This patch looks at the request and response command to determine if either actually has any data payload, and if not, we do not allocate any space for packet data.
The only tricky case is where the command type is changed as part of the MSHR functionality. In these cases where the original packet had no data, but the new packet does, we need to explicitly call allocate(). |
11285:25715951a4b8 |
31-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not alter cache block state on uncacheable snoops
This patch ensures we do not respond with a Modified (dirty and writable) line if the request is uncacheable, and that the cache responding retains the line without modifying the state (even if responding). |
11284:b3926db25371 |
31-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make cache terminology easier to understand
This patch changes the name of a bunch of packet flags and MSHR member functions and variables to make the coherency protocol easier to understand. In addition the patch adds and updates lots of descriptions, explicitly spelling out assumptions.
The following name changes are made:
* the packet memInhibit flag is renamed to cacheResponding
* the packet sharedAsserted flag is renamed to hasSharers
* the packet NeedsExclusive attribute is renamed to NeedsWritable
* the packet isSupplyExclusive is renamed responderHadWritable
* the MSHR pendingDirty is renamed to pendingModified
The cache states, Modified, Owned, Exclusive, Shared are also called out in the cache and MSHR code to make it easier to understand. |
11283:4cc8b312f026 |
20-Jul-2015 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
ruby: slicc: have a static MachineType
This patch is imported from reviewboard patch 2551 by Nilay. This patch moves from a dynamically defined MachineType to a statically defined one. The need for this patch was felt since a dynamically defined type prevents us from having types for which no machine definition may exist.
The following changes have been made: i. each machine definition now uses a type from the MachineType enumeration instead of any random identifier. This required changing the grammar and the *.sm files. ii. MachineType enumeration defined statically in RubySlicc_Exports.sm. * * * normal protocol fixes for nilay's parser machine type fix |
11282:afdcebd314be |
20-Jul-2015 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
ruby: slicc: remove support for single machine, multiple types
This patch is imported from reviewboard patch 2550 by Nilay. It was possible to specify multiple machine types with a single state machine. This seems unnecessary and is being removed. |
11279:3fd1142adad9 |
28-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Explicitly check MSHR snoops for cases not dealt with
Add a sanity check to make it explicit that we currently do not allow an I/O coherent agent to directly issue writes into the coherent part of the memory system (it has to go via a cache, and get transformed into a read ex, upgrade or invalidation). |
11278:18411ccc4f3c |
28-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused cache squash functionality
This patch removes the unused squash function from the MSHR queue, and the associated (and also unused) threadNum member from the MSHR. |
11277:4f8703832608 |
28-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Avoid unecessary checks when creating HardPFReq in cache
The checks made before sending out a HardPFReq were unecessarily complex, and checked for cases that never occur. This patch tidies it up. |
11276:3561d002d8c7 |
28-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not use sender state to track forwarded snoops in cache
This patch changes how the cache tracks which snoops are forwarded, and which ones are created locally. Previously the identification was based on an empty sender state of a specific class, but this method fails to distinguish which cache actually attached the sender state. Instead we use the same mechanism as the crossbar, and keep track of the requests that have outstanding snoops. |
11275:fc2b0e6550ad |
28-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix cache sender state handling and add clarification
This patch addresses a bug in how the cache attached the MSHR as a sender state. Rather than overwriting any existing sender state it now pushes a new one. The handling of upward snoops is also clarified. |
11271:f4ad5be63ba8 |
17-Dec-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix memory allocation bug in deferred snoop handling
This patch fixes a corner case in the deferred snoop handling, where requests ended up being used by multiple packets with different lifetimes, and inadvertently got deleted while they were still in use. |
11269:33434d6cbd20 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
mem: add request types for acquire and release
Add support for acquire and release requests. These synchronization operations are commonly supported by several modern instruction sets. |
11266:452e10b868ea |
20-Jul-2015 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: more flexible ruby tester support
This patch allows the ruby random tester to use ruby ports that may only support instr or data requests. This patch is similar to a previous changeset (8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets. This current patch implements the support in a more straight-forward way. Since retries are now tested when running the ruby random tester, this patch splits up the retry and drain check behavior so that RubyPort children, such as the GPUCoalescer, can perform those operations correctly without having to duplicate code. Finally, the patch also includes better DPRINTFs for debugging the tester. |
11256:65db40192591 |
09-Dec-2015 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
mem: remove acq/rel cmds from packet and add mem fence req |
11253:daf9f91b11e9 |
07-Dec-2015 |
Radhika Jagtap <radhika.jagtap@ARM.com> |
cpu: Support virtual addr in elastic traces
This patch adds support to optionally capture the virtual address and asid for load/store instructions in the elastic traces. If they are present in the traces, Trace CPU will set those fields of the request during replay. |
11248:f6db1e80a878 |
07-Dec-2015 |
Radhika Jagtap <radhika.jagtap@ARM.com> |
mem: Add instruction sequence number to request
This patch adds the instruction sequence number to the request and provides a request constructor that accepts a sequence number for initialization. |
11229:1b9331fd8966 |
25-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix search-replace issues in DRAMPower wrapper license
Fix a number of unintentional insertions of 'const'. |
11211:4e70e13c1a2c |
15-Nov-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
arm: Add missing explicit overrides for classic caches
Make clang when compiling on OSX. |
11210:64c0ebeae224 |
20-Jul-2015 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: added stl vector of ints to be used by SLICC |
11209:d5a7a4da9f63 |
13-Nov-2015 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
slicc: fixes for the Address to Addr changeset (11025)
misc changes now that Address has become Addr including int to address util function |
11208:fa3e56b6e0b6 |
13-Nov-2015 |
Joe Gross <joseph.gross@amd.com> |
ruby: add BoolVec
The BoolVec typedef and insertion operator overload function simplify usage of vectors of type bool |
11207:7b7e352f8d7f |
20-Jul-2015 |
Brad Beckmann <Brad.Beckmann@amd.com> |
mem: add boolean to disable PacketQueue's size sanity check
the sanity check, while generally useful for exposing memory system bugs, may be spurious with respect to GPU workloads, which may generate many more requests than typical CPU workloads. the large number of requests generated by the GPU may cause the req/resp queues to back up, thus queueing more than 100 packets. |
11199:929fd978ab4e |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add an option to perform clean writebacks from caches
This patch adds the necessary commands and cache functionality to allow clean writebacks. This functionality is crucial, especially when having exclusive (victim) caches. For example, if read-only L1 instruction caches are not sending clean writebacks, there will never be any spills from the L1 to the L2. At the moment the cache model defaults to not sending clean writebacks, and this should possibly be re-evaluated.
The implementation of clean writebacks relies on a new packet command WritebackClean, which acts much like a Writeback (renamed WritebackDirty), and also much like a CleanEvict. On eviction of a clean block the cache either sends a clean evict, or a clean writeback, and if any copies are still cached upstream the clean evict/writeback is dropped. Similarly, if a clean evict/writeback reaches a cache where there are outstanding MSHRs for the block, the packet is dropped. In the typical case though, the clean writeback allocates a block in the downstream cache, and marks it writable if the evicted block was writable.
The patch changes the O3_ARM_v7a L1 cache configuration and the default L1 caches in config/common/Caches.py |
11197:f8fdd931e674 |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add cache clusivity
This patch adds a parameter to control the cache clusivity, that is if the cache is mostly inclusive or exclusive. At the moment there is no intention to support strict policies, and thus the options are: 1) mostly inclusive, or 2) mostly exclusive.
The choice of policy guides the behaviuor on a cache fill, and a new helper function, allocOnFill, is created to encapsulate the decision making process. For the timing mode, the decision is annotated on the MSHR on sending out the downstream packet, and in atomic we directly pass the decision to handleFill. We (ab)use the tempBlock in cases where we are not allocating on fill, leaving the rest of the cache unaffected. Simple and effective.
This patch also makes it more explicit that multiple caches are allowed to consider a block writable (this is the case also before this patch). That is, for a mostly inclusive cache, multiple caches upstream may also consider the block exclusive. The caches considering the block writable/exclusive all appear along the same path to memory, and from a coherency protocol point of view it works due to the fact that we always snoop upwards in zero time before querying any downstream cache.
Note that this patch does not introduce clean writebacks. Thus, for clean lines we are essentially removing a cache level if it is made mostly exclusive. For example, lines from the read-only L1 instruction cache or table-walker cache are always clean, and simply get dropped rather than being passed to the L2. If the L2 is mostly exclusive and does not allocate on fill it will thus never hold the line. A follow on patch adds the clean writebacks.
The patch changes the L2 of the O3_ARM_v7a CPU configuration to be mostly exclusive (and stats are affected accordingly). |
11196:53d4f7e452d6 |
06-Nov-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Avoid unnecessary snoops on writebacks and clean evictions
This patch optimises the handling of writebacks and clean evictions when using a snoop filter. Instead of snooping into the caches to determine if the block is cached or not, simply set the status based on the snoop-filter result. |
11195:6f8b2a005abb |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Order packet queue only on matching addresses
Instead of conservatively enforcing order for all packets, which may negatively impact the simulated-system performance, this patch updates the packet queue such that it only applies the restriction if there are already packets with the same address in the queue.
The basic need for the order enforcement is due to coherency interactions where requests/responses to the same cache line must not over-take each other. We rely on the fact that any packet that needs order enforcement will have a block-aligned address. Thus, there is no need for the queue to know about the cacheline size. |
11194:c3ba89c653a9 |
06-Nov-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Enforce insertion order on the cache response path
This patch enforces insertion order transmission of packets on the response path in the cache. Note that the logic to enforce order is already present in the packet queue, this patch simply turns it on for queues in the response path.
Without this patch, there are corner cases where a request-response is faster than a response-response forwarded through the cache. This violation of queuing order causes problems in the snoop filter leaving it with inaccurate information. This causes assert failures in the snoop filter later on.
A follow on patch relaxes the order enforcement in the packet queue to limit the performance impact. |
11193:564e2e7e86f4 |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Use the packet delays and do not just zero them out
This patch updates the I/O devices, bridge and simple memory to take the packet header and payload delay into account in their latency calculations. In all cases we add the header delay, i.e. the accumulated pipeline delay of any crossbars, and the payload delay needed for deserialisation of any payload.
Due to the additional unknown latency contribution, the packet queue of the simple memory is changed to use insertion sorting based on the time stamp. Moreover, since the memory hands out exclusive (non shared) responses, we also need to ensure ordering for reads to the same address. |
11192:4c28abcf8249 |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Align rules for sinking inhibited packets at the slave
This patch aligns how the memory-system slaves, i.e. the various memory controllers and the bridge, identify and deal with sinking of inhibited packets that are only useful within the coherent part of the memory system.
In the future we could shift the onus to the crossbar, and add a parameter "is_point_of_coherence" that would allow it to sink the aforementioned packets. |
11191:9eabb2bf349b |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not treat CleanEvict as a write operation
This patch changes the CleanEvict command type to not be considered a write. Initially it was made a zero-sized write to match the writeback command, but as things developed it became clear that it causes more problems than it solves. For example, the memory modules (and bridge) should not consider the CleanEvict as a write, but instead discard it. With this patch it will be neither a read, nor write, and as it does not need a response the slave will simply sink it. |
11190:0964165d1857 |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Unify delayed packet deletion
This patch unifies how we deal with delayed packet deletion, where the receiving slave is responsible for deleting the packet, but the sending agent (e.g. a cache) is still relying on the pointer until the call to sendTimingReq completes. Previously we used a mix of a deletion vector and a construct using unique_ptr. With this patch we ensure all slaves use the latter approach. |
11189:4237221d3e31 |
06-Nov-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Appease clang static analyzer
A few minor fixes to issues identified by the clang static analyzer. |
11188:091531fa23ad |
06-Nov-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Check the XBar's port queues on functional snoops
The CoherentXBar currently doesn't check its queued slave ports when receiving a functional snoop. This caused data corruption in cases when a modified cache lines is forwarded between two caches.
Add the required functional calls into the queued slave ports. |
11186:2d1d51615e0e |
03-Nov-2015 |
Erfan Azarkhish <erfan.azarkhish@unibo.it> |
mem: hmc: minor fixes
This patch performs two minor fixes to DRAMCtrl.py and xbar.hh in favor of the HMC patch series.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
11185:0ff78be3bc67 |
03-Nov-2015 |
Erfan Azarkhish <erfan.azarkhish@unibo.it> |
mem: hmc: serial link model
This changeset adds a serial link model for the Hybrid Memory Cube (HMC). SerialLink is a simple variation of the Bridge class, with the ability to account for the latency of packet serialization. Also trySendTiming has been modified to correctly model bandwidth.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
11184:07b0dacf27d6 |
03-Nov-2015 |
Erfan Azarkhish <erfan.azarkhish@unibo.it> |
mem: hmc: adds controller
This patch models a simple HMC Controller. It simply schedules the incoming packets to HMC Serial Links using a round robin mechanism. This patch should be applied in series with other patches modeling a complete HMC device.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
11177:524c44cf8278 |
29-Oct-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Clarify cache MSHR handling on fill
This patch addresses the upgrading of deferred targets in the MSHR, and makes it clearer by explicitly calling out what is happening (deferred targets are promoted if we get exclusivity without asking for it). |
11175:2324ed5fa9f4 |
23-Oct-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
x86: Add missing explicit overrides for X86 devices
Make clang >= 3.5 happy when compiling build/X86/gem5.opt on OSX. |
11173:3a4d1b5cd05c |
14-Oct-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Pass snoop retries through the CommMonitor
Allow the monitor to be placed after a snooping port, and do not fail on snoop retries, but instead pass them on to the slave port. |
11172:9261e98e4501 |
14-Oct-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: profiler: provide the number of vnets through ruby system
The aim is to ultimately do away with the static function Network::getNumberOfVirtualNetworks(). |
11171:60d4dfa3241a |
14-Oct-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove unused functionalRead() function.
Not required since functional reads cannot rely on messages that are inflight. |
11170:1151cfea92e3 |
14-Oct-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: garnet: flexible: refactor flit |
11169:44b5c183c3cd |
12-Oct-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Add explicit overrides and fix other clang >= 3.5 issues
This patch adds explicit overrides as this is now required when using "-Wall" with clang >= 3.5, the latter now part of the most recent XCode. The patch consequently removes "virtual" for those methods where "override" is added. The latter should be enough of an indication.
As part of this patch, a few minor issues that clang >= 3.5 complains about are also resolved (unused methods and variables). |
11168:f98eb2da15a4 |
12-Oct-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Remove redundant compiler-specific defines
This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions. |
11151:ca4ea9b5c052 |
30-Sep-2015 |
Mitch Hayenga <mitch.hayenga@arm.com> |
cpu,isa,mem: Add per-thread wakeup logic
Changes wakeup functionality so that only specific threads on SMT capable cpus are woken. |
11145:939f3919b108 |
29-Sep-2015 |
Joel Hestness <jthestness@gmail.com> |
ruby: Fix CacheMemory allocate leak
If a cache entry permission was previously set to NotPresent, but the entry was not deleted, a following cache allocation can cause the entry to be leaked by setting the entry pointer to a newly allocated entry. To eliminate this possibility, check if the new entry is different from the old one, and if so, delete the old one. |
11143:d2114f5629ff |
29-Sep-2015 |
Joel Hestness <jthestness@gmail.com> |
ruby: RubyPort delete snoop requests
In RubyPort::ruby_eviction_callback, prior changes fixed a memory leak caused by instantiating separate packets for each port that the eviction was forwarded to. That change, however, left the instantiated request to also leak. Allocate it on the stack to avoid the leak. |
11142:c5ac64b4b020 |
29-Sep-2015 |
Joel Hestness <jthestness@gmail.com> |
ruby: Fix memory leak in AbstractController
Recent changes to memory access queuing allocate requests for packets sent to memory controllers, but did not free the requests. Delete them to avoid leaks. |
11141:526e6ad9bceb |
29-Sep-2015 |
Joel Hestness <jthestness@gmail.com> |
ruby: RubyMemoryControl delete requests
Changes to the RubyMemoryControl removed the dequeue function, which deleted MemoryNode instances. This results in leaked MemoryNode instances. Correctly delete these instances. |
11139:bd894d2bdd7c |
25-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add PacketInfo to be used for packet probe points
This patch fixes a use-after-delete issue in the packet probe points by adding a PacketInfo struct to retain the key fields before passing the packet onwards. We want to probe the packet after it is successfully sent, but by that time the fields may be modified, and the packet may even be deleted.
Amazingly enough the issue has gone undetected for months, and only recently popped up in our regressions. |
11137:0229c7b15ca1 |
25-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add check for block status on WriteLineReq fill
More checks to help with understanding of functionality. |
11136:3fd483cdd458 |
25-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix WriteLineReq fill behaviour
This patch fixes issues in the interactions between deferred snoops and WriteLineReq. More specifically, the patch addresses an issue where deferred snoops caused assertion failures when being serviced on the arrival of an InvalidateResp. The response packet was perceived to be invalidating, when actually it is not for the cache that sent out the original invalidation request. |
11135:9d09dab39689 |
25-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Comment clean-up for the snoop filter
Merely fixing up some style issues and adding more comments. |
11134:dfa51840de1f |
25-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Avoid adding and then removing empty snoop-filter items
This patch tidies up how we access the snoop filter for snoops, and avoids adding items only to later remove them. |
11133:81e46b63daff |
25-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Only track snooping ports in the snoop filter
This patch changes the tracking of ports in the snoop filter to use local dense port IDs so that we can have 64 snooping ports (rather than crossbar slave ports). This is achieved by adding a simple remapping vector that translates the actal port IDs into the local slave IDs used in the SnoopMask.
Ultimately this patch allows us to scale to much larger systems without introducing a hierarchy of crossbars. |
11132:fbd597034299 |
25-Sep-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Add snoop filters to L2 crossbars, and check size
This patch adds a snoop filter to the L2XBar. For now we refrain from globally adding a snoop filter to the SystemXBar, since the latter is also used in systems without caches. In scenarios without caches the snoop filter will not see any writeback/clean evicts from the CPU ports, despite the fact that they are snooping. To avoid inadvertent use of the snoop filter in these cases we leave it out for now.
A size check is added to the snoop filter, merely to ensure it does not grow beyond the total capacity of the caches above it. The size has to be set manually, and a value of 8 MByte is choosen as suitably high default. |
11131:22e739752f47 |
25-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Store snoop filter lookup result to avoid second lookup
This patch introduces a private member storing the iterator from the lookupRequest call, such that it can be re-used when the request eventually finishes. The method previously called updateRequest is renamed finishRequest to make it more clear that the two functions must be called together. |
11130:45a23e44e93d |
25-Sep-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Add snoops for CleanEvicts and Writebacks in atomic mode
This patch mirrors the logic in timing mode which sends up snoops to check for cached copies before sending CleanEvicts and Writebacks down the memory hierarchy. In case there is a copy in a cache above, discard CleanEvicts and set the BLOCK_CACHED flag in Writebacks so that writebacks do not reset the cache residency bit in the snoop filter below. |
11129:48c02e8b0bbb |
25-Sep-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Add CleanEvict and Writeback support to snoop filters
This patch adds the functionality to properly track CleanEvicts and Writebacks in the snoop filter. Previously there were no CleanEvicts, and Writebacks did not send up snoops to ensure there were no copies in caches above. Hence a writeback could never erase an entry from the snoop filter.
When a CleanEvict message reaches a snoop filter, it confirms that the BLOCK_CACHED flag is not set and resets the bits corresponding to the CleanEvict address and port it arrived on. If none of the other peer caches have (or have requested) the block, the snoop filter forwards the CleanEvict to lower levels of memory. In case of a Writeback message, the snoop filter checks if the BLOCK_CACHED flag is not set and only then resets the bits corresponding to the Writeback address. If any of the other peer caches have (or has requested) the same block, the snoop filter sets the BLOCK_CACHED flag in the Writeback before forwarding it to lower levels of memory heirarachy. |
11128:b6532152a64a |
25-Sep-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Add check for snooping ports in the snoop filter
This patch prevents the snoop filter from creating items for requests originating from non-snooping ports. The allocation decision is thus based both on the cacheability of the line, and the snooping status of the source port. Ultimately we should check if the source of the packet is caching, since also the CPU ports are snooping (but not allocating). Thus, at the moment we rely on the snoop filter being used together with caches.
The patch also transitions to use the Packet::getBlockAddr in determining the line address. |
11127:f39c2cc0d44e |
25-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make the coherent crossbar account for timing snoops
This patch introduces the concept of a snoop latency. Given the requirement to snoop and forward packets in zero time (due to the coherency mechanism), the latency is accounted for later.
On a snoop, we establish the latency, and later add it to the header delay of the packet. To allow multiple caches to contribute to the snoop latency, we use a separate variable in the packet, and then take the maximum before adding it to the header delay. |
11126:823a6aa11fbd |
25-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Do not include snoop-filter latency in crossbar occupancy
This patch ensures that the snoop-filter latency only contributes to the packet latency, and not to the crossbar throughput/occupancy. In essence we treat the snoop-filter lookup as pipelined. |
11124:5d38dc2f7d66 |
24-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: simple network: refactor code
Drops an unused variable and marks three variables as const. |
11123:a8980f67b3fc |
23-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: garnet: refactor code in network links |
11122:721d3e248f75 |
23-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: bloom filters: refactor code |
11121:370488a55495 |
23-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: abstract controller: mark some variables as const |
11120:eef83ecab5bf |
22-Sep-2015 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: Add initial HBM configurations
Created the following HBM configurations: 1) HBM gen1 (x128/CH), 2Gb die, 4H stack, 1Gbps, 8 channels 2) HBM gen2 (x64/PC), 8Gb die, 4H stack, 1Gbps, 16 pseudo-channels
The configuration values are based on: - The HBM gen1 public JEDEC spec - Publically released data from MemCon presentations - Timing extrapolated from existing LPDDR configurations
Will adjust once specs become available. |
11119:3be6083fd774 |
18-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: garnet: mark some variables as const |
11118:75c1e564a725 |
18-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: print addresses in hex Changeset 4872dbdea907 replaced Address by Addr, but did not make changes to print statements. So the addresses which were being printed in hex earlier along with their line address, were now being printed in decimals. This patch adds a function printAddress(Addr) that can be used to print the address in hex along with the lines address. This function has been put to use in some of the places. At other places, change has been made to print just the address in hex. |
11117:2a1a21f79047 |
18-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: derive DataMember class from Var instead of PairContainer
The DataMember class in Type.py was being derived from PairContainer. A separate Var object was also created for the DataMember. This meant some duplication of across the members of these two classes (Var and DataMember). This patch changes DataMember from Var instead. There is no obvious reason to derive from PairContainer which can only hold pairs, something that Var class already supports. The only thing that DataMember has over Var is init_code, which is being retained. This change would later on help in having pointers in DataMembers. |
11116:d6fb95dbf3e2 |
17-Sep-2015 |
Tony Gutierrez <anthony.gutierrez@amd.com> |
ruby: update WireBuffer API to match that of MessageBuffer
this patch updates the WireBuffer API to mirror the changes in revision 11111 |
11114:2910a31917b7 |
16-Sep-2015 |
Lena Olson <lena@cs.wisc.edu> |
ruby: Add missing block deallocations in MOESI_hammer
Some blocks in MOESI hammer were not getting deallocated when they were set to an idle state (e.g. by invalidate or other_getx/s messages). While functionally correct, this caused some bad effects on performance, such as blocks in I in the L1s getting sent to the L2 upon eviction, in turn evicting valid blocks. Also, if a valid block was in LRU, that block could be evicted rather than a block in I. This patch adds in the missing deallocations.
Committed by: Nilay Vaish<nilay@cs.wisc.edu> |
11113:5a2e1b1b5c43 |
16-Sep-2015 |
Joe Gross <joe.gross@amd.com> |
ruby: fix message buffer init order
The recent changes to make MessageBuffers SimObjects required them to be initialized in a particular order, which could break some protocols. Fix this by calling initNetQueues on the external nodes of each external link in the constructor of Network.
This patch also refactors the duplicated code for checking network allocation and setting net queues (which are called by initNetQueues) from the simple and garnet networks to be in Network. |
11111:6da33e720481 |
16-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: message buffer, timer table: significant changes
This patch changes MessageBuffer and TimerTable, two structures used for buffering messages by components in ruby. These structures would no longer maintain pointers to clock objects. Functions in these structures have been changed to take as input current time in Tick. Similarly, these structures will not operate on Cycle valued latencies for different operations. The corresponding functions would need to be provided with these latencies by components invoking the relevant functions. These latencies should also be in Ticks.
I felt the need for these changes while trying to speed up ruby. The ultimate aim is to eliminate Consumer class and replace it with an EventManager object in the MessageBuffer and TimerTable classes. This object would be used for scheduling events. The event itself would contain information on the object and function to be invoked.
In hindsight, it seems I should have done this while I was moving away from use of a single global clock in the memory system. That change led to introduction of clock objects that replaced the global clock object. It never crossed my mind that having clock object pointers is not a good design. And now I really don't like the fact that we have separate consumer, receiver and sender pointers in message buffers. |
11110:8647458d421d |
16-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove unused function removeRequest() |
11109:bf3d0f56a6ba |
16-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: sequencer: remove commented out function printProgress() |
11108:6342ddf6d733 |
16-Sep-2015 |
David Hashe <david.hashe@amd.com> |
ruby: rename System.{hh,cc} to RubySystem.{hh,cc}
The eventual aim of this change is to pass RubySystem pointers through to objects generated from the SLICC protocol code.
Because some of these objects need to dereference their RubySystem pointers, they need access to the System.hh header file.
In src/mem/ruby/SConscript, the MakeInclude function creates single-line header files in the build directory that do nothing except include the corresponding header file from the source tree.
However, SLICC also generates a list of header files from its symbol table, and writes it to mem/protocol/Types.hh in the build directory. This code assumes that the header file name is the same as the class name.
The end result of this is the many of the generated slicc files try to include RubySystem.hh, when the file they really need is System.hh. The path of least resistence is just to rename System.hh to RubySystem.hh. |
11107:43857904aff3 |
16-Sep-2015 |
Anthony Gutierrez <atgutier@umich.edu> |
slicc: export uint64_t instead of uint64 |
11096:efaacec43726 |
14-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: topology: refactor code. |
11095:12c36d719139 |
14-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: remove member buffer_expr from Var class This was added by changeset 51f40b101a56. Instead, buffer_expr would now be associated with the InPort class. |
11093:8049ffff6d68 |
12-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: perfect switch: refactor code Refactored the code in operateVnet(), moved partly to a new function operateMessageBuffer(). This is required since a later patch moves to having a wakeup event per MessageBuffer instead of one event for the entire Switch. |
11092:a51ef09e3a78 |
12-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: simple network: store Switch* in PerfectSwitch and Throttle There are two reasons for doing so:
a. provide a source of clock to PerfectSwitch. A follow on patch removes sender and receiver pointers from MessageBuffer means that the object owning the buffer should have some way of providing timing info.
b. schedule events. A follow on patch removes the consumer class. So the PerfectSwitch needs some EventManager object to schedule events on its own. |
11089:4808f8c4a47e |
08-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: remove nextLineHack from Type.py |
11087:3c4bda5a2f66 |
05-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: call setMRU from L1 controllers, not from sequencer Currently the sequencer calls the function setMRU that updates the replacement policy structures with the first level caches. While functionally this is correct, the problem is that this requires calling findTagInSet() which is an expensive function. This patch removes the calls to setMRU from the sequencer. All controllers should now update the replacement policy on their own.
The set and the way index for a given cache entry can be found within the AbstractCacheEntry structure. Use these indicies to update the replacement policy structures. |
11086:672cda252689 |
05-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: adds set and way indices to AbstractCacheEntry |
11085:f1fe63d949c0 |
05-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: set: reimplement using std::bitset The current Set data structure is slow and therefore is being reimplemented using std::bitset. A maximum limit of 64 is being set on the number of controllers of each type. This means that for simulating a system with more controllers of a given type, one would need to change the value of the variable NUMBER_BITS_PER_SET |
11084:ee2fcca7b58a |
05-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: declare all protocol message buffers as parameters
MessageBuffer is a SimObject now. There were protocols that still declared some of the message buffers are variables of the controller, but not as input parameters. Special handling was required for these variables in the SLICC compiler. This patch changes this. Now all message buffers are declared as input parameters. |
11083:61b329833f74 |
04-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Avoid setting markPending if not needed
In cases where a newly added target does not have any upstream MSHR to mark as downstreamPending, remember that nothing is marked. This allows us to avoid attempting to find the MSHR as part of the clearing of downstreamPending. |
11082:8539728fd457 |
04-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up CacheSet
Minor tweaks and house keeping. |
11081:4d8b7783a692 |
04-Sep-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up the snoop state-transition logic
Remove broken and unused option to pass dirty data on non-exclusive snoops. Also beef up the comments a bit. |
11074:2763a59c73ff |
01-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove random seed We no longer use the C library based random number generator: random(). Instead we use the C++ library provided rng. So setting the random seed for the RubySystem class has no effect. Hence the variable and the corresponding option are being dropped. |
11073:a8afeb8bc3f0 |
01-Sep-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: directory memory: drop unused variable. |
11065:37e19af67f62 |
30-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: specify number of vnets for each protocol The default value for number of virtual networks is being removed. Each protocol should now specify the value it needs. |
11064:386a5200e298 |
30-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: drop member m_in_use
This member indicates whether or not a particular virtual network is in use. Instead of having a default big value for the number of virtual networks and then checking whether a virtual network is in use, the next patch removes the default value and the protocol configuration file would now specify the number of virtual networks it requires.
Additionally, the patch also refactors some of the code used for computing the virtual channel next in the round robin order. |
11063:b254723105b5 |
30-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: garnet: mark few functions const in BaseGarnetNetwork.hh |
11062:262d8494b253 |
30-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: avoid duplicate code for function argument check Both FuncCallExprAST and MethodCallExprAST had code for checking the arguments with which a function is being called. The patch does away with this duplication. Now the code for checking function call arguments resides in the Func class. |
11061:25b53a7195f7 |
29-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: eliminate type uint64 and int64 These types are being replaced with uint64_t and int64_t. |
11060:a1c1c3aa359b |
28-Aug-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
ruby: Use the const serialize interface in RubySystem
The new serialization code (kudos to Tim Jones) moves all of the state mangling in RubySystem to memWriteback. This makes it possible to use the new const serialization interface.
This changeset moves the cache recorder cleanup from the checkpoint() method to drainResume() to make checkpointing truly constant and updates the checkpointing code to use the new interface. |
11059:40e622551656 |
27-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: handle llsc accesses through CacheEntry, not CacheMemory
The sequencer takes care of llsc accesses by calling upon functions from the CacheMemory. This is unnecessary once the required CacheEntry object is available. Thus some of the calls to findTagInSet() are avoided. |
11057:ccdaf2f353ba |
24-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Revert requirement on packet addr/size always valid
This patch reverts part of (842f56345a42), as apparently there are use-cases outside the main repository relying on the late setting of the physical address. |
11056:842f56345a42 |
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Reflect that packet address and size are always valid
This patch simplifies the packet, and removes the possibility of creating a packet without a valid address and/or size. Under no circumstances are these fields set at a later point, and thus they really have to be provided at construction time.
The patch also fixes a case there the MinorCPU creates a packet without a valid address and size, only to later delete it. |
11055:54071fd5c397 |
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
arm, mem: Remove unused CLEAR_LL request flag
Cleaning up dead code. The CLREX stores zero directly to MISCREG_LOCKFLAG and so the request flag is no longer needed. The corresponding functionality in the cache tags is also removed. |
11054:00bddca96da6 |
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused cache squash functionality
Tidying up. |
11053:62544e45c0f4 |
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add explicit Cache subclass and make BaseCache abstract
Open up for other subclasses to BaseCache and transition to using the explicit Cache subclass. |
11052:3137d34acf29 |
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
ruby: Move Rubys cache class from Cache.py to RubyCache.py
This patch serves to avoid name clashes with the classic cache. For some reason having two 'SimObject' files with the same name creates problems. |
11051:81b1f46061c8 |
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Move cache_impl.hh to cache.cc
There is no longer any need to keep the implementation in a header. |
11050:65fc1db5d795 |
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
cpu: Move invldPid constant from Request to BaseCPU
A more natural home for this constant. |
11049:dfb0aa3f0649 |
19-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: reverts to changeset: bf82f1f7b040 |
11048:110cce93d398 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: add accessor functions to SLICC def of MachineID |
11047:dcf729f0bbfa |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: simple network: refactor code
Drops an unused variable and marks three variables as const. |
11046:0cd13910b063 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: profiler: provide the number of vnets through ruby system
The aim is to ultimately do away with the static function Network::getNumberOfVirtualNetworks(). |
11045:0bffd44521f5 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: directory memory: drop unused variable. |
11044:25b2a428e2eb |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: remove a stray line in StateMachine.py |
11043:d22f7d7dfd5c |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: garnet: flexible: refactor flit |
11042:d34a75cb3646 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: DataBlock: adds a comment |
11041:d3bae341e151 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove random seed
We no longer use the C library based random number generator: random(). Instead we use the C++ library provided rng. So setting the random seed for the RubySystem class has no effect. Hence the variable and the corresponding option are being dropped. |
11040:ec668f8466eb |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: SubBlock: refactor code |
11039:fe230bcf3f38 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: cache recorder: move check on block size to RubySystem. |
11038:6d709f3c4c09 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: abstract controller: mark some variables as const |
11037:91d6a2d95cf8 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: simple network: store Switch* in PerfectSwitch and Throttle |
11036:3de670f298b1 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove unused functionalRead() function. |
11035:690ecdba9324 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: perfect switch: refactor code
Refactored the code in operateVnet(), moved partly to a new function operateMessageBuffer(). |
11034:a89984ca7d15 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: cache memory: drop {try,test}CacheAccess functions |
11033:9a0022457323 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: call setMRU from L1 controllers, not from sequencer Currently the sequencer calls the function setMRU that updates the replacement policy structures with the first level caches. While functionally this is correct, the problem is that this requires calling findTagInSet() which is an expensive function. This patch removes the calls to setMRU from the sequencer. All controllers should now update the replacement policy on their own.
The set and the way index for a given cache entry can be found within the AbstractCacheEntry structure. Use these indicies to update the replacement policy structures. |
11032:dec9cb2c5cde |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: adds set and way indices to AbstractCacheEntry |
11031:3815437cb231 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: eliminate type uint64 and int64
These types are being replaced with uint64_t and int64_t. |
11030:17240f381d6a |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: use default argument value Before this patch, while one could declare / define a function with default argument values, but the actual function call would require one to specify all the arguments. This patch changes the check for function arguments. Now a function call needs to specify arguments that are at least as much as those with default values and at most the total number of arguments taken as input by the function. |
11029:32604f9e190b |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: avoid duplicate code for function argument check Both FuncCallExprAST and MethodCallExprAST had code for checking the arguments with which a function is being called. The patch does away with this duplication. Now the code for checking function call arguments resides in the Func class. |
11028:3a5190683bf2 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: drop the [] notation for lookup function.
This is in preparation for adding a second arugment to the lookup function for the CacheMemory class. The change to *.sm files was made using the following sed command:
sed -i 's/\[\([0-9A-Za-z._()]*\)\]/.lookup(\1)/' src/mem/protocol/*.sm |
11027:bf82f1f7b040 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: handle llsc accesses through CacheEntry, not CacheMemory
The sequencer takes care of llsc accesses by calling upon functions from the CacheMemory. This is unnecessary once the required CacheEntry object is available. Thus some of the calls to findTagInSet() are avoided. |
11025:4872dbdea907 |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: replace Address by Addr This patch eliminates the type Address defined by the ruby memory system. This memory system would now use the type Addr that is in use by the rest of the system. |
11024:bc179fa0b91b |
14-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: rename variables Addr to addr
Avoid clash between type Addr and variable name Addr. |
11022:e6e3b7097810 |
14-Aug-2015 |
Joel Hestness <jthestness@gmail.com> |
ruby: Protocol changes for SimObject MessageBuffers |
11021:e8a6637afa4c |
14-Aug-2015 |
Joel Hestness <jthestness@gmail.com> |
ruby: Expose MessageBuffers as SimObjects
Expose MessageBuffers from SLICC controllers as SimObjects that can be manipulated in Python. This patch has numerous benefits: 1) First and foremost, it exposes MessageBuffers as SimObjects that can be manipulated in Python code. This allows parameters to be set and checked in Python code to avoid obfuscating parameters within protocol files. Further, now as SimObjects, MessageBuffer parameters are printed to config output files as a way to track parameters across simulations (e.g. buffer sizes)
2) Cleans up special-case code for responseFromMemory buffers, and aligns their instantiation and use with mandatoryQueue buffers. These two special buffers are the only MessageBuffers that are exposed to components outside of SLICC controllers, and they're both slave ends of these buffers. They should be exposed outside of SLICC in the same way, and this patch does it.
3) Distinguishes buffer-specific parameters from buffer-to-network parameters. Specifically, buffer size, randomization, ordering, recycle latency, and ports are all specific to a MessageBuffer, while the virtual network ID and type are intrinsics of how the buffer is connected to network ports. The former are specified in the Python object, while the latter are specified in the controller *.sm files. Unlike buffer-specific parameters, which may need to change depending on the simulated system structure, buffer-to-network parameters can be specified statically for most or all different simulated systems. |
11020:882ce080c9f7 |
14-Aug-2015 |
Joel Hestness <jthestness@gmail.com> |
ruby: Change PerfectCacheMemory::lookup to return pointer
CacheMemory and DirectoryMemory lookup functions return pointers to entries stored in the memory. Bring PerfectCacheMemory in line with this convention, and clean up SLICC code generation that was in place solely to handle references like that which was returned by PerfectCacheMemory::lookup. |
11019:fc1e41e88fd3 |
14-Aug-2015 |
Joel Hestness <jthestness@gmail.com> |
ruby: Remove the RubyCache/CacheMemory latency
The RubyCache (CacheMemory) latency parameter is only used for top-level caches instantiated for Ruby coherence protocols. However, the top-level cache hit latency is assessed by the Sequencer as accesses flow through to the cache hierarchy. Further, protocol state machines should be enforcing these cache hit latencies, but RubyCaches do not expose their latency to any existng state machines through the SLICC/C++ interface. Thus, the RubyCache latency parameter is superfluous for all caches. This is confusing for users.
As a step toward pushing L0/L1 cache hit latency into the top-level cache controllers, move their latencies out of the RubyCache declarations and over to their Sequencers. Eventually, these Sequencer parameters should be exposed as parameters to the top-level cache controllers, which should assess the latency. NOTE: Assessing these latencies in the cache controllers will require modifying each to eliminate instantaneous Ruby hit callbacks in transitions that finish accesses, which is likely a large undertaking. |
11016:bc759340631f |
11-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: allow mathematical operations on Ticks |
11013:7e31bd5968c0 |
07-Aug-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Cleanup packet accessor methods
The Packet::get() and Packet::set() methods both have very strange semantics. Currently, they automatically convert between the guest system's endianness and the host system's endianness. This behavior is usually undesired and unexpected.
This patch introduces three new method pairs to access data: * getLE() / setLE() - Get data stored as little endian. * getBE() / setBE() - Get data stored as big endian. * get(ByteOrder) / set(v, ByteOrder) - Configurable endianness
For example, a little endian device that is receiving a write request will use teh getLE() method to get the data from the packet.
The old interface will be deprecated once all existing devices have been ported to the new interface. |
11005:e7f403b6b76f |
07-Aug-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
base: Declare a type for context IDs
Context IDs used to be declared as ad hoc (usually as int). This changeset introduces a typedef for ContextIDs and a constant for invalid context IDs. |
11003:ba91725c8f6b |
07-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove extraneous acquire/release flags and attributes
This patch removes the extraneous flags and attributes from the request and packet, and simply leaves the new commands. The change introduced when adding acquire/release breaks all compatibility with existing traces, and there is really no need for any new flags and attributes. The commands should be sufficient.
This patch fixes packet tracing (urgent), and also removes the unnecessary complexity. |
11001:80f018934c3a |
05-Aug-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Fixup incorrect include guards |
10996:d48fda705f4d |
04-Aug-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Move trace functionality from the CommMonitor to a probe
This changeset moves the access trace functionality from the CommMonitor into a separate probe. The probe can be hooked up to any component that exports probe points of the type ProbePoints::Packet.
This patch moves the dependency on Google's Protocol Buffers library from the CommMonitor to the MemTraceProbe, which means that the CommMonitor (including stack distance profiling) no long depends on it. |
10995:a114e2712642 |
04-Aug-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Redesign the stack distance calculator as a probe
This changeset removes the stack distance calculator hooks from the CommMonitor class and implements a stack distance calculator as a memory system probe instead. The probe can be hooked up to any component that exports probe points of the type ProbePoints::Packet. |
10994:51ff41f6a4a5 |
04-Aug-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Add probe support to the CommMonitor
This changeset adds a standardized probe point type to monitor packets in the memory system and adds two probe points to the CommMonitor class. These probe points enable monitoring of successfully delivered requests and successfully delivered responses.
Memory system probe listeners should use the BaseMemProbe base class to provide a unified configuration interface and reuse listener registration code. Unlike the ProbeListenerObject class, the BaseMemProbe allows objects to be wired to multiple ProbeManager instances as long as they use the same probe point name. |
10991:72781d410e48 |
04-Aug-2015 |
Timothy Jones <timothy.jones@cl.cam.ac.uk> |
uby: Fix checkpointing and restore
There are 2 problems with the existing checkpoint and restore code in ruby. The first is that when the event queue is altered by ruby during serialization, some events that are currently scheduled cannot be found (e.g. the event to stop simulation that always lives on the queue), causing a panic. The second is that ruby is sometimes serialized after the memory system, meaning that the dirty data in its cache is flushed back to memory too late and so isn't included in the checkpoint.
These are fixed by implementing memory writeback in ruby, using the same technique of hijacking the event queue, but first descheduling all events that are currently on it. They are saved, along with their scheduled time, so that the event queue can be faithfully reconstructed after writeback has finished. Events with the AutoDelete flag set will delete themselves when they are descheduled, causing an error when attempting to schedule them again. This is fixed by simply not recording them when taking them off the queue.
Writeback is still implemented using flushing, so the cache recorder object, that is created to generate the trace and manage flushing, is kept around and used during serialization to write the trace to disk.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10990:0a45bbe8536a |
03-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: mesi three level: multiple corrections to the protocol
1. Eliminate state NP in L0 and L1 Caches: The two states 'NP' and 'I' both mean that the cache block is not present in the cache. 'I' also means that the cache entry has been allocated. This causes problems when we do not correctly initialize the cache entry when it is re-used. Hence, this patch eliminates the state NP altogether. Everytime a new block comes into the cache, a cache entry is allocated. Everytime a block leaves, the corresponding entry is deallocated.
2. Separate transient state for instruction fetches: purely for accouting purposes.
3. Drop state IS_I in L1 Cache and the message type STALE_DATA: when invalidation is received for a block in IS, the block used to be moved to IS_I. This meant that the data that would arrive in future would be used but not stored since the controller lost the permissions after gaining them. This state is being dropped and now invalidation messages would not processed till the data has arrived. This also means that STALE_DATA type is not longer required. |
10989:75f7ae6304f3 |
03-Aug-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: mesi two,three level: copy data only when dirty
The level 2 controller has a bug. In one particular action, the data block was copied from a message irrespective whether the block is dirty or not. In cases when L1 sends no data, the data value copied was incorrect. |
10987:a618349a7953 |
01-Aug-2015 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: removed invalid assert in message comparitor
It is perfectly valid to compare the same message and the greater than operator should work correctly. |
10986:4fbe4b0adb4d |
20-Jul-2015 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: improved stall and wait debugging
Added dprintfs and asserts for identifying stall and wait bugs. |
10985:d87a25259254 |
20-Jul-2015 |
Brad Beckmann <Brad.Beckmann@amd.com> |
slicc: fix error in conflicing symbol declaration |
10984:a86f453a7caa |
20-Jul-2015 |
Brad Beckmann <Brad.Beckmann@amd.com> |
slicc: enable overloading in functions not in classes
For many years the slicc symbol table has supported overloaded functions in external classes. This patch extends that support to functions that are not part of classes (a.k.a. no parent). For example, this support allows slicc to understand that mapAddressToRange is overloaded and the NodeID is an optional parameter. |
10983:6036e4555eda |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
ruby: change router pipeline stages to 2
This patch changes the router pipeline stages from 4 to 2. The canonical 4-stage router is conservative while a lower-latency router with look ahead routing and speculative allocation is well acknowledged. |
10982:a47c4db94389 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
ruby: change advance_stage for flit_d
Sets m_stage.second to the second parameter of the function. Then, for every place where advance_stage is called, adds a cycle to the argument being passed. |
10981:b300dcda5896 |
20-Jul-2015 |
Brad Beckmann <Brad.Beckmann@amd.com> |
slicc: improved stalling support in protocols
Adds features to allow protocols to reschedule controllers when conditionally stalling within inport logic or actions. Also insures that resource and protocol stalls are re-evaluated the next cycle. |
10980:7de6f95a0817 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
ruby: expose access permission to replacement policies
This patch adds support that allows the replacement policy to identify each cache block's access permission. This information can be useful when making replacement decisions. |
10979:3c11859e4a81 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
ruby: adds size and empty apis to the msg buffer stallmap |
10978:436d5dde4bb7 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
ruby: fix deadlock bug in banked array resource checks
The Ruby banked array resource checks (initiated from SLICC) did a check and allocate at the same time. If a transition needs more than one resource, then it might check/allocate resource #1, then fail to get resource #2. Another transition might then try to get the same resources, but in reverse order. Deadlock.
This patch separates resource checking and resource reservation into two steps to avoid deadlock. |
10977:9b3b9be42dd9 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
ruby: Fix for stallAndWait bug
It was previously possible for a stalled message to be reordered after an incomming message. This patch ensures that any stalled message stays in its original request order. |
10975:eba4e93665fc |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
mem: add request types for acquire and release
Add support for acquire and release requests. These synchronization operations are commonly supported by several modern instruction sets. |
10974:bbdf1177f250 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
ruby: allocate a block in CacheMemory without updating LRU state |
10973:4820cc8408b0 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
ruby: speed up function used for cache walks
This patch adds a few helpful functions that allow .sm files to directly invalidate all cache blocks using a trigger queue rather than rely on each individual cache block to be invalidated via requests from the mandatory queue. |
10972:53d63eeee46f |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
slicc: support for arbitrary DPRINTF flags (not just RubySlicc)
This patch allows DPRINTFs to be used in SLICC state machines similar to how they are used by the rest of gem5. Previously all DPRINTFs in the .sm files had to use the RubySlicc flag. |
10971:3ed88c8334f1 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
slicc: support for local variable declarations in action blocks |
10970:ea8bdb1d9f1e |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
ruby: initialize replacement policies with their own simobjs
this is in preparation for other replacement policies that take additional parameters. |
10969:a588fceeb834 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
ruby: give access to cache tag/data latencies from SLICC
This patch exposes the tag and data array latencies to the SLICC state machines so that it can be used to determine the correct enqueue latency for response messages. |
10968:bde347fc89ae |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
slicc: support for multiple cache entry types in the same state machine
To have multiple Entry types (e.g., a cache Entry type and a directory Entry type), just declare one of them as a secondary type by using the pair 'main="false"', e.g.:
structure(DirEntry, desc="...", interface="AbstractCacheEntry", main="false") {
...and the primary type would be declared:
structure(Entry, desc="...", interface="AbstractCacheEntry") { |
10967:b36204de88c0 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
slicc: Fix bug in enqueue and peek statements.
These were not generating the correct c names for types declared within a machine scope. |
10966:198726a3c723 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
slicc: fix missing inline function in LocalVariableAST |
10965:6f433e7f9767 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
slicc: improve support for prefix operations
This patch fixes the type handling when prefix operations are used. Previously prefix operators would assume a void return type, which made it impossible to combine prefix operations with other expressions. This patch allows SLICC programmers to use prefix operations more naturally. |
10964:2b4fe083d17b |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
slicc: support for transitions with a wildcard next state
This patches adds support for transitions of the form:
transition(START, EVENTS, *) { ACTIONS }
This allows a machine to collapse states that differ only in the next state transition to collapse into one, and can help shorten/simplfy some protocols significantly.
When * is encountered as an end state of a transition, the next state is determined by calling the machine-specific getNextState function. The next state is determined before any actions of the transition execute, and therefore the next state calculation cannot depend on any of the transition actions. |
10963:51f40b101a56 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
slicc: support for multiple message types on the same buffer
This patch allows SLICC protocols to use more than one message type with a message buffer. For example, you can declare two in ports as such:
in_port(ResponseQueue_in, ResponseMsg, responseFromDir, rank=3) { ... } in_port(tgtResponseQueue_in, TgtResponseMsg, responseFromDir, rank=2) { ... } |
10962:7233a5f7ac8f |
01-Aug-2015 |
Brad Beckmann <Brad.Beckmann@amd.com> |
slicc: fatal->panic on invalid transitions |
10961:cf35e8b92a5c |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
mem: Hit callback delay fix
This patch was created by Bihn Pham during his internship at AMD.
There is no need to delay hit callback response messages by a cycle because the response latency is already incurred in the Ruby protocol. This ensures correct timing of memory instructions. |
10956:19515f842044 |
20-Jul-2015 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: re-added the addressToInt slicc interface function
This helper function is very useful converting address offsets to integers that can be used for protocol specific destination mapping. |
10954:255ebb0b32b4 |
20-Jul-2015 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: add useful dprints to sequencer
Added two data block dprints that are useful when tracking down data check failures in the ruby random tester. |
10953:1b21c87b7c18 |
20-Jul-2015 |
David Hashe <david.hashe@amd.com> |
slicc: isinstance bugfix
This fix prevents spurious errors when searching for a symbol that may be located in one of multiple symbol tables. |
10943:329eef4c58f0 |
30-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add missing clean eviction on uncacheable access
This patch adds a missing clean eviction, occuring when an uncacheable access flushes and invalidates an existing block. |
10942:224c85495f96 |
30-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused RequestCause in cache
This patch removes the RequestCause, and also simplifies how we schedule the sending of packets through the memory-side port. The deassertion of bus requests is removed as it is not used. |
10941:a39646f4c407 |
30-Jul-2015 |
David Guillen-Fandos <david.guillen@arm.com> |
mem: Make caches way aware
This patch makes cache sets aware of the way number. This enables some nice features such as the ablity to restrict way allocation. The implemented mechanism allows to set a maximum way number to be allocated 'k' which must fulfill 0 < k <= N (where N is the number of ways). In the future more sophisticated mechasims can be implemented. |
10940:49d9b53b21dc |
30-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Transition away from isSupplyExclusive for writebacks
This patch changes how writebacks communicate whether the line is passed as modified or owned. Previously we relied on the isSupplyExclusive mechanism, which was originally designed to avoid unecessary snoops.
For normal cache requests we use the sharedAsserted mechanism to determine if a block should be marked writeable or not, and with this patch we transition the writebacks to also use this mechanism. Conceptually this is cleaner and more consistent. |
10939:6f23825b091b |
30-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up CacheBlk class
This patch modernises and tidies up the CacheBlk, removing dead code. |
10938:75c5a45170d7 |
30-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up packet
Some minor fixes and removal of dead code. Changing the flags to be enums rather than static const (to avoid any linking issues caused by the latter). Also adding a getBlockAddr member which hopefully can slowly finds its way into caches, snoop filters etc. |
10928:afe7e137943a |
24-Jul-2015 |
Brandon Potter <brandon.potter@amd.com> |
ruby: dma sequencer: removes redundant code |
10927:9689ead7b479 |
22-Jul-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: NetworkLink inherits from Consumer now. |
10922:5ee72f4b2931 |
13-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix (ab)use of emplace to avoid temporary object creation |
10921:07811efc0fde |
13-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Updated DRAMSim2 wrapper to new drain API
Somehow this one slipped through without being updated. |
10920:58fbfddff18d |
10-Jul-2015 |
Brandon Potter <brandon.potter@amd.com> |
ruby: replace global g_abs_controls with per-RubySystem var
This is another step in the process of removing global variables from Ruby to enable multiple RubySystem instances in a single simulation.
The list of abstract controllers is per-RubySystem and should be represented that way, rather than as a global.
Since this is the last remaining Ruby global variable, the src/mem/ruby/Common/Global.* files are also removed. |
10919:80069a602c83 |
10-Jul-2015 |
Brandon Potter <brandon.potter@amd.com> |
ruby: replace global g_system_ptr with per-object pointers
This is another step in the process of removing global variables from Ruby to enable multiple RubySystem instances in a single simulation.
With possibly multiple RubySystem objects, we can no longer use a global variable to find "the" RubySystem object. Instead, each Ruby component has to carry a pointer to the RubySystem object to which it belongs. |
10918:dd3ab1f109ad |
10-Jul-2015 |
Brandon Potter <brandon.potter@amd.com> |
ruby: replace g_ruby_start with per-RubySystem m_start_cycle
This patch begins the process of removing global variables from the Ruby source with the goal of eventually allowing users to create multiple Ruby instances in a single simulation. Currently, users cannot do so because several global variables and static members are referenced by the RubySystem object in a way that assumes that there will only ever be a single RubySystem. These need to be replaced with per-RubySystem equivalents.
This specific patch replaces the global var g_ruby_start, which is used to calculate throughput statistics for Throttles in simple networks and links in Garnet networks, with a RubySystem instance var m_start_cycle. |
10917:c38f28fad4c3 |
10-Jul-2015 |
Brandon Potter <brandon.potter@amd.com> |
ruby: remove extra whitespace and correct misspelled words |
10913:38dbdeea7f1f |
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Refactor and simplify the drain API
The drain() call currently passes around a DrainManager pointer, which is now completely pointless since there is only ever one global DrainManager in the system. It also contains vestiges from the time when SimObjects had to keep track of their child objects that needed draining.
This changeset moves all of the DrainState handling to the Drainable base class and changes the drain() and drainResume() calls to reflect this. Particularly, the drain() call has been updated to take no parameters (the DrainManager argument isn't needed) and return a DrainState instead of an unsigned integer (there is no point returning anything other than 0 or 1 any more). Drainable objects should return either DrainState::Draining (equivalent to returning 1 in the old system) if they need more time to drain or DrainState::Drained (equivalent to returning 0 in the old system) if they are already in a consistent state. Returning DrainState::Running is considered an error.
Drain done signalling is now done through the signalDrainDone() method in the Drainable class instead of using the DrainManager directly. The new call checks if the state of the object is DrainState::Draining before notifying the drain manager. This means that it is safe to call signalDrainDone() without first checking if the simulator has requested draining. The intention here is to reduce the code needed to implement draining in simple objects. |
10912:b99a6662d7c2 |
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Decouple draining from the SimObject hierarchy
Draining is currently done by traversing the SimObject graph and calling drain()/drainResume() on the SimObjects. This is not ideal when non-SimObjects (e.g., ports) need draining since this means that SimObjects owning those objects need to be aware of this.
This changeset moves the responsibility for finding objects that need draining from SimObjects and the Python-side of the simulator to the DrainManager. The DrainManager now maintains a set of all objects that need draining. To reduce the overhead in classes owning non-SimObjects that need draining, objects inheriting from Drainable now automatically register with the DrainManager. If such an object is destroyed, it is automatically unregistered. This means that drain() and drainResume() should never be called directly on a Drainable object.
While implementing the new functionality, the DrainManager has now been made thread safe. In practice, this means that it takes a lock whenever it manipulates the set of Drainable objects since SimObjects in different threads may create Drainable objects dynamically. Similarly, the drain counter is now an atomic_uint, which ensures that it is manipulated correctly when objects signal that they are done draining.
A nice side effect of these changes is that it makes the drain state changes stricter, which the simulation scripts can exploit to avoid redundant drains. |
10910:32f3d1c454ec |
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Make the drain state a global typed enum
The drain state enum is currently a part of the Drainable interface. The same state machine will be used by the DrainManager to identify the global state of the simulator. Make the drain state a global typed enum to better cater for this usage scenario. |
10905:a6ca6831e775 |
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Refactor the serialization base class
Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically:
* Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section.
* Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects.
* Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections).
* The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects.
* Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code. |
10902:36b9241fa027 |
06-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
mem: Cleanup CommMonitor in preparation for probe support
Make configuration parameters constant and get rid of an unnecessary dependency on the Time class. |
10896:66e131813346 |
04-Jul-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
mem: packet: Add const to constructor argument |
10895:287285860dd6 |
04-Jul-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: drop NetworkMessage class
This patch drops the NetworkMessage class. The relevant data members and functions have been moved to the Message class, which was the parent of NetworkMessage. |
10894:52c793be01e7 |
04-Jul-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: mesi three level: name change to avoid clash The accessor function getDestination() for Destination variable in the coherence message clashes with the getDestination() that is part of the Message class. Hence the name change. |
10893:f567e80c0714 |
04-Jul-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove message buffer node
This structure's only purpose was to provide a comparison function for ordering messages in the MessageBuffer. The comparison function is now being moved to the Message class itself. So we no longer require this structure. |
10891:d958fc5f4a00 |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Increase the default buffer sizes for the DDR4 controller
This patch increases the default read/write buffer sizes for the DDR4 controller config to values that are more suitable for the high bandwidth and high bank count. |
10890:bac38d2a4acb |
03-Jul-2015 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: Update DRAM command scheduler for bank groups
This patch updates the command arbitration so that bank group timing as well as rank-to-rank delays will be taken into account. The resulting arbitration no longer selects commands (prepped or not) that cannot issue seamlessly if there are commands that can issue back-to-back, minimizing the effect of rank-to-rank (tCS) & same bank group (tCCD_L) delays.
The arbitration selects a new command based on the following priority. Within each priority band, the arbitration will use FCFS to select the appropriate command:
1) Bank is prepped and burst can issue seamlessly, without a bubble
2) Bank is not prepped, but can prep and issue seamlessly, without a bubble
3) Bank is prepped but burst cannot issue seamlessly. In this case, a bubble will occur on the bus
Thus, to enable more parallelism in subsequent selections, an unprepped packet is given higher priority if the bank prep can be hidden. If the bank prep cannot be hidden, the selection logic will choose a prepped packet that cannot issue seamlessly if one exist. Otherwise, the default selection will choose the packet with the minimum bank prep delay. |
10889:c4c13fced000 |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Avoid DRAM write queue iteration for merging and read lookup
This patch adds a simple lookup structure to avoid iterating over the write queue to find read matches, and for the merging of write bursts. Instead of relying on iteration we simply store a set of currently-buffered write-burst addresses and compare against these. For the reads we still perform the iteration if we have a match. For the writes, we rely entirely on the set. Note that there are corner-cases where sub-bursts would actually not be mergeable without a read-modify-write. We ignore these cases and opt for speed. |
10888:85a001f2193b |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Delay responses in the crossbar before forwarding
This patch changes how the crossbar classes deal with responses. Instead of forwarding responses directly and burdening the neighbouring modules in paying for the latency (through the pkt->headerDelay), we now queue them before sending them.
The coherency protocol is not affected as requests and any snoop requests/responses are still passed on in zero time. Thus, the responses end up paying for any header delay accumulated when passing through the crossbar. Any latency incurred on the request path will be paid for on the response side, if no other module has dealt with it.
As a result of this patch, responses are returned at a later point. This affects the number of outstanding transactions, and quite a few regressions see an impact in blocking due to no MSHRs, increased cache-miss latencies, etc.
Going forward we should be able to use the same concept also for snoop responses, and any request that is not an express snoop. |
10887:279efb97ec99 |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove redundant is_top_level cache parameter
This patch takes the final step in removing the is_top_level parameter from the cache. With the recent changes to read requests and write invalidations, the parameter is no longer needed, and consequently removed.
This also means that asymmetric cache hierarchies are now fully supported (and we are actually using them already with L1 caches, but no table-walker caches, connected to a shared L2). |
10886:fdd4a895f325 |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Split WriteInvalidateReq into write and invalidate
WriteInvalidateReq ensures that a whole-line write does not incur the cost of first doing a read exclusive, only to later overwrite the data. This patch splits the existing WriteInvalidateReq into a WriteLineReq, which is done locally, and an InvalidateReq that is sent out throughout the memory system. The WriteLineReq re-uses the normal WriteResp.
The change allows us to better express the difference between the cache that is performing the write, and the ones that are merely invalidating. As a consequence, we no longer have to rely on the isTopLevel flag. Moreover, the actual memory in the system does not see the intitial write, only the writeback. We were marking the written line as dirty already, so there is really no need to also push the write all the way to the memory.
The overall flow of the write-invalidate operation remains the same, i.e. the operation is only carried out once the response for the invalidate comes back. This patch adds the InvalidateResp for this very reason. |
10885:3ac92bf1f31f |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add ReadCleanReq and ReadSharedReq packets
This patch adds two new read requests packets:
ReadCleanReq - For a cache to explicitly request clean data. The response is thus exclusive or shared, but not owned or modified. The read-only caches (see previous patch) use this request type to ensure they do not get dirty data.
ReadSharedReq - We add this to distinguish cache read requests from those issued by other masters, such as devices and CPUs. Thus, devices use ReadReq, and caches use ReadCleanReq, ReadExReq, or ReadSharedReq. For the latter, the response can be any state, shared, exclusive, owned or even modified.
Both ReadCleanReq and ReadSharedReq re-use the normal ReadResp. The two transactions are aligned with the emerging cache-coherent TLM standard and the AMBA nomenclature.
With this change, the normal ReadReq should never be used by a cache, and is reserved for the actual (non-caching) masters in the system. We thus have a way of identifying if a request came from a cache or not. The introduction of ReadSharedReq thus removes the need for the current isTopLevel hack, and also allows us to stop relying on checking the packet size to determine if the source is a cache or not. This is fixed in follow-on patches. |
10884:c60acdbdd6ad |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Allow read-only caches and check compliance
This patch adds a parameter to the BaseCache to enable a read-only cache, for example for the instruction cache, or table-walker cache (not for x86). A number of checks are put in place in the code to ensure a read-only cache does not end up with dirty data.
A follow-on patch adds suitable read requests to allow a read-only cache to explicitly ask for clean data. |
10883:9294c4a60251 |
03-Jul-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Add clean evicts to improve snoop filter tracking
This patch adds eviction notices to the caches, to provide accurate tracking of cache blocks in snoop filters. We add the CleanEvict message to the memory heirarchy and use both CleanEvicts and Writebacks with BLOCK_CACHED flags to propagate notice of clean and dirty evictions respectively, down the memory hierarchy. Note that the BLOCK_CACHED flag indicates whether there exist any copies of the evicted block in the caches above the evicting cache.
The purpose of the CleanEvict message is to notify snoop filters of silent evictions in the relevant caches. The CleanEvict message behaves much like a Writeback. CleanEvict is a write and a request but unlike a Writeback, CleanEvict does not have data and does not need exclusive access to the block. The cache generates the CleanEvict message on a fill resulting in eviction of a clean block. Before travelling downwards CleanEvict requests generate zero-time snoop requests to check if the same block is cached in upper levels of the memory heirarchy. If the block exists, the cache discards the CleanEvict message. The snoops check the tags, writeback queue and the MSHRs of upper level caches in a manner similar to snoops generated from HardPFReqs. Currently CleanEvicts keep travelling towards main memory unless they encounter the block corresponding to their address or reach main memory (since we have no well defined point of serialisation). Main memory simply discards CleanEvict messages.
We have modified the behavior of Writebacks, such that they generate snoops to check for the presence of blocks in upper level caches. It is possible in our current implmentation for a lower level cache to be writing back a block while a shared copy of the same block exists in the upper level cache. If the snoops find the same block in upper level caches, we set the BLOCK_CACHED flag in the Writeback message.
We have also added logic to account for interaction of other message types with CleanEvicts waiting in the writeback queue. A simple example is of a response arriving at a cache removing any CleanEvicts to the same address from the cache's writeback queue. |
10882:3e84b8b49c77 |
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Convert Request static const flags to enums
This patch fixes an issue which is very wide spread in the codebase, causing sporadic linking failures. The issue is that we declare static const class variables in the header, without any definition (as part of a source file). In most cases the compiler propagates the value and we have no issues. However, especially for less optimising builds such as debug, we get sporadic linking failures due to undefined references.
This patch fixes the Request class, by turning the static const flags and master IDs into C++11 typed enums. |
10877:73d4798871a5 |
25-Jun-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: remove README
No longer maintained. Updates are only made to the wiki page. So being dropped. |
10876:7544f29b7dfc |
25-Jun-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: message: remove a data member added by mistake
I (Nilay) had mistakenly added a data member to the Message class in revision c1694b4032a6. The data member is being removed. |
10875:60eb3fef9c2d |
25-Jun-2015 |
Jason Power <power.jg@gmail.com> |
Ruby: Remove assert in RubyPort retry list logic
Remove the assert when adding a port to the RubyPort retry list. Instead of asserting, just ignore the added port, since it's already on the list. Without this patch, Ruby+detailed fails for even the simplest tests |
10872:ebb3d0737aa7 |
09-Jun-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Add check for express snoop in packet destructor
Snoop packets share the request pointer with the originating packets. We need to ensure that the snoop packet destruction does not delete the request. Snoops are used for reads, invalidations, HardPFReqs, Writebacks and CleansEvicts. Reads, invalidations, and HardPFReqs need a response so their snoops do not delete the request. For Writebacks and CleanEvicts we need to check explicitly for whethere the current packet is an express snoop, in whcih case do not delete the request. |
10871:119cfadf2203 |
09-Jun-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix snoop packet data allocation bug
This patch fixes an issue where the snoop packet did not properly forward the data pointer in case of static data. |
10865:282c2a89ace8 |
07-Jun-2015 |
Marco Elver <marco.elver@ed.ac.uk> |
ruby: Fix MESI consistency bug
Fixes missed forward eviction to CPU. With the O3CPU this can lead to load-load reordering, as the LQ is never notified of the invalidate.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10864:83cec4049505 |
07-Jun-2015 |
Matthias Jung <jungma@eit.uni-kl.de> |
mem: Add HMC Timing Parameters A single HMC-2500 x32 model based on:
[1] DRAMSpec: a high-level DRAM bank modelling tool developed at the University of Kaiserslautern. This high level tool uses RC (resistance-capacitance) and CV (capacitance-voltage) models to estimate the DRAM bank latency and power numbers.
[2] A Logic-base Interconnect for Supporting Near Memory Computation in the Hybrid Memory Cube (E. Azarkhish et. al) Assumed for the HMC model is a 30 nm technology node. The modelled HMC consists of a 4 Gbit part with 4 layers connected with TSVs. Each layer has 16 vaults and each vault consists of 2 banks per layer. In order to be able to use the same controller used for 2D DRAM generations for HMC, the following analogy is done: Channel (DDR) => Vault (HMC) device_size (DDR) => size of a single layer in a vault ranks per channel (DDR) => number of layers banks per rank (DDR) => banks per layer devices per rank (DDR) => devices per layer ( 1 for HMC). The parameters for which no input is available are inherited from the DDR3 configuration. |
10862:c78bfcfdfb02 |
30-May-2015 |
Christoph Pfister <pfistchr@student.ethz.ch> |
mem: addr_mapper: restore old address if request not sent
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10849:30bbc9b60a8c |
26-May-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
ruby: Deprecation warning for RubyMemoryControl
A step towards removing RubyMemoryControl and shift users to DRAMCtrl. The latter is faster, more representative, very versatile, and is integrated with power models. |
10837:ecbab2522757 |
19-May-2015 |
Joel Hestness <jthestness@gmail.com> |
ruby: Fix RubySystem warm-up and cool-down scope
The processes of warming up and cooling down Ruby caches are simulation-wide processes, not just RubySystem instance-specific processes. Thus, the warm-up and cool-down variables should be globally visible to any Ruby components participating in either process. Make these variables static members and track the warm-up and cool-down processes as appropriate.
This patch also has two side benefits: 1) It removes references to the RubySystem g_system_ptr, which are problematic for allowing multiple RubySystem instances in a single simulation. Warmup and cooldown variables being static (global) reduces the need for instance-specific dereferences through the RubySystem. 2) From the AbstractController, it removes local RubySystem pointers, which are used inconsistently with other uses of the RubySystem: 11 other uses reference the RubySystem with the g_system_ptr. Only sequencers have local pointers. |
10826:fe0b1f40ea5a |
17-Mar-2015 |
Stephan Diestelhorst <stephan.diestelhorst@ARM.com> |
mem: Create a request copy for deferred snoops
Sometimes, we need to defer an express snoop in an MSHR, but the original request might complete and deallocate the original pkt->req. In those cases, create a copy of the request so that someone who is inspecting the delayed snoop can also inspect the request still. All of this is rather hacky, but the allocation / linking and general life-time management of Packet and Request is rather tricky. Deleting the copy is another tricky area, testing so far has shown that the right copy is deleted at the right time. |
10824:308771bd2647 |
05-May-2015 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
mem, cpu: Add a separate flag for strictly ordered memory
The Request::UNCACHEABLE flag currently has two different functions. The first, and obvious, function is to prevent the memory system from caching data in the request. The second function is to prevent reordering and speculation in CPU models.
This changeset gives the order/speculation requirement a separate flag (Request::STRICT_ORDER). This flag prevents CPU models from doing the following optimizations:
* Speculation: CPU models are not allowed to issue speculative loads.
* Write combining: CPU models and caches are not allowed to merge writes to the same cache line.
Note: The memory system may still reorder accesses unless the UNCACHEABLE flag is set. It is therefore expected that the STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent this behavior. |
10823:64cd1dcd61a5 |
05-May-2015 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
mem, alpha: Move Alpha-specific request flags
Move Alpha-specific memory request flags to an architecture-specific header and map them to the architecture specific flag bit range. |
10821:581fb2484bd6 |
05-May-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Snoop into caches on uncacheable accesses
This patch takes a last step in fixing issues related to uncacheable accesses. We do not separate uncacheable memory from uncacheable devices, and in cases where it is really memory, there are valid scenarios where we need to snoop since we do not support cache maintenance instructions (yet). On snooping an uncacheable access we thus provide data if possible. In essence this makes uncacheable accesses IO coherent.
The snoop filter is also queried to steer the snoops, but not updated since the uncacheable accesses do not allocate a block. |
10819:2e8abe3bbe32 |
05-May-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Pass shared downstream through caches
This patch ensures that we pass on information about a packet being shared (rather than exclusive), when forwarding a packet downstream.
Without this patch there is a risk that a downstream cache considers the line exclusive when it really isn't. |
10818:9077e269ca4a |
05-May-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Add forward snoop check for HardPFReqs
We should always check whether the cache is supposed to be forwarding snoops before generating snoops. |
10817:404b2b015a17 |
05-May-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add missing stats update for uncacheable MSHRs
This patch adds a missing counter update for the uncacheable accesses. By updating this counter we also get a meaningful average latency for uncacheable accesses (previously inf). |
10816:b3b9097f44a9 |
05-May-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up BaseCache parameters
This patch simply tidies up the BaseCache parameters and removes the unused "two_queue" parameter. |
10815:169af9a2779f |
05-May-2015 |
David Guillen <david.guillen@arm.com> |
mem: Remove templates in cache model
This patch changes the cache implementation to rely on virtual methods rather than using the replacement policy as a template argument.
There is no impact on the simulation performance, and overall the changes make it easier to modify (and subclass) the cache and/or replacement policy. |
10809:e3963342ead4 |
29-Apr-2015 |
Rizwana Begum <rb639@drexel.edu> |
mem: Simplify page close checks for adaptive policies
Both open_adaptive and close_adaptive page polices keep the page open if a row hit is found. If a row hit is not found, close_adaptive page policy precharges the row, and open_adaptive policy precharges the row only if there is a bank conflict request waiting in the queue.
This patch makes the checks for above conditions simpler.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10808:c1694b4032a6 |
29-Apr-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: set: replace long by unsigned long UBSan complains about negative value being shifted |
10783:631e736554c9 |
13-Apr-2015 |
Lena Olson <lena@cs.wisc.edu> |
ruby: allow restoring from checkpoint when using DRAMCtrl
Restoring from a checkpoint with ruby + the DRAMCtrl memory model was not working, because ruby and DRAMCtrl disagreed on the current tick during warmup. Since there is no reason to do timing requests during warmup, use functional requests instead.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10771:ea35886cd847 |
27-Mar-2015 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem: Support any number of master-IDs in stride prefetcher
The stride prefetcher had a hardcoded number of contexts (i.e. master-IDs) that it could handle. Since master IDs need to be unique per system, and every core, cache etc. requires a separate master port, a static limit on these does not make much sense.
Instead, this patch adds a small hash map that will map all master IDs to the right prefetch state and dynamically allocates new state for new master IDs. |
10770:c48310de1a51 |
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Allocate cache writebacks before new MSHRs
This patch changes the order of writeback allocation such that any writebacks resulting from a tag lookup (e.g. for an uncacheable access), are added to the writebuffer before any new MSHR entries are allocated. This ensures that the writebacks logically precedes the new allocations.
The patch also changes the uncacheable flush to use proper timed (or atomic) writebacks, as opposed to functional writes. |
10769:9e521c0c3877 |
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Cleanup flow for uncacheable accesses
This patch simplifies the code dealing with uncacheable timing accesses, aiming to align it with the existing miss handling. Similar to what we do in atomic, a timing request now goes through Cache::access (where the block is also flushed), and then proceeds to ignore any existing MSHR for the block in question. This unifies the flow for cacheable and uncacheable accesses, and for atomic and timing. |
10768:9a34e28cd2c2 |
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Ignore uncacheable MSHRs when finding matches
This patch changes how we search for matching MSHRs, ignoring any MSHR that is allocated for an uncacheable access. By doing so, this patch fixes a corner case in the MSHRs where incorrect data ended up being copied into a (cacheable) read packet due to a first uncacheable MSHR target of size 4, followed by a cacheable target to the same MSHR of size 64. The latter target was filled with nonsense data. |
10767:993c2baa485a |
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove redundant allocateUncachedReadBuffer in cache
This patch removes the no-longer-needed allocateUncachedReadBuffer. Besides the checks it is exactly the same as allocateMissBuffer and thus provides no value. |
10766:b2071d0eb5f1 |
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Modernise MSHR iterators to C++11
This patch updates the iterators in the MSHR and MSHR queues to use C++11 range-based for loops. It also does a bit of additional house keeping. |
10764:b32578b2af99 |
27-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Align all MSHR entries to block boundaries
This patch aligns all MSHR queue entries to block boundaries to simplify checks for matches. Previously there were corner cases that could lead to existing entries not being identified as matches.
There are, rather alarmingly, a few regressions that change with this patch. |
10763:d524dc4f16ae |
27-Mar-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Rename PREFETCH_SNOOP_SQUASH flag to BLOCK_CACHED
This patch subsumes the PREFETCH_SNOOP_SQUASH flag with the more generic BLOCK_CACHED flag. Future patches implementing cache eviction messages can use the BLOCK_CACHED flag in almost the same manner as hardware prefetches use the PREFETCH_SNOOP_SQUASH flag. The PREFTECH_SNOOP_FLAG is set if the prefetch target is found in the tags or the MSHRs in any state, so we are simply replacing calls to setPrefetchSquashed() with setBlockCached(). The case of where the prefetch target is found in the writeback MSHRs of upper level caches continues to be covered by the MEM_INHIBIT flag. |
10760:8f5993cfa916 |
23-Mar-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: rename Locked/LOCKED to LockedRMW/LOCKED_RMW
Makes x86-style locked operations even more distinct from LLSC operations. Using "locked" by itself should be obviously ambiguous now. |
10755:dcd7cf19f7c5 |
23-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up Request
This patch does a bit of house keeping, fixing up typos, removing dead code etc. |
10745:791e4619919d |
19-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Use emplace front/back for deferred packets
Embrace C++11 for the deferred packets as we actually store the objects in the data structure, and not just pointers. |
10744:116c6cd45fff |
19-Mar-2015 |
Geoffrey Blake <Geoffrey.Blake@arm.com> |
mem: Enable CommMonitor to output traces in atomic mode
The CommMonitor by default only allows memory traces to be gathered in timing mode. This patch allows memory traces to be gathered in atomic mode if all one needs is a functional trace of memory addresses used and timing information is of a secondary concern. |
10741:655ff3f6352d |
11-Feb-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: remove redundant test in in Cache::recvTimingResp()
For some reason we were checking mshr->hasTargets() even though we had already called mshr->getTarget() unconditionally earlier in the same function (which asserts if there are no targets). Get rid of this useless check, and while we're at it get rid of the redundant call to mshr->getTarget(), since we still have the value saved in a local var. |
10740:88d515925cbf |
11-Feb-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: add local var in Cache::recvTimingResp()
The main loop in recvTimingResp() uses target->pkt all over the place. Create a local tgt_pkt to help keep lines under the line length limit. |
10739:4cfe55719da5 |
11-Feb-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: restructure Packet cmd initialization a bit more
Refactor the way that specific MemCmd values are generated for packets. The new approach is a little more elegant in that we assign the right value up front, and it's also more amenable to non-heap-allocated Packet objects.
Also replaced the code in the Minor model that was still doing it the ad-hoc way.
This is basically a refinement of http://repo.gem5.org/gem5/rev/711eb0e64249. |
10738:a5f134ef30d3 |
14-Mar-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: clean up write buffer check in Cache::handleSnoop()
The 'if (writebacks.size)' check was redundant, because writeBuffer.findMatches() would return false if the writebacks list was empty.
Also renamed 'mshr' to 'wb_entry' in this context since we are pointing at a writebuffer entry and not an MSHR (even though it's the same C++ class). |
10725:d1387fcd94b8 |
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Unify all cache DPRINTF address formatting
This patch changes all the DPRINTF messages in the cache to use '%#llx' every time a packet address is printed. The inclusion of '#' ensures '0x' is prepended, and since the address type is a uint64_t %x really should be %llx. |
10724:1072b1381560 |
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix cache MSHR conflict determination
This patch fixes a rather subtle issue in the sending of MSHR requests in the cache, where the logic previously did not check for conflicts between the MSRH queue and the write queue when requests were not ready. The correct thing to do is to always check, since not having a ready MSHR does not guarantee that there is no conflict.
The underlying problem seems to have slipped past due to the symmetric timings used for the write queue and MSHR queue. However, with the recent timing changes the bug caused regressions to fail. |
10723:b1d90d88420e |
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add byte mask to Packet::checkFunctional
This patch changes the valid-bytes start/end to a proper byte mask. With the changes in timing introduced in previous patches there are more packets waiting in queues, and there are regressions using the checker CPU failing due to non-contigous read data being found in the various cache queues.
This patch also adds some more comments explaining what is going on, and adds the fourth and missing case to Packet::checkFunctional. |
10722:886d2458e0d6 |
02-Mar-2015 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem: Add option to force in-order insertion in PacketQueue
By default, the packet queue is ordered by the ticks of the to-be-sent packages. With the recent modifications of packages sinking their header time when their resposne leaves the caches, there could be cases of MSHR targets being allocated and ordered A, B, but their responses being sent out in the order B,A. This led to inconsistencies in bus traffic, in particular the snoop filter observing first a ReadExResp and later a ReadRespWithInv. Logically, these were ordered the other way around behind the MSHR, but due to the timing adjustments when inserting into the PacketQueue, they were sent out in the wrong order on the bus, confusing the snoop filter.
This patch adds a flag (off by default) such that these special cases can request in-order insertion into the packet queue, which might offset timing slighty. This is expected to occur rarely and not affect timing results. |
10721:3e6a3eaac71b |
02-Mar-2015 |
Marco Balboni <Marco.Balboni@ARM.com> |
mem: Downstream components consumes new crossbar delays
This patch makes the caches and memory controllers consume the delay that is annotated to a packet by the crossbar. Previously many components simply threw these delays away. Note that the devices still do not pay for these delays. |
10720:67b3e74de9ae |
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Move crossbar default latencies to subclasses
This patch introduces a few subclasses to the CoherentXBar and NoncoherentXBar to distinguish the different uses in the system. We use the crossbar in a wide range of places: interfacing cores to the L2, as a system interconnect, connecting I/O and peripherals, etc. Needless to say, these crossbars have very different performance, and the clock frequency alone is not enough to distinguish these scenarios.
Instead of trying to capture every possible case, this patch introduces dedicated subclasses for the three primary use-cases: L2XBar, SystemXBar and IOXbar. More can be added if needed, and the defaults can be overridden. |
10719:b4fc9ad648aa |
02-Mar-2015 |
Marco Balboni <Marco.Balboni@ARM.com> |
mem: Add crossbar latencies
This patch introduces latencies in crossbar that were neglected before. In particular, it adds three parameters in crossbar model: front_end_latency, forward_latency, and response_latency. Along with these parameters, three corresponding members are added: frontEndLatency, forwardLatency, and responseLatency. The coherent crossbar has an additional snoop_response_latency.
The latency of the request path through the xbar is set as --> frontEndLatency + forwardLatency
In case the snoop filter is enabled, the request path latency is charged also by look-up latency of the snoop filter. --> frontEndLatency + SF(lookupLatency) + forwardLatency.
The latency of the response path through the xbar is set instead as --> responseLatency.
In case of snoop response, if the response is treated as a normal response the latency associated is again --> responseLatency;
If instead it is forwarded as snoop response we add an additional variable + snoopResponseLatency and the latency associated is --> snoopResponseLatency;
Furthermore, this patch lets the crossbar progress on the next clock edge after an unused retry, changing the time the crossbar considers itself busy after sending a retry that was not acted upon. |
10714:9ba5e70964a4 |
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up the cache debug messages
Avoid redundant inclusion of the name in the DPRINTF string. |
10713:eddb533708cb |
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Split port retry for all different packet classes
This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios.
The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting.
The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks. |
10712:245cd4691cbf |
02-Mar-2015 |
Ali Jafri <ali.jafri@arm.com> |
mem: Fix prefetchSquash + memInhibitAsserted bug
This patch resolves a bug with hardware prefetches. Before a hardware prefetch is sent towards the memory, the system generates a snoop request to check all caches above the prefetch generating cache for the presence of the prefetth target. If the prefetch target is found in the tags or the MSHRs of the upper caches, the cache sets the prefetchSquashed flag in the snoop packet. When the snoop packet returns with the prefetchSquashed flag set, the prefetch generating cache deallocates the MSHR reserved for the prefetch. If the prefetch target is found in the writeback buffer of the upper cache, the cache sets the memInhibit flag, which signals the prefetch generating cache to expect the data from the writeback. When the snoop packet returns with the memInhibitAsserted flag set, it marks the allocated MSHR as inService and waits for the data from the writeback.
If the prefetch target is found in multiple upper level caches, specifically in the tags or MSHRs of one upper level cache and the writeback buffer of another, the snoop packet will return with both prefetchSquashed and memInhibitAsserted set, while the current code is not written to handle such an outcome. Current code checks for the prefetchSquashed flag first, if it finds the flag, it deallocates the reserved MSHR. This leads to assert failure when the data from the writeback appears at cache. In this fix, we simply switch the order of checks. We first check for memInhibitAsserted and then for prefetch squashed. |
10706:4206946d60fe |
26-Feb-2015 |
Jason Power <power.jg@gmail.com> |
Ruby: Update backing store option to propagate through to all RubyPorts
Previously, the user would have to manually set access_backing_store=True on all RubyPorts (Sequencers) in the config files. Now, instead there is one global option that each RubyPort checks on initialization.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10703:41413f830836 |
16-Feb-2015 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem: Fix initial value problem with MemChecker
In highly loaded cases, reads might actually overlap with writes to the initial memory state. The mem checker needs to detect such cases and permit the read reading either from the writes (what it is doing now) or read from the initial, unknown value.
This patch adds this logic. |
10700:417ba77dedb4 |
16-Feb-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: mmap the backing store with MAP_NORESERVE
This patch ensures we can run simulations with very large simulated memories (at least 64 TB based on some quick runs on a Linux workstation). In essence this allows us to efficiently deal with sparse address maps without having to implement a redirection layer in the backing store.
This opens up for run-time errors if we eventually exhausts the hosts memory and swap space, but this should hopefully never happen. |
10699:d0004c12d024 |
16-Feb-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Use the range cache for lookup as well as access
This patch changes the range cache used in the global physical memory to be an iterator so that we can use it not only as part of isMemAddr, but also access and functionalAccess. This matches use-cases where a core is using the atomic non-caching memory mode, and repeatedly calls isMemAddr and access.
Linux boot on aarch32, with a single atomic CPU, is now more than 30% faster when using "--fastmem" compared to not using the direct memory access. |
10694:1a6785e37d81 |
11-Feb-2015 |
Marco Balboni <Marco.Balboni@ARM.com> |
mem: Clarification of packet crossbar timings
This patch clarifies the packet timings annotated when going through a crossbar.
The old 'firstWordDelay' is replaced by 'headerDelay' that represents the delay associated to the delivery of the header of the packet.
The old 'lastWordDelay' is replaced by 'payloadDelay' that represents the delay needed to processing the payload of the packet.
For now the uses and values remain identical. However, going forward the payloadDelay will be additive, and not include the headerDelay. Follow-on patches will make the headerDelay capture the pipeline latency incurred in the crossbar, whereas the payloadDelay will capture the additional serialisation delay. |
10693:c0979b2ebda5 |
11-Feb-2015 |
Marco Balboni <Marco.Balboni@ARM.com> |
mem: Clarify usage of latency in the cache
This patch adds some much-needed clarity in the specification of the cache timing. For now, hit_latency and response_latency are kept as top-level parameters, but the cache itself has a number of local variables to better map the individual timing variables to different behaviours (and sub-components).
The introduced variables are: - lookupLatency: latency of tag lookup, occuring on any access - forwardLatency: latency that occurs in case of outbound miss - fillLatency: latency to fill a cache block We keep the existing responseLatency
The forwardLatency is used by allocateInternalBuffer() for: - MSHR allocateWriteBuffer (unchached write forwarded to WriteBuffer); - MSHR allocateMissBuffer (cacheable miss in MSHR queue); - MSHR allocateUncachedReadBuffer (unchached read allocated in MSHR queue) It is our assumption that the time for the above three buffers is the same. Similarly, for snoop responses passing through the cache we use forwardLatency. |
10680:7639c17357dc |
03-Feb-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Clarify express snoop behaviour
This patch adds a bit of documentation with insights around how express snoops really work. |
10679:204a0f53035e |
03-Feb-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Clarify cache behaviour for pending dirty responses
This patch adds a bit of clarification around the assumptions made in the cache when packets are sent out, and dirty responses are pending. As part of the change, the marking of an MSHR as in service is simplified slightly, and comments are added to explain what assumptions are made. |
10675:bb7cd7193edc |
03-Feb-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
config: Adjust DRAM channel interleaving defaults
This patch changes the DRAM channel interleaving default behaviour to be more representative. The default address mapping (RoRaBaCoCh) moves the channel bits towards the least significant bits, and uses 128 byte as the default channel interleaving granularity.
These defaults can be overridden if desired, but should serve as a sensible starting point for most use-cases. |
10660:87f7b5a07584 |
22-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused Packet src and dest fields
This patch takes the final step in removing the src and dest fields in the packet. These fields were rather confusing in that they only remember a single multiplexing component, and pushed the responsibility to the bridge and caches to store the fields in a senderstate, thus effectively creating a stack. With the recent changes to the crossbar response routing the crossbar is now responsible without relying on the packet fields. Thus, these variables are now unused and can be removed. |
10659:3a3bb559b112 |
22-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove Packet source from ForwardResponseRecord
This patch removes the source field from the ForwardResponseRecord, but keeps the class as it is part of how the cache identifies responses to hardware prefetches that are snooped upwards. |
10658:1de300588c4f |
22-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused RequestState in the bridge
This patch removes the bridge sender state as the Crossbar now takes care of remembering its own routing decisions. |
10657:8bb4a9717eaa |
22-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Always use SenderState for response routing in RubyPort
This patch aligns how the response routing is done in the RubyPort, using the SenderState for both memory and I/O accesses. Before this patch, only the I/O used the SenderState, whereas the memory accesses relied on the src field in the packet. With this patch we shift to using SenderState in both cases, thus not relying on the src field any longer. |
10656:bd376adfb7d4 |
22-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make the XBar responsible for tracking response routing
This patch removes the need for a source and destination field in the packet by shifting the onus of the tracking to the crossbar, much like a real implementation. This change in behaviour also means we no longer need a SenderState to remember the source/dest when ever we have multiple crossbars in the system. Thus, the stack that was created by the SenderState is not needed, and each crossbar locally tracks the response routing.
The fields in the packet are still left behind as the RubyPort (which also acts as a crossbar) does routing based on them. In the succeeding patches the uses of the src and dest field will be removed. Combined, these patches improve the simulation performance by roughly 2%. |
10653:e3fc6bc7f97e |
22-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Clean up Request initialisation
This patch tidies up how we create and set the fields of a Request. In essence it tries to use the constructor where possible (as opposed to setPhys and setVirt), thus avoiding spreading the information across a number of locations. In fact, setPhys is made private as part of this patch, and a number of places where we callede setVirt instead uses the appropriate constructor. |
10648:8c9ed0314ed1 |
20-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix bug in cache request retry mechanism
This patch ensures that inhibited packets that are about to be turned into express snoops do not update the retry flag in the cache. |
10646:17d8d0a624a0 |
20-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Move DRAM interleaving check to init
This patch fixes a bug where the DRAM controller tried to access the system cacheline size before the system pointer was initialised. It also fixes a bug where the granularity is 0 (no interleaving). |
10627:63edd4a1243f |
23-Dec-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Change prefetcher to use random_mt
Prefechers has used rand() to generate random numers previously. |
10626:7982e539d003 |
23-Dec-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: Hide WriteInvalidate requests from prefetchers
Without this tweak, a prefetcher will happily prefetch data that will promptly be invalidated and overwritten by a WriteInvalidate. |
10625:00965520c9f5 |
23-Dec-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Fix event scheduling issue for prefetches
The cache's MemSidePacketQueue schedules a sendEvent based upon nextMSHRReadyTime() which is the time when the next MSHR is ready or whenever a future prefetch is ready. However, a prefetch being ready does not guarentee that it can obtain an MSHR. So, when all MSHRs are full, the simulation ends up unnecessiciarly scheduling a sendEvent every picosecond until an MSHR is finally freed and the prefetch can happen.
This patch fixes this by not signaling the prefetch ready time if the prefetch could not be generated. The event is rescheduled as soon as a MSHR becomes available. |
10624:97aa1ee1c2d9 |
23-Dec-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Fix bug relating to writebacks and prefetches
Previously the code commented about an unhandled case where it might be possible for a writeback to arrive after a prefetch was generated but before it was sent to the memory system. I hit that case. Luckily the prefetchSquash() logic already in the code handles dropping prefetch request in certian circumstances. |
10623:b9646f4546ad |
23-Dec-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Rework the structuring of the prefetchers
Re-organizes the prefetcher class structure. Previously the BasePrefetcher forced multiple assumptions on the prefetchers that inherited from it. This patch makes the BasePrefetcher class truly representative of base functionality. For example, the base class no longer enforces FIFO order. Instead, prefetchers with FIFO requests (like the existing stride and tagged prefetchers) now inherit from a new QueuedPrefetcher base class.
Finally, the stride-based prefetcher now assumes a custimizable lookup table (sets/ways) rather than the previous fully associative structure. |
10622:0b969a35781f |
23-Dec-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Add parameter to reserve MSHR entries for demand access
Adds a new parameter that reserves some number of MSHR entries for demand accesses. This helps prevent prefetchers from taking all MSHRs, forcing demand requests from the CPU to stall. |
10620:74834c49fbbe |
23-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
config: Expose the DRAM ranks as a command-line option
This patch gives the user direct influence over the number of DRAM ranks to make it easier to tune the memory density without affecting the bandwidth (previously the only means of scaling the device count was through the number of channels).
The patch also adds some basic sanity checks to ensure that the number of ranks is a power of two (since we rely on bit slices in the address decoding). |
10619:6dd27a0e0d23 |
23-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Ensure DRAM controller is idle when in atomic mode
This patch addresses an issue seen with the KVM CPU where the refresh events scheduled by the DRAM controller forces the simulator to switch out of the KVM mode, thus killing performance.
The current patch works around the fact that we currently have no proper API to inform a SimObject of the mode switches. Instead we rely on drainResume being called after any switch, and cache the previous mode locally to be able to decide on appropriate actions.
The switcheroo regression require a minor stats bump as a result. |
10618:bb665366cc00 |
23-Dec-2014 |
Omar Naji <Omar.Naji@arm.com> |
mem: Add rank-wise refresh to the DRAM controller
This patch adds rank-wise refresh to the controller, as opposed to the channel-wide refresh currently in place. In essence each rank can be refreshed independently, and for this to be possible the controller is extended with a state machine per rank.
Without this patch the data bus is always idle during a refresh, as all the ranks are refreshing at the same time. With the rank-wise refresh it is possible to use one rank while another one is refreshing, and thus the data bus can be kept busy.
The patch introduces a Rank class to encapsulate the state per rank, and also shifts all the relevant banks, activation tracking etc to the rank. The arbitration is also updated to consider the state of the rank. |
10617:471d390943f0 |
23-Dec-2014 |
Omar Naji <Omar.Naji@arm.com> |
mem: Fix a bug in the DRAM controller arbitration
Fix a minor issue that affects multi-rank systems. |
10615:cd8aae15f89a |
23-Dec-2014 |
Kanishk Sugand <kanishk.sugand@arm.com> |
mem: Add stack distance statistics to the CommMonitor
This patch adds the stack distance calculator to the CommMonitor. The stats are disabled by default. |
10614:da37aec3ed1a |
23-Dec-2014 |
Kanishk Sugand <kanishk.sugand@arm.com> |
mem: Add a stack distance calculator
This patch adds a stand-alone stack distance calculator. The stack distance calculator is a passive SimObject that observes the addresses passed to it. It calculates stack distances (LRU Distances) of incoming addresses based on the partial sum hierarchy tree algorithm described by Alamasi et al. http://doi.acm.org/10.1145/773039.773043.
For each transaction a hashtable look-up is performed. At every non-unique transaction the tree is traversed from the leaf at the returned index to the root, the old node is deleted from the tree, and the sums (to the right) are collected and decremented. The collected sum represets the stack distance of the found node. At every unique transaction the stack distance is returned as numeric_limits<uint64>::max().
In addition to the basic stack distance calculation, a feature to mark an old node in the tree is added. This is useful if it is required to see the reuse pattern. For example, Writebacks to the lower level (e.g. membus from L2), can be marked instead of being removed from the stack (isMarked flag of Node set to True). And then later if this same address is accessed (by L1), the value of the isMarked flag would be True. This gives some insight on how the Writeback policy of the lower level affect the read/write accesses in an application.
Debugging is enabled by setting the verify flag to true. Debugging is implemented using a dummy stack that behaves in a naive way, using STL vectors. Note that this has a large impact on run time. |
10612:6332c9d471a8 |
23-Dec-2014 |
Marco Elver <Marco.Elver@ARM.com> |
mem: Add MemChecker and MemCheckerMonitor
This patch adds the MemChecker and MemCheckerMonitor classes. While MemChecker can be integrated anywhere in the system and is independent, the most convenient usage is through the MemCheckerMonitor -- this however, puts limitations on where the MemChecker is able to observe read/write transactions. |
10583:d1e1e8588881 |
02-Dec-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: Support WriteInvalidate (again)
This patch takes a clean-slate approach to providing WriteInvalidate (write streaming, full cache line writes without first reading) support.
Unlike the prior attempt, which took an aggressive approach of directly writing into the cache before handling the coherence actions, this approach follows the existing cache flows as closely as possible. |
10582:c04dc66e4316 |
02-Dec-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: Remove WriteInvalidate support
Prepare for a different implementation following in the next patch |
10572:fc4c90a7d2f5 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Relax packet src/dest check and shift onus to crossbar
This patch allows objects to get the src/dest of a packet even if it is not set to a valid port id. This simplifies (ab)using the bridge as a buffer and latency adapter in situations where the neighbouring MemObjects are not crossbars.
The checks that were done in the packet are now shifted to the crossbar where the fields are used to index into the port arrays. Thus, the carrier of the information is not burdened with checking, and the crossbar can check not only that the destination is set, but also that the port index is within limits. |
10571:c848de089432 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Clean up packet data allocation
This patch attempts to make the rules for data allocation in the packet explicit, understandable, and easy to verify. The constructor that copies a packet is extended with an additional flag "alloc_data" to enable the call site to explicitly say whether the newly created packet is short-lived (a zero-time snoop), or has an unknown life-time and therefore should allocate its own data (or copy a static pointer in the case of static data).
The tricky case is the static data. In essence this is a copy-avoidance scheme where the original source of the request (DMA, CPU etc) does not ask the memory system to return data as part of the packet, but instead provides a pointer, and then the memory system carries this pointer around, and copies the appropriate data to the location itself. Thus any derived packet actually never copies any data. As the original source does not copy any data from the response packet when arriving back at the source, we must maintain the copy of the original pointer to not break the system. We might want to revisit this one day and pay the price for a few extra memcpy invocations.
All in all this patch should make it easier to grok what is going on in the memory system and how data is actually copied (or not). |
10570:dcb908e40547 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Cleanup Packet::checkFunctional and hasData usage
This patch cleans up the use of hasData and checkFunctional in the packet. The hasData function is unfortunately suggesting that it checks if the packet has a valid data pointer, when it does in fact only check if the specific packet type is specified to have a data payload. The confusion led to a bug in checkFunctional. The latter function is also tidied up to avoid name overloading. |
10569:ffd46545b284 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make the requests carried by packets const
This adds a basic level of sanity checking to the packet by ensuring that a request is not modified once the packet is created. The only issue that had to be worked around is the relaying of software-prefetches in the cache. The specific situation is now solved by first copying the request, and then creating a new packet accordingly. |
10568:e70523bd0d26 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make Request getters const
This patch tidies up the Request class, making all getters const. The odd one out is incAccessDepth which is called by the memory system as packets carry the request around. This is also const to enable the packet to hold on to a const Request. |
10567:926802ed1536 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add checks and explanation for assertMemInhibit usage |
10566:c99c8d2a7c31 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Assume all dynamic packet data is array allocated
This patch simplifies how we deal with dynamically allocated data in the packet, always assuming that it is array allocated, and hence should be array deallocated (delete[] as opposed to delete). The only uses of dataDynamic was in the Ruby testers.
The ARRAY_DATA flag in the packet is removed accordingly. No defragmentation of the flags is done at this point, leaving a gap in the bit masks.
As the last part the patch, it renames dataDynamicArray to dataDynamic. |
10565:23593fdaadcd |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove redundant Packet::allocate calls
This patch cleans up the packet memory allocation confusion. The data is always allocated at the requesting side, when a packet is created (or copied), and there is never a need for any device to allocate any space if it is merely responding to a paket. This behaviour is in line with how SystemC and TLM works as well, thus increasing interoperability, and matching established conventions.
The redundant calls to Packet::allocate are removed, and the checks in the function are tightened up to make sure data is only ever allocated once. There are still some oddities in the packet copy constructor where we copy the data pointer if it is static (without ownership), and allocate new space if the data is dynamic (with ownership). The latter is being worked on further in a follow-on patch. |
10564:a8c16e2d466a |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Use const pointers for port proxy write functions
This patch changes the various write functions in the port proxies to use const pointers for all sources (similar to how memcpy works).
The one unfortunate aspect is the need for a const_cast in the packet, to avoid having to juggle a const and a non-const data pointer. This design decision can always be re-evaluated at a later stage. |
10563:755b18321206 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add const getters for write packet data
This patch takes a first step in tightening up how we use the data pointer in write packets. A const getter is added for the pointer itself (getConstPtr), and a number of member functions are also made const accordingly. In a range of places throughout the memory system the new member is used.
The patch also removes the unused isReadWrite function. |
10562:b99fdc295c34 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove null-check bypassing in Packet::getPtr
This patch removes the parameter that enables bypassing the null check in the Packet::getPtr method. A number of call sites assume the value to be non-null.
The one odd case is the RubyTester, which issues zero-sized prefetches(!), and despite being reads they had no valid data pointer. This is now fixed, but the size oddity remains (unless anyone object or has any good suggestions).
Finally, in the Ruby Sequencer, appropriate checks are made for flush packets as they have no valid data pointer. |
10561:e1a853349529 |
02-Dec-2014 |
Omar Naji <Omar.Naji@arm.com> |
mem: Add a GDDR5 DRAM config
This patch adds a first cut GDDR5 config to accommodate the users combining gem5 and GPUSim. The config is based on a SK Hynix datasheet, and the Nvidia GTX580 specification. Someone from the GPUSim user-camp should tweak the default page-policy and static frontend and backend latencies. |
10559:62f5f7363197 |
24-Nov-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Another round of static analysis fixups
Mostly addressing uninitialised members. |
10558:426665ec11a9 |
23-Nov-2014 |
Alexandru Dutu <alexandru.dutu@amd.com> |
mem: Page Table map api modification
This patch adds uncacheable/cacheable and read-only/read-write attributes to the map method of PageTableBase. It also modifies the constructor of TlbEntry structs for all architectures to consider the new attributes. |
10557:3a17e8c018b4 |
23-Nov-2014 |
Alexandru Dutu <alexandru.dutu@amd.com> |
mem: Multi Level Page Table bug fix
The multi level page table was giving false positives for already mapped translations. This patch fixes the bogus behavior. |
10556:1e3b3c7a0cba |
23-Nov-2014 |
Alexandru Dutu <alexandru.dutu@amd.com> |
mem: Page Table long lines
Trimmed down all the lines greater than 78 characters. |
10536:aa97958ce2aa |
14-Nov-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Clarify unit of DRAM controller buffer size |
10534:50bbc64efbb8 |
12-Nov-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Delete unused variable in Garnet NetworkLink
With recent changes OSX clang compilation fails due to an unused variable. |
10525:77787650cbbc |
06-Nov-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: provide a backing store Ruby's functional accesses are not guaranteed to succeed as of now. While this is not a problem for the protocols that are currently in the mainline repo, it seems that coherence protocols for gpus rely on a backing store to supply the correct data. The aim of this patch is to make this backing store configurable i.e. it comes into play only when a particular option: --access-backing-store is invoked.
The backing store has been there since M5 and GEMS were integrated. The only difference is that earlier the system used to maintain the backing store and ruby's copy was write-only. Sometime last year, we moved to data being supplied supplied by ruby in SE mode simulations. And now we have patches on the reviewboard, which remove ruby's copy of memory altogether and rely completely on the system's memory to supply data. This patch adds back a SimpleMemory member to RubySystem. This member is used only if the option: access-backing-store is set to true. By default, the memory would not be accessed. |
10524:fff17530cef6 |
06-Nov-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: interface with classic memory controller This patch is the final in the series. The whole series and this patch in particular were written with the aim of interfacing ruby's directory controller with the memory controller in the classic memory system. This is being done since ruby's memory controller has not being kept up to date with the changes going on in DRAMs. Classic's memory controller is more up to date and supports multiple different types of DRAM. This also brings classic and ruby ever more close. The patch also changes ruby's memory controller to expose the same interface. |
10523:5777a3e55603 |
06-Nov-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove the function functionalReadBuffers() This function was added when I had incorrectly arrived at the conclusion that such a function can improve the chances of a functional read succeeding. As was later realized, this is not possible in the current setup. While the code using this function was dropped long back, this function was not. Hence the patch. |
10522:13312d6e1caf |
06-Nov-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: coherence protocols: remove data block from dirctory entry This patch removes the data block present in the directory entry structure of each protocol in gem5's mainline. Firstly, this is required for moving towards common set of memory controllers for classic and ruby memory systems. Secondly, the data block was being misused in several places. It was being used for having free access to the physical memory instead of calling on the memory controller.
From now on, the directory controller will not have a direct visibility into the physical memory. The Memory Vector object now resides in the Memory Controller class. This also means that some significant changes are being made to the functional accesses in ruby. |
10521:ca248520649f |
06-Nov-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: allow adding a bool to an int, like C++. |
10520:7740e0d97d48 |
06-Nov-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove sparse memory. In my opinion, it creates needless complications in rest of the code. Also, this structure hinders the move towards common set of code for physical memory controllers. |
10519:7a3ad4b09ce4 |
06-Nov-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: single physical memory in fs mode Both ruby and the system used to maintain memory copies. With the changes carried for programmed io accesses, only one single memory is required for fs simulations. This patch sets the copy of memory that used to reside with the system to null, so that no space is allocated, but address checks can still be carried out. All the memory accesses now source and sink values to the memory maintained by ruby. |
10518:30e3715c9405 |
06-Nov-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: dma sequencer: remove RubyPort as parent class As of now DMASequencer inherits from the RubyPort class. But the code in RubyPort class is heavily tailored for the CPU Sequencer. There are parts of the code that are not required at all for the DMA sequencer. Moreover, the next patch uses the dma sequencer for carrying out memory accesses for all the io devices. Hence, it is better to have a leaner dma sequencer. |
10509:d5554f97c451 |
30-Oct-2014 |
Ali Saidi <Ali.Saidi@ARM.com> |
arm, mem: Fix drain bug and provide drain prints for more components. |
10503:94d58056729f |
21-Oct-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: don't inhibit WriteInv's or defer snoops on their MSHRs
WriteInvalidate semantics depend on the unconditional writeback or they won't complete. Also, there's no point in deferring snoops on their MSHRs, as they don't get new data at the end of their life cycle the way other transactions do.
Add comment in the cache about a minor inefficiency re: WriteInvalidate. |
10502:f2f1dbfd505e |
30-Oct-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: have WriteInvalidate obsolete MSHRs
Since WriteInvalidate directly writes into the cache, it can create tricky timing interleavings with reads and writes to the same cache line that haven't yet completed. This patch ensures that these requests, when completed, don't overwrite the newer data from the WriteInvalidate. |
10492:59f9f18aae0c |
20-Oct-2014 |
Omar Naji <Omar.Naji@arm.com> |
mem: Fix DRAM activationlLimit bug
Ensure that we do the proper event scheduling also when the activation limit is disabled. |
10489:99d59caa4c8f |
20-Oct-2014 |
Omar Naji <Omar.Naji@arm.com> |
mem: Add DRAM device size and check against config
This patch adds the size of the DRAM device to the DRAM config. It also compares the actual DRAM size (calculated using information from the config) to the size defined in the system. If these two values do not match gem5 will print a warning. In order to do correct DRAM research the size of the memory defined in the system should match the size of the DRAM in the config. The timing and current parameters found in the DRAM configs are defined for a DRAM device with a specific size and would differ for another device with a different size. |
10482:f1baf4f7723f |
16-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Modernise PhysicalMemory with C++11 features
Bring the PhysicalMemory up-to-date by making use of range-based for loops and vector intialisation where possible. |
10481:59fb5779ec6e |
16-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Move AddrRangeList from port.hh to addr_range.hh
The new location seems like a better fit. The iterator typedefs are removed in favour of using C++11 auto. |
10478:7135f938ff28 |
16-Oct-2014 |
Andrew Bardsley <Andrew.Bardsley@arm.com> |
mem: Add ExternalMaster and ExternalSlave ports
This patch adds two MemoryObject's: ExternalMaster and ExternalSlave. Each object has a single port which can be bound to an externally- provided bridge to a port of another simulation system at initialisation. |
10472:399f35ed5cca |
16-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Use shared_ptr for Ruby Message classes
This patch transitions the Ruby Message and its derived classes from the ad-hoc RefCountingPtr to the c++11 shared_ptr. There are no changes in behaviour, and the code modifications are mainly replacing "new" with "make_shared".
The cloning of derived messages is slightly changed as they previously relied on overriding the base-class through covariant return types. |
10467:dcf27c8220ac |
16-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arch,x86,mem: Dynamically determine the ISA for Ruby store check
This patch makes the memory system ISA-agnostic by enabling the Ruby Sequencer to dynamically determine if it has to do a store check. To enable this check, the ISA is encoded as an enum, and the system is able to provide the ISA to the Sequencer at run time. |
10466:73b7549d979e |
16-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Dynamically determine page bytes in memory components
This patch takes a step towards an ISA-agnostic memory system by enabling the components to establish the page size after instantiation. The swap operation in the memory is now also allowing any granularity to avoid depending on the IntReg of the ISA. |
10446:bb00790bc85c |
11-Oct-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: garnet: add statistics for different activities This patch adds some statistics to garnet that record the activity of certain structures in the on-chip network. These statistics, in a later patch, will be used for computing the energy consumed by the on-chip network. |
10445:e9fe0dc3cda3 |
11-Oct-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: garnet: remove functions for computing power |
10444:bbe7f8bd41ae |
11-Oct-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: drop Orion network power model
Orion is being dropped from ruby. It would be replaced with DSENT which has better models. Note that the power / energy numbers reported after this patch has been applied are not for use. |
10443:36afc9dc6f7e |
11-Oct-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: mesi: slight renaming |
10441:5377550e1e15 |
11-Oct-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: structures: coorect #ifndef macros in header files |
10432:da98b90b5df0 |
29-Jul-2014 |
Omar Naji <Omar.Naji@arm.com> |
mem: DRAMPower integration for on-line DRAM power stats
This patch takes the final step in integrating DRAMPower and adds the appropriate calls in the DRAM controller to provide the command trace and extract the power and energy stats. The debug printouts are still left in place, but will eventually be removed.
At the moment the DRAM power calculation is always on when using the DRAM controller model. The run-time impact of this addition is around 1.5% when looking at the total host seconds of the regressions. We deem this a sensible trade-off to avoid the complication of adding an enable/disable mechanism. |
10431:d9415c7f61a9 |
29-Jul-2014 |
Omar Naji <Omar.Naji@arm.com> |
mem: Add DRAMPower wrapping class
This patch adds a class to wrap DRAMPower Library in gem5. This class initiates an object of class MemorySpecification of the DRAMPower Library, passes the parameters from DRAMCtrl.py to this object and creates an object of drampower library using the memory specification. |
10430:f958ccec628f |
25-Jul-2014 |
Omar Naji <Omar.Naji@arm.com> |
mem: Add missig timing and current parameters to DRAM configs
This patch adds missing timing and current parameters to the existing DRAM configs. These missing timing and current parameters are required by DRAMPower for the DRAM power calculations. The missing values are datasheet values of the specified DRAMs, and the appropriate references are added for the variuos configs. |
10429:025a459edb87 |
09-Oct-2014 |
Omar Naji <Omar.Naji@arm.com> |
mem: Remove DRAMSim2 DDR3 configuration
This patch prunes the DDR3 config that was initially created to match the default config of DRAMSim2. The config is not complete as it is, and to avoid having to maintain it, the easiest way forward is to simply prune it. Going forward we are adding power number etc to the other configurations. |
10424:a910aeb89098 |
09-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add packet sanity checks to cache and MSHRs
This patch adds a number of asserts to the cache, checking basic assumptions about packets being requests or responses. |
10423:cc7f3988c5a9 |
09-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Allow packet queue to move next send event forward
This patch changes the packet queue such that when scheduling a send, the queue is allowed to move the event forward. |
10422:148b96b7bc77 |
01-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Fix issues identified by static analysis
Another bunch of issues addressed. |
10414:3dabe649f1df |
27-Sep-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: Output precise range when XBar has conflicts |
10413:1f12c11d89b6 |
27-Sep-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: Provide better diagnostic for unconnected port
When _masterPort is null, a message to that effect is more helpful than a segfault. |
10412:6400a2ab4e22 |
27-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Fix a bunch of minor issues identified by static analysis
Add some missing initialisation, and fix a handful benign resource leaks (including some false positives). |
10405:7a618c07e663 |
20-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Rename Bus to XBar to better reflect its behaviour
This patch changes the name of the Bus classes to XBar to better reflect the actual timing behaviour. The actual instances in the config scripts are not renamed, and remain as e.g. iobus or membus.
As part of this renaming, the code has also been clean up slightly, making use of range-based for loops and tidying up some comments. The only changes outside the bus/crossbar code is due to the delay variables in the packet. |
10403:b3231fc8ae9d |
24-Apr-2014 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem: Add access statistics for the snoop filter
Adds a simple access counter for requests and snoops for the snoop filter and also classifies hits based on whether a single other holder existed or whether multiple shares held the line. |
10402:33b4ea05c261 |
20-Sep-2014 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem: Tie in the snoop filter in the coherent bus |
10401:3ab6c2a5a407 |
24-Apr-2014 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem: Add a simple snoop counter per bus
This patch adds a simple counter for both total messages and a histogram for the fan-out of snoop messages. The fan-out describes to how many ports snoops had to be sent per incoming request / snoop-from-below. Without any cleverness, this usually means to either all, or all but the requesting port. |
10399:0644819fc32f |
20-Sep-2014 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem: Simple Snoop Filter
This is a first cut at a simple snoop filter that tracks presence of lines in the caches "above" it. The snoop filter can be applied at any given cache hierarchy and will then handle the caches above it appropriately; there is no need to use this only in the last-level bus.
This design currently has some limitations: missing stats, no notion of clean evictions (these will not update the underlying snoop filter, because they are not sent from the evicting cache down), no notion of capacity for the snoop filter and thus no need for invalidations caused by capacity pressure in the snoop filter. These are planned to be added on top with future change sets. |
10394:70cfafa17653 |
20-Sep-2014 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: Add DDR4 bank group timing
Added the following parameter to the DRAMCtrl class: - bank_groups_per_rank
This defaults to 1. For the DDR4 case, the default is overridden to indicate bank group architecture, with multiple bank groups per rank.
Added the following delays to the DRAMCtrl class: - tCCD_L : CAS-to-CAS, same bank group delay - tRRD_L : RAS-to-RAS, same bank group delay
These parameters are only applied when bank group timing is enabled. Bank group timing is currently enabled only for DDR4 memories.
For all other memories, these delays will default to '0 ns'
In the DRAM controller model, applied the bank group timing to the per bank parameters actAllowedAt and colAllowedAt. The actAllowedAt will be updated based on bank group when an ACT is issued. The colAllowedAt will be updated based on bank group when a RD/WR burst is issued.
At the moment no modifications are made to the scheduling. |
10393:0fafa62b6c01 |
20-Sep-2014 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: Add memory rank-to-rank delay
Add the following delay to the DRAM controller: - tCS : Different rank bus turnaround delay
This will be applied for 1) read-to-read, 2) write-to-write, 3) write-to-read, and 4) read-to-write command sequences, where the new command accesses a different rank than the previous burst.
The delay defaults to 2*tCK for each defined memory class. Note that this does not correspond to one particular timing constraint, but is a way of modelling all the associated constraints.
The DRAM controller has some minor changes to prioritize commands to the same rank. This prioritization will only occur when the command stream is not switching from a read to write or vice versa (in the case of switching we have a gap in any case).
To prioritize commands to the same rank, the model will determine if there are any commands queued (same type) to the same rank as the previous command. This check will ensure that the 'same rank' command will be able to execute without adding bubbles to the command flow, e.g. any ACT delay requirements can be done under the hoods, allowing the burst to issue seamlessly. |
10382:452a5f178ec5 |
20-Sep-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Remove the GHB prefetcher from the source tree
There are two primary issues with this code which make it deserving of deletion.
1) GHB is a way to structure a prefetcher, not a definitive type of prefetcher 2) This prefetcher isn't even structured like a GHB prefetcher. It's basically a worse version of the stride prefetcher.
It primarily serves to confuse new gem5 users and most functionality is already present in the stride prefetcher. |
10376:28c63d075e0c |
19-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Use safe_cast when assumptions are made about return value
This patch changes two dynamic_cast to safe_cast as we assume the return value is not NULL (without checking). |
10373:342348537a53 |
19-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Remove assertions ensuring unsigned values >= 0 |
10372:0a810481d511 |
19-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Check return value of checkFunctional in SimpleMemory
Simple fix to ensure we only iterate until we are done. |
10371:a16e73f1297f |
19-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add checks to sendTimingReq in cache
A small fix to ensure the return value is not ignored. |
10370:4466307b8a2a |
15-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: revert some of the changes from ad9c042dce54 The changeset ad9c042dce54 made changes to the structures under the network directory to use a map of buffers instead of vector of buffers. The reasoning was that not all vnets that are created are used and we needlessly allocate more buffers than required and then iterate over them while processing network messages. But the move to map resulted in a slow down which was pointed out by Andreas Hansson. This patch moves things back to using vector of message buffers. |
10362:535e088955ca |
09-Sep-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Add accessor function for vaddr
Determine if a request has an associated virtual address. |
10360:919c02740209 |
09-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Fix a number of unitialised variables and members
Static analysis unearther a bunch of uninitialised variables and members, and this patch addresses the problem. In all cases these omissions seem benign in the end, but at least fixing them means less false positives next time round. |
10348:c91b23c72d5e |
03-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
base: Use the global Mersenne twister throughout
This patch tidies up random number generation to ensure that it is done consistently throughout the code base. In essence this involves a clean-up of Ruby, and some code simplifications in the traffic generator.
As part of this patch a bunch of skewed distributions (off-by-one etc) have been fixed.
Note that a single global random number generator is used, and that the object instantiation order will impact the behaviour (the sequence of numbers will be unaffected, but if module A calles random before module B then they would obviously see a different outcome). The dependency on the instantiation order is true in any case due to the execution-model of gem5, so we leave it as is. Also note that the global ranom generator is not thread safe at this point.
Regressions using the memtest, TrafficGen or any Ruby tester are affected and will be updated accordingly. |
10347:d548d1d7597c |
03-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Avoid unecessary retries when bus peer is not ready
This patch removes unecessary retries that happened when the bus layer itself was no longer busy, but the the peer was not yet ready. Instead of sending a retry that will inevitably not succeed, the bus now silenty waits until the peer sends a retry. |
10345:b5bef3c8e070 |
27-Jun-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: write streaming support via WriteInvalidate promotion
Support full-block writes directly rather than requiring RMW: * a cache line is allocated in the cache upon receipt of a WriteInvalidateReq, not the WriteInvalidateResp. * only top-level caches allocate the line; the others just pass the request along and invalidate as necessary. * to close a timing window between the *Req and the *Resp, a new metadata bit tracks whether another cache has read a copy of the new line before the writeback to memory. |
10344:fa9ef374075f |
03-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix a bug in the cache port flow control
This patch fixes a bug in the cache port where the retry flag was reset too early, allowing new requests to arrive before the retry was actually sent, but with the event already scheduled. This caused a deadlock in the interactions with the O3 LSQ.
The patche fixes the underlying issue by shifting the resetting of the flag to be done by the event that also calls sendRetry(). The patch also tidies up the flow control in recvTimingReq and ensures that we also check if we already have a retry outstanding. |
10343:a1eea45928e6 |
13-May-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
cpu, mem: Make software prefetches non-blocking
Previously, they were treated so much like loads that they could stall at the head of the ROB. Now they are always treated like L1 hits. If they actually miss, a new request is created at the L1 and tracked from the MSHRs there if necessary (i.e. if it didn't coalesce with an existing outstanding load). |
10342:711eb0e64249 |
13-May-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
mem: Refactor assignment of Packet types
Put the packet type swizzling (that is currently done in a lot of places) into a refineCommand() member function. |
10325:7aacec2a247d |
03-Sep-2014 |
Geoffrey Blake <geoffrey.blake@arm.com> |
cache: Fix handling of LL/SC requests under contention
If a set of LL/SC requests contend on the same cache block we can get into a situation where CPUs will deadlock if they expect a failed SC to supply them data. This case happens where 3 or more cores are contending for a cache block using LL/SC and the system is configured where 2 cores are connected to a local bus and the third is connected to a remote bus. If a core on the local bus sends an SCUpgrade and the core on the remote bus sends and SCUpgrade they will race to see who will win the SC access. In the meantime if the other core appends a read to one of the SCUpgrades it will expect to be supplied data by that SCUpgrade transaction. If it happens that the SCUpgrade that was picked to supply the data is failed, it will drop the appended request for data and never respond, leaving the requesting core to deadlock. This patch makes all SC's behave as normal stores to prevent this case but still makes sure to check whether it can perform the update. |
10322:7f4059e4f2d5 |
03-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Packet queue clean up
No change in functionality, just a bit of tidying up. |
10318:98771a936b61 |
03-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Cleanup unused ISA traits constants
This patch prunes unused values, and also unifies how the values are defined (not using an enum for ALPHA), aligning the use of int vs Addr etc.
The patch also removes the duplication of PageBytes/PageShift and VMPageSize/LogVMPageSize. For all ISAs the two pairs had identical values and the latter has been removed. |
10314:94b6b28fc968 |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove typedef of Index as int64 The Index type defined as typedef int64 does not really provide any help since in most places we use primitive types instead of Index. Also, the name Index is very generic that it does not merit being used as a typename. |
10312:08f4deeb5b48 |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: PerfectSwitch: moves code to a per vnet helper function This patch moves code from the wakeup() function to a operateVnet(). The aim is to improve the readiblity of the code. |
10311:ad9c042dce54 |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: message buffers: significant changes
This patch is the final patch in a series of patches. The aim of the series is to make ruby more configurable than it was. More specifically, the connections between controllers are not at all possible (unless one is ready to make significant changes to the coherence protocol). Moreover the buffers themselves are magically connected to the network inside the slicc code. These connections are not part of the configuration file.
This patch makes changes so that these connections will now be made in the python configuration files associated with the protocols. This requires each state machine to expose the message buffers it uses for input and output. So, the patch makes these buffers configurable members of the machines.
The patch drops the slicc code that usd to connect these buffers to the network. Now these buffers are exposed to the python configuration system as Master and Slave ports. In the configuration files, any master port can be connected any slave port. The file pyobject.cc has been modified to take care of allocating the actual message buffer. This is inline with how other port connections work. |
10310:61c7f1d06575 |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
build opts: add MI_example to NULL ISA A later changeset changes the file src/python/swig/pyobject.cc to include a header file that includes a header file generated at build time depending on the PROTOCOL in use. Since NULL ISA was not specifying any protocol, this resulted in compilation problems. Hence, the changeset. |
10309:ccb1801742a1 |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
mem: change the namespace Message to ProtoMessage The namespace Message conflicts with the Message data type used extensively in Ruby. Since Ruby is being moved to the same Master/Slave ports based configuration style as the rest of gem5, this conflict needs to be resolved. Hence, the namespace is being renamed to ProtoMessage. |
10308:8c0870dbae5c |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: change the way configurable members are specified There are two changes this patch makes to the way configurable members of a state machine are specified in SLICC. The first change is that the data member declarations will need to be separated by a semi-colon instead of a comma. Secondly, the default value to be assigned would now use SLICC's assignment operator i.e. ':='. |
10307:6df951dcd7d9 |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: improve the grammar This patch changes the grammar for SLICC so as to remove some of the redundant / duplicate rules. In particular rules for object/variable declaration and class member declaration have been unified. Similarly, the rules for a general function and a class method have been unified.
One more change is in the priority of two rules. The first rule is on declaring a function with all the params typed and named. The second rule is on declaring a function with all the params only typed. Earlier the second rule had a higher priority. Now the first rule has a higher priority. |
10306:4c0de6e0669c |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: mesi three level: slight naming changes. |
10305:76745b567dc3 |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: donot prefix machine name to variables This changeset does away with prefixing of member variables of state machines with the identity of the machine itself. |
10304:a2f88c6d9e54 |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove unused toString() from AbstractController |
10303:71e0934af9f1 |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: move getNumNodes() to base class All the implementations were doing the same things. |
10302:0e9e99e6369a |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: eliminate type Time There is another type Time in src/base class which results in a conflict. |
10301:44839e8febbd |
01-Sep-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: move files from ruby/system to ruby/structures
The directory ruby/system is crowded and unorganized. Hence, the files the hold actual physical structures, are being moved to the directory ruby/structures. This includes Cache Memory, Directory Memory, Memory Controller, Wire Buffer, TBE Table, Perfect Cache Memory, Timer Table, Bank Array.
The directory ruby/systems has the glue code that holds these structures together. |
10299:bec0c5ffc323 |
28-Aug-2014 |
Alexandru <alexandru.dutu@amd.com> |
mem: adding architectural page table support for SE mode This patch enables the use of page tables that are stored in system memory and respect x86 specification, in SE mode. It defines an architectural page table for x86 as a MultiLevelPageTable class and puts a placeholder class for other ISAs page tables, giving the possibility for future implementation. |
10298:77af86f37337 |
01-Apr-2014 |
Alexandru <alexandru.dutu@amd.com> |
mem: adding a multi-level page table class This patch defines a multi-level page table class that stores the page table in system memory, consistent with ISA specifications. In this way, cpu models that use the actual hardware to execute (e.g. KvmCPU), are able to traverse the page table. |
10296:35738ad3c7c6 |
26-Aug-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix DRAMSim2 cycle check when restoring from checkpoint
This patch ensures the cycle check is still valid even restoring from a checkpoint. In this case the DRAMSim2 cycle count is relative to the startTick rather than 0. |
10287:4966471a1ba1 |
26-Aug-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Update DRAM controller comments
Update comments and add a reference for more information. |
10286:e95a0ab1d368 |
26-Aug-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix address interleaving bug in DRAM controller
This patch fixes a bug in the DRAM controller address decoding. In cases where the DRAM burst size (e.g. 32 bytes in a rank with a single LPDDR3 x32) was smaller than the channel interleaving size (e.g. systems with a 64-byte cache line) one address bit effectively got used as a channel bit when it should have been a low-order column bit.
This patch adds a notion of "columns per stripe", and more clearly deals with the low-order column bits and high-order column bits. The patch also relaxes the granularity check such that it is possible to use interleaving granularities other than the cache line size.
The patch also adds a missing M5_CLASS_VAR_USED to the tCK member as it is only used in the debug build for now. |
10274:68da5ef4bb6f |
13-Aug-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Properly set cache block status fields on writebacks
When a cacheline is written back to a lower-level cache, tags->insertBlock() sets various status parameters. However these status bits were cleared immediately after calling. This patch makes it so that these status fields are not cleared by moving them outside of the tags->insertBlock() call. |
10263:c00b5ba43967 |
28-Jul-2014 |
Anthony Gutierrez <atgutier@umich.edu> |
mem: refactor LRU cache tags and add random replacement tags
this patch implements a new tags class that uses a random replacement policy. these tags prefer to evict invalid blocks first, if none are available a replacement candidate is chosen at random.
this patch factors out the common code in the LRU class and creates a new abstract class: the BaseSetAssoc class. any set associative tag class must implement the functionality related to the actual replacement policy in the following methods:
accessBlock() findVictim() insertBlock() invalidate() |
10247:0ad233f0a77d |
30-Jun-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: DRAMPower trace output
This patch adds a DRAMPower flag to enable off-line DRAM power analysis using the DRAMPower tool. A new DRAMPower flag is added and a follow-on patch adds a Python script to post-process the output and order it based on time stamps.
The long-term goal is to link DRAMPower as a library and provide the commands through function calls to the model rather than first printing and then parsing the commands. At the moment it is also up to the user to ensure that the same DRAM configuration is used by the gem5 controller model and DRAMPower. |
10246:e0e3efe3b1d5 |
30-Jun-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add bank and rank indices as fields to the DRAM bank
This patch adds the index of the bank and rank as a field so that we can determine the identity of a given bank (reference or pointer) for the power tracing. We also grab the opportunity of cleaning up the arguments used for identifying the bank when activating. |
10245:70333502b9b5 |
30-Jun-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Extend DRAM row bits from 16 to 32 for larger densities
This patch extends the DRAM row bits to 32 to support larger density memories. Additional checks are also added to ensure the row fits in the 32 bits. |
10231:cb2e6950956d |
31-May-2014 |
Steve Reinhardt <steve.reinhardt@amd.com> |
style: eliminate equality tests with true and false
Using '== true' in a boolean expression is totally redundant, and using '== false' is pretty verbose (and arguably less readable in most cases) compared to '!'.
It's somewhat of a pet peeve, perhaps, but I had some time waiting for some tests to run and decided to clean these up.
Unfortunately, SLICC appears not to have the '!' operator, so I had to leave the '== false' tests in the SLICC code. |
10228:1a85c4fc805c |
23-May-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: remove unused ids DNUCA* |
10227:3ffc86fefc49 |
23-May-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove old protocol documentation |
10226:056363356d15 |
23-May-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: message buffer: drop dequeue_getDelayCycles() The functionality of updating and returning the delay cycles would now be performed by the dequeue() function itself. |
10217:baf8754fd5be |
09-May-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Update DDR3 and DDR4 based on datasheets
This patch makes a more firm connection between the DDR3-1600 configuration and the corresponding datasheet, and also adds a DDR3-2133 and a DDR4-2400 configuration. At the moment there is also an ongoing effort to align the choice of datasheets to what is available in DRAMPower. |
10216:52c869140fc2 |
09-May-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add DRAM cycle time
This patch extends the current timing parameters with the DRAM cycle time. This is needed as the DRAMPower tool expects timestamps in DRAM cycles. At the moment we could get away with doing this in a post-processing step as the DRAMPower execution is separate from the simulation run. However, in the long run we want the tool to be called during the simulation, and then the cycle time is needed. |
10215:52d46098c1b6 |
09-May-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Simplify DRAM response scheduling
This patch simplifies the DRAM response scheduling based on the assumption that they are always returned in order. |
10214:39eb5d4c400a |
09-May-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add precharge all (PREA) to the DRAM controller
This patch adds the basic ingredients for a precharge all operation, to be used in conjunction with DRAM power modelling.
Currently we do not try and apply any cleverness when precharging all banks, thus even if only a single bank is open we use PREA as opposed to PRE. At the moment we only have a single tRP (tRPpb), and do not model the slightly longer all-bank precharge constraint (tRPab). |
10213:2e630c6c2042 |
09-May-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove printing of DRAM params
This patch removes the redundant printing of DRAM params. |
10212:acc1131e01d6 |
09-May-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add tRTP to the DRAM controller
This patch adds the tRTP timing constraint, governing the minimum time between a read command and a precharge. Default values are provided for the existing DRAM types. |
10211:e084db2b1527 |
09-May-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Merge DRAM latency calculation and bank state update
This patch merges the two control paths used to estimate the latency and update the bank state. As a result of this merging the computation is now in one place only, and should be easier to follow as it is all done in absolute (rather than relative) time.
As part of this change, the scheduling is also refined to ensure that we look at a sensible estimate of the bank ready time in choosing the next request. The bank latency stat is removed as it ends up being misleading when the DRAM access code gets evaluated ahead of time (due to the eagerness of waking the model up for scheduling the next request). |
10210:793e5ff26e0b |
09-May-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add tWR to DRAM activate and precharge constraints
This patch adds the write recovery time to the DRAM timing constraints, and changes the current tRASDoneAt to a more generic preAllowedAt, capturing when a precharge is allowed to take place.
The part of the DRAM access code that accounts for the precharge and activate constraints is updated accordingly. |
10209:ac71c857e1e1 |
09-May-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Merge DRAM page-management calculations
This patch treats the closed page policy as yet another case of auto-precharging, and thus merges the code with that used for the other policies. |
10208:c249f7660eb7 |
09-May-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add DRAM power states to the controller
This patch adds power states to the controller. These states and the transitions can be used together with the Micron power model. As a more elaborate use-case, the transitions can be used to drive the DRAMPower tool.
At the moment, the power-down modes are not used, and this patch simply serves to capture the idle, auto refresh and active modes. The patch adds a third state machine that interacts with the refresh state machine. |
10207:3112b31596f0 |
09-May-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Ensure DRAM refresh respects timings
This patch adds a state machine for the refresh scheduling to ensure that no accesses are allowed while the refresh is in progress, and that all banks are propely precharged.
As part of this change, the precharging of banks of broken out into a method of its own, making is similar to how activations are dealt with. The idle accounting is also updated to ensure that the refresh duration is not added to the time that the DRAM is in the idle state with all banks precharged. |
10206:823f7fd1a82f |
09-May-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make DRAM read/write switching less conservative
This patch changes the read/write event loop to use a single event (nextReqEvent), along with a state variable, thus joining the two control flows. This change makes it easier to follow the state transitions, and control what happens when.
With the new loop we modify the overly conservative switching times such that the write-to-read switch allows bank preparation to happen in parallel with the bus turn around. Similarly, the read-to-write switch uses the introduced tRTW constraint. |
10192:5c2c4195b839 |
09-May-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Squash prefetch requests from downstream caches
This patch squashes prefetch requests from downstream caches, so that they do not steal cachelines away from caches closer to the cpu. It was originally coded by Mitch Hayenga and modified by Aasheesh Kolli. |
10189:94d6ffac1e9b |
09-May-2014 |
Sascha Bischoff <sascha.bischoff@ARM.com> |
mem: Auto-generate CommMonitor trace file names
Splits the CommMonitor trace_file parameter into three parameters. Previously, the trace was only enabled if the trace_file parameter was set, and would be written to this file. This patch adds in a trace_enable and trace_compress parameter to the CommMonitor.
No trace is generated if trace_enable is set to False. If it is set to True, the trace is written to a file based on the name of the SimObject in the simulation hierarchy. For example, system.cluster.il1_commmonitor.trc. This filename can be overridden by additionally specifying a file name to the trace_file parameter (more on this later).
The trace_compress parameter will append .gz to any filename if set to True. This enables compression of the generated traces. If the file name already ends in .gz, then no changes are made.
The trace_file parameter will override the name set by the trace_enable parameter. In the case that the specified name does not end in .gz but trace_compress is set to true, .gz is appended to the supplied file name. |
10174:73b035a42df1 |
01-Apr-2014 |
Mitch Hayenga <Mitch.Hayenga@ARM.com> |
mem: Don't print out the data of a cache block
This never actually worked since it was printing out only a word of the cache block and not the entire thing and doubly didn't work csprintf overrides the %#x specifier and assumes a char* array is actually a string. |
10166:2681ac1d671a |
19-Apr-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: remove old documentation Has not been maintained at all. Since there is alternate documentation available on gem5.org, no need to have this separately. |
10165:7e9edf4297a9 |
19-Apr-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: slight change to rule for transitions It had an unnecessary pairs token which is being removed. |
10163:e8608cdddae2 |
19-Apr-2014 |
Marco Elver <marco.elver@ed.ac.uk> |
ruby: recorder: Fix (de-)serializing with different cache block-sizes
Upon aggregating records, serialize system's cache-block size, as the cache-block size can be different when restoring from a checkpoint. This way, we can correctly read all records when restoring from a checkpoints, even if the cache-block size is different.
Note, that it is only possible to restore from a checkpoint if the desired cache-block size is smaller or equal to the cache-block size when the checkpoint was taken; we can split one larger request into multiple small ones, but it is not reliable to do the opposite.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10155:3b0bcc8c34ca |
08-Apr-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: change enqueue statement As of now, the enqueue statement can take in any number of 'pairs' as argument. But we only use the pair in which latency is the key. This latency is allowed to be either a fixed integer or a member variable of controller in which the expression appears. This patch drops the use of pairs in an enqueue statement. Instead, an expression is allowed which will be interpreted to be the latency of the enqueue. This expression can anything allowed by slicc including a constant integer or a member variable. |
10154:525b7e432f76 |
08-Apr-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: coherence protocols: drop the phrase IntraChip The phrase is no longer valid since we do not distinguish between inter and intra chip communication. |
10147:3e51a30b8071 |
23-Mar-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Track DRAM read/write switching and add hysteresis
This patch adds stats for tracking the number of reads/writes per bus turn around, and also adds hysteresis to the write-to-read switching to ensure that the queue does not oscilate around the low threshold. |
10146:27dfed4c8403 |
23-Mar-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Rename SimpleDRAM to a more suitable DRAMCtrl
This patch renames the not-so-simple SimpleDRAM to a more suitable DRAMCtrl. The name change is intended to ensure that we do not send the wrong message (although the "simple" in SimpleDRAM was originally intended as in cleverly simple, or elegant).
As the DRAM controller modelling work is being presented at ISPASS'14 our hope is that a broader audience will use the model in the future. |
10145:d19f759b7340 |
23-Mar-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Change memory defaults to be more representative
Make the default memory type DDR3-1600 x64, and use the open-adaptive page policy. This change is aiming to ensure that users by default are using a realistic memory system. |
10144:dc354b327d69 |
23-Mar-2014 |
Wendy Elsasser <wendy.elsasser@arm.com> |
mem: Add close adaptive paging policy to DRAM controller model
This patch adds a second adaptive page policy to the DRAM controller, closing the page unless there are already queued accesses to the open page. |
10143:ed192b1a2114 |
23-Mar-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: DRAM controller tidying up
Minor tidying up and removing of redundant code, including the printing of queue state every million accesses. |
10142:c4d058c993bf |
23-Mar-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix bug in DRAM bytes per activate
This patch ensures that we do not sample the bytes per activate when the row has already been closed. |
10141:9d44ba8964a5 |
23-Mar-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Limit the accesses to a page before forcing a precharge
This patch adds a basic starvation-prevention mechanism where a DRAM page is forced to close after a certain number of accesses. The limit is combined with the open and open-adaptive page policy and if reached causes an auto-precharge. |
10140:1a778b31add7 |
23-Mar-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make DRAM write queue draining more aggressive
This patch changes the triggering condition for the write draining such that we grab the opportunity to issue writes if there are no reads waiting (as opposed to waiting for the writes to reach the high threshold). As a result, we potentially drain some of the writes in read idle periods (if any).
A low threshold is added to be able to control how many write bursts are kept in the memory controller queue (acting as on-chip storage).
The high and low thresholds are updated to sensible values for a 32/64 size write buffer. Note that the thresholds should be adjusted along with the queue sizes.
This patch also adds some basic initialisation sanity checks and moves part of the initialisation to the constructor. |
10137:a90bd7b35d78 |
23-Mar-2014 |
Neha Agarwal <neha.agarwal@arm.com> |
mem: DDR3 config for comparing with DRAMSim2
This patch adds a new DDR3 configuration to match with the parameters that are specified in one of the DDR3 configs used in DRAMSim2. |
10136:ba1ed063e3af |
23-Mar-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: More descriptive address-mapping scheme names
This patch adds the row bits to the name of the address mapping schemes to make it more clear that all the current schemes places the row bits as the most significant bits. |
10133:0749c3ec92f4 |
23-Mar-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
ruby: Move Ruby debug flags to ruby dir and remove stale options
This patch moves the Ruby-related debug flags to the ruby sub-directory, and also removes the state SConsopts that add the no-longer-used NO_VECTOR_BOUNDS_CHECK. |
10131:cd2270b2f758 |
23-Mar-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Include the DRAMSim2 wrapper in NULL build
This patch makes sure DRAMSim2 is included in a build of the NULL ISA. |
10129:eb34ae5204b8 |
23-Mar-2014 |
Sascha Bischoff <Sascha.Bischoff@ARM.com> |
mem: CommMonitor trace warn on non-timing mode
Add a warning to the CommMonitor which will alert the user if they try and record a trace when the system is not in timing mode. |
10123:e958cdc5c669 |
20-Mar-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: consumer: avoid accessing wakeup times when waking up Each consumer object maintains a set of tick values when the object is supposed to wakeup and do some processing. As of now, the object accesses this set both when scheduling a wakeup event and when the object actually wakes up. The set is accessed during wakeup to remove the current tick value from the set. This functionality is now being moved to the scheduling function where ticks are removed at a later time. |
10122:1268f1fd2714 |
20-Mar-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: garnet: convert network interfaces into clocked objects This helps in configuring the network interfaces from the python script and these objects no longer rely on the network object for the timing information. |
10121:64545628f5a7 |
20-Mar-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: code refactor |
10117:37e333de580f |
20-Mar-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: no piobus in se mode Piobus was recently added to se scripts for ruby so that the interrupt controller can be connected to something (required since the interrupt controller sends address range messages). This patch removes the piobus and instead, the pio port of ruby port will now ignore the range change messages in se mode. |
10115:0e0a0dd558db |
17-Mar-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove some of the unnecessary code |
10108:83bb6e381cbf |
07-Mar-2014 |
Prakash Ramrakhyani <prakash.ramrakhyani@arm.com> |
mem: Fix incorrect assert failure in the Cache
This patch fixes an assert condition that is not true at all times. There are valid situations that arise in dual-core dual-workload runs where the assert condition is false. The function call following the assert however needs to be called only when the condition is true (a block cannot be invalidated in the tags structure if has not been allocated in the structure, and the tempBlock is never allocated). Hence the 'assert' has been replaced with an 'if'. |
10102:b5de69974a2e |
07-Mar-2014 |
Ali Saidi <ali.saidi@arm.com> |
mem: Wakeup sleeping CPUs without caches on LLSC
For systems without caches, the LLSC code does not get snoops for wake-ups. We add the LLSC code in the abstract memory to do the job for us. |
10097:c7fe7555d587 |
02-Mar-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: message buffer: changes related to tracking push/pop times The last pop operation is now tracked as a Tick instead of in Cycles. This helps in avoiding use of the receiver's clock during the enqueue operation. |
10096:e0167dda38dc |
02-Mar-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: make the max_size variable of the MessageBuffer unsigned |
10094:5be102721895 |
02-Mar-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: profiler: statically allocate stats variable Couple of users observed segmentation fault when the simulator tries to register the statistical variable m_IncompleteTimes. It seems that there is some problem with the initialization of these variables when allocated in the constructor. |
10090:4eec7bdde5b0 |
23-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: route all packets through ruby port Currently, the interrupt controller in x86 is connected to the io bus directly. Therefore the packets between the io devices and the interrupt controller do not go through ruby. This patch changes ruby port so that these packets arrive at the ruby port first, which then routes them to their destination. Note that the patch does not make these packets go through the ruby network. That would happen in a subsequent patch. |
10089:bc3126a05a7f |
23-Feb-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
ruby: Simplify RubyPort flow control and routing
This patch simplfies the retry logic in the RubyPort, avoiding redundant attributes, and enforcing more stringent checks on the interactions with the normal ports. The patch also simplifies the routing done by the RubyPort, using the port identifiers instead of a heavy-weight sender state.
The patch also fixes a bug in the sending of responses from PIO ports. Previously these responses bypassed the queue in the queued port, and ignored the return value, potentially leading to response packets being lost.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10087:86f3b546c214 |
23-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: message buffer: refactor code Code in two of the functions was exactly the same. This patch moves this code to a new function which is called from the two functions mentioned initially. |
10086:bd1089db3a88 |
23-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove few not required #includes |
10085:b9891fbae4c8 |
23-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: remove unused COPY_HEAD functionality |
10084:38aeea570604 |
23-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: protocols: remove unused action z_stall |
10082:70f350b13ec0 |
21-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: move message buffers to base network class. |
10081:26670ac8244e |
21-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: garnet: fixed: removes net_ptr from links |
10080:7a1bfe330d14 |
21-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: cache: remove not required variable m_cache_name |
10079:fb7859dc2273 |
20-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: garnet: fixed: removes next cycle functions At several places, there are functions that take a cycle value as input and performs some computation. Along with each such function, another function was being defined that simply added one more cycle to input and computed the same function. This patch removes this second copy of the function. Places where these functions were being called have been updated to use the original function with argument being current cycle + 1. |
10078:9400a90ec5d1 |
20-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: controller: slight code refactoring |
10077:552db6109dd3 |
20-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: mesi three level: rename incorrectly named files Two files had been incorrectly named with a .cache suffix. |
10076:f81d94b53661 |
20-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: removes unused code. |
10075:7322d2b2ec76 |
20-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: slight code refactoring |
10074:0e013fa647ac |
20-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: message buffer: removes some unecessary functions. |
10070:83957204d43b |
18-Feb-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix bug in PhysicalMemory use of mmap and munmap
This patch fixes a bug in how physical memory used to be mapped and unmapped. Previously we unmapped and re-mapped if restoring from a checkpoint. However, we never checked that the new mapping was actually the same, it was just magically working as the OS seems to fairly reliably give us the same chunk back. This patch fixes this issue by relying entirely on the mmap call in the constructor. |
10067:3b30e9d30e10 |
18-Feb-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Filter cache snoops based on address ranges
This patch adds a filter to the cache to drop snoop requests that are not for a range covered by the cache. This fixes an issue observed when multiple caches are placed in parallel, covering different address ranges. Without this patch, all the caches will forward the snoop upwards, when only one should do so. |
10066:06a33d872798 |
18-Feb-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add a wrapped DRAMSim2 memory controller
This patch adds DRAMSim2 as a memory controller by wrapping the external library and creating a sublass of AbstractMemory that bridges between the semantics of gem5 and the DRAMSim2 interface.
The DRAMSim2 wrapper extracts the clock period from the config file. There is no way of extracting this information from DRAMSim2 itself, so we simply read the same config file and get it from there.
To properly model the response queue, the wrapper keeps track of how many transactions are in the actual controller, and how many are stacking up waiting to be sent back as responses (in the wrapper). The latter requires us to move away from the queued port and manage the packets ourselves. This is due to DRAMSim2 not having any flow control on the response path.
DRAMSim2 assumes that the transactions it is given are matching the burst size of the choosen memory. The wrapper checks to ensure the cache line size of the system matches the burst size of DRAMSim2 as there are currently no provisions to split the system requests. In theory we could allow a cache line size smaller than the burst size, but that would lead to inefficient use of the DRAM, so for not we fatal also in this case. |
10064:0267a9b58c8e |
18-Feb-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix input to DPRINTF in CommMonitor
Minor fix of the debug message parameters. |
10059:b29a58680b47 |
06-Feb-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: memory controller: use MemoryNode * |
10054:baaed1733069 |
30-Jan-2014 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com> |
mem: Add additional tolerance to stride prefetcher Forces the prefetcher to mispredict twice in a row before resetting the confidence of prefetching. This helps cases where a load PC strides by a constant factor, however it may operate on different arrays at times. Avoids the cost of retraining. Primarily helps with small iteration loops.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10053:b0b69dbafc08 |
30-Jan-2014 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com> |
mem: Allowed tagged instruction prefetching in stride prefetcher For systems with a tightly coupled L2, a stride-based prefetcher may observe access requests from both instruction and data L1 caches. However, the PC address of an instruction miss gives no relevant training information to the stride based prefetcher(there is no stride to train). In theses cases, its better if the L2 stride prefetcher simply reverted back to a simple N-block ahead prefetcher. This patch enables this option.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10052:5bb8e054456b |
30-Jan-2014 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com>, Amin Farmahini <aminfar@gmail.com> |
mem: prefetcher: add options, support for unaligned addresses
This patch extends the classic prefetcher to work on non-block aligned addresses. Because the existing prefetchers in gem5 mask off the lower address bits of cache accesses, many predictable strides fail to be detected. For example, if a load were to stride by 48 bytes, with 64 byte cachelines, the current stride based prefetcher would see an access pattern of 0, 64, 64, 128, 192.... Thus not detecting a constant stride pattern. This patch fixes this, by training the prefetcher on access and not masking off the lower address bits.
It also adds the following configuration options: 1) Training/prefetching only on cache misses, 2) Training/prefetching only on data acceses, 3) Optionally tagging prefetches with a PC address. #3 allows prefetchers to train off of prefetch requests in systems with multiple cache levels and PC-based prefetchers present at multiple levels. It also effectively allows a pipelining of prefetch requests (like in POWER4) across multiple levels of cache hierarchy.
Improves performance on my gem5 configuration by 4.3% for SPECINT and 4.7% for SPECFP (geomean). |
10048:1548b7aa657c |
28-Jan-2014 |
Amin Farmahini <aminfar@gmail.com> |
mem: Remove redundant findVictim() input argument The patch (1) removes the redundant writeback argument from findVictim() (2) fixes the description of access() function
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10047:f11b4c0e52f8 |
28-Jan-2014 |
Amin Farmahini <aminfar@gmail.com> |
mem: Fixes a bug in simple_dram write merging Fixes updating the value of size in the write merge function.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10031:79d034cd6ba3 |
24-Jan-2014 |
Ali Saidi <Ali.Saidi@ARM.com> |
cpu: Add support for instructions that zero cache lines. |
10029:45779e2f844b |
24-Jan-2014 |
Giacomo Gabrielli <Giacomo.Gabrielli@arm.com> |
mem: Add flag to request if it was generated by a page table walk |
10028:fb8c44de891a |
24-Jan-2014 |
Giacomo Gabrielli <Giacomo.Gabrielli@arm.com> |
mem: Add support for a security bit in the memory system
This patch adds the basic building blocks required to support e.g. ARM TrustZone by discerning secure and non-secure memory accesses. |
10025:fdf737112e46 |
24-Jan-2014 |
Timothy M. Jones <timothy.jones@arm.com> |
Cache: Collect very basic stats on tag and data accesses
Adds very basic statistics on the number of tag and data accesses within the cache, which is important for power modelling. For the tags, simply count the associativity of the cache each time. For the data, this depends on whether tags and data are accessed sequentially, which is given by a new parameter. In the parallel case, all data blocks are accessed each time, but with sequential accesses, a single data block is accessed only on a hit. |
10024:fc10e1f9f124 |
24-Jan-2014 |
Dam Sunwoo <dam.sunwoo@arm.com> |
mem: per-thread cache occupancy and per-block ages
This patch enables tracking of cache occupancy per thread along with ages (in buckets) per cache blocks. Cache occupancy stats are recalculated on each stat dump. |
10020:2f33cb012383 |
24-Jan-2014 |
Matt Horsnell <matt.horsnell@ARM.com> |
mem: track per-request latencies and access depths in the cache hierarchy
Add some values and methods to the request object to track the translation and access latency for a request and which level of the cache hierarchy responded to the request. |
10014:a362694dda2d |
17-Jan-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove unused label no_vector |
10012:ec5a5bfb941d |
10-Jan-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: move all statistics to stats.txt, eliminate ruby.stats |
10010:4aa1135c05d4 |
09-Jan-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: fix bug introduced to revision 8523754f8885 |
10009:8523754f8885 |
08-Jan-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: remove variable 'addr' used in calls to doTransition This variable causes trouble if a variable of same name is declared in a protocol file. Hence it is being eliminated. |
10008:5176f0a71e56 |
04-Jan-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: add a three level MESI protocol.
The first two levels (L0, L1) are private to the core, the third level (L2)is possibly shared. The protocol supports clustered designs. For example, one can have two sets of two cores. Each core has an L0 and L1 cache. There are two L2 controllers where each set accesses only one of the L2 controllers. |
10007:94d286db85c1 |
04-Jan-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: rename MESI_CMP_directory to MESI_Two_Level
This is because the next patch introduces a three level hierarchy. |
10005:8c2b0dc16ccd |
04-Jan-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: add support for clusters
A cluster over here means a set of controllers that can be accessed only by a certain set of cores. For example, consider a two level hierarchy. Assume there are 4 L1 controllers (private) and 2 L2 controllers. We can have two different hierarchies here:
a. the address space is partitioned between the two L2 controllers. Each L1 controller accesses both the L2 controllers. In this case, each L1 controller is a cluster initself.
b. both the L2 controllers can cache any address. An L1 controller has access to only one of the L2 controllers. In this case, each L2 controller along with the L1 controllers that access it, form a cluster.
This patch allows for each controller to have a cluster ID, which is 0 by default. By setting the cluster ID properly, one can instantiate hierarchies with clusters. Note that the coherence protocol might have to be changed as well. |
10004:5d8b72563869 |
04-Jan-2014 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: some small changes |
9997:4e4437251d35 |
26-Dec-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: fix bugs in mesi cmp directory protocol This patch fixes couple of bugs in the L2 controller of the mesi cmp directory protocol.
1. The state MT_I was transitioning to NP on receiving a clean writeback from the L1 controller. This patch makes it inform the directory controller about the writeback.
2. The L2 controller was sending the dirty bit to the L1 controller and the L2 controller used writeback from the L1 controller to update the dirty bit unconditionally. Now, the L1 controller always assumes that the incoming data is clean. The L2 controller updates the dirty bit only when the L1 controller writes to the block.
3. Certain unused functions and events are being removed. |
9996:150338b8ba12 |
20-Dec-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: replace max_in_port_rank with number of inports
This patch replaces max_in_port_rank with the number of inports. The use of max_in_port_rank was causing spurious re-builds and incorrect initialization of variables in ruby related regression tests. This was due to the variable value being used across threads while compiling when it was not meant to be.
Since the number of inports is state machine specific value, this problem should get solved. |
9995:2df9c3856989 |
20-Dec-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: declare variables to be unsigned in Address.hh |
9994:1aa497ac86b2 |
20-Dec-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: mesi: remove owner and sharer fields from directory tags
The directory controller should not have the sharer field since there is only one level 2 cache. Anyway the field was not in use. The owner field was being used to track the l2 cache version (in case of distributed l2) that has the cache block under consideration. The information is not required since the version of the level 2 cache can be obtained from a subset of the address bits. |
9977:239d0abfe11c |
01-Nov-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fixes for DRAM stats accounting
This patch fixes a number of stats accounting issues in the DRAM controller. Most importantly, it separates the system interface and DRAM interface so that it is clearer what the actual DRAM bandwidth (and consequently utilisation) is. |
9976:8d5ef049bb2c |
01-Nov-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix the LPDDR3 page size
This patch corrects the LPDDR3 page size, which was set too low. |
9975:6d17ec8df4c7 |
01-Nov-2013 |
Neha Agarwal <neha.agarwal@arm.com> |
mem: Adding stats for DRAM power calculation
This patch adds stats which are used for offline power calculation from the 'Micron Power Calculator' spreadsheet. |
9974:a73b4e9284d4 |
01-Nov-2013 |
Neha Agarwal <neha.agarwal@arm.com> |
mem: Unify request selection for read and write queues
This patch unifies the request selection across read and write queues for FR-FCFS scheduling policy. It also fixes the request selection code to prioritize the row hits present in the request queues over the selection based on earliest bank availability. |
9973:af0028ff07db |
01-Nov-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add a simple adaptive version of the open-page policy
This patch adds a basic adaptive version of the open-page policy that guides the decision to keep open or close by looking at the contents of the controller queues. If no row hits are found, and bank conflicts are present, then the row is closed by means of an auto precharge. This is a well-known technique that should improve performance in most use-cases. |
9972:5ace73846f27 |
01-Nov-2013 |
Neha Agarwal <neha.agarwal@arm.com> |
mem: Just-in-time write scheduling in DRAM controller
This patch removes the untimed while loop in the write scheduling mechanism and now schedule commands taking into account the minimum timing constraint. It also introduces an optimization to track write queue size and switch from writes to reads if the number of write requests fall below write low threshold. |
9971:6ab9ebbebed8 |
01-Nov-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add tRRD as a timing parameter for the DRAM controller
This patch adds the tRRD parameter to the DRAM controller. With the recent addition of the actAllowedAt member for each bank, this addition is trivial. |
9970:cb786bfbd1ea |
01-Nov-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Less conservative tRAS in DRAM configurations
This patch changes the default values of the tRAS timing parameter to be less conservative, and closer in line with existing parts. |
9969:8053f651a089 |
01-Nov-2013 |
Ani Udipi <ani.udipi@arm.com> |
mem: Make tXAW enforcement less conservative and per rank
This patch changes the tXAW constraint so that it is enforced per rank rather than globally for all ranks in the channel. It also avoids using the bank freeAt to enforce the activation limit, as doing so also precludes performing any column or row command to the DRAM. Instead the patch introduces a new variable actAllowedAt for the banks and use this to track when a potential activation can occur. |
9968:31591b699509 |
01-Nov-2013 |
Neha Agarwal <neha.agarwal@arm.com> |
mem: Fix for 100% write threshold in DRAM controller
This patch fixes the controller when a write threshold of 100% is used. Earlier for 100% write threshold no data is written to memory as writes never get triggered since this corner case is not considered. |
9967:1ef53e046ca0 |
01-Nov-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Pick the next DRAM request based on bank availability
This patch changes the FCFS bit of FR-FCFS such that requests that target the earliest available bank are picked first (as suggested in the original work on FR-FCFS by Rixner et al). To accommodate this we add functionality to identify a bank through a one-dimensional identifier (bank id). The member names of the DRAMPacket are also update to match the style guide. |
9966:5e8970397ab7 |
01-Nov-2013 |
Ani Udipi <ani.udipi@arm.com> |
mem: Use the same timing calculation for DRAM read and write
This patch simplifies the DRAM model by re-using the function that computes the busy and access time for both reads and writes. |
9965:a01cc09ae34c |
01-Nov-2013 |
Ani Udipi <ani.udipi@arm.com> |
mem: Fix DRAM bank occupancy for streaming access
This patch fixes an issue that allowed more than 100% bus utilisation in certain cases. |
9964:6da4081bcbb4 |
01-Nov-2013 |
Ani Udipi <ani.udipi@arm.com> |
mem: Schedule time for DRAM event taking tRAS into account
This patch changes the time the controller is woken up to take the next scheduling decisions. tRAS is now handled in estimateLatency and doDRAMAccess and we do not need to worry about it at scheduling time. The earliest we need to wake up is to do a pre-charge, row access and column access before the bus becomes free for use. |
9963:44cd6322f5d4 |
01-Nov-2013 |
Ani Udipi <ani.udipi@arm.com> |
mem: Add tRAS parameter to the DRAM controller model
This patch adds an explicit tRAS parameter to the DRAM controller model. Previously tRAS was, rather conservatively, assumed to be tRCD + tCL + tRP. The default values for tRAS are chosen to match the previous behaviour and will be updated later. |
9951:2f5eec8c1010 |
31-Oct-2013 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem: Add "const" attribute to Packet getters
Add a "const" keywords to the getters in the Packet class so these can be invoked on const Packet objects. |
9950:4b7f60080149 |
31-Oct-2013 |
Prakash Ramrakhyani <prakash.ramrakhyani@arm.com> |
mem: Add privilege info to request class
This patch adds a flag in the request class that indicates if the request was made in privileged mode. |
9947:964b9eaab6b0 |
30-Oct-2013 |
Lluc Alvarez <lluc.alvarez@bsc.es> |
ruby: set SenderMachine in messages of MOESI_CMP_directory This patch adds missing initializations of the SenderMachine field of out_msg's when thery are created in the L2 cache controller of the MOESI_CMP_directory coherence protocol. When an out_msg is created and this field is left uninitialized, it is set to the default value MachineType_NUM. This causes a panic in the MachineType_to_string function when gem5 is executed with the Ruby debug flag on and it tries to print the message.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
9946:ebd44da818d5 |
30-Oct-2013 |
Emilio Castillo <castilloe@unican.es> |
ruby: Fixed a deadlock when restoring a checkpoint with garnet This patch fixes a problem where in Garnet, the enqueue time in the VCallocator and the SWallocator which is of type Cycles was being stored inside a variable with int type.
This lead to a known problem restoring checkpoints with garnet & the fixed pipeline enabled. That value was really big and didn't fit in the variable overflowing it, therefore some conditions on the VC allocation stage & the SW allocation stage were not met and the packets didn't advance through the network, leading to a deadlock panic right after the checkpoint was restored.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
9945:cd0a9c975c8c |
17-Oct-2013 |
Stephan Diestelhorst <stephan.diestelhorst@ARM.com> |
mem: De-virtualise interfaces in the CoherentBus
The CoherentBus eventually got virtual methods for its interface. The "virtuality" of the CoherentBus, however, comes already from the virtual interface of the bus' ports. There is no need to add another layer of virtual functions, here. |
9944:4ff1c5c6dcbc |
17-Oct-2013 |
Matt Horsnell <matt.horsnell@ARM.com> |
cpu: add consistent guarding to *_impl.hh files. |
9943:cc1e0ea8e450 |
17-Oct-2013 |
Sascha Bischoff <sascha.bischoff@arm.com> |
mem: Add PortID to QueuedMasterPort constructor
This patch adds the PortID to the QueuedMasterPort. This allows a PortID to be specified as it previously was set to the detault value of -1. |
9931:086fc5c038af |
17-Oct-2013 |
Ali Saidi <Ali.Saidi@ARM.com> |
mem: Make MemoryAccess flag more verbose
This patch extends the MemoryAccess debug flag to report who sent the requests and the cacheability. |
9923:cdd51a15e9be |
15-Oct-2013 |
Steve Reinhardt <steve.reinhardt@amd.com> |
ruby: eliminate non-determinism from ruby.stats output
Get rid of non-deterministic "stats" in ruby.stats output such as time & date of run, elapsed & CPU time used, and memory usage. These values cause spurious miscomparisons when looking at output diffs (though they don't affect regressions, since the regressions pass/fail status currently ignores ruby.stats entirely).
Most of this information is already captured in other places (time & date in stdout, elapsed time & mem usage in stats.txt), where the regression script is smart enough to filter it out. It seems easier to get rid of the redundant output rather than teaching the regression tester to ignore the same information in two different places. |
9912:3de4393f5649 |
15-Oct-2013 |
Andreas Sandberg <andreas@sandberg.pp.se> |
mem: Rename the ASI_BITS flag field in Request
ASI_BITS in the Request object were originally used to store a memory request's ASI on SPARC. This is not the case any more since other ISAs use the ASI bits to store architecture-dependent information. This changeset renames the ASI_BITS to ARCH_BITS which better describes their use. Additionally, the getAsi() accessor is renamed to getArchFlags(). |
9911:676d3dcf1cc2 |
15-Oct-2013 |
Andreas Sandberg <andreas@sandberg.pp.se> |
mem: Use a flag instead of address bit 63 for generic IPRs
Using address bit 63 to identify generic IPRs caused problems on SPARC, where IPRs are heavily used. This changeset redefines how generic IPRs are identified. Instead of using bit 63, we now use a separate flag (GENERIC_IPR) a memory request. |
9878:cfb305ba76bd |
18-Sep-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix scheduling bug in SimpleMemory
This patch ensures that a dequeue event is not scheduled if the memory controller is waiting for a retry already. Without this check it is possible for the controller to attempt sending something whilst already having one packet that is in retry, thus causing the bus to have an assertion failure. |
9869:a204694db4f9 |
11-Sep-2013 |
Joel Hestness <jthestness@gmail.com> |
ruby: Fix Topology throttle connections
The Topology source sets up input and output buffers for each of the external nodes of a topology by indexing on Ruby's generated controller unique IDs. These unique IDs are found by adding the MachineType_base_number to the version number of each controller (see any generated *_Controller.cc - init() calls getToNetQueue and getFromNetQueue using m_version + base). However, the Topology object used the cntrl_id - which is required to be unique across all controllers - to index the controllers list as they are being connected to their input and output buffers. If the cntrl_ids did not match the Ruby unique ID, the throttles end up connected to incorrectly indexed nodes in the network, resulting in packets traversing incorrect network paths. This patch fixes the Topology indexing scheme by using the Ruby unique ID to match that of the SimpleNetwork buffer vectors. |
9866:94dac7d7bb88 |
11-Sep-2013 |
Joel Hestness <jthestness@gmail.com> |
ruby: Statically allocate stats in SimpleNetwork, Switch, Throttle
The previous changeset (9863:9483739f83ee) used STL vector containers to dynamically allocate stats in the Ruby SimpleNetwork, Switch and Throttle. For gcc versions before at least 4.6.3, this causes the standard vector allocator to call Stats copy constructors (a no-no, since stats should be allocated in the body of each SimObject instance). Since the size of these stats arrays is known at compile time (NOTE: after code generation), this patch changes their allocation to be static rather than using an STL vector. |
9863:9483739f83ee |
06-Sep-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: convert to gem5 style stats |
9861:022a71603c7e |
06-Sep-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: profiler: removes function resourceUsage() |
9860:7248fa3e6e0f |
06-Sep-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove undefined message size type This message size type does not work well with one of the statistical variables. It also seems unnecessary. |
9859:1bd310386038 |
06-Sep-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: removes reset functionality |
9858:f2417ecf5cc9 |
06-Sep-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: shorten variable names |
9856:69bb50791e25 |
06-Sep-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: converts sparse memory stats to gem5 style |
9850:87d6b41749e9 |
04-Sep-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Resurrect the NOISA build target and rename it NULL
This patch makes it possible to once again build gem5 without any ISA. The main purpose is to enable work around the interconnect and memory system without having to build any CPU models or device models.
The regress script is updated to include the NULL ISA target. Currently no regressions make use of it, but all the testers could (and perhaps should) transition to it. |
9838:43d22d746e7a |
19-Aug-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
stats: Cumulative stats update
This patch updates the stats to reflect the: 1) addition of the internal queue in SimpleMemory, 2) moving of the memory class outside FSConfig, 3) fixing up of the 2D vector printing format, 4) specifying burst size and interface width for the DRAM instead of relying on cache-line size, 5) performing merging in the DRAM controller write buffer, and 6) fixing how idle cycles are counted in the atomic and timing CPU models.
The main reason for bundling them up is to minimise the changeset size. |
9836:4411b4e0c03a |
19-Aug-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
config: Command line support for multi-channel memory
This patch adds support for specifying multi-channel memory configurations on the command line, e.g. 'se/fs.py --mem-type=ddr3_1600_x64 --mem-channels=4'. To enable this, it enhances the functionality of MemConfig and moves the existing makeMultiChannel class method from SimpleDRAM to the support scripts.
The se/fs.py example scripts are updated to make use of the new feature. |
9835:cc7a7fc71c42 |
19-Aug-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Change AbstractMemory defaults to match the common case
This patch changes the default parameter value of conf_table_reported to match the common case. It also simplifies the regression and config scripts to reflect this change. |
9833:f3facce3d2b4 |
19-Aug-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Use STL deque in favour of list for DRAM queues
This patch changes the data structure used for the DRAM read, write and response queues from an STL list to deque. This optimisation is based on the observation that the size is small (and fixed), and that the structures are frequently iterated over in a linear fashion. |
9832:eaf87dfcdbb9 |
19-Aug-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Perform write merging in the DRAM write queue
This patch implements basic write merging in the DRAM to avoid redundant bursts. When a new access is added to the queue it is compared against the existing entries, and if it is either intersecting or immediately succeeding/preceeding an existing item it is merged.
There is currently no attempt made at avoiding iterating over the existing items in determining whether merging is possible or not. |
9831:286ae4a124e1 |
19-Aug-2013 |
Amin Farmahini <aminfar@gmail.com> |
mem: Replacing bytesPerCacheLine with DRAM burstLength in SimpleDRAM
This patch gets rid of bytesPerCacheLine parameter and makes the DRAM configuration separate from cache line size. Instead of bytesPerCacheLine, we define a parameter for the DRAM called burst_length. The burst_length parameter shows the length of a DRAM device burst in bits. Also, lines_per_rowbuffer is replaced with device_rowbuffer_size to improve code portablity.
This patch adds a burst length in beats for each memory type, an interface width for each memory type, and the memory controller model is extended to reason about "system" packets vs "dram" packets and assemble the responses properly. It means that system packets larger than a full burst are split into multiple dram packets. |
9825:410c4238a1bd |
19-Aug-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Warn instead of panic for tXAW violation
Until the performance bug is fixed, avoid killing simulations. |
9824:727c1f23d5ec |
19-Aug-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Allow disabling of tXAW through a 0 activation limit
This patch fixes an issue where an activation limit of 0 was not allowed. With this patch, setting the limit to 0 simply disables the tXAW constraint. |
9823:c8dd3368c6ba |
19-Aug-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add an internal packet queue in SimpleMemory
This patch adds a packet queue in SimpleMemory to avoid using the packet queue in the port (and thus have no involvement in the flow control). The port queue was bound to 100 packets, and as the SimpleMemory is modelling both a controller and an actual RAM, it potentially has a large number of packets in flight. There is currently no limit on the number of packets in the memory controller, but this could easily be added in a follow-on patch.
As a result of the added internal storage, the functional access and draining is updated. Some minor cleaning up and renaming has also been done.
The memtest regression changes as a result of this patch and the stats will be updated. |
9820:2f9aecba2362 |
07-Aug-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: remove double trigger, continueProcessing These constructs are not in use and are not being maintained by any one. In addition, it is not known if doubleTrigger works correctly with Ruby now. |
9819:e4b12145f4eb |
07-Aug-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: move some code to AbstractController Some of the code in StateMachine.py file is added to all the controllers and is independent of the controller definition. This code is being moved to the AbstractController class which is the parent class of all controllers. |
9814:7ad2b0186a32 |
18-Jul-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Set the cache line size on a system level
This patch removes the notion of a peer block size and instead sets the cache line size on the system level.
Previously the size was set per cache, and communicated through the interconnect. There were plenty checks to ensure that everyone had the same size specified, and these checks are now removed. Another benefit that is not yet harnessed is that the cache line size is now known at construction time, rather than after the port binding. Hence, the block size can be locally stored and does not have to be queried every time it is used.
A follow-on patch updates the configuration scripts accordingly. |
9813:bba03800b376 |
18-Jul-2013 |
Xiangyu Dong <rioshering@gmail.com> |
mem: Add cache class destructor to avoid memory leaks
Make valgrind a little bit happier |
9804:6a043adb1e8d |
11-Jul-2013 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: removed the very old double trigger hack
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
9801:04414c223a6a |
28-Jun-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: append transition comment only when in opt/debug |
9799:5aed42e54180 |
28-Jun-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: remove reconfiguration code This code seems not to be of any use now. There is no path in the simulator that allows for reconfiguring the network. A better approach would be to take a checkpoint and start the simulation from the checkpoint with the new configuration. |
9796:485399270ca1 |
27-Jun-2013 |
Prakash Ramrakhyani <prakash.ramrakhyani@arm.com> |
mem: Reorganize cache tags and make them a SimObject
This patch reorganizes the cache tags to allow more flexibility to implement new replacement policies. The base tags class is now a clocked object so that derived classes can use a clock if they need one. Also having deriving from SimObject allows specialized Tag classes to be swapped in/out in .py files.
The cache set is now templatized to allow it to contain customized cache blocks with additional informaiton. This involved moving code to the .hh file and removing cacheset.cc.
The statistics belonging to the cache tags are now including ".tags" in their name. Hence, the stats need an update to reflect the change in naming. |
9795:a31d1a0888a2 |
27-Jun-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove the cache builder
This patch removes the redundant cache builder class. |
9793:6e6cefc1db1f |
27-Jun-2013 |
Akash Bagdia <akash.bagdia@arm.com> |
sim: Add the notion of clock domains to all ClockedObjects
This patch adds the notion of source- and derived-clock domains to the ClockedObjects. As such, all clock information is moved to the clock domain, and the ClockedObjects are grouped into domains.
The clock domains are either source domains, with a specific clock period, or derived domains that have a parent domain and a divider (potentially chained). For piece of logic that runs at a derived clock (a ratio of the clock its parent is running at) the necessary derived clock domain is created from its corresponding parent clock domain. For now, the derived clock domain only supports a divider, thus ensuring a lower speed compared to its parent. Multiplier functionality implies a PLL logic that has not been modelled yet (create a separate clock instead).
The clock domains should be used as a mechanism to provide a controllable clock source that affects clock for every clocked object lying beneath it. The clock of the domain can (in a future patch) be controlled by a handler responsible for dynamic frequency scaling of the respective clock domains.
All the config scripts have been retro-fitted with clock domains. For the System a default SrcClockDomain is created. For CPUs that run at a different speed than the system, there is a seperate clock domain created. This domain incorporates the CPU and the associated caches. As before, Ruby runs under its own clock domain.
The clock period of all domains are pre-computed, such that no virtual functions or multiplications are needed when calling clockPeriod. Instead, the clock period is pre-computed when any changes occur. For this to be possible, each clock domain tracks its children. |
9788:5558ee8dd7d9 |
27-Jun-2013 |
Akash Bagdia <akash.bagdia@arm.com> |
config: Remove redundant explicit setting of default clocks
This patch removes the explicit setting of the clock period for certain instances of CoherentBus, NonCoherentBus and IOCache where the specified clock is same as the default value of the system clock. As all the values used are the defaults, there are no performance changes. There are similar cases where the toL2Bus is set to use the parent CPU clock which is already the default behaviour.
The main motivation for these simplifications is to ease the introduction of clock domains. |
9786:03a075377221 |
27-Jun-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up the bridge with const and additional checks
This patch does a bit of tidying up in the bridge code, adding const where appropriate and also removing redundant checks and adding a few new ones.
There are no changes to the behaviour of any regressions. |
9785:face72b7bb78 |
27-Jun-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix CommMonitor style and response check
This patch fixes the CommMonitor local variable names, and also introduces a variable to capture if it expects to see a response. The latter check considers both needsResponse and memInhibitAsserted. |
9784:d28825cebfcc |
27-Jun-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Align cache timing to clock edges
This patch changes the cache timing calculations such that the results are aligned to clock edges.
Plenty stats change as a results of this patch. |
9782:285458078a09 |
27-Jun-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Cycles converted to Ticks in atomic cache accesses
This patch fixes an outstanding issue in the cache timing calculations where an atomic access returned a time in Cycles, but the port forwarded it on as if it was in Ticks.
A separate patch will update the regression stats. |
9779:0742b0ccc430 |
27-Jun-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove a redundant heap allocation for a snoop packet
This patch changes the updards snoop packet to avoid allocating and later deleting it. As the code executes in 0 time and the lifetime of the packet does not extend beyond the block there is no reason to heap allocate it. |
9778:9fd974959aa3 |
27-Jun-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove CoherentBus snoop port unused private member
This patch removes an unused member to avoid getting compiler warnings when using clang. |
9775:09ea1346e89e |
25-Jun-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: moesi cmp directory: separate actions for external hits This patch adds separate actions for requests that missed in the local cache and messages were sent out to get the requested line. These separate actions are required for differentiating between the hit and miss latencies in the statistics collected. |
9774:f9bf34ba4172 |
25-Jun-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: mesi cmp directory: separate actions for external hits This patch adds separate actions for requests that missed in the local cache and messages were sent out to get the requested line. These separate actions are required for differentiating between the hit and miss latencies in the statistics collected. |
9773:915be89faf30 |
25-Jun-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: profiler: lots of inter-related changes The patch started of with removing the global variables from the profiler for profiling the miss latency of requests made to the cache. The corrresponding histograms have been moved to the Sequencer. These are combined together when the histograms are printed. Separate histograms are now maintained for tracking latency of all requests together, of hits only and of misses only.
A particular set of histograms used to use the type GenericMachineType defined in one of the protocol files. This patch removes this type. Now, everything that relied on this type would use MachineType instead. To do this, SLICC has been changed so that multiple machine types can be declared by a controller in its preamble. |
9771:57aac1719f86 |
24-Jun-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove the three files related to profiling This patch removes the following three files: RubySlicc_Profiler.sm, RubySlicc_Profiler_interface.cc and RubySlicc_Profiler_interface.hh. Only one function prototyped in the file RubySlicc_Profiler.sm. Rest of the code appearing in any of these files is not in use. Therefore, these files are being removed.
That one single function, profileMsgDelay(), is being moved to the protocol files where it is in use. If we need any of these deleted functions, I think the right way to make them visible is to have the AbstractController class in a .sm and let the controller state machine inherit from this class. The AbstractController class can then have the prototypes of these profiling functions in its definition. |
9770:a0ee1b3aec39 |
24-Jun-2013 |
Joel Hestness <jthestness@gmail.com>, Nilay Vaish <nilay@cs.wisc.edu> |
ruby: MessageBuffer: Remove unused m_size variable
The m_size variable attempted to track m_prio_heap.size(), but it did so incorrectly due to the functions reanalyzeMessages and reanalyzeAllMessages(). Since this variable is intended to track m_prio_heap.size(), we can simply replace instances where m_size is referenced with m_prio_heap.size(), which has the added bonus of removing the need for m_size.
Note: This patch also removes an extraneous DPRINTF format string designator from reanalyzeAllMessages()
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
9769:bf245b82de17 |
20-Jun-2013 |
Lena Olson <lena@cs.wisc.edu> |
ruby: fix typo in MOESI_CMP_token protocol |
9768:ff17ab994003 |
18-Jun-2013 |
Lena Olson <lena@cs.wisc.edu> |
ruby: Fix prefetching for MESI_CMP_Directory
Transitions from present on PF_Ifetch were missing, causing a crash when prefetching is enabled.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
9767:82758c79a71f |
18-Jun-2013 |
Lena Olson <lena@cs.wisc.edu> |
ruby: fix slicc compiler to complain about duplicate symbols
Previously, .sm files were allowed to use the same name for a type and a variable. This is unnecessarily confusing and has some bad side effects, like not being able to declare later variables in the same scope with the same type. This causes the compiler to complain and die on things like Address Address.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
9766:488a71df39bc |
18-Jun-2013 |
Lena Olson <lena@cs.wisc.edu> |
ruby: restrict Address to being a type and not a variable name Change all occurrances of Address as a variable name to instead use Addr. Address is an allowed name in slicc even when Address is also being used as a type, leading to declarations of "Address Address". While this works, it prevents adding another field of type Address because the compiler then thinks Address is a variable name, not type.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
9760:9db8a438608c |
18-Jun-2013 |
Andreas Sandberg <andreas@sandberg.pp.se> |
kvm: Use the address finalization code in the TLB
Reuse the address finalization code in the TLB instead of replicating it when handling MMIO. This patch also adds support for injecting memory mapped IPR requests into the memory system. |
9747:fbe79534d024 |
09-Jun-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove several unused variables in Profiler This patch removes per processor cycle count, histogram for filter stats, histogram for multicasts, histogram for prefetch wait, some function prototypes that do not have definitions. |
9746:7d235b709425 |
09-Jun-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove periodic event from Profiler The Profiler class does not need an event for dumping statistics periodically. This is because there is a method for dumping statistics for all the sim objects periodically. Since Ruby is a sim object, its statistics are also included. |
9745:884ad4638236 |
09-Jun-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: stats: use gem5's stats for cache and memory controllers This moves event and transition count statistics for cache controllers to gem5's statistics. It does the same for the statistics associated with the memory controller in ruby.
All the cache/directory/dma controllers individually collect the event and transition counts. A callback function, collateStats(), has been added that is invoked on the controller version 0 of each controller class. This function adds all the individual controller statistics to a vector variables. All the code for registering the statistical variables and collating them is generated by SLICC. The patch removes the files *_Profiler.{cc,hh} and *_ProfileDumper.{cc,hh} which were earlier used for collecting and dumping statistics respectively. |
9744:9ab496e12335 |
09-Jun-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove undefined functions in Address class |
9728:7daeab1685e9 |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: More descriptive DRAM config names
This patch changes the class names of the variuos DRAM configurations to better reflect what memory they are based on. The speed and interface width is now part of the name, and also the alias that is used to select them on the command line.
Some minor changes are done to the actual parameters, to better reflect the named configurations. As a result of these changes the regressions change slightly and the stats will be bumped in a separate patch. |
9727:3af04c92c9aa |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add bytes per activate DRAM controller stat
This patch adds a histogram to track how many bytes are accessed in an open row before it is closed. This metric is useful in characterising a workload and the efficiency of the DRAM scheduler. For example, a DDR3-1600 device requires 44 cycles (tRC) before it can activate another row in the same bank. For a x32 interface (8 bytes per cycle) that means 8 x 44 = 352 bytes must be transferred to hide the preparation time. |
9726:1ce1a59b4060 |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add static latency to the DRAM controller
This patch adds a frontend and backend static latency to the DRAM controller by delaying the responses. Two parameters expressing the frontend and backend contributions in absolute time are added to the controller, and the appropriate latency is added to the responses when adding them to the (infinite) queued port for sending.
For writes and reads that hit in the write buffer, only the frontend latency is added. For reads that are serviced by the DRAM, the static latency is the sum of the pipeline latencies of the entire frontend, backend and PHY. The default values are chosen based on having roughly 10 pipeline stages in total at 500 MHz.
In the future, it would be sensible to make the controller use its clock and convert these latencies (and a few of the DRAM timings) to cycles. |
9725:0d4ee33078bb |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Spring cleaning of MSHR and MSHRQueue
This patch does some minor tidying up of the MSHR and MSHRQueue. The clean up started as part of some ad-hoc tracing and debugging, but seems worthwhile enough to go in as a separate patch.
The highlights of the changes are reduced scoping (private) members where possible, avoiding redundant new/delete, and constructor initialisation to please static code analyzers. |
9724:7c7ed0cae353 |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix MSHR print format
This patch fixes an incorrect print format string by adding an additional string element. |
9716:131cd1e24b70 |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make returning snoop responses occupy response layer
This patch introduces a mirrored internal snoop port to facilitate easy addition of flow control for the snoop responses that are turned into normal responses on their return. To perform this, the slave ports of the coherent bus are wrapped in internal master ports that are passed as the source ports to the response layer in question.
As a result of this patch, there is more contention for the response resources, and as such system performance will decrease slightly.
A consequence of the mirrored internal port is that the port the bus tells to retry (the internal one) and the port actually retrying (the mirrored) one are not the same. Thus, the existing check in tryTiming is not longer correct. In fact, the test is redundant as the layer is only in the retry state while calling sendRetry on the waiting port, and if the latter does not immediately call the bus then the retry state is left. Consequently the check is removed. |
9715:0edf1445cf4d |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make the buses multi layered
This patch makes the buses multi layered, and effectively creates a crossbar structure with distributed contention ports at the destination ports. Before this patch, a bus could have a single request, response and snoop response in flight at any time, and with these changes there can be as many requests as connected slaves (bus master ports), and as many responses as connected masters (bus slave ports).
Together with address interleaving, this patch enables us to create high-throughput memory interconnects, e.g. 50+ GByte/s. |
9714:19a76cedd4ea |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Separate the two snoop response cases in the bus
This patch makes the flow control and state updates of the coherent bus more clear by separating the two cases, i.e. forward as a snoop response, or turn it into a normal response.
With this change it is also more clear what resources are being occupied, and that we effectively bypass the busy check for the second case. As a result of the change in resource usage some stats change. |
9713:d5a97bfa8569 |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up a few variables in the bus
This patch does some minor housekeeping on the bus code, removing redundant code, and moving the extraction of the destination id to the top of the functions using it. |
9712:09deddf4e447 |
30-May-2013 |
Uri Wiener <uri.wiener@arm.com> |
mem: Add basic stats to the buses
This patch adds a basic set of stats which are hard to impossible to implement using only communication monitors, and are needed for insight such as bus utilization, transactions through the bus etc.
Stats added include throughput and transaction distribution, and also a two-dimensional vector capturing how many packets and how much data is exchanged between the masters and slaves connected to the bus. |
9711:d98f85e25441 |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Use unordered set in bus request tracking
This patch changes the set used to track outstanding requests to an unordered set (part of C++11 STL). There is no need to maintain the order, and hopefully there might even be a small performance benefit. |
9710:03b21a385c47 |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Check for waiting state in bus draining
This patch fixes a bug in the bus where the bus transitions from busy to idle and still has a port that is waiting for a retry from a peer. |
9709:fe54045c8670 |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add a LPDDR3-1600 configuration
This patch adds a typical (leaning towards fast) LPDDR3 configuration based on publically available data. As expected, it looks very similar to the LPDDR2-S4 configuration, only with a slightly lower burst time. |
9708:5dd29a521cac |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Adapt the LPDDR2 to match a single x32 channel
This patch adapts the existing LPDDR2 configuration to make use of the multi-channel functionality. Thus, to get a x64 interface two controllers should be instantiated using the makeMultiChannel method.
The page size and ranks are also adapted to better suit with a typical LPDDR2 part. |
9707:1305bec2733f |
30-May-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Avoid explicitly zeroing the memory backing store
This patch removes the explicit memset as it is redundant and causes the simulator to touch the entire space, forcing the host system to allocate the pages.
Anonymous pages are mapped on the first access, and the page-fault handler is responsible for zeroing them. Thus, the pages are still zeroed, but we avoid touching the entire allocated space which enables us to use much larger memory sizes as long as not all the memory is actually used. |
9703:782b7284de21 |
21-May-2013 |
Malek Musleh <malek.musleh@gmail.com> |
ruby: slicc: fix error msg in TypeFieldMemberAST.py |
9697:f037e7b4a827 |
21-May-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: moesi hammer: cosmetic changes Updates copyright years, removes space at the end of lines, shortens variable names. |
9696:744fb905297c |
21-May-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: mesi cmp directory: cosmetic changes Updates copyright years, removes space at the end of lines, shortens variable names. |
9695:df1d9fee32a5 |
21-May-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: moesi cmp token: cosmetic changes Updates copyright years, removes space at the end of lines, shortens variable names. |
9694:692776126391 |
21-May-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: moesi cmp directory: cosmetic changes Updates copyright years, removes space at the end of lines, shortens variable names. |
9692:67d9da312ef0 |
21-May-2013 |
Nilay Vaish <nilay@cs.wisc.edu>, Malek Musleh <malek.musleh@gmail.com> |
ruby: add stats to .sm files, remove cache profiler This patch changes the way cache statistics are collected in ruby.
As of now, there is separate entity called CacheProfiler which holds statistical variables for caches. The CacheMemory class defines different functions for accessing the CacheProfiler. These functions are then invoked in the .sm files. I find this approach opaque and prone to error. Secondly, we probably should not be paying the cost of a function call for recording statistics.
Instead, this patch allows for accessing statistical variables in the .sm files. The collection would become transparent. Secondly, it would happen in place, so no function calls. The patch also removes the CacheProfiler class. |
9676:83d5112e71dd |
23-Apr-2013 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com> |
sim: Fix two bugs relating to software caching of PageTable entries.
The existing implementation can read uninitialized data or stale information from the cached PageTable entries.
1) Add a valid bit for the cache entries. Simply using zero for the virtual address to signify invalid entries is not sufficient. Speculative, wrong-path accesses frequently access page zero. The current implementation would return a uninitialized TLB entry when address zero was accessed and the PageTable cache entry was invalid.
2) When unmapping/mapping/remaping a page, invalidate the corresponding PageTable cache entry if one already exists. |
9673:3885582ecc52 |
23-Apr-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: mesi coherence protocol: remove unused state M_MB |
9670:fa4eedccce17 |
23-Apr-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: patch checkpoint restore with garnet Due to recent changes to clocking system in Ruby and the way Ruby restores state from a checkpoint, garnet was failing to run from a checkpointed state. The problem is that Ruby resets the time to zero while warming up the caches. If any component records a local copy of the time (read calls curCycle()) before the simulation has started, then that component will not operate until that time is reached. In the context of this particular patch, the Garnet Network class calls curCycle() at multiple places. Any non-operational component can block in requests in the memory system, which the system interprets as a deadlock. This patch makes changes so that Garnet can successfully run from checkpointed state.
It adds a globally visible time at which the actual execution started. This time is initialized in RubySystem::startup() function. This variable is only meant for components with in Ruby. This replaces the private variable that was maintained within Garnet since it is not possible to figure out the correct time when the value of this variable can be set.
The patch also does away with all cases where curCycle() is called with in some Ruby component before the system has actually started executing. This is required due to the quirky manner in which ruby restores from a checkpoint. |
9669:c5b24e8ed428 |
22-Apr-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Address mapping with fine-grained channel interleaving
This patch adds an address mapping scheme where the channel interleaving takes place on a cache line granularity. It is similar to the existing RaBaChCo that interleaves on a DRAM page, but should give higher performance when there is less locality in the address stream. |
9668:1d0387a172b0 |
22-Apr-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: More descriptive enum names for address mapping
This patch changes the slightly ambigious names used for the address mapping scheme to be more descriptive, and actually spell out what they do. With this patch we also open up for adding more flavours of open- and close-type mappings, i.e. interleaving across channels with the open map. |
9664:7c91f58b19af |
22-Apr-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add a WideIO DRAM configuration
This patch adds a WideIO 200 MHz configuration that can be used as a baseline to compare with DDRx and LPDDRx. Note that it is a single channel and that it should be replicated 4 times. It is based on publically available information and attempts to capture an envisioned 8 Gbit single-die part (i.e. without TSVs). |
9663:45df88079f04 |
22-Apr-2013 |
Uri Wiener <uri.wiener@arm.com> |
mem: Adding verbose debug output in the memory system
This patch provides useful printouts throughut the memory system. This includes pretty-printed cache tags and function call messages (call-stack like). |
9662:59a7df953d5e |
22-Apr-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Replace check with panic where inhibited should not happen
This patch changes the SimpleTimingPort and RubyPort to panic on inhibited requests as this should never happen in either of the cases. The SimpleTimingPort is only used for the I/O devices PIO port and the DMA devices config port and should thus never see an inhibited request. Similarly, the SimpleTimingPort is also used for the MessagePort in x86, and there should also not be any cases where the port sees an inhibited request. |
9648:f10eb34e3e38 |
22-Apr-2013 |
Dam Sunwoo <dam.sunwoo@arm.com> |
sim: separate nextCycle() and clockEdge() in clockedObjects
Previously, nextCycle() could return the *current* cycle if the current tick was already aligned with the clock edge. This behavior is not only confusing (not quite what the function name implies), but also caused problems in the drainResume() function. When exiting/re-entering the sim loop (e.g., to take checkpoints), the CPUs will drain and resume. Due to the previous behavior of nextCycle(), the CPU tick events were being rescheduled in the same ticks that were already processed before draining. This caused divergence from runs that did not exit/re-entered the sim loop. (Initially a cycle difference, but a significant impact later on.)
This patch separates out the two behaviors (nextCycle() and clockEdge()), uses nextCycle() in drainResume, and uses clockEdge() everywhere else. Nothing (other than name) should change except for the drainResume timing. |
9639:a1609f47cb83 |
17-Apr-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: moesi cmp directory: add copyright notice |
9633:3bf3100e9fa1 |
09-Apr-2013 |
Joel Hestness <jthestness@gmail.com> |
Ruby: Fix RubyPort evict packet memory leak
When using the o3 or inorder CPUs with many Ruby protocols, the caches may need to forward invalidations to the CPUs. The RubyPort was instantiating a packet to be sent to the CPUs to signal the eviction, but the packets were not being freed by the CPUs. Consistent with the classic memory model, stack allocate the packet and heap allocate the request so on ruby_eviction_callback() completion, the packet deconstructor is called, and deletes the request (*Note: stack allocating the request causes double deletion, since it will be deleted in the packet destructor). This results in the least memory allocations without memory errors. |
9632:476febc1aff0 |
09-Apr-2013 |
Joel Hestness <jthestness@gmail.com> |
Ruby: Delete packet requests during warmup
When warming up caches in Ruby, the CacheRecorder sends fetch requests into Ruby Sequencers with packet types that require responses. Since responses are never generated for these CacheRecorder requests, the requests are not deleted in the packet destructor called from the Ruby hit callback. Free the request. |
9631:5ebde5544529 |
09-Apr-2013 |
Joel Hestness <jthestness@gmail.com> |
Ruby: Add field to slicc machine for generic type
This allows you to have (i.e.) an L2 cache that is not named "L2Cache" but is still a GenericMachineType_L2Cache. This is particularly helpful if the protocol has multiple L2 controllers. |
9630:a3525ee464b8 |
09-Apr-2013 |
Joel Hestness <hestness@cs.wisc.edu> |
Ruby: Order profilers based on version
When Ruby stats are printed for events and transitions, they include stats for all of the controllers of the same type, but they are not necessarily printed in order of the controller ID "version", because of the way the profilers were added to the profiler vector. This patch fixes the push order problem so that the stats are printed in ascending order 0->(# controllers), so statistics parsers may correctly assume the controller to which the stats belong. |
9629:c52b4c5f46f8 |
09-Apr-2013 |
Jason Power <powerjg@cs.wisc.edu> |
Ruby: More descriptive message buffer connection fatal
When connecting message buffers between Ruby controllers, it is easy to mistakenly connect multiple controllers to the same message buffer. This patch prints a more descriptive fatal message than the previous assert statement in order to facilitate easier debugging. |
9628:195d92059654 |
09-Apr-2013 |
Jason Power <powerjg@cs.wisc.edu> |
Ruby: Fix typo in Slicc if-statement AST error
The error in the SLICC code was hidden by the python error in SLICC parser before this patch |
9627:fa31189e1fb5 |
07-Apr-2013 |
Joel Hestness <jthestness@gmail.com> |
Ruby System, Cache Recorder: Use delete [] for trace vars
The cache trace variables are array allocated uint8_t* in the RubySystem and the Ruby CacheRecorder, but the code used delete to free the memory, resulting in Valgrind memory errors. Change these deletes to delete [] to get rid of the errors. |
9619:0da414aefaf6 |
27-Mar-2013 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com> |
mem: Fix cache latency bug Fixes a latency calculation bug for accesses during a cache line fill.
Under a cache miss, before the line is filled, accesses to the cache are associated with a MSHR and marked as targets. Once the line fill completes, MSHR target packets pay an additional latency of "responseLatency + busSerializationLatency". However, the "whenReady" field of the cache line is only set to an additional delay of "busSerializationLatency". This lacks the responseLatency component of the fill. It is possible for accesses that occur on the cycle of (or briefly after) the line fill to respond without properly paying the responseLatency. This also creates the situation where two accesses to the same address may be serviced in an order opposite of how they were received by the cache. For stores to the same address, this means that although the cache performs the stores in the order they were received, acknowledgements may be sent in a different order.
Adding the responseLatency component to the whenReady field preserves the penalty that should be paid and prevents these ordering issues.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
9614:c35b47fd0df8 |
26-Mar-2013 |
Rene de Jong <rene.dejong@arm.com> |
mem: Cancel cache retry event when blocking port
This patch solves the corner case scenario where the sendRetryEvent could be scheduled twice, when an io device stresses the IOcache in the system. This should not be possible in the cache system. |
9612:ffe96405c828 |
26-Mar-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Separate waiting for the bus and waiting for a peer
This patch splits the retryList into a list of ports that are waiting for the bus itself to become available, and a map that tracks the ports where forwarding failed due to a peer not accepting the packet. Thus, when a retry reaches the bus, it can be sent to the appropriate port that initiated that transaction.
As a consequence of this patch, only ports that are really ready to go will get a retry, thus reducing the amount of redundant failed attempts. This patch also makes it easier to reason about the order of servicing requests as the ports waiting for the bus are now clearly FIFO and much easier to change if desired. |
9611:1f6fa87e9095 |
26-Mar-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Introduce a variable for the retrying port
This patch introduces a variable to keep track of the retrying port instead of relying on it being the front of the retryList.
Besides the improvement in readability, this patch is a step towards separating out the two cases where a port is waiting for the bus to be free, and where the forwarding did not succeed and the bus is waiting for a retry to pass on to the original initiator of the transaction.
The changes made are currently such that the regressions are not affected. This is ensured by always prioritizing the currently retrying port and putting it back at the front of the retry list. |
9609:2904589daa6b |
26-Mar-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add optional request flags to the packet trace
This patch adds an optional flags field to the packet trace to encode the request flags that contain information about whether the request is (un)cacheable, instruction fetch, preftech etc. |
9604:ce3382ab8772 |
22-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: slicc: set sender, receiver clock objs for optional queue |
9603:bf5e46a02a38 |
22-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: message buffer: correct previous errors A recent set of patches added support for multiple clock domains to ruby. I had made some errors while writing those patches. The sender was using the receiver side clock while enqueuing a message in the buffer. Those errors became visible while creating (or restoring from) checkpoints. The errors also become visible when a multi eventq scenario occurs. |
9602:21f39f6c1e92 |
22-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: message buffer: remove _ptr from some variables The names were getting too long. |
9601:fe4eb64480bf |
22-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: message buffer node: used Tick in place of Cycles The message buffer node used to keep time in terms of Cycles. Since the sender and the receiver can have different clock periods, storing node time in cycles requires some conversion. Instead store the time directly in Ticks. |
9600:34df8f24be7e |
22-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: consumer: avoid using receiver side clock A set of patches was recently committed to allow multiple clock domains in ruby. In those patches, I had inadvertently made an incorrect use of the clocks. Suppose object A needs to schedule an event on object B. It was possible that A accesses B's clock to schedule the event. This is not possible in actual system. Hence, changes are being to the Consumer class so as to avoid such happenings. Note that in a multi eventq simulation, this can possibly lead to an incorrect simulation.
There are two functions in the Consumer class that are used for scheduling events. The first function takes in the relative delay over the current time as the argument and adds the current time to it for scheduling the event. The second function takes in the absolute time (in ticks) for scheduling the event. The first function is now being moved to protected section of the class so that only objects of the derived classes can use it. All other objects will have to specify absolute time while scheduling an event for some consumer. |
9599:e95479c2926f |
22-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove unsued profile functions |
9598:a58b28c17d7f |
22-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: keep histogram of outstanding requests in seq The histogram for tracking outstanding counts per cycle is maintained in the profiler. For a parallel implementation of the memory system, we need that this histogram is maintained locally. Hence it will now be kept in the sequencer itself. The resulting histograms will be merged when the stats are printed. |
9597:f9b731fc6064 |
22-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
slicc: remove check if the L1Cache has a sequencer |
9596:aa73a81cf92c |
22-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: move stall and wakeup functions to AbstractController These functions are currently implemented in one of the files related to Slicc. Since these are purely C++ functions, they are better suited to be in the base class. |
9595:470016acf37d |
22-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: connect two controllers using only message buffers This patch modifies ruby so that two controllers can be connected to each other with only message buffers in between. Before this patch, all the controllers had to be connected to the network for them to communicate with each other. With this patch, one can have protocols where a controller is not connected to the network, but communicates with another controller through a message buffer. |
9594:219ad5fe8c04 |
22-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: convert Topology to regular class The Topology class in Ruby does not need to inherit from SimObject class. This patch turns it into a regular class. The topology object is now created in the constructor of the Network class. All the parameters for the topology class have been moved to the network class. |
9593:9441ca79f3c8 |
22-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: network: move routers from topology to network |
9587:fa9f28c0bfae |
18-Mar-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix missing delete of packet in DRAM access
This patch fixes a memory leak caused by not deleting packets that require no response. |
9586:3c62e3b7f658 |
15-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: set: corrects csprintf() call introduced by 7d95b650c9b6 |
9580:d1e6329cd367 |
07-Mar-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
ruby: Fix gcc 4.8 maybe-uninitialized compilation error
This patch fixes the one-and-only gcc 4.8 compilation error, being a warning about "maybe uninitialized" in Orion. |
9577:91cac7c9c636 |
06-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove the functional copy of memory in se mode This patch removes the functional copy of the memory that was maintained in the se mode. Now ruby itself will provide the data. |
9576:2c094ad4dc70 |
06-Mar-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: garnet: fixed: implement functional access |
9572:13ae8000f771 |
03-Mar-2013 |
Blake Hechtman <bah13@duke.edu>, Nilay Vaish <nilay@cs.wisc.edu> |
ruby: fixes functional writes to RubyRequest The functional write code was assuming that all writes are block sized, which may not be true for Ruby Requests. This bug can lead to a buffer overflow.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
9570:cc7a6660c8b7 |
01-Mar-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add check if SimpleDRAM nextReqEvent is scheduled
This check covers a case where a retry is called from the SimpleDRAM causing a new request to appear before the DRAM itself schedules a nextReqEvent. By adding this check, the event is not scheduled twice. |
9569:3f70fc65b14c |
01-Mar-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add a method to build multi-channel DRAM configurations
This patch adds a class method that allows easy creation of channel-interleaved multi-channel DRAM configurations. It is enabled by a class method to allow customisation of the class independent of the channel configuration. For example, the user can create a MyDDR subclass of e.g. SimpleDDR3, and then create a four-channel configuration of the subclass by calling MyDDR.makeMultiChannel(4, mem_start, mem_size). |
9567:929da5a00a10 |
01-Mar-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: SimpleDRAM variable naming and whitespace fixes
This patch fixes a number of small cosmetic issues in the SimpleDRAM module. The most important change is to move the accounting of received packets to after the check is made if the packet should be retried or not. Thus, packets are only counted if they are actually accepted. |
9566:b1e1409922ad |
01-Mar-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add support for multi-channel DRAM configurations
This patch adds support for multi-channel instances of the DRAM controller model by stripping away the channel bits in the address decoding. The patch relies on the availiability of address interleaving and, at this time, it is up to the user to configure the interleaving appropriately. At the moment it is assumed that the channel interleaving bits are immediately following the column bits (smallest sensible interleaving). Convenience methods for building multi-channel configurations will be added later. |
9565:c2f393be5f14 |
01-Mar-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Merge interleaved ranges when creating backing store
This patch adds merging of interleaved ranges before creating the backing stores. The backing stores are always a contigous chunk of the address space, and with this patch it is possible to have interleaved memories in the system. |
9564:69262a1bf067 |
01-Mar-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Merge ranges in bus before passing them on
This patch adds basic merging of address ranges to the bus, such that interleaved ranges are merged together before being passed on by the bus. As such, the bus aggregates the address ranges of the connected slave ports and then passes on the merged ranges through its master ports. The bus thus hides the complexity of the interleaved ranges and only exposes contigous ranges to the surrounding system.
As part of this patch, the bus ranges are also cached for any future queries. |
9563:08d097040f90 |
28-Feb-2013 |
Dibakar Gope <gope@wisc.edu>, Nilay Vaish <nilay@cs.wisc.edu> |
ruby: mesi coherence protocol: invalidate lock The MESI CMP directory coherence protocol, while transitioning from SM to IM, did not invalidate the lock that it might have taken on a cache line. This patch adds an action for doing so.
The problem was found by Dibakar, but I was not happy with his proposed solution. So I implemented a different solution.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
9561:bc043a0455e3 |
19-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
slicc: remove unused variable message_buffer_names |
9560:322472967603 |
19-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove unused variable m_print_config in class Topology |
9559:e6347e559e8f |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix sender state bug and delay popping
This patch fixes a newly introduced bug where the sender state was popped before checking that it should be. Amazingly all regressions pass, but Linux fails to boot on the detailed CPU with caches enabled. |
9557:8666e81607a6 |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
scons: Fix warnings issued by clang 3.2svn (XCode 4.6)
This patch fixes the warnings that clang3.2svn emit due to the "-Wall" flag. There is one case of an uninitialised value in the ARM neon ISA description, and then a whole range of unused private fields that are pruned. |
9554:406fbcf60223 |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
scons: Add warning for missing declarations
This patch enables warnings for missing declarations. To avoid issues with SWIG-generated code, the warning is only applied to non-SWIG code. |
9550:e0e2c8f83d08 |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
scons: Fix up numerous warnings about name shadowing
This patch address the most important name shadowing warnings (as produced when using gcc/clang with -Wshadow). There are many locations where constructor parameters and function parameters shadow local variables, but these are left unchanged. |
9549:95a536fae9ac |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Enforce strict use of busFirst- and busLastWordTime
This patch adds a check to ensure that the delay incurred by the bus is not simply disregarded, but accounted for by someone. At this point, all the modules do is to zero it out, and no additional time is spent. This highlights where the bus timing is simply dropped instead of being paid for.
As a follow up, the locations identified in this patch should add this additional time to the packets in one way or another. For now it simply acts as a sanity check and highlights where the delay is simply ignored.
Since no time is added, all regressions remain the same. |
9548:63d36f7ef562 |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Change accessor function names to match the port interface
This patch changes the names of the cache accessor functions to be in line with those used by the ports. This is done to avoid confusion and get closer to a one-to-one correspondence between the interface of the memory object (the cache in this case) and the port itself.
The member function timingAccess has been split into a snoop/non-snoop part to avoid branching on the isResponse() of the packet. |
9547:6d81435f56cb |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Make packet bus-related time accounting relative
This patch changes the bus-related time accounting done in the packet to be relative. Besides making it easier to align the cache timing to cache clock cycles, it also makes it possible to create a Last-Level Cache (LLC) directly to a memory controller without a bus inbetween.
The bus is unique in that it does not ever make the packets wait to reflect the time spent forwarding them. Instead, the cache is currently responsible for making the packets wait. Thus, the bus annotates the packets with the time needed for the first word to appear, and also the last word. The cache then delays the packets in its queues before passing them on. It is worth noting that every object attached to a bus (devices, memories, bridges, etc) should be doing this if we opt for keeping this way of accounting for the bus timing. |
9546:ac0c18d738ce |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add deferred packet class to prefetcher
This patch removes the time field from the packet as it was only used by the preftecher. Similar to the packet queue, the prefetcher now wraps the packet in a deferred packet, which also has a tick representing the absolute time when the packet should be sent. |
9545:508784fad4e5 |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
sim: Make clock private and access using clockPeriod()
This patch makes the clock member private to the ClockedObject and forces all children to access it using clockPeriod(). This makes it impossible to inadvertently change the clock, and also makes it easier to transition to a situation where the clock is derived from e.g. a clock domain, or through a multiplier. |
9543:a373b2e664ff |
19-Feb-2013 |
Sascha Bischoff <sascha.bischoff@arm.com> |
mem: Fix SenderState related cache deadlock
This patch fixes a potential deadlock in the caches. This deadlock could occur when more than one cache is used in a system, and pkt->senderState is modified in between the two caches. This happened as the caches relied on the senderState remaining unchanged, and used it for instantaneous upstream communication with other caches.
This issue has been addressed by iterating over the linked list of senderStates until we are either able to cast to a MSHR* or senderState is NULL. If the cast is successful, we know that the packet has previously passed through another cache, and therefore update the downstreamPending flag accordingly. Otherwise, we do nothing. |
9542:683991c46ac8 |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add predecessor to SenderState base class
This patch adds a predecessor field to the SenderState base class to make the process of linking them up more uniform, and enable a traversal of the stack without knowing the specific type of the subclasses.
There are a number of simplifications done as part of changing the SenderState, particularly in the RubyTest. |
9540:9ddb996931d7 |
19-Feb-2013 |
Andreas Hansson <andreas.hansson@armm.com> |
mem: Ensure trace captures packet fields before forwarding
This patch fixes a bug in the CommMonitor caused by the packet being modified before it is captured in the trace. By recording the fields before passing the packet on, and then putting these values in the trace we ensure that even if the packet is modified the trace captures what the CommMonitor saw. |
9529:28d6d9663a7e |
15-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tighten up cache constness and scoping
This patch merely adopts a more strict use of const for the cache member functions and variables, and also moves a large portion of the member functions from public to protected. |
9524:d6ffa982a68b |
15-Feb-2013 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
sim: Add a system-global option to bypass caches
Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches.
To make memory mode tests cleaner, the following methods are added to the System class:
* isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed.
The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore. |
9511:615456167b9d |
14-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
Ruby: Fix compilation errors on gcc 4.7 and clang 3.2
This patch fixes a few (recently added) errors that prevented gem5 from compiling on more recent versions of gcc and clang. |
9509:0adea7868e77 |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: MI protocol: add a missing transition The transition for state MII and event Store was found missing during testing. The transition is being added. The controller will not stall the Store request in state MII |
9508:dde110931867 |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: enable multiple clock domains This patch allows ruby to have multiple clock domains. As I understand with this patch, controllers can have different frequencies. The entire network needs to run at a single frequency.
The idea is that with in an object, time is treated in terms of cycles. But the messages that are passed from one entity to another should contain the time in Ticks. As of now, this is only true for the message buffers, but not for the links in the network. As I understand the code, all the entities in different networks (simple, garnet-fixed, garnet-flexible) should be clocked at the same frequency.
Another problem is that the directory controller has to operate at the same frequency as the ruby system. This is because the memory controller does not make use of the Message Buffer, and instead implements a buffer of its own. So, it has no idea of the frequency at which the directory controller is operating and uses ruby system's frequency for scheduling events. |
9507:d2ab6d889fc7 |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: replace Time with Cycles (final patch in the series) This patch is as of now the final patch in the series of patches that replace Time with Cycles.This patch further replaces Time with Cycles in Sequencer, Profiler, different protocols and related entities.
Though Time has not been completely removed, the places where it is in use seem benign as of now. |
9506:f5335ac67f41 |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: replace Time with Cycles in garnet fixed and flexible |
9505:66b3ed9a176e |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: replace Time with Tick in replacement policy classes |
9504:5c6de9a7f8d8 |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: convert block size, memory size to unsigned |
9503:98ad73bdc579 |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: replace Time with Cycles in MessageBuffer |
9502:45cd0bc6c507 |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: replace Time with Cycles in Memory Controller |
9501:378817542866 |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: Replace Time with Cycles in SequencerMessage |
9500:9c3e3d1c7a87 |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: replace Time with Cycles in Message class Concomitant changes are being committed as well, including the io operator<< for the Cycles class. |
9499:b03b556a8fbb |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: replaces Time with Cycles in many places The patch started of with replacing Time with Cycles in the Consumer class. But to get ruby to compile, the rest of the changes had to be carried out. Subsequent patches will further this process, till we completely replace Time with Cycles. |
9497:2759161b9d7f |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: modifies histogram add() function This patch modifies the Histogram class' add() function so that it can add linear histograms as well. The function assumes that the left end point of the ranges of the two histograms are the same. It also assumes that when the ranges of the two histogram are changed to accomodate an element not in the range, the factor used in changing the range is same for both the histograms.
This function is then used in removing one of the calls to the global profiler*. The histograms for recording the delays incurred in processing different requests are now maintained by the controllers. The profiler adds these histograms when it needs to print the stats. |
9496:28d88a0fda74 |
10-Feb-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: record fully busy cycle with in the controller This patch does several things. First, the counter for fully busy cycles for a controller is now kept with in the controller, instead of being part of the profiler. Second, the topology class no longer keeps an array of controllers which was only used for printing stats. Instead, ruby system will now ask each controller to print the stats. Thirdly, the statistical variable for recording how many different types were created is being moved in to the controller from the profiler. Note that for printing, the profiler will collate results from different controllers. |
9492:d4953634e9ee |
31-Jan-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: correct computation of number of bits required for address The number of bits required for an address was set to floorLog2(memory size). This is correct under the assumption that the memory size is a power of 2, which is not always true. Hence, floorLog2 is being replaced with ceilLog2. |
9491:5a3d2fb86a78 |
31-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add comments for the DRAM address decoding
This patch adds more verbose comments to explain the two different address mapping schemes of the DRAM controller. |
9489:172dbcb74a0e |
31-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add DDR3 and LPDDR2 DRAM controller configurations
This patch moves the default DRAM parameters from the SimpleDRAM class to two different subclasses, one for DDR3 and one for LPDDR2. More can be added as we go forward.
The regressions that previously used the SimpleDRAM are now using SimpleDDR3 as this is the most similar configuration. |
9488:2304663a11d9 |
31-Jan-2013 |
Ani Udipi <ani.udipi@arm.com> |
mem: Add tTAW and tFAW to the SimpleDRAM model
This patch adds two additional scheduling constraints to the DRAM controller model, to constrain the activation rate. The two metrics are determine the size of the activation window in terms of the number of activates and the minimum time required for that number of activates. This maps to current DDRx, LPDDRx and WIOx standards that have either tFAW (4 activate window) or tTAW (2 activate window) scheduling constraints. |
9487:93389c3d9195 |
31-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Separate out the different cases for DRAM bus busy time
This patch changes how the data bus busy time is calculated such that it is delayed to the actual scheduling time of the request as opposed to being done as soon as possible.
This patch changes a bunch of statistics, and the stats update is bundled together with the introruction of tFAW/tTAW and the named DRAM configurations like DDR3 and LPDDR2. |
9486:569e1f1d762d |
28-Jan-2013 |
Anthony Gutierrez <atgutier@umich.edu> |
cache: remove drainManager because it's not used
the cache drainManager is set but never cleared, this is because the cache itself does not need to be drained and thus never triggers a signalDrainDone(). because the drainManager variable is not used properly and does not appear to be necessary it has been removed with this patch. |
9484:e96ff45795bc |
28-Jan-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove get_time() This patch replaces get_time() in *.sm files with curCycle() which is now possible since controllers are clocked objects. |
9483:c2d205f278fc |
28-Jan-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove call to curCycle in panic() The panic() function already prints the current tick value. This call to curCycle() is as such redundant. Since we are trying to move towards multiple clock domains, this call will print misleading time. |
9475:736909f5c13b |
17-Jan-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove calls to g_system_ptr->getTime() This patch further removes calls to g_system_ptr->getTime() where ever other clocked objects are available for providing current time. |
9467:8da5ee073b92 |
14-Jan-2013 |
Malek Musleh <malek.musleh@gmail.com> |
ruby sequencer: converts cycles to ticks in deadlock panic() This patch converts the panic() print outs in the Sequencer::wakeup() call from ruby cycles to Ticks(). This makes it easier to debug deadlocks with the ProtocolTrace flag so the issue time indicated in the panic message can be quickly searched for.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
9466:23e13ad7091f |
14-Jan-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: remove reference to g_system_ptr from class Message This patch was initiated so as to remove reference to g_system_ptr, the pointer to Ruby System that is used for getting the current time. That simple change actual requires changing a lot many things in slicc and garnet. All these changes are related to how time is handled.
In most of the places, g_system_ptr has been replaced by another clock object. The changes have been done under the assumption that all the components in the memory system are on the same clock frequency, but the actual clocks might be distributed. |
9465:4ae4f3f4b870 |
14-Jan-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: use ClockedObject in Consumer class Many Ruby structures inherit from the Consumer, which is used for scheduling events. The Consumer used to relay on an Event Manager for scheduling events and on g_system_ptr for time. With this patch, the Consumer will now use a ClockedObject to schedule events and to query for current time. This resulted in several structures being converted from SimObjects to ClockedObjects. Also, the MessageBuffer class now requires a pointer to a ClockedObject so as to query for time. |
9454:2694770a30d4 |
08-Jan-2013 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com> |
mem: Make LL/SC locks fine grained
The current implementation in gem5 just keeps a list of locks per cacheline. Due to this, a store to a non-overlapping portion of the cacheline can cause an LL/SC pair to fail. This patch simply adds an address range to the lock structure, so that the lock is only invalidated if the store overlaps the lock range. |
9453:0694ba392248 |
08-Jan-2013 |
Mitch Hayenga <mitch.hayenga+gem5@gmail.com> |
mem: Fix use-after-free bug
Running with valgrind I noticed a use after free originating from simple_mem.cc. It looks like this is a known issue and this additional call site was missed in an earlier patch. |
9445:5963165c00cb |
07-Jan-2013 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
mem: Fix guest corruption when caches handle uncacheable accesses
When the classic gem5 cache sees an uncacheable memory access, it used to ignore it or silently drop the cache line in case of a write. Normally, there shouldn't be any data in the cache belonging to an uncacheable address range. However, since some architecture models don't implement cache maintenance instructions, there might be some dirty data in the cache that is discarded when this happens. The reason it has mostly worked before is because such cache lines were most likely evicted by normal memory activity before a TLB flush was requested by the OS.
Previously, the cache model would invalidate cache lines when they were accessed by an uncacheable write. This changeset alters this behavior so all uncacheable memory accesses cause a cache flush with an associated writeback if necessary. This is implemented by reusing the cache flushing machinery used when draining the cache, which implies that writebacks are performed using functional accesses. |
9422:34d2e8082912 |
07-Jan-2013 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
mem: Remove the IIC replacement policy
The IIC replacement policy seems to be unused and has probably gathered too much bit rot to be useful. This patch removes the IIC and its associated cache parameters. |
9418:9923a5ab8c13 |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
sim: Fatal if a clocked object is set to have a clock of 0
This patch adds a check to the clocked object constructor to ensure it is not configured to have a clock period of 0. |
9413:0937a00d3f68 |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Merge ranges that are part of the conf table
This patch adds basic merging of address ranges when determining which address ranges should be reported in the configuration table. By performing this merging it is possible to distribute an address range across many memory channels (controllers). This is essential to enable address interleaving. |
9411:22e15f9c3fda |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add interleaving bits to the address ranges
This patch adds support for interleaving bits for the address ranges. What was previously just a start and end address, now has an additional three fields, for the high bit, and number of bits to use for interleaving, and a match value to compare against. If the number of interleaving bits is set to zero it is effectively disabled.
A number of convenience functions are added to the range to enquire about the interleaving, its granularity and the number of stripes it is part of. |
9409:e399b6c18b76 |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
base: Simplify the AddrRangeMap by removing unused code
This patch cleans up the AddrRangeMap in preparation for the addition of interleaving by removing unused code. The non-const editions of find are never used, and hence the duplication is not needed. |
9407:deb866e1d768 |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Tidy up bus addr range debug messages
This patch tidies up a number of the bus DPRINTFs related to range manipulation. In particular, it shifts the message about range changes to the start of the member function, and also adds information about when all ranges are received. |
9406:024edfcfcbbf |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Skip address mapper range checks to allow more flexibility
This patch makes the address mapper less stringent about checking the before and after ranges, i.e. the original and remapped ranges. The checks were not really necessary, and there are situations when the previous checks were too strict. |
9405:c0a0593510db |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
base: Encapsulate the underlying fields in AddrRange
This patch makes the start and end address private in a move to prevent direct manipulation and matching of ranges based on these fields. This is done so that a transition to ranges with interleaving support is possible.
As a result of hiding the start and end, a number of member functions are needed to perform the comparisons and manipulations that previously took place directly on the members. An accessor function is provided for the start address, and a function is added to test if an address is within a range. As a result of the latter the != and == operator is also removed in favour of the member function. A member function that returns a string representation is also created to allow debug printing.
In general, this patch does not add any functionality, but it does take us closer to a situation where interleaving (and more cleverness) can be added under the bonnet without exposing it to the user. More on that in a later patch. |
9404:c194718a592c |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove the joining of neighbouring ranges
This patch temporarily removes the joining of ranges when creating the backing store, to reserve this functionality for the interleaved ranges that are about to be introduced.
When creating the mmaps for the backing store, there is no point in creating larger contigous chunks that what is necessary. The larger chunks will only make life more difficult for the host.
Merging will be re-added later, but then only for interleaved ranges. |
9398:6a348f61220c |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add tracing support in the communication monitor
This patch adds packet tracing to the communication monitor using a protobuf as the mechanism for creating the trace.
If no file is specified, then the tracing is disabled. If a file is specified, then for every packet that is successfully sent, a protobuf message is serialized to the file. |
9390:5490105626dc |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add sanity check to packet queue size
This patch adds a basic check to ensure that the packet queue does not grow absurdly large. The queue should only be used to store packets that were delayed due to blocking from the neighbouring port, and not for actual storage. Thus, a limit of 100 has been chosen for now (which is already quite substantial). |
9389:8f8c911ab5a7 |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
ruby: Fix missing cxx_header in Switch
This patch addresses a warning related to the swig interface generation for the Switch class. The cxx_header is now specified correctly, and the header in question has got a few includes added to make it all compile. |
9386:b08ec9cf2e3f |
07-Jan-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix a bug in the memory serialization file naming
This patch fixes a bug that caused multiple systems to overwrite each other physical memory. The system name is now included in the filename such that this is avoided. |
9379:40250293a6ae |
07-Jan-2013 |
Ali Saidi <Ali.Saidi@ARM.com> |
cache: add note about where conflicts are handled |
9366:bf8eb26c7b7e |
11-Dec-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: add support for prefetching to MESI protocol |
9364:e5fc9d588132 |
11-Dec-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: change slicc to allow for constructor args The patch adds support to slicc for recognizing arguments that should be passed to the constructor of a class. I did not like the fact that an explicit check was being carried on the type 'TBETable' to figure out the arguments to be passed to the constructor. The patch also moves some of the member variables that are declared for all the controllers to the base class AbstractController. |
9363:e2616dc035ce |
11-Dec-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: add a prefetcher This patch adds a prefetcher for the ruby memory system. The prefetcher is based on a prefetcher implemented by others (well, I don't know who wrote the original). The prefetcher does stride-based prefetching, both unit and non-unit. It obseves the misses in the cache and trains on these. After the training period is over, the prefetcher starts issuing prefetch requests to the controller. |
9362:d7f4abbf52e3 |
11-Dec-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: add functions for computing next stride/page address |
9356:b279bad40aa3 |
16-Nov-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
sim: have a curTick per eventq This patch adds a _curTick variable to an eventq. This variable is updated whenever an event is serviced in function serviceOne(), or all events upto a particular time are processed in function serviceEvents(). This change helps when there are eventqs that do not make use of curTick for scheduling events. |
9354:7691ec6b173b |
10-Nov-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: support functional accesses in garnet flexible network |
9353:b25c55c87d60 |
10-Nov-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: bug in functionalRead, revert recent changes Recent changes to functionalRead() in the memory system was not correct. The change allowed for returning data from the first message found in the buffers of the memory system. This is not correct since it is possible that a timing message has data from an older state of the block.
The changes are being reverted. |
9352:9bbde5309ac6 |
08-Nov-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix DRAM draining to ensure write queue is empty
This patch fixes the draining of the SimpleDRAM controller model. The controller performs buffering of writes and normally there is no need to ever empty the write buffer (if you have a fast on-chip memory, then use it). The patch adds checks to ensure the write buffer is drained when the controller is asked to do so. |
9350:ddb946b131c8 |
02-Nov-2012 |
Hamid Reza Khaleghzadeh <khaleghzadeh@gmail.com>, Lluc Alvarez <lluc.alvarez@bsc.es>, Nilay Vaish <nilay@cs.wisc.edu> |
ruby: reset and dump stats along with reset of the system This patch adds support to ruby so that the statistics maintained by ruby are reset/dumped when the statistics for the rest of the system are reset/dumped. For resetting the statistics, ruby now provides the resetStats() function that a sim object can provide. As a consequence, the clearStats() function has been removed from RubySystem. For dumping stats, Ruby now adds a callback event to the dumpStatsQueue. The exit callback that ruby used to add earlier is being removed.
Created by: Hamid Reza Khaleghzadeh. Improved by: Lluc Alvarez, Nilay Vaish Committed by: Nilay Vaish |
9349:844f9e724343 |
02-Nov-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
mem: fix use after free issue in memories until 4-phase work complete. |
9347:b02075171b57 |
02-Nov-2012 |
Andreas Sandberg <Andreas.Sandberg@arm.com> |
mem: Add support for writing back and flushing caches
This patch adds support for the following optional drain methods in the classical memory system's cache model:
memWriteback() - Write back all dirty cache lines to memory using functional accesses.
memInvalidate() - Invalidate all cache lines. Dirty cache lines are lost unless a writeback is requested.
Since memWriteback() is called when checkpointing systems, this patch adds support for checkpointing systems with caches. The serialization code now checks whether there are any dirty lines in the cache. If there are dirty lines in the cache, the checkpoint is flagged as bad and a warning is printed. |
9342:6fec8f26e56d |
02-Nov-2012 |
Andreas Sandberg <Andreas.Sandberg@arm.com> |
sim: Move the draining interface into a separate base class
This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager. |
9338:97b4a2be1e5b |
02-Nov-2012 |
Andreas Sandberg <Andreas.Sandberg@arm.com> |
sim: Include object header files in SWIG interfaces
When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy.
This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it. |
9332:ae2a5329ce96 |
02-Nov-2012 |
Dam Sunwoo <dam.sunwoo@arm.com> |
ARM: dump stats and process info on context switches
This patch enables dumping statistics and Linux process information on context switch boundaries (__switch_to() calls) that are used for Streamline integration (a graphical statistics viewer from ARM). |
9325:da1c9ac339f9 |
31-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Fix typo in port comments
This patch merely fixes a few typos in the port comments. |
9313:0ad73254027b |
25-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
dev: Make default clock more reasonable for system and devices
This patch changes the default system clock from 1THz to 1GHz. This clock is used by all modules that do not override the default (parent clock), and primarily affects the IO subsystem. Every DMA device uses its clock to schedule the next transfer, and the change will thus cause this inter-transfer delay to be longer.
The default clock of the bus is removed, as the clock inherited from the system provides exactly the same value.
A follow-on patch will bump the stats. |
9305:ac608464be80 |
18-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: functional access updates to network test protocol I had forgotten to change the network test protocol while making changes to ruby for supporting functional accesses. This patch updates the protocol so that it can compile correctly. |
9302:c2e70a9bc340 |
15-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: improved support for functional accesses This patch adds support to different entities in the ruby memory system for more reliable functional read/write accesses. Only the simple network has been augmented as of now. Later on Garnet will also support functional accesses. The patch adds functional access code to all the different types of messages that protocols can send around. These messages are functionally accessed by going through the buffers maintained by the network entities. The patch also rectifies some of the bugs found in coherence protocols while testing the patch.
With this patch applied, functional writes always succeed. But functional reads can still fail. |
9300:7edfd33b40e2 |
15-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: register multiple memory controllers Currently the Ruby System maintains pointer to only one of the memory controllers. But there can be multiple controllers in the system. This patch adds a vector of memory controllers. |
9299:bfd2ccb8841b |
15-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove AbstractMemOrCache The only place where this abstract class is in use is the memory controller, which it self is an abstract class. Does not seem useful at all. |
9298:9a087e046c58 |
15-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: allow function definition in slicc structs This patch adds support for function definitions to appear in slicc structs. This is required for supporting functional accesses for different types of messages. Subsequent patches will use this to development. |
9297:b6d1e257d488 |
15-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby banked array: do away with event scheduling It seems unecessary that the BankedArray class needs to schedule an event to figure out when the access ends. Instead only the time for the end of access needs to be tracked. |
9296:f4ba9a861e65 |
15-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: reset timing after cache warm up Ruby system was recently converted to a clocked object. Such objects maintain state related to the time that has passed so far. During the cache warmup, Ruby system changes its own time and the global time. Later on, the global time is restored. So Ruby system also needs to reset its own time. |
9295:0b9fcd304b58 |
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Fix incorrect logic in bus blocksize check
This patch fixes the logic in the blocksize check such that the warning is printed if the size is not 16, 32, 64 or 128. |
9294:8fb03b13de02 |
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Add protocol-agnostic ports in the port hierarchy
This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations.
The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default. |
9293:df7c3f99ebca |
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Separate the host and guest views of memory backing store
This patch moves all the memory backing store operations from the independent memory controllers to the global physical memory. The main reason for this patch is to allow address striping in a future set of patches, but at this point it already provides some useful functionality in that it is now possible to change the number of memory controllers and their address mapping in combination with checkpointing. Thus, the host and guest view of the memory backing store are now completely separate.
With this patch, the individual memory controllers are far simpler as all responsibility for serializing/unserializing is moved to the physical memory. Currently, the functionality is more or less moved from AbstractMemory to PhysicalMemory without any major changes. However, in a future patch the physical memory will also resolve any ranges that are interleaved and properly assign the backing store to the memory controllers, and keep the host memory as a single contigous chunk per address range.
Functionality for future extensions which involve CPU virtualization also enable the host to get pointers to the backing store. |
9291:b27d3e9a333f |
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Use deque instead of list for bus retries
This patch changes the data structure used to keep track of ports that should be told to retry. As the bus is doing this in an FCFS way, there is no point having a list. A deque is a better match (and is at least in theory a better choice from a performance point of view). |
9290:90dd57ca9a7e |
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Fix: Address a few minor issues identified by cppcheck
This patch addresses a number of smaller issues identified by the code inspection utility cppcheck. There are a number of identified leaks in the arm/linux/system.cc (although the function only get's called once so it is not a major problem), a few deletes in dev/x86/i8042.cc that were not array deletes, and sprintfs where the character array had one element less than needed. In the IIC tags there was a function allocating an array of longs which is in fact never used. |
9288:3d6da8559605 |
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Use cycles to express cache-related latencies
This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency.
The stat blocked_cycles that actually counter in ticks is now updated to count in cycles.
As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions. |
9279:8b16c3804bda |
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Use range operations in bus in preparation for striping
This patch transitions the bus to use the AddrRange operations instead of directly accessing the start and end. The change facilitates the move to a more elaborate AddrRange class that also supports address striping in the bus by specifying interleaving bits in the ranges.
Two new functions are added to the AddrRange to determine if two ranges intersect, and if one is a subset of another. The bus propagation of address ranges is also tweaked such that an update is only propagated if the bus received information from all the downstream slave modules. This avoids the iteration and need for the cycle-breaking scheme that was previously used. |
9278:6681c1027563 |
11-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Determine bus block size during initialisation
This patch moves the block size computation from findBlockSize to initialisation time, once all the neighbouring ports are connected.
There is no need to dynamically update the block size, and the caching of the value effectively avoided that anyhow. This is very similar to what was already in place, just with a slightly leaner implementation. |
9275:ef43e69c837a |
02-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: makes some members non-static This patch makes some of the members (profiler, network, memory vector) of ruby system non-static. |
9274:ba635023d4bb |
02-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: changes to simple network This patch makes the Switch structure inherit from BasicRouter, as is done in two other networks. |
9273:05b12cb19cc8 |
02-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: rename template_hack to template I don't like using the word hack. Hence, the patch. |
9272:67c11eeafacf |
02-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove unused code in protocols |
9271:3859f5d4f2c6 |
02-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: remove some unused things in slicc This patch removes the parts of slicc that were required for multi-chip protocols. Going ahead, it seems multi-chip protocols would be implemented by playing with the network itself. |
9270:92aad0e984ff |
02-Oct-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: move functional access to ruby system This patch moves the code for functional accesses to ruby system. This is because the subsequent patches add support for making functional accesses to the messages in the interconnect. Making those accesses from the ruby port would be cumbersome. |
9269:4ece3d8d22fa |
30-Sep-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
MI coherence protocol: add copyright notice |
9264:1607119c36bb |
25-Sep-2012 |
Djordje Kovacevic <djordje.kovacevic@arm.com> |
MEM: Put memory system document into doxygen |
9263:066099902102 |
25-Sep-2012 |
Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> |
Cache: add a response latency to the caches
In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path. |
9259:fc28f3ca5b21 |
25-Sep-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
mem: Add a gasket that allows memory ranges to be re-mapped.
For example if DRAM is at two locations and mirrored this patch allows the mirroring to occur. |
9245:e215ee9db617 |
23-Sep-2012 |
Joel Hestness <hestness@cs.wisc.edu> |
RubyPort and Sequencer: Fix draining
Fix the drain functionality of the RubyPort to only call drain on child ports during a system-wide drain process, instead of calling each time that a ruby_hit_callback is executed.
This fixes the issue of the RubyPort ports being reawakened during the drain simulation, possibly with work they didn't previously have to complete. If they have new work, they may call process on the drain event that they had not registered work for, causing an assertion failure when completing the drain event.
Also, in RubyPort, set the drainEvent to NULL when there are no events to be drained. If not set to NULL, the drain loop can result in stale drainEvents used. |
9243:9b6ff962d62f |
21-Sep-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
DRAM: Introduce SimpleDRAM to capture a high-level controller
This patch introduces a high-level model of a DRAM controller, with a basic read/write buffer structure, a selectable and customisable arbiter, a few address mapping options, and the basic DRAM timing constraints. The parameters make it possible to turn this model into any desired DDRx/LPDDRx/WideIOx memory controller.
The intention is not to be cycle accurate or capture every aspect of a DDR DRAM interface, but rather to enable exploring of the high-level knobs with a good simulation speed. Thus, contrary to e.g. DRAMSim this module emphasizes simulation speed with a good-enough accuracy.
This module is merely a starting point, and there are plenty additions and improvements to come. A notable addition is the support for address-striping in the bus to enable a multi-channel DRAM controller. Also note that there are still a few "todo's" in the code base that will be addressed as we go along.
A follow-up patch will add basic performance regressions that use the traffic generator to exercise a few well-defined corner cases. |
9240:7d506c3ef13d |
21-Sep-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Tidy up bus member variables types
This patch merely tidies up the types used for the bus member variables. It also makes the constant ones const. |
9237:cb942df51335 |
20-Sep-2012 |
Anthony Gutierrez <atgutier@umich.edu> |
bus: removed outdated warn regarding 64 B block sizes
this warn is outdated as 64 B blocks are very common, and even the default size for some CPU types. E.g., arm_detailed. |
9236:c38988024f1f |
19-Sep-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Remove the file parameter from AbstractMemory
This patch removes the unused file parameter from the AbstractMemory. The patch serves to make it easier to transition to a separation of the actual contigious host memory backing store, and the gem5 memory controllers.
Without the file parameter it becomes easier to hide the creation of the mmap in the PhysicalMemory, as there are no longer any reasons to expose the actual contigious ranges to the user.
To the best of my knowledge there is no use of the parameter, so the change should not affect anyone. |
9235:5aa4896ed55a |
19-Sep-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
AddrRange: Transition from Range<T> to AddrRange
This patch takes the final plunge and transitions from the templated Range class to the more specific AddrRange. In doing so it changes the obvious Range<Addr> to AddrRange, and also bumps the range_map to be AddrRangeMap.
In addition to the obvious changes, including the removal of redundant includes, this patch also does some house keeping in preparing for the introduction of address interleaving support in the ranges. The Range class is also stripped of all the functionality that is never used. |
9231:cecc64db9b3b |
18-Sep-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: eliminate typedef integer_t |
9230:33eb3c8a98b9 |
18-Sep-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: avoid using g_system_ptr for event scheduling This patch removes the use of g_system_ptr for event scheduling. Each consumer object now needs to specify upfront an EventManager object it would use for scheduling events. This makes the ruby memory system more amenable for a multi-threaded simulation. |
9228:bbdca4088834 |
18-Sep-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Add a maximum bandwidth to SimpleMemory
This patch makes a minor addition to the SimpleMemory by enforcing a maximum data rate. The bandwidth is configurable, and a reasonable value (12.8GB/s) has been choosen as the default.
The changes do add some complexity to the SimpleMemory, but they should definitely be justifiable as this enables a far more realistic setup using even this simple memory controller.
The rate regulation is done for reads and writes combined to reflect the bidirectional data busses used by most (if not all) relevant memories. Moreover, the regulation is done per packet as opposed to long term, as it is the short term data rate (data bus width times frequency) that is the limiting factor.
A follow-up patch bumps the stats for the regressions. |
9224:b0539d08bda8 |
14-Sep-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
scons: Use c++0x with gcc >= 4.4 instead of 4.6
This patch shifts the version of gcc for which we enable c++0x from 4.6 to 4.4 The more long term plan is to see what the c++0x features can bring and what level of support would be enabled simply by bumping the required version of gcc from 4.3 to 4.4.
A few minor things had to be fixed in the code base, most notably the choice of a hashmap implementation. In the Ruby Sequencer there were also a few minor issues that gcc 4.4 was not too happy about. |
9219:258753d3bc47 |
12-Sep-2012 |
Jason Power <power.jg@gmail.com> |
Ruby: Modify Scons so that we can put .sm files in extras Also allows for header files which are required in slicc generated code to be in a directory other than src/mem/ruby/slicc_interface. |
9216:a5f937d152bf |
11-Sep-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
clang: Fix issues identified by the clang static analyzer
This patch addresses a few minor issues reported by the clang static analyzer.
The analysis was run with:
scan-build -disable-checker deadcode \ -enable-checker experimental.core \ -disable-checker experimental.core.CastToStruct \ -enable-checker experimental.cpluscplus |
9214:a42caed28e1f |
11-Sep-2012 |
Lena Olson <lena@cs.wisc.edu> |
Cache: Split invalidateBlk up to seperate block vs. tags
This seperates the functionality to clear the state in a block into blk.hh and the functionality to udpate the tag information into the tags. This gets rid of the case where calling invalidateBlk on an already-invalid block does something different than calling it on a valid block, which was confusing. |
9209:f9633070689b |
11-Sep-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: Use uint32_t instead of uint32 everywhere |
9208:2451e60d4555 |
11-Sep-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: Use uint8_t instead of uint8 everywhere |
9206:f6483789d23a |
10-Sep-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby System: Convert to Clocked Object This patch moves Ruby System from being a SimObject to recently introduced ClockedObject. |
9205:cc41d310241f |
10-Sep-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby Slicc: remove the call to cin.get() function If I understand correctly, this was put in place so that a debugger can be attached when the protocol aborts. While this sounds useful, it is a problem when the simulation is not being actively monitored. I think it is better to remove this. |
9203:939077a54014 |
10-Sep-2012 |
Marco Elver <marco.elver@ed.ac.uk> |
Mem: Allow serializing of more than INT_MAX bytes
Despite gzwrite taking an unsigned for length, it returns an int for bytes written; gzwrite fails if (int)len < 0. Because of this, call gzwrite with len no larger than INT_MAX: write in blocks of INT_MAX if data to be written is larger than INT_MAX. |
9184:a1a8f137b796 |
07-Sep-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Param: Transition to Cycles for relevant parameters
This patch is a first step to using Cycles as a parameter type. The main affected modules are the CPUs and the Ruby caches. There are definitely plenty more places that are affected, but this patch serves as a starting point to making the transition.
An important part of this patch is to actually enable parameters to be specified as Param.Cycles which involves some changes to params.py. |
9182:43da8ae0f36e |
05-Sep-2012 |
Joel Hestness <hestness@cs.wisc.edu> |
Ruby Memory Controller: Fix clocking |
9181:42807286d6cb |
28-Aug-2012 |
Jason Power <power.jg@gmail.com> |
Ruby: Correct DataBlock =operator The =operator for the DataBlock class was incorrectly interpreting the class member m_alloc. This variable stands for whether the assigned memory for the data block needs to be freed or not by the class itself. It seems that the =operator interpreted the variable as whether the memory is assigned to the data block. This wrong interpretation was causing values not to propagate to RubySystem::m_mem_vec_ptr. This caused major issues with restoring from checkpoints when using a protocol which verified that the cache data was consistent with the backing store (i.e. MOESI-hammer). |
9180:ee8d7a51651d |
28-Aug-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Clock: Add a Cycles wrapper class and use where applicable
This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time.
Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though.
This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles.
In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words.
An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes. |
9178:6a0ff1770e6e |
28-Aug-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Stricter port bind/unbind semantics
This patch tightens up the semantics around port binding and checks that the ports that are being bound are currently not connected, and similarly connected before unbind is called.
The patch consequently also changes the order of the unbind and bind for the switching of CPUs to ensure that the rules are adhered to. Previously the ports would be "over-written" without any check.
There are no changes in behaviour due to this patch, and the only place where the unbind functionality is used is in the CPU. |
9173:631daf17b0be |
27-Aug-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: remove README.debugging and Decommissioning_note These files were relevant when Ruby was part of GEMS. They are not required any longer. |
9171:ae88ecf37145 |
27-Aug-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: Remove RubyEventQueue This patch removes RubyEventQueue. Consumer objects now rely on RubySystem or themselves for scheduling events. |
9170:88d422d737db |
27-Aug-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby Memory Vector: Allow more than 4GB of memory The memory size variable was a 32-bit int. This meant that the size of the memory was limited to 4GB. This patch changes the type of the variable to 64-bit to support larger memory sizes. Thanks to Raghuraman Balasubramanian for bringing this to notice. |
9168:4dc0fc0f68c2 |
25-Aug-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
MESI Protocol: Correct the virtual network in profile functions The virtual network in a couple of places was incorrectly mentioned as 3 in place of 1. This is being corrected. |
9167:b8de57c70759 |
25-Aug-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
MESI Coherence Protocol: Add copyright notice |
9165:f9e3dac185ba |
22-Aug-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Packet: Remove NACKs from packet and its use in endpoints
This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that).
The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe.
Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up. |
9164:d112473185ea |
22-Aug-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Bridge: Remove NACKs in the bridge and unify with packet queue
This patch removes the NACKing in the bridge, as the split request/response busses now ensure that protocol deadlocks do not occur, i.e. the message-dependency chain is broken by always allowing responses to make progress without being stalled by requests. The NACKs had limited support in the system with most components ignoring their use (with a suitable call to panic), and as the NACKs are no longer needed to avoid protocol deadlocks, the cleanest way is to simply remove them.
The bridge is the starting point as this is the only place where the NACKs are created. A follow-up patch will remove the code that deals with NACKs in the endpoints, e.g. the X86 table walker and DMA port. Ultimately the type of packet can be complete removed (until someone sees a need for modelling more complex protocols, which can now be done in parts of the system since the port and interface is split).
As a consequence of the NACK removal, the bridge now has to send a retry to a master if the request or response queue was full on the first attempt. This change also makes the bridge ports very similar to QueuedPorts, and a later patch will change the bridge to use these. A first step in this direction is taken by aligning the name of the member functions, as done by this patch.
A bit of tidying up has also been done as part of the simplifications.
Surprisingly, this patch has no impact on any of the regressions. Hence, there was never any NACKs issued. In a follow-up patch I would suggest changing the size of the bridge buffers set in FSConfig.py to also test the situation where the bridge fills up. |
9163:3b5e13ac1940 |
22-Aug-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Extend the QueuedPort interface and use where appropriate
This patch extends the queued port interfaces with methods for scheduling the transmission of a timing request/response. The methods are named similar to the corresponding sendTiming(Snoop)Req/Resp, replacing the "send" with "sched". As the queues are currently unbounded, the methods always succeed and hence do not return a value.
This functionality was previously provided in the subclasses by calling PacketQueue::schedSendTiming with the appropriate parameters. With this change, there is no need to introduce these extra methods in the subclasses, and the use of the queued interface is more uniform and explicit. |
9160:584662eaaecf |
21-Aug-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
PacketQueue: Allow queuing in the same tick as desired send tick
This patch allows packets to be enqueued in the same tick as they are intended to be sent. This does not imply they actually are sent that tick, although that is possible.
This change is useful for module that use the queued ports primarly to avoid handling the flow control involved in sending and retrying packets. |
9157:e0bad9d7bbd6 |
21-Aug-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Clock: Move the clock and related functions to ClockedObject
This patch moves the clock of the CPU, bus, and numerous devices to the new class ClockedObject, that sits in between the SimObject and MemObject in the class hierarchy. Although there are currently a fair amount of MemObjects that do not make use of the clock, they potentially should do so, e.g. the caches should at some point have the same clock as the CPU, potentially with a 1:n ratio. This patch does not introduce any new clock objects or object hierarchies (clusters, clock domains etc), but is still a step in the direction of having a more structured approach clock domains.
The most contentious part of this patch is the serialisation of clocks that some of the modules (but not all) did previously. This serialisation should not be needed as the clock is set through the parameters even when restoring from the checkpoint. In other words, the state is "stored" in the Python code that creates the modules.
The nextCycle methods are also simplified and the clock phase parameter of the CPU is removed (this could be part of a clock object once they are introduced). |
9155:4c67c26fa76e |
19-Aug-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby Banked Array: add copyrights |
9154:198352d722e4 |
17-Aug-2012 |
Jason Power <power.jg@gmail.com> |
Ruby: Add RubySystem parameter to MemoryControl This guarantees that RubySystem object is created before the MemoryController object is created. |
9152:86c0e6ca5e7c |
15-Aug-2012 |
Anthony Gutierrez <atgutier@umich.edu> |
O3,ARM: fix some problems with drain/switchout functionality and add Drain DPRINTFs
This patch fixes some problems with the drain/switchout functionality for the O3 cpu and for the ARM ISA and adds some useful debug print statements.
This is an incremental fix as there are still a few bugs/mem leaks with the switchout code. Particularly when switching from an O3CPU to a TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA I haven't encountered any more assertion failures; now the kernel will typically panic inside of simulation. |
9148:a7a72f42919e |
10-Aug-2012 |
Jason Power <powerjg@cs.wisc.edu> |
Ruby: Clean up topology changes This patch moves instantiateTopology into Ruby.py and removes the mem/ruby/network/topologies directory. It also adds some extra inheritance to the topologies to clean up some issues in the existing topologies. |
9145:42dd80dee4dd |
06-Aug-2012 |
Steve Reinhardt <steve.reinhardt@amd.com> |
SETranslatingPortProxy: fix bug in tryReadString()
Off-by-one loop termination meant that we were stuffing the terminating '\0' into the std::string value, which makes for difficult-to-debug string comparison failures. |
9138:b4d0bdb52694 |
01-Aug-2012 |
Jason Power <powerjg@cs.wisc.edu> |
Ruby NetDest: add assert for bad element in netdest |
9131:b6b4d41ba9b9 |
27-Jul-2012 |
Anthony Gutierrez <atgutier@umich.edu> |
cache: don't allow dirty data in the i-cache
removes the optimization that forwards an exclusive copy to a requester on a read, only for the i-cache. this optimization isn't necessary because we typically won't be writing to the i-cache. |
9128:6921ec2e77c4 |
23-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Bridge: Use EventWrapper instead of Event subclass for sendEvent
This class simply cleans up the code by making use of the EventWrapper convenience class to schedule the sendEvent in the bridge ports. |
9120:48eeef8a0997 |
12-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Make SimpleMemory single ported
This patch changes the simple memory to have a single slave port rather than a vector port. The simple memory makes no attempts at modelling the contention between multiple ports, and any such multiplexing and demultiplexing could be done in a bus (or crossbar) outside the memory controller. This scenario also matches with the ongoing work on a SimpleDRAM model, which will be a single-ported single-channel controller that can be used in conjunction with a bus (or crossbar) to create a multi-port multi-channel controller.
There are only very few regressions that make use of the vector port, and these are all for functional accesses only. To facilitate these cases, memtest and memtest-ruby have been updated to also have a "functional" bus to perform the (de)multiplexing of the functional memory accesses. |
9117:49116b947194 |
12-Jul-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: remove config information from ruby.stats This patch removes printConfig() functions from all structures in Ruby. Most of the information is already part of config.ini, and where ever it is not, it would become in due course. |
9116:9171e26543fa |
12-Jul-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: remove some unused stuff from SLICC files |
9114:8b0ce484dfdc |
11-Jul-2012 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: improved DRAM reset comment |
9109:6bce09259194 |
11-Jul-2012 |
Brad Beckmann <Brad.Beckmann@amd.com> |
# User Brad Beckmann <Brad.Beckmann@amd.com> ruby: fixed fatal print statement |
9107:66b2e1ce53da |
11-Jul-2012 |
Brad Beckmann <Brad.Beckmann@amd.com> |
# User Brad Beckmann <Brad.Beckmann@amd.com> ruby: fixed msgptr print call |
9106:aa9b75db7ea0 |
11-Jul-2012 |
Brad Beckmann <Brad.Beckmann@amd.com> |
imported patch jason/slicc-external-structure-fix |
9105:b576c490e7d1 |
11-Jul-2012 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: banked cache array resource model
This patch models a cache as separate tag and data arrays. The patch exposes the banked array as another resource that is checked by SLICC before a transition is allowed to execute. This is similar to how TBE entries and slots in output ports are modeled. |
9104:27d56b644e78 |
11-Jul-2012 |
Joel Hestness <hestness@cs.utexas.edu> |
ruby: tag and data cache access support
Updates to Ruby to support statistics counting of cache accesses. This feature serves multiple purposes beyond simple stats collection. It provides the foundation for ruby to model the cache tag and data arrays as physical resources, as well as provide the necessary input data for McPAT power modeling. |
9103:956796e06b7f |
11-Jul-2012 |
Nuwan Jayasena <Nuwan.Jayasena@amd.com> |
ruby: adds reset function to Ruby memory controllers |
9102:5464eb9a684b |
11-Jul-2012 |
Nuwan Jayasena <Nuwan.Jayasena@amd.com> |
ruby: memory controllers now inherit from an abstract "MemoryControl" class |
9100:3caf131d7a95 |
11-Jul-2012 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: changes how Topologies are created
Instead of just passing a list of controllers to the makeTopology function in src/mem/ruby/network/topologies/<Topo>.py we pass in a function pointer which knows how to make the topology, possibly with some extra state set in the configs/ruby/<protocol>.py file. Thus, we can move all of the files from network/topologies to configs/topologies. A new class BaseTopology is added which all topologies in configs/topologies must inheirit from and follow its API. |
9098:7909b6cf7188 |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Mem: Make members relating to range and size constant
This patch makes the address-range related members const. The change is trivial and merely ensures that they can be called on a const memory. |
9097:4e1ceddba87b |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Hide the queue implementation in SimpleTimingPort
This patch makes the queue implementation in the SimpleTimingPort private to avoid confusion with the protected member queue in the QueuedSlavePort. The SimpleTimingPort provides the queue_impl to the QueuedSlavePort and it can be accessed via the reference in the base class. The use of the member name queue is thus no longer overloaded. |
9095:0e6bd7082fac |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Align port names in C++ and Python
This patch is a first step to align the port names used in the Python world and the C++ world. Ultimately it serves to make the use of config.json together with output from the simulation easier, including post-processing of statistics.
Most notably, the CPU, cache, and bus is addressed in this patch, and there might be other ports that should be updated accordingly. The dash name separator has also been replaced with a "." which is what is used to concatenate the names in python, and a separation is made between the master and slave port in the bus. |
9094:407c06cd29a3 |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Bus: Make the default bus width 8 bytes instead of 64
This patch changes the default bus width to a more sensible 8 bytes (64 bits), which is in line with most on-chip buses. Although there are cases where a wider or narrower bus is useful, the 8 bytes is a good compromise to serve as the default.
This patch changes essentially all statistics, and will be bundled with the outstanding changes to the bus. |
9093:d33332605782 |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Bus: Split the bus into separate request/response layers
This patch splits the existing buses into multiple layers. The non-coherent bus is split into a request and a response layer, and the coherent bus adds an additional layer for the snoop responses. The layer is modified to be templatised on the port type, such that the different layers can have retryLists with either master or slave ports. This patch also removes the dynamic cast from the retry, as previously promised when moving the recvRetry from the port base class to the master/slave port respectively.
Overall, the split bus more closely reflects any modern on-chip bus and should be at step in the right direction. From this point, it would be reasonable straight forward to add separate layers (and thus contention points and arbitration) for each port and thus create a true crossbar.
The regressions all produce the correct output, but have varying degrees of changes to their statistics. A separate patch will be pushed with the updates to the reference statistics. |
9092:bb27575ebb33 |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Bus: Add a notion of layers to the buses
This patch moves all flow control, arbitration and state information into a bus layer. The layer is thus responsible for all the state transitions, and for keeping hold of the retry list. Consequently the layer is also responsible for the draining.
With this change, the non-coherent and coherent bus are given a single layer to avoid changing any temporal behaviour, but the patch opens up for adding more layers. |
9091:9b29b9a4dda6 |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Bus: Replace tickNextIdle and inRetry with a state variable
This patch adds a state enum and member variable in the bus, tracking the bus state, thus eliminating the need for tickNextIdle and inRetry, and fixing an issue that allowed the bus to be occupied by multiple packets at once (hopefully it also makes it easier to understand the code).
The bus, in its current form, uses tickNextIdle and inRetry to keep track of the state of the bus. However, it only updates tickNextIdle _after_ forwarding a packet using sendTiming, and the result is that the bus is still seen as idle, and a module that receives the packet and starts transmitting new packets in zero time will still see the bus as idle (and this is done by a number of DMA devices). The issue can also be seen in isOccupied where the bus calls reschedule on an event instead of schedule.
This patch addresses the problem by marking the bus as _not_ idle already by the time we conclude that the bus is not occupied and we will deal with the packet.
As a result of not allowing multiple packets to occupy the bus, some regressions have slight changes in their statistics. A separate patch updates these accordingly.
Further ahead, a follow-on patch will introduce a separate state variable for request/responses/snoop responses, and thus implement a split request/response bus with separate flow control for the different message types (even further ahead it will introduce a multi-layer bus). |
9090:e4e22240398f |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Make getAddrRanges const
This patch makes getAddrRanges const throughout the code base. There is no reason why it should not be, and making it const prevents adding any unintentional side-effects. |
9089:da918cb3462e |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Add getAddrRanges to master port (asking slave port)
This patch adds getAddrRanges to the master port, and thus avoids going through getSlavePort to be able to ask the slave. Similar to the previous patch that added isSnooping to the SlavePort, this patch aims to introduce an additional level of hierarchy in the ports (base port being protocol-agnostic) and getSlave/MasterPort will return port pointers to these base classes.
The function is named getAddrRanges also on the master port, but does nothing besides asking the connected slave port. The slave port, as before, has to provide an implementation and actually produce a list of address ranges. The initial design used the name getSlaveAddrRanges for the new function, but the more verbose name was later changed. |
9088:73eeda352933 |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Add isSnooping to slave port (asking master port)
This patch adds isSnooping to the slave port, and thus avoids going through getMasterPort to be able to ask the master. Over the course of the next few patches, all getMasterPort/getSlavePort in Port and MemObject are to be protocol agnostic, and the snooping is part of the protocol layer.
The function is already present on the master port, where it is implemented by the module itself, e.g. a cache. On the slave side, it is merely asking the connected master port. The same name is used by both functions despite their difference in behaviour. The initial design used isMasterSnooping on the slave port side, but the more verbose function name was later changed. |
9087:b5a084a6159b |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Move retry from port base class to Master/SlavePort
This patch is the last part of moving all protocol-related functionality out of the Port base class. All the send/recv functions are already moved, and the retry (which still governs all the timing transport functions) is the only part that remained in the base class.
The only point where this currently causes a bit of inconvenience is in the bus where the retry list is global and holds Port pointers (not Master/SlavePort). This is about to change with the split into a request/response bus and will soon be removed anyway.
The patch has no impact on any regressions. |
9086:496304c8017d |
09-Jul-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Fix: Address a few benign memory leaks
This patch is the result of static analysis identifying a number of memory leaks. The leaks are all benign as they are a result of not deallocating memory in the desctructor. The fix still has value as it removes false positives in the static analysis. |
9084:ace8383f2b7e |
29-Jun-2012 |
Lena Olson <lena@cs.wisc.edu> |
Cache: Fix the LRU policy for classic memory hierarchy
The LRU policy always evicted the least recently touched way, even if it contained valid data and another way was invalid, as can happen if a block has been invalidated by coherance. This can result in caches never warming up even though they are replacing blocks. This modifies the LRU policy to move blocks to LRU position on invalidation. |
9083:fe8355ca560e |
29-Jun-2012 |
Uri Wiener <uri.wiener@arm.com> |
Bus: enable non/coherent buses sub-classes This patch merely changes several methods to be virtual in order to enable non/coherent buses sub-classes. |
9082:7f95b7f56577 |
29-Jun-2012 |
Dam Sunwoo <dam.sunwoo@arm.com> |
Mem: fix master id assertion in cache_impl.hh The assertion was applied to the wrong packet. This patch fixes the issue rerported by Xiang Jiang on the gem5-dev mailing list. |
9080:753fc1c3618c |
29-Jun-2012 |
Matt Evans <matt.evans@arm.com> |
Mem: Fix a livelock resulting in LLSC/locked memory access implementation.
Currently when multiple CPUs perform a load-linked/store-conditional sequence, the loads all create a list of reservations which is then scanned when the stores occur. A reservation matching the context and address of the store is sought, BUT all reservations matching the address are also erased at this point.
The upshot is that a store-conditional will remove all reservations even if the store itself does not succeed. A livelock was observed using 7-8 CPUs where a thread would erase the reservations of other threads, not succeed, loop and put its own reservation in again only to have it blown by another thread that unsuccessfully now tries to store-conditional -- no forward progress was made, hanging the system.
The correct way to do this is to only blow a reservation when a store (conditional or not) actually /occurs/ to its address. One thread always wins (the one that does the store-conditional first). |
9076:fefce4388397 |
29-Jun-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
Cache: Only invalidate a line in the cache when an uncacheable write is seen. |
9063:965c042379df |
07-Jun-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
mem: Delay deleting of incoming packets by one call.
This patch is a temporary fix until Andreas' four-phase patches get reviewed and committed. Removing FastAlloc seems to have exposed an issue which previously was reasonable rare in which packets are freed before the sending cache is done with them. This change puts incoming packets no a pendingDelete queue which are deleted at the start of the next call and thus breaks the dependency between when the caller returns true and when the packet is actually used by the sending cache.
Running valgrind on a multi-core linux boot and the memtester results in no valgrind warnings. |
9053:9cad1c26c3b3 |
05-Jun-2012 |
Dam Sunwoo <dam.sunwoo@arm.com> |
Mem: add per-master stats to physmem
Added per-master stats (similar to cache stats) to physmem. |
9044:904ddeecc653 |
05-Jun-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
sim: Remove FastAlloc
While FastAlloc provides a small performance increase (~1.5%) over regular malloc it isn't thread safe. After removing FastAlloc and using tcmalloc I've seen a performance increase of 12% over libc malloc when running twolf for ARM. |
9036:6385cf85bf12 |
31-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Bus: Split the bus into a non-coherent and coherent bus
This patch introduces a class hierarchy of buses, a non-coherent one, and a coherent one, splitting the existing bus functionality. By doing so it also enables further specialisation of the two types of buses.
A non-coherent bus connects a number of non-snooping masters and slaves, and routes the request and response packets based on the address. The request packets issued by the master connected to a non-coherent bus could still snoop in caches attached to a coherent bus, as is the case with the I/O bus and memory bus in most system configurations. No snoops will, however, reach any master on the non-coherent bus itself. The non-coherent bus can be used as a template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses, and is typically used for the I/O buses.
A coherent bus connects a number of (potentially) snooping masters and slaves, and routes the request and response packets based on the address, and also forwards all requests to the snoopers and deals with the snoop responses. The coherent bus can be used as a template for modelling QPI, HyperTransport, ACE and coherent OCP buses, and is typically used for the L1-to-L2 buses and as the main system interconnect.
The configuration scripts are updated to use a NoncoherentBus for all peripheral and I/O buses.
A bit of minor tidying up has also been done. |
9033:cbe7d60037f3 |
30-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Bus: Remove redundant packet parameter from isOccupied
This patch merely remove the Packet* from the isOccupied member function. Historically this was used to check if the packet was an express snoop, but this is now done outside this function (where relevant). |
9032:42dfc00ee1a1 |
30-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Bus: Turn the PortId into a transport function parameter
The main aim of this patch is to arrive at a suitable port interface for vector ports, including both the packet and the port id. This patch changes the bus transport functions (recvFunctional/Atomic/Timing) to require a PortId parameter indicating the source port. Previously this information was passed by setting the source field of the packet, and this is only required in the case of a timing request.
With this patch, the use of the source and destination field is also more restrictive, as they are only needed for timing accesses. The modifications to these fields for atomic snoops is now removed entirely, also making minor modifications to the cache. |
9031:32ecc0217c5e |
30-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Packet: Unify the use of PortID in packet and port
This patch removes the Packet::NodeID typedef and unifies it with the Port::PortId. The src and dest fields in the packet are used to hold a port id (e.g. in the bus), and thus the two should actually be the same.
The typedef PortID is now global (in base/types.hh) and aligned with the ThreadID in terms of capitalisation and naming of the InvalidPortID constant.
Before this patch, two flags were used for valid destination and source, rather than relying on a named value (InvalidPortID), and this is now redundant, as the src and dest field themselves are sufficient to tell whether the current value is a valid port identifier or not. Consequently, the VALID_SRC and VALID_DST are removed.
As part of the cleaning up, a number of int parameters and local variables are updated to use PortID.
Note that Ruby still has its own NodeID typedef. Furthermore, the MemObject getMaster/SlavePort still has an int idx parameter with a default value of -1 which should eventually change to PortID idx = InvalidPortID. |
9030:047bd5f02c6e |
30-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Packet: Updated comments for src and dest fields
This patch updates the comments for the src and dest fields to reflect their actual use. Due to a number of patches (e.g. removing the Broadcast flag), the old comments are no longer indicative of the current usage. |
9029:120ba616606e |
30-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Bridge: Split deferred request, response and sender state
This patch splits the PacketBuffer class into a RequestState and a DeferredRequest and DeferredResponse. Only the requests need a SenderState, and the deferred requests and responses only need an associated point in time for the request and the response queue.
Besides the cleaning up, the goal is to simplify the transition to a new port handshake, and with these changes, the two packet queues are starting to look very similar to the generic packet queue, but currently they do a few unique things relating to the NACK and counting of requests/responses that the packet queue cannot be conveniently used. This will be addressed in a later patch. |
9019:ea7d6873af6e |
24-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Cache: Remove dangling doWriteback declaration
This patch removes the declaration of doWriteback as there is no implementation for this member function. |
9018:4fbbd05809d2 |
23-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Packet: Cleaning up packet command and attribute
This patch removes unused commands and attributes from the packet to avoid any confusion. It is part of an effort to clear up how and where different commands and attributes are used. |
9012:6d64aa6a26af |
22-May-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: Remove the unused src/mem/ruby/common/Driver.* files. |
9011:52574306c576 |
22-May-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby Sequencer: Schedule deadlock check event at correct time The scheduling of the deadlock check event was being done incorrectly as the clock was not being multiplied, so as to convert the time into ticks. This patch removes that bug. |
8996:8601533b6f70 |
10-May-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
mem: fix bug with CopyStringOut and null string termination. |
8995:a029d2119487 |
10-May-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
Cache: restructure code that actually isn't a loop |
8992:e68dd2ba4fa4 |
10-May-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
gem5: assert before indexing intro arrays to verify bounds |
8991:69fad6658160 |
10-May-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
gem5: fix some iterator use and erase bugs |
8988:528f0fa80f76 |
10-May-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
gem5: Fix a number of incorrect case statements |
8985:4b517873c9ae |
10-May-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
Cache: Panic if you attempt to create a checkpoint with a cache in the system |
8981:6f4ec692716f |
09-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Add the communication monitor
This patch adds a communication monitor MemObject that can be inserted between a master and slave port to provide a range of statistics about the communication passing through it. The communication monitor is non-invasive and does not change any properties or timing of the packets, with the exception of adding a sender state to be able to track latency. The statistics are only collected in timing mode (not atomic) to avoid slowing down any fast forwarding.
An example of the statistics captured by the monitor are: read/write burst lengths, bandwidth, request-response latency, outstanding transactions, inter transaction time, transaction count, and address distribution. The monitor can be used in combination with periodic resetting and dumping of stats (through schedStatEvent) to study the behaviour over time.
In future patches, a selection of convenience scripts will be added to aid in visualising the statistics collected by the monitor. |
8979:591a755b3ddf |
08-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Do not forward uncacheable to bus snoopers
This patch adds a guarding if-statement to avoid forwarding uncacheable requests (or rather their corresponding request packets) to bus snoopers. These packets should never have any effect on the caches, and thus there is no need to forward them to the snoopers. |
8978:4388495beb44 |
04-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Ruby: Ensure snoop requests are sent using sendTimingSnoopReq
This patch fixes a bug that caused snoop requests to be placed in a packet queue. Instead, the packet is now sent immediately using sendTimingSnoopReq, thus bypassing the packet queue and any normal responses waiting to be sent. |
8975:7f36d4436074 |
01-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate requests and responses for timing accesses
This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses.
For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions).
The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly.
With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself. |
8970:1fc1256d5798 |
28-Apr-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Garnet: Correct computation of link utilization The computation for link utilization was incorrect for the flexible network. The utilization was being divided twice by the total time. |
8967:fc2c4db64ded |
25-Apr-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: Remove extra statements from Sequencer |
8966:354202312a21 |
25-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Use base class Master/SlavePort pointers in the bus
This patch makes some rather trivial simplifications to the bus in that it changes the use of BusMasterPort and BusSlavePort pointers to simply use MasterPort and SlavePort (iterators are also updated accordingly).
This change is a step towards a future patch that introduces a separation of the interface and the structural port itself. |
8965:1ebd7c856abc |
25-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Add the PortId type and a corresponding id field to Port
This patch introduces the PortId type, moves the definition of INVALID_PORT_ID to the Port class, and also gives every port an id to reflect the fact that each element in a vector port has an identifier/index.
Previously the bus and Ruby testers (and potentially other users of the vector ports) added the id field in their port subclasses, and now this functionality is always present as it is moved to the base class. |
8949:3fa1ee293096 |
14-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Remove the Broadcast destination from the packet
This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field.
Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before).
The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class.
In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing. |
8948:e95ee70f876c |
14-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate snoops and normal memory requests/responses
This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports.
Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below.
Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass.
Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses.
The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id.
In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions. |
8946:fb6c89334b86 |
14-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
clang/gcc: Fix compilation issues with clang 3.0 and gcc 4.6
This patch addresses a number of minor issues that cause problems when compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it avoids using the deprecated ext/hash_map and instead uses unordered_map (and similarly so for the hash_set). To make use of the new STL containers, g++ and clang has to be invoked with "-std=c++0x", and this is now added for all gcc versions >= 4.6, and for clang >= 3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1 unordered_map to avoid the deprecation warning.
The addition of c++0x in turn causes a few problems, as the compiler is more stringent and adds a number of new warnings. Below, the most important issues are enumerated:
1) the use of namespaces is more strict, e.g. for isnan, and all headers opening the entire namespace std are now fixed.
2) another other issue caused by the more stringent compiler is the narrowing of the embedded python, which used to be a char array, and is now unsigned char since there were values larger than 128.
3) a particularly odd issue that arose with the new c++0x behaviour is found in range.hh, where the operator< causes gcc to complain about the template type parsing (the "<" is interpreted as the beginning of a template argument), and the problem seems to be related to the begin/end members introduced for the range-type iteration, which is a new feature in c++11.
As a minor update, this patch also fixes the build flags for the clang debug target that used to be shared with gcc and incorrectly use "-ggdb". |
8943:f954ee138ca3 |
12-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Ruby: Ensure order-dependent iteration uses an ordered map
This patch fixes a bug in Ruby that caused non-deterministic simulation when changing the underlying hash map implementation. The reason is order-dependent behaviour in combination with iteration over the hash map contents. The two locations where a sorted container is assumed are now changed to make use of a std::map instead of the unordered hash map.
With this change, the stats changes slightly and the follow-on changeset will update the relevant statistics. |
8938:7925057dc4d8 |
06-Apr-2012 |
Lisa Hsu <Lisa.Hsu@amd.com> |
slicc: Controllers attached to Sequencers no longer have to be named L1Cache. |
8937:225590437eb2 |
06-Apr-2012 |
Brad Beckmann <Brad.Beckmann@amd.com> |
sim-ruby: checkpointing fixes and dependent eventq improvements
Fixes checkpointing with respect to lost events after swapping event queues. Also adds DPRINTFs to better understand what's going on when Ruby serializes and unserializes. |
8936:c04af06738e0 |
06-Apr-2012 |
Brad Beckmann <Brad.Beckmann@amd.com> |
slicc: fixed error message when the type has no inheritance |
8935:c955a451271e |
06-Apr-2012 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_hammer: tbe allocation and dependent wakeup fixes |
8933:2727a5a0aadc |
06-Apr-2012 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_hammer: fixed bug with single cpu + flushes, then modified the regression tester to check this functionality |
8932:1b2c17565ac8 |
06-Apr-2012 |
Brad Beckmann <Brad.Beckmann@amd.com> |
rubytest: seperated read and write ports.
This patch allows the ruby tester to support protocols where the i-cache and d-cache are managed by seperate controllers. |
8931:7a1dfb191e3f |
06-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Enable multiple distributed generalized memories
This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range.
All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory.
To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables.
Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. |
8924:5f6cfd09fdaf |
30-Mar-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Remove legacy DRAM in preparation for memory updates
This patch removes the DRAM memory class in preparation for updates to the memory system, with the first one introducing an abstract memory class, and removing the assumption of a single physical memory. |
8923:820111f58fbb |
30-Mar-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Ruby: Remove the physMemPort and instead access memory directly
This patch removes the physMemPort from the RubySequencer and instead uses the system pointer to access the physmem. The system already keeps track of the physmem and the valid memory address ranges, and with this patch we merely make use of that existing functionality. The memory is modified so that it is possible to call the access functions (atomic and functional) without going through the port, and the memory is allowed to be unconnected, i.e. have no ports (since Ruby does not attach it like the conventional memory system). |
8922:17f037ad8918 |
30-Mar-2012 |
William Wang <william.wang@arm.com> |
MEM: Introduce the master/slave port sub-classes in C++
This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects.
The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches.
The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal.
The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again. |
8916:7d95b650c9b6 |
23-Mar-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Ruby: Fix Set::print for 32-bit hosts
This patch fixes a compilation error caused by a length mismatch on 32-bit hosts. The ifdef and sprintf is replaced by a csprintf. |
8915:31648cc2e0d9 |
22-Mar-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Unify bus access methods and prepare for master/slave split
This patch unifies the recvFunctional, recvAtomic and recvTiming to all be based on a similar structure: 1) extract information about the incoming packet, 2) send it out to the appropriate snoopers, 3) determine where it is going, and 4) forward it to the right destination. The naming of variables across the different access functions is now consistent as well.
Additionally, the patch introduces the member functions releaseBus and retryWaiting to better distinguish between the two cases when we should tell a sender to retry. The first case is when the bus goes from busy to idle, and the second case is when it receives a retry from a destination that did not immediatelly accept a packet.
As a very minor change, the MMU debug flag is no longer used in the bus. |
8914:8c3bd7bea667 |
22-Mar-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Split SimpleTimingPort into PacketQueue and ports
This patch decouples the queueing and the port interactions to simplify the introduction of the master and slave ports. By separating the queueing functionality from the port itself, it becomes much easier to distinguish between master and slave ports, and still retain the queueing ability for both (without code duplication).
As part of the split into a PacketQueue and a port, there is now also a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The QueuedPort is useful for ports that want to leave the packet transmission of outgoing packets to the queue and is used by both master and slave ports. The SimpleTimingPort inherits from the QueuedPort and adds the implemention of recvTiming and recvFunctional through recvAtomic.
The PioPort and MessagePort are cleaned up as part of the changes. |
8913:8b223e308b08 |
22-Mar-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Scons: Remove Werror=False in SConscript files
This patch removes the overriding of "-Werror" in a handful of cases. The code compiles with gcc 4.6.3 and clang 3.0 without any warnings, and thus without any errors. There are no functional changes introduced by this patch. In the future, rather than ypassing "-Werror", address the warnings. |
8903:c739a3a829f5 |
19-Mar-2012 |
Tushar Krishna <tushar@csail.mit.edu> |
Garnet: Stats at vnet granularity + code cleanup
This patch (1) Moves redundant code from fixed and flexible networks to BaseGarnetNetwork. (2) Prints network stats at vnet granularity. |
8883:c92153af04ac |
09-Mar-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
cache: Allow main memory to be at disjoint address ranges. |
8881:042d509574c1 |
06-Mar-2012 |
Marc Orr <marc.orr@gmail.com> |
build scripts: Made minor modifications to reduce build overhead time.
1. --implicit-cache behavior is default. 2. makeEnv in src/SConscript is conditionally called. 3. decider set to MD5-timestamp 4. NO_HTML build option changed to SLICC_HTML (defaults to False) |
8874:9e2a4cf89be6 |
02-Mar-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Ruby: Rename RubyPort::sendTiming to avoid overriding base class
This patch renames the sendTiming member function in the RubyPort to avoid inadvertently hiding Port::sendTiming (discovered through some rather painful debugging). The RubyPort does, in fact, rely on the functionality of the queued port and the implementation merely schedules a send the next cycle. The new name for the member function is sendNextCycle to better reflect this behaviour.
In the unlikely event that we ever shift to using C++11 the member functions in Port should have a "final" identifier to prevent any overriding in derived classes. |
8867:08cc303b718b |
01-Mar-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
Cache: Fix an issue with LRU when bonus block is used to complete transaction.
The block is never inserted because it's the one extra block in the cache, but it can be invalidated twice in a row. In that case the block doesn't have a new master id (beacuse it was never inserted), however it is valid and the accounting goes wrong at that point. |
8861:56d011130987 |
29-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Make all the port proxy members const
This is a trivial patch that merely makes all the member functions of the port proxies const. There is no good reason why they should not be, and this change only serves to make it explicit that they are not modified through their use. |
8856:241ee47b0dc6 |
24-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Simplify cache ports preparing for master/slave split
This patch splits the two cache ports into a master (memory-side) and slave (cpu-side) subclass of port with slightly different functionality. For example, it is only the CPU-side port that blocks incoming requests, and only the memory-side port that schedules send events outside of what the transmit list dictates.
This patch simplifies the two classes by relying further on SimpleTimingPort and also generalises the latter to better accommodate the changes (introducing trySendTiming and scheduleSend). The memory-side cache port overrides sendDeferredPacket to be able to not only send responses from the transmit list, but also send requests based on the MSHRs.
A follow on patch further simplifies the SimpleTimingPort and the cache ports. |
8855:74490e94da0c |
24-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Prepare mport for master/slave split
This patch simplifies the mport in preparation for a split into a master and slave role for the message ports. In particular, sendMessageAtomic was only used in a single location and similarly so sendMessageTiming. The affected interrupt device is updated accordingly. |
8853:0216ed80991b |
24-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Move all read/write blob functions from Port to PortProxy
This patch moves the readBlob/writeBlob/memsetBlob from the Port class to the PortProxy class, thus making a clear separation of the basic port functionality (recv/send functional/atomic/timing), and the higher-level functional accessors available on the port proxies.
There are only a few places in the code base where the blob functions were used on ports, and they are all for peeking into the memory system without making a normal memory access (in the memtest, and the malta and tsunami pchip). The memtest also exemplifies how easy it is to create a non-translating proxy if desired. The malta and tsunami pchip used a slave port to perform a functional read, and this is now changed to rely on the physProxy of the system (to which they already have a pointer). |
8852:c744483edfcf |
24-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Make port proxies use references rather than pointers
This patch is adding a clearer design intent to all objects that would not be complete without a port proxy by making the proxies members rathen than dynamically allocated. In essence, if NULL would not be a valid value for the proxy, then we avoid using a pointer to make this clear.
The same approach is used for the methods using these proxies, such as loadSections, that now use references rather than pointers to better reflect the fact that NULL would not be an acceptable value (in fact the code would break and that is how this patch started out).
Overall the concept of "using a reference to express unconditional composition where a NULL pointer is never valid" could be done on a much broader scale throughout the code base, but for now it is only done in the locations affected by the proxies. |
8851:7e966326ef5b |
24-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Move port creation to the memory object(s) construction
This patch moves all port creation from the getPort method to be consistently done in the MemObject's constructor. This is possible thanks to the Swig interface passing the length of the vector ports. Previously there was a mix of: 1) creating the ports as members (at object construction time) and using getPort for the name resolution, or 2) dynamically creating the ports in the getPort call. This is now uniform. Furthermore, objects that would not be complete without a port have these ports as members rather than having pointers to dynamically allocated ports.
This patch also enables an elaboration-time enumeration of all the ports in the system which can be used to determine the masterId. |
8850:ed91b534ed04 |
24-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
CPU: Round-two unifying instr/data CPU ports across models
This patch continues the unification of how the different CPU models create and share their instruction and data ports. Most importantly, it forces every CPU to have an instruction and a data port, and gives these ports explicit getters in the BaseCPU (getDataPort and getInstPort). The patch helps in simplifying the code, make assumptions more explicit, andfurther ease future patches related to the CPU ports.
The biggest changes are in the in-order model (that was not modified in the previous unification patch), which now moves the ports from the CacheUnit to the CPU. It also distinguishes the instruction fetch and load-store unit from the rest of the resources, and avoids the use of indices and casting in favour of keeping track of these two units explicitly (since they are always there anyways). The atomic, timing and O3 model simply return references to their already existing ports. |
8849:6c15fa5fe377 |
24-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Fatal when no port can be found for an address
This patch adds a check in the findPort method to ensure that an invalid port id is never returned. Previously this could happen if no default port was set, and no address matched the request, in which case -1 was returned causing a SEGFAULT when using the id to index in the port array. To clean things up further a symbolic name is added for the invalid port id. |
8839:eeb293859255 |
13-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Introduce the master/slave port roles in the Python classes
This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves.
The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port.
Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves. |
8833:2870638642bd |
12-Feb-2012 |
Dam Sunwoo <dam.sunwoo@arm.com> |
mem: fix cache stats to use request ids correctly
This patch fixes the cache stats to use the new request ids. Cache stats also display the requestor names in the vector subnames. Most cache stats now include "nozero" and "nonan" flags to reduce the amount of excessive cache stat dump. Also, simplified incMissCount()/incHitCount() functions. |
8832:247fee427324 |
12-Feb-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
mem: Add a master ID to each request object.
This change adds a master id to each request object which can be used identify every device in the system that is capable of issuing a request. This is part of the way to removing the numCpus+1 stats in the cache and replacing them with the master ids. This is one of a series of changes that make way for the stats output to be changed to python. |
8831:6c08a877af8f |
12-Feb-2012 |
Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> |
prefetcher: Make prefetcher a sim object instead of it being a parameter on cache |
8828:e8fd0fc4a417 |
10-Feb-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: Remove isTagPresent() calls from Sequencer.cc This patch removes the calls to isTagPresent() from Sequencer.cc. These calls are made just for setting the cache block to have been most recently used. The calls have been folded in to the function setMRU(). |
8827:38b8b9a97500 |
10-Feb-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
MESI: Add queues for stalled requests This patch adds support for stalling the requests queued up at different controllers for the MESI CMP directory protocol. Earlier the controllers would recycle the requests using some fixed latency. This results in younger requests getting serviced first at times, and can result in starvation. Instead all the requests that need a particular block to be in a stable state are moved to a separate queue, where they wait till that block returns to a stable state and then they are processed. |
8819:4f2ad221ae32 |
09-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Remove onRetryList from BusPort and rely on retryList
This patch removes the onRetryList field from the BusPort class and entirely relies on the retryList which holds all ports that are waiting to retry. The onRetryList field and the retryList were previously used with overloaded functionalities and only one is really needed (there were also checks to assert they held the same information). After this patch the bus ports will be split into master and slave ports and this simplifies that transition. |
8809:bb10807da889 |
01-Feb-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with head, hopefully the last time for this batch. |
8799:dac1e33e07b0 |
28-Jan-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with the main repo. |
8798:adaa92be9037 |
16-Jan-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Merge yet again with the main repository. |
8797:3202eb01e01e |
07-Jan-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Another merge with the main repository. |
8796:a2ae5c378d0a |
07-Jan-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with the main repository again. |
8795:0909f8ed7aa0 |
07-Jan-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with main repository. |
8794:e2ac2b7164dd |
18-Nov-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Get rid of includes of config/full_system.hh. |
8786:8be24baf68b8 |
07-Nov-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Get rid of FULL_SYSTEM in mem. |
8766:b0773af78423 |
30-Oct-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Build the base process class in FS. |
8763:509e9bb84dfa |
16-Oct-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Turn on the page table class in FS. |
8762:c77d9ef26d2b |
16-Oct-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Build in the tport in FS mode. |
8761:20322354b80b |
16-Oct-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Build/expose vport in SE mode. |
8737:770ccf3af571 |
31-Jan-2012 |
Koan-Sin Tan <koansin.tan@gmail.com> |
clang: Enable compiling gem5 using clang 2.9 and 3.0
This patch adds the necessary flags to the SConstruct and SConscript files for compiling using clang 2.9 and later (on Ubuntu et al and OSX XCode 4.2), and also cleans up a bunch of compiler warnings found by clang. Most of the warnings are related to hidden virtual functions, comparisons with unsigneds >= 0, and if-statements with empty bodies. A number of mismatches between struct and class are also fixed. clang 2.8 is not working as it has problems with class names that occur in multiple namespaces (e.g. Statistics in kernel_stats.hh).
clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which causes confusion between the container std::set and the function Packet::set, and this is currently addressed by not including the entire namespace std, but rather selecting e.g. "using std::vector" in the appropriate places. |
8736:2d8a57343fe3 |
31-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Remove the otherPort from the cache ports
This patch is a very straight-forward simplification, removing the unecessary otherPort pointer from the cache port. The pointer was only used to forward range changes, and the address range is fixed for the cache. Removing the pointer simplifies the transition to master/slave ports. |
8733:64a7bf8fa56c |
31-Jan-2012 |
Geoffrey Blake <geoffrey.blake@arm.com> |
CheckerCPU: Re-factor CheckerCPU to be compatible with current gem5
Brings the CheckerCPU back to life to allow FS and SE checking of the O3CPU. These changes have only been tested with the ARM ISA. Other ISAs potentially require modification. |
8731:bb0aaf3ffa18 |
30-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Make the RubyPort physMemPort a PioPort instead of M5Port
This patch makes the physMemPort of the RubyPort a PioPort rather than an M5Port. This reflects the fact that the M5Port and PioPort have different roles. The M5Port is really a coherent slave that is connected to the CPUs and other coherent masters of the system, e.g. DMA ports. The PioPort, on the other hand, is a master port that is connected to the memory and other slaves, for example the pio devices.
This simplifies future changes into master/slave ports and is consistent with the port roles throughout the system. |
8722:78b08f92c290 |
12-Jan-2012 |
Mitchell Hayenga <Mitchell.Hayenga@ARM.com> |
Fix memory corruption issue with CopyStringOut()
CopyStringOut() improperly indexed setting the null character, would result in zeroing a random byte of memory after(out of bounds) the character array. |
8719:d70a85ee7062 |
25-Jan-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
Mem: Add simple bandwidth stats to PhysicalMemory |
8717:5c253f1031d7 |
23-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
O3, Ruby: Forward invalidations from Ruby to O3 CPU This patch implements the functionality for forwarding invalidations and replacements from the L1 cache of the Ruby memory system to the O3 CPU. The implementation adds a list of ports to RubyPort. Whenever a replacement or an invalidation is performed, the L1 cache forwards this to all the ports, which is the LSQ in case of the O3 CPU. |
8716:a91aba9f2cd4 |
23-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
MemCmd: Add a command for invalidation requests to LSQ This command will be sent from the memory system (Ruby) to the LSQ of an O3 CPU so that the LSQ, if it needs to, invalidates the address in the request packet. |
8715:03e09db82c80 |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Make the bus default port yet another port
This patch removes the idiosyncratic nature of the default bus port and makes it yet another port in the list of interfaces. Rather than having a specific pointer to the default port we merely track the identifier of this port. This change makes future port diversification easier and overall cleans up the bus code. |
8713:2f1a3e335255 |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Make the bus bridge unidirectional and fixed address range
This patch makes the bus bridge uni-directional and specialises the bus ports to be a master port and a slave port. This greatly simplifies the assumptions on both sides as either port only has to deal with requests or responses. The following patches introduce the notion of master and slave ports, and would not be possible without this split of responsibilities.
In making the bridge unidirectional, the address range mechanism of the bridge is also changed. For the cases where communication is taking place both ways, an additional bridge is needed. This causes issues with the existing mechanism, as the busses cannot determine when to stop iterating the address updates from the two bridges. To avoid this issue, and also greatly simplify the specification, the bridge now has a fixed set of address ranges, specified at creation time. |
8712:7f762428a9f5 |
17-Jan-2012 |
William Wang <william.wang@arm.com> |
MEM: Remove the functional ports from the memory system
The functional ports are no longer used and this patch cleans up the legacy that is still present in buses, memories, CPUs etc. Note that this does not refer to the class FunctionalPort (already removed), but rather ports with the name (and use) functional. |
8711:c7e14f52c682 |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate queries for snooping and address ranges
This patch simplifies the address-range determination mechanism and also unifies the naming across ports and devices. It further splits the queries for determining if a port is snooping and what address ranges it responds to (aiming towards a separation of cache-maintenance ports and pure memory-mapped ports). Default behaviours are such that most ports do not have to define isSnooping, and master ports need not implement getAddrRanges. |
8710:aab813d6a162 |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Remove Port removeConn and MemObject deletePortRefs
Cleaning up and simplifying the ports and going towards a more strict elaboration-time creation and binding of the ports. |
8709:d7358736ac70 |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Remove the notion of the default port
This patch removes the default port and instead relies on the peer being set to NULL initially. The binding check (i.e. is a port connected or not) will eventually be moved to the init function of the modules. |
8708:7ccbdea0fa12 |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Simplify ports by removing EventManager
This patch removes the inheritance of EventManager from the ports and moves all responsibility for event queues to the owner. Eventually the event manager should be the interface block, which could either be the structural owner or a subblock like a LSQ in the O3 CPU for example. |
8706:b1838faf3bcc |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Add port proxies instead of non-structural ports
Port proxies are used to replace non-structural ports, and thus enable all ports in the system to correspond to a structural entity. This has the advantage of accessing memory through the normal memory subsystem and thus allowing any constellation of distributed memories, address maps, etc. Most accesses are done through the "system port" that is used for loading binaries, debugging etc. For the entities that belong to the CPU, e.g. threads and thread contexts, they wrap the CPU data port in a port proxy.
The following replacements are made: FunctionalPort > PortProxy TranslatingPort > SETranslatingPortProxy VirtualPort > FSTranslatingPortProxy |
8704:683e7b0b5771 |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Ruby: Change the access permissions for MOESI hammer
This patch changes the access permission for the WB_E_W state from Busy to Read_Write to avoid having issues in follow-on patches with functional accesses going through Ruby. This change was made after consultation with all involved parties and is more of a work-around than a fix. |
8702:2764cd55d2ad |
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Differentiate functional cache accesses from CPU and memory
This patch changes the functionalAccess member function in the cache model such that it is aware of what port the access came from, i.e. if it came from the CPU side or from the memory side. By adding this information, it is possible to respect the 'forwardSnoops' flag for snooping requests coming from the memory side and not forward them. This fixes an outstanding issue with the IO bus getting accesses that have no valid destination port and also cleans up future changes to the bus model. |
8693:99ba36eaa789 |
12-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
PerfectCacheMemory: Remove references to CacheMsg The definition for the class CacheMsg was removed long back. Some declaration had still survived, which was recently removed. Since the PerfectCacheMemory class relied on this particular declaration, its absence let to compilation breaking down. Hence this patch. |
8692:d131677ccfcf |
11-Jan-2012 |
Ali Saidi <saidi@eecs.umich.edu> |
Packet: Put back part of the assert |
8691:caf280f1268d |
11-Jan-2012 |
Ali Saidi <saidi@eecs.umich.edu> |
Packet: Remove meaningless assert statement |
8688:5ca9dd977386 |
11-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: Resurrect Cache Warmup Capability This patch resurrects ruby's cache warmup capability. It essentially makes use of all the infrastructure that was added to the controllers, memories and the cache recorder. |
8687:0b7825ddbb17 |
11-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby Debug Flags: Remove one, add another The flag RubyStoreBuffer is being removed, instead RubySystem is being added |
8686:71ac9dda5432 |
11-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby Port: Add a list of cpu ports attached to this port |
8685:2854ed06ce05 |
11-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby EventQueue: Remove unused functions |
8684:9a2ac57eb22c |
11-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby Sparse Memory: Add function for collating blocks This patch adds function to the Sparse Memory so that the blocks can be recorded in a cache trace. The blocks are added to the cache recorder which can later write them into a file. |
8683:9feb100066e1 |
11-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: Add infrastructure for recording cache contents This patch changes CacheRecorder, CacheMemory, CacheControllers so that the contents of a cache can be recorded for checkpointing purposes. |
8682:d70d2dfb1c20 |
11-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby Memory Vector: Functions for collating and populating pages This patch adds functions to the memory vector class that can be used for collating memory pages to raw trace and for populating pages from a raw trace. |
8681:db978f3bcf51 |
10-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: remove the files related to the tracer The Ruby Tracer is out of date with the changes that are being carried out to support checkpointing. Hence, it needs to be removed. |
8678:08d6c1fbaecb |
10-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
MOESI Hammer: Remove a couple of bugs A couple of bugs were observed while building checkpointing support in Ruby. This patch changes transitions to remove those errors. |
8677:e7f6268d7ef3 |
10-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Sparse Memory: Simplify the structure for an entry The SparseMemEntry structure includes just one void* pointer. It seems unnecessary that we have a structure for this. The patch removes the structure and makes use of a typedef on void* instead. |
8668:be72c2a127b2 |
09-Jan-2012 |
Geoffrey Blake <geoffrey.blake@arm.com> |
Packet: Add derived class FunctionalPacket to enable partial functional reads
This adds the derived class FunctionalPacket to fix a long standing deficiency in the Packet class where it was unable to handle finding data to partially satisfy a functional access. Made this a derived class as functional accesses are used only in certain contexts and to not add any additional overhead to the existing Packet class. |
8663:e1bb31f243e2 |
09-Jan-2012 |
Min Kyu Jeong <MinKyu.Jeong@arm.com> |
mem: Change DPRINTF prints more useful destination port number. Old code prints 0 for destination since pkt->getDest() returns 0 for pkt->getDest() == Packet::Broadcast, which is always true. |
8653:15d4da9d2042 |
07-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby Cache: Add param for marking caches as instruction only |
8651:c3d878fbdaea |
06-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
AbstractController: Remove some of the unused functions |
8650:33a48f15e94a |
06-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
Ruby Set: Move NUMBER_WORDS_PER_SET to Set.hh This constant is currently in System.hh, but is only used in Set.hh. It is being moved to Set.hh to remove this artificial dependence of Set.hh on System.hh. |
8647:31ae249b397a |
05-Jan-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
MESI Coherence Protocol: Fix L2 miss statistics This patch removes calls to uu_ProfileMiss from transitions where the request is satisfied by the L2 cache controller. |
8645:89929730804b |
31-Dec-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Shuffle some of the included files This patch adds and removes included files from some of the files so as to organize remove some false dependencies and include some files directly instead of transitively. |
8644:acf68e5a8cd7 |
31-Dec-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
SLICC: Use pointers for directory entries SLICC uses pointers for cache and TBE entries but not for directory entries. This patch changes the protocols, SLICC and Ruby memory system so that even directory entries are referenced using pointers. |
8641:4d3ecac1abec |
13-Dec-2011 |
Nathan Binkert <nate@binkert.org> |
gcc: fix unused variable warnings from GCC 4.6.1 |
8637:13ea13815bf9 |
01-Dec-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_hammer: fixed L2 to L1 infinite stalls and deadlock |
8636:4ee9dec30f8c |
01-Dec-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
physmem: Improved fatal message for size mismatch |
8619:2f1875b5f107 |
23-Nov-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
Topology: bug fix in external link initialization |
8618:ce41ec640691 |
22-Nov-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
Remove standard_1level_CMP-protocol.sm include statement from Network |
8615:e66a566f2cfa |
14-Nov-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Process packet instead of RubyRequest in Sequencer This patch changes the implementation of Ruby's recvTiming() function so that it pushes a packet in to the Sequencer instead of a RubyRequest. This requires changes in the Sequencer's makeRequest() and issueRequest() functions, as they also need to operate on a Packet instead of RubyRequest. |
8612:df3b7a1e883f |
04-Nov-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
GARNET: adding a fault model for resilient on-chip network research.
This patch adds a fault model, which provides the probability of a number of architectural faults in the interconnection network (e.g., data corruption, misrouting). These probabilities can be used to realistically inject faults in GARNET and faithfully evaluate the effectiveness of novel resilient NoC architectures. |
8611:d9c61e6f1848 |
04-Nov-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
MESI Protocol: Add functions for profiling misses |
8609:78da831670e4 |
03-Nov-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Protocol: Remove standard one and two level files |
8608:02d7ac5fb855 |
03-Nov-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Remove some unused typedefs This patch removes some of the unused typedefs. It also moves some of the typedefs from Global.hh to TypeDefines.hh. The patch also eliminates the file NodeID.hh. |
8607:5fb918115c07 |
31-Oct-2011 |
Gabe Black <gblack@eecs.umich.edu> |
GCC: Get everything working with gcc 4.6.1.
And by "everything" I mean all the quick regressions. |
8602:836f8fad4a4c |
28-Oct-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Reorganize mapping of components In RubySlicc_ComponentMapping.hh, certain '#define's have been used for mapping MachineType to GenericMachineType. These '#define's are being eliminated and the code will now be generated by SLICC instead. Also are being eliminated some of the unused functions from RubySlicc_ComponentMapping.sm. |
8601:af28085882dc |
23-Oct-2011 |
Steve Reinhardt <steve.reinhardt@amd.com> |
SE: move page allocation from PageTable to Process
PageTable supported an allocate() call that called back through the Process to allocate memory, but did not have a method to map addresses without allocating new pages. It makes more sense for Process to do the allocation, so this method was renamed allocateMem() and moved to Process, and uses a new map() call on PageTable.
The remaining uses of the process pointer in PageTable were only to get the name and the PID, so by passing these in directly in the constructor, we can make PageTable completely independent of Process. |
8600:b0d7c64ada19 |
23-Oct-2011 |
Steve Reinhardt <steve.reinhardt@amd.com> |
syscall_emul: implement MAP_FIXED option to mmap() |
8581:56f97760eadd |
22-Sep-2011 |
Steve Reinhardt <steve.reinhardt@amd.com> |
event: minor cleanup Initialize flags via the Event constructor instead of calling setFlags() in the body of the derived class's constructor. I forget exactly why, but this made life easier when implementing multi-queue support.
Also rename Event::getFlags() to isFlagSet() to better match common usage, and get rid of some unused Event methods. |
8551:4e09d02322fb |
13-Sep-2011 |
Daniel Johnson <daniel.johnson@arm.com> |
Mem: Allow ASID to be set after request is created. |
8548:33bdc36bf46f |
13-Sep-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Prefetch: Don't prefetch if address is in the write queue.
Check that we're not currently writing back an address the prefetcher is trying to prefetch before issuing it. We previously checked the mshrQueue and the cache itself, but forgot to check the writeBuffer. This fixes a memory corrucption issue with an L2 prefetcher. |
8539:7d3ea3c65c66 |
09-Sep-2011 |
Gabe Black <gblack@eecs.umich.edu> |
Stack: Tidy up some comments, a warning, and make stack extension consistent.
Do some minor cleanup of some recently added comments, a warning, and change other instances of stack extension to be like what's now being done for x86. |
8533:8dac0abb7a1b |
01-Sep-2011 |
Lisa Hsu <Lisa.Hsu@amd.com> |
Fix build for gcc-4.2 opt/fast
Even though the code is safe, compiler flags a warning here, which are treated as errors for fast/opt. I know it's redundant but it has no side effects and fixes the compile. |
8532:8f27cf8971fe |
01-Sep-2011 |
Lisa Hsu <Lisa.Hsu@amd.com> |
Functional Accesses: Update states to support Broadcast/Snooping protocols.
In the current implementation of Functional Accesses, it's very hard to implement broadcast or snooping protocols where the memory has no idea if it has exclusive access to a cache block or not. Without this knowledge, making sure the RW vs. RO permissions are right are next to impossible. So we add a new state called Backing_Store to enable the conveyance that this is the backup storage for a block, so that it can be written if it is the only possibly RW block in the system, or written even if there is another RW block in the system, without causing problems.
Also, a small change to actually set the m_name field for each Controller so that debugging can be easier. Now you can access a controller's name just by controller->getName(). |
8531:bfc59fbde824 |
29-Aug-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
SLICC: Pass arguments by reference Arguments to functions were being passed by value. This patch changes SLICC so that arguments are passed by reference. |
8530:3aaa99208a84 |
29-Aug-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Remove some unused code |
8529:00ca5af1b954 |
26-Aug-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Eliminate modulo op for computing set size. |
8526:2e5d41fbc4a5 |
19-Aug-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Mem: Put prefetcher notify call before packet is deleted. |
8509:afb40c3d4ba6 |
19-Aug-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Prefetcher: Fix some memory leaks with the prefetcher. |
8505:442804117f95 |
15-Aug-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Initialize some variables. |
8492:1ad244a20877 |
08-Aug-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
BuildEnv: Eliminate RUBY as build environment variable This patch replaces RUBY with PROTOCOL in all the SConscript files as the environment variable that decides whether or not certain components of the simulator are compiled. |
8485:7a9a7f2a3d46 |
03-Aug-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Remove files and includes not in use |
8483:b5052cad1fd3 |
02-Aug-2011 |
Gabe Black <gblack@eecs.umich.edu> |
Scons: Make some Action objects fit the abreviated output format. |
8482:353abb676fa2 |
02-Aug-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Scons: Drop RUBY as compile time option. This patch drops RUBY as a compile time option. Instead the PROTOCOL option is used to figure out whether or not to build Ruby. If the specified protocol is 'None', then Ruby is not compiled. |
8478:435179113834 |
27-Jul-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
SLICC: Put functions of a controller in its .cc file Currently, functions associated with a controller go into separate files. This patch puts all the functions in the controller's .cc file. This should hopefully take away some time from compilation. |
8472:37d052b21555 |
15-Jul-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Mem: Fix issue with prefetches originating at non-L1 caches getting stale data
Prefetch requests issued from the L2 or below wouldn't check if valid data is present higher in the system. If a prefetch into the L2 occured at the same time as writeback from a higher-level cache the dirty data could be replaced in by unmodified data in memory. |
8456:5204873afc05 |
06-Jul-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: added generic dma machine |
8455:d59189f372e7 |
06-Jul-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_hammer: Fixed uniprocessor DMA bug |
8454:fad37c6670a6 |
05-Jul-2011 |
Nathan Binkert <nate@binkert.org> |
slicc: add a protocol statement and an include statement All protocols must specify their name The include statement allows any file to include another file. |
8453:82fc1267d3bb |
05-Jul-2011 |
Nathan Binkert <nate@binkert.org> |
slicc: cleanup slicc code and make it less verbose |
8452:3f2c329e9046 |
05-Jul-2011 |
Nathan Binkert <nate@binkert.org> |
grammar: better encapsulation of a grammar and parsing This makes it possible to use the grammar multiple times and use the multiple instances concurrently. This makes implementing an include statement as part of a grammar possible. |
8446:be8f4157c8f4 |
03-Jul-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Network_test: Conform it with functional access changes in Ruby Addition of functional access support to Ruby necessitated some changes to the way coherence protocols are written. I had forgotten to update the Network_test protocol. This patch makes those updates. |
8439:559ef3da5dac |
01-Jul-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Commit files missing from previous commit The previous commit on functional access support in Ruby did not have some of the files required. This patch adds those files to the repository. |
8436:5648986156db |
30-Jun-2011 |
Brad Beckmann <Brad.Beckmann@amd.com>, Nilay Vaish <nilay@cs.wisc.edu> |
Ruby: Add support for functional accesses This patch rpovides functional access support in Ruby. Currently only the M5Port of RubyPort supports functional accesses. The support for functional through the PioPort will be added as a separate patch. |
8434:0412ba528ad6 |
24-Jun-2011 |
Joel Hestness <hestness@cs.utexas.edu> |
Ruby: remove unused functions in CacheMemory: get/setMemoryValue |
8351:f897d0483b06 |
14-Jun-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Correct set LONG_BITS and INDEX_SHIFT in class Set. The code for Set class was written under the assumption that std::numeric_limits<long>::digits returns the number of bits used for data type long, which was presumed to be either 32 or 64. But return value is actually one less, that is, it is either 31 or 63. The value is now being incremented by 1 so as to correctly set it. |
8341:30daf1dd5c91 |
08-Jun-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Correctly set access permissions for directory entries The access permissions for the directory entries are not being set correctly. This is because pointers are not used for handling directory entries. function. get and set functions for access permissions have been added to the Controller state machine. The changePermission() function provided by the AbstractEntry and AbstractCacheEntry classes has been exposed to SLICC code once again. The set_permission() functionality has been removed.
NOTE: Each protocol will have to define these get and set functions in order to compile successfully. |
8340:e39a9c0493ad |
08-Jun-2011 |
Gabe Black <gblack@eecs.umich.edu> |
Mem: Use sysconf to get the page size instead of the PAGE_SIZE macro. |
8337:b9ba22cb23f2 |
03-Jun-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
SLICC: Remove machine name as prefix to functions Currently, the machine name is appended before any of the functions defined with in the sm files. This is not necessary and it also means that these functions cannot be used outside the sm files. This patch does away with the prefixes. Note that the generated C++ files in which the code for these functions is present are still named such that the machine name is the prefix. |
8335:9228e00459d4 |
02-Jun-2011 |
Nathan Binkert <nate@binkert.org> |
scons: rename TraceFlags to DebugFlags |
8332:23711432221f |
02-Jun-2011 |
Nathan Binkert <nate@binkert.org> |
copyright: clean up copyright blocks |
8330:681497e0356b |
31-May-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
orion: bug fix in link power, and some reorg |
8329:24a00a6d5992 |
31-May-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
garnet: added network ptr to links to be used by orion |
8322:19949c6de823 |
23-May-2011 |
Steve Reinhardt <steve.reinhardt@amd.com> |
config: tweak ruby configs to clean up hierarchy
Re-enabling implicit parenting (see previous patch) causes current Ruby config scripts to create some strange hierarchies and generate several warnings. This patch makes three general changes to address these issues.
1. The order of object creation in the ruby config files makes the L1 caches children of the sequencer rather than the controller; these config ciles are rewritten to assign the L1 caches to the controller first.
2. The assignment of the sequencer list to system.ruby.cpu_ruby_ports causes the sequencers to be children of system.ruby, generating warnings because they are already parented to their respective controllers. Changing this attribute to _cpu_ruby_ports fixes this because the leading underscore means this is now treated as a plain Python attribute rather than a child assignment. As a result, the configuration hierarchy changes such that, e.g., system.ruby.cpu_ruby_ports0 becomes system.l1_cntrl0.sequencer.
3. In the topology classes, the routers become children of some random internal link node rather than direct children of the topology. The topology classes are rewritten to assign the routers to the topology object first. |
8313:1eaa1fbd2212 |
21-May-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
garnet: use vnet_type from protocol to decide buffer depths
The virtual channels within "response" vnets are made buffers_per_data_vc deep (default=4), while virtual channels within other vnets are made buffers_per_ctrl_vc deep (default = 1). This is for accurate power estimates. |
8310:adb2d5f7407d |
20-May-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
slicc: added vnet_type to MI_example
Forgot to add this to MI_example in my previous patch. |
8308:79cf09f5a234 |
18-May-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
slicc: added vnet_type field to identify response vnets from others
Identifying response vnets versus other vnets will allow garnet to determine which vnets will carry data packets, and which will carry ctrl packets, and use appropriate buffer sizes (since data packets are larger than ctrl packets). This in turn allows the orion power model to accurately estimate buffer power. |
8307:76f7c2858c5c |
18-May-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
garnet: rename and rearrange config parameters.
Renamed (message) class to vnet for consistency with rest of ruby. Moved some parameters specific to fixed/flexible garnet networks into their corresponding py files. |
8292:0990d8c19b64 |
07-May-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
network: added Torus and Pt2Pt topologies |
8289:f64b07758814 |
05-May-2011 |
Korey Sewell <ksewell@umich.edu> |
ruby: use RubyMemory flag & remove setDebug() functionality The RubyMemory flag wasnt used in the code, creating large gaps in trace output. Replace cprintfs w/dprintfs using RubyMemory in memory controller. DPRINTF also deprecate the usage of the setDebug() pure virtual function in the AbstractMemoryOrCache Class as well the m_debug/cprintf functions in MemoryControl.hh/cc |
8266:66a3187a6714 |
02-May-2011 |
Korey Sewell <ksewell@umich.edu> |
ruby: dbg: use system ticks instead of cycles |
8263:8743998edfd3 |
28-Apr-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
network: set the ExtLink bw to 16 bytes
Therefore all links by default are 16 bytes wide and thus work with Garnet's uniform link bandwidth assumption. |
8262:89d0e7c17d1e |
28-Apr-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
garnet: removed flit_width from Routers |
8261:39e42ccddd63 |
28-Apr-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
network: adjusted default endpoint bandwidth
The simple network's endpoint bandwidth value is used to adjust the overall bandwidth of the network. Specifically, the ration between endpoint bandwidth and the MESSAGE_SIZE_MULTIPLIER determines the increase. By setting the value to 1000, that means the bandwdith factor specified in the links translates to the link bandwidth in bytes. Previously, it was increasing that value by 10.
This patch will likely require a reset of the ruby regression tester stats. |
8260:f113f73dd494 |
28-Apr-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
network: removed the unused network-wide latency param |
8259:36987780169e |
28-Apr-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
network: moved network config params
Moved the buffer_size, endpoint_bandwidth, and adaptive_routing params out of the top-level parent network object and to only those networks that actually use those parameters. |
8258:7c377f5162f8 |
28-Apr-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
network: basic link bw for garnet and simple networks
This patch ensures that both Garnet and the simple networks use the bw value specified in the topology. To do so, the patch generalizes the specification of bw for basic links. This value is then translated to the specific value used by the simple and Garnet networks. Since Garent does not support non-uniformed link bandwidth, the patch also adds a check to ensure all bws are equal. |
8257:7226aebb77b4 |
28-Apr-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
network: convert links & switches to first class C++ SimObjects
This patch converts links and switches from second class simobjects that were virtually ignored by the networks (both simple and Garnet) to first class simobjects that directly correspond to c++ ojbects manipulated by the topology and network classes. This is especially true for Garnet, where the links and switches directly correspond to specific C++ objects.
By making this change, many aspects of the Topology class were simplified. |
8256:2284cec55ef4 |
28-Apr-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
garnet: cleaned up flexible network header file |
8255:73089f793a0a |
28-Apr-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: moved topology to the top network directory
Moved the Topology class to the top network directory because it is shared by both the simple and Garnet networks. |
8254:779d775abc11 |
28-Apr-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: removed dated comment in SimpleNetwork |
8245:a9d06c894afe |
20-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
fix some build problems from prior changesets |
8240:38befb82b2c9 |
19-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
stats: rename stats so they can be used as python expressions |
8232:b28d06a175be |
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
trace: reimplement the DTRACE function so it doesn't use a vector At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help |
8229:78bf55f23338 |
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
includes: sort all includes |
8214:02cb69e5cfeb |
06-Apr-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: fixes to support more types of RubyRequests |
8194:aeec9e157d06 |
01-Apr-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
hammer: fixed dma uniproc error
Fixed an error reguarding DMA for uninprocessor systems. Basically removed an overly agressive optimization that lead to inconsistent state between the cache and the directory. |
8193:c7302d55d644 |
31-Mar-2011 |
Lisa Hsu <Lisa.Hsu@amd.com> |
CacheMemory: add allocateVoid() that is == allocate() but no return value. This function duplicates the functionality of allocate() exactly, except that it does not return a return value. In protocols where you just want to allocate a block but do not want that block to be your implicitly passed cache_entry, use this function. Otherwise, SLICC will complain if you do not consume the pointer returned by allocate(), and if you do a dummy assignment Entry foo := cache.allocate(address), the C++ compiler will complain of an unused variable. This is kind of a hack to get around those issues, but suggestions welcome. |
8192:be38f7b6ad9e |
31-Mar-2011 |
Lisa Hsu <Lisa.Hsu@amd.com> |
Ruby: Simplify SLICC and Entry/TBE handling. Before this changeset, all local variables of type Entry and TBE were considered to be pointers, but an immediate use of said variables would not be automatically deferenced in SLICC-generated code. Instead, deferences occurred when such variables were passed to functions, and were automatically dereferenced in the bodies of the functions (e.g. the implicitly passed cache_entry).
This is a more general way to do it, which leaves in place the assumption that parameters to functions and local variables of type AbstractCacheEntry and TBE are always pointers, but instead of dereferencing to access member variables on a contextual basis, the dereferencing automatically occurs on a type basis at the moment a member is being accessed. So, now, things you can do that you couldn't before include:
Entry foo := getCacheEntry(address); cache_entry.DataBlk := foo.DataBlk;
or
cache_entry.DataBlk := getCacheEntry(address).DataBlk;
or even
cache_entry.DataBlk := static_cast(Entry, pointer, cache.lookup(address)).DataBlk; |
8191:777459f7c61f |
31-Mar-2011 |
Lisa Hsu <Lisa.Hsu@amd.com> |
Ruby: Add new object called WireBuffer to mimic a Wire. This is a substitute for MessageBuffers between controllers where you don't want messages to actually go through the Network, because requests/responses can always get reordered wrt to one another (even if you turn off Randomization and turn on Ordered) because you are, after all, going through a network with contention. For systems where you model multiple controllers that are very tightly coupled and do not actually go through a network, it is a pain to have to write a coherence protocol to account for mixed up request/response orderings despite the fact that it's completely unrealistic. This is *not* meant as a substitute for real MessageBuffers when messages do in fact go over a network. |
8189:d5ad24eb015f |
31-Mar-2011 |
Lisa Hsu <Lisa.Hsu@amd.com> |
Ruby: enable multiple sequencers in one controller. |
8188:20dbef14192d |
31-Mar-2011 |
Lisa Hsu <Lisa.Hsu@amd.com> |
Ruby: pass Packet->Req->contextId() to Ruby. It is useful for Ruby to understand from whence request packets came. This has all request packets going into Ruby pass the contextId value, if it exists. This supplants the old libruby proc_id value passed around in all the Messages, so I've also removed the unused unsigned proc_id; member generated by SLICC for all Message types. |
8187:99428f716e7b |
31-Mar-2011 |
Lisa Hsu <Lisa.Hsu@amd.com> |
Ruby: Bug in SLICC forgot semicolon at end of code. |
8184:a8d64545cda6 |
28-Mar-2011 |
Somayeh Sardashti <somayeh@cs.wisc.edu> |
This patch supports cache flushing in MOESI_hammer |
8174:e21f6e70169e |
22-Mar-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Remove CacheMsg class from SLICC The goal of the patch is to do away with the CacheMsg class currently in use in coherence protocols. In place of CacheMsg, the RubyRequest class will used. This class is already present in slicc_interface/RubyRequest.hh. In fact, objects of class CacheMsg are generated by copying values from a RubyRequest object. |
8173:2c47dc111abd |
21-Mar-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
This patch makes garnet use the info about active and inactive vnets during allocation and power estimations etc |
8172:bdb039c42553 |
21-Mar-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
fix garnet fleible pipeline |
8171:19444b1f092c |
21-Mar-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
This patch adds the network tester for simple and garnet networks. The tester code is in testers/networktest. The tester can be invoked by configs/example/ruby_network_test.py. A dummy coherence protocol called Network_test is also addded for network-only simulations and testing. The protocol takes in messages from the tester and just pushes them into the network in the appropriate vnet, without storing any state. |
8170:c1c6f36e118e |
20-Mar-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
SLICC: Remove WakeUp* import calls from ast/__init__.py I had recently committed a patch that removed the WakeUp*.py files from the slicc/ast directory. I had forgotten to remove the import calls for these files from slicc/ast/__init__.py. This resulted in error while running regressions on zizzer. This patch does the needful. |
8165:5955406f7ed0 |
19-Mar-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Convert CacheRequestType to RubyRequestType This patch converts CacheRequestType to RubyRequestType so that both the protocol dependent and independent code makes use of the same request type. |
8164:b043c0efa024 |
19-Mar-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Convert AccessModeType to RubyAccessMode This patch converts AccessModeType to RubyAccessMode so that both the protocol dependent and independent code uses the same access mode. |
8163:19a654839a04 |
19-Mar-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_hammer: minor fixes to full-bit dir |
8162:5f69f1b0039e |
19-Mar-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
Ruby: dma retry fix
This patch fixes the problem where Ruby would fail to call sendRetry on ports after it nacked the port. This patch is particularly helpful for bursty dma requests which often include several packets. |
8161:ebb373fcb206 |
19-Mar-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
RubyPort: minor fixes to trace flag and dprintfs |
8160:0b3252d3b400 |
19-Mar-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: added useful dma progress dprintf |
8159:de9e34de70ff |
19-Mar-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
slicc: improved invalid transition message |
8158:519fba665871 |
19-Mar-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_hammer: fixed dma bug with shared data |
8157:d2cf4b19e8ad |
19-Mar-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_CMP_directory: significant dma bug fixes |
8156:9a6a02a235f1 |
18-Mar-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
SLICC: Remove external_type for structures In SLICC, in order to define a type a data type for which it should not generate any code, the keyword external_type is used. For those data types for which code should be generated, the keyword structure is used. This patch eliminates the use of keyword external_type for defining structures. structure key word can now have an optional attribute external, which would be used for figuring out whether or not to generate the code for this structure. Also, now structures can have functions as well data members in them. |
8155:099771c7725d |
18-Mar-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
SLICC: Remove the keyword wake_up_dependents In order to add stall and wait facility for protocols, a keyword wake_up_dependents was introduced. This patch removes the keyword, instead this functionality is now implemented as function call. |
8154:f3d1493787d4 |
18-Mar-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
SLICC: Remove the keyword wake_up_all_dependents In order to add stall and wait facility for protocols, a keyword wake_up_all_dependents was introduced. This patch removes the keyword, instead this functionality is now implemented as function call. |
8134:b01a51ff05fa |
17-Mar-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Mem: Fix issue with dirty block being lost when entire block transferred to non-cache.
This change fixes the problem for all the cases we actively use. If you want to try more creative I/O device attachments (E.g. sharing an L2), this won't work. You would need another level of caching between the I/O device and the cache (which you actually need anyway with our current code to make sure writes propagate). This is required so that you can mark the cache in between as top level and it won't try to send ownership of a block to the I/O device. Asserts have been added that should catch any issues. |
8132:b0ecadb07742 |
17-Mar-2011 |
Lisa Hsu <Lisa.Hsu@amd.com> |
Ruby: minor bugfix, line did not adhere to some macro usage conventions. |
8131:03f7df749b9d |
17-Mar-2011 |
Lisa Hsu <Lisa.Hsu@amd.com> |
Ruby: expose a simple mod function in slicc interface. |
8125:05d2937bacbf |
11-Mar-2011 |
Gabe Black <gblack@eecs.umich.edu> |
Gems: Eliminate the now unused GEMS_ROOT scons variable. |
8124:0dc6769af3a1 |
11-Mar-2011 |
Gabe Black <gblack@eecs.umich.edu> |
Ruby: Get rid of the dead ruby tester.
None of the code in the ruby tester directory is compiled or referred to outside of that directory. This change eliminates it. If it's needed in the future, it can be revived from the history. In the mean time, this removes clutter and the only use of the GEMS_ROOT scons variable. |
8121:457c24115bde |
04-Mar-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SCons: Clean up some inconsistent capitalization in scons options. |
8105:906864dd0937 |
02-Mar-2011 |
Gabe Black <gblack@eecs.umich.edu> |
Spelling: Fix the a spelling error by changing mmaped to mmapped.
There may not be a formally correct spelling for the past tense of mmap, but mmapped is the spelling Google doesn't try to autocorrect. This makes sense because it mirrors the past tense of map->mapped and not the past tense of cape->caped. |
8101:2e1ee8ec6266 |
01-Mar-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Fix DPRINTF bugs in PerfectSwitch and MessageBuffer At a couple of places in PerfectSwitch.cc and MessageBuffer.cc, DPRINTF() has not been provided with correct number of arguments. The patch fixes these bugs. |
8099:265202bbac87 |
01-Mar-2011 |
Gabe Black <gblack@eecs.umich.edu> |
Ruby: Mention that Ruby's bound checking option only applies to Ruby. |
8094:baf4b5f6782e |
27-Feb-2011 |
Nathan Binkert <nate@binkert.org> |
getopt: Remove GPL code. This code is unused and should never have been committed |
8093:05a2f6ac1f8e |
25-Feb-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Remove store buffer This patch removes the store buffer from Ruby. It is not in use currently. Since libruby is being and store buffer makes calls to libruby, it is not possible to maintain it until substantial changes are made. |
8092:6782b51ae8a8 |
25-Feb-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Remove libruby This patch removes libruby_internal.hh, libruby.hh and libruby.cc. It moves the contents to libruby.hh to RubyRequest.hh and RubyRequest.cc files. |
8091:04078b1214dd |
25-Feb-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Make Address.hh independent of RubySystem This patch changes Address.hh so that it is not dependent on RubySystem. This dependence seems unecessary. All those functions that depend on RubySystem have been moved to Address.cc file. |
8090:722a0d28ee83 |
25-Feb-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Make DataBlock.hh independent of RubySystem This patch changes DataBlock.hh so that it is not dependent on RubySystem. This dependence seems unecessary. All those functions that depende on RubySystem have been moved to DataBlock.cc file. |
8086:bf0335d98250 |
23-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: automate permission setting
This patch integrates permissions with cache and memory states, and then automates the setting of permissions within the generated code. No longer does one need to manually set the permissions within the setState funciton. This patch will faciliate easier functional access support by always correctly setting permissions for both cache and memory states. |
8085:d1eb504fd302 |
23-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_hammer: cache probe address clean up |
8084:d1bb88080be4 |
23-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: cleaned up access permission enum |
8083:bba14984f2ce |
23-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: removed unsupported protocol files |
8077:7544ad480a38 |
23-Feb-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Mem: Print out memory when access > 8 bytes |
8066:cb7bf3919bdd |
23-Feb-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
Includes: Don't include isa_traits.hh and use the TheISA namespace unless really needed. |
8055:e1fd27c723a2 |
23-Feb-2011 |
Korey Sewell <ksewell@umich.edu> |
ruby: extend dprintfs for RubyGenerated TraceFlag "executing" isnt a very descriptive debug message and in going through the output you get multiple messages that say "executing" but nothing to help you parse through the code/execution.
So instead, at least print out the name of the action that is taking place in these functions. |
8054:9138d38eccd7 |
23-Feb-2011 |
Korey Sewell <ksewell@umich.edu> |
ruby: cleaning up RubyQueue and RubyNetwork dprintfs Overall, continue to progress Ruby debug messages to more of the normal M5 debug message style - add a name() to the Ruby Throttle & PerfectSwitch objects so that the debug output isn't littered w/"global:" everywhere. - clean up messages that print over multiple lines when possible - clean up duplicate prints in the message buffer |
8053:e6ce478c05d3 |
22-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
m5: merged in hammer fix |
8051:c7f591ccf3a1 |
10-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_hammer: fixed wakeup for SS->S transistion |
8050:5e58eaf00b58 |
19-Feb-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Machine Type missing in MOESI CMP directory protocol In certain actions of the L1 cache controller, while creating an outgoing message, the machine type was not being set. This results in a segmentation fault when trace is collected. Joseph Pusudesris provided his patch for fixing this issue. |
8049:44f1ac4f587f |
19-Feb-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: clean MOESI CMP directory protocol The L1 cache controller file contains references to foo and goo queues, which are not in use at all. These have been removed. |
7973:e5550966464a |
14-Feb-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Improve Change PerfectSwitch's wakeup function Currently the wakeup function for the PerfectSwitch contains three loops -
loop on number of virtual networks loop on number of incoming links loop till all messages for this (link, network) have been routed
With an 8 processor mesh network and Hammer protocol, about 11-12% of the was observed to have been spent in this function, which is the highest amongst all the functions. It was found that the innermost loop is executed about 45 times per invocation of the wakeup function, when each invocation of the wakeup function processes just about one message.
The patch tries to do away with the redundant executions of the innermost loop. Counters have been added for each virtual network that record the number of messages that need to be routed for that virtual network. The inner loops are only executed when the number of messages for that particular virtual network > 0. This does away with almost 80% of the executions of the innermost loop. The function now consumes about 5-6% of the total execution time. |
7961:e8f4bb35dca9 |
12-Feb-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Reorder Cache Lookup in Protocol Files The patch changes the order in which L1 dcache and icache are looked up when a request comes in. Earlier, if a request came in for instruction fetch, the dcache was looked up before the icache, to correctly handle self-modifying code. But, in the common case, dcache is going to report a miss and the subsequent icache lookup is going to report a hit. Given the invariant - caches under the same controller keep track of disjoint sets of cache blocks, we can move the icache lookup before the dcache lookup. In case of a hit in the icache, using our invariant, we know that the dcache would have reported a miss. In case of a miss in the icache, we know that icache would have missed even if the dcache was looked up before looking up the icache. Effectively, we are doing the same thing as before, though in the common case, we expect reduction in the number of lookups. This was empirically confirmed for MOESI hammer. The ratio lookups to access requests is now about 1.1 to 1. |
7940:d6294150a32e |
09-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: removed duplicate make response call |
7936:9c245e375e05 |
08-Feb-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
MESI CMP: Unset TBE pointer in L2 cache controller The TBE pointer in the MESI CMP implementation was not being set to NULL when the TBE is deallocated. This resulted in segmentation fault on testing the protocol when the ProtocolTrace was switched on. |
7929:68f37178b408 |
07-Feb-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Orion: Replace printf() with fatal() The code for Orion 2.0 makes use of printf() at several places where there as an error in configuration of the model. These have been replaced with fatal(). |
7928:5f2a2deb377d |
07-Feb-2011 |
Korey Sewell <ksewell@umich.edu> |
ruby: add stdio header in SRAM.hh missing header file caused RUBY_FS to not compile |
7922:7532067f818e |
07-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: support to stallAndWait the mandatory queue
By stalling and waiting the mandatory queue instead of recycling it, one can ensure that no incoming messages are starved when the mandatory queue puts signficant of pressure on the L1 cache controller (i.e. the ruby memtester). |
7921:351f1761765f |
07-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: minor fix to deadlock panic message |
7919:3a02353d6e43 |
07-Feb-2011 |
Joel Hestness <hestness@cs.utexas.edu> |
garnet: Split network power in ruby.stats
Split out dynamic and static power numbers for printing to ruby.stats |
7918:409a2692b8e6 |
07-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_hammer: fixed dir bug counting received acks |
7917:d9afb18a5008 |
07-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: numa bit fix for sparse memory |
7916:b3d642f01495 |
07-Feb-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
MOESI_CMP_token: removed unused message fields |
7915:bc39c93a5519 |
07-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
mem: Added support for Null data packet
The packet now identifies whether static or dynamic data has been allocated and is used by Ruby to determine whehter to copy the data pointer into the ruby request. Subsequently, Ruby can be told not to update phys memory when receiving packets. |
7910:8a92b39be50e |
07-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Fix RubyPort to properly handle retrys |
7909:eee578ed2130 |
07-Feb-2011 |
Joel Hestness <hestness@cs.utexas.edu> |
Ruby: Fix to return cache block size to CPU for split data transfers |
7908:4e83ebb67794 |
07-Feb-2011 |
Joel Hestness <hestness@cs.utexas.edu> |
Ruby: Add support for locked memory accesses in X86_FS |
7907:d648b8409d4c |
07-Feb-2011 |
Joel Hestness <hestness@cs.utexas.edu> |
Ruby: Update the Ruby request type names for LL/SC |
7906:5ccd97218ca0 |
07-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Assert for x86 misaligned access
This patch ensures only aligned access are passed to ruby and includes a fix to the DPRINTF address print. |
7904:6f5299ff8260 |
07-Feb-2011 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_hammer: Added full-bit directory support |
7898:73bc24002f82 |
07-Feb-2011 |
Joel Hestness <hestness@cs.utexas.edu> |
MOESI_hammer: trigge queue fix. |
7896:46e9b3bf447f |
07-Feb-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
garnet: added orion2.0 for network power calculation |
7895:8439266ec9e5 |
07-Feb-2011 |
Tushar Krishna <tushar@csail.mit.edu> |
garnet: separate data and ctrl VCs
Separate data VCs and ctrl VCs in garnet, as ctrl VCs have 1 buffer per VC, while data VCs have > 1 buffers per VC. This is for correct power estimations. |
7839:9e556fb25900 |
17-Jan-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Change interface between coherence protocols and CacheMemory The purpose of this patch is to change the way CacheMemory interfaces with coherence protocols. Currently, whenever a cache controller (defined in the protocol under consideration) needs to carry out any operation on a cache block, it looks up the tag hash map and figures out whether or not the block exists in the cache. In case it does exist, the operation is carried out (which requires another lookup). As observed through profiling of different protocols, multiple such lookups take place for a given cache block. It was noted that the tag lookup takes anything from 10% to 20% of the simulation time. In order to reduce this time, this patch is being posted.
I have to acknowledge that the many of the thoughts that went in to this patch belong to Brad.
Changes to CacheMemory, TBETable and AbstractCacheEntry classes: 1. The lookup function belonging to CacheMemory class now returns a pointer to a cache block entry, instead of a reference. The pointer is NULL in case the block being looked up is not present in the cache. Similar change has been carried out in the lookup function of the TBETable class. 2. Function for setting and getting access permission of a cache block have been moved from CacheMemory class to AbstractCacheEntry class. 3. The allocate function in CacheMemory class now returns pointer to the allocated cache entry.
Changes to SLICC: 1. Each action now has implicit variables - cache_entry and tbe. cache_entry, if != NULL, must point to the cache entry for the address on which the action is being carried out. Similarly, tbe should also point to the transaction buffer entry of the address on which the action is being carried out. 2. If a cache entry or a transaction buffer entry is passed on as an argument to a function, it is presumed that a pointer is being passed on. 3. The cache entry and the tbe pointers received __implicitly__ by the actions, are passed __explicitly__ to the trigger function. 4. While performing an action, set/unset_cache_entry, set/unset_tbe are to be used for setting / unsetting cache entry and tbe pointers respectively. 5. is_valid() and is_invalid() has been made available for testing whether a given pointer 'is not NULL' and 'is NULL' respectively. 6. Local variables are now available, but they are assumed to be pointers always. 7. It is now possible for an object of the derieved class to make calls to a function defined in the interface. 8. An OOD token has been introduced in SLICC. It is same as the NULL token used in C/C++. If you are wondering, OOD stands for Out Of Domain. 9. static_cast can now taken an optional parameter that asks for casting the given variable to a pointer of the given type. 10. Functions can be annotated with 'return_by_pointer=yes' to return a pointer. 11. StateMachine has two new variables, EntryType and TBEType. EntryType is set to the type which inherits from 'AbstractCacheEntry'. There can only be one such type in the machine. TBEType is set to the type for which 'TBE' is used as the name.
All the protocols have been modified to conform with the new interface. |
7835:8f37a23e02d7 |
13-Jan-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Fixes MESI CMP directory protocol The current implementation of MESI CMP directory protocol is broken. This patch, from Arkaprava Basu, fixes the protocol. |
7832:de7601e6e19d |
10-Jan-2011 |
Nathan Binkert <nate@binkert.org> |
ruby: get rid of ruby's Debug.hh
Get rid of the Debug class Get rid of ASSERT and use assert Use DPRINTFR for ProtocolTrace |
7823:dac01f14f20f |
08-Jan-2011 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Replace curTick global variable with accessor functions. This step makes it easy to replace the accessor functions (which still access a global variable) with ones that access per-thread curTick values. |
7815:9f9e10967912 |
04-Jan-2011 |
Nilay Vaish<nilay@cs.wisc.edu> |
Ruby: Updates MOESI Hammer protocol This patch changes the manner in which data is copied from L1 to L2 cache in the implementation of the Hammer's cache coherence protocol. Earlier, data was copied directly from one cache entry to another. This has been broken in to two parts. First, the data is copied from the source cache entry to a transaction buffer entry. Then, data is copied from the transaction buffer entry to the destination cache entry.
This has been done to maintain the invariant - at any given instant, multiple caches under a controller are exclusive with respect to each other. |
7811:a8fc35183c10 |
03-Jan-2011 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Make commenting on close namespace brackets consistent.
Ran all the source files through 'perl -pi' with this script:
s|\s*(};?\s*)?/\*\s*(end\s*)?namespace\s*(\S+)\s*\*/(\s*})?|} // namespace $3|; s|\s*};?\s*//\s*(end\s*)?namespace\s*(\S+)\s*|} // namespace $2\n|; s|\s*};?\s*//\s*(\S+)\s*namespace\s*|} // namespace $1\n|;
Also did a little manual editing on some of the arch/*/isa_traits.hh files and src/SConscript. |
7806:fbf4b1b18202 |
23-Dec-2010 |
Nilay Vaish<nilay@cs.wisc.edu> |
PerfectCacheMemory: Add return statements to two functions. Two functions in src/mem/ruby/system/PerfectCacheMemory.hh, tryCacheAccess() and cacheProbe(), end with calls to panic(). Both of these functions have return type other than void. Any file that includes this header file fails to compile because of the missing return statement. This patch adds dummy values so as to avoid the compiler warnings. |
7805:f249937228b5 |
23-Dec-2010 |
Nilay Vaish<nilay@cs.wisc.edu> |
This patch removes the WARN_* and ERROR_* from src/mem/ruby/common/Debug.hh file. These statements have been replaced with warn(), panic() and fatal() defined in src/base/misc.hh |
7793:f6cbeb8712d3 |
08-Dec-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: remove Ruby asserts for m5.fast
This diff is for changing the way ASSERT is handled in Ruby. m5.fast compiles out the assert statements by using the macro NDEBUG. Ruby uses the macro RUBY_NO_ASSERT to do so. This macro has been removed and NDEBUG has been put in its place. |
7780:42da07116e12 |
01-Dec-2010 |
Nilay Vaish <nilay@cs.wisc.edu> |
ruby: Converted old ruby debug calls to M5 debug calls
This patch developed by Nilay Vaish converts all the old GEMS-style ruby debug calls to the appropriate M5 debug calls. |
7770:6286bb50127e |
19-Nov-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
SE: Fix simulating more than 4GB of RAM in SE mode
This change removes some dead code in PhysicalMemory, uses a 64 bit type for the page pointer in System (instead of 32 bit) and cleans up some style. |
7768:cdb18c1b51ea |
19-Nov-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
SCons: Support building without an ISA |
7733:08d6a773d1b6 |
08-Nov-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Add checkpointing support |
7730:982b4c6c1470 |
08-Nov-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
Mem: Finish half-baked support for mmaping file in physmem.
Physmem has a parameter to be able to mem map a file, however it isn't actually used. This changeset utilizes the parameter so a file can be mmapped. |
7710:5e129d3c6d7e |
18-Oct-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: minor SC assertion fix
Thanks to Joe Gross for finding/testing this. |
7708:956ac83b0a58 |
16-Oct-2010 |
Gabe Black <gblack@eecs.umich.edu> |
Mem: Reclaim some request flags used by MIPS for alignment checking.
These flags were being used to identify what alignment a request needed, but the same information is available using the request size. This change also eliminates the isMisaligned function. If more complicated alignment checks are needed, they can be signaled using the ASI_BITS space in the flags vector like is currently done with ARM. |
7705:fd65f85fcc0c |
13-Oct-2010 |
Gabe Black <gblack@eecs.umich.edu> |
Mem: Change the CLREX flag to CLEAR_LL.
CLREX is the name of an ARM instruction, not a name for this generic flag. |
7691:358c00c482f7 |
30-Sep-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
CPU/Cache: Fix some errors exposed by valgrind |
7687:d1ba390671ec |
22-Sep-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: improve coherence handling of writebacks If we write back an exclusive copy, we now mark it as such, so the cache receiving the writeback can mark its copy as exclusive. This avoids some unnecessary upgrade requests when a cache later tries to re-acquire exclusive access to the block. |
7678:f19b6a3a8cec |
13-Sep-2010 |
Gabe Black <gblack@eecs.umich.edu> |
Faults: Pass the StaticInst involved, if any, to a Fault's invoke method.
Also move the "Fault" reference counted pointer type into a separate file, sim/fault.hh. It would be better to name this less similarly to sim/faults.hh to reduce confusion, but fault.hh matches the name of the type. We could change Fault to FaultPtr to match other pointer types, and then changing the name of the file would make more sense. |
7676:92274350b953 |
10-Sep-2010 |
Nathan Binkert <nate@binkert.org> |
style: fix sorting of includes and whitespace in some files |
7672:d609cd948ca0 |
09-Sep-2010 |
Nathan Binkert <nate@binkert.org> |
code_formatter: make it easier to insert whitespace a newline by just doing "code()". indent() and dedent() now take a "count" parameter to indent/dedent multiple levels. |
7669:cc222ba29079 |
09-Sep-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: fail SC when invalidated while waiting for bus Corrects an oversight in cset f97b62be544f. The fix there only failed queued SCUpgradeReq packets that encountered an invalidation, which meant that the upgrade had to reach the L2 cache. To handle pending requests in the L1 we must similarly fail StoreCondReq packets too. |
7668:aec271db42c9 |
09-Sep-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: fix functional accesses to deal with coherence change We can't just obliviously return the first valid cache block we find any more... see comments for details. |
7667:aa8fd8f6a495 |
09-Sep-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: coherence protocol enhancements & bug fixes Allow lower-level caches (e.g., L2 or L3) to pass exclusive copies to higher levels (e.g., L1). This eliminates a lot of unnecessary upgrade transactions on read-write sequences to non-shared data.
Also some cleanup of MSHR coherence handling and multiple bug fixes. |
7659:657f0adae97c |
26-Aug-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: fix m5.fast compile bug in previous cset |
7658:3148ae920301 |
26-Aug-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: fix a bug in atomic multilevel snoops |
7636:59b6a1b5bb0c |
25-Aug-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: fix dumb typo in copyrights |
7632:acf43d6bbc18 |
24-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
testers: move testers to a new directory
This patch moves the testers to a new subdirectory under src/cpu and includes the necessary fixes to work with latest m5 initialization patches. |
7631:9bd6e86476d2 |
24-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_hammer: fixed bug for dma reads in single cpu systems |
7612:917946898102 |
23-Aug-2010 |
Gene Wu <Gene.Wu@arm.com> |
MEM: Make CLREX a first class request operation and clear locks in caches when it in received |
7611:c119da5a80c8 |
23-Aug-2010 |
Gene Wu <Gene.Wu@arm.com> |
ARM: Make sure that software prefetch instructions can't change the state of the TLB |
7576:4154f3e1edae |
23-Aug-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
Compiler: Fixes for GCC 4.5. |
7569:96a602c5368d |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added merge GETS optimization to hammer
Added an optimization that merges multiple pending GETS requests into a single request to the owner node. |
7567:238f99c9f441 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Stall and wait input messages instead of recycling
This patch allows messages to be stalled in their input buffers and wait until a corresponding address changes state. In order to make this work, all in_ports must be ranked in order of dependence and those in_ports that may unblock an address, must wake up the stalled messages. Alot of this complexity is handled in slicc and the specification files simply annotate the in_ports. |
7566:6919df046bba |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Recycle latency fix for hammer
Patch allows each individual message buffer to have different recycle latencies and allows the overall recycle latency to be specified at the cmd line. The patch also adds profiling info to make sure no one processor's requests are recycled too much. |
7565:9fc3475e8175 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_hammer: break down miss latency stalled cycles
This patch tracks the number of cycles a transaction is delayed at different points of the request-forward-response loop. |
7564:3559d47839a1 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: added probe filter support to hammer |
7563:406e98960def |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: fixed DirectoryMemory's numa_high_bit configuration
This fix includes the off-by-one bit selection bug for numa mapping. |
7562:ec3b148b14f3 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Reset ruby stats in RubySystem unserialize
The main purpose for clearing stats in the unserialize process is so that the profiler can correctly set its start time to the unserialized value of curTick. |
7561:02a9a597fce4 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Disable migratory sharing for token and hammer
This patch allows one to disable migratory sharing for those cache blocks that are accessed by atomic requests. While the implementations are different between the token and hammer protocols, the motivation is the same. For Alpha, LLSC semantics expect that normal loads do not unlock cache blocks that have been locked by LL accesses. Therefore, locked blocks should not transfer write permissions when responding to these load requests. Instead, only they only transfer read permissions so that the subsequent SC access can possibly succeed. |
7560:29d5891a96d6 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added SC fail indication to trace profiling |
7558:6c3f81b176da |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Fixed RubyPort sendTiming callbacks
Fixed RubyPort schedSendTiming calls to match ruby frequency. |
7556:5dc128cab5dc |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: fixed token bugs associated with owner token counts
This patch fixes several bugs related to previous inconsistent assumptions on how many tokens the Owner had. Mike Marty should have fixes these bugs years ago. :) |
7554:40ba2226f274 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: MOESI_CMP_token dma fixes
This patch fixes various protocol bugs regarding races between dma requests and persistent requests. |
7553:fcdd99057b8a |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Resurrected Ruby's deterministic tests
Added the request series and invalidate deterministic tests as new cpu models and removed the no longer needed ruby tests |
7552:2e4786ed3f90 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Updated MOESI_hammer L2 latency behavior
Previously, the MOESI_hammer protocol calculated the same latency for L1 and L2 hits. This was because the protocol was written using the old ruby assumption that L1 hits used the sequencer fast path. Since ruby no longer uses the fast-path, the protocol delays L2 hits by placing them on the trigger queue. |
7551:b10ee98aea91 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Reduced ruby latencies
The previous slower ruby latencies created a mismatch between the faster M5 cpu models and the much slower ruby memory system. Specifically smp interrupts were much slower and infrequent, as well as cpus moving in and out of spin locks. The result was many cpus were idle for large periods of time.
These changes fix the latency mismatch. |
7550:7d97cec15818 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: fix ruby llsc support to sync sc outcomes
Added support so that ruby can determine the outcome of store conditional operations and reflect that outcome to M5 physical memory and cpus. |
7549:08dbd22d58a0 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Fixed L2 cache miss profiling
Fixed L2 cache miss profiling for the MOESI_CMP_token protocol |
7548:764a7401e217 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added bcast msg profiling to hammer and token |
7547:a5ddcb2abfa1 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added consolidated network msg stats |
7546:84e8f914b3b8 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Reincarnated the responding machine profiling
This patch adds back to ruby the capability to understand the response time for messages that hit in different levels of the cache heirarchy. Specifically add support for the MI_example, MOESI_hammer, and MOESI_CMP_token protocols. |
7545:3d5d3653eaa4 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MOESI_CMP_token: Fixed dma persistent lockdown bugs |
7544:90c5eb6a5e66 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
memtest: Memtester support for DMA
This patch adds DMA testing to the Memtester and is inherits many changes from Polina's old tester_dma_extension patch. Since Ruby does not work in atomic mode, the atomic mode options are removed. |
7543:e660ab620115 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added ruby_request_type ostream def to libruby.hh |
7542:49327b849c7f |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
slicc: Consolidated the protocol stats printing
Created a separate ProfileDumper that consolidates the generated stats for each controller of a certain type. |
7540:86c3bf056a0d |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
config: Added the topology description to m5 config.ini |
7537:8178df9c17c4 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Fixed printout when Sequencer detects a deadlock |
7536:2eb9d43d7b41 |
20-Aug-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
MESI_CMP_directory: bug fix for old PUTX requests |
7523:9c8fdcdae976 |
17-Aug-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
bus: clean up default responder code. Clean up some minor things left over from the default responder change in rev 9af6fb59752f. Mostly renaming the 'responder_set' param to 'use_default_range' to actually reflect what it does... old name wasn't that descriptive in the first place, but now it really doesn't make sense at all.
Also got rid of the bogus obsolete assignment to 'bus.responder' which used to be a parameter but now is interpreted as an implicit child assignment, and which was giving me problems in the config restructuring to come. (A good argument for not allowing implicit child assignments, IMO, but that's water under the bridge, I'm afraid.)
Also moved the Bus constructor to the .cc file since that's where it should have been all along. |
7510:fb7fc9aca918 |
22-Jul-2010 |
Timothy M. Jones <tjones1@inf.ed.ac.uk> |
Port: Only indicate that a SimpleTimingPort is drained if its send event is not scheduled, as well as the transmit list being empty. |
7497:aab017d1adc6 |
08-Jul-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: fix bug in SC upgrade handling This bug was introduced with the recent rework of SC failure handling in cset f97b62be544f. |
7496:10510cc7bb9f |
08-Jul-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
garnet: Added topology print function to Garnet printStats |
7495:2e8e5bb0b4c2 |
08-Jul-2010 |
Tushar Krishna <Tushar.Krishna@amd.com> |
NetworkMessage copy constructor fix |
7486:3006bde825fd |
22-Jun-2010 |
Tushar Krishna <Tushar.Krishna@amd.com> |
style: updated garnet to match M5 coding style |
7468:6b72468fbad3 |
23-Jun-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: fix longstanding prefetcher bug Thanks to Joe Gross for pointing this out (again?). Apologies to anyone who pointed it out earlier and we didn't listen. |
7465:f97b62be544f |
16-Jun-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: fail store conditionals when upgrade loses race Requires new "SCUpgradeReq" message that marks upgrades for store conditionals, so downstream caches can fail these when they run into invalidations. See http://www.m5sim.org/flyspray/task/197 |
7464:8d92c2737ac8 |
16-Jun-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: fix dirty bit setting Only set the dirty bit when we actually write to a block (not if we thought we might but didn't, as in a failed SC or CAS). This requires makeing sure the dirty bit stays set when we get an exclusive (writable) copy in a cache-to-cache transfer from another owner, which n turn requires copying the mem-inhibit flag from timing-mode requests to their associated responses. |
7461:5a07045d0af2 |
15-Jun-2010 |
Nathan Binkert <nate@binkert.org> |
stats: only consider a formula initialized if there is a formula |
7456:8b9be6e12c9b |
11-Jun-2010 |
Nathan Binkert <nate@binkert.org> |
ruby: get rid of PrioHeap and use STL
One big difference is that PrioHeap puts the smallest element at the top of the heap, whereas stl puts the largest element on top, so I changed all comparisons so they did the right thing.
Some usage of PrioHeap was simply changed to a std::vector, using sort at the right time, other usage had me just use the various heap functions in the stl. |
7455:586f99bf0dc4 |
11-Jun-2010 |
Nathan Binkert <nate@binkert.org> |
ruby: get rid of the Map class |
7454:3a3e8e8cce1b |
11-Jun-2010 |
Nathan Binkert <nate@binkert.org> |
ruby: get rid of Vector and use STL add a couple of helper functions to base for deleteing all pointers in a container and outputting containers to a stream |
7453:1a5db3dd0f62 |
11-Jun-2010 |
Nathan Binkert <nate@binkert.org> |
ruby: get rid of RefCnt and Allocator stuff use base/refcnt.hh
This was somewhat tricky because the RefCnt API was somewhat odd. The biggest confusion was that the the RefCnt object's constructor that took a TYPE& cloned the object. I created an explicit virtual clone() function for things that took advantage of this version of the constructor. I was conservative and used clone() when I was in doubt of whether or not it was necessary. I still think that there are probably too many instances of clone(), but hopefully not too many.
I converted several instances of const MsgPtr & to a simple MsgPtr. If the function wants to avoid the overhead of creating another reference, then it should just use a regular pointer instead of a ref counting ptr.
There were a couple of instances where refcounted objects were created on the stack. This seems pretty dangerous since if you ever accidentally make a reference to that object with a ref counting pointer, bad things are bound to happen. |
7450:2302e04c506e |
07-Jun-2010 |
Steve Reinhardt <stever@gmail.com> |
scons: make RUBY a regular (non-global) sticky var and force it to True for builds that imply Ruby protocols (else unexpected things happen when testing these builds with RUBY=False). |
7089:9ea24d102d66 |
01-Jun-2010 |
Nathan Binkert <nate@binkert.org> |
style: clean up ruby's Set class
Further cleanup should probably be done to make this class be non-Ruby specific and put it in src/base.
There are probably several cases where this class is used, std::bitset could be used instead. |
7064:586b0e3a12b3 |
15-Apr-2010 |
Nathan Binkert <nate@binkert.org> |
tick: rename Clock namespace to SimClock |
7056:b66b558578bd |
02-Apr-2010 |
Nathan Binkert <nate@binkert.org> |
ruby: get rid of gems_common/util.hh and .cc and use stuff in src/base |
7055:4e24742201d7 |
02-Apr-2010 |
Nathan Binkert <nate@binkert.org> |
ruby: get "using namespace" out of headers In addition to obvious changes, this required a slight change to the slicc grammar to allow types with :: in them. Otherwise slicc barfs on std::string which we need for the headers that slicc generates. |
7054:7d6862b80049 |
31-Mar-2010 |
Nathan Binkert <nate@binkert.org> |
style: another ruby style pass |
7048:2ab58c54de63 |
24-Mar-2010 |
Nathan Binkert <nate@binkert.org> |
ruby: continue style pass |
7044:8a05ebc9d372 |
23-Mar-2010 |
Korey Sewell <ksewell@umich.edu> |
m5merge(2): another merge of regression stats |
7039:bc0b6ea676b5 |
22-Mar-2010 |
Nathan Binkert <nate@binkert.org> |
ruby: style pass |
7035:b78b3a9e205f |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: improved isReadWrite fix me comment |
7033:35cb92cbd50c |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Removed the unnecessary MachineType message fields |
7032:9f938aea1942 |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Reorganized Ruby topology and protocol files |
7030:a200627c3d42 |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Disable adaptive routing by for faster simulation perf. |
7029:9a48c447bc19 |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Changed the default set size to 1
Previously, the set size was set to 4. This was mostly do to the fact that a crazy graduate student use to create networks with 256 l2 cache banks. Now it is far more likely that users will create systems with less than 64 of any particular controller type. Therefore Ruby should be optimized for a set size of 1. |
7028:56dfde6abe48 |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Reordered protocol buffers
Reordered vnet priorities to agree with PerfectSwitch for protocols MI_example, MOESI_CMP_token, and MOESI_hammer |
7027:46b02e79bf2c |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Adds configurable bit selection for numa mapping |
7026:3f4c23e9d67d |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added flag to disable mem_vec allocation
The RubySystem flag no_mem_vec will disable Ruby from allocating it's memory data array. |
7025:9adf5b0ccc79 |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Ruby support for sparse memory
The patch includes direct support for the MI example protocol. |
7024:30883414ad10 |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Finally removed bash code cira. 2001ish! |
7023:185ad61a4117 |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Ruby support for LLSC |
7022:836ba66301a1 |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Minor dma latency initialization fix |
7021:0b3c02da71b3 |
22-Mar-2010 |
Tushar Krishna <tushar@csail.mit.edu> |
ruby: Fix multiple wakeups in Ruby Eventqueue
Fix bug in Ruby Event queue to avoid multiple wakeups of same consumer in same cycle |
7020:34a5bdcce1e6 |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Removed the obsolete file specified network files |
7019:a49fd5febdce |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added copyright to many Ruby *.py files |
7017:7b557c87a19a |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Fixed small data msg bug in MOESI_hammer-dir |
7013:7827a86b8d24 |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Removed the no longer used rubymem files |
7012:0ef205fb6d6f |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Fix MOESI_hammer cache profiler calls for L2 misses |
7010:c769c45253c9 |
22-Mar-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Removed deprecated stats from the main profiler |
7009:44ed5e0c7228 |
16-Mar-2010 |
Nathan Binkert <nate@binkert.org> |
orion: Make declarations match definition |
7008:90c097fb76e1 |
14-Mar-2010 |
Nathan Binkert <nate@binkert.org> |
ruby: Fix copyrights on files Mostly files missed during import or screwed up during import |
7007:79413d1ec307 |
12-Mar-2010 |
Nathan Binkert <nate@binkert.org> |
slicc: Change the code generation so that the generated code is easier to read |
7006:77c9c4d5007d |
12-Mar-2010 |
Nathan Binkert <nate@binkert.org> |
packet: add a method to set the size |
7003:5af96fb1ebde |
12-Mar-2010 |
Nathan Binkert <nate@binkert.org> |
bugfix: since pow() causes a bug don't use it It's a power of two anyway, so why use it in the first place. |
7002:48a19d52d939 |
10-Mar-2010 |
Nathan Binkert <nate@binkert.org> |
ruby: get rid of std-includes.hh Do not use "using namespace std;" in headers Include header files as needed |
7001:eca0b78f6d96 |
10-Mar-2010 |
Nathan Binkert <nate@binkert.org> |
ruby: remove calc_host.diff since we don't use it |
7000:084c8da32789 |
10-Mar-2010 |
Nathan Binkert <nate@binkert.org> |
ruby: get rid of the ioutil stuff since it isn't used anymore |
6999:f226c098c393 |
10-Mar-2010 |
Nathan Binkert <nate@binkert.org> |
slicc: have a central mechanism for creating a code_formatter. This makes it easier to add global variables like protocol |
6979:7732bca47f60 |
24-Feb-2010 |
Lisa Hsu <Lisa.Hsu@amd.com> |
cache stats: account for writebacks and/or device occupancy in the cache. Plus, a minor bugfix that neglects to update blk->contextSrc in certain cases on a cache insert. |
6978:ab05e20dc4a7 |
23-Feb-2010 |
Lisa Hsu <Lisa.Hsu@amd.com> |
cache: Make caches sharing aware and add occupancy stats. On the config end, if a shared L2 is created for the system, it is parameterized to have n sharers as defined by option.num_cpus. In addition to making the cache sharing aware so that discriminating tag policies can make use of context_ids to make decisions, I added an occupancy AverageStat and an occ % stat to each cache so that you could know which contexts are occupying how much cache on average, both in terms of blocks and percentage. Note that since devices have context_id -1, having an array of occ stats that correspond to each context_id will break here, so in FS mode I add an extra bucket for device blocks. This bucket is explicitly not added in SE mode in order to not only avoid ugliness in the stats.txt file, but to avoid broken stats (some formulas break when a bucket is 0). |
6976:1d7008e14da6 |
23-Feb-2010 |
Lisa Hsu <Lisa.Hsu@amd.com> |
cache: pull CacheSet out of LRU so that other tags can use associative sets. |
6971:12cfde8f819b |
10-Feb-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: fixed data block assignment fix
Fixed data block assignment to not delete if not internally allocated. |
6970:3d8241813e4b |
10-Feb-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Initialize sender in MI_example-dir |
6969:1ab268977bbb |
10-Feb-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Fixed slicc to initialize the m_is_blocking flag |
6968:33d2b758697b |
01-Feb-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added FS support to the simple mesh topology
Added full-system support to the simple mesh toplogy by allowing dma contrllers to be attached to router zero in the network. |
6967:ee82497f749c |
01-Feb-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Set default protocol back to MI_example |
6928:5bd33f7c26ea |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
m5: Regression Tester Update
This patch includes the necessary regression updates to test the new ruby configuration system. The patch includes support for multiple ruby protocols and adds the ruby random tester. The patch removes atomic mode test for ruby since ruby does not support atomic mode acceses. These tests can be added back in when ruby supports atomic mode for real. |
6927:a162a2b0b1f8 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Replaced gems_common debug statements
Replaced Ruby debug statements with M5 statements. |
6926:775342cda4db |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: removed last level cache support
Removed the last level cache support and MOESI_hammer's dependency on it. Replaces the LLC support with the more generic MachineType count. |
6925:a27441e3d106 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added a Scons option to prevent HTML file creation |
6922:1620cffaa3b6 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Removed static members in RubyPort including hitcallback Removed static members in RubyPort and removed the ruby request unique id. |
6921:fd852ed8c6b4 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Removed the old config interface
Removed the old config interface from RubySystem and libruby. |
6920:e031f09a7dcc |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Re-enabled orion power models
Removed the dummy power function implementations so that Orion can implement them correctly. Since Orion lacks modular design, this patch simply enables scons to compile it. There are no python configuration changes in this patch. |
6918:9b57f0108bc8 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Converted Garnet to M5 configuration |
6917:341a71fd2600 |
29-Jan-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Garnet: reorganize directory tree. Rename the ruby/network/garnet-foo directories to garnet/foo. Move the common NetworkHeader.hh file from garnet-fixed-pipeline up to the common garnet directory. Fix up include paths. |
6916:a421f60f0e87 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added a mesh topology |
6915:13e4df0df905 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: MESI_CMP_directory updated to the new config system |
6914:af5360e5ccd6 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Sorted the file includes to maintain consistency |
6913:a846f65efe55 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Renamed the MESI directory sm file
Renamed the MESI directory file to be consistent with all other protocols. |
6912:5d6887ca9dc4 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Removed the GPL header in MESI_CMP_directory-msg
I'm not sure how this got past our initial ruby code import, but this obviously needed to be removed. |
6911:1fdbff869ff4 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: MOESI_CMP_directory updated to the new config system |
6910:70026d87d4f1 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added atomic support to MOESI_CMP_token |
6909:041a27b02642 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: fixed memory fetch bug for persistent requests |
6908:0e1d7624e641 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: MOESI_CMP_token updates to use the new config system |
6907:b05de761960e |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Allows boolean and string defaults for StateMachine parameters |
6906:35da51c349e2 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: MI_example updates to use the new config system |
6903:27f47cf65ab7 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: convert to M5 MemorySize Converted both ruby caches and directory memory to use the M5 MemorySize python type. |
6902:b5baf1dc44b4 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added Cache and MemCntrl profiler calls |
6901:a375402313df |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: added data print to ruby request |
6900:5d01c182d6a7 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added atomic support to MOESI_hammer |
6899:f8057af86bf7 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: added the GEMS ruby tester |
6898:dff0720d106d |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: fixed MOESI_hammer data writebacks to the directory |
6897:cfeb3d9563dd |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: cleaned up ruby profilers Cleaned up the ruby profilers by moving the memory controller profiling code out of the main profiler object and into a separate object similar to the current CacheProfiler. Both the CacheProfiler and MemCntrlProfiler are specific to a particular Ruby object, CacheMemory and MemoryControl respectively. Therefore, these profilers should not be SimObjects and created by the python configuration system, but instead private objects. This simplifies the creation of these profilers. |
6896:649e40aad897 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Removed RubySystem::getNumberOfSequencers removed the static function RubySystem::getNumberOfSequencers and replaced it with a python config variable |
6895:5f3d2d3f977e |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: added ruby stats print Moved the previous rubymem stats print feature to ruby System so that ruby stats are printed on simulation exit. |
6894:fcd9e5ed33f7 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: fixed Set.cc bug to allow zero sized sets This is necessary for example when no dma sequencers are necessary in the simulated system. |
6893:9cdf9b65d946 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: FS support using the new configuration system |
6892:6a2db6c8a9b1 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: reorganized ruby python configuration Reorganized ruby python configuration so that protocol and ruby memory system configuration code can be shared by multiple front-end configuration files (i.e. memory tester, full system, and hopefully the regression tester). This code works for memory tester, but have not tested fs mode. |
6891:77451885bb00 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Removed out_link_vec from Consumer Removed the out_line_vec data structure from the Consumer. I'm not sure what this did before, but currently it has no usefulness. |
6890:87dea2f9f791 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Convered ruby tracing support usage of sequencer Modified ruby's tracing support to no longer rely on the RubySystem map to convert a sequencer string name to a sequencer pointer. As a temporary solution, the code uses the sim_object find function. Eventually, we should develop a better fix. |
6889:323cd43a3c46 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Memory Controller Profiler with new config system This patch includes a rather substantial change to the memory controller profiler in order to work with the new configuration system. Most noteably, the mem_cntrl_profiler no longer uses a string map, but instead a vector. Eventually this support should be removed from the main profiler and go into a separate object. Each memory controller should have a pointer to that new mem_cntrl profile object. |
6888:de8e755aca4f |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Converted MOESI_hammer dma cntrl to new config system |
6887:b10cae7bacf4 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added the cache profiler to the new config system |
6886:3137c3d41107 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Converted the sequencer deadlock event to m5 eventq |
6885:e07489ad819f |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Wrapped ruby events into m5 events Wrapped ruby events using the m5 event object. Removed the prio_heap from ruby's event queue and instead schedule ruby events on the m5 event queue. |
6884:28a5d2e6b1ff |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Removed the tech_nm variable from RubySystem |
6883:f57e272cf8a1 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added clock to ruby system As a first step to migrate ruby to the M5 eventqueue, added a clock variable to the ruby system. |
6882:898047a3672c |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Ruby changes required to use the python config system This patch includes the necessary changes to connect ruby objects using the python configuration system. Mainly it consists of removing unnecessary ruby object pointers and connecting the necessary object pointers using the generated param objects. This patch includes the slicc changes necessary to connect generated ruby objects together using the python configuraiton system. |
6881:5a61a8a9009a |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: connects sm queues to the network |
6880:a9e3c07205a8 |
29-Jan-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
ruby: Calculate system total memory capacity in Python rather than in RubySystem object. |
6879:c07cf29b5a33 |
29-Jan-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
ruby: Add support for generating topologies in Python. |
6878:c3a3c09af8be |
29-Jan-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
scons: ignore blank lines in .slicc files |
6877:2a1a3d916ca8 |
29-Jan-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
ruby: Make SLICC-generated objects SimObjects. Also add SLICC support for state-machine parameter defaults (passed through to Python as SimObject Param defaults). |
6876:a658c315512c |
29-Jan-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
ruby: Convert most Ruby objects to M5 SimObjects. The necessary companion conversion of Ruby objects generated by SLICC are converted to M5 SimObjects in the following patch, so this patch alone does not compile. Conversion of Garnet network models is also handled in a separate patch; that code is temporarily disabled from compiling to allow testing of interim code. |
6875:5eb6e323b595 |
29-Jan-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
ruby: get rid of obsolete, unused CustomTopology class. |
6873:f55de179b43d |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: fix out_port declaration |
6872:b26f60c254c1 |
29-Jan-2010 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added message type check to OutPortDeclAST.py
Though OutPort's message type is not used to generate code, this fix checks that the programmer's intent is correct. Eventually, we may want to remove the message type from the OutPort declaration statement. |
6865:abfa00a2a23a |
22-Jan-2010 |
Derek Hower <drh5@cs.wisc.edu> |
Automated merge with ssh://hg@m5sim.org/m5 |
6863:21fbf0412e0d |
19-Jan-2010 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: new atomics implementation
This patch changes the way that Ruby handles atomic RMW instructions. This implementation, unlike the prior one, is protocol independent. It works by locking an address from the sequencer immediately after the read portion of an RMW completes. When that address is locked, the coherence controller will only satisfy requests coming from one port (e.g., the mandatory queue) and will ignore all others. After the write portion completed, the line is unlocked. This should also work with multi-line atomics, as long as the blocks are always acquired in the same order. |
6862:3d308cbd1657 |
19-Jan-2010 |
Derek Hower <drh5@cs.wisc.edu> |
merge |
6861:7561088131f9 |
04-Dec-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: cleaned up ruby-lang configuration |
6859:5de565c4b7bd |
18-Nov-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: added sequencer stats to track what requests are waiting on |
6858:92135335e177 |
18-Nov-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: turned off randomization by default, turned on memory controller random arbitrate |
6857:14d7cd6f09a6 |
13-Nov-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: added -A option to TwoLevel_SplitL1UnifiedL2 to set the L1 cache size |
6856:f3caa1cd1d9a |
13-Nov-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: gave ALIASED_REQUEST priority over BUFFER_FULL in sequencer |
6855:5a55833aede4 |
13-Nov-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: reduce the memory usage of ruby by making memory vector page based |
6854:575b029534f1 |
13-Nov-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: cache memory bugfix |
6853:971902a8740e |
20-Oct-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: removed obsolete configuration files |
6852:e98ede05836c |
16-Oct-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: add parameter to config to set # of l2 banks |
6851:9952fc5e0a70 |
07-Oct-2009 |
Derek Hower <drh5@cs.wisc.edu> |
merge |
6850:d480ef5b9028 |
21-Sep-2009 |
Polina Dudnik <pdudnik@gmail.com> |
Atomics bug fix |
6849:3c557ac2ca74 |
25-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
protocol: cleaned up MESI...got rid of unneccessary virtual networks |
6848:1139f1e51da9 |
25-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: more helpful config error message |
6847:44cf8bfb66ff |
25-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
slicc: removed unused atomics code from StateMachine |
6846:60e0df8086f0 |
17-Sep-2009 |
Polina Dudnik <pdudnik@cs.wisc.edu> |
Functionality migrated to sequencer. |
6845:9740ade45962 |
15-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: improve libruby_issue_request feedback |
6844:b8421af116e5 |
15-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
removed isReady from the library interface |
6843:de4b394c6792 |
15-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: added broadcast mechanism |
6842:346b8460b306 |
15-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: added unified assert script |
6841:be6ad0778565 |
15-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: made configuration parameters uniform |
6840:a78dc9a782b8 |
14-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby:removed unused code from CacheMemory |
6839:0bf5c598c9c5 |
14-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: configuration updates |
6838:829892ec644c |
14-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: removed stray printf |
6836:1a01f799bd76 |
11-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: cleaned up unified MESI/MOESI configuration |
6835:ec28f4e6df9e |
11-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
merge |
6833:38da844de114 |
10-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: removed SMT-related Sequencer assert |
6832:576153b639d0 |
10-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: made randomization true by default |
6831:f1ee92cfcc10 |
10-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
protocol: made MI_example work with unordered networks |
6830:0173532b03f0 |
10-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: made L2 request/response latency based on cache latency by default |
6829:4169f24434ef |
09-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: made Locked read/write atomic requests within ruby |
6827:2431d803c355 |
01-Sep-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: fixed config assertion failure |
6825:104115ebc206 |
21-Aug-2009 |
pdudnik@gmail.com |
[mq]: first_patch |
6823:c47323cc8f98 |
25-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: CacheMemory tag lookup uses a hash instead of a loop |
6822:79d81f0b6217 |
18-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: added random seed option to config scripts |
6820:2980bd04e6df |
20-Jan-2010 |
Lisa Hsu <Lisa.Hsu@amd.com> |
util: do checkpoint aggregation more cleanly, fix last changeset. 1) Move alpha-specific code out of page_table.cc:serialize(). 2) Begin serializing M5_pid and unserializing it, but adding an function to do optional paramIn so that old checkpoints don't need to be fixed up. 3) Fix up alpha startup code so that the unserialized M5_pid value is properly written to DTB_IPR_ASN. 4) Fix the memory unserialize that I forgot somehow in the last changeset. 5) Add in an agg_se.py to handle aggregated checkpoints. --bench foo-bar plus positional arguments foo bar are the only changes in usage from se.py. Note this aggregation stuff has only been tested for Alpha and nothing else, though it should take a very minimal amount of work to get it to work with another ISA. |
6818:5a0e3a283826 |
18-Jan-2010 |
Lisa Hsu <Lisa.Hsu@amd.com> |
util: make a generic checkpoint aggregator that can aggregate different cpts into one multi-programmed cpt. Make minor changes to serialization/unserialization to get it to work properly. Note that checkpoints were made with a comment at the beginning with // - this must be changed to ## to work properly with the python config parser in the aggregator. |
6817:5aec45d0fc24 |
12-Jan-2010 |
Lisa Hsu <Lisa.Hsu@amd.com> |
cache: make tags->insertBlock() and tags->accessBlock() context aware so that the cache can make context-specific decisions within their various tag policy implementations. |
6797:7bf0a839c237 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
Resurrection of the CMP token protocol to GEM5 |
6795:394bc95d417b |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: removed the chip pointer from MessageBuffer The Chip object no longer exists and thus is removed from the MessageBuffer constructor. |
6794:b431ec0ad43d |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: added error message to isinstance check Added error message when a symbol is not an instance of a particular expected type. |
6793:bc8c8617c4f0 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added boolean to State Machine parameters * * * ruby: Removed primitive .hh includes |
6791:71021368db4a |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: The persistent table files from GEMS These files are need by the MOESI_CMP_token protocol. |
6790:14c356da6ed3 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: MOESI hammer support for DMA reads and writes |
6789:53caf4b9186d |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added a memory controller feature to MOESI hammer |
6788:c43f6fdcc24c |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Hammer ruby configuration support |
6787:b9c7716f6aa6 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Changes necessary to get the hammer protocol to work in GEM5 |
6786:000fa68c57a9 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: added the original hammer protocols from old ruby |
6785:bb675ba62c79 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: returns the number of LLC needed for broadcast Added feature to CacheMemory to return the number of last level caches. This count is need for broadcast protocols such as MOESI_hammer. |
6784:13387a838449 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: cache configuration fix to use bytes Changed cache size to be in bytes instead of kb so that testers can use very small caches and increase the chance of writeback races. |
6783:c82047a62104 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: fix CacheMemory destructor |
6782:db88ebe2c9fc |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: split CacheMemory.hh into a .hh and a .cc |
6781:8da9d36fc14a |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added default names to message buffers Added default names to message buffers created by the simple network. |
6780:2d3fc2e6f368 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: slicc method error fix Added error message when a method call is not supported by an object. |
6779:4e611eba2b13 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: slicc action error fix Small fix to the State Machine error message when duplicate actions are defined. |
6778:b3f2dfbe8006 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: slicc state machine error fixes Added error messages when: - a state does not exist in a machine's list of known states. - an event does not exist in a machine - the actions of a certain machine have not been declared |
6777:6b6b8f01429c |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Removed unused action z_stall |
6774:554d84a850d6 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: fixed dma mi example to work with multiple dma ports |
6771:c2dfa12ea482 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: getPort function fix Fixed RubyMemory::getPort function to not pass in a -1 for the idx parameter |
6770:5ea2e2b3b39f |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Fixed Directory memory destructor |
6768:c5401cb99aae |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
m5: Added isValidSrc and isValidDest calls to packet.hh |
6767:71b272bd988e |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: included ruby config parameter ports per core Slightly improved the major hack need to correctly assign the number of ports per core. CPUs have two ports: icache + dcache. MemTester has one port. |
6766:01202c147598 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added error check for openning the ruby config file |
6765:b5101309174d |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Support for merging ALPHA_FS and ruby Connects M5 cpu and dma ports directly to ruby sequencers and dma sequencers. Rubymem also includes a pio port so that pio requests and be forwarded to a special pio bus connecting to device pio ports. |
6764:668d24eb6e0f |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Added more info to bridge error message |
6763:5a879a3513dc |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Ruby 64-bit address output fixes. |
6762:a22a47e60c21 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Ruby destruction fix. |
6761:81e9d83f87c0 |
18-Nov-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
ruby: Ruby debug print fixes. |
6739:48d10ba361c9 |
11-Nov-2009 |
Gabe Black <gblack@eecs.umich.edu> |
Mem: Eliminate the NO_FAULT request flag. |
6714:028047200ff7 |
05-Nov-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
slicc: tweak file enumeration for scons Right now .cc and .hh files are handled separately, but then they're just munged together at the end by scons, so it doesn't buy us anything. Might as well munge from the start since we'll eventually be adding generated Python files to the list too. |
6713:4b6fb0a99039 |
05-Nov-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
slicc: whack some of Nate's leftover debug code |
6712:b95abe00dd9d |
04-Nov-2009 |
Nathan Binkert <nate@binkert.org> |
build: fix compile problems pointed out by gcc 4.4 |
6700:deb871e1fc27 |
28-Oct-2009 |
Nathan Binkert <nate@binkert.org> |
license: Fix license on network model code
This mostly was a matter of changing the license owner to Princeton which is as it should have been. The code was originally licensed under the GPL but was relicensed as BSD by Li-Shiuan Peh on July 27, 2009. This relicensing was in an explicit e-mail to Nathan Binkert, Brad Beckmann, Mark Hill, David Wood, and Steve Reinhardt. |
6690:4dc4e494e4d8 |
26-Oct-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
fixed error message generation bug in SLICC ast files |
6674:300266bf68ec |
03-Oct-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
bus: add assertion to catch illegal retry on mem-inhibited transaction. |
6666:3199397fd905 |
26-Sep-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Minor cleanup: Use the blockAlign() method where it applies in the cache. |
6665:874f2ee2f115 |
26-Sep-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Force prefetches to check cache and MSHRs immediately prior to issue. This prevents redundant prefetches from being issued, solving the occasional 'needsExclusive && !blk->isWritable()' assertion failure in cache_impl.hh that several people have run into. Eliminates "prefetch_cache_check_push" flag, neither setting of which really solved the problem. |
6659:60e8bbcae401 |
23-Sep-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: Disable all debug output by default |
6658:f4de76601762 |
23-Sep-2009 |
Nathan Binkert <nate@binkert.org> |
arch: nuke arch/isa_specific.hh and move stuff to generated config/the_isa.hh |
6657:ef5fae93a3b2 |
22-Sep-2009 |
Nathan Binkert <nate@binkert.org> |
slicc: Pure python implementation of slicc. This is simply a translation of the C++ slicc into python with very minimal reorganization of the code. The output can be verified as nearly identical by doing a "diff -wBur".
Slicc can easily be run manually by using util/slicc |
6655:380a32b43336 |
22-Sep-2009 |
Nathan Binkert <nate@binkert.org> |
scons: add slicc and ply to sys.path and PYTHONPATH so everyone has access |
6654:4c84e771cca7 |
22-Sep-2009 |
Nathan Binkert <nate@binkert.org> |
python: Move more code into m5.util allow SCons to use that code. Get rid of misc.py and just stick misc things in __init__.py Move utility functions out of SCons files and into m5.util Move utility type stuff from m5/__init__.py to m5/util/__init__.py Remove buildEnv from m5 and allow access only from m5.defines Rename AddToPath to addToPath while we're moving it to m5.util Rename read_command to readCommand while we're moving it Rename compare_versions to compareVersions while we're moving it. |
6635:3b2d7fdff6b1 |
11-Sep-2009 |
pdudnik@gmail.com |
Added new MESI files |
6634:737662612eb7 |
11-Sep-2009 |
pdudnik@gmail.com |
Config adjustments for MESI |
6633:9082a3fe5608 |
11-Sep-2009 |
pdudnik@gmail.com |
Somayeh's MESI protocol with Polina's bug fixes |
6632:deb20a55147c |
11-Sep-2009 |
pdudnik@gmail.com |
MI data corruption bug fix |
6631:5437a0eeb822 |
11-Sep-2009 |
pdudnik@gmail.com |
Object print bug fix |
6630:17e885fd7246 |
11-Sep-2009 |
pdudnik@gmail.com |
MOESI data corruption bug fix |
6628:369b61762d7b |
31-Aug-2009 |
pdudnik@gmail.com |
[mq]: MOESI_patch |
6627:c7fb413a369f |
28-Aug-2009 |
pdudnik@gmail.com |
Reset the atomics flags if RMW_Read is not followed by a RMW_Read or RMW_Write |
6626:8ea43024230b |
28-Aug-2009 |
pdudnik@gmail.com |
imported patch mi_patch |
6510:336a194c8500 |
15-Aug-2009 |
pdudnik@gmail.com |
Made servicing_atomic a counter and added started writes: a function for setting the flag to indicate that the rmw_writes started issuing |
6509:d7894bb9c4b5 |
14-Aug-2009 |
pdudnik@gmail.com |
Bug fix: indicate when writes started coming in |
6507:df6d844345f4 |
14-Aug-2009 |
pdudnik@gmail.com |
Added proc_id to CacheMsg for SMT.
Not yet necessary, but in case each of the threads is allowed to initiate an atomic, will come in handy |
6506:e9e7ca667575 |
14-Aug-2009 |
pdudnik@gmail.com |
Multi-line RMW handling |
6505:a2306c563df2 |
14-Aug-2009 |
pdudnik@gmail.com |
SMT atomics modifications: don't allow enquing from other threads if servicing and atomic for a thread |
6497:64bf776c5e70 |
13-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
Automated merge with ssh://hg@m5sim.org/m5 |
6496:41bcaefab1a0 |
13-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: config bugfix |
6495:04a90b404da6 |
11-Aug-2009 |
Tushar Krishna <Tushar.Krishna@amd.com> |
ruby/network data_msg_size bug fix with updated stats |
6494:be123e27612f |
11-Aug-2009 |
Brad Beckmann <Brad.Beckmann@amd.com> |
merged Tushar's bug fix with public repository changes |
6493:1fa51760a963 |
07-Aug-2009 |
Tushar Krishna <Tushar.Krishna@amd.com> |
bug fix for data_msg_size in network/Network.cc |
6491:fe8a24516bb7 |
09-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
protocol: added recycle actions to MOESI DMA events |
6490:2b448f0329b0 |
06-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
fixed MOESI_CMP_directory bug |
6489:105109db1847 |
06-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
protocol: fixed MOESI_CMP_directory bug |
6488:692b62dfc8b0 |
06-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: better configuration assert message |
6472:d7bb25f0687a |
05-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
merge |
6470:e76348cb11de |
05-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: configuration supports multiple runs in same session
These changes allow to run Ruby-gems multiple times from the same ruby-lang script with different configurations |
6469:e983bc0f31a0 |
05-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
protocol: made MI_example dma mapping generic |
6468:26abdfe2d980 |
05-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: made mapAddressToRange based off a bit count |
6467:5670eee2a866 |
04-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
slicc: added MOESI_CMP_directory, DMA SequencerMsg, parameterized controllers
This changeset contains a lot of different changes that are too mingled to separate. They are:
1. Added MOESI_CMP_directory
I made the changes necessary to bring back MOESI_CMP_directory, including adding a DMA controller. I got rid of MOESI_CMP_directory_m and made MOESI_CMP_directory use a memory controller. Added a new configuration for two level protocols in general, and MOESI_CMP_directory in particular.
2. DMA Sequencer uses a generic SequencerMsg
I will eventually make the cache Sequencer use this type as well. It doesn't contain an offset field, just a physical address and a length. MI_example has been updated to deal with this.
3. Parameterized Controllers
SLICC controllers can now take custom parameters to use for mapping, latencies, etc. Currently, only int parameters are supported. |
6466:4e66dd2decd7 |
04-Aug-2009 |
Derek Hower <drh5@cs.wisc.edu> |
slicc: generate html by default |
6439:0e78ffeebffd |
04-Aug-2009 |
Nathan Binkert <nate@binkert.org> |
slicc: better error messages when the python parser fails |
6434:a6e8795b73de |
29-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: fixed clearStats |
6433:0f0f0fbef977 |
27-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: removed unused/incorrect profiler state |
6429:7ed8937e375a |
02-Aug-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Fix setting of INST_FETCH flag for O3 CPU. It's still broken in inorder. Also enhance DPRINTFs in cache and physical memory so we can see more easily whether it's getting set or not. |
6428:9e35cdc95e81 |
02-Aug-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Clean up some inconsistencies with Request flags. |
6427:50125d42559c |
02-Aug-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Rename internal Request fields to start with '_'. The inconsistency was causing a subtle bug with some of the constructors where the params had the same name as the fields. This is also a first step to switching the accessors over to our new "standard", e.g., getVaddr() -> vaddr(). |
6386:82ee4a597908 |
22-Jul-2009 |
Polina Dudnik <pdudnik@gmail.com> |
Fixed the licences plus minor fixes for compilation |
6381:fb39bf847dbe |
21-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: fixed sequencer RMW data bug |
6380:6ed66f196c89 |
21-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: libruby_init now takes parsed Ruby-lang config text
libruby_init now expects to get a file that contains the output of running a ruby-lang configuration, opposed to the ruby-lang configuration itself. |
6374:11423b4639c0 |
20-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: moved cache stats from Profiler to CacheMemory
Caches are now responsible for their own statistic gathering. This requires a direct callback from the protocol on misses, and so all future protocols need to take this into account. |
6373:544d33334ee1 |
19-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
scons: removed RubyConfig from scons |
6372:f1a41ea3bbab |
18-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: removed all refs to old RubyConfig |
6371:a1768b396928 |
18-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: removed dead files |
6370:ebfc37fa8615 |
18-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: removed dead files |
6369:82ac95f4d9f0 |
18-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
merge |
6368:cecc7019b458 |
18-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: fixed dma sequencer bug
The DMASequencer was still using a parameter from the old RubyConfig, causing an offset error when the requested data wasn't block aligned. This changeset also includes a fix to MI_example for a similar bug. |
6367:c4e91b8e3da3 |
18-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: better debug print for DataBlock |
6366:c6254810bb6c |
18-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
slicc: made coherence profilers per-controller |
6357:bd813379f121 |
15-Jul-2009 |
pdudnik@gmail.com |
Tester update |
6356:ceb2e719e36c |
13-Jul-2009 |
pdudnik@gmail.com |
Changed the state machine to generate code such that multiple processors can make atomic requests at once |
6355:79464d8a4d2f |
13-Jul-2009 |
pdudnik@gmail.com |
1. Got rid of unused functions in DirectoryMemory 2. Reintroduced RMW_Read and RMW_Write 3. Defined -2 in the Sequencer as well as made a note about mandatory queue
Did not address the issues in the slicc because remaking the atomics altogether to allow multiple processors to issue atomic requests at once |
6354:390fefc98e2b |
13-Jul-2009 |
pdudnik@gmail.com |
Changes to add tracing and replaying command-line options Trace is automatically ended upon a manual checkpoint |
6353:979add6f6fb7 |
13-Jul-2009 |
pdudnik@gmail.com |
Locked requests should actually be converted to ST rather than ATOMIC, because ATOMIC is for RMW. |
6352:849c8cc5c995 |
13-Jul-2009 |
pdudnik@gmail.com |
Added atomics implementation which would work for MI_example |
6351:31d19bdd9d85 |
13-Jul-2009 |
pdudnik@gmail.com |
Minor fixes for compiling |
6350:accdf59eedd3 |
13-Jul-2009 |
pdudnik@gmail.com |
Replaced RMW with Locked. RMW will be used for the coherence-aided atomics other than LLSC |
6349:1b3d165d890d |
13-Jul-2009 |
pdudnik@gmail.com |
Moved the lock check and clearing the lock into makeRequest |
6348:374e1d9b0660 |
13-Jul-2009 |
pdudnik@gmail.com |
Forgot to replace one of the RubyRequest_RMW |
6347:a532849ca78f |
13-Jul-2009 |
Polina pdudnik@gmail.com |
Reintegrated Derek's functional implementation of atomics with a minor change: don't clear lock on failure |
6339:61f8eb04e96d |
13-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
regression: updated memtest-ruby stats
This also includes a change to the default Ruby random seed, which was previously set using the wall clock. It is now set to 1234 so that the stat files don't change for the regression tester. |
6329:5d8b91875859 |
09-Jul-2009 |
Gabe Black <gblack@eecs.umich.edu> |
Registers: Add a registers.hh file as an ISA switched header. This file is for register indices, Num* constants, and register types. copyRegs and copyMiscRegs were moved to utility.hh and utility.cc. |
6297:57650468aff1 |
08-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
slicc: fixed MI_example bug. The directory wasn't deallocating the TBE, leading to a leak. Also increased the default max TBE size to 256 to allow memtest to pass the regression. |
6296:553a34ccd03b |
08-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: set the default values of the debug object so that nothing is printed |
6295:5b1049c70664 |
08-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
slicc: Fixed MI_example bug. The directory was not writing data to DRAM after a PUTX. |
6294:b42cea5e1625 |
08-Jul-2009 |
Derek Hower <drh5@cs.wisc.edu> |
removed stray debug print |
6289:a9e7d19871b5 |
06-Jul-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: Fix RubyMemory to work with the newer ruby. |
6288:083a6806dd96 |
06-Jul-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: apply some fixes that were overwritten by the recent ruby import. |
6287:d60118c43d60 |
06-Jul-2009 |
Nathan Binkert <nate@binkert.org> |
slicc: update parser.py for changes in slicc language. |
6286:40b142645016 |
06-Jul-2009 |
Nathan Binkert <nate@binkert.org> |
scons: update SCons files for changes in ruby. |
6285:ce086eca1ede |
06-Jul-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: Import the latest ruby changes from gems. This was done with an automated process, so there could be things that were done in this tree in the past that didn't make it. One known regression is that atomic memory operations do not seem to work properly anymore. |
6284:a63d1dc4c820 |
06-Jul-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: replace strings that were missed in original ruby import. |
6239:0c808c6d4481 |
10-Jun-2009 |
Nathan Binkert <nate@binkert.org> |
copyright: I missed some copyrights during ruby integration |
6227:a17798f2a52c |
05-Jun-2009 |
Nathan Binkert <nate@binkert.org> |
types: clean up types, especially signed vs unsigned |
6223:3623155c0e95 |
29-May-2009 |
Nathan Binkert <nate@binkert.org> |
request: add accessor and constructor for setting time other than curTick |
6221:58a3c04e6344 |
26-May-2009 |
Nathan Binkert <nate@binkert.org> |
types: add a type for thread IDs and try to use it everywhere |
6216:2f4020838149 |
17-May-2009 |
Nathan Binkert <nate@binkert.org> |
includes: sort includes again |
6215:9aed64c9f10f |
17-May-2009 |
Nathan Binkert <nate@binkert.org> |
includes: use base/types.hh not inttypes.h or stdint.h |
6214:1ec0ec8933ae |
17-May-2009 |
Nathan Binkert <nate@binkert.org> |
types: Move stuff for global types into src/base/types.hh |
6205:39a0b4026bda |
13-May-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: deal with printf warnings and convert some to cprintf |
6204:b247610d8882 |
13-May-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: remove random uint typedef and use unsigned |
6203:6baf252c5ad1 |
13-May-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: Make ruby's Map use hashmap.hh to simplify things. |
6201:39d8a2197b77 |
13-May-2009 |
Nathan Binkert <nate@binkert.org> |
slicc: work around improper initialization of a global in slicc. |
6200:0e8d74461d51 |
13-May-2009 |
Nathan Binkert <nate@binkert.org> |
slicc: clean up the slicc environment so things build properly on mac. |
6177:8684c61ac457 |
11-May-2009 |
Korey Sewell <ksewell@umich.edu> |
Merge Ruby Stuff |
6173:5a809dcfed1e |
11-May-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: assert(false) should be panic. This also fixes some compiler warnings |
6168:ba6fe02228db |
11-May-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: add RUBY sticky option that must be set to add ruby to the build Default is false |
6165:2d26c346f1be |
11-May-2009 |
Daniel Sanchez <sanchezd@stanford.edu> |
ruby: Working M5 interface and updated Ruby interface. This changeset also includes a lot of work from Derek Hower <drh5@cs.wisc.edu>
RubyMemory is now both a driver for Ruby and a port for M5. Changed makeRequest/hitCallback interface. Brought packets (superficially) into the sequencer. Modified tester infrastructure to be packet based. and Ruby can be used together through the example ruby_se.py script. SPARC parallel applications work, and the timing *seems* right from combined M5/Ruby debug traces. To run, % build/ALPHA_SE/m5.debug configs/example/ruby_se.py -c tests/test-progs/hello/bin/alpha/linux/hello -n 4 -t |
6164:29b7b7aba911 |
11-May-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
ruby: Check stderr and not stdin before hanging on an assert. |
6163:92318648212f |
11-May-2009 |
Polina Dudnik <pdudnik@gmail.com> |
ruby: decommission code
1. Set.* and BigSet.* are replaced with OptBigSet.* which was renamed Set.* 2. Decomissioned all bloom filters 3. Decomissioned ruby/simics directory |
6162:cbd6debc4fd0 |
11-May-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: removed dead functions from the sequencer |
6161:8ad9be15d1e1 |
11-May-2009 |
Polina Dudnik <pdudnik@gmail.com> |
ruby: Removed g_SIMULATING flag 1. removed checks from tester files 2. removed else clause in Sequencer and DirectoryMemory else clause is needed by the tester, it is up to Derek to revive it elsewhere when he gets to it
Also: 1. Changed m_entries in DirectoryMemory to a map 2. And replaced SIMICS_read_physical_memory with a call to now-dummy Derek's-to-be readPhysMem function |
6160:91e31308be1e |
11-May-2009 |
Polina Dudnik <pdudnik@gmail.com> |
ruby: Remove transactional access types (e.g. LD_XACT) from CacheRequestType
1. Modified enumeration 2. Also modified profiler 3. Remove transactions from Tester 4. Edited XACT_MEM out of Synthetic Driver |
6159:25181e8dd68e |
11-May-2009 |
Polina Dudnik <pdudnik@gmail.com> |
ruby: reordered Debug and RubyConfig::init to fix segfault due to uninitialized output file pointer. |
6158:5e0a261d57b8 |
11-May-2009 |
Dan Gibson <gibson@cs.wisc.edu> |
ruby: Disabled RubyEventQueue's deletion of its home-grown priority heap. Temporarily to fix unusual memory problem. |
6157:eaf2fd8f54c0 |
11-May-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: Migrate all of ruby and slicc to SCons. Add the PROTOCOL sticky option sets the coherence protocol that slicc will parse and therefore ruby will use. This whole process was made difficult by the fact that the set of files that are output by slicc are not easily known ahead of time. The easiest thing wound up being to write a parser for slicc that would tell me. Incidentally this means we now have a slicc grammar written in python. |
6156:76de2027b8ad |
11-May-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: clean up a few warnings |
6155:2b8fec056712 |
11-May-2009 |
Dan Gibson <gibson@cs.wisc.edu> |
ruby: Fixed some unresolved references. |
6154:6bb54dcb940e |
11-May-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: Make ruby #includes use full paths to the files they're including. This basically means changing all #include statements and changing autogenerated code so that it generates the correct paths. Because slicc generates #includes, I had to hard code the include paths to mem/protocol. |
6153:0011560d49b0 |
11-May-2009 |
Dan Gibson <gibson@cs.wisc.edu> |
ruby: remove unnecessary code.
1) Removing files from the ruby build left some unresovled symbols. Those have been fixed.
2) Most of the dependencies on Simics data types and the simics interface files have been removed.
3) Almost all mention of opal is gone.
4) Huge chunks of LogTM are now gone.
5) Handling 1-4 left ~hundreds of unresolved references, which were fixed, yielding a snowball effect (and the massive size of this delta). |
6152:705b277e1141 |
11-May-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: Cleaned up sequencer. Removed LogTM specific code. |
6151:bc6b84108443 |
11-May-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: added Packet interface to makeRequest and isReady. Also pushed Packet usage into the Sequencer |
6150:a2ebddfe1a37 |
11-May-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: fold the debugging options into Debug.cc |
6149:ff34514cbf37 |
11-May-2009 |
Derek Hower <drh5@cs.wisc.edu> |
ruby: Renamed Ruby's EventQueue to RubyEventQueue |
6148:71a683318799 |
11-May-2009 |
Daniel Sanchez <sanchezd@stanford.edu> |
ruby: Removed System name clash by renaming ruby's System to RubySystem |
6147:e9a8bb75c3a8 |
11-May-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: rename config.include to config.hh and clean up the macro stuff. I did the macro cleanup because I was worried that the SCons scanner would get confused. This code will hopefully go away soon anyway. |
6146:0390b60a0b51 |
11-May-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: strip out some unused defines |
6145:15cca6ab723a |
11-May-2009 |
Nathan Binkert <nate@binkert.org> |
ruby: Import ruby and slicc from GEMS
We eventually plan to replace the m5 cache hierarchy with the GEMS hierarchy, but for now we will make both live alongside eachother. |
6133:5af0a83d9021 |
23-Apr-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
request: reorganize flags to group related flags together. |
6122:9af6fb59752f |
16-Jul-2008 |
Steve Reinhardt <Steve.Reinhardt@amd.com> |
mem: use single BadAddr responder per system. Previously there was one per bus, which caused some coherence problems when more than one decided to respond. Now there is just one on the main memory bus. The default bus responder on all other buses is now the downstream cache's cpu_side port. Caches no longer need to do address range filtering; instead, we just have a simple flag to prevent snoops from propagating to the I/O bus. |
6107:52a5e1c63380 |
21-Apr-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Minor tweaks for future Ruby compatibility. |
6106:d41da05de9ad |
21-Apr-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
request: add PREFETCH flag. |
6105:a27c0934de24 |
20-Apr-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
request: rename INST_READ to INST_FETCH. |
6104:ca0915f8d86d |
20-Apr-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
request: split public and private flags into separate fields. This frees up needed space for more public flags. Also: - remove unused Request accessor methods - make Packet use public Request accessors, so it need not be a friend |
6103:549511187a5c |
20-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
Mem: Fill out the comment that describes the LOCKED request flag. |
6102:7fbf97dc6540 |
20-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
Mem: Change isLlsc to isLLSC. |
6077:37aac5b2c2b7 |
19-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
Memory: Add a LOCKED flag back in for x86 style locking. |
6076:e141cc7896ce |
19-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
Memory: Rename LOCKED for load locked store conditional to LLSC. |
6064:46d327d42036 |
19-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a function which gets called when an interrupt message has been delivered. |
6063:5e719a1e5d82 |
19-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the flags for interrupt response messages. |
6020:0647c8b31a99 |
06-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
Merge ARM into the head. ARM will compile but may not actually work. |
6013:208de84f046d |
12-Mar-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cache: set dirty bit on swaps (oops!) |
6010:a1e71f3576f8 |
10-Mar-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
prefetch: don't panic on requests w/o contextID (e.g., writebacks). |
5999:3cf8e71257e0 |
05-Mar-2009 |
Nathan Binkert <nate@binkert.org> |
stats: Fix all stats usages to deal with template fixes |
5890:bdef71accd68 |
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
CPU: Get rid of translate... functions from various interface classes. |
5877:9fe574944f31 |
16-Feb-2009 |
Lisa Hsu <hsul@eecs.umich.edu> |
sycalls: implement mremap() and add DATA flag for getrlimit(). mremap has been tested on Alpha, compiles for the rest but not tested. I don't see why it wouldn't work though. |
5875:d82be3235ab4 |
16-Feb-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Fixes to get prefetching working again. Apparently we broke it with the cache rewrite and never noticed. Thanks to Bao Yungang <baoyungang@gmail.com> for a significant part of these changes (and for inspiring me to work on the rest). Some other overdue cleanup on the prefetch code too. |
5793:321f79ddb500 |
13-Jan-2009 |
Nathan Binkert <nate@binkert.org> |
SCons: centralize the Dir() workaround for newer versions of scons. Scons bug id: 2006 M5 Bug id: 308 |
5764:f07df23e1fc8 |
06-Dec-2008 |
Nathan Binkert <nate@binkert.org> |
flags: Change naming of functions to be clearer |
5748:f28f020f3006 |
15-Nov-2008 |
Steve Reinhardt <Steve.Reinhardt@amd.com> |
syscalls: fix latent brk/obreak bug. Bogus calls to ChunkGenerator with negative size were triggering a new assertion that was added there. Also did a little renaming and cleanup in the process. |
5746:d7540fa81f1d |
14-Nov-2008 |
Steve Reinhardt <Steve.Reinhardt@amd.com> |
Cache: get rid of obsolete Tag methods. I think readData() and writeData() were used for Erik's compression work, but that code is gone, these aren't called anymore, and they don't even really do what their names imply. |
5745:6b0f8306704b |
14-Nov-2008 |
Nathan Binkert <nate@binkert.org> |
Fix a bunch of bugs I introduced when I changed the flags stuff for packets. I did some of the flags and assertions wrong. Thanks to Brad Beckmann for pointing this out. I should have run the opt regressions instead of the fast. I also screwed up some of the logical functions in the Flags class. |
5744:342cbc20a188 |
14-Nov-2008 |
Gabe Black <gblack@eecs.umich.edu> |
CPU: Refactor read/write in the simple timing CPU. |
5740:983b71bfc1bd |
10-Nov-2008 |
Nathan Binkert <nate@binkert.org> |
Clean up the SimpleTimingPort class a little bit. Move the constructor into the .cc file and get rid of the typedef for SendEvent. |
5735:a88e8e7dec75 |
10-Nov-2008 |
Nathan Binkert <nate@binkert.org> |
style: clean up the Packet stuff |
5731:453f320129a1 |
10-Nov-2008 |
Steve Reinhardt <Steve.Reinhardt@amd.com> |
mem: Assert that requests have non-negative size. Would have saved me much debugging time if these had been in there previously. |
5730:dea5fcd1ead0 |
10-Nov-2008 |
Steve Reinhardt <Steve.Reinhardt@amd.com> |
Cache: Refactor packet forwarding a bit. Makes adding write-through operations easier. |
5717:6ed48cba2217 |
04-Nov-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
decouple eviction from insertion in the cache. |
5716:ee56bb539212 |
04-Nov-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
Change the findBlock(addr, lat) to accessBlock, which I think has better connotations for what is really happening and how it should be used. |
5715:e8c1d4e669a7 |
04-Nov-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
get rid of all instances of readTid() and getThreadNum(). Unify and eliminate redundancies with threadId() as their replacement. |
5714:76abee886def |
02-Nov-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
Add in Context IDs to the simulator. From now on, cpuId is almost never used, the primary identifier for a hardware context should be contextId(). The concept of threads within a CPU remains, in the form of threadId() because sometimes you need to know which context within a cpu to manipulate. |
5707:da86e00f87a0 |
23-Oct-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
s/cpu_id/cpuId in o3 (to be consistent and match style), also fix some typos in comments. |
5706:2cc2387049bc |
23-Oct-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
probe function no longer used anywhere. |
5705:aea94955635b |
23-Oct-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
remove the totally obsolete split cache |
5699:ab3067124402 |
14-Oct-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
This function declaration isn't used anywhere. HG: user: Lisa Hsu <hsul@eecs.umich.edu> HG: branch default HG: changed src/mem/cache/cache.hh |
5693:4bf6f614871b |
13-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
Get rid of some commented out code. |
5650:d2782c951841 |
12-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
Create a message port for sending messages as apposed to reading/writing a memory range. |
5607:eb9b92bf37ec |
09-Oct-2008 |
Nathan Binkert <nate@binkert.org> |
mem: Add a method for setting the time on a packet. |
5606:6da7a58b0bc8 |
09-Oct-2008 |
Nathan Binkert <nate@binkert.org> |
eventq: convert all usage of events to use the new API. For now, there is still a single global event queue, but this is necessary for making the steps towards a parallelized m5. |
5605:b194a80157e2 |
09-Oct-2008 |
Nathan Binkert <nate@binkert.org> |
eventq: Major API change for the Event and EventQueue structures.
Since the early days of M5, an event needed to know which event queue it was on, and that data was required at the time of construction of the event object. In the future parallelized M5, this sort of requirement does not work well since the proper event queue will not always be known at the time of construction of an event. Now, events are created, and the EventQueue itself has the schedule function, e.g. eventq->schedule(event, when). To simplify the syntax, I created a class called EventManager which holds a pointer to an EventQueue and provides the schedule interface that is a proxy for the EventQueue. The intent is that objects that frequently schedule events can be derived from EventManager and then they have the schedule interface. SimObject and Port are examples of objects that will become EventManagers. The end result is that any SimObject can just call schedule(event, when) and it will just call that SimObject's eventq->schedule function. Of course, some objects may have more than one EventQueue, so this interface might not be perfect for those, but they should be relatively few. |
5562:875cb7d09831 |
26-Sep-2008 |
Nathan Binkert <nate@binkert.org> |
When nesting if statements, use braces to avoid ambiguous else clauses. |
5543:3af77710f397 |
10-Sep-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
style: Remove non-leading tabs everywhere they shouldn't be. Developers should configure their editors to not insert tabs |
5530:bbfff6d0c42c |
11-Aug-2008 |
Nathan Binkert <nate@binkert.org> |
params: Get rid of the remnants of the old style parameter configuration stuff. |
5520:cf280b3621cf |
03-Aug-2008 |
Steve Reinhardt <stever@gmail.com> |
Make default PhysicalMemory latency slightly more realistic. Also update stats to reflect change. |
5507:52bcc301b467 |
15-Jul-2008 |
Steve Reinhardt <stever@gmail.com> |
Use ReadResp instead of LoadLockedResp for LoadLockedReq responses. |
5506:c1f203a35cc3 |
15-Jul-2008 |
Steve Reinhardt <stever@gmail.com> |
Add missing newlines to Bus DPRINTFs. |
5499:8bfc7650c344 |
01-Jul-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
Remove delVirtPort() and make getVirtPort() only return cached version. |
5498:2af99511ded4 |
01-Jul-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
Change everything to use the cached virtPort rather than created their own each time. This appears to work, but I don't want to commit it until it gets tested a lot more. I haven't deleted the functionality in this patch that will come later, but one question is how to enforce encourage objects that call getVirtPort() to not cache the virtual port since if the CPU changes out from under them it will be worse than useless. Perhaps a null function like delVirtPort() is still useful in that case. |
5495:0d9fac06c402 |
28-Jun-2008 |
Steve Reinhardt <stever@gmail.com> |
Automated merge after backout. |
5494:85c8d296c1cb |
28-Jun-2008 |
Steve Reinhardt <stever@gmail.com> |
Backed out changeset 94a7bb476fca: caused memory leak. |
5490:88f1e9295945 |
21-Jun-2008 |
Steve Reinhardt <stever@gmail.com> |
Make bus address conflict error more informative |
5489:94a7bb476fca |
21-Jun-2008 |
Steve Reinhardt <stever@gmail.com> |
Generate more useful error messages for unconnected ports. Force all non-default ports to provide a name and an owner in the constructor. |
5477:dc04d655315a |
16-Jun-2008 |
Nathan Binkert <nate@binkert.org> |
physmem: Add a null option to physical memory so it doesn't store data. |
5476:758c2413765a |
16-Jun-2008 |
Nathan Binkert <nate@binkert.org> |
port: Clean up default port setup and port switchover code. |
5466:a1981d557252 |
14-Jun-2008 |
Nathan Binkert <nate@binkert.org> |
MemReq: Add option to reset the time on a request. |
5459:b84a60dbf862 |
13-Jun-2008 |
Steve Reinhardt <stever@gmail.com> |
Get rid of bogus bus assertion. It runs out that if a MemObject turns around and does a send in its receive callback, and there are other sends already scheduled, then it could observe a state where it's not at the head of the list but the bus's sendEvent is not scheduled (because we're still in the middle of processing the prior sendEvent). |
5458:9ffc2be2d925 |
13-Jun-2008 |
Steve Reinhardt <stever@gmail.com> |
Get rid of bogus cache assertion. I was asserting that the only reason you would defer targets is if a write came in while you had an outstanding read miss, but there's another case where you could get a read access after you've snooped an invalidation and buffered it because it applies to a prior outstanding miss. |
5402:05c388940eb6 |
15-May-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
Make sure that output files are always checked success before they're used. Make OutputDirectory::resolve() private and change the functions using resolve() to instead use create(). |
5400:fee00a595efc |
10-Apr-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
SCons: add comments to SConscript documenting bug workaround |
5399:e951ca2d56e2 |
10-Apr-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
PhysicalMemory: Add parameter for variance in memory delay. |
5398:9727ba4600de |
08-Apr-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
SCons: Manually specifying header only directories with Dir() works around the problem |
5388:3b4772ca8368 |
25-Mar-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix handling of writeback-induced writebacks in atomic mode. |
5387:3323952c3bb4 |
24-Mar-2008 |
Steve Reinhardt <stever@gmail.com> |
Delete the Request for a no-response Packet when the Packet is deleted, since the requester can't possibly do it. |
5386:5614618f4027 |
24-Mar-2008 |
Steve Reinhardt <stever@gmail.com> |
Don't FastAlloc MSHRs since we don't allocate them on the fly. |
5384:dc6bb852ca68 |
22-Mar-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix cache problem with writes to tempBlock getting wrong writeback address. |
5381:55789e3f65cd |
17-Mar-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix a few Packet memory leaks. |
5380:51b02aad92af |
17-Mar-2008 |
Steve Reinhardt <stever@gmail.com> |
Restructure bus timing calcs to cope with pkt being deleted by target. |
5379:800a2f0641b5 |
15-Mar-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix subtle cache bug where read could return stale data if a prior write miss arrived while an even earlier read miss was still outstanding. |
5366:ccef4b20c987 |
27-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Revamp cache timing access mshr check to make stats sane again. |
5365:49bef92749d1 |
26-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Cache: better comments particularly regarding writeback situation. |
5354:a15145af7ae7 |
26-Feb-2008 |
Gabe Black <gblack@eecs.umich.edu> |
Bus: Fix the bus timing to be more realistic. |
5350:67e5e13f4146 |
16-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Make L2+ caches allocate new block for writeback misses instead of forwarding down the line. |
5345:6a783e4946ac |
11-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Automated merge with file:/home/stever/hg/m5-orig |
5340:9616120063c4 |
10-Feb-2008 |
Nicolas Zea <nicolas.zea@gmail.com> |
Bus: Only update port cache when there is an item to update it with. |
5338:e75d02a09806 |
10-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix #include lines for renamed cache files. |
5337:f81512eb8bdf |
10-Feb-2008 |
Steve Reinhardt <stever@gmail.com> |
Rename cache files for brevity and consistency with rest of tree. |
5336:c7e21f4e5a2e |
06-Feb-2008 |
Stephen Hines <hines@cs.fsu.edu> |
Make the Event::description() a const function |
5321:14afee693b39 |
06-Jan-2008 |
Geoffrey Blake <blakeg@umich.edu> |
Temporary fix for ll/sc bug see flyspray task for more info: http://www.m5sim.org/flyspray/task/197
Signed-off by: Ali Saidi <saidi@eecs.umich.edu> |
5319:13cb690ba6d6 |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Add ReadRespWithInvalidate to handle multi-level coherence situation where we defer a response to a read from a far-away cache A, then later defer a ReadExcl from a cache B on the same bus as us. We'll assert MemInhibit in both cases, but in the latter case MemInhibit will keep the invalidation from reaching cache A. This special response tells cache A that it gets the block to satisfy its read, but must immediately invalidate it. |
5318:fc6a69e31c8e |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Mark cache-to-cache MSHRs as downstreamPending when necessary. Don't mark upstream MSHR as pending if downstream MSHR is already in service. |
5317:5f5eb2456e8b |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Don't DPRINTF in the middle of a PrintReq. |
5316:e5cefff77060 |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Bug fix: functional cache port now needs otherPort set. |
5315:30997e988446 |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Additional comments and helper functions for PrintReq. |
5314:e902f12a3af1 |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Add functional PrintReq command for memory-system debugging. |
5313:07c9a3b1539b |
02-Jan-2008 |
Steve Reinhardt <stever@gmail.com> |
Fix formatting and comments in cache_impl.hh |
5283:3ab643fa74be |
28-Nov-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make ports that aren't connected to anything fail more gracefully. |
5275:5279ced1dd8b |
19-Nov-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Memory: Cache the physical memory start and size so we don't need a dynamic cast on every access. |
5271:5e7547af97fb |
16-Nov-2007 |
Steve Reinhardt <stever@gmail.com> |
Tweak check for writable block fill. |
5270:ba8f3ca2a525 |
16-Nov-2007 |
Steve Reinhardt <stever@gmail.com> |
Fix bug on exclusive response to ReadReq with pending WriteReq. |
5248:b27aab7165da |
14-Nov-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Checkpointing: Name SE page table entries better so that there isn't a problem if multiple workloads are being run at once. |
5223:5f581fe175ce |
14-Nov-2007 |
Korey Sewell <ksewell@umich.edu> |
remove unnecessary debug messages I added |
5222:bb733a878f85 |
13-Nov-2007 |
Korey Sewell <ksewell@umich.edu> |
Add in files from merge-bare-iron, get them compiling in FS and SE mode |
5213:ad68c4b99d6d |
04-Nov-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Cache: Fix for OS X 10.5 compiling. |
5203:de0fcc102302 |
01-Nov-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
DRAM: Make latency parameters be Param.Latency instead of ints. |
5197:5d7cf59548f5 |
16-Sep-2007 |
Steve Reinhardt <stever@gmail.com> |
mem: clean up bus/cache DPRINTFs a bit Not so much noise on failed sends, and more complete info when grepping a trace using an address. |
5192:582e583f8e7e |
31-Oct-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Traceflags: Add SCons function to created a traceflag instead of having one file with them all. |
5184:8782de2949e5 |
25-Oct-2007 |
Gabe Black <gblack@eecs.umich.edu> |
TLB: Fix serialization issues with the tlb entries and make the page table store the process, not the system. |
5183:b4decf133fe4 |
25-Oct-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
SE: Fix page table and system serialization, don't reinit process if this is a checkpoint restore. |
5057:2979e5e3f457 |
05-Sep-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Bus: Fix drain code; old method could return 1 in atomic mode and never call de->process(). |
5034:6186ef720dd4 |
30-Aug-2007 |
Miles Kaufmann <milesck@eecs.umich.edu> |
params: Deprecate old-style constructors; update most SimObject constructors.
SimObjects not yet updated: - Process and subclasses - BaseCPU and subclasses
The SimObject(const std::string &name) constructor was removed. Subclasses that still rely on that behavior must call the parent initializer as : SimObject(makeParams(name)) |
5012:c0a28154d002 |
27-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with head |
5004:7d94cedab264 |
26-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Address translation: Make the page table more flexible. The page table now stores actual page table entries. It is still a templated class here, but this will be corrected in the near future. |
4986:b7c82ad6b3ef |
24-Aug-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Mem: Make errors in the memory system be responses, not requests. Fixes cache handling of error responses. |
4970:d0ed47928f9c |
12-Aug-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
MemorySystem: Fix the use of ?: to produce correct results. |
4965:ad0e792a5c78 |
10-Aug-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
DMA: Add IOCache and fix bus bridge to optionally only send requests one way so a cache can handle partial block requests for i/o devices. |
4964:7a8a941f4059 |
10-Aug-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Bus: Only call end() on an stl object once in a loop |
4963:ba55203d1bdc |
08-Aug-2007 |
Vincentius Robby <acolyte@umich.edu> |
Port, StaticInst: Revert unnecessary changes. |
4962:4e939f4629c3 |
08-Aug-2007 |
Vincentius Robby <acolyte@umich.edu> |
alpha: Make the TLB cache to actually work. Improve MRU checking for StaticInst, Bus, TLB |
4958:b24c65d715ad |
04-Aug-2007 |
Vincentius Robby <acolyte@umich.edu> |
port: Implement cache for port interfaces and ranges |
4948:55bcb35dc166 |
04-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with head. |
4934:0573f1ffcc4d |
03-Aug-2007 |
Steve Reinhardt <stever@gmail.com> |
cache: get rid of obsolete params from python. |
4929:6db35d0c81c6 |
29-Jul-2007 |
Steve Reinhardt <stever@gmail.com> |
memory system: fix functional access bug. Make sure not to keep processing functional accesses after they've been responded to. Also use checkFunctional() return value instead of checking packet command field where possible, mostly just for consistency. |
4927:dcf6af85cea8 |
29-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
bus: take out response prioritization (timing was messed up). Also make express snoops not occupy bus (since they're magic). |
4921:bcf49547dae7 |
27-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
packet: get rid of unused intersect() function. |
4920:03b88702070e |
27-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
cache/memtest: fixes for functional accesses. |
4919:013a8e9117b6 |
27-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
cache: Get rid of unused variable. |
4918:3214e3694fb2 |
27-Jul-2007 |
Nathan Binkert <nate@binkert.org> |
Merge python and x86 changes with cache branch |
4917:9e84859dde4d |
26-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Have owner respond to UpgradeReq to avoid race. |
4916:000ab733f1eb |
26-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Add downward express snoops for invalidations. |
4915:4c0e0f67fc94 |
26-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Continue snooping after a writeback is encountered. |
4914:5a560cfb4976 |
26-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
bus: Fix default port handling. |
4913:d81df43157b3 |
25-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Can't block on memInhibit packets (now that bus no longer filters them for us). |
4912:e6edaa59f845 |
25-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Integrate snoop loop functions into their respective call sites. Also some additional cleanup of Bus::recvTiming(). |
4911:3a0ee63f490c |
25-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Don't delete request at target... requester still needs it. |
4910:fd583ea6a3bb |
24-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
A couple more minor bug fixes for multilevel coherence. |
4908:771ec077a955 |
23-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Replace lowerMSHRPending flag with more robust scheme based on following Packet senderState links. |
4905:0ccda2bb3be7 |
22-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Replace DeferredSnoop flag with LowerMSHRPending flag. Turns out DeferredSnoop isn't quite the right bit of info we needed... see new comment in cache_impl.hh. |
4904:291184a5eb05 |
22-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
A few minor non-debug compilation issues. |
4903:865d314b7139 |
21-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Deal with invalidations intersecting outstanding upgrades. If the invalidation beats the upgrade at a lower level then the upgrade must be converted to a read exclusive "in the field". Restructure target list & deferred target list to factor out some common code. |
4902:bc666118c6e2 |
21-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Several more fixes for multi-level timing coherence. - Add "deferred snoop" flag to Packet so upper-level caches can distinguish whether lower-level cache request was in-service or not at the time of the original snoop. - Revamp response handling to properly handle deferred snoops on non-cache-fill requests (i.e. upgrades). - Make sure forwarded writebacks are kept in write buffer at lower-level caches so they get snooped properly. |
4901:623403628a23 |
17-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Make sure responses never get blocked. |
4900:9397ff92c45c |
17-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Forward cache-to-cache responses through other caches. |
4899:6179f3039eb2 |
17-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Assert that an mshr has a target in getTarget(). |
4896:93f20b1f3925 |
15-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge from head. |
4895:d36959284fbc |
15-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix up a bunch of multilevel coherence issues. Atomic mode seems to work. Timing is closer but not there yet. |
4894:0e137be78ad0 |
15-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Make Bus::findPort() a little more useful. Move check for loops outside, since half the call sites end up working around it anyway. Return integer port ID instead of port object pointer. |
4889:a557a85bdb96 |
14-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Add CacheRepl trace flag and move a couple DPRINTFs to it. |
4888:a1c0cca0979f |
14-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Move a couple of DPRINTFs from Cache to CachePort. |
4887:a784c507ea84 |
14-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix bug in copying packet with static data pointer. |
4885:385a051ad874 |
14-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge of DPRINTF fixes from head. |
4882:78904f539525 |
03-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Delete packets when we're done with them. |
4881:3e4b4f6ff9dd |
02-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Couple more minor bug fixes for FS timing mode.
src/cpu/simple/timing.cc: Fix another SC problem. src/mem/cache/cache_impl.hh: Forgot to call makeTimingResponse() on uncached timing responses. |
4880:4de4d072e977 |
02-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix a couple LL/SC bugs that only affected timing mode.
src/cpu/simple/timing.cc: Fix swap/stq_c command bug. src/mem/packet.cc: Fix incorrect LoadLockedReq command response field. |
4879:b768f6b9ef80 |
02-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
bus.cc: Fix atomic timing issue.
src/mem/bus.cc: Fix atomic timing issue. |
4877:8f00ebb86efd |
30-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Don't propagate snoops across bridges. Wouldn't work anyway. |
4876:a18cedc19da5 |
30-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of remaining traces of obsolete CoherenceProtocol object. |
4874:cdae9adbd276 |
30-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of obsolete fixPacket() functions. Handled by Packet::checkFunctional() now. |
4873:b135f6e6adfe |
30-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Event descriptions should not end in "event" (they function as adjectives not nouns) |
4872:c810a14f9a39 |
30-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Factor out a little more common code. |
4871:02c0ad6e09ee |
30-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix up a few statistics problems. Stats pretty much line up with old code, except: - bug in old code included L1 latency in L2 miss time, making it too high - UniCoherence did cache-to-cache transfers even from non-owner caches, so occasionally the icache would get a block from the dcache not the L2 - L2 can now receive ReadExReq from L1 since L1s have coherence |
4870:fcc39d001154 |
30-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of Packet result field. Error responses are now encoded in cmd field. |
4763:fef9a47b3732 |
24-Jul-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with head. |
4762:c94e103c83ad |
24-Jul-2007 |
Nathan Binkert <nate@binkert.org> |
Major changes to how SimObjects are created and initialized. Almost all creation and initialization now happens in python. Parameter objects are generated and initialized by python. The .ini file is now solely for debugging purposes and is not used in construction of the objects in any way. |
4739:9f8edf47aeca |
14-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix & tweak DPRINTFs for tracediff w/new cache code. Note that we should *not* print pointer values in DPRINTFs as these needlessly clutter tracediff output. |
4672:cc97e595e07d |
27-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of coherence protocol object. |
4671:5d29d3be0f79 |
27-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Revamp replacement-of-upgrade handling. |
4670:54ac1fb49a26 |
27-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Handle deferred snoops better. |
4669:afd3ecbf9798 |
26-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
cache_impl.hh: Change target overflow from assertion to warning.
src/mem/cache/cache_impl.hh: Change target overflow from assertion to warning. |
4668:fcce0b964c7c |
26-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Handle replacement of block with pending upgrade.
src/mem/cache/tags/lru.cc: Add some replacement DPRINTFs |
4667:bf428e572091 |
26-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Couple minor bug fixes...
src/mem/cache/cache_impl.hh: Handle grants with no packet. src/mem/cache/miss/mshr.cc: Fix MSHR snoop hit handling. |
4666:5d110d024fcf |
25-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of requestCauses. Use timestamped queue to make sure we don't re-request bus prematurely. Use callback to avoid calling sendRetry() recursively within recvTiming. |
4665:9471921e5e08 |
24-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Better handling of deferred targets. |
4660:8ba283606f48 |
23-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Minor fix plus new assertion to catch similar bugs.
src/cpu/memtest/memtest.cc: Need to set packet source field so that response from cache doesn't run into assertion failure when copying source to dest. src/mem/packet.hh: Copy source field when copying packets. Assert that source is valid before copying it to dest when turning packets around. |
4630:5a832c366b22 |
22-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fixes to hitLatency, blocking, buffer allocation. Single-cpu timing mode seems to work now. |
4629:6c153fdd70bc |
21-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge vm1.(none):/home/stever/bk/newmem-head into vm1.(none):/home/stever/bk/newmem-cache2 |
4628:17b3ce796176 |
21-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Getting closer...
configs/example/memtest.py: Add progress interval option. src/base/traceflags.py: Add MemTest flag. src/cpu/memtest/memtest.cc: Clean up tracing. src/cpu/memtest/memtest.hh: Get rid of unused code. |
4627:2766d5cfbd9d |
17-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge vm1.(none):/home/stever/bk/newmem-head into vm1.(none):/home/stever/bk/newmem-cache2
configs/example/memtest.py: Hand merge redundant changes. |
4626:ed8aacb19c03 |
17-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
More major reorg of cache. Seems to work for atomic mode now, timing mode still broken.
configs/example/memtest.py: Revamp options. src/cpu/memtest/memtest.cc: No need for memory initialization. No need to make atomic response... memory system should do that now. src/cpu/memtest/memtest.hh: MemTest really doesn't want to snoop. src/mem/bridge.cc: checkFunctional() cleanup. src/mem/bus.cc: src/mem/bus.hh: src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache.cc: src/mem/cache/cache.hh: src/mem/cache/cache_blk.hh: src/mem/cache/cache_builder.cc: src/mem/cache/cache_impl.hh: src/mem/cache/coherence/coherence_protocol.cc: src/mem/cache/coherence/coherence_protocol.hh: src/mem/cache/coherence/simple_coherence.hh: src/mem/cache/miss/SConscript: src/mem/cache/miss/mshr.cc: src/mem/cache/miss/mshr.hh: src/mem/cache/miss/mshr_queue.cc: src/mem/cache/miss/mshr_queue.hh: src/mem/cache/prefetch/base_prefetcher.cc: src/mem/cache/tags/fa_lru.cc: src/mem/cache/tags/fa_lru.hh: src/mem/cache/tags/iic.cc: src/mem/cache/tags/iic.hh: src/mem/cache/tags/lru.cc: src/mem/cache/tags/lru.hh: src/mem/cache/tags/split.cc: src/mem/cache/tags/split.hh: src/mem/cache/tags/split_lifo.cc: src/mem/cache/tags/split_lifo.hh: src/mem/cache/tags/split_lru.cc: src/mem/cache/tags/split_lru.hh: src/mem/packet.cc: src/mem/packet.hh: src/mem/physical.cc: src/mem/physical.hh: src/mem/tport.cc: More major reorg. Seems to work for atomic mode now, timing mode still broken. |
4622:f681e10844f3 |
28-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge vm1.(none):/home/stever/bk/newmem-head into vm1.(none):/home/stever/bk/newmem-cache2 |
4621:0468bff29088 |
28-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge vm1.(none):/home/stever/bk/newmem-head into vm1.(none):/home/stever/bk/newmem-cache2 |
4610:97834b18a8b4 |
21-Jun-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Use FastAlloc for Packet, Request, CoherenceState, and SenderState so we don't spend so much time calling malloc() |
4600:82bec5b42595 |
20-Jun-2007 |
Vincentius Robby <acolyte@umich.edu> |
Minor error. |
4599:b3cdf938a853 |
20-Jun-2007 |
Vincentius Robby <acolyte@umich.edu> |
Removed "adding instead of dividing" trick. Caused slowdown in performance instead of speeding up.
src/cpu/base.cc: Removed "adding instead of dividing" trick. src/mem/bus.cc: Fixed spelling in comments. Removed "adding instead of dividing" trick. |
4597:063f25d13229 |
20-Jun-2007 |
Nathan Binkert <binkertn@umich.edu> |
Make sure all parameters have default values if they're supposed to and make sure parameters have the right type. Also make sure that any object that should be an intermediate type has the right options set. |
4552:45315931d83b |
10-Jun-2007 |
Nathan Binkert <binkertn@umich.edu> |
Add a startup function that will fast forward to the right clock edge using a divide in order to not loop forever after resuming from a checkpoint |
4549:42b30b2529e1 |
10-Jun-2007 |
Nathan Binkert <binkertn@umich.edu> |
More realistic parameters |
4521:0236d1cdb330 |
05-Jun-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Clean up some of vincent's code and commit it Makes page table cache scheme actually work
src/mem/page_table.cc: src/mem/page_table.hh: fix caching scheme to actually work and improve performance |
4493:0757d7c8a0e5 |
30-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
tport.cc: Oops... forgot to update call site after changing function argument semantics.
src/mem/tport.cc: Oops... forgot to update call site after changing function argument semantics. |
4492:75dabb0392ee |
30-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
A little more cleanup & refactoring of SimpleTimingPort. Make it a better base class for cache ports. |
4490:f9d3db907eec |
28-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Restructure SimpleTimingPort a bit: - factor out checkFunctional() code so it can be called from derived classes - use EventWrapper for sendEvent, move event handling code from event to port where it belongs - make sendEvent a pointer so derived classes can override it - replace std::pair with new class for readability |
4489:381fcb5b6c31 |
28-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Reformat comments to meet line length restriction. |
4486:aaeb03a8a6e1 |
27-May-2007 |
Nathan Binkert <binkertn@umich.edu> |
Move SimObject python files alongside the C++ and fix the SConscript files so that only the objects that are actually available in a given build are compiled in. Remove a bunch of files that aren't used anymore. |
4478:33c4bf0ab4b9 |
22-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix getDeviceAddressRanges() to get snooping right. |
4477:375b35072b58 |
22-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge vm1.(none):/home/stever/bk/newmem-head into vm1.(none):/home/stever/bk/newmem-cache2
src/mem/cache/base_cache.hh: Manual conflict resolution. |
4475:fb185cc1c845 |
22-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Change getDeviceAddressRanges to use bool for snoop arg. |
4473:fa451e5f9f06 |
22-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Another pass of minor changes in preparation for new protocol.
src/mem/cache/cache_impl.hh: src/mem/cache/coherence/simple_coherence.hh: Get rid of old invalidate propagation logic in preparation for new multilevel snoop protocol. src/mem/cache/coherence/coherence_protocol.cc: L2 cache now has protocol, so protocol must handle ReadExReq coming in from the CPU side. src/mem/cache/miss/mshr_queue.cc: Assertion is failing, so let's take it out for now. src/mem/packet.cc: src/mem/packet.hh: Add WritebackAck command. Reorganize enum to put responses next to corresponding requests. Get rid of unused WriteReqNoAck. |
4470:9ab7a98dae90 |
20-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Insist that PhysicalMemory object have at least one connection. |
4469:1a5deb8fffd3 |
19-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge vm1.(none):/home/stever/bk/newmem-head into vm1.(none):/home/stever/bk/newmem-cache2
src/mem/bridge.cc: SCCS merged |
4468:25046012019e |
19-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Oops... some places in C++ explicitly ask for a "functional" port. It would be better to move this to python IMO but for now I'll stick in a compatibility hack. |
4467:cb5715e021ca |
19-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
PhysicalMemory has vector of uniform ports instead of one special one.
configs/example/memtest.py: PhysicalMemory has vector of uniform ports instead of one special one. Other updates to fix obsolete brokenness. src/mem/physical.cc: src/mem/physical.hh: src/python/m5/objects/PhysicalMemory.py: Have vector of uniform ports instead of one special one. src/python/swig/pyobject.cc: Add comment. |
4458:d43aab911e6e |
19-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
First set of changes for reorganized cache coherence support. Compiles but doesn't work... committing just so I can merge (stupid bk!).
src/mem/bridge.cc: Get rid of SNOOP_COMMIT. src/mem/bus.cc: src/mem/packet.hh: Get rid of SNOOP_COMMIT & two-pass snoop. First bits of EXPRESS_SNOOP support. src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache.hh: src/mem/cache/cache_impl.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/miss_queue.cc: src/mem/cache/prefetch/base_prefetcher.cc: Big reorg of ports and port-related functions & events. src/mem/cache/cache.cc: src/mem/cache/cache_builder.cc: src/mem/cache/coherence/SConscript: Get rid of UniCoherence object. |
4456:02b3756b83e4 |
14-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge vm1.(none):/home/stever/bk/newmem-head into vm1.(none):/home/stever/bk/newmem-cache2 |
4454:8125c4b9e306 |
15-May-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
hopefully the final hacky change to make the bus bridge work ok cache blocks that get dmaed ARE NOT marked invalid in the caches so it's a performance issue here
src/mem/bridge.cc: src/mem/bridge.hh: hopefully the final hacky change to make the bus bridge work ok |
4451:bfb7c7c0b7ea |
14-May-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
add uglyiness to fix dmas
src/dev/io_device.cc: extra printing and assertions src/mem/bridge.hh: deal with packets only satisfying part of a request by making many requests src/mem/cache/cache_impl.hh: make the cache try to satisfy a functional request from the cache above it before checking itself |
4450:54dbcf524f0b |
13-May-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
fix handling of atomic packets fix up code for counting requests and responses |
4449:dc56f9418210 |
14-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Eliminate unused PacketPtr from BaseCache's RequestEvent and ResponseEvent. Compiles but not tested. |
4448:4c1ae4adf9bb |
14-May-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Split BaseCache::CacheEvent into RequestEvent and ResponseEvent. Compiles but not tested. |
4444:0648bdc8d1c9 |
10-May-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
remove hit_latency and make latency do the right thing set the latency parameter in terms of a latency add caches to tsunami-simple configs
configs/common/Caches.py: tests/configs/memtest.py: tests/configs/o3-timing-mp.py: tests/configs/o3-timing.py: tests/configs/simple-atomic-mp.py: tests/configs/simple-timing-mp.py: tests/configs/simple-timing.py: set the latency parameter in terms of a latency configs/common/FSConfig.py: give the bridge a default latency too src/mem/cache/cache_builder.cc: src/python/m5/objects/BaseCache.py: remove hit_latency and make latency do the right thing tests/configs/tsunami-simple-atomic-dual.py: tests/configs/tsunami-simple-atomic.py: tests/configs/tsunami-simple-timing-dual.py: tests/configs/tsunami-simple-timing.py: add caches to tsunami-simple configs |
4436:364be2d08f3a |
09-May-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
undo my previous bus change, it can make the bus deadlock.. so it still constantly reschedules itself |
4435:7da241055348 |
09-May-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
add a backoff algorithm when nacks are received by devices add seperate response buffers and request queue sizes in bus bridge add delay to respond to a nack in the bus bridge
src/dev/i8254xGBe.cc: src/dev/ide_ctrl.cc: src/dev/ns_gige.cc: src/dev/pcidev.hh: src/dev/sinic.cc: add backoff delay parameters src/dev/io_device.cc: src/dev/io_device.hh: add a backoff algorithm when nacks are received. src/mem/bridge.cc: src/mem/bridge.hh: add seperate response buffers and request queue sizes add a new parameters to specify how long before a nack in ready to go after a packet that needs to be nacked is received src/mem/cache/cache_impl.hh: assert on the src/mem/tport.cc: add a friendly assert to make sure the packet was inserted into the list |
4434:2ea7b6e0b78f |
09-May-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
fix the translating ports so it can add a page on a fault |
4433:4722c6787f69 |
07-May-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
the bridge never returns false when recvTiming() is called on its ports now, it always returns true and nacks the packet if there isn't sufficient buffer space fix the timing cpu to handle receiving a nacked packet
src/cpu/simple/timing.cc: make the timing cpu handle receiving a nacked packet src/mem/bridge.cc: src/mem/bridge.hh: the bridge never returns false when recvTiming() is called on its ports now, it always returns true and nacks the packet if there isn't sufficient buffer space |
4432:5e55857abb01 |
07-May-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
fix partial writes with a functional memory hack figure out the block size from devices attached to the bus otherwise use a default block size when no devices that care are attached
configs/common/FSConfig.py: src/mem/bridge.cc: src/mem/bridge.hh: src/python/m5/objects/Bridge.py: fix partial writes with a functional memory hack src/mem/bus.cc: src/mem/bus.hh: src/python/m5/objects/Bus.py: figure out the block size from devices attached to the bus otherwise use a default block size when no devices that care are attached src/mem/packet.cc: fix WriteInvalidateResp to not be a request that needs a response since it isn't src/mem/port.hh: by default return 0 for deviceBlockSize instead of panicing. This makes finding the block size the bus should use easier |
4321:6f8b597ab244 |
04-Apr-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
The MemoryObject tha owns a port should delete it if it so chooses when deletePortRefs() is called on it with that port as a parameter. In this way a MemoryObject can keep a functional port around and give it to anyone who wants to do functional accesses rather than creating a new one each time.
src/mem/bus.cc: src/mem/bus.hh: src/mem/cache/cache_impl.hh: only keep around one func port we give to anyone who wants it. Otherwise we can run out of port ids reasonably quickly if a lot of functional accesses are happening (e.g. remote debugging, dprintk, etc) |
4300:39657530a8c3 |
28-Mar-2007 |
Ron Dreslinski <rdreslin@umich.edu> |
Call compare and Swap on the target, not the response. |
4296:f7855a71f660 |
27-Mar-2007 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/bk/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/head |
4284:c8800319ed0c |
23-Mar-2007 |
Kevin Lim <ktlim@umich.edu> |
Merge ktlim@zizzer:/bk/newmem into zamp.eecs.umich.edu:/z/ktlim2/clean/tmp/clean2
src/cpu/base_dyn_inst.hh: Hand merge. Line is no longer needed because it's handled in the ISA. |
4219:e3f636da1042 |
27-Mar-2007 |
Ron Dreslinski <rdreslin@umich.edu> |
First Pass At Cmp/Swap in caches |
4211:d3a09a666b68 |
12-Mar-2007 |
Ron Dreslinski <rdreslin@umich.edu> |
Clean up more memory leaks |
4203:b5c2bb0b9cae |
12-Mar-2007 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix some of the memory leaks related to writebacks
src/cpu/memtest/memtest.cc: Add the [] to a delete to make it work correctly src/mem/cache/cache_impl.hh: Fix one of the memory leaks |
4202:f7a05daec670 |
11-Mar-2007 |
Nathan Binkert <binkertn@umich.edu> |
Rework the way SCons recurses into subdirectories, making it automatic. The point is that now a subdirectory can be added to the build process just by creating a SConscript file in it. The process has two passes. On the first pass, all subdirs of the root of the tree are searched for SConsopts files. These files contain any command line options that ought to be added for a particular subdirectory. On the second pass, all subdirs of the src directory are searched for SConscript files. These files describe how to build any given subdirectory. I have added a Source() function. Any file (relative to the directory in which the SConscript resides) passed to that function is added to the build. Clean up everything to take advantage of Source(). function is added to the list of files to be built. |
4192:7accc6365bb9 |
09-Mar-2007 |
Kevin Lim <ktlim@umich.edu> |
Two fixes: 1. Make sure connectMemPorts() only gets called when the CPU's peer gets changed. This is done by making setPeer() virtual, and overriding it in the CPU's ports. When it gets called on a CPU's port (dcache specifically), it calls the normal setPeer() function, and also connectMemPorts(). 2. Consolidate redundant code that handles switching in a CPU.
src/cpu/base.cc: Move common code of switching over peers to base CPU. src/cpu/base.hh: Move common code of switching over peers to BaseCPU. src/cpu/o3/cpu.cc: Add in function that updates thread context's ports. Also use updated function to takeOverFrom() in BaseCPU. This gets rid of some repeated code. src/cpu/o3/cpu.hh: Include function to update thread context's memory ports. src/cpu/o3/lsq.hh: Add function to dcache port that will update the memory ports upon getting a new peer. Also include a function that will tell the CPU to update those memory ports. src/cpu/o3/lsq_impl.hh: Add function that will update the memory ports upon getting a new peer. src/cpu/simple/atomic.cc: src/cpu/simple/timing.cc: Add function that will update thread context's memory ports upon getting a new peer. Also use the new BaseCPU's take over from function. src/cpu/simple/atomic.hh: Add in function (and dcache port) that will allow the dcache to update memory ports when it gets assigned a new peer. src/cpu/simple/timing.hh: Add function that will update thread context's memory ports upon getting a new peer. src/mem/port.hh: Make setPeer virtual so that other classes can override it. |
4190:5069dfa3d62e |
08-Mar-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
stop m5 from leaking like a sieve don't create a new physPort/virtPort every time activateContext() is called add the ability to tell a memory object to delete it's reference to a port and a method to have a port call deletePortRefs() on the port owner as well as delete it's peer still need to stop calling connectMemoPorts() every time activateContext() is called or we'll overflow the bus id and panic
src/cpu/thread_state.cc: if we hav ea (phys|virt)Port don't create a new on, have it delete it's peer and then reuse it src/mem/bus.cc: src/mem/bus.hh: add ability to delete a port by usig a hash_map instead of an array to store port ids add a function to do deleting src/mem/cache/cache.hh: src/mem/cache/cache_impl.hh: src/mem/mem_object.cc: src/mem/mem_object.hh: adda function to delete port references from a memory object src/mem/port.cc: src/mem/port.hh: add a removeConn function that tell the owener to delete any references to the port and then deletes its peer |
4183:3d19c1d46946 |
07-Mar-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Move the magic m5 PageTableFault into sim/faults.[hh,cc] since it's the same across all architectures. |
4176:2d52a9751dfc |
07-Mar-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make byteswap work correctly on Twin??_t types. |
4167:ce5d0f62f13b |
06-Mar-2007 |
Nathan Binkert <binkertn@umich.edu> |
Move all of the parameters of the Root SimObject so they are directly configured by python. Move stuff from root.(cc|hh) to core.(cc|hh) since it really belogs there now. In the process, simplify how ticks are used in the python code. |
4115:cc1d6df13c7d |
02-Mar-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
make ldtw(a) -- Twin 32 bit load work correctly -- by doing it the same way as the twin 64 bit loads
src/arch/isa_parser.py: src/arch/sparc/isa/decoder.isa: src/arch/sparc/isa/operands.isa: src/base/bigint.hh: src/cpu/simple/atomic.cc: src/cpu/simple/timing.cc: src/mem/packet_access.hh: make ldtw(a) Twin 32 bit load work correctly |
4070:74449a198a44 |
18-Feb-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
implement vtophys and 32bit gdb support
src/arch/alpha/vtophys.cc: src/arch/alpha/vtophys.hh: src/arch/sparc/arguments.hh: move Copy* to vport since it's generic for all the ISAs src/arch/sparc/isa_traits.hh: the Solaris kernel sets up a virtual-> real mapping for all memory starting at SegKPMBase src/arch/sparc/pagetable.hh: add a class for getting bits out of the TteTag src/arch/sparc/remote_gdb.cc: add 32bit support kinda.... If its 32 bit src/arch/sparc/remote_gdb.hh: Add 32bit register offsets too. src/arch/sparc/tlb.cc: cleanup generation of tsb pointers src/arch/sparc/tlb.hh: add function to return tsb pointers for an address make lookup public so vtophys can use it src/arch/sparc/vtophys.cc: src/arch/sparc/vtophys.hh: write vtophys for sparc src/base/bitfield.hh: return a mask of bits first->last src/mem/vport.cc: src/mem/vport.hh: move Copy* here since it's ISA generic |
4052:895ad21ffbf3 |
12-Feb-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
some forgotten commits |
4040:eb894f3fc168 |
12-Feb-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
rename store conditional stuff as extra data so it can be used for conditional swaps as well Add support for a twin 64 bit int load Add Memory barrier and write barrier flags as appropriate Make atomic memory ops atomic
src/arch/alpha/isa/mem.isa: src/arch/alpha/locked_mem.hh: src/cpu/base_dyn_inst.hh: src/mem/cache/cache_blk.hh: src/mem/cache/cache_impl.hh: rename store conditional stuff as extra data so it can be used for conditional swaps as well src/arch/alpha/types.hh: src/arch/mips/types.hh: src/arch/sparc/types.hh: add a largest read data type for statically allocating read buffers in atomic simple cpu src/arch/isa_parser.py: Add support for a twin 64 bit int load src/arch/sparc/isa/decoder.isa: Make atomic memory ops atomic Add Memory barrier and write barrier flags as appropriate src/arch/sparc/isa/formats/mem/basicmem.isa: add post access code block and define a twinload format for twin loads src/arch/sparc/isa/formats/mem/blockmem.isa: remove old microcoded twin load coad src/arch/sparc/isa/formats/mem/mem.isa: swap.isa replaces the code in loadstore.isa src/arch/sparc/isa/formats/mem/util.isa: add a post access code block src/arch/sparc/isa/includes.isa: need bigint.hh for Twin64_t src/arch/sparc/isa/operands.isa: add a twin 64 int type src/cpu/simple/atomic.cc: src/cpu/simple/atomic.hh: src/cpu/simple/base.hh: src/cpu/simple/timing.cc: add support for twinloads add support for swap and conditional swap instructions rename store conditional stuff as extra data so it can be used for conditional swaps as well src/mem/packet.cc: src/mem/packet.hh: Add support for atomic swap memory commands src/mem/packet_access.hh: Add endian conversion function for Twin64_t type src/mem/physical.cc: src/mem/physical.hh: src/mem/request.hh: Add support for atomic swap memory commands Rename sc code to extradata |
4034:ba523332c82b |
23-Mar-2007 |
Kevin Lim <ktlim@umich.edu> |
3 memory system fixes: 1. Update packet's flags properly when a snoop happens 2. Don't allow accesses to read a block's data if the block has outstanding MSHRs. This avoids a RAW hazard in MP systems that the memory system was not detecting properly earlier (a write required a block to upgrade, and while the upgrade was outstanding, a read came along and read old data). 3. Update MSHR's request upon a response being handled. If the MSHR has more targets than it can respond to in one cycle, then its request must be properly updated to the new head of the targets list.
src/mem/bus.cc: Update packet's flags properly upon snoop. src/mem/cache/cache_impl.hh: Be sure to not allow accesses to a block with outstanding MSHRs. src/mem/cache/miss/miss_queue.cc: Update MSHR's request upon a response being handled. |
4026:7c8c480474c6 |
07-Feb-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem into vm1.(none):/home/stever/bk/newmem-head |
4024:9eada81a030b |
07-Feb-2007 |
Nathan Binkert <binkertn@umich.edu> |
Include compiler.hh since we use some of the #defines |
4023:fbefb05ecf2e |
06-Feb-2007 |
Kevin Lim <ktlim@umich.edu> |
Fix for LL/SC that Ron sent me. |
4022:c422464ca16e |
07-Feb-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Make memory commands dense again to avoid cache stat table explosion. Created MemCmd class to wrap enum and provide handy methods to check attributes, convert to string/int, etc. |
4021:bdae6aac5c43 |
07-Feb-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
More DPRINTF cleanup. |
4020:c77bd3d23e48 |
07-Feb-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Minor DPRINTF fixes. |
3970:d54945bab95d |
03-Jan-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer:/bk/newmem into zower.eecs.umich.edu:/eecshome/m5/newmem |
3940:b87f85bb4275 |
27-Jan-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
While I'm waiting for legion to run make m5 compile with a few more compilers
SConstruct: src/SConscript: Add flags for Intel CC while i'm at it src/base/compiler.hh: the _Pragma stuff needst to be called this way unless someone happens to have a cleaner way src/base/cprintf_formats.hh: add std:: where appropriate src/base/statistics.hh: use this->map since icc was getting confused about std::map vs the locally defined map src/cpu/static_inst.hh: Add some more dummy returns where needed src/mem/packet.hh: add more dummy returns where needed src/sim/host.hh: use limits to come up with max tick |
3918:1f9a98d198e8 |
26-Jan-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
make our code a little more standards compliant pretty close to compiling w/ suns compiler
briefly: add dummy return after panic()/fatal() split out flags by compiler vendor include cstring and cmath where appropriate use std namespace for string ops
SConstruct: Add code to detect compiler and choose cflags based on detected compiler Fix zlib check to work with suncc src/SConscript: split out flags by compiler vendor src/arch/sparc/isa/decoder.isa: use correct namespace for sqrt src/arch/sparc/isa/formats/basic.isa: add dummy return around panic src/arch/sparc/isa/formats/integerop.isa: use correct namespace for stringops src/arch/sparc/isa/includes.isa: include cstring and cmath where appropriate src/arch/sparc/isa_traits.hh: remove dangling comma src/arch/sparc/system.cc: dummy return to make sun cc front end happy src/arch/sparc/tlb.cc: src/base/compression/lzss_compression.cc: use std namespace for string ops src/arch/sparc/utility.hh: no reason to say something is unsigned unsigned int src/base/compression/null_compression.hh: dummy returns to for suncc front end src/base/cprintf.hh: use standard variadic argument syntax instead of gnuc specefic renaming src/base/hashmap.hh: don't need to define hash for suncc src/base/hostinfo.cc: need stdio.h for sprintf src/base/loader/object_file.cc: munmap is in std namespace not null src/base/misc.hh: use M5 generic noreturn macros use standard variadic macro __VA_ARGS__ src/base/pollevent.cc: we need file.h for file flags src/base/random.cc: mess with include files to make suncc happy src/base/remote_gdb.cc: malloc memory for function instead of having a non-constant in an array size src/base/statistics.hh: use std namespace for floor src/base/stats/text.cc: include math.h for rint (cmath won't work) src/base/time.cc: use suncc version of ctime_r src/base/time.hh: change macro to work with both gcc and suncc src/base/timebuf.hh: include cstring from memset and use std:: src/base/trace.hh: change variadic macros to be normal format src/cpu/SConscript: add dummy returns where appropriate src/cpu/activity.cc: include cstring for memset src/cpu/exetrace.hh: include cstring fro memcpy src/cpu/simple/base.hh: add dummy return for panic src/dev/baddev.cc: src/dev/pciconfigall.cc: src/dev/platform.cc: src/dev/sparc/t1000.cc: add dummy return where appropriate src/dev/ide_atareg.h: make define work for both gnuc and suncc src/dev/io_device.hh: add dummy returns where approirate src/dev/pcidev.hh: src/mem/cache/cache_impl.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/tags/lru.hh: src/mem/cache/tags/split.hh: src/mem/cache/tags/split_lifo.hh: src/mem/cache/tags/split_lru.hh: src/mem/dram.cc: src/mem/packet.cc: src/mem/port.cc: include cstring for string ops src/dev/sparc/mm_disk.cc: add dummy return where appropriate include cstring for string ops src/mem/cache/miss/blocking_buffer.hh: src/mem/port.hh: Add dummy return where appropriate src/mem/cache/tags/iic.cc: cast hastSets to double for log() call src/mem/physical.cc: cast pmemAddr to char* for munmap src/sim/byteswap.hh: make define work for suncc and gnuc |
3879:1712a36b22cb |
27-Dec-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Change MemoryAccess dprintfs to print the data as well |
3862:ec47e4243107 |
19-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Streamline Cache/Tags interface: get rid of redundant functions, don't regenerate address from block in cache so that tags can turn around and use address to look up block again. |
3861:3b35b0f0b6a9 |
19-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
No need to template prefetcher on cache TagStore type. |
3860:73e3642713a3 |
18-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of generic CacheTags object (fold back into Cache). |
3837:c174c1e40f22 |
15-Dec-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Use my range_map to speed up findPort() in the bus. The snoop code could still use some work. |
3823:1c8f87aa103e |
06-Dec-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Many more fixes for SPARC_FS. Gets us to the point where SOFTINT starts getting touched.
configs/common/FSConfig.py: Physical memory on the T1 starts at 1MB, The first megabyte is unmapped to catch bugs src/arch/isa_parser.py: we should readmiscregwitheffect not readmiscreg src/arch/sparc/asi.cc: Fix AsiIsNucleus spelling with respect to header file Add ASI_LSU_CONTROL_REG to AsiSiMmu src/arch/sparc/asi.hh: Fix spelling of two ASIs src/arch/sparc/isa/decoder.isa: switch back to defaults letting the isa_parser insert readMiscRegWithEffect src/arch/sparc/isa/formats/mem/util.isa: Flesh out priviledgedString with hypervisor checks Make load alternate set the flags correctly src/arch/sparc/miscregfile.cc: insert some forgotten break statements src/arch/sparc/miscregfile.hh: Add some comments to make it easier to find which misc register is which number src/arch/sparc/tlb.cc: flesh out the tlb memory mapped registers a lot more src/base/traceflags.py: add an IPR traceflag src/mem/request.hh: Fix a bad assert() in request |
3806:65ae5388c059 |
29-Nov-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Add support for mmapped iprs to atomic cpu
src/arch/SConscript: add mmaped_ipr.hh to switch headers src/arch/sparc/asi.hh: make ASI_IMPLICT=0 so by default nothing needs to be done src/arch/sparc/miscregfile.hh: miscregfile no longer needs to include asi.hh src/arch/sparc/tlb.cc: src/arch/sparc/tlb.hh: implement panic instructions for mmaped ipr reads src/cpu/simple/atomic.cc: add check for mmaped iprs and handle them if it exists src/mem/request.hh: allocate space in the flags for mmaped iprs. Put in in the first 8 bits so that by default its fast. Move the other flags up 8 bits |
3804:fa7a01dddc7a |
23-Nov-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
first cut at a sparc tlb
src/arch/sparc/SConscript: Add code to serialize/unserialze tlb entries src/arch/sparc/asi.cc: src/arch/sparc/asi.hh: update asi names for how they're listed in the supplement add asis add more asi functions src/arch/sparc/isa_traits.hh: move the interrupt stuff and some basic address space stuff into isa traits src/arch/sparc/miscregfile.cc: src/arch/sparc/miscregfile.hh: add mmu registers to tlb get rid of implicit asi stuff... the tlb will handle it src/arch/sparc/regfile.hh: make isnt/dataAsid return ints not asis src/arch/sparc/tlb.cc: src/arch/sparc/tlb.hh: first cut at sparc tlb src/arch/sparc/vtophys.hh: pagatable nedes to be included here src/mem/request.hh: add asi and if the request is a memory mapped register to the requset object src/sim/host.hh: fix incorrect definition of LL |
3751:b422ffec62c1 |
22-Nov-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Added a parameter to set memory to zero. This is to support Legion, and once we can make our own hypervisor binary, we probably won't need it. |
3749:89fb514175fe |
20-Nov-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Fix an assert to correctly make sure a request falls entirely inside a memory. |
3738:c06cd072bbbe |
14-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Split CachePort class into CpuSidePort and MemSidePort and push those into derived Cache template class to eliminate a few layers of virtual functions and conditionals ("if (isCpuSide) { ... }" etc.). |
3721:6d0e55c05a46 |
05-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Don't compress data on writebacks unless it's actually necessary. |
3719:23ca579a363a |
04-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Turn cache MissQueue/BlockingBuffer into virtual object instead of template parameter. |
3712:c8a8938402cd |
03-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Make cache compression policy a runtime virtual thing instead of a template policy. |
3682:bf27fd870dae |
28-Nov-2006 |
Kevin Lim <ktlim@umich.edu> |
Remove assertion. It's not needed and messes up writebacks when a 2 level cache is used in a uniprocessor setting. |
3678:a689a7cf337e |
22-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Do a functional access to levels above on a read as a temporary solution for L2's in FS
Fix a small writeback bug when missing in the L2 in atomic mode
src/mem/bus.cc: Fix a comment to make sense src/mem/cache/cache_impl.hh: Do a functional access to levels above on a read as a temporary solution for L2's in FS Also fix a small writeback miss in L2 issue src/mem/cache/coherence/simple_coherence.hh: src/mem/cache/coherence/uni_coherence.cc: src/mem/cache/coherence/uni_coherence.hh: Do a functional access to levels above on a read as a temporary solution for L2's in FS tests/quick/00.hello/ref/alpha/linux/o3-timing/m5stats.txt: tests/quick/00.hello/ref/alpha/linux/simple-timing/m5stats.txt: tests/quick/01.hello-2T-smt/ref/alpha/linux/o3-timing/m5stats.txt: Update ref's for writeback changes |
3665:307a21253be8 |
14-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix bugs around uni-coherence invalidates being propogated properly.
src/mem/bus.cc: Make it so that invalidates being sent from the responder up don't call the responder but they should also not Panic. src/mem/packet.hh: If we don't have data in the packet, don't call deleteData: Example: InvalidateRequests never have data. |
3664:5e1cdb2f10cc |
14-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Update atomic and functional paths for snoops as well |
3662:9dacf0926b69 |
14-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Update bus bridges now that snoop ranges are passed properly
src/mem/bridge.cc: Update brdiges, now that snoop addresses are properly forwarded. Bus bridge should only handle snoops on the second phase (SNOOP_COMMIT) src/mem/bus.cc: src/mem/bus.hh: Make sure if a busBridge has access to both things that snoop and things that respond it only takes the request once |
3660:63e9c578bf83 |
13-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix a bug to handle the fact that a CPU can send Functional accesses while a sendTiming has not returned in the call stack.
src/mem/cache/base_cache.cc: Sometimes a functional access comes while waiting on a outstanding packet being sent. This could be because Timing CPU does some post processing on the recvTiming which send functional access. Either the CPU should leave the pkt/req around (so They can be referenced in the mem system). Or the mem system should remove them from outstanding lists and reinsert them if they fail in the sendTiming.
I did the later, eventually we should consider doing the former if that is the correct behavior. |
3655:4cae75fbc19c |
14-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
If all the targets aren't satisfied, reinitialize the packet. |
3652:81081c5de9aa |
13-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
If we didn't satisfy all targets, reset the packet we are requesting with. |
3651:e9c8b64e9e21 |
13-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix some errors related to snooping and functional access in the bus
src/mem/bus.cc: Only call snoop once per port, need to fix it so snoop ranges that overlap aren't added to list Functional accesses that call snoop and it goes to a higher bus may change the src, reset it after each snoop. |
3650:eb7cd03c1024 |
13-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix problems with snoop ranges not working properly on functional accesses
src/mem/bus.cc: Actually return the snoop list when asked for it. Don't get stuck in infinite functional loops |
3648:e84414759d6b |
13-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Since cpus now send out snoop ranges, remove it from the cache. |
3612:936dcb3f3e2d |
12-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Physical memory overrides the tport version of recvFunctional, need to do the check here for responses that match as well |
3611:205c2bdcdbb0 |
12-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Handle packets being deleted by lower level properly. Fixes for Mem Leak associated with Writebacks.
src/mem/cache/miss/mshr_queue.cc: Fixes for Mem Leak associated with Writebacks. (Double Delete removed) |
3610:c0f97b22db1a |
12-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Don't insert reponses into the list more than once If you get inserted in the front, reschedule the event |
3609:932a09e3e0c2 |
12-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Move code before a early return to make sure it is executed on all paths |
3608:8d8258faf7f6 |
12-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Yet another small bug in mem system related to flow control
src/mem/cache/cache_impl.hh: When upgrades change to readEx make sure to allocate the block Fix dprintf |
3607:7b7dd28784c4 |
12-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix functional access errors related to delayed respnoses in cachePort
src/mem/cache/base_cache.cc: On a delayed response, be sure to call the fixPacket wrapper to toggle hasData flag. src/mem/packet.cc: src/mem/packet.hh: Create a wrapper to toggle the hasData flag on delayed responses |
3606:9a4154893155 |
10-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
More fixes for functional accesses. It now makes the writeback memory leak to crash all configs. Working on that now.
src/mem/cache/base_cache.cc: Keep a list of the responders so we can search them on functional accesses. src/mem/cache/base_cache.hh: Properly put things on a list for responses so we can search the list. Also, be sure to check the outgoing ports lists on a functional access (factor some common code out there) src/mem/cache/cache_impl.hh: Properly return when the first read hit on a functional access. Make sure to call to check the other ports list of packets before forwarding it out. |
3605:ed3c5b4e8bca |
10-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Big fix for functional access, where we forgot to copy the last byte on write intersections.
src/mem/packet.cc: Make sure to copy the whole data (we were one byte short) src/mem/tport.cc: Fix for the proper semantics of fixPacket |
3584:8c3cdb2c001c |
09-Nov-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Get SPARC to the point that it starts running. Add ability to load the ROM bin files, cleanup lockstep printing a bit Since we don't have a platform yet, you need to comment out the default responder stuff in Bus.py to make it work.
SConstruct: Add TARGET_ISA to the list of environment variables that end up in the build_env for python configs/common/FSConfig.py: add a simple SPARC system to being testing with, you'll need to change makeLinuxAlphaSystem to makeSparcSystem in fs.py for now src/SConscript: add a raw file object, at least until we get more info about how to compile openboot properly src/arch/sparc/system.cc: src/arch/sparc/system.hh: add parameters for ROM files (OBP/Reset/Hypervisor), a ROM, load files into ROM src/base/loader/object_file.cc: src/base/loader/object_file.hh: add option to try raw when nothing works src/cpu/exetrace.cc: cleanup lockstep printing a little bit src/cpu/m5legion_interface.h: change the instruction to be 32 bits because it is src/mem/physical.cc: fix assert that doesn't work if memory starts somewhere above 0 src/python/m5/objects/BaseCPU.py: Add if statement to choose between sparc tlbs and alpha tlbs src/python/m5/objects/System.py: Add a sparc system that sets the rom addresses correctly src/python/m5/params.py: add the ability to add Addr() together |
3570:aacc19068f25 |
08-Nov-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Put the ProcessInfo and StackTrace objects into the ISA namespaces. |
3513:3dd360e9bbc4 |
09-Nov-2006 |
Kevin Lim <ktlim@umich.edu> |
Be sure to populate the packet's finishTime field in the atomic timing case. |
3512:cefe7f965104 |
09-Nov-2006 |
Kevin Lim <ktlim@umich.edu> |
Draining fixes.
src/cpu/o3/cpu.cc: Handle draining properly when CPU isn't actually being used. src/cpu/simple/atomic.cc: Be sure to set status properly when draining. src/mem/bus.cc: Fix for draining. |
3503:0754b2b23408 |
07-Nov-2006 |
Kevin Lim <ktlim@umich.edu> |
Fix up bus draining and add draining to the caches.
src/mem/bus.cc: Fix up draining to work properly. src/mem/bus.hh: Initialize drainEvent to NULL. src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: Add draining to the caches. |
3490:37a313c96683 |
02-Nov-2006 |
Kevin Lim <ktlim@umich.edu> |
Merge ktlim@zizzer:/bk/newmem into zamp.eecs.umich.edu:/z/ktlim2/clean/newmem-busfix |
3489:a90b0ecd17a5 |
02-Nov-2006 |
Kevin Lim <ktlim@umich.edu> |
Have bus use the BadAddress device to handle bad addresses. The O3 CPU should be able to boot into Linux with caches on after this change.
src/mem/bus.cc: src/mem/bus.hh: Bus now will be setup with a default responder, unless the user overrides it. This default responder should return BadAddress if no matching port is found. src/python/m5/objects/Bus.py: Bus now has a default responder for FS mode if the user doesn't override it. It returns BadAddress if no matching port is found. src/python/m5/objects/Tsunami.py: Add bad address device. Also record when the user has specified their own default responder. |
3487:dd7b0e5e907c |
02-Nov-2006 |
Kevin Lim <ktlim@umich.edu> |
Caches return a new functional port whenever asked for one.
src/mem/cache/base_cache.cc: Have caches return a new functional port whenever asked for them. I'm pretty sure this is desired behavior. Ron can correct me if it's not. |
3479:4fbcaa81d105 |
01-Nov-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem/ into zeep.eecs.umich.edu:/home/gblack/m5/newmemmemops |
3476:0e26b5458236 |
31-Oct-2006 |
Kevin Lim <ktlim@umich.edu> |
Merge ktlim@zizzer:/bk/newmem into zamp.eecs.umich.edu:/z/ktlim2/clean/newmem-busfix
configs/example/fs.py: configs/example/se.py: src/mem/tport.hh: Hand merge. |
3470:119eb2ef3772 |
01-Nov-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Added code to handle draining. |
3450:4dbe91f2b2cf |
31-Oct-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
add the ability to insert into the middle of the timing port send list |
3403:92c08efc9d53 |
25-Oct-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Fix simple timing port keep a list of all packets, have only one event, and scan all packets on a functional access. |
3401:1df0cb879413 |
31-Oct-2006 |
Kevin Lim <ktlim@umich.edu> |
Ports now have a pointer to the MemObject that owns it (can be NULL).
src/cpu/simple/atomic.hh: Port now takes in the MemObject that owns it. src/cpu/simple/timing.hh: Port now takes in MemObject that owns it. src/dev/io_device.cc: src/mem/bus.hh: Ports now take in the MemObject that owns it. src/mem/cache/base_cache.cc: Ports now take in the MemObject that own it. src/mem/port.hh: src/mem/tport.hh: Ports now optionally take in the MemObject that owns it. |
3400:469db0566924 |
25-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix fixPacket functionality to calculate sizes properly
src/mem/packet.cc: Copy size is calculated by END-BEGIN not BEGIN-END |
3375:fc9deea82085 |
23-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Clean up cache DPRINTFs |
3374:d274a61e8e6c |
22-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
s/pktuest/request/ (all in comments) |
3369:1da3e60827b6 |
22-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Small bug fixes for timing LL/SC. Better now but not necessarily 100% there yet.
src/mem/cache/cache_impl.hh: Generate response packet on failed store conditional. src/mem/packet.hh: Clear packet flags when reinitializing. (SATISFIED in particular is one we don't want to leave set.) |
3367:5bd949e01861 |
21-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Just give up if a store conditional misses completely in the cache (don't treat as normal write miss). |
3366:922d6f4dfa97 |
21-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix formatting that got screwed up when tabs were removed. |
3365:323803612cbb |
21-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Refactor coherence state table initialization. |
3353:495bb0a961f2 |
20-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Get rid of a variable put back by merge. |
3352:8e940d22b2a8 |
20-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/bk/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest
src/mem/tport.cc: Merge PacketPtr changes |
3349:fec4a86fa212 |
20-Oct-2006 |
Nathan Binkert <binkertn@umich.edu> |
Use PacketPtr everywhere |
3348:11f6ef023158 |
20-Oct-2006 |
Nathan Binkert <binkertn@umich.edu> |
refactor code for the packet, get rid of packet_impl.hh and call it packet_access.hh and fix the #includes so things compile right. |
3347:c182ee45f6b4 |
20-Oct-2006 |
Nathan Binkert <binkertn@umich.edu> |
initialize end, clean up loop |
3346:247e6b9b57b7 |
20-Oct-2006 |
Nathan Binkert <binkertn@umich.edu> |
Fix compile of m5.fast |
3342:19e716ad518e |
20-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Use fixPacket function everywhere. Fix fixPacket assert function. Stop timing port from forwarding the request if a response was found in its queue on a read.
src/cpu/memtest/memtest.cc: src/cpu/memtest/memtest.hh: src/python/m5/objects/MemTest.py: Add parameter to configure what percentage of mem accesses are functional src/mem/cache/base_cache.cc: src/mem/cache/cache_impl.hh: Use fix Packet function src/mem/packet.cc: Fix an assert that was checking the wrong thing src/mem/tport.cc: Properly detect if we need to do the access to the functional device |
3341:82c51d920701 |
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix corner case on assertion. I need to move over to using the fixPacket function so I don't have to make the same changes everywhere. Still a functional access bug someplace I need to track down in timing mode.
src/mem/cache/base_cache.cc: src/mem/cache/cache_impl.hh: Fix corner case on assertion tests/configs/memtest.py: Updated memtester with uncacheable addresses and functional accesses |
3340:5b24f2c55fae |
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix memtester to use functional access, fix cache to work functionally now that we could test it.
src/cpu/memtest/memtest.cc: Fix memtest to do functional accesses src/mem/cache/cache_impl.hh: Fix cache to handle functional accesses properly based on memtester changes Still need to fix functional accesses in timing mode now that the memtester can test it. |
3339:d1b3ec71baa4 |
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Small changes: ?? doesn't compile in warn statements Should have been false, where I had a true.
src/cpu/o3/lsq_impl.hh: Apparently you can't have ?? in a warn statement (Something about trigraphs) src/mem/cache/cache_impl.hh: Forgot to signal atomic mode in snoopProbe |
3338:fdb673b90ca7 |
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes to get single level uni-coherence to work. Now to try L2 caches in FS.
src/mem/cache/base_cache.hh: Fix uni-coherence for atomic accesses in coherence protocol access to port src/mem/cache/cache_impl.hh: Properly handle uni-coherence src/mem/cache/coherence/simple_coherence.hh: Properly forward invalidates (not done for MSI+ protocols (assumed top level for now) src/mem/cache/coherence/uni_coherence.cc: src/mem/cache/coherence/uni_coherence.hh: Properly forward invalidates in atomic/timing uni-coherence |
3337:98e3fe23fe22 |
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/bk/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest |
3336:511088415da6 |
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Always get the functional access from the highest level of cache first.
src/mem/cache/cache_impl.hh: Get the read data from the highest level of cache on a functional access |
3335:71bef174e59f |
18-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix WriteInvalidateResp |
3334:79f481d4e307 |
18-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/bk/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest |
3328:50b7be1f9ab6 |
19-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
First cut at LL/SC support in caches (atomic mode only).
configs/example/fs.py: Add MOESI protocol to caches (uni coherence not quite working w/FS yet). |
3320:a8910dbabb44 |
18-Oct-2006 |
Lisa Hsu <hsul@eecs.umich.edu> |
need some initializations before doing the loop. |
3317:fc913ad3eba5 |
18-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Break a lot of overly long lines. Factor out some asserts that were on both sides of an if/else. |
3316:191bf5e30ac3 |
18-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of doData() lines (were already commented out). Reindent due to resulting changes in nesting. |
3315:f15ce6434ab0 |
18-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of obsolete in-cache copy support. |
3313:f44dfa966df5 |
18-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Include packet_impl.hh (need this on my laptop, but not on zizzer... g++ 4 thing maybe?) |
3311:7eb47a60dbd4 |
17-Oct-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
add code to serialize se structures. Lisa is working on the python side of things and will test
src/mem/page_table.cc: src/mem/page_table.hh: add code to serialize/unserialize page table src/sim/process.cc: src/sim/process.hh: add code to serialize/unserialize process |
3310:21adbb41a37e |
17-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes for uni-coherence in timing mode for FS. Still a bug in atomic uni-coherence in FS.
src/cpu/o3/fetch_impl.hh: src/cpu/o3/lsq_impl.hh: src/cpu/simple/atomic.cc: src/cpu/simple/timing.cc: Make CPU models handle coherence requests src/mem/cache/base_cache.cc: Properly signal coherence CSHRs src/mem/cache/coherence/uni_coherence.cc: Only deallocate once |
3309:183edf675c27 |
17-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes to cache eliminating the assumption that the Packet is still valid after sending out a request. Still need to rework upgrades into this system, but works for now.
src/mem/cache/base_cache.cc: Re order code to be more readable src/mem/cache/base_cache.hh: Be sure to delete the copy on a bus block src/mem/cache/cache_impl.hh: Be sure to remove the copy on a writeback success src/mem/cache/miss/mshr_queue.cc: Demorgans to make it easier to understand src/mem/tport.cc: Delete writebacks |
3308:b85887027c9b |
17-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Properly chack the pkt pointer on upgrades to insure no segfaults when writebacks delete the packet. |
3307:ee2de66e23f1 |
17-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix it so that the cache does not assume to gave the packet it sent out via sendTiming. Still need to fix upgrades to use this path
src/mem/cache/base_cache.cc: Copy the pkt to the MSHR before issuing the sendTiming where it may be changed/consumed src/mem/cache/cache_impl.hh: Use copy of packet, because sendTiming may have changed the pkt Also, delete the copy when the time comes |
3303:d67ab1244d38 |
14-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of unused CacheBlk << output operator. |
3296:58498b71afd8 |
12-Oct-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
replace functional code in tport with fixPacket(). fixPacket() should be used anywhere a functional packet and timing packet are found to have the same address. |
3293:4ac3d9486d6e |
13-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix for DMA's in FS caches. Fix CSHR's for flow control. Fix for Bus Bridges reusing packets (clean flags up)
Now both timing/atomic caches with MOESI in UP fail at same point.
src/dev/io_device.hh: DMA's should send WriteInvalidates src/mem/bridge.cc: Reusing packet, clean flags in the packet set by bus. src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache.hh: src/mem/cache/cache_impl.hh: src/mem/cache/coherence/simple_coherence.hh: src/mem/cache/coherence/uni_coherence.cc: src/mem/cache/coherence/uni_coherence.hh: Fix CSHR's for flow control. src/mem/packet.hh: Make a writeInvalidateResp, since the DMA expects responses to it's writes |
3292:34666be8f3fb |
12-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix CSHR retrys |
3288:c78574da82b6 |
12-Oct-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
small bus updates for functional accesses |
3286:21d9d32ab8ab |
12-Oct-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Merge zizzer:/bk/newmem into zeep.pool:/z/saidi/work/m5.newmem.head
src/mem/packet.hh: hand merge |
3285:89b08bd7420e |
12-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Remove bus and top level parameters from cache
src/mem/cache/base_cache.hh: Remove top level param from cache src/mem/cache/coherence/uni_coherence.cc: Remove top level parameters from the cache |
3284:917750443a75 |
12-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Check the response queue on functional accesses. The response queue is not tying up an MSHR, should we change that or assume infinite storage for responses?
src/mem/cache/base_cache.cc: src/mem/tport.cc: Add in functional check of retry queued packets. |
3281:d0f7a2e1573f |
12-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix problems with unCacheable addresses in timing-coherence
src/base/traceflags.py: src/mem/physical.cc: Add debug falgs fro physical memory accesses src/mem/cache/cache_impl.hh: Snoops to uncacheable blocks should not happen src/mem/cache/miss/miss_queue.cc: Set the size properly on unCacheable accesses |
3265:4e67752fdfe0 |
11-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Make default ID unique (not broadcast) Fix a segfault associated with DefaultId
src/mem/bus.cc: Handle a segfault in the bus when DefaultPort was being used src/mem/bus.hh: Make the Default ID more unique (it overlapped with Broadcast ID) |
3264:9f72c06d14de |
11-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Forgot to mark myself as on the retry list |
3263:e532da529c9f |
11-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix bus in FS mode.
src/mem/bus.cc: Add debugging statement src/mem/bus.hh: Fix implementation of bus for subsequent recvTimings while handling a retry request. src/mem/tport.cc: Rework timing port to retry properly |
3262:5f96609a30ef |
11-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
More cache fixes. Atomic coherence now works as well.
src/cpu/memtest/memtest.cc: src/cpu/memtest/memtest.hh: Make Memtester able to test atomic as well src/mem/bus.cc: src/mem/bus.hh: Handle atomic snoops properly for cache->cache transfers src/mem/cache/cache_impl.hh: Debug output. Clean up memleak in atomic mode. Set hitLatency. Still need to send back reasonable number for atomic return value. src/mem/packet.cc: Add command strings for new commands src/python/m5/objects/MemTest.py: Add param to test atomic memory. |
3261:e3ca644e51d4 |
11-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Update for Atomic Coherece with Gabes bus |
3260:d9ef6d4cbe2a |
12-Oct-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
add a traceflag for functional accesses implement fix packet and add the ability to print a packet to a ostream remove tabs in packet.hh (Could people stop inserting them??!?!?!) mark const functions in packet.hh as such
src/base/traceflags.py: add a traceflag for functional accesses src/mem/packet.cc: implement fix packet and add the ability to print a packet to a ostream src/mem/packet.hh: add the ability to print a packet to an ostream remove tabs in file mark const functions as such |
3255:5b6cade9060f |
11-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Use bus response time paramteres Fix bug with deadlocking
src/mem/cache/base_cache.cc: Make sure to not wait anymore |
3253:055dae946472 |
11-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Don't call recvRetry if the bus is busy anyway. This takes care of a corner case as well when dealing with grants that aren't used. |
3252:6b010687be4e |
11-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Make the bus work if the other sides recvRetry doesn't call sendTiming for some reason. |
3251:5ed435255205 |
11-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
When turning asserts into if's don't forget to invert.
src/mem/cache/base_cache.cc: When turning asserts into if's don't forget to invert. Must be too sleepy. |
3250:e32f670162a5 |
11-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Writebacks can be pulled out from under the BusRequest when snoops of uprgades to owned blocks hit in the WB buffer |
3249:7144ab5a3c94 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Only issue responses if we aren;t already blocked |
3248:26b6037719ef |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/n/wexford/x/gblack/m5/newmem_bus into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest
src/mem/bus.cc: SCCS merged |
3247:aeca83138049 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/n/wexford/x/gblack/m5/newmem_bus into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest
src/mem/bus.cc: SCCS merged |
3246:29acc553907f |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Debugging info
src/base/traceflags.py: Add new flags for cacheport src/mem/bus.cc: Add debugging info src/mem/cache/base_cache.cc: Add debuggin info |
3244:c0ca5153f7e2 |
10-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Make the bus is occupied for none broadcast packets as well. |
3243:0f95e7d73043 |
10-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Put in an accounting mechanism and an assert to make sure something doesn't try to send another packet while it's still waiting for the bus. |
3242:46cbc3bd6564 |
10-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Fixed a corner case and simplified the logic in Packet::intersect. |
3241:76bb7218674f |
10-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Changed the bus to use a bool to keep track of retries rather than a pointer
src/mem/tport.cc: minor formatting tweak |
3238:dadfdbbf3955 |
10-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem into zeep.eecs.umich.edu:/home/gblack/m5/newmem_bus |
3237:39baab979195 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Some more code cleanup
src/mem/cache/base_cache.cc: Add sanity checks src/mem/cache/base_cache.hh: Fix for retry mechanism |
3236:f303f5e88656 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix some more mem leaks, still some left Update retry mechanism
src/mem/cache/base_cache.cc: Rework the retry mechanism src/mem/cache/base_cache.hh: Rework the retry mechanism Try to fix memory bug src/mem/cache/cache_impl.hh: Rework upgrades to not be blocked by slave src/mem/cache/miss/mshr_queue.cc: Fix mem leak on writebacks |
3235:87bec63ab497 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix cshr Retry's Fix Upgrades being blocked by slave |
3230:e86a03911728 |
09-Oct-2006 |
Kevin Lim <ktlim@umich.edu> |
Merge ktlim@zizzer:/bk/newmem into zamp.eecs.umich.edu:/z/ktlim2/clean/o3-merge/newmem
src/cpu/memtest/memtest.cc: src/cpu/memtest/memtest.hh: src/cpu/simple/timing.hh: tests/configs/o3-timing-mp.py: Hand merge. |
3224:60e426da682b |
09-Oct-2006 |
Kevin Lim <ktlim@umich.edu> |
Update memory assertion to check for whole range.
src/mem/physical.cc: Update assertion to check for full range. |
3219:32e49a9eea07 |
10-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Fixed a bug where a packet was attempted to be sent even though another packet was waiting for the bus. |
3218:41e0f5606940 |
09-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Fixes to the bus, and added fields to the packet.
src/mem/bus.cc: Put back the check to see if the bus is busy. Also, populate the fields in the packet to indicate when the first word and the entire packet will be delivered. src/mem/bus.hh: Remove the occupyBus function. src/mem/packet.hh: Added fields to the packet to indicate when the first chunk of a packet arrives, and when the entire packet arrives. |
3217:317ca1c50bbf |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Yet another fix to the HasData command attribute. |
3216:24d3fbc238d8 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Actually set the HasData attribute on Read Responses |
3215:cf2f7f09cab2 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix another merge issue |
3214:779bab9071b5 |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/z/m5/Bitkeeper/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest
src/mem/packet.hh: Hand merge code |
3212:41b04a73857f |
09-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem into zeep.eecs.umich.edu:/home/gblack/m5/newmem_bus |
3211:fe9df4627b32 |
09-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem into zeep.eecs.umich.edu:/home/gblack/m5/newmem_bus |
3210:e8c2c35d5c2b |
09-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Potentially functional partially timed bandwidth limitted bus model.
src/mem/bus.cc: Fixes to the previous hand merging, and put the snooping back into recvTiming and out of it's own function. src/mem/bus.hh: Put snooping back into recvTiming and not in it's own function. |
3209:91744bf6c92a |
08-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem into zeep.eecs.umich.edu:/home/gblack/m5/newmem_bus
src/mem/bus.cc: Hand merged. Needs to be fixed |
3208:97d9cc1e626f |
10-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix several bugs pertaining to upgrades/mem leaks.
src/mem/cache/base_cache.cc: Fix a bug about not having a request to send src/mem/cache/base_cache.hh: Fix a bug with the blocking code src/mem/cache/cache.hh: AFix a bug with snoop hits in WB buffer src/mem/cache/cache_impl.hh: Fix a bug with snoop hits in WB buffer Also, add better DPRINTF's src/mem/cache/miss/miss_queue.cc: Fix a bug with upgrades (Need to clean it up later) src/mem/cache/miss/mshr.cc: Fix a memory leak bug, still some outstanding with writebacks not being deleted src/mem/cache/miss/mshr_queue.cc: Fix a bug about upgrades (need to clean up later) src/mem/packet.hh: Fix for newly added cmd attribute for upgrades tests/configs/memtest.py: More interesting testcase |
3207:0698f82cfbb3 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Handle NACK's that occur from devices on the same bus. Not fully implemented yet, but good enough for single level cache coherence
src/mem/packet.hh: Add a bit to distinguish invalidates and upgrades |
3206:ba8d40305e98 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix a typo preventing compilation |
3205:135273dc77a9 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix how upgrades work. Remove some dead code.
src/mem/cache/cache_impl.hh: Upgrades don't need a response. Moved satisfied check into bus so removed some dead code. src/mem/cache/coherence/coherence_protocol.cc: src/mem/packet.hh: Upgrades don't require a response |
3204:1ac62ef68c44 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
One step closet to having NACK's work.
src/cpu/memtest/memtest.cc: Fix functional return path src/cpu/memtest/memtest.hh: Add snoop ranges in src/mem/cache/base_cache.cc: Properly signal NACKED src/mem/cache/cache_impl.hh: Catch nacked packet and panic for now |
3199:8e7749972f03 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix a typo in the printf |
3197:c5c7d434d135 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix a bitwise operation that was accidentally a logical operation. |
3196:8eb90bc29df8 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Make memtest work with 8 memtesters
src/mem/physical.cc: Update comment to match memtest use src/python/m5/objects/PhysicalMemory.py: Make memtester have a way to connect functionally tests/configs/memtest.py: Properly create 8 memtesters and connect them to the memory system |
3195:4aa11ac8395c |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/z/m5/Bitkeeper/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest |
3194:a304c81d654d |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Set size properly on uncache accesses Don't use the senderState after you get a succesful sendTiming. Not guarnteed to be correct
src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache.hh: src/mem/cache/cache_impl.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/blocking_buffer.hh: src/mem/cache/miss/miss_queue.hh: Don't use the senderState after you get a succesful sendTiming. Not guarnteed to be correct |
3193:3df743a775d5 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Add more DPRINTF's fix a supply condition.
src/mem/cache/cache_impl.hh: Add more usefull DPRINTF's REmove the PC to get rid of asserts |
3192:f3e215dda3f6 |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Have cpus send snoop ranges |
3189:bd5657abca1a |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Don't create a response if one isn't needed. |
3188:0a850349908c |
09-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Don't block responses even if the cache is blocked. |
3185:1cc3355b84bf |
08-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Make sure to propogate sendFunctional calls with functional not atomic.
src/mem/cache/cache_impl.hh: Fix a error case by putting a panic in. Make sure to propogate sendFunctional calls with functional not atomic. |
3184:8edaf4539e05 |
08-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes for functional path.
If the cpu needs to update any state when it gets a functional write (LSQ??) then that code needs to be written.
src/cpu/o3/fetch_impl.hh: src/cpu/o3/lsq_impl.hh: src/cpu/ozone/front_end_impl.hh: src/cpu/ozone/lw_lsq_impl.hh: src/cpu/simple/atomic.cc: src/cpu/simple/timing.cc: CPU's can recieve functional accesses, they need to determine if they need to do anything with them. src/mem/bus.cc: src/mem/bus.hh: Make the fuctional path do the correct tye of snoop |
3175:693ce319ee95 |
08-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Only respond if the pkt needs a response. Fix an issue with memory handling writebacks.
src/mem/cache/base_cache.hh: src/mem/tport.cc: Only respond if the pkt needs a response. src/mem/physical.cc: Make physical memory respond to writebacks, set satisfied for invalidates/upgrades. |
3174:b6b8440de50e |
08-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/z/m5/Bitkeeper/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest |
3173:2df0d82268d6 |
08-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Move away from using the statusChange function on snoops. Clean up snooping code in general. |
3172:2c84db071850 |
08-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Replace tests of LOCKED/UNCACHEABLE flags with isLocked()/isUncacheable(). |
3170:37fd1e73f836 |
08-Oct-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing) and PhysicalMemory. *No* support for caches or O3CPU. Note that properly setting cpu_id on all CPUs is now required for correct operation.
src/arch/SConscript: src/base/traceflags.py: src/cpu/base.hh: src/cpu/simple/atomic.cc: src/cpu/simple/timing.cc: src/cpu/simple/timing.hh: src/mem/physical.cc: src/mem/physical.hh: src/mem/request.hh: src/python/m5/objects/BaseCPU.py: tests/configs/simple-atomic.py: tests/configs/simple-timing.py: tests/configs/tsunami-simple-atomic-dual.py: tests/configs/tsunami-simple-atomic.py: tests/configs/tsunami-simple-timing-dual.py: tests/configs/tsunami-simple-timing.py: Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing) and PhysicalMemory. *No* support for caches or O3CPU. |
3168:31c84f0573e1 |
08-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
missing else |
3167:8c2a0a0d4ed5 |
08-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
bus changes
src/mem/bus.cc: src/mem/bus.hh: minor fix and some formatting changes src/python/m5/objects/Bus.py: changed bits to bytes |
3166:0ee91593402d |
08-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem into zeep.eecs.umich.edu:/home/gblack/m5/newmem_bus |
3159:a7e0e903d13b |
08-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
A possible implementation of a multiplexed bus. |
3158:e36d37280687 |
08-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Add in HasData, and move the define of NUM_MEM_CMDS to a more visible location. |
3156:2e6fc95d9ccf |
05-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Partial reimplementation of the bus. The "clock" and "width" parameters have been added, and the HasData flag has been partially added to packets. |
3153:90c1e143e33d |
07-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix a missing pointer |
3152:f16d754e0b10 |
07-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
No need to keep trying to request the data bus if we are already waiting. |
3151:7e437baee004 |
07-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Add mechanism for caches to handle failure of the fast path on responses.
For now, responses have priority over requests (may want to revist this).
src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: Add mechanism for caches to handle failure of the fast path on responses. |
3149:5409b6f356a3 |
07-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix infinite writebacks bug in cache.
src/mem/cache/cache_impl.hh: Make sure to pop the list. Fixes infinite writeback bug. src/mem/cache/miss/mshr_queue.cc: Add an assert as sanity check in case .full() stops working again. |
3148:765ddf2612f1 |
06-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/z/m5/Bitkeeper/newmem into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest |
3144:b6e9e1811d71 |
06-Oct-2006 |
Lisa Hsu <hsul@eecs.umich.edu> |
there are two main thrusts of this changeset.
1) return the periodicity of checkpoints back into the code (i.e. make m5 checkpoint n m meaningful again). 2) to do this, i had to much around with being able to repeatedly schedule and SimLoopExitEvent, which led to changes in how exit simloop events are handled to make this easier.
src/arch/alpha/isa/decoder.isa: src/mem/cache/cache_impl.hh: modify arg. order for new calling convention of exitSimLoop. src/cpu/base.cc: src/sim/main.cc: src/sim/pseudo_inst.cc: src/sim/root.cc: now, instead of creating a new SimLoopExitEvent, call a wrapper schedExitSimLoop which handles all the default args. src/sim/sim_events.cc: src/sim/sim_events.hh: src/sim/sim_exit.hh: add the periodicity of checkpointing back into the code.
to facilitate this, there are now two wrappers (instead of just overloading exitSimLoop). exitSimLoop is only for exiting NOW (i.e. at curTick), while schedExitSimLoop schedules and exit event for the future. |
3137:5dd9b13986a7 |
06-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Another thread number removed |
3136:a1eba7e17de5 |
06-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Remove threadnum from cache everywhere for now Fix so that blocking for the same reason doesn't fail. I.E. multiple writebacks want to set the blocked flag.
src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/miss_queue.cc: src/mem/cache/miss/mshr.cc: Remove threadnum from cache everywhere for now |
3135:8e008e281579 |
05-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes for functional accesses to use the snoop path. And small other tweaks to snooping coherence.
src/mem/cache/base_cache.hh: Make timing response at the time of send. src/mem/cache/cache.hh: src/mem/cache/cache_impl.hh: Update probe interface to be bi-directional for functional accesses src/mem/packet.hh: Add the function to create an atomic response to a given request |
3134:cf578b0dd70d |
05-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
First pass at snooping stuff that compiles and doesn't break.
Still need: -Handle NACK's on the recieve side -Distinguish top level caches -Handle repsonses from caches failing the fast path -Handle BusError and propogate it -Fix the invalidate packet associated with snooping in the cache
src/mem/bus.cc: Make sure to snoop on functional accesses src/mem/cache/base_cache.cc: Wait to make a request into a response until it is ready to be issued src/mem/cache/base_cache.hh: Support range changes for snoops Set up snoop responses for cache->cache transfers src/mem/cache/cache_impl.hh: Only access the cache if it wasn't satisfied by cache->cache transfer Handle snoop phases (detect block, then snoop) Fix functional access to work properly (still need to fix snoop path for functional accesses) |
3091:dba513d68c16 |
30-Aug-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Move more common functionality into SimpleTimingPort, allowing derived classes to be simplified. |
3090:3cced9156352 |
30-Aug-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Minor include file & formatting cleanup. |
3081:05343a8bf269 |
28-Aug-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Make address formats consistent in DPRINTFs. |
3075:b2e56d8b8566 |
22-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Still need LL/SC support in cache, add hack to always return success for now |
3074:e87fbe7941f8 |
22-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Commiting a version of the multi-phase snoop atomic bus so people can see the framework. Doesn't work, but also doesn't break uni-processor systems. Working on pulling out the changes in the cache so that it remains working.
src/mem/bus.cc: Changes for multi-phase snoop Some code for registering snoop ranges (a version that compiles and runs, but does nothing) src/mem/bus.hh: Changes for multi-phase snoop src/mem/packet.hh: Flag for multi-phase snoop src/mem/port.hh: Status for multi-phase snoop |
3051:b4f73000973b |
21-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/z/m5/Bitkeeper/newmem into zizzer.eecs.umich.edu:/.automount/zazzer/z/rdreslin/m5bk/newmem
src/python/m5/objects/BaseCPU.py: Merge duplicate change |
3039:9cec9533b941 |
17-Aug-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Changes to build m5.fast |
3029:02fdde6319b7 |
16-Aug-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
DRAM Memory doesn't crash the simulator now.. still untested. |
3018:6130a3c2db41 |
21-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Changes so that time in the packet is actually set properly.
src/mem/packet.hh: Make sure packets set the time parameter correctly. |
3015:5c8e078331bc |
16-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/z/m5/Bitkeeper/newmem into zizzer.eecs.umich.edu:/.automount/zazzer/z/rdreslin/m5bk/newmem |
3013:a173458c7f4d |
16-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes for blocking in the caches that needed to be pulled
src/mem/cache/base_cache.cc: Add in retry path for blocking with multi-level caches src/mem/cache/base_cache.hh: Pull more of the blocking fixes into head src/mem/packet.hh: Fix typo |
3012:1d5e18f6a100 |
16-Aug-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Fix Physical Memory to allow memory sizes bigger than 128MB. Kinda port DRAM to new memory system. The code is *really* ugly (not my fault) and right now something about the stats it uses causes a simulator segfault.
src/SConscript: Add dram.cc to sconscript src/mem/physical.cc: src/mem/physical.hh: Add params struct to physical memory, use params, make latency function be virtual src/python/m5/objects/PhysicalMemory.py: Add DRAMMemory python class |
2994:f19cdc9c919c |
15-Aug-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Merge zizzer:/bk/newmem into zeep.pool:/z/saidi/tmp/m5.newmem |
2991:60cd98c72fd9 |
15-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Pulled out changes to fix EIO programs with caches. Also fixes any translatingPort read/write Blob function problems with caches.
-Basically removed the ASID from places it is no longer needed due to PageTable
src/mem/cache/cache.hh: src/mem/cache/cache_impl.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/blocking_buffer.hh: src/mem/cache/miss/miss_queue.cc: src/mem/cache/miss/miss_queue.hh: src/mem/cache/miss/mshr.cc: src/mem/cache/miss/mshr.hh: src/mem/cache/miss/mshr_queue.cc: src/mem/cache/miss/mshr_queue.hh: src/mem/cache/prefetch/base_prefetcher.cc: src/mem/cache/prefetch/base_prefetcher.hh: src/mem/cache/tags/fa_lru.cc: src/mem/cache/tags/fa_lru.hh: src/mem/cache/tags/iic.cc: src/mem/cache/tags/iic.hh: src/mem/cache/tags/lru.cc: src/mem/cache/tags/lru.hh: src/mem/cache/tags/split.cc: src/mem/cache/tags/split.hh: src/mem/cache/tags/split_lifo.cc: src/mem/cache/tags/split_lifo.hh: src/mem/cache/tags/split_lru.cc: src/mem/cache/tags/split_lru.hh: Remove asid where it wasn't neccesary anymore due to Page Table |
2990:d5074a2d3a9b |
15-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Merge zizzer:/z/m5/Bitkeeper/newmem into zizzer.eecs.umich.edu:/.automount/zazzer/z/rdreslin/m5bk/newmem |
2989:9a6f66c38acc |
15-Aug-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
fixes for gcc 4.1 Nate needs to fix sinic builder stuff Gabe needs to verify my fixes to decoder.isa
OPT/DEBUG compiles for ALPHA_FS, ALPHA_SE, MIPS_SE, SPARC_SE with this changeset
README: Fix the swig version in the readme src/SConscript: remove sinic until nate fixes the builder crap for it src/arch/alpha/system.hh: src/arch/mips/isa/includes.isa: src/arch/sparc/isa/decoder.isa: src/base/stats/visit.cc: src/base/timebuf.hh: src/dev/ide_disk.cc: src/dev/sinic.cc: src/mem/cache/miss/mshr.cc: src/mem/cache/miss/mshr_queue.cc: src/mem/packet.hh: src/mem/request.hh: src/sim/builder.hh: src/sim/system.hh: fixes for gcc 4.1 |
2985:c010893f23ae |
15-Aug-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem into ewok.(none):/home/gblack/m5/newmem
src/cpu/static_inst.hh: SCCS merged |
2982:0ecdb0879b14 |
14-Aug-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix up doxygen. |
2980:eab855f06b79 |
15-Aug-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Cleaned up include files and got rid of many using directives in header files. |
2979:88f767122b58 |
14-Aug-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Changed the size parameter from int to int64_t |
2975:9f8a7f66c91b |
11-Aug-2006 |
Gabe Black <gblack@eecs.umich.edu> |
#include of iostream needed. |
2972:f84c6c5309ce |
11-Aug-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Pushed most of constants.hh back into isa_traits.hh and regfile.hh and created a seperate file for the syscallreturn class. |
2914:2c524dc023d2 |
20-Jul-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Move PioPort timing code into Simple Timing Port object Make PioPort use it Make Physical memory use it as well
src/SConscript: Add timing port to sconscript src/dev/io_device.cc: src/dev/io_device.hh: Move simple timing pio port stuff into a simple timing port class so it can be used by the physical memory src/mem/physical.cc: src/mem/physical.hh: use a simple timing port stuff instead of rolling our own here |
2897:d30a4674261c |
15-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Some changes to support blocking in the caches
src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache_impl.hh: Outstanding blocking updates for cache |
2885:703566816f07 |
10-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Some fixes so that MSHR's are matched and we don't issue overlapping requests with detailed cpu
src/mem/cache/base_cache.cc: If we still have outstanding requests, need to schedule event again src/mem/cache/miss/miss_queue.cc: Need to use block size so overlapping requests match in the MSHR's src/mem/cache/miss/mshr.cc: Actually save the address, otherwise we can't match MSHR's |
2883:20cbfd9cf24c |
10-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix offset calculation. Now L2's work with timing&atomic.
src/mem/packet.hh: Offset is based on packet, not request. |
2858:6b243823ac53 |
07-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix address range calculation. Still need bus to handle snoop ranges. On the way towards multi-level caches (L2)
src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: Fix address range calculation. Still need bus to handle snoop ranges. |
2856:89691405ec9c |
07-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Update cpus to use the getPort function to use a connector object to connect the I/D cache ports to memory
configs/test/test.py: Update to use new cpu getPort functionality src/cpu/base.cc: Make cpu's a memObject to expose getPort interface src/cpu/base.hh: Make cpu's a memObject to export getPort interface src/cpu/simple/atomic.cc: src/cpu/simple/atomic.hh: src/cpu/simple/timing.cc: src/cpu/simple/timing.hh: Now use the connector via getPort interface src/mem/cache/base_cache.cc: Make sure the cache recognizes all port names |
2855:5ca2cdb32521 |
06-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Timing cache works for hello world test. Still need 1) detailed CPU (blocking ability in cache) 1a) Multiple outstanding requests (need to keep track of times for events) 2)Multi-level support 3)MP coherece support 4)LL/SC support 5)Functional path needs to be correctly implemented (temporarily works without multiple outstanding requests (simple cpu))
src/cpu/simple/timing.cc: Temp hack because timing cpu doesn't export ports properly so single I/D cache communicates only through the Icache port. src/mem/cache/base_cache.cc: Handle marking MSHR's in service Add support for getting CSHR's src/mem/cache/base_cache.hh: Make these functions visible at the base cache level src/mem/cache/cache.hh: make the functions virtual src/mem/cache/cache_impl.hh: Rename the function to make sense src/mem/packet.hh: Accidentally clearing the needsResponse field when sending a response back. |
2846:89fbe74d8ea8 |
06-Jul-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Add default responder to bus Update configuration for new default responder on bus Update to devices to handle their own pci config space without pciconfigall Remove most of pciconfigall, it now is a dumbdevice which gets it's address based on the bus it's supposed to respond for Remove need for pci config space from platform, add registerPciDevice function to prevent more than one device from having same bus:dev:func and interrupt Remove pciconfigspace from pci devices, and py files Add calcConfigAddr that returns address for config space based on bus/dev/function + offset
configs/test/fs.py: Update configuration for new default responder on bus src/dev/ide_ctrl.cc: src/dev/ide_ctrl.hh: src/dev/ns_gige.cc: src/dev/ns_gige.hh: src/dev/pcidev.cc: src/dev/pcidev.hh: Update to handle it's own pci config space without pciconfigall src/dev/io_device.cc: src/dev/io_device.hh: change naming for pio port break out recvTiming into two functions to reuse code src/dev/pciconfigall.cc: src/dev/pciconfigall.hh: removing most of pciconfigall, it now is a dumbdevice which gets it's address based on the bus it's supposed to respond for src/dev/pcireg.h: add a max size for PCI config space (per PCI spec) src/dev/platform.cc: src/dev/platform.hh: remove need for pci config space from platform, add registerPciDevice function to prevent more than one device from having same bus:dev:func and interrupt src/dev/sinic.cc: remove pciconfigspace as it's no longer a needed parameter src/dev/tsunami.cc: src/dev/tsunami.hh: src/dev/tsunami_pchip.cc: src/dev/tsunami_pchip.hh: add calcConfigAddr that returns address for config space based on bus/dev/function + offset (per PCI spec) src/mem/bus.cc: src/mem/bus.hh: src/python/m5/objects/Bus.py: add idea of default responder to bus src/python/m5/objects/Pci.py: add config port for pci devices add latency, bus and size parameters for pci config all (min is 8MB, max is 256MB see pci spec) |
2844:265f19c60d45 |
06-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Now timing reads work in single level of cache with simple cpu
src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache.hh: Changes to handle timing reads in Simple CPU (blocking buffers) |
2835:d2a977df88de |
05-Jul-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix some unset values in the request in the timing CPU. Properly implement the MSHR allocate function.
src/cpu/simple/timing.cc: Set the thread context in the CPU.
Need to do this properly, currently I just set it to Cpu=0 Thread=0. This will just cause all the stats in the cache based on these to just yield totals and not a distribution. src/mem/cache/miss/mshr.cc: Properly implement the allocate function for the MSHR. |
2827:45c3bdb0ffd4 |
30-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
AtomicSimpleCPU with a cache now runs the hello world! test program. Need to clean up a bunch of flags/hacks in the code. Then onto Timming mode.
Functional accesses also work properly, although not exactly how we wanted them. I'll need to clean that up as well.
src/cpu/simple/atomic.cc: Atomic CPU needs to set thread context so stats work in cache. Temporarily just use CPU=0 ThreadID=0 src/mem/cache/cache_impl.hh: Need to return success/failure properly still Physical memory object doesn't assert SATISFIED anymore, need to remove that flag src/mem/cache/tags/lru.cc: Doesn't work if the REQ doesn't set it's ASID. Temporary fix use 0 always |
2826:d20db4a6f7d1 |
30-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
First pass, now compiles with current head of tree. Compile and initialization work, still working on functionality.
src/mem/cache/base_cache.cc: Temp fix for cpu's use of getPort functionality. CPU's will need to be ported to the new connector objects. Also, all packets have to have data or the delete fails. src/mem/cache/cache.hh: Fix function prototypes so overloading works src/mem/cache/cache_impl.hh: fix functions to match virtual base class src/mem/cache/miss/miss_queue.cc: Packets havve to have data, or delete fails src/python/m5/objects/BaseCache.py: Update for newmem |
2825:d5d9593a1f19 |
30-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fix the packet data allocation methods. Small fixes from changesets after my initial work.
This now compiles.
src/mem/cache/base_cache.cc: Fix getPort function that changed src/mem/cache/base_cache.hh: Fix get port function, provide default implementations of virtual functions in the base class src/mem/cache/cache.hh: Fix virtual function declerations src/mem/cache/cache_builder.cc: Fix params src/mem/cache/cache_impl.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/miss_queue.cc: src/mem/cache/miss/mshr.cc: src/mem/cache/prefetch/base_prefetcher.cc: src/mem/cache/tags/iic.cc: src/mem/cache/tags/lru.cc: Properly allocate data in packet |
2814:b723c79f5349 |
30-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
All files compile in the mem directory except cache_builder
Missing some functionality (like split caches and copy support)
src/SConscript: Typo src/mem/cache/prefetch/base_prefetcher.cc: src/mem/cache/prefetch/ghb_prefetcher.hh: src/mem/cache/prefetch/stride_prefetcher.hh: src/mem/cache/prefetch/tagged_prefetcher_impl.hh: src/mem/cache/tags/fa_lru.cc: src/mem/cache/tags/fa_lru.hh: src/mem/cache/tags/iic.cc: src/mem/cache/tags/iic.hh: src/mem/cache/tags/lru.cc: src/mem/cache/tags/lru.hh: src/mem/cache/tags/split.cc: src/mem/cache/tags/split.hh: src/mem/cache/tags/split_lifo.cc: src/mem/cache/tags/split_lifo.hh: src/mem/cache/tags/split_lru.cc: src/mem/cache/tags/split_lru.hh: src/mem/packet.hh: src/mem/request.hh: Fix so it compiles |
2813:89d9196456ac |
29-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Still missing prefetch and tags directories as well as cache builder. Some implementation details were left blank still, need to fill them in.
src/SConscript: Reorder build to compile all files first src/mem/cache/cache.hh: src/mem/cache/cache_builder.cc: src/mem/cache/cache_impl.hh: src/mem/cache/coherence/coherence_protocol.cc: src/mem/cache/coherence/uni_coherence.cc: src/mem/cache/coherence/uni_coherence.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/miss_queue.cc: src/mem/cache/miss/mshr.cc: src/mem/cache/miss/mshr.hh: src/mem/cache/miss/mshr_queue.cc: More changesets pulled, now compiles everything in /miss directory and in the root directory src/mem/packet.hh: Add some more support, need to clean some of it out once everything is working |
2812:8e5feae75615 |
28-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
More Changes, working towards cache.cc compiling. Headers cleaned up.
src/mem/cache/cache_blk.hh: Remove XC |
2811:9da12e9830ce |
28-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Backing in more changsets, getting closer to compile base_cache.cc compiles, continuing on
src/SConscript: Add in compilation flags for cache files src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: Back in more fixes, now base_cache compiles src/mem/cache/cache.hh: src/mem/cache/cache_blk.hh: src/mem/cache/cache_impl.hh: src/mem/cache/coherence/coherence_protocol.cc: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/blocking_buffer.hh: src/mem/cache/miss/miss_queue.cc: src/mem/cache/miss/miss_queue.hh: src/mem/cache/miss/mshr.cc: src/mem/cache/miss/mshr.hh: src/mem/cache/miss/mshr_queue.cc: src/mem/cache/miss/mshr_queue.hh: src/mem/cache/prefetch/base_prefetcher.cc: src/mem/cache/tags/fa_lru.cc: src/mem/cache/tags/iic.cc: src/mem/cache/tags/lru.cc: src/mem/cache/tags/split_lifo.cc: src/mem/cache/tags/split_lru.cc: src/mem/packet.cc: src/mem/packet.hh: src/mem/request.hh: Backing in more changsets, getting closer to compile |
2810:5befce12ad70 |
28-Jun-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Was having difficulty with merging the cache, reverted to an early version and will add back in the patches to make it work soon.
src/mem/cache/prefetch/tagged_prefetcher_impl.hh: Trying to merge src/mem/cache/base_cache.cc: src/mem/cache/base_cache.hh: src/mem/cache/cache.cc: src/mem/cache/cache.hh: src/mem/cache/cache_blk.hh: src/mem/cache/cache_builder.cc: src/mem/cache/cache_impl.hh: src/mem/cache/coherence/coherence_protocol.cc: src/mem/cache/coherence/coherence_protocol.hh: src/mem/cache/coherence/simple_coherence.hh: src/mem/cache/coherence/uni_coherence.cc: src/mem/cache/coherence/uni_coherence.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/miss/blocking_buffer.hh: src/mem/cache/miss/miss_queue.cc: src/mem/cache/miss/miss_queue.hh: src/mem/cache/miss/mshr.cc: src/mem/cache/miss/mshr.hh: src/mem/cache/miss/mshr_queue.cc: src/mem/cache/miss/mshr_queue.hh: src/mem/cache/prefetch/base_prefetcher.cc: src/mem/cache/prefetch/base_prefetcher.hh: src/mem/cache/prefetch/ghb_prefetcher.cc: src/mem/cache/prefetch/ghb_prefetcher.hh: src/mem/cache/prefetch/stride_prefetcher.cc: src/mem/cache/prefetch/stride_prefetcher.hh: src/mem/cache/prefetch/tagged_prefetcher.hh: src/mem/cache/tags/base_tags.cc: src/mem/cache/tags/base_tags.hh: src/mem/cache/tags/fa_lru.cc: src/mem/cache/tags/fa_lru.hh: src/mem/cache/tags/iic.cc: src/mem/cache/tags/iic.hh: src/mem/cache/tags/lru.cc: src/mem/cache/tags/lru.hh: src/mem/cache/tags/repl/gen.cc: src/mem/cache/tags/repl/gen.hh: src/mem/cache/tags/repl/repl.cc: src/mem/cache/tags/repl/repl.hh: src/mem/cache/tags/split.cc: src/mem/cache/tags/split.hh: src/mem/cache/tags/split_blk.hh: src/mem/cache/tags/split_lifo.cc: src/mem/cache/tags/split_lifo.hh: src/mem/cache/tags/split_lru.cc: src/mem/cache/tags/split_lru.hh: Pulling an early version of the cache into the tree due to merging issues. Will apply patches and push. |
2809:9cb5fba079ed |
27-Jun-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
change the page table from map to hash_map and create small cache to to speed up lookups |
2800:18a615ca6e19 |
26-Jun-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
add syscall emulation page table fault so we can allocate more stack pages
src/cpu/simple/base.cc: add syscall emulation page table fault so we can allocate more stack pages FaultBase::invoke will do this, we don't need to do it here src/sim/faults.hh: I have no idea why this #if was there... gone src/sim/process.cc: make stack_min actually be the current minimum |
2796:8d58290b85c7 |
25-Jun-2006 |
Kevin Lim <ktlim@umich.edu> |
Allow ports to be created without a name. |
2772:f0f52cbe744d |
17-Jun-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
minor header cleanups
src/dev/alpha_console.cc: Remove my name twice from header src/dev/ide_disk.cc: Spell my full name correctly src/mem/bus.hh: I think I edited much of this src/sim/byteswap.hh: I believe most of this code is mine or nate's |
2738:5d7a31c7fa29 |
13-Jun-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Move SimObject creation and Port connection loops into Python. Add Port and VectorPort objects and support for specifying port connections via assignment. The whole C++ ConfigNode hierarchy is gone now, as are C++ Connector objects.
configs/test/fs.py: configs/test/test.py: Rewrite for new port connector syntax. src/SConscript: Remove unneeded files: - mem/connector.* - sim/config* src/dev/io_device.hh: src/mem/bridge.cc: src/mem/bridge.hh: src/mem/bus.cc: src/mem/bus.hh: src/mem/mem_object.hh: src/mem/physical.cc: src/mem/physical.hh: Allow getPort() to take an optional index to support vector ports (eventually). src/python/m5/__init__.py: Move SimObject construction and port connection operations into Python (with C++ calls). src/python/m5/config.py: Move SimObject construction and port connection operations into Python (with C++ calls). Add support for declaring and connecting MemObject ports in Python. src/python/m5/objects/Bus.py: src/python/m5/objects/PhysicalMemory.py: Add port declaration. src/sim/builder.cc: src/sim/builder.hh: src/sim/serialize.cc: src/sim/serialize.hh: ConfigNodes are gone; builder just gets the name of a .ini file section now. src/sim/main.cc: Move SimObject construction and port connection operations into Python (with C++ calls). Split remaining initialization operations into two parts, loadIniFile() and finalInit(). src/sim/param.cc: src/sim/param.hh: SimObject resolution done globally in Python now (not via ConfigNode hierarchy). src/sim/sim_object.cc: Remove unneeded #include. |
2685:a0821abe7132 |
08-Jun-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
add nacked result and a function to swizzle nacked packet into something that can be sent out again implement ability for i/o devices to handle
src/dev/io_device.cc: src/dev/io_device.hh: implement ability for i/o devices to handle src/mem/packet.hh: add nacked result and a function to swizzle nacked packet into something that can be sent out again |
2684:71f3cabf891f |
08-Jun-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
add write/read functions that have endian conversions in them when we get a virtual port delete it (even though delete does nothing in these cases)
src/arch/alpha/linux/system.cc: src/arch/alpha/stacktrace.cc: src/base/remote_gdb.cc: src/cpu/simple_thread.cc: when we get a virtual port delete it (even though delete does nothing in this case) src/mem/port.hh: src/mem/vport.hh: add write/read functions that have endian conversions in them |
2680:246e7104f744 |
06-Jun-2006 |
Kevin Lim <ktlim@umich.edu> |
Change ExecContext to ThreadContext. This is being renamed to differentiate between the interface used objects outside of the CPU, and the interface used by the ISA. ThreadContext is used by objects outside of the CPU and is specifically defined in thread_context.hh. ExecContext is more implicit, and is defined by files such as base_dyn_inst.hh or cpu/simple/base.hh.
Further renames/reorganization will be coming shortly; what is currently CPUExecContext (the old ExecContext from m5) will be renamed to SimpleThread or something similar.
src/arch/alpha/arguments.cc: src/arch/alpha/arguments.hh: src/arch/alpha/ev5.cc: src/arch/alpha/faults.cc: src/arch/alpha/faults.hh: src/arch/alpha/freebsd/system.cc: src/arch/alpha/freebsd/system.hh: src/arch/alpha/isa/branch.isa: src/arch/alpha/isa/decoder.isa: src/arch/alpha/isa/main.isa: src/arch/alpha/linux/process.cc: src/arch/alpha/linux/system.cc: src/arch/alpha/linux/system.hh: src/arch/alpha/linux/threadinfo.hh: src/arch/alpha/process.cc: src/arch/alpha/regfile.hh: src/arch/alpha/stacktrace.cc: src/arch/alpha/stacktrace.hh: src/arch/alpha/tlb.cc: src/arch/alpha/tlb.hh: src/arch/alpha/tru64/process.cc: src/arch/alpha/tru64/system.cc: src/arch/alpha/tru64/system.hh: src/arch/alpha/utility.hh: src/arch/alpha/vtophys.cc: src/arch/alpha/vtophys.hh: src/arch/mips/faults.cc: src/arch/mips/faults.hh: src/arch/mips/isa_traits.cc: src/arch/mips/isa_traits.hh: src/arch/mips/linux/process.cc: src/arch/mips/process.cc: src/arch/mips/regfile/float_regfile.hh: src/arch/mips/regfile/int_regfile.hh: src/arch/mips/regfile/misc_regfile.hh: src/arch/mips/regfile/regfile.hh: src/arch/mips/stacktrace.hh: src/arch/sparc/faults.cc: src/arch/sparc/faults.hh: src/arch/sparc/isa_traits.hh: src/arch/sparc/linux/process.cc: src/arch/sparc/linux/process.hh: src/arch/sparc/process.cc: src/arch/sparc/regfile.hh: src/arch/sparc/solaris/process.cc: src/arch/sparc/stacktrace.hh: src/arch/sparc/ua2005.cc: src/arch/sparc/utility.hh: src/arch/sparc/vtophys.cc: src/arch/sparc/vtophys.hh: src/base/remote_gdb.cc: src/base/remote_gdb.hh: src/cpu/base.cc: src/cpu/base.hh: src/cpu/base_dyn_inst.hh: src/cpu/checker/cpu.cc: src/cpu/checker/cpu.hh: src/cpu/checker/exec_context.hh: src/cpu/cpu_exec_context.cc: src/cpu/cpu_exec_context.hh: src/cpu/cpuevent.cc: src/cpu/cpuevent.hh: src/cpu/exetrace.hh: src/cpu/intr_control.cc: src/cpu/memtest/memtest.hh: src/cpu/o3/alpha_cpu.hh: src/cpu/o3/alpha_cpu_impl.hh: src/cpu/o3/alpha_dyn_inst_impl.hh: src/cpu/o3/commit.hh: src/cpu/o3/commit_impl.hh: src/cpu/o3/cpu.cc: src/cpu/o3/cpu.hh: src/cpu/o3/fetch_impl.hh: src/cpu/o3/regfile.hh: src/cpu/o3/thread_state.hh: src/cpu/ozone/back_end.hh: src/cpu/ozone/cpu.hh: src/cpu/ozone/cpu_impl.hh: src/cpu/ozone/front_end.hh: src/cpu/ozone/front_end_impl.hh: src/cpu/ozone/inorder_back_end.hh: src/cpu/ozone/lw_back_end.hh: src/cpu/ozone/lw_back_end_impl.hh: src/cpu/ozone/lw_lsq.hh: src/cpu/ozone/lw_lsq_impl.hh: src/cpu/ozone/thread_state.hh: src/cpu/pc_event.cc: src/cpu/pc_event.hh: src/cpu/profile.cc: src/cpu/profile.hh: src/cpu/quiesce_event.cc: src/cpu/quiesce_event.hh: src/cpu/simple/atomic.cc: src/cpu/simple/base.cc: src/cpu/simple/base.hh: src/cpu/simple/timing.cc: src/cpu/static_inst.cc: src/cpu/static_inst.hh: src/cpu/thread_state.hh: src/dev/alpha_console.cc: src/dev/ns_gige.cc: src/dev/sinic.cc: src/dev/tsunami_cchip.cc: src/kern/kernel_stats.cc: src/kern/kernel_stats.hh: src/kern/linux/events.cc: src/kern/linux/events.hh: src/kern/system_events.cc: src/kern/system_events.hh: src/kern/tru64/dump_mbuf.cc: src/kern/tru64/tru64.hh: src/kern/tru64/tru64_events.cc: src/kern/tru64/tru64_events.hh: src/mem/vport.cc: src/mem/vport.hh: src/sim/faults.cc: src/sim/faults.hh: src/sim/process.cc: src/sim/process.hh: src/sim/pseudo_inst.cc: src/sim/pseudo_inst.hh: src/sim/syscall_emul.cc: src/sim/syscall_emul.hh: src/sim/system.cc: src/cpu/thread_context.hh: src/sim/system.hh: src/sim/vptr.hh: Change ExecContext to ThreadContext. |
2679:737e9f158843 |
06-Jun-2006 |
Kevin Lim <ktlim@umich.edu> |
Fix checker to work in newmem in SE mode.
src/cpu/o3/fetch_impl.hh: Give the checker a pointer to the icachePort. src/cpu/o3/lsq_unit_impl.hh: Give the checker a pointer to the dcachePort. src/mem/request.hh: Allow checking for the scResult being valid prior to accessing it. |
2670:9107b8bd08cd |
02-Jun-2006 |
Kevin Lim <ktlim@umich.edu> |
Merge ktlim@zizzer:/bk/newmem into zizzer.eecs.umich.edu:/.automount/zamp/z/ktlim2/clean/newmem |
2669:f2b336e89d2a |
02-Jun-2006 |
Kevin Lim <ktlim@umich.edu> |
Fixes to get compiling to work. This is mainly fixing up some includes; changing functions within the XCs; changing MemReqPtrs to Requests or Packets where appropriate.
Currently the O3 and Ozone CPUs do not work in the new memory system; I still need to fix up the ports to work and handle responses properly. This check-in is so that the merge between m5 and newmem is no longer outstanding.
src/SConscript: Need to include FU Pool for new CPU model. I'll try to figure out a cleaner way to handle this in the future. src/base/traceflags.py: Include new traces flags, fix up merge mess up. src/cpu/SConscript: Include the base_dyn_inst.cc as one of othe sources. Don't compile the Ozone CPU for now. src/cpu/base.cc: Remove an extra } from the merge. src/cpu/base_dyn_inst.cc: Fixes to make compiling work. Don't instantiate the OzoneCPU for now. src/cpu/base_dyn_inst.hh: src/cpu/o3/2bit_local_pred.cc: src/cpu/o3/alpha_cpu_builder.cc: src/cpu/o3/alpha_cpu_impl.hh: src/cpu/o3/alpha_dyn_inst.hh: src/cpu/o3/alpha_params.hh: src/cpu/o3/bpred_unit.cc: src/cpu/o3/btb.hh: src/cpu/o3/commit.hh: src/cpu/o3/commit_impl.hh: src/cpu/o3/cpu.cc: src/cpu/o3/cpu.hh: src/cpu/o3/fetch.hh: src/cpu/o3/fetch_impl.hh: src/cpu/o3/free_list.hh: src/cpu/o3/iew.hh: src/cpu/o3/iew_impl.hh: src/cpu/o3/inst_queue.hh: src/cpu/o3/inst_queue_impl.hh: src/cpu/o3/regfile.hh: src/cpu/o3/sat_counter.hh: src/cpu/op_class.hh: src/cpu/ozone/cpu.hh: src/cpu/checker/cpu.cc: src/cpu/checker/cpu.hh: src/cpu/checker/exec_context.hh: src/cpu/checker/o3_cpu_builder.cc: src/cpu/ozone/cpu_impl.hh: src/mem/request.hh: src/cpu/o3/fu_pool.hh: src/cpu/o3/lsq.hh: src/cpu/o3/lsq_unit.hh: src/cpu/o3/lsq_unit_impl.hh: src/cpu/o3/thread_state.hh: src/cpu/ozone/back_end.hh: src/cpu/ozone/dyn_inst.cc: src/cpu/ozone/dyn_inst.hh: src/cpu/ozone/front_end.hh: src/cpu/ozone/inorder_back_end.hh: src/cpu/ozone/lw_back_end.hh: src/cpu/ozone/lw_lsq.hh: src/cpu/ozone/ozone_impl.hh: src/cpu/ozone/thread_state.hh: Fixes to get compiling to work. src/cpu/o3/alpha_cpu.hh: Fixes to get compiling to work. Float reg accessors have changed, as well as MemReqPtrs to RequestPtrs. src/cpu/o3/alpha_dyn_inst_impl.hh: Fixes to get compiling to work. Pass in the packet to the completeAcc function. Fix up syscall function. |
2665:a124942bacb8 |
31-May-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Updated Authors from bk prs info |
2663:c82193ae8467 |
31-May-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Streamline interface to Request object.
src/SConscript: mem/request.cc no longer needed (all functions inline). src/cpu/simple/atomic.cc: src/cpu/simple/base.cc: src/cpu/simple/timing.cc: src/dev/io_device.cc: src/mem/port.cc: Modified Request object interface. src/mem/packet.hh: Modified Request object interface. Address & size are always set together now, so track with single flag. src/mem/request.hh: Streamline interface to support a handful of calls that set multiple fields reflecting common usage patterns. Reduce number of validFoo booleans by combining flags for fields which must be set together. |
2662:f24ae2d09e27 |
30-May-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Minor further cleanup & commenting of Packet class.
src/cpu/simple/atomic.cc: Make common ifetch setup based on Request rather than Packet. Packet::reset() no longer a separate function. sendAtomic() returns latency, not absolute tick. src/cpu/simple/atomic.hh: sendAtomic returns latency, not absolute tick. src/cpu/simple/base.cc: src/cpu/simple/base.hh: src/cpu/simple/timing.cc: Make common ifetch setup based on Request rather than Packet. src/dev/alpha_console.cc: src/dev/ide_ctrl.cc: src/dev/io_device.cc: src/dev/isa_fake.cc: src/dev/ns_gige.cc: src/dev/pciconfigall.cc: src/dev/sinic.cc: src/dev/tsunami_cchip.cc: src/dev/tsunami_io.cc: src/dev/tsunami_pchip.cc: src/dev/uart8250.cc: src/mem/physical.cc: Get rid of redundant Packet time field. src/mem/packet.cc: Eliminate reset() method. src/mem/packet.hh: Fold reset() function into reinitFromRequest()... it was only ever called together with that function. Get rid of redundant time field. Cleanup/add comments. src/mem/port.hh: Document in comment that sendAtomic returns latency, not absolute tick. |
2661:2fe54b1abfa7 |
30-May-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix Port pointer initialization.
src/mem/port.hh: Initialize peer port pointer to NULL. Move private data members together. |
2657:b119b774656b |
30-May-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Add a very poor implementation of dealing with retries on timing requests. It is especially slow with tracing on since it ends up being O(N^2). But it's probably going to have to change for the real bus anyway, so it should be rewritten then Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true) Removed Port Blocked/Unblocked and replaced with sendRetry(). Remove possibility of packet mangling if packet is going to be refused anyway in bridge
src/cpu/simple/atomic.cc: src/cpu/simple/atomic.hh: src/cpu/simple/timing.cc: src/cpu/simple/timing.hh: Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true) src/dev/io_device.cc: src/dev/io_device.hh: Make DMA Timing requests/responses work. Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true) src/mem/bridge.cc: src/mem/bridge.hh: Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true) Removed Port Blocked/Unblocked and replaced with sendRetry(). Remove posibility of packet mangling if packet is going to be refused anyway. src/mem/bus.cc: src/mem/bus.hh: Add a very poor implementation of dealing with retries on timing requests. It is especially slow with tracing on since it ends up being O(N^2). But it's probably going to have to change for the real bus anyway, so it should be rewritten then src/mem/port.hh: Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true) Removed Blocked/Unblocked port status, their functionality is really duplicated in the recvRetry() method |
2643:67ac7b611c56 |
26-May-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Reorganize bridge as pair of cooperating ports. Store original source & senderState for timing packets that get a response, so we can properly route the response packet back to the original sender. |
2642:c162e0359b49 |
26-May-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Add a little more tracing support for Bus/Port stuff.
src/base/traceflags.py: Sort flags so you can find things. Add BusAddrRanges flag for tracking RangeChange events separately from general bus activity. src/mem/bus.cc: Add BusAddrRanges flag for tracking RangeChange events separately from general bus activity. src/mem/port.cc: src/mem/port.hh: Print Config trace message when peers are set up. |
2641:6d9d837e2032 |
26-May-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Significant rework of Packet class interface: - new constructor guarantees initialization of most fields - flags track status of non-guaranteed fields (addr, size, src) - accessor functions (getAddr() etc.) check status on access - Command & Result classes are nested in Packet class scope - Command now built from vector of behavior bits - string version of Command for tracing - reinitFromRequest() and makeTimingResponse() encapsulate common manipulations of existing packets
src/cpu/simple/atomic.cc: src/cpu/simple/base.cc: src/cpu/simple/timing.cc: src/dev/alpha_console.cc: src/dev/ide_ctrl.cc: src/dev/io_device.cc: src/dev/io_device.hh: src/dev/isa_fake.cc: src/dev/ns_gige.cc: src/dev/pciconfigall.cc: src/dev/sinic.cc: src/dev/tsunami_cchip.cc: src/dev/tsunami_io.cc: src/dev/tsunami_pchip.cc: src/dev/uart8250.cc: src/mem/bus.cc: src/mem/bus.hh: src/mem/physical.cc: src/mem/port.cc: src/mem/port.hh: src/mem/request.hh: Update for new Packet interface. |
2640:266b80dd5eca |
26-May-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Add names to memory Port objects for tracing. |
2639:78773954274f |
23-May-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Minor fixes for full-system timing memory. Need to rewrite bus bridge to get any further.
src/dev/io_device.cc: Set packet dest on timing responses. src/mem/bus.cc: Fix dest addr bounds check assertion. Add assertion to catch infinite loopbacks. src/mem/physical.cc: Add comment. |
2632:1bb2f91485ea |
22-May-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
New directory structure: - simulator source now in 'src' subdirectory - imported files from 'ext' repository - support building in arbitrary places, including outside of the source tree. See comment at top of SConstruct file for more details. Regression tests are temporarily disabled; that syetem needs more extensive revisions.
SConstruct: Update for new directory structure. Modify to support build trees that are not subdirectories of the source tree. See comment at top of file for more details. Regression tests are temporarily disabled. src/arch/SConscript: src/arch/isa_parser.py: src/python/SConscript: Update for new directory structure. |