#
14297:b4519e586f5e |
|
10-Sep-2019 |
Jordi Vaquero <jordi.vaquero@metempsy.com> |
cpu, mem: Changing AtomicOpFunctor* for unique_ptr<AtomicOpFunctor>
This change is based on modify the way we move the AtomicOpFunctor* through gem5 in order to mantain proper ownership of the object and ensuring its destruction when it is no longer used.
Doing that we fix at the same time a memory leak in Request.hh where we were assigning a new AtomicOpFunctor* without destroying the previous one.
This change creates a new type AtomicOpFunctor_ptr as a std::unique_ptr<AtomicOpFunctor> and move its ownership as needed. Except for its only usage when AtomicOpFunc() is called.
Change-Id: Ic516f9d8217cb1ae1f0a19500e5da0336da9fd4f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20919 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
14194:967b9c450b04 |
|
17-Aug-2019 |
Gabe Black <gabeblack@google.com> |
cpu: Move O3's data port into the LSQ.
That's where it's used, and putting it there avoids having to pass around the port using the top level getDataPort function.
Change-Id: I0dea25d0c5f4bb3f58a6574a8f2b2d242784caf2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20238 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Gabe Black <gabeblack@google.com>
|
#
14111:14c05f862590 |
|
15-Nov-2018 |
Gabor Dozsa <gabor.dozsa@arm.com> |
cpu-o3: Fix too strict assert condition in writeback()
The assert() in the LSQ writeback() only allowed ReExec faults. However, a SplitRequest which completed the translation in PartialFault state (i.e. any but the very first cacheline translation failed) may end up here. The assert() condition is extended accordingly.
The patch also removes the superfluous/unused Complete/Squashed states from the LSQ request. (The completion of the request is recorded in the flags still.)
Change-Id: Ie575f4d3b4d5295585828ad8c7d3f4c7c1fe15d0 Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com> Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19174 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
#
14105:969b4e972b07 |
|
27-Feb-2019 |
Gabor Dozsa <gabor.dozsa@arm.com> |
cpu: Add first-/non-faulting load support to Minor and O3
Some architectures allow masking faults of memory load instructions in some specific circumstances (e.g. first-faulting and non-faulting loads in Arm SVE). This patch adds support for such loads in the Minor and O3 CPU models.
Change-Id: I264a81a078f049127779aa834e89f0e693ba0bea Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19178 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
14080:4472576445e7 |
|
15-Feb-2018 |
Gabor Dozsa <gabor.dozsa@arm.com> |
cpu-o3: Reset fault status for mem access in pushRequest
Reset the fault status always before translation is initiated in pushRequest() in the LSQ. This avoids the problem when a strictly ordered load needs to be re-executed multiple times. If the translation is delayed at one of those attempts then the internal panicFault (from the previous execution attempt) can get fired at commit.
Change-Id: I0c22b2f7afd6e2cb00bc359a4a01042efd2d01d2 Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19388 Reviewed-by: Ciro Santilli <ciro.santilli@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
13954:2f400a5f2627 |
|
07-Jul-2017 |
Giacomo Gabrielli <giacomo.gabrielli@arm.com> |
cpu,mem: Add support for partial loads/stores and wide mem. accesses
This changeset adds support for partial (or masked) loads/stores, i.e. loads/stores that can disable accesses to individual bytes within the target address range. In addition, this changeset extends the code to crack memory accesses across most CPU models (TimingSimpleCPU still TBD), so that arbitrarily wide memory accesses are supported. These changes are required for supporting ISAs with wide vectors.
Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> - Tiago Muck <tiago.muck@arm.com>
Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13710:5ba1d8066ef0 |
|
25-Jun-2018 |
Gabor Dozsa <gabor.dozsa@arm.com> |
cpu-o3: Add cache read ports limit to LSQ
This change introduces cache read ports to limit the number of per-cycle loads. Previously only the number of per-cycle stores could be limited.
Change-Id: I39bbd984056c5a696725ee2db462a55b2079e2d4 Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com> Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13517 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
13688:5bb3bf2f2559 |
|
15-Feb-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
cpu: Fix fast build broken due to unused variable
This fixes fast build for commit 25dc765889d948693995cfa622f001aa94b5364b (fast build is striping out assertions)
Change-Id: I9536ad58a3d85990b16a1f8c2515f6bf5d3acf71 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/16463 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
#
13652:45d94ac03a27 |
|
22-Jan-2018 |
Tuan Ta <qtt2@cornell.edu> |
cpu: support atomic memory request type with AtomicOpFunctor
This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU, MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory system.
Atomic memory instruction is treated as a special store instruction in all CPU models.
In simple CPUs, an AMO request with an associated AtomicOpFunctor is simply sent to L1 dcache.
In MinorCPU, an AMO request bypasses store buffer and waits for any conflicting store request(s) currently in the store buffer to retire before the AMO request is sent to the cache. AMO requests are not buffered in the store buffer, so their effects appear immediately in the cache.
In DerivO3CPU, an AMO request is inserted in the store buffer so that it is delivered to the cache only after all previous stores are issued to the cache. Data forwarding between between an outstanding AMO in the store buffer and a subsequent load is not allowed since the AMO request does not hold valid data until it's executed in the cache.
This implementation assumes that a target ISA implementation must insert enough memory fences as micro-ops around an atomic instruction to enforce a correct order of memory instructions with respect to its memory consistency model. Without extra memory fences, this implementation can allow AMOs and other memory instructions that do not conflict (i.e., not target the same address) to reorder.
This implementation also assumes that atomic instructions execute within a cache line boundary since the cache for now is not able to execute an operation on two different cache lines in one single step. Therefore, ISAs like x86 that require multi-cache-line atomic instructions need to either use a pair of locking load and unlocking store or change the cache implementation to guarantee the atomicity of an atomic instruction.
Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a Reviewed-on: https://gem5-review.googlesource.com/c/8188 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
#
13590:d7e018859709 |
|
13-Feb-2017 |
Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> |
cpu-o3: O3 LSQ Generalisation
This patch does a large modification of the LSQ in the O3 model. The main goal of the patch is to remove the 'an operation can be served with one or two memory requests' assumption that is present in the LSQ and the instruction with the req, reqLow, reqHigh triplet, and generalising it to operations that can be addressed with one request, and operations that require many requests, embodied in the SingleDataRequest and the SplitDataRequest.
This modification has been done mimicking the minor model to an extent, shifting the responsibilities of dealing with VtoP translation and tracking the status and resources from the DynInst to the LSQ via the LSQRequest. The LSQRequest models the information concerning the operation, handles the creation of fragments for translation and request as well as assembling/splitting the data accordingly.
With this modifications, the implementation of vector ISAs, particularly on the memory side, become more rich, as the new model permits a dissociation of the ISA characteristics as vector length, from the microarchitectural characteristics that govern how contiguous loads are executing, allowing exploration of different LSQ to DL1 bus widths to understand the tradeoffs in complexity and performance.
Part of the complexities introduced stem from the fact that gem5 keeps a large amount of metadata regarding, in particular, memory operations, thus, when an instruction is squashed while some operation as TLB lookup or cache access is ongoing, when the relevant structure communicates to the LSQ that the operation is over, it tries to access some pieces of data that should have died when the instruction is squashed, leading to asserts, panics, or memory corruption. To ensure the correct behaviour, the LSQRequest rely on assesing who is their owner, and self-destroying if they detect their owner is done with the request, and there will be no subsequent action. For example, in the case of an instruction squashed whal the TLB is doing a walk to serve the translation, when the translation is served by the TLB, the LSQRequest detects that the instruction was squashed, and as the translation is done, no one else expect to access its information, and therefore, it self-destructs. Having destroyed the LSQRequest earlier, would lead to wrong behaviour as the TLB walk may access some fields of it.
Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com>
Change-Id: I9578a1a3f6b899c390cdd886856a24db68ff7d0c Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13516 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
|
#
13560:f8732494c155 |
|
24-Dec-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
cpu-o3: Make the smtLSQPolicy a Param.ScopedEnum
The smtLSQPolicy is a parameter in the o3 cpu that can have 3 different values. Previously this setting was done through a string and a parser function would turn it into a c++ enum value. This changeset turns the string into a python Param.ScopedEnum.
Change-Id: I82041b88bd914c5dc660058d9e3998e3114e7c35 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15397 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
#
13472:7ceacede4f1e |
|
01-Mar-2017 |
Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> |
cpu: Change raw pointers to STL Containers
This patch changes two members from being raw pointers to being STL containers. The reason behind, other than cleanlyness and arguable OO best practices is that containers have more intronspections capabilities than naked pointers do, as the size is known.
Using STL containers adds little overhead and eases the automation of process during debugging (gdb).
Change-Id: I4d9d3eedafa8b5e50ac512ea93b458a4200229f2 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13126 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
#
13449:2f7efa89c58b |
|
26-Nov-2018 |
Gabe Black <gabeblack@google.com> |
arch, base, cpu, gpu, mem: Replace assert(0 or false with panic.
Neither assert(0) nor assert(false) give any hint as to why control getting to them is bad, and their more descriptive versions, assert(0 && "description") and assert(false && "description"), jury rig assert to add an error message when the utility function panic() already does that directly with better formatting options.
This change replaces that flavor of call to assert with panic, except in the actual code which processes the formatting that panic uses (to avoid infinitely recurring error handling), and in some *.sm files since I don't know what rules those have to follow and don't want to accidentaly break them.
Change-Id: I8addfbfaf77eaed94ec8191f2ae4efb477cefdd0 Reviewed-on: https://gem5-review.googlesource.com/c/14636 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
#
13429:a1e199fd8122 |
|
06-Feb-2017 |
Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> |
cpu: Fix the usage of const DynInstPtr
Summary: Usage of const DynInstPtr& when possible and introduction of move operators to RefCountingPtr.
In many places, scoped references to dynamic instructions do a copy of the DynInstPtr when a reference would do. This is detrimental to performance. On top of that, in case there is a need for reference tracking for debugging, the redundant copies make the process much more painful than it already is.
Also, from the theoretical point of view, a function/method that defines a convenience name to access an instruction should not be considered an owner of the data, i.e., doing a copy and not a reference is not justified.
On a related topic, C++11 introduces move semantics, and those are useful when, for example, there is a class modelling a HW structure that contains a list, and has a getHeadOfList function, to prevent doing a copy to an internal variable -> update pointer, remove from the list -> update pointer, return value making a copy to the assined variable -> update pointer, destroy the returned value -> update pointer.
Change-Id: I3bb46c20ef23b6873b469fd22befb251ac44d2f6 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13105 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
#
12749:223c83ed9979 |
|
04-Jun-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
misc: Using smart pointers for memory Requests
This patch is changing the underlying type for RequestPtr from Request* to shared_ptr<Request>. Having memory requests being managed by smart pointers will simplify the code; it will also prevent memory leakage and dangling pointers.
Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10996 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
11435:0f1b46dde3fa |
|
07-Apr-2016 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu.
This is a re-spin of 20264eb after the revert (bd1c6789) and includes some fixes of that commit.
|
#
11429:cf5af0cc3be4 |
|
06-Apr-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
Revert power patch sets with unexpected interactions
The following patches had unexpected interactions with the current upstream code and have been reverted for now:
e07fd01651f3: power: Add support for power models 831c7f2f9e39: power: Low-power idle power state for idle CPUs 4f749e00b667: power: Add power states to ClockedObject
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
11428:20264eb69fbf |
|
05-Apr-2016 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu.
|
#
10713:eddb533708cb |
|
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Split port retry for all different packet classes
This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios.
The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting.
The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks.
|
#
10575:a8d612fa170b |
|
02-Dec-2014 |
Marco Elver <Marco.Elver@ARM.com> |
cpu, o3: Ignored invalidate causing same-address load reordering
In case the memory subsystem sends a combined response with invalidate (e.g. ReadRespWithInvalidate), we cannot ignore the invalidate part of the response.
If we were to ignore the invalidate part, under certain circumstances this effectively leads to reordering of loads to the same address which is not permitted under any memory consistency model implemented in gem5.
Consider the case where a later load's address is computed before an earlier load in program order, and is therefore sent to the memory subsystem first. At some point the earlier load's address is computed and in doing so correctly marks the later load as a possibleLoadViolation. In the meantime some other node writes and sends invalidations to all other nodes. The invalidation races with the later load's ReadResp, and arrives before ReadResp and is deferred. Upon receipt of the ReadResp, the response is changed to ReadRespWithInvalidate, and sent to the CPU. If we ignore the invalidate part of the packet, we let the later load read the old value of the address. Eventually the earlier load's ReadResp arrives, but with new data. As there was no invalidate snoop (sunk into the ReadRespWithInvalidate), and if we did not process the invalidate of the ReadRespWithInvalidate, we obtain a load reordering.
A similar scenario can be constructed where the earlier load's address is computed after ReadRespWithInvalidate arrives for the younger load. In this case hitExternalSnoop needs to be set to true on the ReadRespWithInvalidate, so that upon knowing the address of the earlier load, checkViolations will cause the later load to be squashed.
Finally we must account for the case where both loads are sent to the memory subsystem (reordered), a snoop invalidate arrives and correctly sets the later loads fault to ReExec. However, before the CPU processes the fault, the later load's ReadResp arrives and the writeback discards the outstanding fault. We must add a check to ensure that we do not skip any unprocessed faults.
|
#
10573:3b405d11d6dc |
|
02-Dec-2014 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
cpu: Move packet deallocation to recvTimingResp in the O3 CPU
Move the packet deallocations in the O3 CPU so that the completeDataAccess deals only with the LSQ specific parts and the generic recvTimingResp frees the packet in all other cases.
|
#
10333:6be8945d226b |
|
03-Sep-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
cpu: Fix cache blocked load behavior in o3 cpu
This patch fixes the load blocked/replay mechanism in the o3 cpu. Rather than flushing the entire pipeline, this patch replays loads once the cache becomes unblocked.
Additionally, deferred memory instructions (loads which had conflicting stores), when replayed would not respect the number of functional units (only respected issue width). This patch also corrects that.
Improvements over 20% have been observed on a microbenchmark designed to exercise this behavior.
|
#
10239:592f0bb6bd6f |
|
21-Jun-2014 |
Binh Pham <binhpham@cs.rutgers.edu> |
o3: split load & store queue full cases in rename
Check for free entries in Load Queue and Store Queue separately to avoid cases when load cannot be renamed due to full Store Queue and vice versa.
This work was done while Binh was an intern at AMD Research.
|
#
9944:4ff1c5c6dcbc |
|
17-Oct-2013 |
Matt Horsnell <matt.horsnell@ARM.com> |
cpu: add consistent guarding to *_impl.hh files.
|
#
9868:44a67004d6b4 |
|
11-Sep-2013 |
Joel Hestness <jthestness@gmail.com> |
cpu: Dynamically instantiate O3 CPU LSQUnits
Previously, the LSQ would instantiate MaxThreads LSQUnits in the body of it's object, but it would only initialize numThreads LSQUnits as specified by the user. This had the effect of leaving some LSQUnits uninitialized when the number of threads was less than MaxThreads, and when adding statistics to the LSQUnit that must be initialized, this caused the stats initialization check to fail. By dynamically instantiating LSQUnits, they are all initialized and this avoids uninitialized LSQUnits from floating around during runtime.
|
#
9444:ab47fe7f03f0 |
|
07-Jan-2013 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
cpu: Rewrite O3 draining to avoid stopping in microcode
Previously, the O3 CPU could stop in the middle of a microcode sequence. This patch makes sure that the pipeline stops when it has committed a normal instruction or exited from a microcode sequence. Additionally, it makes sure that the pipeline has no instructions in flight when it is drained, which should make draining more robust.
Draining is controlled in the commit stage, which checks if the next PC after a committed instruction is in microcode. If this isn't the case, it requests a squash of all instructions after that the instruction that just committed and immediately signals a drain stall to the fetch stage. The CPU then continues to execute until the pipeline and all associated buffers are empty.
|
#
9440:fdc91cab5760 |
|
07-Jan-2013 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
cpu: Fix O3 LSQ debug dumping constness and formatting
|
#
9358:aa761458ddcb |
|
06-Dec-2012 |
Nathanael Premillieu <nathanael.premillieu@irisa.fr> |
o3 cpu: remove some unused buggy functions in the lsq Committed by: Nilay Vaish <nilay@cs.wisc.edu>
|
#
8975:7f36d4436074 |
|
01-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate requests and responses for timing accesses
This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses.
For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions).
The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly.
With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself.
|
#
8948:e95ee70f876c |
|
14-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate snoops and normal memory requests/responses
This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports.
Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below.
Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass.
Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses.
The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id.
In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions.
|
#
8850:ed91b534ed04 |
|
24-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
CPU: Round-two unifying instr/data CPU ports across models
This patch continues the unification of how the different CPU models create and share their instruction and data ports. Most importantly, it forces every CPU to have an instruction and a data port, and gives these ports explicit getters in the BaseCPU (getDataPort and getInstPort). The patch helps in simplifying the code, make assumptions more explicit, andfurther ease future patches related to the CPU ports.
The biggest changes are in the in-order model (that was not modified in the previous unification patch), which now moves the ports from the CacheUnit to the CPU. It also distinguishes the instruction fetch and load-store unit from the rest of the resources, and avoids the use of indices and casting in favour of keeping track of these two units explicitly (since they are always there anyways). The atomic, timing and O3 model simply return references to their already existing ports.
|
#
8799:dac1e33e07b0 |
|
28-Jan-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with the main repo.
|
#
8793:5f25086326ac |
|
18-Nov-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Get rid of FULL_SYSTEM in the CPU directory.
|
#
8707:489489c67fd9 |
|
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
CPU: Moving towards a more general port across CPU models
This patch performs minimal changes to move the instruction and data ports from specialised subclasses to the base CPU (to the largest degree possible). Ultimately it servers to make the CPU(s) have a well-defined interface to the memory sub-system.
|
#
8706:b1838faf3bcc |
|
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Add port proxies instead of non-structural ports
Port proxies are used to replace non-structural ports, and thus enable all ports in the system to correspond to a structural entity. This has the advantage of accessing memory through the normal memory subsystem and thus allowing any constellation of distributed memories, address maps, etc. Most accesses are done through the "system port" that is used for loading binaries, debugging etc. For the entities that belong to the CPU, e.g. threads and thread contexts, they wrap the CPU data port in a port proxy.
The following replacements are made: FunctionalPort > PortProxy TranslatingPort > SETranslatingPortProxy VirtualPort > FSTranslatingPortProxy
|
#
8545:a3992291e230 |
|
13-Sep-2011 |
Ali Saidi <saidi@eecs.umich.edu> |
LSQ: Only trigger a memory violation with a load/load if the value changes.
Only create a memory ordering violation when the value could have changed between two subsequent loads, instead of just when loads go out-of-order to the same address. While not very common in the case of Alpha, with an architecture with a hardware table walker this can happen reasonably frequently beacuse a translation will miss and start a table walk and before the CPU re-schedules the faulting instruction another one will pass it to the same address (or cache block depending on the dendency checking).
This patch has been tested with a couple of self-checking hand crafted programs to stress ordering between two cores.
The performance improvement on SPEC benchmarks can be substantial (2-10%).
|
#
8346:ce8b9a250021 |
|
10-Jun-2011 |
Korey Sewell <ksewell@umich.edu> |
o3: missing newlines on some dprintfs
|
#
8232:b28d06a175be |
|
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
trace: reimplement the DTRACE function so it doesn't use a vector At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help
|
#
7823:dac01f14f20f |
|
08-Jan-2011 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Replace curTick global variable with accessor functions. This step makes it easy to replace the accessor functions (which still access a global variable) with ones that access per-thread curTick values.
|
#
6221:58a3c04e6344 |
|
26-May-2009 |
Nathan Binkert <nate@binkert.org> |
types: add a type for thread IDs and try to use it everywhere
|
#
5714:76abee886def |
|
02-Nov-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
Add in Context IDs to the simulator. From now on, cpuId is almost never used, the primary identifier for a hardware context should be contextId(). The concept of threads within a CPU remains, in the form of threadId() because sometimes you need to know which context within a cpu to manipulate.
|
#
5557:03c186e416aa |
|
26-Sep-2008 |
Kevin Lim <ktlim@umich.edu> |
O3CPU: Fix thread writeback logic. Fix the logic in the LSQ that determines if there are any stores to write back. In the commit stage, check for thread specific writebacks instead of just any writeback.
|
#
5529:9ae69b9cd7fd |
|
11-Aug-2008 |
Nathan Binkert <nate@binkert.org> |
params: Convert the CPU objects to use the auto generated param structs. A whole bunch of stuff has been converted to use the new params stuff, but the CPU wasn't one of them. While we're at it, make some things a bit more stylish. Most of the work was done by Gabe, I just cleaned stuff up a bit more at the end.
|
#
5494:85c8d296c1cb |
|
28-Jun-2008 |
Steve Reinhardt <stever@gmail.com> |
Backed out changeset 94a7bb476fca: caused memory leak.
|
#
5489:94a7bb476fca |
|
21-Jun-2008 |
Steve Reinhardt <stever@gmail.com> |
Generate more useful error messages for unconnected ports. Force all non-default ports to provide a name and an owner in the constructor.
|
#
5012:c0a28154d002 |
|
27-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with head
|
#
4986:b7c82ad6b3ef |
|
24-Aug-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Mem: Make errors in the memory system be responses, not requests. Fixes cache handling of error responses.
|
#
4985:9f577f468009 |
|
21-Aug-2007 |
Kevin Lim <ktlim@umich.edu> |
o3: Fix for retry ID bug. It should be cleared prior to the call to recvRetry. Add extra DPRINTF statement for clearer debugging output.
|
#
4895:d36959284fbc |
|
15-Jul-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Fix up a bunch of multilevel coherence issues. Atomic mode seems to work. Timing is closer but not there yet.
|
#
4329:52057dbec096 |
|
04-Apr-2007 |
Kevin Lim <ktlim@umich.edu> |
Pass ISA-specific O3 CPU as a constructor parameter instead of using setCPU functions.
src/cpu/o3/alpha/cpu_impl.hh: Pass ISA-specific O3 CPU to FullO3CPU as a constructor parameter instead of using setCPU functions.
|
#
4318:eb4241362a80 |
|
02-Apr-2007 |
Kevin Lim <ktlim@umich.edu> |
Remove/comment out DPRINTFs that were causing a segfault.
The removed ones were unnecessary. The commented out ones could be useful in the future, should this problem get fixed. See flyspray task #243.
src/cpu/o3/commit_impl.hh: src/cpu/o3/decode_impl.hh: src/cpu/o3/fetch_impl.hh: src/cpu/o3/iew_impl.hh: src/cpu/o3/inst_queue_impl.hh: src/cpu/o3/lsq_impl.hh: src/cpu/o3/lsq_unit_impl.hh: src/cpu/o3/rename_impl.hh: src/cpu/o3/rob_impl.hh: Remove/comment out DPRINTFs that were causing a segfault.
|
#
4192:7accc6365bb9 |
|
09-Mar-2007 |
Kevin Lim <ktlim@umich.edu> |
Two fixes: 1. Make sure connectMemPorts() only gets called when the CPU's peer gets changed. This is done by making setPeer() virtual, and overriding it in the CPU's ports. When it gets called on a CPU's port (dcache specifically), it calls the normal setPeer() function, and also connectMemPorts(). 2. Consolidate redundant code that handles switching in a CPU.
src/cpu/base.cc: Move common code of switching over peers to base CPU. src/cpu/base.hh: Move common code of switching over peers to BaseCPU. src/cpu/o3/cpu.cc: Add in function that updates thread context's ports. Also use updated function to takeOverFrom() in BaseCPU. This gets rid of some repeated code. src/cpu/o3/cpu.hh: Include function to update thread context's memory ports. src/cpu/o3/lsq.hh: Add function to dcache port that will update the memory ports upon getting a new peer. Also include a function that will tell the CPU to update those memory ports. src/cpu/o3/lsq_impl.hh: Add function that will update the memory ports upon getting a new peer. src/cpu/simple/atomic.cc: src/cpu/simple/timing.cc: Add function that will update thread context's memory ports upon getting a new peer. Also use the new BaseCPU's take over from function. src/cpu/simple/atomic.hh: Add in function (and dcache port) that will allow the dcache to update memory ports when it gets assigned a new peer. src/cpu/simple/timing.hh: Add function that will update thread context's memory ports upon getting a new peer. src/mem/port.hh: Make setPeer virtual so that other classes can override it.
|
#
3870:fc7a16797788 |
|
22-Dec-2006 |
Nathan Binkert <binkertn@umich.edu> |
style
|
#
3867:807483cfab77 |
|
21-Dec-2006 |
Nathan Binkert <binkertn@umich.edu> |
don't use (*activeThreads).begin(), use activeThreads->blah(). Also don't call (*activeThreads).end() over and over. Just call activeThreads->end() once and save the result. Make sure we always check that there are elements in the list before we grab the first one.
|
#
3647:8121d4503cbc |
|
13-Nov-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Make CPU models signal to update the snoop ranges
|
#
3639:251dfe00c03d |
|
13-Nov-2006 |
Kevin Lim <ktlim@umich.edu> |
Change warn to DPRINTF.
|
#
3339:d1b3ec71baa4 |
|
19-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Small changes: ?? doesn't compile in warn statements Should have been false, where I had a true.
src/cpu/o3/lsq_impl.hh: Apparently you can't have ?? in a warn statement (Something about trigraphs) src/mem/cache/cache_impl.hh: Forgot to signal atomic mode in snoopProbe
|
#
3310:21adbb41a37e |
|
17-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes for uni-coherence in timing mode for FS. Still a bug in atomic uni-coherence in FS.
src/cpu/o3/fetch_impl.hh: src/cpu/o3/lsq_impl.hh: src/cpu/simple/atomic.cc: src/cpu/simple/timing.cc: Make CPU models handle coherence requests src/mem/cache/base_cache.cc: Properly signal coherence CSHRs src/mem/cache/coherence/uni_coherence.cc: Only deallocate once
|
#
3184:8edaf4539e05 |
|
08-Oct-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes for functional path.
If the cpu needs to update any state when it gets a functional write (LSQ??) then that code needs to be written.
src/cpu/o3/fetch_impl.hh: src/cpu/o3/lsq_impl.hh: src/cpu/ozone/front_end_impl.hh: src/cpu/ozone/lw_lsq_impl.hh: src/cpu/simple/atomic.cc: src/cpu/simple/timing.cc: CPU's can recieve functional accesses, they need to determine if they need to do anything with them. src/mem/bus.cc: src/mem/bus.hh: Make the fuctional path do the correct tye of snoop
|
#
3126:756092c6383c |
|
02-Oct-2006 |
Kevin Lim <ktlim@umich.edu> |
Updates to fix merge issues and bring almost everything up to working speed. Ozone CPU remains untested, but everything else compiles and runs.
src/arch/alpha/isa_traits.hh: This got changed to the wrong version by accident. src/cpu/base.cc: Fix up progress event to not schedule itself if the interval is set to 0. src/cpu/base.hh: Fix up the CPU Progress Event to not print itself if it's set to 0. Also remove stats_reset_inst (something I added to m5 but isn't necessary here). src/cpu/base_dyn_inst.hh: src/cpu/checker/cpu.hh: Remove float variable of instResult; it's always held within the double part now. src/cpu/checker/cpu_impl.hh: Use thread and not cpuXC. src/cpu/o3/alpha/cpu_builder.cc: src/cpu/o3/checker_builder.cc: src/cpu/ozone/checker_builder.cc: src/cpu/ozone/cpu_builder.cc: src/python/m5/objects/BaseCPU.py: Remove stats_reset_inst. src/cpu/o3/commit_impl.hh: src/cpu/ozone/lw_back_end_impl.hh: Get TC, not XCProxy. src/cpu/o3/cpu.cc: Switch out updates from the version of m5 I have. Also remove serialize code that got added twice. src/cpu/o3/iew_impl.hh: src/cpu/o3/lsq_impl.hh: src/cpu/thread_state.hh: Remove code that was added twice. src/cpu/o3/lsq_unit.hh: Add back in stats that got lost in the merge. src/cpu/o3/lsq_unit_impl.hh: Use proper method to get flags. Also wake CPU if we're coming back from a cache miss. src/cpu/o3/thread_context_impl.hh: src/cpu/o3/thread_state.hh: Support profiling. src/cpu/ozone/cpu.hh: Update to use proper typename. src/cpu/ozone/cpu_impl.hh: src/cpu/ozone/dyn_inst_impl.hh: Updates for newmem. src/cpu/ozone/lw_lsq_impl.hh: Get flags correctly. src/cpu/ozone/thread_state.hh: Reorder constructor initialization, use tc. src/sim/pseudo_inst.cc: Allow for loading of symbol file. Be sure to use ThreadContext and not ExecContext.
|
#
3125:febd811bccc6 |
|
30-Sep-2006 |
Kevin Lim <ktlim@umich.edu> |
Merge ktlim@zamp:./local/clean/o3-merge/m5 into zamp.eecs.umich.edu:/z/ktlim2/clean/o3-merge/newmem
configs/boot/micro_memlat.rcS: configs/boot/micro_tlblat.rcS: src/arch/alpha/ev5.cc: src/arch/alpha/isa/decoder.isa: src/arch/alpha/isa_traits.hh: src/cpu/base.cc: src/cpu/base.hh: src/cpu/base_dyn_inst.hh: src/cpu/checker/cpu.hh: src/cpu/checker/cpu_impl.hh: src/cpu/o3/alpha/cpu_impl.hh: src/cpu/o3/alpha/params.hh: src/cpu/o3/checker_builder.cc: src/cpu/o3/commit_impl.hh: src/cpu/o3/cpu.cc: src/cpu/o3/decode_impl.hh: src/cpu/o3/fetch_impl.hh: src/cpu/o3/iew.hh: src/cpu/o3/iew_impl.hh: src/cpu/o3/inst_queue.hh: src/cpu/o3/lsq.hh: src/cpu/o3/lsq_impl.hh: src/cpu/o3/lsq_unit.hh: src/cpu/o3/lsq_unit_impl.hh: src/cpu/o3/regfile.hh: src/cpu/o3/rename_impl.hh: src/cpu/o3/thread_state.hh: src/cpu/ozone/checker_builder.cc: src/cpu/ozone/cpu.hh: src/cpu/ozone/cpu_impl.hh: src/cpu/ozone/front_end.hh: src/cpu/ozone/front_end_impl.hh: src/cpu/ozone/lw_back_end.hh: src/cpu/ozone/lw_back_end_impl.hh: src/cpu/ozone/lw_lsq.hh: src/cpu/ozone/lw_lsq_impl.hh: src/cpu/ozone/thread_state.hh: src/cpu/simple/base.cc: src/cpu/simple_thread.cc: src/cpu/simple_thread.hh: src/cpu/thread_state.hh: src/dev/ide_disk.cc: src/python/m5/objects/O3CPU.py: src/python/m5/objects/Root.py: src/python/m5/objects/System.py: src/sim/pseudo_inst.cc: src/sim/pseudo_inst.hh: src/sim/system.hh: util/m5/m5.c: Hand merge.
|
#
3014:b4309193255a |
|
16-Aug-2006 |
Ron Dreslinski <rdreslin@umich.edu> |
Fixes for Kevins O3 model to work with the blocking caches.
src/cpu/o3/fetch_impl.hh: Fix ordering so dereference works src/cpu/o3/lsq_impl.hh: Check to make sure we didn't squash already src/cpu/o3/lsq_unit.hh: Fix for counting squashed retrys in the WB count src/cpu/o3/lsq_unit_impl.hh: Make sure to set retryID for stores, and clear it appropriately
|
#
2980:eab855f06b79 |
|
15-Aug-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Cleaned up include files and got rid of many using directives in header files.
|
#
2907:7b0ababb4166 |
|
13-Jul-2006 |
Kevin Lim <ktlim@umich.edu> |
Move Dcache port creation from LSQUnit to LSQ in order to support Ron's recent changes, and using the O3CPU in SMT mode.
src/cpu/o3/lsq.hh: Update to have LSQ work with only one dcache port for all LSQ Units. LSQ has the dcache port, and the LSQ Units must tell the LSQ if the cache has become blocked. src/cpu/o3/lsq_impl.hh: Updates to have the LSQ work with only one dcache port for all LSQUnits. src/cpu/o3/lsq_unit.hh: src/cpu/o3/lsq_unit_impl.hh: Update for LSQ to create dcache port instead of LSQUnits. Now LSQUnits are given the dcache port from the LSQ, and also must check the LSQ if the cache is blocked prior to accessing the cache.
|
#
2864:eab7ff8f6d72 |
|
06-Jul-2006 |
Kevin Lim <ktlim@umich.edu> |
Support serializing and unserializing in the O3 CPU. Also a few small fixes for draining/switching CPUs.
src/cpu/o3/commit_impl.hh: Fix to clear drainPending variable on call to resume. src/cpu/o3/cpu.cc: src/cpu/o3/cpu.hh: Support serializing and unserializing in the O3 CPU. src/cpu/o3/lsq_impl.hh: Be sure to say we have no stores to write back if the active thread list is empty. src/cpu/simple_thread.cc: src/cpu/simple_thread.hh: Slightly change how SimpleThread is used to copy from other ThreadContexts.
|
#
2733:e0eac8fc5774 |
|
16-Jun-2006 |
Kevin Lim <ktlim@umich.edu> |
Two updates that got combined into one ChangeSet accidentally. They're both pretty simple so they shouldn't cause any trouble.
First: Rename FullCPU and its variants in the o3 directory to O3CPU to differentiate from the old model, and also to specify it's an out of order model.
Second: Include build options for selecting the Checker to be used. These options make sure if the Checker is being used there is a CPU that supports it also being compiled.
SConstruct: Add in option USE_CHECKER to allow for not compiling in checker code. The checker is enabled through this option instead of through the CPU_MODELS list. However it's still necessary to treat the Checker like a CPU model, so it is appended onto the CPU_MODELS list if enabled. configs/test/test.py: Name change for DetailedCPU to DetailedO3CPU. Also include option for max tick. src/base/traceflags.py: Add in O3CPU trace flag. src/cpu/SConscript: Rename AlphaFullCPU to AlphaO3CPU.
Only include checker sources if they're necessary. Also add a list of CPUs that support the Checker, and only allow the Checker to be compiled in if one of those CPUs are also being included. src/cpu/base_dyn_inst.cc: src/cpu/base_dyn_inst.hh: Rename typedef to ImplCPU instead of FullCPU, to differentiate from the old FullCPU. src/cpu/cpu_models.py: src/cpu/o3/alpha_cpu.cc: src/cpu/o3/alpha_cpu.hh: src/cpu/o3/alpha_cpu_builder.cc: src/cpu/o3/alpha_cpu_impl.hh: Rename AlphaFullCPU to AlphaO3CPU to differentiate from old FullCPU model. src/cpu/o3/alpha_dyn_inst.hh: src/cpu/o3/alpha_dyn_inst_impl.hh: src/cpu/o3/alpha_impl.hh: src/cpu/o3/alpha_params.hh: src/cpu/o3/commit.hh: src/cpu/o3/cpu.hh: src/cpu/o3/decode.hh: src/cpu/o3/decode_impl.hh: src/cpu/o3/fetch.hh: src/cpu/o3/iew.hh: src/cpu/o3/iew_impl.hh: src/cpu/o3/inst_queue.hh: src/cpu/o3/lsq.hh: src/cpu/o3/lsq_impl.hh: src/cpu/o3/lsq_unit.hh: src/cpu/o3/regfile.hh: src/cpu/o3/rename.hh: src/cpu/o3/rename_impl.hh: src/cpu/o3/rob.hh: src/cpu/o3/rob_impl.hh: src/cpu/o3/thread_state.hh: src/python/m5/objects/AlphaO3CPU.py: Rename FullCPU to O3CPU to differentiate from old FullCPU model. src/cpu/o3/commit_impl.hh: src/cpu/o3/cpu.cc: src/cpu/o3/fetch_impl.hh: src/cpu/o3/lsq_unit_impl.hh: Rename FullCPU to O3CPU to differentiate from old FullCPU model. Also #ifdef the checker code so it doesn't need to be included if it's not selected.
|
#
2727:91e17c7ee622 |
|
13-Jun-2006 |
Kevin Lim <ktlim@umich.edu> |
Minor updates for stats.
src/cpu/o3/commit_impl.hh: src/cpu/o3/fetch.hh: Update stats comments. src/cpu/o3/fetch_impl.hh: Differentiate stats. src/cpu/o3/iew.hh: src/cpu/o3/iew_impl.hh: src/cpu/o3/inst_queue.hh: src/cpu/o3/inst_queue_impl.hh: Update for stats. src/cpu/o3/lsq.hh: LSQ now has stats. src/cpu/o3/lsq_impl.hh: Register stats of all LSQ units. src/cpu/o3/lsq_unit.hh: src/cpu/o3/lsq_unit_impl.hh: Add in stats.
|
#
2698:d5f35d41e017 |
|
09-Jun-2006 |
Kevin Lim <ktlim@umich.edu> |
Removing of old code and adding in new comments.
src/cpu/base_dyn_inst.cc: Clean up old functions, comments. src/cpu/o3/alpha_cpu_builder.cc: src/cpu/o3/alpha_params.hh: src/cpu/o3/cpu.hh: src/cpu/o3/fetch_impl.hh: src/cpu/o3/iew.hh: src/cpu/o3/iew_impl.hh: src/cpu/o3/lsq.hh: src/cpu/o3/lsq_impl.hh: src/cpu/o3/rename_impl.hh: src/cpu/ozone/lsq_unit.hh: src/cpu/ozone/lsq_unit_impl.hh: Remove old commented code. src/cpu/o3/fetch.hh: Remove old commented code, add in comments. src/cpu/o3/inst_queue_impl.hh: Move comment to better place. src/cpu/o3/lsq_unit.hh: Remove old commented code, add in new comments. src/cpu/o3/lsq_unit_impl.hh: Remove old commented code, rename variable.
|
#
2689:dbf969c18a65 |
|
07-Jun-2006 |
Kevin Lim <ktlim@umich.edu> |
Update copyright.
|
#
2669:f2b336e89d2a |
|
02-Jun-2006 |
Kevin Lim <ktlim@umich.edu> |
Fixes to get compiling to work. This is mainly fixing up some includes; changing functions within the XCs; changing MemReqPtrs to Requests or Packets where appropriate.
Currently the O3 and Ozone CPUs do not work in the new memory system; I still need to fix up the ports to work and handle responses properly. This check-in is so that the merge between m5 and newmem is no longer outstanding.
src/SConscript: Need to include FU Pool for new CPU model. I'll try to figure out a cleaner way to handle this in the future. src/base/traceflags.py: Include new traces flags, fix up merge mess up. src/cpu/SConscript: Include the base_dyn_inst.cc as one of othe sources. Don't compile the Ozone CPU for now. src/cpu/base.cc: Remove an extra } from the merge. src/cpu/base_dyn_inst.cc: Fixes to make compiling work. Don't instantiate the OzoneCPU for now. src/cpu/base_dyn_inst.hh: src/cpu/o3/2bit_local_pred.cc: src/cpu/o3/alpha_cpu_builder.cc: src/cpu/o3/alpha_cpu_impl.hh: src/cpu/o3/alpha_dyn_inst.hh: src/cpu/o3/alpha_params.hh: src/cpu/o3/bpred_unit.cc: src/cpu/o3/btb.hh: src/cpu/o3/commit.hh: src/cpu/o3/commit_impl.hh: src/cpu/o3/cpu.cc: src/cpu/o3/cpu.hh: src/cpu/o3/fetch.hh: src/cpu/o3/fetch_impl.hh: src/cpu/o3/free_list.hh: src/cpu/o3/iew.hh: src/cpu/o3/iew_impl.hh: src/cpu/o3/inst_queue.hh: src/cpu/o3/inst_queue_impl.hh: src/cpu/o3/regfile.hh: src/cpu/o3/sat_counter.hh: src/cpu/op_class.hh: src/cpu/ozone/cpu.hh: src/cpu/checker/cpu.cc: src/cpu/checker/cpu.hh: src/cpu/checker/exec_context.hh: src/cpu/checker/o3_cpu_builder.cc: src/cpu/ozone/cpu_impl.hh: src/mem/request.hh: src/cpu/o3/fu_pool.hh: src/cpu/o3/lsq.hh: src/cpu/o3/lsq_unit.hh: src/cpu/o3/lsq_unit_impl.hh: src/cpu/o3/thread_state.hh: src/cpu/ozone/back_end.hh: src/cpu/ozone/dyn_inst.cc: src/cpu/ozone/dyn_inst.hh: src/cpu/ozone/front_end.hh: src/cpu/ozone/inorder_back_end.hh: src/cpu/ozone/lw_back_end.hh: src/cpu/ozone/lw_lsq.hh: src/cpu/ozone/ozone_impl.hh: src/cpu/ozone/thread_state.hh: Fixes to get compiling to work. src/cpu/o3/alpha_cpu.hh: Fixes to get compiling to work. Float reg accessors have changed, as well as MemReqPtrs to RequestPtrs. src/cpu/o3/alpha_dyn_inst_impl.hh: Fixes to get compiling to work. Pass in the packet to the completeAcc function. Fix up syscall function.
|