14297:b4519e586f5e |
10-Sep-2019 |
Jordi Vaquero <jordi.vaquero@metempsy.com> |
cpu, mem: Changing AtomicOpFunctor* for unique_ptr<AtomicOpFunctor>
This change is based on modify the way we move the AtomicOpFunctor* through gem5 in order to mantain proper ownership of the object and ensuring its destruction when it is no longer used.
Doing that we fix at the same time a memory leak in Request.hh where we were assigning a new AtomicOpFunctor* without destroying the previous one.
This change creates a new type AtomicOpFunctor_ptr as a std::unique_ptr<AtomicOpFunctor> and move its ownership as needed. Except for its only usage when AtomicOpFunc() is called.
Change-Id: Ic516f9d8217cb1ae1f0a19500e5da0336da9fd4f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20919 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
14198:9c2f67392409 |
17-Aug-2019 |
Gabe Black <gabeblack@google.com> |
cpu: Make get(Data|Inst)Port return a Port and not a MasterPort.
No caller uses any of the MasterPort specific properties of these function's return values, so we can instead return a reference to the base Port class. This makes it possible for the data and inst ports to be of any port type, not just gem5 style MasterPorts. This makes life simpler for, for example, systemc based CPUs which might have TLM ports.
It also makes it possible for any two CPUs which have compatible ports to be switched between, as long as the ports they use support being unbound. Unfortunately that does not include TLM or systemc ports which are bound permanently.
Change-Id: I98fce5a16d2ef1af051238e929dd96d57a4ac838 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20240 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> |
14105:969b4e972b07 |
27-Feb-2019 |
Gabor Dozsa <gabor.dozsa@arm.com> |
cpu: Add first-/non-faulting load support to Minor and O3
Some architectures allow masking faults of memory load instructions in some specific circumstances (e.g. first-faulting and non-faulting loads in Arm SVE). This patch adds support for such loads in the Minor and O3 CPU models.
Change-Id: I264a81a078f049127779aa834e89f0e693ba0bea Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19178 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13966:3189413c5894 |
01-Mar-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
Revert "cpu: fix how a thread starts up in MinorCPU"
This reverts commit 02dafc5498750d9734ba8f2a1608a846f90b71d1. The commit was part of a patchset which broke MinorCPU regressions (switcheroo)
Change-Id: I0a8098fc71abe5838014e587dbe372b258d8aa9f Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18604 Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13965:347e04956cfe |
01-Mar-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
Revert "cpu: stop scheduling suspended threads in MinorCPU"
This reverts commit 6a6668bbc4b038b98eb3ee64ffb034719316afd9. The commit was part of a patchset which broke MinorCPU regressions (switcheroo)
Change-Id: I3c16a6478ba44b9d27cdd3d64a710a356999df05 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18603 Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13964:e4dbd156a640 |
01-Mar-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
Revert "cpu: fix branching when thread is suspended in MinorCPU"
This reverts commit e437086341712f1435db655b3527ea29b3311f4e. The commit was part of a patchset which broke MinorCPU regressions (switcheroo)
Change-Id: Ib8482034c2402008ccfa552325a8eb31e731b619 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18602 Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13954:2f400a5f2627 |
07-Jul-2017 |
Giacomo Gabrielli <giacomo.gabrielli@arm.com> |
cpu,mem: Add support for partial loads/stores and wide mem. accesses
This changeset adds support for partial (or masked) loads/stores, i.e. loads/stores that can disable accesses to individual bytes within the target address range. In addition, this changeset extends the code to crack memory accesses across most CPU models (TimingSimpleCPU still TBD), so that arbitrarily wide memory accesses are supported. These changes are required for supporting ISAs with wide vectors.
Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> - Tiago Muck <tiago.muck@arm.com>
Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13953:43ae8a30ec1f |
23-Oct-2018 |
Giacomo Gabrielli <giacomo.gabrielli@arm.com> |
cpu: Add a memory access predicate
This changeset introduces a new predicate to guard memory accesses. The most immediate use for this is to allow proper handling of predicated-false vector contiguous loads and predicated-false micro-ops of vector gather loads (added in separate changesets).
Change-Id: Ice6894fe150faec2f2f7ab796a00c99ac843810a Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17991 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Bradley Wang <radwang@ucdavis.edu> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
13910:d5deee7b4279 |
28-Apr-2019 |
Gabe Black <gabeblack@google.com> |
cpu: alpha: Delete all occurrances of the simPalCheck function.
This is now handled within the ISA description.
Change-Id: Ie409bb46d102e59d4eb41408d9196fe235626d32 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18434 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13908:6ab98c626b06 |
27-Apr-2019 |
Gabe Black <gabeblack@google.com> |
cpu: Remove hwrei from the generic interfaces.
This mechanism is specific to Alpha and doesn't belong sprinkled around the CPU's generic mechanisms.
Change-Id: I87904d1a08df2b03eb770205e2c4b94db25201a1 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18432 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13900:d4bcfecd871e |
28-Apr-2019 |
Gabe Black <gabeblack@google.com> |
cpu: Get rid of the (read|set)RegOtherThread methods.
These are implemented by MIPS internally now.
Change-Id: If7465e1666e51e1314968efb56a5a814e62ee2d1 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18436 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> |
13818:f0126488ef9e |
26-Mar-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
cpu: Added a probe to notify the address of retired instructions
A probe is added to notify the address of each retired instruction.
Change-Id: Iefc1b09d74b3aa0aa5773b17ba637bf51f5a59c9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17632 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
13759:9941fca869a9 |
16-Oct-2018 |
Giacomo Gabrielli <giacomo.gabrielli@arm.com> |
arch-arm,cpu: Add initial support for Arm SVE
This changeset adds initial support for the Arm Scalable Vector Extension (SVE) by implementing: - support for most data-processing instructions (no loads/stores yet); - basic system-level support.
Additional authors: - Javier Setoain <javier.setoain@arm.com> - Gabor Dozsa <gabor.dozsa@arm.com> - Giacomo Travaglini <giacomo.travaglini@arm.com>
Thanks to Pau Cabre for his contribution of bugfixes.
Change-Id: I1808b5ff55b401777eeb9b99c9a1129e0d527709 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13515 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
13709:dd6b7ac5801f |
26-Jan-2019 |
Andreas Sandberg <andreas.sandberg@arm.com> |
python: Make iterator handling Python 3 compatible
Many functions that used to return lists (e.g., dict.items()) now return iterators and their iterator counterparts (e.g., dict.iteritems()) have been removed. Switch calls to the Python 2.7 iterator methods to use the Python 3 equivalent and add explicit list conversions where necessary.
Change-Id: I0c18114955af8f4932d81fb689a0adb939dafaba Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15992 Reviewed-by: Juha Jäykkä <juha.jaykka@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
13665:9c7fe3811b88 |
25-Jan-2019 |
Andreas Sandberg <andreas.sandberg@arm.com> |
python: Don't assume SimObjects live in the global namespace
The importer in Python 3 doesn't like the way we import SimObjects from the global namespace. Convert the existing SimObject declarations to import from m5.objects. As a side-effect, this makes these files consistent with configuration files.
Change-Id: I11153502b430822130722839e1fa767b82a027aa Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15981 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> |
13652:45d94ac03a27 |
22-Jan-2018 |
Tuan Ta <qtt2@cornell.edu> |
cpu: support atomic memory request type with AtomicOpFunctor
This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU, MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory system.
Atomic memory instruction is treated as a special store instruction in all CPU models.
In simple CPUs, an AMO request with an associated AtomicOpFunctor is simply sent to L1 dcache.
In MinorCPU, an AMO request bypasses store buffer and waits for any conflicting store request(s) currently in the store buffer to retire before the AMO request is sent to the cache. AMO requests are not buffered in the store buffer, so their effects appear immediately in the cache.
In DerivO3CPU, an AMO request is inserted in the store buffer so that it is delivered to the cache only after all previous stores are issued to the cache. Data forwarding between between an outstanding AMO in the store buffer and a subsequent load is not allowed since the AMO request does not hold valid data until it's executed in the cache.
This implementation assumes that a target ISA implementation must insert enough memory fences as micro-ops around an atomic instruction to enforce a correct order of memory instructions with respect to its memory consistency model. Without extra memory fences, this implementation can allow AMOs and other memory instructions that do not conflict (i.e., not target the same address) to reorder.
This implementation also assumes that atomic instructions execute within a cache line boundary since the cache for now is not able to execute an operation on two different cache lines in one single step. Therefore, ISAs like x86 that require multi-cache-line atomic instructions need to either use a pair of locking load and unlocking store or change the cache implementation to guarantee the atomicity of an atomic instruction.
Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a Reviewed-on: https://gem5-review.googlesource.com/c/8188 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13647:7a9b7c0373b1 |
02-Apr-2018 |
Tuan Ta <qtt2@cornell.edu> |
cpu: fix how branching is handled when a thread is suspended in MinorCPU
When a thread is suspended, all instructions after the suspension need to be discarded since the thread will take a different execution stream when it wakes up.
To do that, in MinorCPU, whenever a thread gets suspended, we change the current execution stream by updating the current branch with BranchData::SuspendThread reason.
Change-Id: I7cdcda22c1cf6e8ac8db8800b7d9ec052433fdf3 Reviewed-on: https://gem5-review.googlesource.com/c/9626 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@gmail.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13646:626670cc6da4 |
02-Apr-2018 |
Tuan Ta <qtt2@cornell.edu> |
cpu: stop scheduling suspended threads in all stages of MinorCPU
This patch makes suspended threads non-schedulable in Fetch1, Fetch2, Decode and Execute stages in MinorCPU.
Change-Id: Ie79857e13b7b782d9c58c32310993a132b609cf9 Reviewed-on: https://gem5-review.googlesource.com/c/9625 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@gmail.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13632:483aaa00c69c |
02-Apr-2018 |
Tuan Ta <qtt2@cornell.edu> |
cpu: fix how a thread starts up in MinorCPU
When a thread is activated by another thread calling a clone system call, the child thread's context is initialized in the middle of the clone system call and before the context is fully initialized. Therefore, the child thread starts fetching an unitialized PC, which could lead to a page fault.
This patch adds a pipeline wakeup event that is scheduled later in the cycle when the thread is activated. This event ensures that the first fetch only happens after the thread context is fully initialized (e.g., in case of clone syscall, it is when the parent thread copies its context over to the child thread).
When a thread first starts or wakes up, input queue to the Fetch2 stage needs to be drained since the execution flow is likely to change and previously fetched instructions in the queue may no longer be in the correct flow. This patch dumps/drains all inputs in the input queue of a thread context in the Fetch2 stage when the associated thread wakes up.
Change-Id: Iad970638e435858b7289cd471158cc0afdbbb0e5 Reviewed-on: https://gem5-review.googlesource.com/c/8182 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Brandon Potter <Brandon.Potter@amd.com> |
13628:332f730a1855 |
04-Feb-2019 |
Andrea Mondelli <Andrea.Mondelli@ucf.edu> |
misc: added missing override specifier
Added missing specifier for various virtual functions.
Change-Id: I4783e92d78789a9ae182fad79aadceafb00b2458 Reviewed-on: https://gem5-review.googlesource.com/c/16103 Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13622:ba31c2a23eca |
21-Nov-2018 |
Gabe Black <gabeblack@google.com> |
cpu, arch: Replace the CCReg type with RegVal.
Most architectures weren't using the CCReg type, and in x86 and arm it was already a uint64_t.
Change-Id: I0b3d5e690e6b31db6f2627f449c89bde0f6750a6 Reviewed-on: https://gem5-review.googlesource.com/c/14515 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
13611:c8b7847b4171 |
19-Nov-2018 |
Gabe Black <gabeblack@google.com> |
arch: cpu: Rename *FloatRegBits* to *FloatReg*.
Now that there's no plain FloatReg, there's no reason to distinguish FloatRegBits with a special suffix since it's the only way to read or write FP registers.
Change-Id: I3a60168c1d4302aed55223ea8e37b421f21efded Reviewed-on: https://gem5-review.googlesource.com/c/14460 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Gabe Black <gabeblack@google.com> |
13610:5d5404ac6288 |
16-Oct-2018 |
Giacomo Gabrielli <giacomo.gabrielli@arm.com> |
arch,cpu: Add vector predicate registers
Latest-gen. vector/SIMD extensions, including the Arm Scalable Vector Extension (SVE), introduce the notion of a predicate register file. This changeset adds this feature across architectures and CPU models.
Change-Id: Iebcadbad89c0a582ff8b1b70de353305db603946 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13715 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
13598:39220222740c |
04-Jan-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
cpu: Fix VecElemClass bugs in cpu models
This patch is:
* Adding a missing VecElemClass entry * Fixing assertion in rename map which was checking the number of free vector registers rather than free vector element registers * Fixing assertion in read/setVecElemOperand APIs. * Using the right register index in SimpleThread * Using VecElem instead of VecReg on O3 readArchVecElem
Change-Id: I265320dcbe35eb47075991301dfc99333c5190c4 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15598 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
13597:6b0f8e9cdeb5 |
10-Jan-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
cpu: Add VecElem entries in MinorCPU Scoreboard
This patch is: * Increasing the number of bits in the Scoreboard so that it is keeping track of VecElemClass dependencies. * Fixing VecElemClass entry in the scoreboard table so that it correctly uses flatIndex rather than index.
Change-Id: Ie4877e5fe410b1437447adebbe289602a443f7c0 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15597 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
13582:989577bf6abc |
18-Oct-2018 |
Gabe Black <gabeblack@google.com> |
arch: cpu: Stop passing around misc registers by reference.
These values are all basic integers (specifically uint64_t now), and so passing them by const & is actually less efficient since there's a extra level of indirection and an extra value, and the same sized value (a 64 bit pointer vs. a 64 bit int) is being passed around.
Change-Id: Ie9956b8dc4c225068ab1afaba233ec2b42b76da3 Reviewed-on: https://gem5-review.googlesource.com/c/13626 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
13557:fc33e6048b25 |
13-Oct-2018 |
Gabe Black <gabeblack@google.com> |
cpu: dev: sim: gpu-compute: Banish some ISA specific register types.
These types are IntReg, FloatReg, FloatRegBits, and MiscReg. There are some remaining types, specifically the vector registers and the CCReg. I'm less familiar with these new types of registers, and so will look at getting rid of them at some later time.
Change-Id: Ide8f76b15c531286f61427330053b44074b8ac9b Reviewed-on: https://gem5-review.googlesource.com/c/13624 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> |
13500:6e0a2a7c6d8c |
19-Nov-2018 |
Gabe Black <gabeblack@google.com> |
arch, cpu: Remove float type accessors.
Use the binary accessors instead.
Change-Id: Iff1877e92c79df02b3d13635391a8c2f025776a2 Reviewed-on: https://gem5-review.googlesource.com/c/14457 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Gabe Black <gabeblack@google.com> |
13475:5189e2334f1a |
28-Nov-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
base, sim: Add missing destructors
Derived classes with virtual functions need to define a virtual destructor or a protected destructor otherwise calling the base class destructor has undefined behavior. This change adds a virtual distructor in the base class.
Change-Id: I1c855aa56dff6585ff99b9147bdb4eb9729a0a53 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/14815 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13449:2f7efa89c58b |
26-Nov-2018 |
Gabe Black <gabeblack@google.com> |
arch, base, cpu, gpu, mem: Replace assert(0 or false with panic.
Neither assert(0) nor assert(false) give any hint as to why control getting to them is bad, and their more descriptive versions, assert(0 && "description") and assert(false && "description"), jury rig assert to add an error message when the utility function panic() already does that directly with better formatting options.
This change replaces that flavor of call to assert with panic, except in the actual code which processes the formatting that panic uses (to avoid infinitely recurring error handling), and in some *.sm files since I don't know what rules those have to follow and don't want to accidentaly break them.
Change-Id: I8addfbfaf77eaed94ec8191f2ae4efb477cefdd0 Reviewed-on: https://gem5-review.googlesource.com/c/14636 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13429:a1e199fd8122 |
06-Feb-2017 |
Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> |
cpu: Fix the usage of const DynInstPtr
Summary: Usage of const DynInstPtr& when possible and introduction of move operators to RefCountingPtr.
In many places, scoped references to dynamic instructions do a copy of the DynInstPtr when a reference would do. This is detrimental to performance. On top of that, in case there is a need for reference tracking for debugging, the redundant copies make the process much more painful than it already is.
Also, from the theoretical point of view, a function/method that defines a convenience name to access an instruction should not be considered an owner of the data, i.e., doing a copy and not a reference is not justified.
On a related topic, C++11 introduces move semantics, and those are useful when, for example, there is a class modelling a HW structure that contains a list, and has a getHeadOfList function, to prevent doing a copy to an internal variable -> update pointer, remove from the list -> update pointer, return value making a copy to the assined variable -> update pointer, destroy the returned value -> update pointer.
Change-Id: I3bb46c20ef23b6873b469fd22befb251ac44d2f6 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13105 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
13172:b816da4c5e9f |
14-May-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
cpu: Fix MinorCPU executing Crypto Instructions
Crypto instruction classes added to the MinorDefaultFloatSimdFU.
Change-Id: I0cd4aa422bec74285595312a8cf01f5f425a82cd Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13251 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12759:ab260678199a |
08-Jun-2018 |
Andreas Sandberg <andreas.sandberg@arm.com> |
cpu-minor: Remove redundant thread startup call
Don't call startup() twice on each of the threads.
Change-Id: Ibe3d1f25c4fdff291ee310abb9bcad3b184bab20 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/11037 |
12749:223c83ed9979 |
04-Jun-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
misc: Using smart pointers for memory Requests
This patch is changing the underlying type for RequestPtr from Request* to shared_ptr<Request>. Having memory requests being managed by smart pointers will simplify the code; it will also prevent memory leakage and dangling pointers.
Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10996 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> |
12748:ae5ce8e42de7 |
03-Jun-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
misc: Substitute pointer to Request with aliased RequestPtr
Every usage of Request* in the code has been replaced with the RequestPtr alias. This is a preparing patch for when RequestPtr will be the typdefed to a smart pointer to Request rather then a raw pointer to Request.
Change-Id: I73cbaf2d96ea9313a590cdc731a25662950cd51a Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10995 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> |
12563:8d59ed22ae79 |
06-Mar-2018 |
Gabe Black <gabeblack@google.com> |
scons: Switch from the print statement to the print function.
Starting with version 3, scons imposes using the print function instead of the print statement in code it processes. To get things building again, this change moves all python code within gem5 to use the function version. Another change by another author separately made this same change to the site_tools and site_init.py files.
Change-Id: I2de7dc3b1be756baad6f60574c47c8b7e80ea3b0 Reviewed-on: https://gem5-review.googlesource.com/8761 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> |
12489:76d7f5f55f40 |
08-Nov-2017 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
cpu: MinorCPU handling IsSquashAfter flag
MinorCPU was not handling IsSquashAfter flagged instructions. The behaviour was to force a branch (hence enforcing refetching) for SerializeAfter instructions only. This has now been extended to SquashAfter in order to correctly support ISB barrier instruction behaviour.
Change-Id: Ie525b23350b0de121372d3b98b433e36b097d5c4 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5702 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
12420:f5c80f4ed41f |
05-Jan-2018 |
Gabe Black <gabeblack@google.com> |
cpu, power: Get rid of the remnants of the EA computation insts.
Get rid of some remnants of a system which was intended to separate address computation into its own instruction object.
Change-Id: I23f9ffd70fcb89a8ea5bbb934507fb00da9a0b7f Reviewed-on: https://gem5-review.googlesource.com/7122 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Gabe Black <gabeblack@google.com> |
12392:e0dbdf30a2a5 |
13-Dec-2017 |
Jason Lowe-Power <jason@lowepower.com> |
misc: Updates for gcc7.2 for x86
GCC 7.2 is much stricter than previous GCC versions. The following changes are needed:
* There is now a warning if there is an implicit fallthrough between two case statments. C++17 adds the [[fallthrough]]; declaration. However, to support non C++17 standards (i.e., C++11), we use M5_FALLTHROUGH. M5_FALLTHROUGH checks for [[fallthrough]] compliant C++17 compiler and if that doesn't exist, it defaults to nothing (no older compilers generate warnings). * The above resulted in a couple of bugs that were found. This is noted in the review request on gerrit. * throw() for dynamic exception specification is deprecated * There were a couple of new uninitialized variable warnings * Can no longer perform bitwise operations on a bool. * Must now include <functional> for std::function * Compiler bug for void* lambda. Changed to auto as work around. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82878
Change-Id: I5d4c782a4e133fa4cdb119e35d9aff68c6e2958e Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/5802 Reviewed-by: Gabe Black <gabeblack@google.com> |
12355:568ec3a0c614 |
07-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
cpu: Add support for CMOs in the cpu models
Cache maintenance operations go through the write channel of the cpu. This changes makes sure that the cpu does not try to fill in the packet with data.
Change-Id: Ic83205bb1cda7967636d88f15adcb475eb38d158 Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5055 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12334:e0ab29a34764 |
30-Nov-2017 |
Gabe Black <gabeblack@google.com> |
misc: Rename misc.(hh|cc) to logging.(hh|cc)
These files aren't a collection of miscellaneous stuff, they're the definition of the Logger interface, and a few utility macros for calling into that interface (panic, warn, etc.).
Change-Id: I84267ac3f45896a83c0ef027f8f19c5e9a5667d1 Reviewed-on: https://gem5-review.googlesource.com/6226 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Gabe Black <gabeblack@google.com> |
12324:6142a2fec8d9 |
16-Jun-2016 |
David Guillen Fandos <david.guillen@arm.com> |
cpu-minor: Add missing instruction stats
Change-Id: I811b552989caf3601ac65a128dbee6b7bb405d7f Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> [ Updated to use IsVector instruction flag. ] Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5732 Reviewed-by: Gabe Black <gabeblack@google.com> |
12284:b91c036913da |
20-Jul-2017 |
Jose Marinho <jose.marinho@arm.com> |
cpu, cpu, sim: move Cycle probe update
Move the code responsible for performing the actual probe point notify into BaseCPU. Use BaseCPU activateContext and suspendContext to keep track of sleep cycles. Create a probe point (ppActiveCycles) that does not count cycles where the processor was asleep. Rename ppCycles to ppAllCycles to reflect its nature.
Change-Id: I1907ddd07d0ff9f2ef22cc9f61f5f46c630c9d66 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5762 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> |
12276:22c220be30c5 |
16-Mar-2017 |
Anouk Van Laer <anouk.vanlaer@arm.com> |
pwr: Adds logic to enter power gating for the cpu model
If the CPU has been clock gated for a sufficient amount of time (configurable via pwrGatingLatency), the CPU will go into the OFF power state. This does not model hardware, just behaviour.
Change-Id: Ib3681d1ffa6ad25eba60f47b4020325f63472d43 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/3969 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
12179:432a44667130 |
01-Sep-2017 |
Pau Cabre <pau.cabre@metempsy.com> |
cpu-minor: Fix for addr range coverage calculation
Coverage was wrongly set to PartialAddrRangeCoverage in the case of disjoint adjacent ranges
Change-Id: I29aaf5145e6cdcf5f0b8f4e009d57ee57bd4c944 Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/4640 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> |
12127:4207df055b0d |
28-Jun-2017 |
Sean Wilson <spwilson2@wisc.edu> |
cpu: Refactor some Event subclasses to lambdas
Change-Id: If765c6100d67556f157e4e61aa33c2b7eeb8d2f0 Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3923 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> |
12109:f29e9c5418aa |
05-Apr-2017 |
Rekai Gonzalez-Alberquilla <Rekai.GonzalezAlberquilla@arm.com> |
cpu: Added interface for vector reg file
This patch adds some more functionality to the cpu model and the arch to interface with the vector register file.
This change consists mainly of augmenting ThreadContexts and ExecContexts with calls to get/set full vectors, underlying microarchitectural elements or lanes. Those are meant to interface with the vector register file. All classes that implement this interface also get an appropriate implementation.
This requires implementing the vector register file for the different models using the VecRegContainer class.
This change set also updates the Result abstraction to contemplate the possibility of having a vector as result.
The changes also affect how the remote_gdb connection works.
There are some (nasty) side effects, such as the need to define dummy numPhysVecRegs parameter values for architectures that do not implement vector extensions.
Nathanael Premillieu's work with an increasing number of fixes and improvements of mine.
Change-Id: Iee65f4e8b03abfe1e94e6940a51b68d0977fd5bb Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> [ Fix RISCV build issues and CC reg free list initialisation ] Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2705 |
12106:7784fac1b159 |
05-Apr-2017 |
Rekai Gonzalez-Alberquilla <Rekai.GonzalezAlberquilla@arm.com> |
cpu: Simplify the rename interface and use RegId
With the hierarchical RegId there are a lot of functions that are redundant now.
The idea behind the simplification is that instead of having the regId, telling which kind of register read/write/rename/lookup/etc. and then the function panic_if'ing if the regId is not of the appropriate type, we provide an interface that decides what kind of register to read depending on the register type of the given regId.
Change-Id: I7d52e9e21fc01205ae365d86921a4ceb67a57178 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> [ Fix RISCV build issues ] Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2702 |
12104:edd63f9c6184 |
05-Apr-2017 |
Nathanael Premillieu <nathanael.premillieu@arm.com> |
arch, cpu: Architectural Register structural indexing
Replace the unified register mapping with a structure associating a class and an index. It is now much easier to know which class of register the index is referring to. Also, when adding a new class there is no need to modify existing ones.
Change-Id: I55b3ac80763702aa2cd3ed2cbff0a75ef7620373 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> [ Fix RISCV build issues ] Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2700 |
11877:5ea85692a53e |
20-Jul-2015 |
Brandon Potter <brandon.potter@amd.com> |
syscall_emul: [patch 13/22] add system call retry capability
This changeset adds functionality that allows system calls to retry without affecting thread context state such as the program counter or register values for the associated thread context (when system calls return with a retry fault).
This functionality is needed to solve problems with blocking system calls in multi-process or multi-threaded simulations where information is passed between processes/threads. Blocking system calls can cause deadlock because the simulator itself is single threaded. There is only a single thread servicing the event queue which can cause deadlock if the thread hits a blocking system call instruction.
To illustrate the problem, consider two processes using the producer/consumer sharing model. The processes can use file descriptors and the read and write calls to pass information to one another. If the consumer calls the blocking read system call before the producer has produced anything, the call will block the event queue (while executing the system call instruction) and deadlock the simulation.
The solution implemented in this changeset is to recognize that the system calls will block and then generate a special retry fault. The fault will be sent back up through the function call chain until it is exposed to the cpu model's pipeline where the fault becomes visible. The fault will trigger the cpu model to replay the instruction at a future tick where the call has a chance to succeed without actually going into a blocking state.
In subsequent patches, we recognize that a syscall will block by calling a non-blocking poll (from inside the system call implementation) and checking for events. When events show up during the poll, it signifies that the call would not have blocked and the syscall is allowed to proceed (calling an underlying host system call if necessary). If no events are returned from the poll, we generate the fault and try the instruction for the thread context at a distant tick. Note that retrying every tick is not efficient.
As an aside, the simulator has some multi-threading support for the event queue, but it is not used by default and needs work. Even if the event queue was completely multi-threaded, meaning that there is a hardware thread on the host servicing a single simulator thread contexts with a 1:1 mapping between them, it's still possible to run into deadlock due to the event queue barriers on quantum boundaries. The solution of replaying at a later tick is the simplest solution and solves the problem generally. |
11800:54436a1784dc |
09-Nov-2016 |
Brandon Potter <brandon.potter@amd.com> |
style: [patch 3/22] reduce include dependencies in some headers
Used cppclean to help identify useless includes and removed them. This involved erroneously included headers, but also cases where forward declarations could have been used rather than a full include. |
11793:ef606668d247 |
09-Nov-2016 |
Brandon Potter <brandon.potter@amd.com> |
style: [patch 1/22] use /r/3648/ to reorganize includes |
11783:f94c14fd6561 |
21-Dec-2016 |
Arthur Perais <arthur.perais@inria.fr> |
cpu: disallow speculative update of branch predictor tables (o3)
The Minor and o3 cpu models share the branch prediction code. Minor relies on the BPredUnit::squash() function to update the branch predictor tables on a branch mispre- diction. This is fine because Minor executes in-order, so the update is on the correct path. However, this causes the branch predictor to be updated on out-of-order branch mispredictions when using the o3 model, which should not be the case.
This patch guards against speculative update of the branch prediction tables. On a branch misprediction, BPredUnit::squash() calls BpredUnit::update(..., squashed = true). The underlying branch predictor tests against the value of squashed. If it is true, it restores any speculatively updated internal state it might have (e.g., global/local branch history), then returns. If false, it updates its prediction tables. Previously, exist- ing predictors did not test against the "squashed" parameter.
To accomodate for this change, the Minor model must now call BPredUnit::squash() then BPredUnit::update(..., squashed = false) on branch mispredictions. Before, calling BpredUnit::squash() performed the prediction tables update.
The effect is a slight MPKI improvement when using the o3 model. A further patch should perform the same modifications for the indirect target predictor and BTB (less critical).
Signed-off-by: Jason Lowe-Power <jason@lowepower.com> |
11683:f1e198a028be |
15-Oct-2016 |
Fernando Endo <fernando.endo2@gmail.com> |
cpu, arm: Distinguish Float* and SimdFloat*, create FloatMem* opClass
Modify the opClass assigned to AArch64 FP instructions from SimdFloat* to Float*. Also create the FloatMemRead and FloatMemWrite opClasses, which distinguishes writes to the INT and FP register banks. Change the latency of (Simd)FloatMultAcc to 5, based on the Cortex-A72, where the "latency" of FMADD is 3 if the next instruction is a FMADD and has only the augend to destination dependency, otherwise it's 7 cycles.
Signed-off-by: Jason Lowe-Power <jason@lowepower.com> |
11612:985d9b9a68bf |
14-Aug-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
cpu: Add missing override in Minor's exec context
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11611:818913b8ce80 |
14-Aug-2016 |
Reiley Jeapaul <Reiley.Jeyapaul@arm.com> |
cpu: Fixed clang errors. Added 'override' keyword for virtual functions.
Change-Id: Ic37311443ca11ee6d95bceffea599e054e7aa110 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11608:6319a1125f1c |
14-Aug-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
cpu, arch: fix the type used for the request flags
Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11568:91e95eb78191 |
21-Jul-2016 |
Mitch Hayenga <mitch.hayenga@arm.com> |
cpu: Fix Minor SMT WFI/drain interaction issues
The behavior of WFI is to cause minor to cease evaluating pipeline logic until an interrupt is observed, however a user may wish to drain the system while a core is sleeping due to a WFI. This patch makes WFI drain. If an actual drain occurs during a WFI, the CPU is already drained and will immediately be ready for swapping, checkpointing, etc. This should not negatively impact performance as WFI instructions are 'stream-changing' (treated like unpredicted branches), so all remaining instructions are wrong-path and will be squashed rapidly.
Change-Id: I63833d5acb53d8dde78f9f0c9611de0ece385e45 |
11567:560d7fbbddd1 |
21-Jul-2016 |
Mitch Hayenga <mitch.hayenga@arm.com> |
cpu: Add SMT support to MinorCPU
This patch adds SMT support to the MinorCPU. Currently RoundRobin or Random thread scheduling are supported.
Change-Id: I91faf39ff881af5918cca05051829fc6261f20e3 |
11526:5b81895e5d5e |
06-Jun-2016 |
David Guillen Fandos <david.guillen@arm.com> |
pwr: Low-power idle power state for idle CPUs
Add functionality to the BaseCPU that will put the entire CPU into a low-power idle state whenever all threads in it are idle.
Change-Id: I984d1656eb0a4863c87ceacd773d2d10de5cfd2b |
11499:16ceeed96e1c |
27-May-2016 |
Ilias Vougioukas <Ilias.Vougioukas@ARM.com> |
cpu: fix lastStopped unserialisation
MinorCPU fix for corrupt numCycles when resuming from a previous simulation. --- src/cpu/minor/cpu.cc | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) |
11435:0f1b46dde3fa |
07-Apr-2016 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu.
This is a re-spin of 20264eb after the revert (bd1c6789) and includes some fixes of that commit. |
11429:cf5af0cc3be4 |
06-Apr-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
Revert power patch sets with unexpected interactions
The following patches had unexpected interactions with the current upstream code and have been reverted for now:
e07fd01651f3: power: Add support for power models 831c7f2f9e39: power: Low-power idle power state for idle CPUs 4f749e00b667: power: Add power states to ClockedObject
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> |
11428:20264eb69fbf |
05-Apr-2016 |
Mitch Hayenga <mitch.hayenga@arm.com> |
mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu. |
11423:831c7f2f9e39 |
09-Dec-2014 |
Akash Bagdia <akash.bagdia@ARM.com> |
power: Low-power idle power state for idle CPUs
Add functionality to the BaseCPU that will put the entire CPU into a low-power idle state whenever all threads in it are idle. |
11419:9c7b55faea5d |
05-Apr-2016 |
Mitch Hayenga <mitch.hayenga@arm.com> |
cpu: Add instruction opclass histogram to minor |
11356:a80884911971 |
19-Jul-2015 |
Krishnendra Nathella <krinat01@arm.com> |
cpu: Fix LLSC atomic CPU wakeup
Writes to locked memory addresses (LLSC) did not wake up the locking CPU. This can lead to deadlocks on multi-core runs. In AtomicSimpleCPU, recvAtomicSnoop was checking if the incoming packet was an invalidation (isInvalidate) and only then handled a locked snoop. But, writes are seen instead of invalidates when running without caches (fast-forward configurations). As as simple fix, now handleLockedSnoop is also called even if the incoming snoop packet are from writes. |
11341:bda2c39fd9fd |
15-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Add missing overrides to appease clang
Since the last round of fixes a few new issues have snuck in. We should consider switching the regression runs to clang. |
11331:cd5c48db28e6 |
10-Feb-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Deduce if cache should forward snoops
This patch changes how the cache determines if snoops should be forwarded from the memory side to the CPU side. Instead of having a parameter, the cache now looks at the port connected on the CPU side, and if it is a snooping port, then snoops are forwarded. Less error prone, and less parameters to worry about.
The patch also tidies up the CPU classes to ensure that their I-side port is not snooping by removing overrides to the snoop request handler, such that snoop requests will panic via the default MasterPort implement |
11321:02e930db812d |
06-Feb-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
style: fix missing spaces in control statements
Result of running 'hg m5style --skip-all --fix-control -a'. |
11303:f694764d656d |
17-Jan-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cpu. arch: add initiateMemRead() to ExecContext interface
For historical reasons, the ExecContext interface had a single function, readMem(), that did two different things depending on whether the ExecContext supported atomic memory mode (i.e., AtomicSimpleCPU) or timing memory mode (all the other models). In the former case, it actually performed a memory read; in the latter case, it merely initiated a read access, and the read completion did not happen until later when a response packet arrived from the memory system.
This led to some confusing things, including timing accesses being required to provide a pointer for the return data even though that pointer was only used in atomic mode.
This patch splits this interface, adding a new initiateMemRead() function to the ExecContext interface to replace the timing-mode use of readMem().
For consistency and clarity, the readMemTiming() helper function in the ISA definitions is renamed to initiateMemRead() as well. For x86, where the access size is passed in explicitly, we can also get rid of the data parameter at this level. For other ISAs, where the access size is determined from the type of the data parameter, we have to keep the parameter for that purpose. |
11169:44b5c183c3cd |
12-Oct-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Add explicit overrides and fix other clang >= 3.5 issues
This patch adds explicit overrides as this is now required when using "-Wall" with clang >= 3.5, the latter now part of the most recent XCode. The patch consequently removes "virtual" for those methods where "override" is added. The latter should be enough of an indication.
As part of this patch, a few minor issues that clang >= 3.5 complains about are also resolved (unused methods and variables). |
11168:f98eb2da15a4 |
12-Oct-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Remove redundant compiler-specific defines
This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions. |
11151:ca4ea9b5c052 |
30-Sep-2015 |
Mitch Hayenga <mitch.hayenga@arm.com> |
cpu,isa,mem: Add per-thread wakeup logic
Changes wakeup functionality so that only specific threads on SMT capable cpus are woken. |
11150:a8a64cca231b |
30-Sep-2015 |
Mitch Hayenga <mitch.hayenga@arm.com> |
isa,cpu: Add support for FS SMT Interrupts
Adds per-thread interrupt controllers and thread/context logic so that interrupts properly get routed in SMT systems. |
11148:1bc3d93c7eaa |
30-Sep-2015 |
Mitch Hayenga <mitch.hayenga@arm.com> |
cpu: Add per-thread monitors
Adds per-thread address monitors to support FullSystem SMT. |
11056:842f56345a42 |
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Reflect that packet address and size are always valid
This patch simplifies the packet, and removes the possibility of creating a packet without a valid address and/or size. Under no circumstances are these fields set at a later point, and thus they really have to be provided at construction time.
The patch also fixes a case there the MinorCPU creates a packet without a valid address and size, only to later delete it. |
11005:e7f403b6b76f |
07-Aug-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
base: Declare a type for context IDs
Context IDs used to be declared as ad hoc (usually as int). This changeset introduces a typedef for ContextIDs and a constant for invalid context IDs. |
10950:d262e02c26b3 |
31-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
cpu: Update debug message from Fetch1 isDrained() in Minor
Fix a spurious %s and include the state of the Fetch1 stage in the debug printout. |
10949:7fc527ab626a |
31-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
cpu: Fix Minor drain issues when switched out
The Minor CPU currently doesn't drain properly when it is switched out. This happens because Fetch 1 expects to be in the FetchHalted state when it is drained. However, because the CPU is switched out, it is stuck in the FetchWaitingForPC state. Fix this by ignoring drain requests and returning DrainState::Drained from MinorCPU::drain() if the CPU is switched out. This is always safe since a switched out CPU, by definition, doesn't have any instructions in flight. |
10946:6f10e35b57d1 |
30-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
cpu: Only activate thread 0 in Minor if the CPU is active
Minor currently activates thread 0 in startup() to work around an issue where activateContext() is called from LiveProcess before the process entry point is known. When activateContext() is called, Minor creates a branch instruction to the process's entry point. The first time it is called, the branch points to an undefined location (0). The call in startup() updates the branch to point to the actual entry point.
When instantiating a switched out Minor CPU, it still tries to activate thread 0. This is clearly incorrect since a switched out CPU can't have any active threads. This changeset adds a check to ensure that the thread is active before reactivating it. |
10945:369861e3d5af |
30-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
cpu: Fix drain issues in the Minor CPU
The drain refactor patches introduced a couple of bugs in the way Minor handles draining. This patch fixes an incorrect assert and a case of infinite recursion when the CPU signals drain done. |
10935:acd48ddd725f |
28-Jul-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
revert 5af8f40d8f2c |
10934:5af8f40d8f2c |
26-Jul-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
cpu: implements vector registers
This adds a vector register type. The type is defined as a std::array of a fixed number of uint64_ts. The isa_parser.py has been modified to parse vector register operands and generate the required code. Different cpus have vector register files now. |
10913:38dbdeea7f1f |
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Refactor and simplify the drain API
The drain() call currently passes around a DrainManager pointer, which is now completely pointless since there is only ever one global DrainManager in the system. It also contains vestiges from the time when SimObjects had to keep track of their child objects that needed draining.
This changeset moves all of the DrainState handling to the Drainable base class and changes the drain() and drainResume() calls to reflect this. Particularly, the drain() call has been updated to take no parameters (the DrainManager argument isn't needed) and return a DrainState instead of an unsigned integer (there is no point returning anything other than 0 or 1 any more). Drainable objects should return either DrainState::Draining (equivalent to returning 1 in the old system) if they need more time to drain or DrainState::Drained (equivalent to returning 0 in the old system) if they are already in a consistent state. Returning DrainState::Running is considered an error.
Drain done signalling is now done through the signalDrainDone() method in the Drainable class instead of using the DrainManager directly. The new call checks if the state of the object is DrainState::Draining before notifying the drain manager. This means that it is safe to call signalDrainDone() without first checking if the simulator has requested draining. The intention here is to reduce the code needed to implement draining in simple objects. |
10910:32f3d1c454ec |
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Make the drain state a global typed enum
The drain state enum is currently a part of the Drainable interface. The same state machine will be used by the DrainManager to identify the global state of the simulator. Make the drain state a global typed enum to better cater for this usage scenario. |
10905:a6ca6831e775 |
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Refactor the serialization base class
Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically:
* Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section.
* Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects.
* Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections).
* The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects.
* Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code. |
10851:a50657a4f0c1 |
26-May-2015 |
Andrew Bardsley <Andrew.Bardsley@arm.com> |
cpu: Fix a bug in counting issued instructions in MinorCPU
The MinorCPU would count bubbles in Execute::issue as part of the num_insts_issued and so sometimes reach the instruction issue limit incorrectly.
Fixed by checking for a bubble in one new place. |
10824:308771bd2647 |
05-May-2015 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
mem, cpu: Add a separate flag for strictly ordered memory
The Request::UNCACHEABLE flag currently has two different functions. The first, and obvious, function is to prevent the memory system from caching data in the request. The second function is to prevent reordering and speculation in CPU models.
This changeset gives the order/speculation requirement a separate flag (Request::STRICT_ORDER). This flag prevents CPU models from doing the following optimizations:
* Speculation: CPU models are not allowed to issue speculative loads.
* Write combining: CPU models and caches are not allowed to merge writes to the same cache line.
Note: The memory system may still reorder accesses unless the UNCACHEABLE flag is set. It is therefore expected that the STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent this behavior. |
10814:46b6043bd32c |
05-May-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
cpu: Work around gcc 4.9 issues with Num_OpClasses
This patch fixes a recent issue with gcc 4.9 (and possibly more) being convinced that indices outside the array bounds are used when initialising the FUPool members. |
10785:f56c10663a01 |
13-Apr-2015 |
Dibakar Gope <gope@wisc.edu> |
cpu: re-organizes the branch predictor structure.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10774:68d688cbe26c |
03-Apr-2015 |
Nikos Nikoleris <nikos.nikoleris@gmail.com> |
cpu: fix system total instructions accounting
The totalInstructions counter is only incremented when the whole instruction is commited and not on every microop. It was incorrectly reset in atomic and timing cpus.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>" |
10739:4cfe55719da5 |
11-Feb-2015 |
Steve Reinhardt <steve.reinhardt@amd.com> |
mem: restructure Packet cmd initialization a bit more
Refactor the way that specific MemCmd values are generated for packets. The new approach is a little more elegant in that we assign the right value up front, and it's also more amenable to non-heap-allocated Packet objects.
Also replaced the code in the Minor model that was still doing it the ad-hoc way.
This is basically a refinement of http://repo.gem5.org/gem5/rev/711eb0e64249. |
10713:eddb533708cb |
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Split port retry for all different packet classes
This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios.
The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting.
The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks. |
10698:829adc48e175 |
16-Feb-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Make readMiscRegNoEffect const throughout
Finally took the plunge and made this apply to all ISAs, not just ARM. |
10665:aef704eaedd2 |
25-Jan-2015 |
Ali Saidi <Ali.Saidi@ARM.com> |
sim: Clean up InstRecord
Track memory size and flags as well as add some comments and consts. |
10647:899c0e7e85f1 |
20-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
cpu: Fix retry bug in MinorCPU LSQ |
10634:c5a2c5ef6e68 |
03-Jan-2015 |
Andrew Lukefahr <lukefahr@umich.edu> |
minor: fixed LSQ MasterPortID
Minor was reporting the data cache access as ".inst" accesses. This just switches the MasterPortID to dataMasterPortId.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10581:7c4f1d0a8cff |
02-Dec-2014 |
Andrew Bardsley <Andrew.Bardsley@arm.com> |
cpu: Fix retries on barrier/store in Minor's store buffer
This patch fixes a case where a store in Minor's store buffer never leaves the store buffer as it is pre-maturely counted as having been issued, leading to the store buffer idling.
LSQ::StoreBuffer::numUnissuedAccesses should count the number of accesses either in memory, or still in the store buffer after being completed.
For stores which are also barriers, the store will stay in the store buffer for a cycle after it is completed and will be cleaned up by the barrier clearing code (to ensure that barriers are completed in-order). To acheive this, numUnissuedAccesses is not decremented when a store-barrier is issued to memory, but when its barrier effect is cleared.
Without this patch, the correct behaviour happens when a memory transaction is immediately accepted, but not if it needs a retry. |
10580:953d7b741619 |
02-Dec-2014 |
Andrew Bardsley <Andrew.Bardsley@arm.com> |
cpu: Fix memoryIssueLimit checking in Minor
This patch fixes the checking of the number of memory instructions issued per cycles in the Minor CPU. |
10566:c99c8d2a7c31 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Assume all dynamic packet data is array allocated
This patch simplifies how we deal with dynamically allocated data in the packet, always assuming that it is array allocated, and hence should be array deallocated (delete[] as opposed to delete). The only uses of dataDynamic was in the Ruby testers.
The ARRAY_DATA flag in the packet is removed accordingly. No defragmentation of the flags is done at this point, leaving a gap in the bit masks.
As the last part the patch, it renames dataDynamicArray to dataDynamic. |
10563:755b18321206 |
02-Dec-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add const getters for write packet data
This patch takes a first step in tightening up how we use the data pointer in write packets. A const getter is added for the pointer itself (getConstPtr), and a number of member functions are also made const accordingly. In a range of places throughout the memory system the new member is used.
The patch also removes the unused isReadWrite function. |
10537:47fe87b0cf97 |
14-Nov-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arm: Fixes based on UBSan and static analysis
Another churn to clean up undefined behaviour, mostly ARM, but some parts also touching the generic part of the code base.
Most of the fixes are simply ensuring that proper intialisation. One of the more subtle changes is the return type of the sign-extension, which is changed to uint64_t. This is to avoid shifting negative values (undefined behaviour) in the ISA code. |
10529:05b5a6cf3521 |
06-Nov-2014 |
Marc Orr <morr@cs.wisc.edu> |
x86 isa: This patch attempts an implementation at mwait.
Mwait works as follows: 1. A cpu monitors an address of interest (monitor instruction) 2. A cpu calls mwait - this loads the cache line into that cpu's cache. 3. The cpu goes to sleep. 4. When another processor requests write permission for the line, it is evicted from the sleeping cpu's cache. This eviction is forwarded to the sleeping cpu, which then wakes up.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10527:d0c2ba70dc12 |
06-Nov-2014 |
Andrew Lukefahr <lukefahr@umich.edu> |
cpu: Minor Draining Bug
Fixes a bug where Minor drains in the midst of committing a conditional store.
While committing a conditional store, lastCommitWasEndOfMacroop is true (from the previous instruction) as we still haven't finished the conditional store. If a drain occurs before the cache response, Minor would check just lastCommitWasEndOfMacroop, which was true, and set drainState=DrainHaltFetch, which increases the streamSeqNum. This caused the conditional store to be squashed when the memory responded and it completed. However, to the memory the store succeeded, while to the instruction sequence it never occurred.
In the case of an LLSC, the instruction sequence will replay the squashed STREX, which will fail as the cache is no longer in LLSC. Then the instruction sequence will loop back to a LDREX, which receives the updated (incorrect) value.
Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
10504:58d5d471b598 |
30-Oct-2014 |
Andrew Bardsley <Andrew.Bardsley@arm.com> |
cpu: Fix barrier push to store buffer when full bug in Minor
This patch fixes a bug where a completing load or store which is also a barrier can push a barrier into the store buffer without first checking that there is a free slot.
The bug was not fatal but would print a warning that the store buffer was full when inserting. |
10464:2a0fe8bca031 |
16-Oct-2014 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
cpu: Probe points for basic PMU stats
This changeset adds probe points that can be used to implement PMU counters for CPU stats. The following probes are supported:
* BaseCPU::ppCycles / Cycles * BaseCPU::ppRetiredInsts / RetiredInsts * BaseCPU::ppRetiredLoads / RetiredLoads * BaseCPU::ppRetiredStores / RetiredStores * BaseCPU::ppRetiredBranches RetiredBranches |
10417:710ee116eb68 |
27-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Use const StaticInstPtr references where possible
This patch optimises the passing of StaticInstPtr by avoiding copying the reference-counting pointer. This avoids first incrementing and then decrementing the reference-counting pointer. |
10407:a9023811bf9e |
20-Sep-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
alpha,arm,mips,power,x86,cpu,sim: Cleanup activate/deactivate
activate(), suspend(), and halt() used on thread contexts had an optional delay parameter. However this parameter was often ignored. Also, when used, the delay was seemily arbitrarily set to 0 or 1 cycle (no other delays were ever specified). This patch removes the delay parameter and 'Events' associated with them across all ISAs and cores. Unused activate logic is also removed. |
10379:c00f6d7e2681 |
19-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Pass faults by const reference where possible
This patch changes how faults are passed between methods in an attempt to copy as few reference-counting pointer instances as possible. This should avoid unecessary copies being created, contributing to the increment/decrement of the reference counters. |
10368:a7cb233caa7b |
12-Sep-2014 |
Andrew Bardsley <Andrew.Bardsley@arm.com> |
cpu: Fix memory access in Minor not setting parent Request flags
This patch fixes cases where uncacheable/memory type flags are not set correctly on a memory op which is split in the LSQ. Without this patch, request->request if freely used to check flags where the flags should actually come from the accumulation of request fragment flags.
This patch also fixes a bug where an uncacheable access which passes through tryToSendRequest more than once can increment LSQ::numAccessesInMemorySystem more than once. |
10366:128c1ed03f4e |
12-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
minor: Fix typo in DPRINTF for Minor branch prediction |
10319:4207f9bfcceb |
03-Sep-2014 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
arch, cpu: Factor out the ExecContext into a proper base class
We currently generate and compile one version of the ISA code per CPU model. This is obviously wasting a lot of resources at compile time. This changeset factors out the interface into a separate ExecContext class, which also serves as documentation for the interface between CPUs and the ISA code. While doing so, this changeset also fixes up interface inconsistencies between the different CPU models.
The main argument for using one set of ISA code per CPU model has always been performance as this avoid indirect branches in the generated code. However, this argument does not hold water. Booting Linux on a simulated ARM system running in atomic mode (opt/10.linux-boot/realview-simple-atomic) is actually 2% faster (compiled using clang 3.4) after applying this patch. Additionally, compilation time is decreased by 35%. |
10259:ebb376f73dd2 |
23-Jul-2014 |
Andrew Bardsley <Andrew.Bardsley@arm.com> |
cpu: `Minor' in-order CPU model
This patch contains a new CPU model named `Minor'. Minor models a four stage in-order execution pipeline (fetch lines, decompose into macroops, decompose macroops into microops, execute).
The model was developed to support the ARM ISA but should be fixable to support all the remaining gem5 ISAs. It currently also works for Alpha, and regressions are included for ARM and Alpha (including Linux boot).
Documentation for the model can be found in src/doc/inside-minor.doxygen and its internal operations can be visualised using the Minorview tool utils/minorview.py.
Minor was designed to be fairly simple and not to engage in a lot of instruction annotation. As such, it currently has very few gathered stats and may lack other gem5 features.
Minor is faster than the o3 model. Sample results:
Benchmark | Stat host_seconds (s) ---------------+--------v--------v-------- (on ARM, opt) | simple | o3 | minor | timing | timing | timing ---------------+--------+--------+-------- 10.linux-boot | 169 | 1883 | 1075 10.mcf | 117 | 967 | 491 20.parser | 668 | 6315 | 3146 30.eon | 542 | 3413 | 2414 40.perlbmk | 2339 | 20905 | 11532 50.vortex | 122 | 1094 | 588 60.bzip2 | 2045 | 18061 | 9662 70.twolf | 207 | 2736 | 1036 |