#
13966:3189413c5894 |
|
01-Mar-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
Revert "cpu: fix how a thread starts up in MinorCPU"
This reverts commit 02dafc5498750d9734ba8f2a1608a846f90b71d1. The commit was part of a patchset which broke MinorCPU regressions (switcheroo)
Change-Id: I0a8098fc71abe5838014e587dbe372b258d8aa9f Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18604 Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
13652:45d94ac03a27 |
|
22-Jan-2018 |
Tuan Ta <qtt2@cornell.edu> |
cpu: support atomic memory request type with AtomicOpFunctor
This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU, MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory system.
Atomic memory instruction is treated as a special store instruction in all CPU models.
In simple CPUs, an AMO request with an associated AtomicOpFunctor is simply sent to L1 dcache.
In MinorCPU, an AMO request bypasses store buffer and waits for any conflicting store request(s) currently in the store buffer to retire before the AMO request is sent to the cache. AMO requests are not buffered in the store buffer, so their effects appear immediately in the cache.
In DerivO3CPU, an AMO request is inserted in the store buffer so that it is delivered to the cache only after all previous stores are issued to the cache. Data forwarding between between an outstanding AMO in the store buffer and a subsequent load is not allowed since the AMO request does not hold valid data until it's executed in the cache.
This implementation assumes that a target ISA implementation must insert enough memory fences as micro-ops around an atomic instruction to enforce a correct order of memory instructions with respect to its memory consistency model. Without extra memory fences, this implementation can allow AMOs and other memory instructions that do not conflict (i.e., not target the same address) to reorder.
This implementation also assumes that atomic instructions execute within a cache line boundary since the cache for now is not able to execute an operation on two different cache lines in one single step. Therefore, ISAs like x86 that require multi-cache-line atomic instructions need to either use a pair of locking load and unlocking store or change the cache implementation to guarantee the atomicity of an atomic instruction.
Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a Reviewed-on: https://gem5-review.googlesource.com/c/8188 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
#
13632:483aaa00c69c |
|
02-Apr-2018 |
Tuan Ta <qtt2@cornell.edu> |
cpu: fix how a thread starts up in MinorCPU
When a thread is activated by another thread calling a clone system call, the child thread's context is initialized in the middle of the clone system call and before the context is fully initialized. Therefore, the child thread starts fetching an unitialized PC, which could lead to a page fault.
This patch adds a pipeline wakeup event that is scheduled later in the cycle when the thread is activated. This event ensures that the first fetch only happens after the thread context is fully initialized (e.g., in case of clone syscall, it is when the parent thread copies its context over to the child thread).
When a thread first starts or wakes up, input queue to the Fetch2 stage needs to be drained since the execution flow is likely to change and previously fetched instructions in the queue may no longer be in the correct flow. This patch dumps/drains all inputs in the input queue of a thread context in the Fetch2 stage when the associated thread wakes up.
Change-Id: Iad970638e435858b7289cd471158cc0afdbbb0e5 Reviewed-on: https://gem5-review.googlesource.com/c/8182 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Brandon Potter <Brandon.Potter@amd.com>
|
#
12324:6142a2fec8d9 |
|
16-Jun-2016 |
David Guillen Fandos <david.guillen@arm.com> |
cpu-minor: Add missing instruction stats
Change-Id: I811b552989caf3601ac65a128dbee6b7bb405d7f Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> [ Updated to use IsVector instruction flag. ] Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5732 Reviewed-by: Gabe Black <gabeblack@google.com>
|
#
11567:560d7fbbddd1 |
|
21-Jul-2016 |
Mitch Hayenga <mitch.hayenga@arm.com> |
cpu: Add SMT support to MinorCPU
This patch adds SMT support to the MinorCPU. Currently RoundRobin or Random thread scheduling are supported.
Change-Id: I91faf39ff881af5918cca05051829fc6261f20e3
|
#
10259:ebb376f73dd2 |
|
23-Jul-2014 |
Andrew Bardsley <Andrew.Bardsley@arm.com> |
cpu: `Minor' in-order CPU model
This patch contains a new CPU model named `Minor'. Minor models a four stage in-order execution pipeline (fetch lines, decompose into macroops, decompose macroops into microops, execute).
The model was developed to support the ARM ISA but should be fixable to support all the remaining gem5 ISAs. It currently also works for Alpha, and regressions are included for ARM and Alpha (including Linux boot).
Documentation for the model can be found in src/doc/inside-minor.doxygen and its internal operations can be visualised using the Minorview tool utils/minorview.py.
Minor was designed to be fairly simple and not to engage in a lot of instruction annotation. As such, it currently has very few gathered stats and may lack other gem5 features.
Minor is faster than the o3 model. Sample results:
Benchmark | Stat host_seconds (s) ---------------+--------v--------v-------- (on ARM, opt) | simple | o3 | minor | timing | timing | timing ---------------+--------+--------+-------- 10.linux-boot | 169 | 1883 | 1075 10.mcf | 117 | 967 | 491 20.parser | 668 | 6315 | 3146 30.eon | 542 | 3413 | 2414 40.perlbmk | 2339 | 20905 | 11532 50.vortex | 122 | 1094 | 588 60.bzip2 | 2045 | 18061 | 9662 70.twolf | 207 | 2736 | 1036
|