#
14280:9e3f2937f72c |
|
31-Jul-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
arch-arm: Fix Data Abort ISS when caused by Atomic operation
Data Aborts caused by an atomic instruction have a special rule for their syndrome: From a ISS point of view they count as read if a read to that address would generate a fault; they count as writes otherwise (ISS.WnR bit) This patch is implementing this in the TLB. For permission faults we need to explicitly check if a read would trigger a fault (e.g. checking for the AP bits) since permissions can allow read-only accesses. For other MMU exceptions (like translation faults) we are confident the nature of the access doesn't affect the genration of a fault. This means that if the access is atomic, we treat it as a read from an ISS.WnR point of view.
Change-Id: Ia524aa6ae07f81513cdc26c516b5fd9b01a931c3 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20981 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
14278:45892d0d3e98 |
|
08-Sep-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
arch-arm: PSTATE.PAN affecting EL2 only when HCR_EL2.E2H=1
Change-Id: I6df0cdcbadca17f30d3de3bed887f75c739b00f0 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20979 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
14172:bba55ff08279 |
|
16-Aug-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
arch-arm: Replace occ of opModeToEL(currOpMode/cpsr) with currEL
Change-Id: I739a9be03ea5caa63540c62fd110eee86a058c4c Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20252 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
14128:6ed23d07d0d1 |
|
28-Jul-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
arch-arm: Implement ARMv8.1-PAN, Privileged access never
ARMv8.1-PAN adds a new bit to PSTATE. When the value of this PAN state bit is 1, any privileged data access from EL1 or EL2 to a virtual memory address that is accessible at EL0 generates a Permission fault. This feature is mandatory in ARMv8.1 implementations. This feature is supported in AArch64 and AArch32 states. The ID_AA64MMFR1_EL1.PAN, ID_MMFR3_EL1.PAN, and ID_MMFR3.PAN fields identify the support for ARMv8.1-PAN.
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Change-Id: I94a76311711739dd2394c72944d88ba9321fd159 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19729 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
14088:8de55a7aa53b |
|
01-May-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
arch-arm: Use ExceptionLevel type in TlbEntry
Replacing uint8_t with ExceptionLevel type in the arm TlbEntry. The variable is representing the translation regime it is targeting.
Change-Id: Ifcd6e86c5d73f752e8476a2b7fda9ea74a0c7a3b Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19488 Reviewed-by: Ciro Santilli <ciro.santilli@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
14066:c29e36d24f68 |
|
16-Nov-2018 |
Anouk Van Laer <anouk.vanlaer@arm.com> |
arch, arm: Update miscRegs in getTE
Normally, a translation will start via translateTiming/functional which will check if the miscRegs have been updated and if so, will update the TLB state accordingly. However, in a 2 stage system, if there is a hit in stage 1, the resulting IPA will be sent to the S2-TLB for translation via a getTE() function call (via the stage2_lookup object). This will cause the state of the S2-TLB to be out of sync.
Change-Id: I117e4032fc76d7d31f4f999887b5573a7e5811e6 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14995 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
|
#
13968:55d001a9732d |
|
14-May-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
arch-arm: Do not check MustBeOne flag for TLB requests from the prefetcher
Allow TLB requests generated from prefetchers to override the MustBeOne arch flag. This allows the prefetchers to issue requests without having to know architecutre-specific flags.
Change-Id: Id83e0c93f3d1a614da11c4f344ab4dc594423672 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18768 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
13889:b329d40d4e78 |
|
01-Apr-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
arch-arm: updateMiscReg not setting isHyp in aarch64
The isHyp flag should be set for a TLB::NormalTran when in EL2. This was happening in aarch32 only, where the CPSR mode is checked, while aarch64 was only using it for explicit EL2 translations, like for AT instructions.
Change-Id: I54605811e9dde75b5cf8868190b0f4c2a8d46570 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18394 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
13882:03fe9a85b435 |
|
10-Apr-2019 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
arch-arm: Remove un-needed hyp flag in TLBI operations
The hyp flag was probably a legacy pre-v8 flag distinguishing invalidation targeting PL2 translation regime (hyp mode). Since the introduction of target_el parameter, hyp boolean is not needed anymore. The patch works by setting the hyp flag in the flush* methods in the TLB automatically by checking if target_el == EL2.
Change-Id: I798009e09ff24a383dea871e348188bae2685e8e Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Jan-Peter Larsson <jan-peter.larsson@arm.com> Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18389 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
13795:e21c61d9efb8 |
|
19-Mar-2019 |
Andrea Mondelli <Andrea.Mondelli@ucf.edu> |
dev-arm: ambiguous use of getPort()
The recent introduction of getPort() creates a conflict with the existing method used in arm MMU.
This patch rename the old getPort() in getDMAPort() according to the returned value (DmaPort class type)
Change-Id: Ief3d83650fd6b08490522341631244be06e380ce Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17469 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
13784:1941dc118243 |
|
07-Mar-2019 |
Gabe Black <gabeblack@google.com> |
arch, cpu, dev, gpu, mem, sim, python: start using getPort.
Replace the getMasterPort, getSlavePort, and getEthPort functions with getPort, and remove extraneous mechanisms that are no longer necessary.
Change-Id: Iab7e3c02d2f3a0cf33e7e824e18c28646b5bc318 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17040 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
13741:d994984b842a |
|
22-Feb-2019 |
Andrea Mondelli <Andrea.Mondelli@ucf.edu> |
mem-cache: alias to mem::getMasterPort in TLB class
TLB:getMasterPort is used to obtain the PageWalkMasterPort if present and hides the BaseTLB::getMasterPort().
The TLB::getMasterPort() is renamed according to the expected behavior.
Change-Id: If4f61189094a706d59805cd10f4f814e5830eda8 Reviewed-on: https://gem5-review.googlesource.com/c/16648 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
13453:4a7a060ea26e |
|
10-Feb-2017 |
Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> |
cpu,arch-arm: Initialise data members
The value that is not initialized has a bogus value that manifests when using some debug-flags what makes the usage of tracediff a bit more challenging.
In addition, while debugging with other techniques, it introduces the problem of understanding if the value of a field is 'intended' or just an effect of the lack of initialisation.
Change-Id: Ied88caa77479c6f1d5166d80d1a1a057503cb106 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13125 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
|
#
13374:b7f652df5e5b |
|
19-Oct-2018 |
Anouk Van Laer <anouk.vanlaer@arm.com> |
arch, arm: Effect of AT instructions on descriptor handling
Some address translation instructions will stop translation after the 1st stage and intercept the IPA, even in the presence of stage 2 (eg AT S1E1). However, in the case of a TLB miss, the table descriptors still need to be translated from IPA to PA to avoid fetching the wrong addresses. This commit splits whether IPA->PA translation is required for the VA and/or for the table descriptors.
Change-Id: Ie53cdc00585f116150256f1d833460931b3bfb7d Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13781 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
12763:37c243ed1112 |
|
29-May-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
arch-arm: Add Illegal Execution flag to PCState
This patch moves the detection of the Illegal Execution flag (PSTATE.IL) from the tlb translation stage (fetch) to the decoding stage. This is done by adding the illegalExecution field to the PCState.
Change-Id: I9c1c4e9c6bd5ded905c1d56b3034e4e9322582fa Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10813 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
12749:223c83ed9979 |
|
04-Jun-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
misc: Using smart pointers for memory Requests
This patch is changing the underlying type for RequestPtr from Request* to shared_ptr<Request>. Having memory requests being managed by smart pointers will simplify the code; it will also prevent memory leakage and dangling pointers.
Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10996 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
12735:e3da526a0654 |
|
16-May-2018 |
Andreas Sandberg <andreas.sandberg@arm.com> |
arch-arm: Respect EL from translation type
There are cases where instructions request translations in the context of a lower EL. This is currently not respected in the TLB and the page table walker. Fix that.
Change-Id: Icd59657a1ecfd8bd75a001bb1a4e41a6f4808a36 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10506 Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
|
#
12528:a9960d039c29 |
|
23-Jan-2018 |
Chuan Zhu <chuan.zhu@arm.com> |
arch-arm: Fix syntax error in TLB::getResultTe
The patch fixes one syntax error in TLB::getResultTe
Change-Id: I31a72a52d5c03f43929a69ca1be61d9c20e76f5b Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/7983 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
12506:aed554105426 |
|
04-Jan-2018 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
arch-arm: Check cache maintenance insts for permission faults
In AArch32, data cache maintenance instructions that operate by VA do not generate permission faults.
In AArch64, a data cache invalidate instruction can generate a permission fault when there are no write permissions to the specified VA. Data cache clean and data cache clean and invalidate instructions do not generate permission faults.
Checks for external aborts are also bypassed for data cache maintenance instructions.
Change-Id: Iea5bc665e4cf66d528e36b671535b66637c4b224 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/7827 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
12499:b81688796004 |
|
09-Jan-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
arch-arm: Change function name for banked miscregs
This commit changes the function's name used for retrieving the index of a security banked register given the flatten index. This will avoid confusion with flattenRegId, which has a different purpose.
Change-Id: I470ffb55916cb7fc9f78e071a7f2e609c1829f1a Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/7982 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
12406:86bde4a026b5 |
|
22-Dec-2017 |
Gabe Black <gabeblack@google.com> |
arch,cpu: "virtualize" the TLB interface.
CPUs have historically instantiated the architecture specific version of the TLBs to avoid a virtual function call, making them a little bit more dependent on what the current ISA is. Some simple performance measurement, the x86 twolf regression on the atomic CPU, shows that there isn't actually any performance benefit, and if anything the simulator goes slightly faster (although still within margin of error) when the TLB functions are virtual.
This change switches everything outside of the architectures themselves to use the generic BaseTLB type, and then inside the ISA for them to cast that to their architecture specific type to call into architecture specific interfaces.
The ARM TLB needed the most adjustment since it was using non-standard translation function signatures. Specifically, they all took an extra "type" parameter which defaulted to normal, and translateTiming returned a Fault. translateTiming actually doesn't need to return a Fault because everywhere that consumed it just stored it into a structure which it then deleted(?), and the fault is stored in the Translation object when the translation is done.
A little more work is needed to fully obviate the arch/tlb.hh header, so the TheISA::TLB type is still visible outside of the ISAs. Specifically, the TlbEntry type is used in the generic PageTable which lives in src/mem.
Change-Id: I51b68ee74411f9af778317eff222f9349d2ed575 Reviewed-on: https://gem5-review.googlesource.com/6921 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
|
#
12356:e56e838c47cb |
|
20-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
arm: Add CMO support for Non-Cacheable memory
Cache Maintainance operations to the point of coherence are treated as normal cahceable requests and clean and/or invalidate the caches of all PEs.
Change-Id: Ia4a749c2318fe29c8601848b034b8315c4186c8a Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5056 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
|
#
12005:f4b9607db0af |
|
16-Feb-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
arm: Add support for memory-mapped m5ops
Add support for a memory mapped m5op interface. When enabled, the TLB intercepts accesses in the 64KiB region designated by the ArmTLB.m5ops_base parameter. An access to this range maps to a specific m5op call. The upper 8 bits of the offset into the range denote the m5op function to call and the lower 8 bits denote the subfunction.
Change-Id: I55fd8ac1afef4c3cc423b973870c9fe600a843a2 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2964
|
#
11861:9684637f3339 |
|
21-Feb-2017 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
arm: Blame the right instruction address on a Prefetch Abort
CPU models (e.g., O3CPU) issue instruction fetches for the whole cache block rather than a specific instruction. Consequently the TLB lookups translate the cache block virtual address. When the TLB lookup fails, however, the Prefetch Abort must be raised for the PC of the instruction that caused the fault rather than for the address of the block.
This change fixes the way we instantiate the PrefetchAbort faults to use the PC of the request rather the address of the instruction fetch request.
Change-Id: I8e45549da1c3be55ad204a060029c95ce822a851 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Rekai Gonzalez Alberquilla <rekai.gonzalezalberquilla@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
11793:ef606668d247 |
|
09-Nov-2016 |
Brandon Potter <brandon.potter@amd.com> |
style: [patch 1/22] use /r/3648/ to reorganize includes
|
#
11608:6319a1125f1c |
|
14-Aug-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
cpu, arch: fix the type used for the request flags
Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
11584:bbd8448f104e |
|
02-Aug-2016 |
Dylan Johnson <Dylan.Johnson@ARM.com> |
arm: Add TLBI instruction for stage 2 IPA's
This patch adds support for stage 2 TLBI instructions such as TLBI IPAS2E1_Xt.
Change-Id: I0cd5e8055b0c1003e03439aa5183252f50ea0a88
|
#
11580:afe051c345e9 |
|
02-Aug-2016 |
Dylan Johnson <Dylan.Johnson@ARM.com> |
arm: Fix stage 2 determination in table walker
We recompute if we are doing a stage 2 walk inside of the table walker but we have already figured it out in the tlb. Pass the information in to the walk instead of recomputing it.
Change-Id: I39637ce99309b2ddbc30344d45ac9ebf6a203401
|
#
11577:a26a328c20eb |
|
02-Aug-2016 |
Dylan Johnson <Dylan.Johnson@ARM.com> |
arm: Fix EL perceived at TLB for address translation instructions
During address translation instructions (such as AT S1E1R_Xt) the exception level can be different than the current exception level. This patch fixes how the TLB determines what EL to use during these instructions.
Change-Id: Ia9ce229404de9e284bc1f7479fd2c580efd55f8f
|
#
11575:0005b28685f0 |
|
02-Aug-2016 |
Dylan Johnson <Dylan.Johnson@ARM.com> |
arm: add stage2 translation support
Change-Id: I8f7c09c7ec3a97149ebebf4b21471b244e6cecc1
|
#
11560:f050b8cf4754 |
|
11-Jul-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
arm: Don't consult the TLB test iface for functional translations
Don't consult the TLB test interface for PA's returned by functional translations by the AT instruction. We implement this by chaning the ISA code to synthesize 0-length functional reads for the TLB lookup. The TLB then bypasses the final PA check in the tester if the size is zero.
Change-Id: I2487b7f829cea88c37e229e9fc7a4543aced961b Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
|
#
11522:348411ec525a |
|
06-Jun-2016 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
sim: Call regStats of base-class as well
We want to extend the stats of objects hierarchically and thus it is necessary to register the statistics of the base-class(es), as well. For now, these are empty, but generic stats will be added there.
Patch originally provided by Akash Bagdia at ARM Ltd.
|
#
11517:54230f1ebef2 |
|
02-Jun-2016 |
Curtis Dunham <Curtis.Dunham@arm.com> |
arm: refactor page table format determination
In particular, when EL0 is in AArch32 but EL1 is AArch64, AArch64 memory translation must be used. This is essential for typical AArch64/32 interworking use cases.
|
#
11505:55256a05d9e9 |
|
30-May-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
arm: Correctly check translation mode (aarch64/aarch32)
According to the ARM ARM (see AArch32.TranslateAddress in the pseudocode library), the TLB should be operating in aarch64 mode if the EL0 is aarch32 and EL1 is aarch64. This is currently not the case in gem5, which breaks 64/32 interprocessing. Update the check to match the reference manual.
Change-Id: I6f1444d57c0e2eb5f8880f513f33a9197b7cb2ce Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
11495:1f04f97c014d |
|
26-May-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
arm: Fix incorrect TLB permission check in aarch32
The TLB currently assumes that the pxn bit in an LPAE page descriptor disables execution from unprivileged mode. However, according to the architecture manual, this bit should disable execution from privileged modes. Update the TLB implementation to reflect this behavior.
Change-Id: I7f1bb232d7a94a93fd601a9230223195ac952947 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
11395:032bc62120eb |
|
21-Mar-2016 |
Andreas Sandberg <andreas.sandberg@arm.com> |
arm: Refactor the TLB test interface
Refactor the TLB and page table walker test interface to use a dynamic registration mechanism. Instead of patching a couple of empty methods to wire up a TLB tester, this change allows such testers to register themselves using the setTestInterface() method.
|
#
11321:02e930db812d |
|
06-Feb-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
style: fix missing spaces in control statements
Result of running 'hg m5style --skip-all --fix-control -a'.
|
#
11152:11da02681277 |
|
30-Sep-2015 |
Mitch Hayenga <mitch.hayenga@arm.com> |
arm: Change TLB Software Caching
In ARM, certain variables are only updated when a necessary change is detected. Having 2 SMT threads share a TLB resulted in these not being updated as required. This patch adds a thread context identifer to assist in the invalidation of these variables.
|
#
11055:54071fd5c397 |
|
21-Aug-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
arm, mem: Remove unused CLEAR_LL request flag
Cleaning up dead code. The CLREX stores zero directly to MISCREG_LOCKFLAG and so the request flag is no longer needed. The corresponding functionality in the cache tags is also removed.
|
#
10905:a6ca6831e775 |
|
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Refactor the serialization base class
Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically:
* Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section.
* Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects.
* Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections).
* The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects.
* Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code.
|
#
10873:7c972b9aea16 |
|
21-Jun-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
arm: Cleanup arch headers to remove dma_device.hh dependency
Break the dependency on dma_device.hh by forward-declaring DmaPort in the relevant header.
|
#
10854:f449d6f8a647 |
|
26-May-2015 |
Nathanael Premillieu <Nathanael.Premillieu@arm.com> |
arm: Make address translation faster with better caching
This patch adds better caching of the sys regs for AArch64, thus avoiding unnecessary calls to tc->readMiscReg(MISCREG_CPSR) in the non-faulting case.
|
#
10825:5d059b8ed8a4 |
|
05-May-2015 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
arm: Relax ordering for some uncacheable accesses
We currently assume that all uncacheable memory accesses are strictly ordered. Instead of always enforcing strict ordering, we now only enforce it if the required memory type is device memory or strongly ordered memory.
|
#
10824:308771bd2647 |
|
05-May-2015 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
mem, cpu: Add a separate flag for strictly ordered memory
The Request::UNCACHEABLE flag currently has two different functions. The first, and obvious, function is to prevent the memory system from caching data in the request. The second function is to prevent reordering and speculation in CPU models.
This changeset gives the order/speculation requirement a separate flag (Request::STRICT_ORDER). This flag prevents CPU models from doing the following optimizations:
* Speculation: CPU models are not allowed to issue speculative loads.
* Write combining: CPU models and caches are not allowed to merge writes to the same cache line.
Note: The memory system may still reorder accesses unless the UNCACHEABLE flag is set. It is therefore expected that the STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent this behavior.
|
#
10822:d259f2bc2b31 |
|
05-May-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
arm: Remove unnecessary boot uncachability
With the recent patches addressing how we deal with uncacheable accesses there is no longer need for the work arounds put in place to enforce certain sections of memory to be uncacheable during boot.
|
#
10717:4f8c1bd6fdb8 |
|
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
arm: Share a port for the two table walker objects
This patch changes how the MMU and table walkers are created such that a single port is used to connect the MMU and the TLBs to the memory system. Previously two ports were needed as there are two table walker objects (stage one and stage two), and they both had a port. Now the port itself is moved to the Stage2MMU, and each TableWalker is simply using the port from the parent.
By using the same port we also remove the need for having an additional crossbar joining the two ports before the walker cache or the L2. This simplifies the creation of the CPU cache topology in BaseCPU.py considerably. Moreover, for naming and symmetry reasons, the TLB walker port is connected through the stage-one table walker thus making the naming identical to x86. Along the same line, we use the stage-one table walker to generate the master id that is used by all TLB-related requests.
|
#
10611:3bba9f2d0c7d |
|
23-Dec-2014 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
arm: Raise an alignment fault if a PC has illegal alignment
We currently don't handle unaligned PCs correctly. There is one check for unaligned PCs in the TLB when running in aarch64 mode, but this check does not cover cases where the CPU does not do a TLB lookup when decoding an instruction (e.g., a branch stays within the same cache line). Additionally, the Decoder class sometimes throws an assertion for unaligned PCs which breaks speculation.
This changeset introduces a decoder fault bit field in the ExtMachInst structure. This field can be used to signal a decoder failure. If set, the decoder generates an internal gem5fault instruction instead of a normal instruction. This instruction in turns either panics (fault type PANIC), returns an PCAlignmentFault (fault type UNALIGNED, aarch64) or PrefetchAbort (fault type UNALIGNED, aarch32).
The patch causes minor changes to the realview64 regressions, and a stats bump will follow.
|
#
10537:47fe87b0cf97 |
|
14-Nov-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arm: Fixes based on UBSan and static analysis
Another churn to clean up undefined behaviour, mostly ARM, but some parts also touching the generic part of the code base.
Most of the fixes are simply ensuring that proper intialisation. One of the more subtle changes is the return type of the sign-extension, which is changed to uint64_t. This is to avoid shifting negative values (undefined behaviour) in the ISA code.
|
#
10508:aa46a8ae3487 |
|
30-Oct-2014 |
Ali Saidi <Ali.Saidi@ARM.com> |
arm: Fix multi-system AArch64 boot w/caches.
Automatically extract cpu release address from DTB file. Check SCTLR_EL1 to verify all caches are enabled.
|
#
10474:799c8ee4ecba |
|
16-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Use shared_ptr for all Faults
This patch takes quite a large step in transitioning from the ad-hoc RefCountingPtr to the c++11 shared_ptr by adopting its use for all Faults. There are no changes in behaviour, and the code modifications are mostly just replacing "new" with "make_shared".
|
#
10463:25c5da51bbe0 |
|
16-Oct-2014 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
arm: Add TLB PMU probes
This changeset adds probe points that can be used to implement PMU counters for TLB stats. The following probes are supported:
* ArmISA::TLB::ppRefills / TLB Refills (TLB insertions)
|
#
10418:7a76e13f0101 |
|
27-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arm: Fixed undefined behaviours identified by gcc
This patch fixes the runtime errors highlighted by the undefined behaviour sanitizer. In the end there were two issues. First, when rotating an immediate, we ended up shifting an uint32_t by 32 in some cases. This case is fixed by checking for a rotation by 0 positions. Second, the Mrc15 and Mcr15 are operating on an IntReg and a MiscReg, but we used the type RegRegImmOp and passed a MiscRegIndex as an IntRegIndex. This issue is resolved by introducing a MiscRegRegImmOp and RegMiscRegImmOp with the appropriate types.
With these fixes there are no runtime errors identified for the full ARM regressions.
|
#
10367:bf52480abd01 |
|
12-Sep-2014 |
Andrew Bardsley <Andrew.Bardsley@arm.com> |
style: Fix line continuation, especially in debug messages
This patch closes a number of space gaps in debug messages caused by the incorrect use of line continuation within strings. (There's also one consistency change to a similar, but correct, use of line continuation)
|
#
10194:e6d2e8083d9c |
|
09-May-2014 |
Geoffrey Blake <Geoffrey.Blake@arm.com> |
arch, arm: Preserve TLB bootUncacheability when switching CPUs
The ARM TLBs have a bootUncacheability flag used to make some loads and stores become uncacheable when booting in FS mode. Later the flag is cleared to let those loads and stores operate as normal. When doing a takeOverFrom(), this flag's state is not preserved and is momentarily reset until the CPSR is touched. On single core runs this is a non-issue. On multi-core runs this can lead to crashes on the O3 CPU model from the following series of events: 1) takeOverFrom executed to switch from Atomic -> O3 2) All bootUncacheability flags are reset to true 3) Core2 tries to execute a load covered by bootUncacheability, it is flagged as uncacheable 4) Core2's load needs to replay due to a pipeline flush 3) Core1 core does an action on CPSR 4) The handling code for CPSR then checks all other cores to determine if bootUncacheability can be set to false 5) Asynchronously set bootUncacheability on all cores to false 6) Core2 replays load previously set as uncacheable and notices it is now flagged as cacheable, leads to a panic. This patch implements takeOverFrom() functionality for the ARM TLBs to preserve flag values when switching from atomic -> detailed.
|
#
10037:5cac77888310 |
|
24-Jan-2014 |
ARM gem5 Developers |
arm: Add support for ARMv8 (AArch64 & AArch32)
Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64 kernel you are restricted to AArch64 user-mode binaries. This will be addressed in a later patch.
Note: Virtualization is only supported in AArch32 mode. This will also be fixed in a later patch.
Contributors: Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation) Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation) Mbou Eyole (AArch64 NEON, validation) Ali Saidi (AArch64 Linux support, code integration, validation) Edmund Grimley-Evans (AArch64 FP) William Wang (AArch64 Linux support) Rene De Jong (AArch64 Linux support, performance opt.) Matt Horsnell (AArch64 MP, validation) Matt Evans (device models, code integration, validation) Chris Adeniyi-Jones (AArch64 syscall-emulation) Prakash Ramrakhyani (validation) Dam Sunwoo (validation) Chander Sudanthi (validation) Stephan Diestelhorst (validation) Andreas Hansson (code integration, performance opt.) Eric Van Hensbergen (performance opt.) Gabe Black
|
#
10024:fc10e1f9f124 |
|
24-Jan-2014 |
Dam Sunwoo <dam.sunwoo@arm.com> |
mem: per-thread cache occupancy and per-block ages
This patch enables tracking of cache occupancy per thread along with ages (in buckets) per cache blocks. Cache occupancy stats are recalculated on each stat dump.
|
#
9950:4b7f60080149 |
|
31-Oct-2013 |
Prakash Ramrakhyani <prakash.ramrakhyani@arm.com> |
mem: Add privilege info to request class
This patch adds a flag in the request class that indicates if the request was made in privileged mode.
|
#
9738:304a37519d11 |
|
03-Jun-2013 |
Andreas Sandberg <andreas@sandberg.pp.se> |
arch: Create a method to finalize physical addresses in the TLB
Some architectures (currently only x86) require some fixing-up of physical addresses after a normal address translation. This is usually to remap devices such as the APIC, but could be used for other memory mapped devices as well. When running the CPU in a using hardware virtualization, we still need to do these address fix-ups before inserting the request into the memory system. This patch moves this patch allows that code to be used by such CPUs without doing full address translations.
|
#
9535:508aebb47ca6 |
|
15-Feb-2013 |
Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> |
arm: fix a page table walker issue where a page could be translated multiple times
If multiple memory operations to the same page are miss the TLB they are all inserted into the page table queue and before this change could result in multiple uncessesary walks as well as duplicate enteries being inserted into the TLB.
|
#
9439:fce94f92ea0f |
|
07-Jan-2013 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
arm: Invalidate cached TLB configuration in drainResume
Currently, we invalidate the cached miscregs in TLB::unserialize(). The intended use of the drainResume() method is to invalidate cached state and prepare the system to resume after a CPU handover or (un)serialization. This patch moves the TLB miscregs invalidation code to the drainResume() method to avoid surprising behavior.
|
#
9294:8fb03b13de02 |
|
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Add protocol-agnostic ports in the port hierarchy
This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations.
The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default.
|
#
8922:17f037ad8918 |
|
30-Mar-2012 |
William Wang <william.wang@arm.com> |
MEM: Introduce the master/slave port sub-classes in C++
This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects.
The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches.
The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal.
The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again.
|
#
8809:bb10807da889 |
|
01-Feb-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with head, hopefully the last time for this batch.
|
#
8806:669e93d79ed9 |
|
29-Jan-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Implement Ali's review feedback.
Try to decrease indentation, and remove some redundant FullSystem checks.
|
#
8782:10c9297e14d5 |
|
02-Nov-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Get rid of FULL_SYSTEM in the ARM ISA.
|
#
8756:cce8cf3906ca |
|
16-Oct-2011 |
Gabe Black <gblack@eecs.umich.edu> |
ARM: Turn on the page table walker on ARM in SE mode.
|
#
8733:64a7bf8fa56c |
|
31-Jan-2012 |
Geoffrey Blake <geoffrey.blake@arm.com> |
CheckerCPU: Re-factor CheckerCPU to be compatible with current gem5
Brings the CheckerCPU back to life to allow FS and SE checking of the O3CPU. These changes have only been tested with the ARM ISA. Other ISAs potentially require modification.
|
#
8552:f51e3dce9521 |
|
13-Sep-2011 |
Daniel Johnson <daniel.johnson@arm.com> |
ARM: update TLB to set request packet ASID field
|
#
8527:6bac5b04d588 |
|
19-Aug-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Mark some variables uncacheable until boot all CPUs are enabled.
There are a set of locations is the linux kernel that are managed via cache maintence instructions until all processors enable their MMUs & TLBs. Writes to these locations are manually flushed from the cache to main memory when the occur so that cores operating without their MMU enabled and only issuing uncached accesses can receive the correct data. Unfortuantely, gem5 doesn't support any kind of software directed maintence of the cache. Until such time as that support exists this patch marks the specific cache blocks that need to be coherent as non-cacheable until all CPUs enable their MMU and thus allows gem5 to boot MP systems with caches enabled (a requirement for booting an O3 cpu and thus an O3 CPU regression).
|
#
8353:237ea6324ece |
|
16-Jun-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Handle case where new TLB size is different from previous TLB size.
After a checkpoint we need to make sure that we restore the right number of entries.
|
#
8352:9a3c002dab3e |
|
16-Jun-2011 |
Chander Sudanthi <Chander.Sudanthi@ARM.com> |
ARM: Fix memset on TLB flush and initialization
Instead of clearing the entire TLB on initialization and flush, the code was clearing only one element. This patch corrects the memsets in the init and flush routines.
|
#
8232:b28d06a175be |
|
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
trace: reimplement the DTRACE function so it doesn't use a vector At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help
|
#
8202:1b63e9afeafc |
|
04-Apr-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Fix table walk going on while ASID changes error
|
#
8067:21f14583aa6a |
|
23-Feb-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Fix bug that let two table walks occur in parallel.
|
#
7944:1daf51f62013 |
|
11-Feb-2011 |
Giacomo Gabrielli <Giacomo.Gabrielli@arm.com> |
O3: Enhance data address translation by supporting hardware page table walkers.
Some ISAs (like ARM) relies on hardware page table walkers. For those ISAs, when a TLB miss occurs, initiateTranslation() can return with NoFault but with the translation unfinished.
Instructions experiencing a delayed translation due to a hardware page table walk are deferred until the translation completes and kept into the IQ. In order to keep track of them, the IQ has been augmented with a queue of the outstanding delayed memory instructions. When their translation completes, instructions are re-executed (only their initiateAccess() was already executed; their DTB translation is now skipped). The IEW stage has been modified to support such a 2-pass execution.
|
#
7850:02450f4443ce |
|
18-Jan-2011 |
Matt Horsnell <Matt.Horsnell@arm.com> |
O3: Fixes the way prefetches are handled inside the iew unit.
This patch prevents the prefetch being added to the instCommit queue twice.
|
#
7781:a9f9eed35b18 |
|
07-Dec-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Support switchover with hardware table walkers
|
#
7749:859e8bc1cdc2 |
|
15-Nov-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Cache the misc regs at the TLB to limit readMiscReg() calls.
|
#
7734:85a8198aa2ff |
|
08-Nov-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Add some TLB statistics for ARM
|
#
7733:08d6a773d1b6 |
|
08-Nov-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Add checkpointing support
|
#
7720:65d338a8dba4 |
|
31-Oct-2010 |
Gabe Black <gblack@eecs.umich.edu> |
ISA,CPU,etc: Create an ISA defined PC type that abstracts out ISA behaviors.
This change is a low level and pervasive reorganization of how PCs are managed in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about, the PC and the NPC, and the lsb of the PC signaled whether or not you were in PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next micropc, x86 and ARM introduced variable length instruction sets, and ARM started to keep track of mode bits in the PC. Each CPU model handled PCs in its own custom way that needed to be updated individually to handle the new dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack, the complexity could be hidden in the ISA at the ISA implementation's expense. Areas like the branch predictor hadn't been updated to handle branch delay slots or micropcs, and it turns out that had introduced a significant (10s of percent) performance bug in SPARC and to a lesser extend MIPS. Rather than perpetuate the problem by reworking O3 again to handle the PC features needed by x86, this change was introduced to rework PC handling in a more modular, transparent, and hopefully efficient way.
PC type:
Rather than having the superset of all possible elements of PC state declared in each of the CPU models, each ISA defines its own PCState type which has exactly the elements it needs. A cross product of canned PCState classes are defined in the new "generic" ISA directory for ISAs with/without delay slots and microcode. These are either typedef-ed or subclassed by each ISA. To read or write this structure through a *Context, you use the new pcState() accessor which reads or writes depending on whether it has an argument. If you just want the address of the current or next instruction or the current micro PC, you can get those through read-only accessors on either the PCState type or the *Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the move away from readPC. That name is ambiguous since it's not clear whether or not it should be the actual address to fetch from, or if it should have extra bits in it like the PAL mode bit. Each class is free to define its own functions to get at whatever values it needs however it needs to to be used in ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the PC and into a separate field like ARM.
These types can be reset to a particular pc (where npc = pc + sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as appropriate), printed, serialized, and compared. There is a branching() function which encapsulates code in the CPU models that checked if an instruction branched or not. Exactly what that means in the context of branch delay slots which can skip an instruction when not taken is ambiguous, and ideally this function and its uses can be eliminated. PCStates also generally know how to advance themselves in various ways depending on if they point at an instruction, a microop, or the last microop of a macroop. More on that later.
Ideally, accessing all the PCs at once when setting them will improve performance of M5 even though more data needs to be moved around. This is because often all the PCs need to be manipulated together, and by getting them all at once you avoid multiple function calls. Also, the PCs of a particular thread will have spatial locality in the cache. Previously they were grouped by element in arrays which spread out accesses.
Advancing the PC:
The PCs were previously managed entirely by the CPU which had to know about PC semantics, try to figure out which dimension to increment the PC in, what to set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction with the PC type itself. Because most of the information about how to increment the PC (mainly what type of instruction it refers to) is contained in the instruction object, a new advancePC virtual function was added to the StaticInst class. Subclasses provide an implementation that moves around the right element of the PC with a minimal amount of decision making. In ISAs like Alpha, the instructions always simply assign NPC to PC without having to worry about micropcs, nnpcs, etc. The added cost of a virtual function call should be outweighed by not having to figure out as much about what to do with the PCs and mucking around with the extra elements.
One drawback of making the StaticInsts advance the PC is that you have to actually have one to advance the PC. This would, superficially, seem to require decoding an instruction before fetch could advance. This is, as far as I can tell, realistic. fetch would advance through memory addresses, not PCs, perhaps predicting new memory addresses using existing ones. More sophisticated decisions about control flow would be made later on, after the instruction was decoded, and handed back to fetch. If branching needs to happen, some amount of decoding needs to happen to see that it's a branch, what the target is, etc. This could get a little more complicated if that gets done by the predecoder, but I'm choosing to ignore that for now.
Variable length instructions:
To handle variable length instructions in x86 and ARM, the predecoder now takes in the current PC by reference to the getExtMachInst function. It can modify the PC however it needs to (by setting NPC to be the PC + instruction length, for instance). This could be improved since the CPU doesn't know if the PC was modified and always has to write it back.
ISA parser:
To support the new API, all PC related operand types were removed from the parser and replaced with a PCState type. There are two warts on this implementation. First, as with all the other operand types, the PCState still has to have a valid operand type even though it doesn't use it. Second, using syntax like PCS.npc(target) doesn't work for two reasons, this looks like the syntax for operand type overriding, and the parser can't figure out if you're reading or writing. Instructions that use the PCS operand (which I've consistently called it) need to first read it into a local variable, manipulate it, and then write it back out.
Return address stack:
The return address stack needed a little extra help because, in the presence of branch delay slots, it has to merge together elements of the return PC and the call PC. To handle that, a buildRetPC utility function was added. There are basically only two versions in all the ISAs, but it didn't seem short enough to put into the generic ISA directory. Also, the branch predictor code in O3 and InOrder were adjusted so that they always store the PC of the actual call instruction in the RAS, not the next PC. If the call instruction is a microop, the next PC refers to the next microop in the same macroop which is probably not desirable. The buildRetPC function advances the PC intelligently to the next macroop (in an ISA specific way) so that that case works.
Change in stats:
There were no change in stats except in MIPS and SPARC in the O3 model. MIPS runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could likely be improved further by setting call/return instruction flags and taking advantage of the RAS.
TODO:
Add != operators to the PCState classes, defined trivially to be !(a==b). Smooth out places where PCs are split apart, passed around, and put back together later. I think this might happen in SPARC's fault code. Add ISA specific constructors that allow setting PC elements without calling a bunch of accessors. Try to eliminate the need for the branching() function. Factor out Alpha's PAL mode pc bit into a separate flag field, and eliminate places where it's blindly masked out or tested in the PC.
|
#
7705:fd65f85fcc0c |
|
13-Oct-2010 |
Gabe Black <gblack@eecs.umich.edu> |
Mem: Change the CLREX flag to CLEAR_LL.
CLREX is the name of an ARM instruction, not a name for this generic flag.
|
#
7697:05b1a077977b |
|
01-Oct-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Make the TLB a little bit faster by moving most recently used items to front of list
|
#
7694:de057cccee82 |
|
01-Oct-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Implement functional virtual to physical address translation for debugging and program introspection.
|
#
7612:917946898102 |
|
23-Aug-2010 |
Gene Wu <Gene.Wu@arm.com> |
MEM: Make CLREX a first class request operation and clear locks in caches when it in received
|
#
7611:c119da5a80c8 |
|
23-Aug-2010 |
Gene Wu <Gene.Wu@arm.com> |
ARM: Make sure that software prefetch instructions can't change the state of the TLB
|
#
7608:17aabeaa1a8f |
|
23-Aug-2010 |
Gene Wu <Gene.Wu@arm.com> |
ARM: Fix Uncachable TLB requests and decoding of xn bit
|
#
7606:c0d90ba69082 |
|
23-Aug-2010 |
Gene Wu <Gene.Wu@arm.com> |
ARM: For non-cachable accesses set the UNCACHABLE flag
|
#
7603:66d853e566d2 |
|
23-Aug-2010 |
Gene Wu <Gene.Wu@arm.com> |
ARM: Implement CLREX
|
#
7461:5a07045d0af2 |
|
15-Jun-2010 |
Nathan Binkert <nate@binkert.org> |
stats: only consider a formula initialized if there is a formula
|
#
7439:b4c6b2532bbf |
|
02-Jun-2010 |
Dam Sunwoo <dam.sunwoo@arm.com> |
ARM: Allow multiple outstanding TLB walks to queue.
|
#
7438:8e4b37136330 |
|
02-Jun-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM TLB: Fix bug in memAttrs getting a bogus thread context
|
#
7437:5853fbdfba9b |
|
02-Jun-2010 |
Dam Sunwoo <dam.sunwoo@arm.com> |
ARM: Support table walks in timing mode.
|
#
7436:b578349f9371 |
|
02-Jun-2010 |
Dam Sunwoo <dam.sunwoo@arm.com> |
ARM: Added support for Access Flag and some CP15 regs (V2PCWPR, V2PCWPW, V2PCWUR, V2PCWUW,...)
|
#
7406:ddc26bd4ea7d |
|
02-Jun-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Some TLB bug fixes.
|
#
7404:bfc74724914e |
|
02-Jun-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Implement the ARM TLB/Tablewalker. Needs performance improvements.
|
#
7399:a378ac1e1615 |
|
02-Jun-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Start over with translation from Alpha code as opposed to something that has cruft from 4 different ISAs.
|
#
7362:9ea92e0eb4a9 |
|
02-Jun-2010 |
Gabe Black <gblack@eecs.umich.edu> |
ARM: Implement and update the DFSR and IFSR registers on faults.
|
#
7301:04d8488d0d36 |
|
02-Jun-2010 |
Gabe Black <gblack@eecs.umich.edu> |
ARM: Warn about not implementing MPU translation, not panic about MMU.
We'll start out with a stbu version of PMSA and switch over to VMSA for the full implementation.
|
#
7294:fda2c00880db |
|
02-Jun-2010 |
Gabe Black <gblack@eecs.umich.edu> |
ARM: Implement the V7 version of alignment checking.
|
#
7093:9832d4b070fc |
|
02-Jun-2010 |
Gabe Black <gblack@eecs.umich.edu> |
ARM: Track the current ISA mode using the PC.
|
#
6757:d86d3d6e5326 |
|
17-Nov-2009 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM: Boilerplate full-system code.
|
#
6428:9e35cdc95e81 |
|
02-Aug-2009 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Clean up some inconsistencies with Request flags.
|
#
6116:a5a97b04d796 |
|
21-Apr-2009 |
Nathan Binkert <nate@binkert.org> |
arm: Unify the ARM tlb. We forgot about this when we did the rest. This code compiles, but there are no tests still
|
#
6020:0647c8b31a99 |
|
06-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
Merge ARM into the head. ARM will compile but may not actually work.
|
#
6019:76890d8b28f5 |
|
05-Apr-2009 |
Stephen Hines <hines@cs.fsu.edu> |
arm: add ARM support to M5
|