History log of /gem5/src/arch/arm/tlb.cc
Revision Date Author Comments
# 14280:9e3f2937f72c 31-Jul-2019 Giacomo Travaglini <giacomo.travaglini@arm.com>

arch-arm: Fix Data Abort ISS when caused by Atomic operation

Data Aborts caused by an atomic instruction have a special rule for
their syndrome:
From a ISS point of view they count as read if a read to that address
would generate a fault; they count as writes otherwise (ISS.WnR bit)
This patch is implementing this in the TLB. For permission faults we
need to explicitly check if a read would trigger a fault
(e.g. checking for the AP bits) since permissions can allow read-only
accesses.
For other MMU exceptions (like translation faults) we are confident the
nature of the access doesn't affect the genration of a fault.
This means that if the access is atomic, we treat it as a read from an
ISS.WnR point of view.

Change-Id: Ia524aa6ae07f81513cdc26c516b5fd9b01a931c3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20981
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 14278:45892d0d3e98 08-Sep-2019 Giacomo Travaglini <giacomo.travaglini@arm.com>

arch-arm: PSTATE.PAN affecting EL2 only when HCR_EL2.E2H=1

Change-Id: I6df0cdcbadca17f30d3de3bed887f75c739b00f0
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20979
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 14172:bba55ff08279 16-Aug-2019 Giacomo Travaglini <giacomo.travaglini@arm.com>

arch-arm: Replace occ of opModeToEL(currOpMode/cpsr) with currEL

Change-Id: I739a9be03ea5caa63540c62fd110eee86a058c4c
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20252
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 14128:6ed23d07d0d1 28-Jul-2019 Giacomo Travaglini <giacomo.travaglini@arm.com>

arch-arm: Implement ARMv8.1-PAN, Privileged access never

ARMv8.1-PAN adds a new bit to PSTATE. When the value of this PAN state
bit is 1, any privileged data access from EL1 or EL2 to a virtual memory
address that is accessible at EL0 generates a Permission fault.
This feature is mandatory in ARMv8.1 implementations.
This feature is supported in AArch64 and AArch32 states.
The ID_AA64MMFR1_EL1.PAN, ID_MMFR3_EL1.PAN, and ID_MMFR3.PAN fields
identify the support for ARMv8.1-PAN.

Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: I94a76311711739dd2394c72944d88ba9321fd159
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19729
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 14088:8de55a7aa53b 01-May-2019 Giacomo Travaglini <giacomo.travaglini@arm.com>

arch-arm: Use ExceptionLevel type in TlbEntry

Replacing uint8_t with ExceptionLevel type in the arm TlbEntry. The
variable is representing the translation regime it is targeting.

Change-Id: Ifcd6e86c5d73f752e8476a2b7fda9ea74a0c7a3b
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19488
Reviewed-by: Ciro Santilli <ciro.santilli@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 14066:c29e36d24f68 16-Nov-2018 Anouk Van Laer <anouk.vanlaer@arm.com>

arch, arm: Update miscRegs in getTE

Normally, a translation will start via translateTiming/functional
which will check if the miscRegs have been updated and if so,
will update the TLB state accordingly. However, in a 2 stage
system, if there is a hit in stage 1, the resulting IPA will be
sent to the S2-TLB for translation via a getTE() function call
(via the stage2_lookup object). This will cause the state of the
S2-TLB to be out of sync.

Change-Id: I117e4032fc76d7d31f4f999887b5573a7e5811e6
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/14995
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>


# 13968:55d001a9732d 14-May-2019 Javier Bueno <javier.bueno@metempsy.com>

arch-arm: Do not check MustBeOne flag for TLB requests from the prefetcher

Allow TLB requests generated from prefetchers to override the
MustBeOne arch flag. This allows the prefetchers to issue requests
without having to know architecutre-specific flags.

Change-Id: Id83e0c93f3d1a614da11c4f344ab4dc594423672
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18768
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 13889:b329d40d4e78 01-Apr-2019 Giacomo Travaglini <giacomo.travaglini@arm.com>

arch-arm: updateMiscReg not setting isHyp in aarch64

The isHyp flag should be set for a TLB::NormalTran when in EL2. This
was happening in aarch32 only, where the CPSR mode is checked, while
aarch64 was only using it for explicit EL2 translations, like for AT
instructions.

Change-Id: I54605811e9dde75b5cf8868190b0f4c2a8d46570
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18394
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 13882:03fe9a85b435 10-Apr-2019 Giacomo Travaglini <giacomo.travaglini@arm.com>

arch-arm: Remove un-needed hyp flag in TLBI operations

The hyp flag was probably a legacy pre-v8 flag distinguishing
invalidation targeting PL2 translation regime (hyp mode).
Since the introduction of target_el parameter, hyp boolean is not needed
anymore. The patch works by setting the hyp flag in the flush* methods
in the TLB automatically by checking if target_el == EL2.

Change-Id: I798009e09ff24a383dea871e348188bae2685e8e
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Jan-Peter Larsson <jan-peter.larsson@arm.com>
Reviewed-by: Anouk Van Laer <anouk.vanlaer@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18389
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 13795:e21c61d9efb8 19-Mar-2019 Andrea Mondelli <Andrea.Mondelli@ucf.edu>

dev-arm: ambiguous use of getPort()

The recent introduction of getPort() creates a conflict with
the existing method used in arm MMU.

This patch rename the old getPort() in getDMAPort() according
to the returned value (DmaPort class type)

Change-Id: Ief3d83650fd6b08490522341631244be06e380ce
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17469
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 13784:1941dc118243 07-Mar-2019 Gabe Black <gabeblack@google.com>

arch, cpu, dev, gpu, mem, sim, python: start using getPort.

Replace the getMasterPort, getSlavePort, and getEthPort functions
with getPort, and remove extraneous mechanisms that are no longer
necessary.

Change-Id: Iab7e3c02d2f3a0cf33e7e824e18c28646b5bc318
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17040
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 13741:d994984b842a 22-Feb-2019 Andrea Mondelli <Andrea.Mondelli@ucf.edu>

mem-cache: alias to mem::getMasterPort in TLB class

TLB:getMasterPort is used to obtain the PageWalkMasterPort if present and
hides the BaseTLB::getMasterPort().

The TLB::getMasterPort() is renamed according to the expected behavior.

Change-Id: If4f61189094a706d59805cd10f4f814e5830eda8
Reviewed-on: https://gem5-review.googlesource.com/c/16648
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 13453:4a7a060ea26e 10-Feb-2017 Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com>

cpu,arch-arm: Initialise data members

The value that is not initialized has a bogus value that manifests when
using some debug-flags what makes the usage of tracediff a bit more
challenging.

In addition, while debugging with other techniques, it introduces the
problem of understanding if the value of a field is 'intended' or just
an effect of the lack of initialisation.

Change-Id: Ied88caa77479c6f1d5166d80d1a1a057503cb106
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13125
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>


# 13374:b7f652df5e5b 19-Oct-2018 Anouk Van Laer <anouk.vanlaer@arm.com>

arch, arm: Effect of AT instructions on descriptor handling

Some address translation instructions will stop translation after
the 1st stage and intercept the IPA, even in the presence of
stage 2 (eg AT S1E1). However, in the case of a TLB miss, the
table descriptors still need to be translated from IPA to PA to
avoid fetching the wrong addresses. This commit splits whether
IPA->PA translation is required for the VA and/or for the table
descriptors.

Change-Id: Ie53cdc00585f116150256f1d833460931b3bfb7d
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13781
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 12763:37c243ed1112 29-May-2018 Giacomo Travaglini <giacomo.travaglini@arm.com>

arch-arm: Add Illegal Execution flag to PCState

This patch moves the detection of the Illegal Execution flag (PSTATE.IL)
from the tlb translation stage (fetch) to the decoding stage. This is
done by adding the illegalExecution field to the PCState.

Change-Id: I9c1c4e9c6bd5ded905c1d56b3034e4e9322582fa
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10813
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 12749:223c83ed9979 04-Jun-2018 Giacomo Travaglini <giacomo.travaglini@arm.com>

misc: Using smart pointers for memory Requests

This patch is changing the underlying type for RequestPtr from Request*
to shared_ptr<Request>. Having memory requests being managed by smart
pointers will simplify the code; it will also prevent memory leakage and
dangling pointers.

Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10996
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 12735:e3da526a0654 16-May-2018 Andreas Sandberg <andreas.sandberg@arm.com>

arch-arm: Respect EL from translation type

There are cases where instructions request translations in the context
of a lower EL. This is currently not respected in the TLB and the page
table walker. Fix that.

Change-Id: Icd59657a1ecfd8bd75a001bb1a4e41a6f4808a36
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10506
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>


# 12528:a9960d039c29 23-Jan-2018 Chuan Zhu <chuan.zhu@arm.com>

arch-arm: Fix syntax error in TLB::getResultTe

The patch fixes one syntax error in TLB::getResultTe

Change-Id: I31a72a52d5c03f43929a69ca1be61d9c20e76f5b
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/7983
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 12506:aed554105426 04-Jan-2018 Nikos Nikoleris <nikos.nikoleris@arm.com>

arch-arm: Check cache maintenance insts for permission faults

In AArch32, data cache maintenance instructions that operate by VA do
not generate permission faults.

In AArch64, a data cache invalidate instruction can generate a
permission fault when there are no write permissions to the specified
VA. Data cache clean and data cache clean and invalidate instructions
do not generate permission faults.

Checks for external aborts are also bypassed for data cache
maintenance instructions.

Change-Id: Iea5bc665e4cf66d528e36b671535b66637c4b224
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/7827
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 12499:b81688796004 09-Jan-2018 Giacomo Travaglini <giacomo.travaglini@arm.com>

arch-arm: Change function name for banked miscregs

This commit changes the function's name used for retrieving the index of a
security banked register given the flatten index. This will avoid confusion
with flattenRegId, which has a different purpose.

Change-Id: I470ffb55916cb7fc9f78e071a7f2e609c1829f1a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/7982
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 12406:86bde4a026b5 22-Dec-2017 Gabe Black <gabeblack@google.com>

arch,cpu: "virtualize" the TLB interface.

CPUs have historically instantiated the architecture specific version
of the TLBs to avoid a virtual function call, making them a little bit
more dependent on what the current ISA is. Some simple performance
measurement, the x86 twolf regression on the atomic CPU, shows that
there isn't actually any performance benefit, and if anything the
simulator goes slightly faster (although still within margin of error)
when the TLB functions are virtual.

This change switches everything outside of the architectures themselves
to use the generic BaseTLB type, and then inside the ISA for them to
cast that to their architecture specific type to call into architecture
specific interfaces.

The ARM TLB needed the most adjustment since it was using non-standard
translation function signatures. Specifically, they all took an extra
"type" parameter which defaulted to normal, and translateTiming
returned a Fault. translateTiming actually doesn't need to return a
Fault because everywhere that consumed it just stored it into a
structure which it then deleted(?), and the fault is stored in the
Translation object when the translation is done.

A little more work is needed to fully obviate the arch/tlb.hh header,
so the TheISA::TLB type is still visible outside of the ISAs.
Specifically, the TlbEntry type is used in the generic PageTable which
lives in src/mem.

Change-Id: I51b68ee74411f9af778317eff222f9349d2ed575
Reviewed-on: https://gem5-review.googlesource.com/6921
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>


# 12356:e56e838c47cb 20-Feb-2017 Nikos Nikoleris <nikos.nikoleris@arm.com>

arm: Add CMO support for Non-Cacheable memory

Cache Maintainance operations to the point of coherence are treated as
normal cahceable requests and clean and/or invalidate the caches of
all PEs.

Change-Id: Ia4a749c2318fe29c8601848b034b8315c4186c8a
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/5056
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>


# 12005:f4b9607db0af 16-Feb-2016 Andreas Sandberg <andreas.sandberg@arm.com>

arm: Add support for memory-mapped m5ops

Add support for a memory mapped m5op interface. When enabled, the TLB
intercepts accesses in the 64KiB region designated by the
ArmTLB.m5ops_base parameter. An access to this range maps to a
specific m5op call. The upper 8 bits of the offset into the range
denote the m5op function to call and the lower 8 bits denote the
subfunction.

Change-Id: I55fd8ac1afef4c3cc423b973870c9fe600a843a2
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2964


# 11861:9684637f3339 21-Feb-2017 Nikos Nikoleris <nikos.nikoleris@arm.com>

arm: Blame the right instruction address on a Prefetch Abort

CPU models (e.g., O3CPU) issue instruction fetches for the whole cache
block rather than a specific instruction. Consequently the TLB lookups
translate the cache block virtual address. When the TLB lookup fails,
however, the Prefetch Abort must be raised for the PC of the
instruction that caused the fault rather than for the address of the
block.

This change fixes the way we instantiate the PrefetchAbort faults to
use the PC of the request rather the address of the instruction fetch
request.

Change-Id: I8e45549da1c3be55ad204a060029c95ce822a851
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-by: Rekai Gonzalez Alberquilla <rekai.gonzalezalberquilla@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>


# 11793:ef606668d247 09-Nov-2016 Brandon Potter <brandon.potter@amd.com>

style: [patch 1/22] use /r/3648/ to reorganize includes


# 11608:6319a1125f1c 14-Aug-2016 Nikos Nikoleris <nikos.nikoleris@arm.com>

cpu, arch: fix the type used for the request flags

Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>


# 11584:bbd8448f104e 02-Aug-2016 Dylan Johnson <Dylan.Johnson@ARM.com>

arm: Add TLBI instruction for stage 2 IPA's

This patch adds support for stage 2 TLBI instructions
such as TLBI IPAS2E1_Xt.

Change-Id: I0cd5e8055b0c1003e03439aa5183252f50ea0a88


# 11580:afe051c345e9 02-Aug-2016 Dylan Johnson <Dylan.Johnson@ARM.com>

arm: Fix stage 2 determination in table walker

We recompute if we are doing a stage 2 walk inside of the table walker
but we have already figured it out in the tlb. Pass the information in
to the walk instead of recomputing it.

Change-Id: I39637ce99309b2ddbc30344d45ac9ebf6a203401


# 11577:a26a328c20eb 02-Aug-2016 Dylan Johnson <Dylan.Johnson@ARM.com>

arm: Fix EL perceived at TLB for address translation instructions

During address translation instructions (such as AT S1E1R_Xt) the exception
level can be different than the current exception level. This patch fixes
how the TLB determines what EL to use during these instructions.

Change-Id: Ia9ce229404de9e284bc1f7479fd2c580efd55f8f


# 11575:0005b28685f0 02-Aug-2016 Dylan Johnson <Dylan.Johnson@ARM.com>

arm: add stage2 translation support

Change-Id: I8f7c09c7ec3a97149ebebf4b21471b244e6cecc1


# 11560:f050b8cf4754 11-Jul-2016 Andreas Sandberg <andreas.sandberg@arm.com>

arm: Don't consult the TLB test iface for functional translations

Don't consult the TLB test interface for PA's returned by functional
translations by the AT instruction. We implement this by chaning the
ISA code to synthesize 0-length functional reads for the TLB lookup.
The TLB then bypasses the final PA check in the tester if the size is
zero.

Change-Id: I2487b7f829cea88c37e229e9fc7a4543aced961b
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>


# 11522:348411ec525a 06-Jun-2016 Stephan Diestelhorst <stephan.diestelhorst@arm.com>

sim: Call regStats of base-class as well

We want to extend the stats of objects hierarchically and thus it is necessary
to register the statistics of the base-class(es), as well. For now, these are
empty, but generic stats will be added there.

Patch originally provided by Akash Bagdia at ARM Ltd.


# 11517:54230f1ebef2 02-Jun-2016 Curtis Dunham <Curtis.Dunham@arm.com>

arm: refactor page table format determination

In particular, when EL0 is in AArch32 but EL1 is AArch64, AArch64
memory translation must be used. This is essential for typical
AArch64/32 interworking use cases.


# 11505:55256a05d9e9 30-May-2016 Andreas Sandberg <andreas.sandberg@arm.com>

arm: Correctly check translation mode (aarch64/aarch32)

According to the ARM ARM (see AArch32.TranslateAddress in the
pseudocode library), the TLB should be operating in aarch64 mode if
the EL0 is aarch32 and EL1 is aarch64. This is currently not the case
in gem5, which breaks 64/32 interprocessing. Update the check to match
the reference manual.

Change-Id: I6f1444d57c0e2eb5f8880f513f33a9197b7cb2ce
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 11495:1f04f97c014d 26-May-2016 Andreas Sandberg <andreas.sandberg@arm.com>

arm: Fix incorrect TLB permission check in aarch32

The TLB currently assumes that the pxn bit in an LPAE page descriptor
disables execution from unprivileged mode. However, according to the
architecture manual, this bit should disable execution from privileged
modes. Update the TLB implementation to reflect this behavior.

Change-Id: I7f1bb232d7a94a93fd601a9230223195ac952947
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 11395:032bc62120eb 21-Mar-2016 Andreas Sandberg <andreas.sandberg@arm.com>

arm: Refactor the TLB test interface

Refactor the TLB and page table walker test interface to use a dynamic
registration mechanism. Instead of patching a couple of empty methods
to wire up a TLB tester, this change allows such testers to register
themselves using the setTestInterface() method.


# 11321:02e930db812d 06-Feb-2016 Steve Reinhardt <steve.reinhardt@amd.com>

style: fix missing spaces in control statements

Result of running 'hg m5style --skip-all --fix-control -a'.


# 11152:11da02681277 30-Sep-2015 Mitch Hayenga <mitch.hayenga@arm.com>

arm: Change TLB Software Caching

In ARM, certain variables are only updated when a necessary change is
detected. Having 2 SMT threads share a TLB resulted in these not being
updated as required. This patch adds a thread context identifer to
assist in the invalidation of these variables.


# 11055:54071fd5c397 21-Aug-2015 Andreas Hansson <andreas.hansson@arm.com>

arm, mem: Remove unused CLEAR_LL request flag

Cleaning up dead code. The CLREX stores zero directly to
MISCREG_LOCKFLAG and so the request flag is no longer needed. The
corresponding functionality in the cache tags is also removed.


# 10905:a6ca6831e775 07-Jul-2015 Andreas Sandberg <andreas.sandberg@arm.com>

sim: Refactor the serialization base class

Objects that are can be serialized are supposed to inherit from the
Serializable class. This class is meant to provide a unified API for
such objects. However, so far it has mainly been used by SimObjects
due to some fundamental design limitations. This changeset redesigns
to the serialization interface to make it more generic and hide the
underlying checkpoint storage. Specifically:

* Add a set of APIs to serialize into a subsection of the current
object. Previously, objects that needed this functionality would
use ad-hoc solutions using nameOut() and section name
generation. In the new world, an object that implements the
interface has the methods serializeSection() and
unserializeSection() that serialize into a named /subsection/ of
the current object. Calling serialize() serializes an object into
the current section.

* Move the name() method from Serializable to SimObject as it is no
longer needed for serialization. The fully qualified section name
is generated by the main serialization code on the fly as objects
serialize sub-objects.

* Add a scoped ScopedCheckpointSection helper class. Some objects
need to serialize data structures, that are not deriving from
Serializable, into subsections. Previously, this was done using
nameOut() and manual section name generation. To simplify this,
this changeset introduces a ScopedCheckpointSection() helper
class. When this class is instantiated, it adds a new /subsection/
and subsequent serialization calls during the lifetime of this
helper class happen inside this section (or a subsection in case
of nested sections).

* The serialize() call is now const which prevents accidental state
manipulation during serialization. Objects that rely on modifying
state can use the serializeOld() call instead. The default
implementation simply calls serialize(). Note: The old-style calls
need to be explicitly called using the
serializeOld()/serializeSectionOld() style APIs. These are used by
default when serializing SimObjects.

* Both the input and output checkpoints now use their own named
types. This hides underlying checkpoint implementation from
objects that need checkpointing and makes it easier to change the
underlying checkpoint storage code.


# 10873:7c972b9aea16 21-Jun-2015 Andreas Sandberg <andreas.sandberg@arm.com>

arm: Cleanup arch headers to remove dma_device.hh dependency

Break the dependency on dma_device.hh by forward-declaring DmaPort in
the relevant header.


# 10854:f449d6f8a647 26-May-2015 Nathanael Premillieu <Nathanael.Premillieu@arm.com>

arm: Make address translation faster with better caching

This patch adds better caching of the sys regs for AArch64, thus
avoiding unnecessary calls to tc->readMiscReg(MISCREG_CPSR) in the
non-faulting case.


# 10825:5d059b8ed8a4 05-May-2015 Andreas Sandberg <Andreas.Sandberg@ARM.com>

arm: Relax ordering for some uncacheable accesses

We currently assume that all uncacheable memory accesses are strictly
ordered. Instead of always enforcing strict ordering, we now only
enforce it if the required memory type is device memory or strongly
ordered memory.


# 10824:308771bd2647 05-May-2015 Andreas Sandberg <Andreas.Sandberg@ARM.com>

mem, cpu: Add a separate flag for strictly ordered memory

The Request::UNCACHEABLE flag currently has two different
functions. The first, and obvious, function is to prevent the memory
system from caching data in the request. The second function is to
prevent reordering and speculation in CPU models.

This changeset gives the order/speculation requirement a separate flag
(Request::STRICT_ORDER). This flag prevents CPU models from doing the
following optimizations:

* Speculation: CPU models are not allowed to issue speculative
loads.

* Write combining: CPU models and caches are not allowed to merge
writes to the same cache line.

Note: The memory system may still reorder accesses unless the
UNCACHEABLE flag is set. It is therefore expected that the
STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent
this behavior.


# 10822:d259f2bc2b31 05-May-2015 Andreas Hansson <andreas.hansson@arm.com>

arm: Remove unnecessary boot uncachability

With the recent patches addressing how we deal with uncacheable
accesses there is no longer need for the work arounds put in place to
enforce certain sections of memory to be uncacheable during boot.


# 10717:4f8c1bd6fdb8 02-Mar-2015 Andreas Hansson <andreas.hansson@arm.com>

arm: Share a port for the two table walker objects

This patch changes how the MMU and table walkers are created such that
a single port is used to connect the MMU and the TLBs to the memory
system. Previously two ports were needed as there are two table walker
objects (stage one and stage two), and they both had a port. Now the
port itself is moved to the Stage2MMU, and each TableWalker is simply
using the port from the parent.

By using the same port we also remove the need for having an
additional crossbar joining the two ports before the walker cache or
the L2. This simplifies the creation of the CPU cache topology in
BaseCPU.py considerably. Moreover, for naming and symmetry reasons,
the TLB walker port is connected through the stage-one table walker
thus making the naming identical to x86. Along the same line, we use
the stage-one table walker to generate the master id that is used by
all TLB-related requests.


# 10611:3bba9f2d0c7d 23-Dec-2014 Andreas Sandberg <Andreas.Sandberg@ARM.com>

arm: Raise an alignment fault if a PC has illegal alignment

We currently don't handle unaligned PCs correctly. There is one check
for unaligned PCs in the TLB when running in aarch64 mode, but this
check does not cover cases where the CPU does not do a TLB lookup when
decoding an instruction (e.g., a branch stays within the same cache
line). Additionally, the Decoder class sometimes throws an assertion
for unaligned PCs which breaks speculation.

This changeset introduces a decoder fault bit field in the ExtMachInst
structure. This field can be used to signal a decoder failure. If set,
the decoder generates an internal gem5fault instruction instead of a
normal instruction. This instruction in turns either panics (fault
type PANIC), returns an PCAlignmentFault (fault type UNALIGNED,
aarch64) or PrefetchAbort (fault type UNALIGNED, aarch32).

The patch causes minor changes to the realview64 regressions, and a
stats bump will follow.


# 10537:47fe87b0cf97 14-Nov-2014 Andreas Hansson <andreas.hansson@arm.com>

arm: Fixes based on UBSan and static analysis

Another churn to clean up undefined behaviour, mostly ARM, but some
parts also touching the generic part of the code base.

Most of the fixes are simply ensuring that proper intialisation. One
of the more subtle changes is the return type of the sign-extension,
which is changed to uint64_t. This is to avoid shifting negative
values (undefined behaviour) in the ISA code.


# 10508:aa46a8ae3487 30-Oct-2014 Ali Saidi <Ali.Saidi@ARM.com>

arm: Fix multi-system AArch64 boot w/caches.

Automatically extract cpu release address from DTB file.
Check SCTLR_EL1 to verify all caches are enabled.


# 10474:799c8ee4ecba 16-Oct-2014 Andreas Hansson <andreas.hansson@arm.com>

arch: Use shared_ptr for all Faults

This patch takes quite a large step in transitioning from the ad-hoc
RefCountingPtr to the c++11 shared_ptr by adopting its use for all
Faults. There are no changes in behaviour, and the code modifications
are mostly just replacing "new" with "make_shared".


# 10463:25c5da51bbe0 16-Oct-2014 Andreas Sandberg <Andreas.Sandberg@ARM.com>

arm: Add TLB PMU probes

This changeset adds probe points that can be used to implement PMU
counters for TLB stats. The following probes are supported:

* ArmISA::TLB::ppRefills / TLB Refills (TLB insertions)


# 10418:7a76e13f0101 27-Sep-2014 Andreas Hansson <andreas.hansson@arm.com>

arm: Fixed undefined behaviours identified by gcc

This patch fixes the runtime errors highlighted by the undefined
behaviour sanitizer. In the end there were two issues. First, when
rotating an immediate, we ended up shifting an uint32_t by 32 in some
cases. This case is fixed by checking for a rotation by 0
positions. Second, the Mrc15 and Mcr15 are operating on an IntReg and
a MiscReg, but we used the type RegRegImmOp and passed a MiscRegIndex
as an IntRegIndex. This issue is resolved by introducing a
MiscRegRegImmOp and RegMiscRegImmOp with the appropriate types.

With these fixes there are no runtime errors identified for the full
ARM regressions.


# 10367:bf52480abd01 12-Sep-2014 Andrew Bardsley <Andrew.Bardsley@arm.com>

style: Fix line continuation, especially in debug messages

This patch closes a number of space gaps in debug messages caused by
the incorrect use of line continuation within strings. (There's also
one consistency change to a similar, but correct, use of line
continuation)


# 10194:e6d2e8083d9c 09-May-2014 Geoffrey Blake <Geoffrey.Blake@arm.com>

arch, arm: Preserve TLB bootUncacheability when switching CPUs

The ARM TLBs have a bootUncacheability flag used to make some loads
and stores become uncacheable when booting in FS mode. Later the
flag is cleared to let those loads and stores operate as normal. When
doing a takeOverFrom(), this flag's state is not preserved and is
momentarily reset until the CPSR is touched. On single core runs this
is a non-issue. On multi-core runs this can lead to crashes on the O3
CPU model from the following series of events:
1) takeOverFrom executed to switch from Atomic -> O3
2) All bootUncacheability flags are reset to true
3) Core2 tries to execute a load covered by bootUncacheability, it
is flagged as uncacheable
4) Core2's load needs to replay due to a pipeline flush
3) Core1 core does an action on CPSR
4) The handling code for CPSR then checks all other cores
to determine if bootUncacheability can be set to false
5) Asynchronously set bootUncacheability on all cores to false
6) Core2 replays load previously set as uncacheable and notices
it is now flagged as cacheable, leads to a panic.
This patch implements takeOverFrom() functionality for the ARM TLBs
to preserve flag values when switching from atomic -> detailed.


# 10037:5cac77888310 24-Jan-2014 ARM gem5 Developers

arm: Add support for ARMv8 (AArch64 & AArch32)

Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64
kernel you are restricted to AArch64 user-mode binaries. This will be addressed
in a later patch.

Note: Virtualization is only supported in AArch32 mode. This will also be fixed
in a later patch.

Contributors:
Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation)
Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation)
Mbou Eyole (AArch64 NEON, validation)
Ali Saidi (AArch64 Linux support, code integration, validation)
Edmund Grimley-Evans (AArch64 FP)
William Wang (AArch64 Linux support)
Rene De Jong (AArch64 Linux support, performance opt.)
Matt Horsnell (AArch64 MP, validation)
Matt Evans (device models, code integration, validation)
Chris Adeniyi-Jones (AArch64 syscall-emulation)
Prakash Ramrakhyani (validation)
Dam Sunwoo (validation)
Chander Sudanthi (validation)
Stephan Diestelhorst (validation)
Andreas Hansson (code integration, performance opt.)
Eric Van Hensbergen (performance opt.)
Gabe Black


# 10024:fc10e1f9f124 24-Jan-2014 Dam Sunwoo <dam.sunwoo@arm.com>

mem: per-thread cache occupancy and per-block ages

This patch enables tracking of cache occupancy per thread along with
ages (in buckets) per cache blocks. Cache occupancy stats are
recalculated on each stat dump.


# 9950:4b7f60080149 31-Oct-2013 Prakash Ramrakhyani <prakash.ramrakhyani@arm.com>

mem: Add privilege info to request class

This patch adds a flag in the request class that indicates if the request
was made in privileged mode.


# 9738:304a37519d11 03-Jun-2013 Andreas Sandberg <andreas@sandberg.pp.se>

arch: Create a method to finalize physical addresses
in the TLB

Some architectures (currently only x86) require some fixing-up of
physical addresses after a normal address translation. This is usually
to remap devices such as the APIC, but could be used for other memory
mapped devices as well. When running the CPU in a using hardware
virtualization, we still need to do these address fix-ups before
inserting the request into the memory system. This patch moves this
patch allows that code to be used by such CPUs without doing full
address translations.


# 9535:508aebb47ca6 15-Feb-2013 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com>

arm: fix a page table walker issue where a page could be translated multiple times

If multiple memory operations to the same page are miss the TLB they are
all inserted into the page table queue and before this change could result
in multiple uncessesary walks as well as duplicate enteries being inserted
into the TLB.


# 9439:fce94f92ea0f 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

arm: Invalidate cached TLB configuration in drainResume

Currently, we invalidate the cached miscregs in
TLB::unserialize(). The intended use of the drainResume() method is to
invalidate cached state and prepare the system to resume after a CPU
handover or (un)serialization. This patch moves the TLB miscregs
invalidation code to the drainResume() method to avoid surprising
behavior.


# 9294:8fb03b13de02 15-Oct-2012 Andreas Hansson <andreas.hansson@arm.com>

Port: Add protocol-agnostic ports in the port hierarchy

This patch adds an additional level of ports in the inheritance
hierarchy, separating out the protocol-specific and protocl-agnostic
parts. All the functionality related to the binding of ports is now
confined to use BaseMaster/BaseSlavePorts, and all the
protocol-specific parts stay in the Master/SlavePort. In the future it
will be possible to add other protocol-specific implementations.

The functions used in the binding of ports, i.e. getMaster/SlavePort
now use the base classes, and the index parameter is updated to use
the PortID typedef with the symbolic InvalidPortID as the default.


# 8922:17f037ad8918 30-Mar-2012 William Wang <william.wang@arm.com>

MEM: Introduce the master/slave port sub-classes in C++

This patch introduces the notion of a master and slave port in the C++
code, thus bringing the previous classification from the Python
classes into the corresponding simulation objects and memory objects.

The patch enables us to classify behaviours into the two bins and add
assumptions and enfore compliance, also simplifying the two
interfaces. As a starting point, isSnooping is confined to a master
port, and getAddrRanges to slave ports. More of these specilisations
are to come in later patches.

The getPort function is not getMasterPort and getSlavePort, and
returns a port reference rather than a pointer as NULL would never be
a valid return value. The default implementation of these two
functions is placed in MemObject, and calls fatal.

The one drawback with this specific patch is that it requires some
code duplication, e.g. QueuedPort becomes QueuedMasterPort and
QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort
(avoiding multiple inheritance). With the later introduction of the
port interfaces, moving the functionality outside the port itself, a
lot of the duplicated code will disappear again.


# 8809:bb10807da889 01-Feb-2012 Gabe Black <gblack@eecs.umich.edu>

Merge with head, hopefully the last time for this batch.


# 8806:669e93d79ed9 29-Jan-2012 Gabe Black <gblack@eecs.umich.edu>

Implement Ali's review feedback.

Try to decrease indentation, and remove some redundant FullSystem checks.


# 8782:10c9297e14d5 02-Nov-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Get rid of FULL_SYSTEM in the ARM ISA.


# 8756:cce8cf3906ca 16-Oct-2011 Gabe Black <gblack@eecs.umich.edu>

ARM: Turn on the page table walker on ARM in SE mode.


# 8733:64a7bf8fa56c 31-Jan-2012 Geoffrey Blake <geoffrey.blake@arm.com>

CheckerCPU: Re-factor CheckerCPU to be compatible with current gem5

Brings the CheckerCPU back to life to allow FS and SE checking of the
O3CPU. These changes have only been tested with the ARM ISA. Other
ISAs potentially require modification.


# 8552:f51e3dce9521 13-Sep-2011 Daniel Johnson <daniel.johnson@arm.com>

ARM: update TLB to set request packet ASID field


# 8527:6bac5b04d588 19-Aug-2011 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Mark some variables uncacheable until boot all CPUs are enabled.

There are a set of locations is the linux kernel that are managed via
cache maintence instructions until all processors enable their MMUs & TLBs.
Writes to these locations are manually flushed from the cache to main
memory when the occur so that cores operating without their MMU enabled
and only issuing uncached accesses can receive the correct data. Unfortuantely,
gem5 doesn't support any kind of software directed maintence of the cache.
Until such time as that support exists this patch marks the specific cache blocks
that need to be coherent as non-cacheable until all CPUs enable their MMU and
thus allows gem5 to boot MP systems with caches enabled (a requirement for
booting an O3 cpu and thus an O3 CPU regression).


# 8353:237ea6324ece 16-Jun-2011 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Handle case where new TLB size is different from previous TLB size.

After a checkpoint we need to make sure that we restore the right
number of entries.


# 8352:9a3c002dab3e 16-Jun-2011 Chander Sudanthi <Chander.Sudanthi@ARM.com>

ARM: Fix memset on TLB flush and initialization

Instead of clearing the entire TLB on initialization and flush, the code was
clearing only one element. This patch corrects the memsets in the init and
flush routines.


# 8232:b28d06a175be 15-Apr-2011 Nathan Binkert <nate@binkert.org>

trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help


# 8202:1b63e9afeafc 04-Apr-2011 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Fix table walk going on while ASID changes error


# 8067:21f14583aa6a 23-Feb-2011 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Fix bug that let two table walks occur in parallel.


# 7944:1daf51f62013 11-Feb-2011 Giacomo Gabrielli <Giacomo.Gabrielli@arm.com>

O3: Enhance data address translation by supporting hardware page table walkers.

Some ISAs (like ARM) relies on hardware page table walkers. For those ISAs,
when a TLB miss occurs, initiateTranslation() can return with NoFault but with
the translation unfinished.

Instructions experiencing a delayed translation due to a hardware page table
walk are deferred until the translation completes and kept into the IQ. In
order to keep track of them, the IQ has been augmented with a queue of the
outstanding delayed memory instructions. When their translation completes,
instructions are re-executed (only their initiateAccess() was already
executed; their DTB translation is now skipped). The IEW stage has been
modified to support such a 2-pass execution.


# 7850:02450f4443ce 18-Jan-2011 Matt Horsnell <Matt.Horsnell@arm.com>

O3: Fixes the way prefetches are handled inside the iew unit.

This patch prevents the prefetch being added to the instCommit queue twice.


# 7781:a9f9eed35b18 07-Dec-2010 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Support switchover with hardware table walkers


# 7749:859e8bc1cdc2 15-Nov-2010 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Cache the misc regs at the TLB to limit readMiscReg() calls.


# 7734:85a8198aa2ff 08-Nov-2010 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Add some TLB statistics for ARM


# 7733:08d6a773d1b6 08-Nov-2010 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Add checkpointing support


# 7720:65d338a8dba4 31-Oct-2010 Gabe Black <gblack@eecs.umich.edu>

ISA,CPU,etc: Create an ISA defined PC type that abstracts out ISA behaviors.



This change is a low level and pervasive reorganization of how PCs are managed
in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about,
the PC and the NPC, and the lsb of the PC signaled whether or not you were in
PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next
micropc, x86 and ARM introduced variable length instruction sets, and ARM
started to keep track of mode bits in the PC. Each CPU model handled PCs in
its own custom way that needed to be updated individually to handle the new
dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack,
the complexity could be hidden in the ISA at the ISA implementation's expense.
Areas like the branch predictor hadn't been updated to handle branch delay
slots or micropcs, and it turns out that had introduced a significant (10s of
percent) performance bug in SPARC and to a lesser extend MIPS. Rather than
perpetuate the problem by reworking O3 again to handle the PC features needed
by x86, this change was introduced to rework PC handling in a more modular,
transparent, and hopefully efficient way.


PC type:

Rather than having the superset of all possible elements of PC state declared
in each of the CPU models, each ISA defines its own PCState type which has
exactly the elements it needs. A cross product of canned PCState classes are
defined in the new "generic" ISA directory for ISAs with/without delay slots
and microcode. These are either typedef-ed or subclassed by each ISA. To read
or write this structure through a *Context, you use the new pcState() accessor
which reads or writes depending on whether it has an argument. If you just
want the address of the current or next instruction or the current micro PC,
you can get those through read-only accessors on either the PCState type or
the *Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the
move away from readPC. That name is ambiguous since it's not clear whether or
not it should be the actual address to fetch from, or if it should have extra
bits in it like the PAL mode bit. Each class is free to define its own
functions to get at whatever values it needs however it needs to to be used in
ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the
PC and into a separate field like ARM.

These types can be reset to a particular pc (where npc = pc +
sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as
appropriate), printed, serialized, and compared. There is a branching()
function which encapsulates code in the CPU models that checked if an
instruction branched or not. Exactly what that means in the context of branch
delay slots which can skip an instruction when not taken is ambiguous, and
ideally this function and its uses can be eliminated. PCStates also generally
know how to advance themselves in various ways depending on if they point at
an instruction, a microop, or the last microop of a macroop. More on that
later.

Ideally, accessing all the PCs at once when setting them will improve
performance of M5 even though more data needs to be moved around. This is
because often all the PCs need to be manipulated together, and by getting them
all at once you avoid multiple function calls. Also, the PCs of a particular
thread will have spatial locality in the cache. Previously they were grouped
by element in arrays which spread out accesses.


Advancing the PC:

The PCs were previously managed entirely by the CPU which had to know about PC
semantics, try to figure out which dimension to increment the PC in, what to
set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction
with the PC type itself. Because most of the information about how to
increment the PC (mainly what type of instruction it refers to) is contained
in the instruction object, a new advancePC virtual function was added to the
StaticInst class. Subclasses provide an implementation that moves around the
right element of the PC with a minimal amount of decision making. In ISAs like
Alpha, the instructions always simply assign NPC to PC without having to worry
about micropcs, nnpcs, etc. The added cost of a virtual function call should
be outweighed by not having to figure out as much about what to do with the
PCs and mucking around with the extra elements.

One drawback of making the StaticInsts advance the PC is that you have to
actually have one to advance the PC. This would, superficially, seem to
require decoding an instruction before fetch could advance. This is, as far as
I can tell, realistic. fetch would advance through memory addresses, not PCs,
perhaps predicting new memory addresses using existing ones. More
sophisticated decisions about control flow would be made later on, after the
instruction was decoded, and handed back to fetch. If branching needs to
happen, some amount of decoding needs to happen to see that it's a branch,
what the target is, etc. This could get a little more complicated if that gets
done by the predecoder, but I'm choosing to ignore that for now.


Variable length instructions:

To handle variable length instructions in x86 and ARM, the predecoder now
takes in the current PC by reference to the getExtMachInst function. It can
modify the PC however it needs to (by setting NPC to be the PC + instruction
length, for instance). This could be improved since the CPU doesn't know if
the PC was modified and always has to write it back.


ISA parser:

To support the new API, all PC related operand types were removed from the
parser and replaced with a PCState type. There are two warts on this
implementation. First, as with all the other operand types, the PCState still
has to have a valid operand type even though it doesn't use it. Second, using
syntax like PCS.npc(target) doesn't work for two reasons, this looks like the
syntax for operand type overriding, and the parser can't figure out if you're
reading or writing. Instructions that use the PCS operand (which I've
consistently called it) need to first read it into a local variable,
manipulate it, and then write it back out.


Return address stack:

The return address stack needed a little extra help because, in the presence
of branch delay slots, it has to merge together elements of the return PC and
the call PC. To handle that, a buildRetPC utility function was added. There
are basically only two versions in all the ISAs, but it didn't seem short
enough to put into the generic ISA directory. Also, the branch predictor code
in O3 and InOrder were adjusted so that they always store the PC of the actual
call instruction in the RAS, not the next PC. If the call instruction is a
microop, the next PC refers to the next microop in the same macroop which is
probably not desirable. The buildRetPC function advances the PC intelligently
to the next macroop (in an ISA specific way) so that that case works.


Change in stats:

There were no change in stats except in MIPS and SPARC in the O3 model. MIPS
runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could
likely be improved further by setting call/return instruction flags and taking
advantage of the RAS.


TODO:

Add != operators to the PCState classes, defined trivially to be !(a==b).
Smooth out places where PCs are split apart, passed around, and put back
together later. I think this might happen in SPARC's fault code. Add ISA
specific constructors that allow setting PC elements without calling a bunch
of accessors. Try to eliminate the need for the branching() function. Factor
out Alpha's PAL mode pc bit into a separate flag field, and eliminate places
where it's blindly masked out or tested in the PC.


# 7705:fd65f85fcc0c 13-Oct-2010 Gabe Black <gblack@eecs.umich.edu>

Mem: Change the CLREX flag to CLEAR_LL.

CLREX is the name of an ARM instruction, not a name for this generic flag.


# 7697:05b1a077977b 01-Oct-2010 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Make the TLB a little bit faster by moving most recently used items to front of list


# 7694:de057cccee82 01-Oct-2010 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Implement functional virtual to physical address translation
for debugging and program introspection.


# 7612:917946898102 23-Aug-2010 Gene Wu <Gene.Wu@arm.com>

MEM: Make CLREX a first class request operation and clear locks in caches when it in received


# 7611:c119da5a80c8 23-Aug-2010 Gene Wu <Gene.Wu@arm.com>

ARM: Make sure that software prefetch instructions can't change the state of the TLB


# 7608:17aabeaa1a8f 23-Aug-2010 Gene Wu <Gene.Wu@arm.com>

ARM: Fix Uncachable TLB requests and decoding of xn bit


# 7606:c0d90ba69082 23-Aug-2010 Gene Wu <Gene.Wu@arm.com>

ARM: For non-cachable accesses set the UNCACHABLE flag


# 7603:66d853e566d2 23-Aug-2010 Gene Wu <Gene.Wu@arm.com>

ARM: Implement CLREX


# 7461:5a07045d0af2 15-Jun-2010 Nathan Binkert <nate@binkert.org>

stats: only consider a formula initialized if there is a formula


# 7439:b4c6b2532bbf 02-Jun-2010 Dam Sunwoo <dam.sunwoo@arm.com>

ARM: Allow multiple outstanding TLB walks to queue.


# 7438:8e4b37136330 02-Jun-2010 Ali Saidi <Ali.Saidi@ARM.com>

ARM TLB: Fix bug in memAttrs getting a bogus thread context


# 7437:5853fbdfba9b 02-Jun-2010 Dam Sunwoo <dam.sunwoo@arm.com>

ARM: Support table walks in timing mode.


# 7436:b578349f9371 02-Jun-2010 Dam Sunwoo <dam.sunwoo@arm.com>

ARM: Added support for Access Flag and some CP15 regs (V2PCWPR, V2PCWPW, V2PCWUR, V2PCWUW,...)


# 7406:ddc26bd4ea7d 02-Jun-2010 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Some TLB bug fixes.


# 7404:bfc74724914e 02-Jun-2010 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Implement the ARM TLB/Tablewalker. Needs performance improvements.


# 7399:a378ac1e1615 02-Jun-2010 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Start over with translation from Alpha code as opposed to something that has cruft from 4 different ISAs.


# 7362:9ea92e0eb4a9 02-Jun-2010 Gabe Black <gblack@eecs.umich.edu>

ARM: Implement and update the DFSR and IFSR registers on faults.


# 7301:04d8488d0d36 02-Jun-2010 Gabe Black <gblack@eecs.umich.edu>

ARM: Warn about not implementing MPU translation, not panic about MMU.

We'll start out with a stbu version of PMSA and switch over to VMSA for the
full implementation.


# 7294:fda2c00880db 02-Jun-2010 Gabe Black <gblack@eecs.umich.edu>

ARM: Implement the V7 version of alignment checking.


# 7093:9832d4b070fc 02-Jun-2010 Gabe Black <gblack@eecs.umich.edu>

ARM: Track the current ISA mode using the PC.


# 6757:d86d3d6e5326 17-Nov-2009 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Boilerplate full-system code.


# 6428:9e35cdc95e81 02-Aug-2009 Steve Reinhardt <steve.reinhardt@amd.com>

Clean up some inconsistencies with Request flags.


# 6116:a5a97b04d796 21-Apr-2009 Nathan Binkert <nate@binkert.org>

arm: Unify the ARM tlb. We forgot about this when we did the rest.
This code compiles, but there are no tests still


# 6020:0647c8b31a99 06-Apr-2009 Gabe Black <gblack@eecs.umich.edu>

Merge ARM into the head. ARM will compile but may not actually work.


# 6019:76890d8b28f5 05-Apr-2009 Stephen Hines <hines@cs.fsu.edu>

arm: add ARM support to M5