History log of /gem5/src/cpu/o3/thread_context_impl.hh
Revision Date Author Comments
# 14022:a7cdc33dab35 02-May-2019 Gabe Black <gabeblack@google.com>

cpu, sim: Return PortProxy &s from all the proxy accessors.

This is a step towards merging the accessors for SE and FS modes.

Change-Id: I76818ab88b97097ac363e243be9cc1911b283090
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18579
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 13865:cca49fc49c57 13-Apr-2019 Gabe Black <gabeblack@google.com>

cpu: Eliminate the ProxyThreadContext class.

Replace it with direct inheritance from the ThreadContext class in the
SimpleThread class which was the only place it was used.

Also take the opportunity to use some specialized types instead of
ints, etc., add some consts, and fix some style issues.

Change-Id: I5d2cfa87b20dc43615e33e6755c9d016564e9c0e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18048
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 13641:648f3106ebdf 02-Apr-2018 Tuan Ta <qtt2@cornell.edu>

cpu: fixed how O3 CPU executes an exit system call

When a thread executed an exit syscall in SE mode, the thread context
was removed immediately in the same cycle, which left inflight squash
operations and trap event incomplete. The problem happened when a new
thread was assigned to the CPU later. The new thread started with some
incomplete transactions of the previous thread (e.g., squashing). This
problem could cause incorrect execution flow for the new thread (i.e.,
pc was not reset properly at the exit point), deadlock (i.e., some
stage-to-stage signals were not reset) and incorrect rename map between
logical and physical registers.

This patch adds a new state called 'Halting' to the thread context and
defers removing thread context from a CPU until a trap event initiated
by an exit syscall execution is processed. This patch also makes sure
that the removal of a thread context happens after all inflight
transactions of the to-be-removed thread in the pipeline complete.

Change-Id: If7ef1462fb8864e22b45371ee7ae67e2a5ad38b8
Reviewed-on: https://gem5-review.googlesource.com/c/8184
Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>


# 13622:ba31c2a23eca 21-Nov-2018 Gabe Black <gabeblack@google.com>

cpu, arch: Replace the CCReg type with RegVal.

Most architectures weren't using the CCReg type, and in x86 and arm
it was already a uint64_t.

Change-Id: I0b3d5e690e6b31db6f2627f449c89bde0f6750a6
Reviewed-on: https://gem5-review.googlesource.com/c/14515
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 13611:c8b7847b4171 19-Nov-2018 Gabe Black <gabeblack@google.com>

arch: cpu: Rename *FloatRegBits* to *FloatReg*.

Now that there's no plain FloatReg, there's no reason to distinguish
FloatRegBits with a special suffix since it's the only way to read or
write FP registers.

Change-Id: I3a60168c1d4302aed55223ea8e37b421f21efded
Reviewed-on: https://gem5-review.googlesource.com/c/14460
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 13610:5d5404ac6288 16-Oct-2018 Giacomo Gabrielli <giacomo.gabrielli@arm.com>

arch,cpu: Add vector predicate registers

Latest-gen. vector/SIMD extensions, including the Arm Scalable Vector
Extension (SVE), introduce the notion of a predicate register file.
This changeset adds this feature across architectures and CPU models.

Change-Id: Iebcadbad89c0a582ff8b1b70de353305db603946
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13715
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>


# 13601:f5c84915eb7f 10-Jan-2019 Giacomo Travaglini <giacomo.travaglini@arm.com>

cpu, arch, arch-arm: Wire unused VecElem code in the O3 model

VecElem code had been introduced in order to simulate change of renaming
for vector registers. Most of the work is happening on the rename_map
switchRenameMode. Change of renaming can happen after a squash in the
pipeline.
This patch is also changing the interface to the ISA part so that
a PCState is used instead of ISA in order to check if rename mode
has changed.

Change-Id: I8af795d771b958e0a0d459abfeceff5f16b4b5d4
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15601


# 13582:989577bf6abc 18-Oct-2018 Gabe Black <gabeblack@google.com>

arch: cpu: Stop passing around misc registers by reference.

These values are all basic integers (specifically uint64_t now), and
so passing them by const & is actually less efficient since there's a
extra level of indirection and an extra value, and the same sized value
(a 64 bit pointer vs. a 64 bit int) is being passed around.

Change-Id: Ie9956b8dc4c225068ab1afaba233ec2b42b76da3
Reviewed-on: https://gem5-review.googlesource.com/c/13626
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>


# 13557:fc33e6048b25 13-Oct-2018 Gabe Black <gabeblack@google.com>

cpu: dev: sim: gpu-compute: Banish some ISA specific register types.

These types are IntReg, FloatReg, FloatRegBits, and MiscReg. There are
some remaining types, specifically the vector registers and the CCReg.
I'm less familiar with these new types of registers, and so will look
at getting rid of them at some later time.

Change-Id: Ide8f76b15c531286f61427330053b44074b8ac9b
Reviewed-on: https://gem5-review.googlesource.com/c/13624
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 13500:6e0a2a7c6d8c 19-Nov-2018 Gabe Black <gabeblack@google.com>

arch, cpu: Remove float type accessors.

Use the binary accessors instead.

Change-Id: Iff1877e92c79df02b3d13635391a8c2f025776a2
Reviewed-on: https://gem5-review.googlesource.com/c/14457
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 12429:beefb9f5f551 09-Jan-2018 BKP <brandon.potter@amd.com>

style: change C/C++ source permissions to noexec

Several files in the repository were tracked with execute permissions
even though the files are just normal C/C++ files (and the one .isa).

Change-Id: I976b096acab4a1fc74c5699ef1f9b222c1e635c2
Reviewed-on: https://gem5-review.googlesource.com/7241
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 12279:48bca1fee7a0 09-Oct-2017 Nikos Nikoleris <nikos.nikoleris@arm.com>

cpu-o3: Prevent cpu from suspending if it is already draining

Suspending the current thread context while draining due to a quiesce
pseudo instruction (for example a wfi instruction) could deadlock the
cpu and prevent it from successfully draining. This change ensures
that the cpu is not draining before suspending the thread context.

Change-Id: I7c019847f5a870d4bc9ce2b19936bc3dc45e5fd7
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/5881
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 12181:2150eff234c1 25-Aug-2017 Gabe Black <gabeblack@google.com>

stats: Get rid of some kernel stats related cruft.

The kernel stat mechanism should really be refactored and moved somewhere
else, but in the mean time there's some old cruft that can be cleared away.

Change-Id: I21e725de590dda0d20bf3bc675bbe976c7b1bd86
Reviewed-on: https://gem5-review.googlesource.com/4600
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 12109:f29e9c5418aa 05-Apr-2017 Rekai Gonzalez-Alberquilla <Rekai.GonzalezAlberquilla@arm.com>

cpu: Added interface for vector reg file

This patch adds some more functionality to the cpu model and the arch to
interface with the vector register file.

This change consists mainly of augmenting ThreadContexts and ExecContexts
with calls to get/set full vectors, underlying microarchitectural elements
or lanes. Those are meant to interface with the vector register file. All
classes that implement this interface also get an appropriate implementation.

This requires implementing the vector register file for the different
models using the VecRegContainer class.

This change set also updates the Result abstraction to contemplate the
possibility of having a vector as result.

The changes also affect how the remote_gdb connection works.

There are some (nasty) side effects, such as the need to define dummy
numPhysVecRegs parameter values for architectures that do not implement
vector extensions.

Nathanael Premillieu's work with an increasing number of fixes and
improvements of mine.

Change-Id: Iee65f4e8b03abfe1e94e6940a51b68d0977fd5bb
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
[ Fix RISCV build issues and CC reg free list initialisation ]
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2705


# 12106:7784fac1b159 05-Apr-2017 Rekai Gonzalez-Alberquilla <Rekai.GonzalezAlberquilla@arm.com>

cpu: Simplify the rename interface and use RegId

With the hierarchical RegId there are a lot of functions that are
redundant now.

The idea behind the simplification is that instead of having the regId,
telling which kind of register read/write/rename/lookup/etc. and then
the function panic_if'ing if the regId is not of the appropriate type,
we provide an interface that decides what kind of register to read
depending on the register type of the given regId.

Change-Id: I7d52e9e21fc01205ae365d86921a4ceb67a57178
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
[ Fix RISCV build issues ]
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2702


# 10935:acd48ddd725f 28-Jul-2015 Nilay Vaish <nilay@cs.wisc.edu>

revert 5af8f40d8f2c


# 10934:5af8f40d8f2c 26-Jul-2015 Nilay Vaish <nilay@cs.wisc.edu>

cpu: implements vector registers

This adds a vector register type. The type is defined as a std::array of a
fixed number of uint64_ts. The isa_parser.py has been modified to parse vector
register operands and generate the required code. Different cpus have vector
register files now.


# 10407:a9023811bf9e 20-Sep-2014 Mitch Hayenga <mitch.hayenga@arm.com>

alpha,arm,mips,power,x86,cpu,sim: Cleanup activate/deactivate

activate(), suspend(), and halt() used on thread contexts had an optional
delay parameter. However this parameter was often ignored. Also, when used,
the delay was seemily arbitrarily set to 0 or 1 cycle (no other delays were
ever specified). This patch removes the delay parameter and 'Events'
associated with them across all ISAs and cores. Unused activate logic
is also removed.


# 10033:21c14a2b2117 24-Jan-2014 Ali Saidi <Ali.Saidi@ARM.com>

arch, cpu: Add support for flattening misc register indexes.

With ARMv8 support the same misc register id results in accessing different
registers depending on the current mode of the processor. This patch adds
the same orthogonality to the misc register file as the others (int, float, cc).
For all the othre ISAs this is currently a null-implementation.

Additionally, a system variable is added to all the ISA objects.


# 9944:4ff1c5c6dcbc 17-Oct-2013 Matt Horsnell <matt.horsnell@ARM.com>

cpu: add consistent guarding to *_impl.hh files.


# 9920:028e4da64b42 15-Oct-2013 Yasuko Eckert <yasuko.eckert@amd.com>

cpu: add a condition-code register class

Add a third register class for condition codes,
in parallel with the integer and FP classes.
No ISAs use the CC class at this point though.


# 9478:ba80f7d4f452 22-Jan-2013 Nilay Vaish <nilay@cs.wisc.edu>

x86, cpu: corrects 270c9a75e91f, take over decoder on cpu switch
The changes made by the changeset 270c9a75e91f do not work well with switching
of cpus. The problem is that decoder for the old thread context holds state
that is not taken over by the new decoder.

This patch adds a takeOverFrom() function to Decoder class in each ISA. Except
for x86, functions in other ISAs are blank. For x86, the function copies state
from the old decoder to the new decoder.


# 9441:1133617844c8 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Fix broken thread context handover

The thread context handover code used to break when multiple handovers
were performed during the same quiesce period. Previously, the thread
contexts would assign the TC pointer in the old quiesce event to the
new TC. This obviously broke in cases where multiple switches were
performed within the same quiesce period, in which case the TC pointer
in the quiesce event would point to an old CPU.

The new implementation deschedules pending quiesce events in the old
TC and schedules a new quiesce event in the new TC. The code has been
refactored to remove most of the code duplication.


# 9436:4a0223da4924 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

o3 cpu: Remove unused variables


# 9428:029dfe6324d3 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Unify SimpleCPU and O3 CPU serialization code

The O3 CPU used to copy its thread context to a SimpleThread in order
to do serialization. This was a bit of a hack involving two static
SimpleThread instances and a magic constructor that was only used by
the O3 CPU.

This patch moves the ThreadContext serialization code into two global
procedures that, in addition to the normal serialization parameters,
take a ThreadContext reference as a parameter. This allows us to reuse
the serialization code in all ThreadContext implementations.


# 9426:0548b3e9734d 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Implement a flat register interface in thread contexts

Some architectures map registers differently depending on their mode
of operations. There is currently no architecture independent way of
accessing all registers. This patch introduces a flat register
interface to the ThreadContext class. This interface is useful, for
example, when serializing or copying thread contexts.


# 9384:877293183bdf 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@arm.com>

arch: Make the ISA class inherit from SimObject

The ISA class on stores the contents of ID registers on many
architectures. In order to make reset values of such registers
configurable, we make the class inherit from SimObject, which allows
us to use the normal generated parameter headers.

This patch introduces a Python helper method, BaseCPU.createThreads(),
which creates a set of ISAs for each of the threads in an SMT
system. Although it is currently only needed when creating
multi-threaded CPUs, it should always be called before instantiating
the system as this is an obvious place to configure ID registers
identifying a thread/CPU.


# 9382:1c97b57d5169 07-Jan-2013 Ali Saidi <Ali.Saidi@ARM.com>

cpu: rename the misleading inSyscall to noSquashFromTC

isSyscall was originally created because during handling of a syscall in SE
mode the threadcontext had to be updated. However, in many places this is used
in FS mode (e.g. fault handlers) and the name doesn't make much sense. The
boolean actually stops gem5 from squashing speculative and non-committed state
when a write to a threadcontext happens, so re-name the variable to something
more appropriate


# 9180:ee8d7a51651d 28-Aug-2012 Andreas Hansson <andreas.hansson@arm.com>

Clock: Add a Cycles wrapper class and use where applicable

This patch addresses the comments and feedback on the preceding patch
that reworks the clocks and now more clearly shows where cycles
(relative cycle counts) are used to express time.

Instead of bumping the existing patch I chose to make this a separate
patch, merely to try and focus the discussion around a smaller set of
changes. The two patches will be pushed together though.

This changes done as part of this patch are mostly following directly
from the introduction of the wrapper class, and change enough code to
make things compile and run again. There are definitely more places
where int/uint/Tick is still used to represent cycles, and it will
take some time to chase them all down. Similarly, a lot of parameters
should be changed from Param.Tick and Param.Unsigned to
Param.Cycles.

In addition, the use of curTick is questionable as there should not be
an absolute cycle. Potential solutions can be built on top of this
patch. There is a similar situation in the o3 CPU where
lastRunningCycle is currently counting in Cycles, and is still an
absolute time. More discussion to be had in other words.

An additional change that would be appropriate in the future is to
perform a similar wrapping of Tick and probably also introduce a
Ticks class along with suitable operators for all these classes.


# 8887:20ea02da9c53 09-Mar-2012 Geoffrey Blake <geoffrey.blake@arm.com>

CheckerCPU: Make CheckerCPU runtime selectable instead of compile selectable

Enables the CheckerCPU to be selected at runtime with the --checker option
from the configs/example/fs.py and configs/example/se.py configuration
files. Also merges with the SE/FS changes.


# 8852:c744483edfcf 24-Feb-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Make port proxies use references rather than pointers

This patch is adding a clearer design intent to all objects that would
not be complete without a port proxy by making the proxies members
rathen than dynamically allocated. In essence, if NULL would not be a
valid value for the proxy, then we avoid using a pointer to make this
clear.

The same approach is used for the methods using these proxies, such as
loadSections, that now use references rather than pointers to better
reflect the fact that NULL would not be an acceptable value (in fact
the code would break and that is how this patch started out).

Overall the concept of "using a reference to express unconditional
composition where a NULL pointer is never valid" could be done on a
much broader scale throughout the code base, but for now it is only
done in the locations affected by the proxies.


# 8809:bb10807da889 01-Feb-2012 Gabe Black <gblack@eecs.umich.edu>

Merge with head, hopefully the last time for this batch.


# 8799:dac1e33e07b0 28-Jan-2012 Gabe Black <gblack@eecs.umich.edu>

Merge with the main repo.


# 8793:5f25086326ac 18-Nov-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Get rid of FULL_SYSTEM in the CPU directory.


# 8777:dd43f1c9fa0a 31-Oct-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Make the functions available from the TC consistent between SE and FS.


# 8767:e575781f71b8 30-Oct-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Make getProcessPtr available in both modes, and get rid of FULL_SYSTEMs.


# 8761:20322354b80b 16-Oct-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Build/expose vport in SE mode.


# 8733:64a7bf8fa56c 31-Jan-2012 Geoffrey Blake <geoffrey.blake@arm.com>

CheckerCPU: Re-factor CheckerCPU to be compatible with current gem5

Brings the CheckerCPU back to life to allow FS and SE checking of the
O3CPU. These changes have only been tested with the ARM ISA. Other
ISAs potentially require modification.


# 8706:b1838faf3bcc 17-Jan-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Add port proxies instead of non-structural ports

Port proxies are used to replace non-structural ports, and thus enable
all ports in the system to correspond to a structural entity. This has
the advantage of accessing memory through the normal memory subsystem
and thus allowing any constellation of distributed memories, address
maps, etc. Most accesses are done through the "system port" that is
used for loading binaries, debugging etc. For the entities that belong
to the CPU, e.g. threads and thread contexts, they wrap the CPU data
port in a port proxy.

The following replacements are made:
FunctionalPort > PortProxy
TranslatingPort > SETranslatingPortProxy
VirtualPort > FSTranslatingPortProxy


# 8518:9c87727099ce 19-Aug-2011 Geoffrey Blake <geoffrey.blake@arm.com>

Fix bugs due to interaction between SEV instructions and O3 pipeline

SEV instructions were originally implemented to cause asynchronous squashes
via the generateTCSquash() function in the O3 pipeline when updating the
SEV_MAILBOX miscReg. This caused race conditions between CPUs in an MP system
that would lead to a pipeline either going inactive indefinitely or not being
able to commit squashed instructions. Fixed SEV instructions to behave like
interrupts and cause synchronous sqaushes inside the pipeline, eliminating
the race conditions. Also fixed up the semantics of the WFE instruction to
behave as documented in the ARMv7 ISA description to not sleep if SEV_MAILBOX=1
or unmasked interrupts are pending.


# 8232:b28d06a175be 15-Apr-2011 Nathan Binkert <nate@binkert.org>

trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help


# 8208:45331a355c38 04-Apr-2011 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Fix checkpoint restoration into O3 CPU and the way O3 switchCpu works.

This change fixes a small bug in the arm copyRegs() code where some registers
wouldn't be copied if the processor was in a mode other than MODE_USER.
Additionally, this change simplifies the way the O3 switchCpu code works by
utilizing TheISA::copyRegs() to copy the required context information
rather than the adhoc copying that goes on in the CPU model. The current code
makes assumptions about the visibility of int and float registers that aren't
true for all architectures in FS mode.


# 7823:dac01f14f20f 08-Jan-2011 Steve Reinhardt <steve.reinhardt@amd.com>

Replace curTick global variable with accessor functions.
This step makes it easy to replace the accessor functions
(which still access a global variable) with ones that access
per-thread curTick values.


# 7763:ff2213d13e58 15-Nov-2010 Ali Saidi <Ali.Saidi@ARM.com>

O3: reset architetural state by calling clear()


# 7720:65d338a8dba4 31-Oct-2010 Gabe Black <gblack@eecs.umich.edu>

ISA,CPU,etc: Create an ISA defined PC type that abstracts out ISA behaviors.



This change is a low level and pervasive reorganization of how PCs are managed
in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about,
the PC and the NPC, and the lsb of the PC signaled whether or not you were in
PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next
micropc, x86 and ARM introduced variable length instruction sets, and ARM
started to keep track of mode bits in the PC. Each CPU model handled PCs in
its own custom way that needed to be updated individually to handle the new
dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack,
the complexity could be hidden in the ISA at the ISA implementation's expense.
Areas like the branch predictor hadn't been updated to handle branch delay
slots or micropcs, and it turns out that had introduced a significant (10s of
percent) performance bug in SPARC and to a lesser extend MIPS. Rather than
perpetuate the problem by reworking O3 again to handle the PC features needed
by x86, this change was introduced to rework PC handling in a more modular,
transparent, and hopefully efficient way.


PC type:

Rather than having the superset of all possible elements of PC state declared
in each of the CPU models, each ISA defines its own PCState type which has
exactly the elements it needs. A cross product of canned PCState classes are
defined in the new "generic" ISA directory for ISAs with/without delay slots
and microcode. These are either typedef-ed or subclassed by each ISA. To read
or write this structure through a *Context, you use the new pcState() accessor
which reads or writes depending on whether it has an argument. If you just
want the address of the current or next instruction or the current micro PC,
you can get those through read-only accessors on either the PCState type or
the *Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the
move away from readPC. That name is ambiguous since it's not clear whether or
not it should be the actual address to fetch from, or if it should have extra
bits in it like the PAL mode bit. Each class is free to define its own
functions to get at whatever values it needs however it needs to to be used in
ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the
PC and into a separate field like ARM.

These types can be reset to a particular pc (where npc = pc +
sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as
appropriate), printed, serialized, and compared. There is a branching()
function which encapsulates code in the CPU models that checked if an
instruction branched or not. Exactly what that means in the context of branch
delay slots which can skip an instruction when not taken is ambiguous, and
ideally this function and its uses can be eliminated. PCStates also generally
know how to advance themselves in various ways depending on if they point at
an instruction, a microop, or the last microop of a macroop. More on that
later.

Ideally, accessing all the PCs at once when setting them will improve
performance of M5 even though more data needs to be moved around. This is
because often all the PCs need to be manipulated together, and by getting them
all at once you avoid multiple function calls. Also, the PCs of a particular
thread will have spatial locality in the cache. Previously they were grouped
by element in arrays which spread out accesses.


Advancing the PC:

The PCs were previously managed entirely by the CPU which had to know about PC
semantics, try to figure out which dimension to increment the PC in, what to
set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction
with the PC type itself. Because most of the information about how to
increment the PC (mainly what type of instruction it refers to) is contained
in the instruction object, a new advancePC virtual function was added to the
StaticInst class. Subclasses provide an implementation that moves around the
right element of the PC with a minimal amount of decision making. In ISAs like
Alpha, the instructions always simply assign NPC to PC without having to worry
about micropcs, nnpcs, etc. The added cost of a virtual function call should
be outweighed by not having to figure out as much about what to do with the
PCs and mucking around with the extra elements.

One drawback of making the StaticInsts advance the PC is that you have to
actually have one to advance the PC. This would, superficially, seem to
require decoding an instruction before fetch could advance. This is, as far as
I can tell, realistic. fetch would advance through memory addresses, not PCs,
perhaps predicting new memory addresses using existing ones. More
sophisticated decisions about control flow would be made later on, after the
instruction was decoded, and handed back to fetch. If branching needs to
happen, some amount of decoding needs to happen to see that it's a branch,
what the target is, etc. This could get a little more complicated if that gets
done by the predecoder, but I'm choosing to ignore that for now.


Variable length instructions:

To handle variable length instructions in x86 and ARM, the predecoder now
takes in the current PC by reference to the getExtMachInst function. It can
modify the PC however it needs to (by setting NPC to be the PC + instruction
length, for instance). This could be improved since the CPU doesn't know if
the PC was modified and always has to write it back.


ISA parser:

To support the new API, all PC related operand types were removed from the
parser and replaced with a PCState type. There are two warts on this
implementation. First, as with all the other operand types, the PCState still
has to have a valid operand type even though it doesn't use it. Second, using
syntax like PCS.npc(target) doesn't work for two reasons, this looks like the
syntax for operand type overriding, and the parser can't figure out if you're
reading or writing. Instructions that use the PCS operand (which I've
consistently called it) need to first read it into a local variable,
manipulate it, and then write it back out.


Return address stack:

The return address stack needed a little extra help because, in the presence
of branch delay slots, it has to merge together elements of the return PC and
the call PC. To handle that, a buildRetPC utility function was added. There
are basically only two versions in all the ISAs, but it didn't seem short
enough to put into the generic ISA directory. Also, the branch predictor code
in O3 and InOrder were adjusted so that they always store the PC of the actual
call instruction in the RAS, not the next PC. If the call instruction is a
microop, the next PC refers to the next microop in the same macroop which is
probably not desirable. The buildRetPC function advances the PC intelligently
to the next macroop (in an ISA specific way) so that that case works.


Change in stats:

There were no change in stats except in MIPS and SPARC in the O3 model. MIPS
runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could
likely be improved further by setting call/return instruction flags and taking
advantage of the RAS.


TODO:

Add != operators to the PCState classes, defined trivially to be !(a==b).
Smooth out places where PCs are split apart, passed around, and put back
together later. I think this might happen in SPARC's fault code. Add ISA
specific constructors that allow setting PC elements without calling a bunch
of accessors. Try to eliminate the need for the branching() function. Factor
out Alpha's PAL mode pc bit into a separate flag field, and eliminate places
where it's blindly masked out or tested in the PC.


# 7679:f26cc2c68b48 14-Sep-2010 Gabe Black <gblack@eecs.umich.edu>

CPU: Get rid of the now unnecessary getInst/setInst family of functions.

This code is no longer needed because of the preceeding change which adds a
StaticInstPtr parameter to the fault's invoke method, obviating the only use
for this pair of functions.


# 7467:91994f36de7f 22-Jun-2010 Timothy M. Jones <tjones1@inf.ed.ac.uk>

O3ThreadContext: When taking over from a previous context, only assert that
the system pointers match in Full System mode.


# 6658:f4de76601762 23-Sep-2009 Nathan Binkert <nate@binkert.org>

arch: nuke arch/isa_specific.hh and move stuff to generated config/the_isa.hh


# 6329:5d8b91875859 09-Jul-2009 Gabe Black <gblack@eecs.umich.edu>

Registers: Add a registers.hh file as an ISA switched header.
This file is for register indices, Num* constants, and register types.
copyRegs and copyMiscRegs were moved to utility.hh and utility.cc.


# 6314:781969fbeca9 09-Jul-2009 Gabe Black <gblack@eecs.umich.edu>

Registers: Get rid of the float register width parameter.


# 6313:95f69a436c82 09-Jul-2009 Gabe Black <gblack@eecs.umich.edu>

Registers: Add an ISA object which replaces the MiscRegFile.
This object encapsulates (or will eventually) the identity and characteristics
of the ISA in the CPU.


# 6221:58a3c04e6344 26-May-2009 Nathan Binkert <nate@binkert.org>

types: add a type for thread IDs and try to use it everywhere


# 6029:007c36616f47 15-Apr-2009 Steve Reinhardt <steve.reinhardt@amd.com>

Get rid of the Unallocated thread context state.
Basically merge it in with Halted.
Also had to get rid of a few other functions that
called ThreadContext::deallocate(), including:
- InOrderCPU's setThreadRescheduleCondition.
- ThreadContext::exit(). This function was there to avoid terminating
simulation when one thread out of a multi-thread workload exits, but we
need to find a better (non-cpu-centric) way.


# 5958:2d9737bf3c2f 27-Feb-2009 Gabe Black <gblack@eecs.umich.edu>

Processes: Make getting and setting system call arguments part of a process object.


# 5803:aae3d7089925 19-Jan-2009 Nathan Binkert <nate@binkert.org>

thread_context: move getSystemPtr so SE mode can get to it.
There was really no reason that it should be FS only.


# 5715:e8c1d4e669a7 04-Nov-2008 Lisa Hsu <hsul@eecs.umich.edu>

get rid of all instances of readTid() and getThreadNum(). Unify and eliminate
redundancies with threadId() as their replacement.


# 5714:76abee886def 02-Nov-2008 Lisa Hsu <hsul@eecs.umich.edu>

Add in Context IDs to the simulator. From now on, cpuId is almost never used,
the primary identifier for a hardware context should be contextId(). The
concept of threads within a CPU remains, in the form of threadId() because
sometimes you need to know which context within a cpu to manipulate.


# 5712:199d31b47f7b 02-Nov-2008 Lisa Hsu <hsul@eecs.umich.edu>

make BaseCPU the provider of _cpuId, and cpuId() instead of being scattered
across the subclasses. generally make it so that member data is _cpuId and
accessor functions are cpuId(). The ID val comes from the python (default -1 if
none provided), and if it is -1, the index of cpuList will be given. this has
passed util/regress quick and se.py -n4 and fs.py -n4 as well as standard
switch.


# 5704:98224505352a 21-Oct-2008 Nathan Binkert <nate@binkert.org>

style: Use the correct m5 style for things relating to interrupts.


# 5499:8bfc7650c344 01-Jul-2008 Ali Saidi <saidi@eecs.umich.edu>

Remove delVirtPort() and make getVirtPort() only return cached version.


# 5494:85c8d296c1cb 28-Jun-2008 Steve Reinhardt <stever@gmail.com>

Backed out changeset 94a7bb476fca: caused memory leak.


# 5489:94a7bb476fca 21-Jun-2008 Steve Reinhardt <stever@gmail.com>

Generate more useful error messages for unconnected ports.
Force all non-default ports to provide a name and an
owner in the constructor.


# 5258:fcccd87d5178 15-Nov-2007 Korey Sewell <ksewell@umich.edu>

put the flattenIndex stuff back in O3 AND put fatal() back in faults


# 5250:42577371ff31 15-Nov-2007 Korey Sewell <ksewell@umich.edu>

Get MIPS simple regression working. Take out unecessary functions "setShadowSet", "CacheOp"


# 5235:f07f46843886 12-Nov-2007 Gabe Black <gblack@eecs.umich.edu>

X86: Make the micropc available through the thread context objects.
This is necssary for fault handlers that branch to non-zero micro PCs.


# 5082:82dd253231c8 19-Sep-2007 Gabe Black <gblack@eecs.umich.edu>

X86: Put in the foundation for x87 stack based fp registers.


# 4217:4c966fec2324 13-Mar-2007 Ali Saidi <saidi@eecs.umich.edu>

fix segfault when peer owner attempts to use functional port


# 4172:141705d83494 07-Mar-2007 Ali Saidi <saidi@eecs.umich.edu>

*MiscReg->*MiscRegNoEffect, *MiscRegWithEffect->*MiscReg


# 3778:ac52cbef744c 06-Dec-2006 Gabe Black <gblack@eecs.umich.edu>

Merge zizzer:/bk/newmem
into zower.eecs.umich.edu:/eecshome/m5/newmem

src/cpu/o3/commit_impl.hh:
Hand Merge


# 3776:4f88e76d8ebe 06-Dec-2006 Gabe Black <gblack@eecs.umich.edu>

Flattening and syscallReturn fixes

src/cpu/o3/thread_context_impl.hh:
Use flattened indices
src/cpu/simple_thread.hh:
Use flattened indices, and pass a thread context to setSyscallReturn rather than a register file.
src/cpu/thread_context.hh:
The SyscallReturn class is no longer in arch/syscallreturn.hh


# 3686:fa8d8b90cd8a 29-Nov-2006 Kevin Lim <ktlim@umich.edu>

Change the connecting of the physPort and virtPort to the memory object below the CPU to happen every time activateContext is called. The overhead is probably a little higher than necessary, but allows these connections to properly be made when there are CPUs that are inactive until they are switched in.

Right now this introduces a minor memory leak as old physPorts and virtPorts are not deleted when new ones are created. A flyspray task has been created for this issue. It can not be resolved until we determine how the bus will handle giving out ID's to functional ports that may be deleted.

src/cpu/o3/cpu.cc:
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
Change the setup of the physPort and virtPort to instead happen every time the CPU has a context activated. This is a little high overhead, but keeps it working correctly when the CPU does not have a physical memory attached to it until it switches in (like the case of switch CPUs).
src/cpu/o3/thread_context.hh:
Change function from being called at init() to just being called whenever the memory ports need to be connected.
src/cpu/o3/thread_context_impl.hh:
Update this to not delete the port if it's the same as the virtPort.
src/cpu/thread_context.hh:
Change function from being called at init() to whenever the memory ports need to be connected.
src/cpu/thread_state.cc:
Instead of initializing the ports, simply connect them, deleting any old ports that might exist. This allows these functions to be called multiple times.
src/cpu/thread_state.hh:
Ports are no longer initialized, but rather connected at context activation time.


# 3675:dc883b610345 19-Nov-2006 Kevin Lim <ktlim@umich.edu>

Update Virtual and Physical ports.

src/cpu/o3/alpha/cpu_impl.hh:
Handle the PhysicalPort and VirtualPort in the ThreadState.
src/cpu/o3/cpu.cc:
Initialize the thread context.
src/cpu/o3/thread_context.hh:
Add new function to initialize thread context.
src/cpu/o3/thread_context_impl.hh:
Use code now put into function.
src/cpu/simple_thread.cc:
Move code to ThreadState and use the new helper function.
src/cpu/simple_thread.hh:
Remove init() in this derived class; use init() from ThreadState base class.
src/cpu/thread_state.cc:
Move setting up of Physical and Virtual ports here. Change getMemFuncPort() to connectToMemFunc(), which connects a port to a functional port of the memory object below the CPU.
src/cpu/thread_state.hh:
Update functions.


# 3548:85e64c82c522 07-Nov-2006 Gabe Black <gblack@eecs.umich.edu>

Moved the switched version of kernel_stats.hh back to kern, and moved the base kernel_stats to base_kernel_stats


# 3468:cf23ad1ceef2 01-Nov-2006 Gabe Black <gblack@eecs.umich.edu>

Adjustments for the AlphaTLB changing to AlphaISA::TLB and changing register file functions to not take faults


# 3221:669a04468c0d 08-Oct-2006 Kevin Lim <ktlim@umich.edu>

Updates to O3 CPU. It should now work in FS mode, although sampling still has a bug.

src/cpu/o3/commit_impl.hh:
Fixes for compile and sampling.
src/cpu/o3/cpu.cc:
Deallocate and activate threads properly. Also hopefully fix being able to use caches while switching over.
src/cpu/o3/cpu.hh:
Fixes for deallocating and activating threads.
src/cpu/o3/fetch_impl.hh:
src/cpu/o3/lsq_unit.hh:
Handle getting back a BadAddress result from the access.
src/cpu/o3/iew_impl.hh:
More debug output.
src/cpu/o3/lsq_unit_impl.hh:
Fixup store conditional handling (still a bit of a hack, but works now).

Also handle getting back a BadAddress result from the access.
src/cpu/o3/thread_context_impl.hh:
Deallocate context now records if the context should be fully removed.


# 3126:756092c6383c 02-Oct-2006 Kevin Lim <ktlim@umich.edu>

Updates to fix merge issues and bring almost everything up to working speed. Ozone CPU remains untested, but everything else compiles and runs.

src/arch/alpha/isa_traits.hh:
This got changed to the wrong version by accident.
src/cpu/base.cc:
Fix up progress event to not schedule itself if the interval is set to 0.
src/cpu/base.hh:
Fix up the CPU Progress Event to not print itself if it's set to 0. Also remove stats_reset_inst (something I added to m5 but isn't necessary here).
src/cpu/base_dyn_inst.hh:
src/cpu/checker/cpu.hh:
Remove float variable of instResult; it's always held within the double part now.
src/cpu/checker/cpu_impl.hh:
Use thread and not cpuXC.
src/cpu/o3/alpha/cpu_builder.cc:
src/cpu/o3/checker_builder.cc:
src/cpu/ozone/checker_builder.cc:
src/cpu/ozone/cpu_builder.cc:
src/python/m5/objects/BaseCPU.py:
Remove stats_reset_inst.
src/cpu/o3/commit_impl.hh:
src/cpu/ozone/lw_back_end_impl.hh:
Get TC, not XCProxy.
src/cpu/o3/cpu.cc:
Switch out updates from the version of m5 I have. Also remove serialize code that got added twice.
src/cpu/o3/iew_impl.hh:
src/cpu/o3/lsq_impl.hh:
src/cpu/thread_state.hh:
Remove code that was added twice.
src/cpu/o3/lsq_unit.hh:
Add back in stats that got lost in the merge.
src/cpu/o3/lsq_unit_impl.hh:
Use proper method to get flags. Also wake CPU if we're coming back from a cache miss.
src/cpu/o3/thread_context_impl.hh:
src/cpu/o3/thread_state.hh:
Support profiling.
src/cpu/ozone/cpu.hh:
Update to use proper typename.
src/cpu/ozone/cpu_impl.hh:
src/cpu/ozone/dyn_inst_impl.hh:
Updates for newmem.
src/cpu/ozone/lw_lsq_impl.hh:
Get flags correctly.
src/cpu/ozone/thread_state.hh:
Reorder constructor initialization, use tc.
src/sim/pseudo_inst.cc:
Allow for loading of symbol file. Be sure to use ThreadContext and not ExecContext.


# 2986:99640058db70 15-Aug-2006 Gabe Black <gblack@eecs.umich.edu>

Some touchup to the reorganized includes and "using" directives.


# 2980:eab855f06b79 15-Aug-2006 Gabe Black <gblack@eecs.umich.edu>

Cleaned up include files and got rid of many using directives in header files.


# 2875:9b6f6b75b187 07-Jul-2006 Korey Sewell <ksewell@umich.edu>

Fix so that O3CPU doesnt segfault on exit.
Major thing was to not execute commit if there are no active threads in CPU.

src/cpu/o3/alpha/thread_context.hh:
call deallocate instead of deallocateContext
src/cpu/o3/commit_impl.hh:
dont run commit stage if there are no instructions
src/cpu/o3/cpu.cc:
add deallocate event, deactivateThread function, and edit deallocateContext.
src/cpu/o3/cpu.hh:
add deallocate event and add optional delay to deallocateContext
src/cpu/o3/thread_context.hh:
optional delay for deallocate
src/cpu/o3/thread_context_impl.hh:
edit DPRINTFs to say Thread Context instead of Alpha TC
src/cpu/thread_context.hh:
optional delay
src/sim/syscall_emul.hh:
name stuff


# 2834:c8342a71404b 03-Jul-2006 Korey Sewell <ksewell@umich.edu>

Fix for FS O3CPU compile ... missing forward class declaration/header file after files got split for ISA-independence

src/cpu/o3/alpha/thread_context.hh:
Use 'this' when accessing cpu
src/cpu/o3/cpu.hh:
add numActiveThreds function
src/cpu/o3/thread_context.hh:
forward class declarations
src/cpu/o3/thread_context_impl.hh:
add quiesce event header file
src/cpu/thread_context.hh:
add exit() function to thread context (read comments in file)
src/sim/syscall_emul.cc:
adjust exitFunc syscall


# 2817:273f7fb94f83 30-Jun-2006 Korey Sewell <ksewell@umich.edu>

Make O3CPU model independent of the ISA

Use O3CPU when building instead of AlphaO3CPU.

I could use some better python magic in the cpu_models.py file!

AUTHORS:
add middle initial
SConstruct:
change from AlphaO3CPU to O3CPU
src/cpu/SConscript:
edits to build O3CPU instead of AlphaO3CPU
src/cpu/cpu_models.py:
change substitution template to use proper CPU EXEC CONTEXT For O3CPU Model...

Actually, some Python expertise could be used here. The 'env' variable is not
passed to this file, so I had to parse through the ARGV to find the ISA...
src/cpu/o3/base_dyn_inst.cc:
src/cpu/o3/bpred_unit.cc:
src/cpu/o3/commit.cc:
src/cpu/o3/cpu.cc:
src/cpu/o3/cpu.hh:
src/cpu/o3/decode.cc:
src/cpu/o3/fetch.cc:
src/cpu/o3/iew.cc:
src/cpu/o3/inst_queue.cc:
src/cpu/o3/lsq.cc:
src/cpu/o3/lsq_unit.cc:
src/cpu/o3/mem_dep_unit.cc:
src/cpu/o3/rename.cc:
src/cpu/o3/rob.cc:
use isa_specific.hh
src/sim/process.cc:
only initi NextNPC if not ALPHA
src/cpu/o3/alpha/cpu.cc:
alphao3cpu impl
src/cpu/o3/alpha/cpu.hh:
move AlphaTC to it's own file
src/cpu/o3/alpha/cpu_impl.hh:
Move AlphaTC to it's own file ...
src/cpu/o3/alpha/dyn_inst.cc:
src/cpu/o3/alpha/dyn_inst.hh:
src/cpu/o3/alpha/dyn_inst_impl.hh:
include paths
src/cpu/o3/alpha/impl.hh:
include paths, set default MaxThreads to 2 instead of 4
src/cpu/o3/alpha/params.hh:
set Alpha Specific Params here
src/python/m5/objects/O3CPU.py:
add O3CPU class
src/cpu/o3/SConscript:
include isa-specific build files
src/cpu/o3/alpha/thread_context.cc:
NEW HOME of AlphaTC
src/cpu/o3/alpha/thread_context.hh:
new home of AlphaTC
src/cpu/o3/isa_specific.hh:
includes ISA specific files
src/cpu/o3/params.hh:
base o3 params
src/cpu/o3/thread_context.hh:
base o3 thread context
src/cpu/o3/thread_context_impl.hh:
base o3 thead context impl