History log of /gem5/src/cpu/o3/thread_context.hh
Revision Date Author Comments
# 14024:abe47b13653d 02-May-2019 Gabe Black <gabeblack@google.com>

arch, base, cpu, gpu, sim: Merge getMemProxy and getVirtProxy.

These two functions were performing the same function but had two
different names for historical reasons. This change merges them
together, keeping the getVirtProxy name to be consistent with the
getPhysProxy method used to get a non-translating proxy port.

Change-Id: Idd83c6b899f9343795075b030ccbc723a79e52a4
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18581
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 14022:a7cdc33dab35 02-May-2019 Gabe Black <gabeblack@google.com>

cpu, sim: Return PortProxy &s from all the proxy accessors.

This is a step towards merging the accessors for SE and FS modes.

Change-Id: I76818ab88b97097ac363e243be9cc1911b283090
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18579
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 13905:5cf30883255c 27-Apr-2019 Gabe Black <gabeblack@google.com>

arch: cpu: Track kernel stats using the base ISA agnostic type.

Then cast to the ISA specific type when necessary. This removes
(mostly) an ISA specific aspect to some of the interfaces. The ISA
specific version of the kernel stats still needs to be constructed and
stored in a few places which means that kernel_stats.hh still needs to
be a switching arch header, for instance.

In the future, I'd like to make the kernel its own object like the
Process objects in SE mode, and then it would be able to instantiate
and maintain its own stats.

Change-Id: I8309d49019124f6bea1482aaea5b5b34e8c97433
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18429
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 13865:cca49fc49c57 13-Apr-2019 Gabe Black <gabeblack@google.com>

cpu: Eliminate the ProxyThreadContext class.

Replace it with direct inheritance from the ThreadContext class in the
SimpleThread class which was the only place it was used.

Also take the opportunity to use some specialized types instead of
ints, etc., add some consts, and fix some style issues.

Change-Id: I5d2cfa87b20dc43615e33e6755c9d016564e9c0e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18048
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 13693:85fa3a41014b 14-Feb-2019 Giacomo Gabrielli <giacomo.gabrielli@arm.com>

cpu: Add ISA* getter in Thread interface

This patch is adding a ISA* getter to the TC interface

Change-Id: Ib8ddc5d8fdd44e782f50a2ad15878a6bcf931e58
Reviewed-on: https://gem5-review.googlesource.com/c/16462
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>


# 13628:332f730a1855 04-Feb-2019 Andrea Mondelli <Andrea.Mondelli@ucf.edu>

misc: added missing override specifier

Added missing specifier for various virtual functions.

Change-Id: I4783e92d78789a9ae182fad79aadceafb00b2458
Reviewed-on: https://gem5-review.googlesource.com/c/16103
Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>


# 13622:ba31c2a23eca 21-Nov-2018 Gabe Black <gabeblack@google.com>

cpu, arch: Replace the CCReg type with RegVal.

Most architectures weren't using the CCReg type, and in x86 and arm
it was already a uint64_t.

Change-Id: I0b3d5e690e6b31db6f2627f449c89bde0f6750a6
Reviewed-on: https://gem5-review.googlesource.com/c/14515
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 13611:c8b7847b4171 19-Nov-2018 Gabe Black <gabeblack@google.com>

arch: cpu: Rename *FloatRegBits* to *FloatReg*.

Now that there's no plain FloatReg, there's no reason to distinguish
FloatRegBits with a special suffix since it's the only way to read or
write FP registers.

Change-Id: I3a60168c1d4302aed55223ea8e37b421f21efded
Reviewed-on: https://gem5-review.googlesource.com/c/14460
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 13610:5d5404ac6288 16-Oct-2018 Giacomo Gabrielli <giacomo.gabrielli@arm.com>

arch,cpu: Add vector predicate registers

Latest-gen. vector/SIMD extensions, including the Arm Scalable Vector
Extension (SVE), introduce the notion of a predicate register file.
This changeset adds this feature across architectures and CPU models.

Change-Id: Iebcadbad89c0a582ff8b1b70de353305db603946
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13715
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>


# 13582:989577bf6abc 18-Oct-2018 Gabe Black <gabeblack@google.com>

arch: cpu: Stop passing around misc registers by reference.

These values are all basic integers (specifically uint64_t now), and
so passing them by const & is actually less efficient since there's a
extra level of indirection and an extra value, and the same sized value
(a 64 bit pointer vs. a 64 bit int) is being passed around.

Change-Id: Ie9956b8dc4c225068ab1afaba233ec2b42b76da3
Reviewed-on: https://gem5-review.googlesource.com/c/13626
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>


# 13557:fc33e6048b25 13-Oct-2018 Gabe Black <gabeblack@google.com>

cpu: dev: sim: gpu-compute: Banish some ISA specific register types.

These types are IntReg, FloatReg, FloatRegBits, and MiscReg. There are
some remaining types, specifically the vector registers and the CCReg.
I'm less familiar with these new types of registers, and so will look
at getting rid of them at some later time.

Change-Id: Ide8f76b15c531286f61427330053b44074b8ac9b
Reviewed-on: https://gem5-review.googlesource.com/c/13624
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 13500:6e0a2a7c6d8c 19-Nov-2018 Gabe Black <gabeblack@google.com>

arch, cpu: Remove float type accessors.

Use the binary accessors instead.

Change-Id: Iff1877e92c79df02b3d13635391a8c2f025776a2
Reviewed-on: https://gem5-review.googlesource.com/c/14457
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 12429:beefb9f5f551 09-Jan-2018 BKP <brandon.potter@amd.com>

style: change C/C++ source permissions to noexec

Several files in the repository were tracked with execute permissions
even though the files are just normal C/C++ files (and the one .isa).

Change-Id: I976b096acab4a1fc74c5699ef1f9b222c1e635c2
Reviewed-on: https://gem5-review.googlesource.com/7241
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 12406:86bde4a026b5 22-Dec-2017 Gabe Black <gabeblack@google.com>

arch,cpu: "virtualize" the TLB interface.

CPUs have historically instantiated the architecture specific version
of the TLBs to avoid a virtual function call, making them a little bit
more dependent on what the current ISA is. Some simple performance
measurement, the x86 twolf regression on the atomic CPU, shows that
there isn't actually any performance benefit, and if anything the
simulator goes slightly faster (although still within margin of error)
when the TLB functions are virtual.

This change switches everything outside of the architectures themselves
to use the generic BaseTLB type, and then inside the ISA for them to
cast that to their architecture specific type to call into architecture
specific interfaces.

The ARM TLB needed the most adjustment since it was using non-standard
translation function signatures. Specifically, they all took an extra
"type" parameter which defaulted to normal, and translateTiming
returned a Fault. translateTiming actually doesn't need to return a
Fault because everywhere that consumed it just stored it into a
structure which it then deleted(?), and the fault is stored in the
Translation object when the translation is done.

A little more work is needed to fully obviate the arch/tlb.hh header,
so the TheISA::TLB type is still visible outside of the ISAs.
Specifically, the TlbEntry type is used in the generic PageTable which
lives in src/mem.

Change-Id: I51b68ee74411f9af778317eff222f9349d2ed575
Reviewed-on: https://gem5-review.googlesource.com/6921
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>


# 12109:f29e9c5418aa 05-Apr-2017 Rekai Gonzalez-Alberquilla <Rekai.GonzalezAlberquilla@arm.com>

cpu: Added interface for vector reg file

This patch adds some more functionality to the cpu model and the arch to
interface with the vector register file.

This change consists mainly of augmenting ThreadContexts and ExecContexts
with calls to get/set full vectors, underlying microarchitectural elements
or lanes. Those are meant to interface with the vector register file. All
classes that implement this interface also get an appropriate implementation.

This requires implementing the vector register file for the different
models using the VecRegContainer class.

This change set also updates the Result abstraction to contemplate the
possibility of having a vector as result.

The changes also affect how the remote_gdb connection works.

There are some (nasty) side effects, such as the need to define dummy
numPhysVecRegs parameter values for architectures that do not implement
vector extensions.

Nathanael Premillieu's work with an increasing number of fixes and
improvements of mine.

Change-Id: Iee65f4e8b03abfe1e94e6940a51b68d0977fd5bb
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
[ Fix RISCV build issues and CC reg free list initialisation ]
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2705


# 12106:7784fac1b159 05-Apr-2017 Rekai Gonzalez-Alberquilla <Rekai.GonzalezAlberquilla@arm.com>

cpu: Simplify the rename interface and use RegId

With the hierarchical RegId there are a lot of functions that are
redundant now.

The idea behind the simplification is that instead of having the regId,
telling which kind of register read/write/rename/lookup/etc. and then
the function panic_if'ing if the regId is not of the appropriate type,
we provide an interface that decides what kind of register to read
depending on the register type of the given regId.

Change-Id: I7d52e9e21fc01205ae365d86921a4ceb67a57178
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
[ Fix RISCV build issues ]
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2702


# 11886:43b882cada33 27-Feb-2017 Brandon Potter <brandon.potter@amd.com>

syscall_emul: [PATCH 15/22] add clone/execve for threading and multiprocess simulations

Modifies the clone system call and adds execve system call. Requires allowing
processes to steal thread contexts from other processes in the same system
object and the ability to detach pieces of process state (such as MemState)
to allow dynamic sharing.


# 11877:5ea85692a53e 20-Jul-2015 Brandon Potter <brandon.potter@amd.com>

syscall_emul: [patch 13/22] add system call retry capability

This changeset adds functionality that allows system calls to retry without
affecting thread context state such as the program counter or register values
for the associated thread context (when system calls return with a retry
fault).

This functionality is needed to solve problems with blocking system calls
in multi-process or multi-threaded simulations where information is passed
between processes/threads. Blocking system calls can cause deadlock because
the simulator itself is single threaded. There is only a single thread
servicing the event queue which can cause deadlock if the thread hits a
blocking system call instruction.

To illustrate the problem, consider two processes using the producer/consumer
sharing model. The processes can use file descriptors and the read and write
calls to pass information to one another. If the consumer calls the blocking
read system call before the producer has produced anything, the call will
block the event queue (while executing the system call instruction) and
deadlock the simulation.

The solution implemented in this changeset is to recognize that the system
calls will block and then generate a special retry fault. The fault will
be sent back up through the function call chain until it is exposed to the
cpu model's pipeline where the fault becomes visible. The fault will trigger
the cpu model to replay the instruction at a future tick where the call has
a chance to succeed without actually going into a blocking state.

In subsequent patches, we recognize that a syscall will block by calling a
non-blocking poll (from inside the system call implementation) and checking
for events. When events show up during the poll, it signifies that the call
would not have blocked and the syscall is allowed to proceed (calling an
underlying host system call if necessary). If no events are returned from the
poll, we generate the fault and try the instruction for the thread context
at a distant tick. Note that retrying every tick is not efficient.

As an aside, the simulator has some multi-threading support for the event
queue, but it is not used by default and needs work. Even if the event queue
was completely multi-threaded, meaning that there is a hardware thread on
the host servicing a single simulator thread contexts with a 1:1 mapping
between them, it's still possible to run into deadlock due to the event queue
barriers on quantum boundaries. The solution of replaying at a later tick
is the simplest solution and solves the problem generally.


# 11005:e7f403b6b76f 07-Aug-2015 Andreas Sandberg <andreas.sandberg@arm.com>

base: Declare a type for context IDs

Context IDs used to be declared as ad hoc (usually as int). This
changeset introduces a typedef for ContextIDs and a constant for
invalid context IDs.


# 10935:acd48ddd725f 28-Jul-2015 Nilay Vaish <nilay@cs.wisc.edu>

revert 5af8f40d8f2c


# 10934:5af8f40d8f2c 26-Jul-2015 Nilay Vaish <nilay@cs.wisc.edu>

cpu: implements vector registers

This adds a vector register type. The type is defined as a std::array of a
fixed number of uint64_ts. The isa_parser.py has been modified to parse vector
register operands and generate the required code. Different cpus have vector
register files now.


# 10698:829adc48e175 16-Feb-2015 Andreas Hansson <andreas.hansson@arm.com>

arch: Make readMiscRegNoEffect const throughout

Finally took the plunge and made this apply to all ISAs, not just ARM.


# 10664:61a0b02aa800 25-Jan-2015 Ali Saidi <Ali.Saidi@ARM.com>

cpu: Remove all notion that we know when the cpu is misspeculating.

We have no way of knowing if a CPU model is on the wrong path with
our execute-in-execute CPU models. Don't pretend that we do.


# 10407:a9023811bf9e 20-Sep-2014 Mitch Hayenga <mitch.hayenga@arm.com>

alpha,arm,mips,power,x86,cpu,sim: Cleanup activate/deactivate

activate(), suspend(), and halt() used on thread contexts had an optional
delay parameter. However this parameter was often ignored. Also, when used,
the delay was seemily arbitrarily set to 0 or 1 cycle (no other delays were
ever specified). This patch removes the delay parameter and 'Events'
associated with them across all ISAs and cores. Unused activate logic
is also removed.


# 10190:fb83d025d1c3 09-May-2014 Akash Bagdia <akash.bagdia@arm.com>

cpu, arm: Allow the specification of a socket field

Allow the specification of a socket ID for every core that is reflected in the
MPIDR field in ARM systems. This allows studying multi-socket / cluster
systems with ARM CPUs.


# 10110:580b47334a97 07-Mar-2014 Andreas Hansson <andreas.hansson@arm.com>

cpu: Make CPU and ThreadContext getters const

This patch merely tidies up the CPU and ThreadContext getters by
making them const where appropriate.


# 10033:21c14a2b2117 24-Jan-2014 Ali Saidi <Ali.Saidi@ARM.com>

arch, cpu: Add support for flattening misc register indexes.

With ARMv8 support the same misc register id results in accessing different
registers depending on the current mode of the processor. This patch adds
the same orthogonality to the misc register file as the others (int, float, cc).
For all the othre ISAs this is currently a null-implementation.

Additionally, a system variable is added to all the ISA objects.


# 9920:028e4da64b42 15-Oct-2013 Yasuko Eckert <yasuko.eckert@amd.com>

cpu: add a condition-code register class

Add a third register class for condition codes,
in parallel with the integer and FP classes.
No ISAs use the CC class at this point though.


# 9428:029dfe6324d3 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Unify SimpleCPU and O3 CPU serialization code

The O3 CPU used to copy its thread context to a SimpleThread in order
to do serialization. This was a bit of a hack involving two static
SimpleThread instances and a magic constructor that was only used by
the O3 CPU.

This patch moves the ThreadContext serialization code into two global
procedures that, in addition to the normal serialization parameters,
take a ThreadContext reference as a parameter. This allows us to reuse
the serialization code in all ThreadContext implementations.


# 9426:0548b3e9734d 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Implement a flat register interface in thread contexts

Some architectures map registers differently depending on their mode
of operations. There is currently no architecture independent way of
accessing all registers. This patch introduces a flat register
interface to the ThreadContext class. This interface is useful, for
example, when serializing or copying thread contexts.


# 9382:1c97b57d5169 07-Jan-2013 Ali Saidi <Ali.Saidi@ARM.com>

cpu: rename the misleading inSyscall to noSquashFromTC

isSyscall was originally created because during handling of a syscall in SE
mode the threadcontext had to be updated. However, in many places this is used
in FS mode (e.g. fault handlers) and the name doesn't make much sense. The
boolean actually stops gem5 from squashing speculative and non-committed state
when a write to a threadcontext happens, so re-name the variable to something
more appropriate


# 9180:ee8d7a51651d 28-Aug-2012 Andreas Hansson <andreas.hansson@arm.com>

Clock: Add a Cycles wrapper class and use where applicable

This patch addresses the comments and feedback on the preceding patch
that reworks the clocks and now more clearly shows where cycles
(relative cycle counts) are used to express time.

Instead of bumping the existing patch I chose to make this a separate
patch, merely to try and focus the discussion around a smaller set of
changes. The two patches will be pushed together though.

This changes done as part of this patch are mostly following directly
from the introduction of the wrapper class, and change enough code to
make things compile and run again. There are definitely more places
where int/uint/Tick is still used to represent cycles, and it will
take some time to chase them all down. Similarly, a lot of parameters
should be changed from Param.Tick and Param.Unsigned to
Param.Cycles.

In addition, the use of curTick is questionable as there should not be
an absolute cycle. Potential solutions can be built on top of this
patch. There is a similar situation in the o3 CPU where
lastRunningCycle is currently counting in Cycles, and is still an
absolute time. More discussion to be had in other words.

An additional change that would be appropriate in the future is to
perform a similar wrapping of Tick and probably also introduce a
Ticks class along with suitable operators for all these classes.


# 9023:e9201a7bce59 26-May-2012 Gabe Black <gblack@eecs.umich.edu>

CPU: Merge the predecoder and decoder.

These classes are always used together, and merging them will give the ISAs
more flexibility in how they cache things and manage the process.


# 9020:14321ce30881 25-May-2012 Gabe Black <gblack@eecs.umich.edu>

Decode: Make the Decoder class defined per ISA.


# 8902:75b524b64c28 19-Mar-2012 Andreas Hansson <andreas.hansson@arm.com>

gcc: Clean-up of non-C++0x compliant code, first steps

This patch cleans up a number of minor issues aiming to get closer to
compliance with the C++0x standard as interpreted by gcc and clang
(compile with std=c++0x and -pedantic-errors). In particular, the
patch cleans up enums where the last item was succeded by a comma,
namespaces closed by a curcly brace followed by a semi-colon, and the
use of the GNU-extension typeof (replaced by templated functions). It
does not address variable-length arrays, zero-size arrays, anonymous
structs, range expressions in switch statements, and the use of long
long. The generated CPU code also has a large number of issues that
remain to be fixed, mainly related to overflows in implicit constant
conversion (due to shifts).


# 8887:20ea02da9c53 09-Mar-2012 Geoffrey Blake <geoffrey.blake@arm.com>

CheckerCPU: Make CheckerCPU runtime selectable instead of compile selectable

Enables the CheckerCPU to be selected at runtime with the --checker option
from the configs/example/fs.py and configs/example/se.py configuration
files. Also merges with the SE/FS changes.


# 8852:c744483edfcf 24-Feb-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Make port proxies use references rather than pointers

This patch is adding a clearer design intent to all objects that would
not be complete without a port proxy by making the proxies members
rathen than dynamically allocated. In essence, if NULL would not be a
valid value for the proxy, then we avoid using a pointer to make this
clear.

The same approach is used for the methods using these proxies, such as
loadSections, that now use references rather than pointers to better
reflect the fact that NULL would not be an acceptable value (in fact
the code would break and that is how this patch started out).

Overall the concept of "using a reference to express unconditional
composition where a NULL pointer is never valid" could be done on a
much broader scale throughout the code base, but for now it is only
done in the locations affected by the proxies.


# 8809:bb10807da889 01-Feb-2012 Gabe Black <gblack@eecs.umich.edu>

Merge with head, hopefully the last time for this batch.


# 8808:8af87554ad7e 31-Jan-2012 Gabe Black <gblack@eecs.umich.edu>

Merge with main repository.


# 8799:dac1e33e07b0 28-Jan-2012 Gabe Black <gblack@eecs.umich.edu>

Merge with the main repo.


# 8777:dd43f1c9fa0a 31-Oct-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Make the functions available from the TC consistent between SE and FS.


# 8767:e575781f71b8 30-Oct-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Make getProcessPtr available in both modes, and get rid of FULL_SYSTEMs.


# 8764:e4660687c49f 16-Oct-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Include getMemPort in FS.


# 8761:20322354b80b 16-Oct-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Build/expose vport in SE mode.


# 8754:0996451df6de 16-Oct-2011 Gabe Black <gblack@eecs.umich.edu>

CPU: Make physPort and getPhysPort available in SE mode.


# 8733:64a7bf8fa56c 31-Jan-2012 Geoffrey Blake <geoffrey.blake@arm.com>

CheckerCPU: Re-factor CheckerCPU to be compatible with current gem5

Brings the CheckerCPU back to life to allow FS and SE checking of the
O3CPU. These changes have only been tested with the ARM ISA. Other
ISAs potentially require modification.


# 8730:0a742249f76b 30-Jan-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Clean-up of Functional/Virtual/TranslatingPort remnants

This patch cleans up forward declarations and a member-function
prototype that still referred to the old FunctionalPort, VirtualPort
and TranslatingPort. There is no change in functionality.


# 8706:b1838faf3bcc 17-Jan-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Add port proxies instead of non-structural ports

Port proxies are used to replace non-structural ports, and thus enable
all ports in the system to correspond to a structural entity. This has
the advantage of accessing memory through the normal memory subsystem
and thus allowing any constellation of distributed memories, address
maps, etc. Most accesses are done through the "system port" that is
used for loading binaries, debugging etc. For the entities that belong
to the CPU, e.g. threads and thread contexts, they wrap the CPU data
port in a port proxy.

The following replacements are made:
FunctionalPort > PortProxy
TranslatingPort > SETranslatingPortProxy
VirtualPort > FSTranslatingPortProxy


# 8541:27aaee8ec7cc 09-Sep-2011 Gabe Black <gblack@eecs.umich.edu>

Decode: Pull instruction decoding out of the StaticInst class into its own.

This change pulls the instruction decoding machinery (including caches) out of
the StaticInst class and puts it into its own class. This has a few intrinsic
benefits. First, the StaticInst code, which has gotten to be quite large, gets
simpler. Second, the code that handles decode caching is now separated out
into its own component and can be looked at in isolation, making it easier to
understand. I took the opportunity to restructure the code a bit which will
hopefully also help.

Beyond that, this change also lays some ground work for each ISA to have its
own, potentially stateful decode object. We'd be able to include less
contextualizing information in the ExtMachInst objects since that context
would be applied at the decoder. Also, the decoder could "know" ahead of time
that all the instructions it's going to see are going to be, for instance, 64
bit mode, and it will have one less thing to check when it decodes them.
Because the decode caching mechanism has been separated out, it's now possible
to have multiple caches which correspond to different types of decoding
context. Having one cache for each element of the cross product of different
configurations may become prohibitive, so it may be desirable to clear out the
cache when relatively static state changes and not to have one for each
setting.

Because the decode function is no longer universally accessible as a static
member of the StaticInst class, a new function was added to the ThreadContexts
that returns the applicable decode object.


# 8229:78bf55f23338 15-Apr-2011 Nathan Binkert <nate@binkert.org>

includes: sort all includes


# 7720:65d338a8dba4 31-Oct-2010 Gabe Black <gblack@eecs.umich.edu>

ISA,CPU,etc: Create an ISA defined PC type that abstracts out ISA behaviors.



This change is a low level and pervasive reorganization of how PCs are managed
in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about,
the PC and the NPC, and the lsb of the PC signaled whether or not you were in
PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next
micropc, x86 and ARM introduced variable length instruction sets, and ARM
started to keep track of mode bits in the PC. Each CPU model handled PCs in
its own custom way that needed to be updated individually to handle the new
dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack,
the complexity could be hidden in the ISA at the ISA implementation's expense.
Areas like the branch predictor hadn't been updated to handle branch delay
slots or micropcs, and it turns out that had introduced a significant (10s of
percent) performance bug in SPARC and to a lesser extend MIPS. Rather than
perpetuate the problem by reworking O3 again to handle the PC features needed
by x86, this change was introduced to rework PC handling in a more modular,
transparent, and hopefully efficient way.


PC type:

Rather than having the superset of all possible elements of PC state declared
in each of the CPU models, each ISA defines its own PCState type which has
exactly the elements it needs. A cross product of canned PCState classes are
defined in the new "generic" ISA directory for ISAs with/without delay slots
and microcode. These are either typedef-ed or subclassed by each ISA. To read
or write this structure through a *Context, you use the new pcState() accessor
which reads or writes depending on whether it has an argument. If you just
want the address of the current or next instruction or the current micro PC,
you can get those through read-only accessors on either the PCState type or
the *Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the
move away from readPC. That name is ambiguous since it's not clear whether or
not it should be the actual address to fetch from, or if it should have extra
bits in it like the PAL mode bit. Each class is free to define its own
functions to get at whatever values it needs however it needs to to be used in
ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the
PC and into a separate field like ARM.

These types can be reset to a particular pc (where npc = pc +
sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as
appropriate), printed, serialized, and compared. There is a branching()
function which encapsulates code in the CPU models that checked if an
instruction branched or not. Exactly what that means in the context of branch
delay slots which can skip an instruction when not taken is ambiguous, and
ideally this function and its uses can be eliminated. PCStates also generally
know how to advance themselves in various ways depending on if they point at
an instruction, a microop, or the last microop of a macroop. More on that
later.

Ideally, accessing all the PCs at once when setting them will improve
performance of M5 even though more data needs to be moved around. This is
because often all the PCs need to be manipulated together, and by getting them
all at once you avoid multiple function calls. Also, the PCs of a particular
thread will have spatial locality in the cache. Previously they were grouped
by element in arrays which spread out accesses.


Advancing the PC:

The PCs were previously managed entirely by the CPU which had to know about PC
semantics, try to figure out which dimension to increment the PC in, what to
set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction
with the PC type itself. Because most of the information about how to
increment the PC (mainly what type of instruction it refers to) is contained
in the instruction object, a new advancePC virtual function was added to the
StaticInst class. Subclasses provide an implementation that moves around the
right element of the PC with a minimal amount of decision making. In ISAs like
Alpha, the instructions always simply assign NPC to PC without having to worry
about micropcs, nnpcs, etc. The added cost of a virtual function call should
be outweighed by not having to figure out as much about what to do with the
PCs and mucking around with the extra elements.

One drawback of making the StaticInsts advance the PC is that you have to
actually have one to advance the PC. This would, superficially, seem to
require decoding an instruction before fetch could advance. This is, as far as
I can tell, realistic. fetch would advance through memory addresses, not PCs,
perhaps predicting new memory addresses using existing ones. More
sophisticated decisions about control flow would be made later on, after the
instruction was decoded, and handed back to fetch. If branching needs to
happen, some amount of decoding needs to happen to see that it's a branch,
what the target is, etc. This could get a little more complicated if that gets
done by the predecoder, but I'm choosing to ignore that for now.


Variable length instructions:

To handle variable length instructions in x86 and ARM, the predecoder now
takes in the current PC by reference to the getExtMachInst function. It can
modify the PC however it needs to (by setting NPC to be the PC + instruction
length, for instance). This could be improved since the CPU doesn't know if
the PC was modified and always has to write it back.


ISA parser:

To support the new API, all PC related operand types were removed from the
parser and replaced with a PCState type. There are two warts on this
implementation. First, as with all the other operand types, the PCState still
has to have a valid operand type even though it doesn't use it. Second, using
syntax like PCS.npc(target) doesn't work for two reasons, this looks like the
syntax for operand type overriding, and the parser can't figure out if you're
reading or writing. Instructions that use the PCS operand (which I've
consistently called it) need to first read it into a local variable,
manipulate it, and then write it back out.


Return address stack:

The return address stack needed a little extra help because, in the presence
of branch delay slots, it has to merge together elements of the return PC and
the call PC. To handle that, a buildRetPC utility function was added. There
are basically only two versions in all the ISAs, but it didn't seem short
enough to put into the generic ISA directory. Also, the branch predictor code
in O3 and InOrder were adjusted so that they always store the PC of the actual
call instruction in the RAS, not the next PC. If the call instruction is a
microop, the next PC refers to the next microop in the same macroop which is
probably not desirable. The buildRetPC function advances the PC intelligently
to the next macroop (in an ISA specific way) so that that case works.


Change in stats:

There were no change in stats except in MIPS and SPARC in the O3 model. MIPS
runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could
likely be improved further by setting call/return instruction flags and taking
advantage of the RAS.


TODO:

Add != operators to the PCState classes, defined trivially to be !(a==b).
Smooth out places where PCs are split apart, passed around, and put back
together later. I think this might happen in SPARC's fault code. Add ISA
specific constructors that allow setting PC elements without calling a bunch
of accessors. Try to eliminate the need for the branching() function. Factor
out Alpha's PAL mode pc bit into a separate flag field, and eliminate places
where it's blindly masked out or tested in the PC.


# 7679:f26cc2c68b48 14-Sep-2010 Gabe Black <gblack@eecs.umich.edu>

CPU: Get rid of the now unnecessary getInst/setInst family of functions.

This code is no longer needed because of the preceeding change which adds a
StaticInstPtr parameter to the fault's invoke method, obviating the only use
for this pair of functions.


# 6711:c79d72abdbe5 04-Nov-2009 Steve Reinhardt <steve.reinhardt@amd.com>

o3: get rid of unused physmem pointer


# 6658:f4de76601762 23-Sep-2009 Nathan Binkert <nate@binkert.org>

arch: nuke arch/isa_specific.hh and move stuff to generated config/the_isa.hh


# 6314:781969fbeca9 09-Jul-2009 Gabe Black <gblack@eecs.umich.edu>

Registers: Get rid of the float register width parameter.


# 6313:95f69a436c82 09-Jul-2009 Gabe Black <gblack@eecs.umich.edu>

Registers: Add an ISA object which replaces the MiscRegFile.
This object encapsulates (or will eventually) the identity and characteristics
of the ISA in the CPU.


# 6180:1a8950d566ff 12-May-2009 Korey Sewell <ksewell@umich.edu>

inorder-bpred: edits to handle non-delay-slot ISAs
Changes so that InOrder can work for a non-delay-slot ISA like Alpha. Typically, changes have to do with handling misspeculated branches at different points in pipeline


# 6029:007c36616f47 15-Apr-2009 Steve Reinhardt <steve.reinhardt@amd.com>

Get rid of the Unallocated thread context state.
Basically merge it in with Halted.
Also had to get rid of a few other functions that
called ThreadContext::deallocate(), including:
- InOrderCPU's setThreadRescheduleCondition.
- ThreadContext::exit(). This function was there to avoid terminating
simulation when one thread out of a multi-thread workload exits, but we
need to find a better (non-cpu-centric) way.


# 6022:410194bb3049 09-Apr-2009 Gabe Black <gblack@eecs.umich.edu>

tlb: Don't separate the TLB classes into an instruction TLB and a data TLB


# 5958:2d9737bf3c2f 27-Feb-2009 Gabe Black <gblack@eecs.umich.edu>

Processes: Make getting and setting system call arguments part of a process object.


# 5803:aae3d7089925 19-Jan-2009 Nathan Binkert <nate@binkert.org>

thread_context: move getSystemPtr so SE mode can get to it.
There was really no reason that it should be FS only.


# 5715:e8c1d4e669a7 04-Nov-2008 Lisa Hsu <hsul@eecs.umich.edu>

get rid of all instances of readTid() and getThreadNum(). Unify and eliminate
redundancies with threadId() as their replacement.


# 5714:76abee886def 02-Nov-2008 Lisa Hsu <hsul@eecs.umich.edu>

Add in Context IDs to the simulator. From now on, cpuId is almost never used,
the primary identifier for a hardware context should be contextId(). The
concept of threads within a CPU remains, in the form of threadId() because
sometimes you need to know which context within a cpu to manipulate.


# 5712:199d31b47f7b 02-Nov-2008 Lisa Hsu <hsul@eecs.umich.edu>

make BaseCPU the provider of _cpuId, and cpuId() instead of being scattered
across the subclasses. generally make it so that member data is _cpuId and
accessor functions are cpuId(). The ID val comes from the python (default -1 if
none provided), and if it is -1, the index of cpuList will be given. this has
passed util/regress quick and se.py -n4 and fs.py -n4 as well as standard
switch.


# 5668:5b5a9f4203d1 12-Oct-2008 Gabe Black <gblack@eecs.umich.edu>

Get rid of old RegContext code.


# 5595:6ebdae3f619b 09-Oct-2008 Gabe Black <gblack@eecs.umich.edu>

O3: Generalize the O3 CPU object so it isn't split out by ISA.


# 5499:8bfc7650c344 01-Jul-2008 Ali Saidi <saidi@eecs.umich.edu>

Remove delVirtPort() and make getVirtPort() only return cached version.


# 5497:89a6483d7047 01-Jul-2008 Ali Saidi <saidi@eecs.umich.edu>

Make the cached virtPort have a thread context so it can do everything that a newly created one can.


# 5259:74ef5093154f 15-Nov-2007 Korey Sewell <ksewell@umich.edu>

add microPC stuff back in. got deleted on changeset propragation somehow.


# 5250:42577371ff31 15-Nov-2007 Korey Sewell <ksewell@umich.edu>

Get MIPS simple regression working. Take out unecessary functions "setShadowSet", "CacheOp"


# 5249:49d44a466496 15-Nov-2007 Korey Sewell <ksewell@umich.edu>

branch merge


# 5235:f07f46843886 12-Nov-2007 Gabe Black <gblack@eecs.umich.edu>

X86: Make the micropc available through the thread context objects.
This is necssary for fault handlers that branch to non-zero micro PCs.


# 5222:bb733a878f85 13-Nov-2007 Korey Sewell <ksewell@umich.edu>

Add in files from merge-bare-iron, get them compiling in FS and SE mode


# 4997:e7380529bd2d 26-Aug-2007 Gabe Black <gblack@eecs.umich.edu>

Address Translation: Make SE mode use an actual TLB/MMU for translation like FS.


# 4172:141705d83494 07-Mar-2007 Ali Saidi <saidi@eecs.umich.edu>

*MiscReg->*MiscRegNoEffect, *MiscRegWithEffect->*MiscReg


# 3789:9ce219516b5d 07-Dec-2006 Gabe Black <gblack@eecs.umich.edu>

Compilation fixes


# 3784:edc6cff4cbc1 06-Dec-2006 Gabe Black <gblack@eecs.umich.edu>

Got rid of some typedefs and moved the tlbs into the base o3 cpu.


# 3686:fa8d8b90cd8a 29-Nov-2006 Kevin Lim <ktlim@umich.edu>

Change the connecting of the physPort and virtPort to the memory object below the CPU to happen every time activateContext is called. The overhead is probably a little higher than necessary, but allows these connections to properly be made when there are CPUs that are inactive until they are switched in.

Right now this introduces a minor memory leak as old physPorts and virtPorts are not deleted when new ones are created. A flyspray task has been created for this issue. It can not be resolved until we determine how the bus will handle giving out ID's to functional ports that may be deleted.

src/cpu/o3/cpu.cc:
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
Change the setup of the physPort and virtPort to instead happen every time the CPU has a context activated. This is a little high overhead, but keeps it working correctly when the CPU does not have a physical memory attached to it until it switches in (like the case of switch CPUs).
src/cpu/o3/thread_context.hh:
Change function from being called at init() to just being called whenever the memory ports need to be connected.
src/cpu/o3/thread_context_impl.hh:
Update this to not delete the port if it's the same as the virtPort.
src/cpu/thread_context.hh:
Change function from being called at init() to whenever the memory ports need to be connected.
src/cpu/thread_state.cc:
Instead of initializing the ports, simply connect them, deleting any old ports that might exist. This allows these functions to be called multiple times.
src/cpu/thread_state.hh:
Ports are no longer initialized, but rather connected at context activation time.


# 3675:dc883b610345 19-Nov-2006 Kevin Lim <ktlim@umich.edu>

Update Virtual and Physical ports.

src/cpu/o3/alpha/cpu_impl.hh:
Handle the PhysicalPort and VirtualPort in the ThreadState.
src/cpu/o3/cpu.cc:
Initialize the thread context.
src/cpu/o3/thread_context.hh:
Add new function to initialize thread context.
src/cpu/o3/thread_context_impl.hh:
Use code now put into function.
src/cpu/simple_thread.cc:
Move code to ThreadState and use the new helper function.
src/cpu/simple_thread.hh:
Remove init() in this derived class; use init() from ThreadState base class.
src/cpu/thread_state.cc:
Move setting up of Physical and Virtual ports here. Change getMemFuncPort() to connectToMemFunc(), which connects a port to a functional port of the memory object below the CPU.
src/cpu/thread_state.hh:
Update functions.


# 3548:85e64c82c522 07-Nov-2006 Gabe Black <gblack@eecs.umich.edu>

Moved the switched version of kernel_stats.hh back to kern, and moved the base kernel_stats to base_kernel_stats


# 3468:cf23ad1ceef2 01-Nov-2006 Gabe Black <gblack@eecs.umich.edu>

Adjustments for the AlphaTLB changing to AlphaISA::TLB and changing register file functions to not take faults


# 2935:d1223a6c9156 23-Jul-2006 Korey Sewell <ksewell@umich.edu>

This changeset gets the MIPS ISA pretty much working in the O3CPU. It builds, runs, and gets very very close to completing the hello world
succesfully but there are some minor quirks to iron out. Who would've known a DELAY SLOT introduces that much complexity?! arrgh!

Anyways, a lot of this stuff had to do with my project at MIPS and me needing to know how I was going to get this working for the MIPS
ISA. So I figured I would try to touch it up and throw it in here (I hate to introduce non-completely working components... )

src/arch/alpha/isa/mem.isa:
spacing
src/arch/mips/faults.cc:
src/arch/mips/faults.hh:
Gabe really authored this
src/arch/mips/isa/decoder.isa:
add StoreConditional Flag to instruction
src/arch/mips/isa/formats/basic.isa:
Steven really did this file
src/arch/mips/isa/formats/branch.isa:
fix bug for uncond/cond control
src/arch/mips/isa/formats/mem.isa:
Adjust O3CPU memory access to use new memory model interface.
src/arch/mips/isa/formats/util.isa:
update LoadStoreBase template
src/arch/mips/isa_traits.cc:
update SERIALIZE partially
src/arch/mips/process.cc:
src/arch/mips/process.hh:
no need for this for NOW. ASID/Virtual addressing handles it
src/arch/mips/regfile/misc_regfile.hh:
add in clear() function and comments for future usage of special misc. regs
src/cpu/base_dyn_inst.hh:
add in nextNPC variable and supporting functions.

add isCondDelaySlot function

Update predTaken and mispredicted functions
src/cpu/base_dyn_inst_impl.hh:
init nextNPC
src/cpu/o3/SConscript:
add MIPS files to compile
src/cpu/o3/alpha/thread_context.hh:
no need for my name on this file
src/cpu/o3/bpred_unit_impl.hh:
Update RAS appropriately for MIPS
src/cpu/o3/comm.hh:
add some extra communication variables to aid in handling the
delay slots
src/cpu/o3/commit.hh:
minor name fix for nextNPC functions.
src/cpu/o3/commit_impl.hh:
src/cpu/o3/decode_impl.hh:
src/cpu/o3/fetch_impl.hh:
src/cpu/o3/iew_impl.hh:
src/cpu/o3/inst_queue_impl.hh:
src/cpu/o3/rename_impl.hh:
Fix necessary variables and functions for squashes with delay slots
src/cpu/o3/cpu.cc:
Update function interface ...

adjust removeInstsNotInROB function to recognize delay slots insts
src/cpu/o3/cpu.hh:
update removeInstsNotInROB
src/cpu/o3/decode.hh:
declare necessary variables for handling delay slot
src/cpu/o3/dyn_inst.hh:
Add in MipsDynInst
src/cpu/o3/fetch.hh:
src/cpu/o3/iew.hh:
src/cpu/o3/rename.hh:
declare necessary variables and adjust functions for handling delay slot
src/cpu/o3/inst_queue.hh:
src/cpu/simple/base.cc:
no need for my name here
src/cpu/o3/isa_specific.hh:
add in MIPS files
src/cpu/o3/scoreboard.hh:
dont include alpha specific isa traits!
src/cpu/o3/thread_context.hh:
no need for my name here, i just rearranged where the file goes
src/cpu/static_inst.hh:
add isCondDelaySlot function
src/cpu/o3/mips/cpu.cc:
src/cpu/o3/mips/cpu.hh:
src/cpu/o3/mips/cpu_builder.cc:
src/cpu/o3/mips/cpu_impl.hh:
src/cpu/o3/mips/dyn_inst.cc:
src/cpu/o3/mips/dyn_inst.hh:
src/cpu/o3/mips/dyn_inst_impl.hh:
src/cpu/o3/mips/impl.hh:
src/cpu/o3/mips/params.hh:
src/cpu/o3/mips/thread_context.cc:
src/cpu/o3/mips/thread_context.hh:
MIPS file for O3CPU...mirrors ALPHA definition


# 2875:9b6f6b75b187 07-Jul-2006 Korey Sewell <ksewell@umich.edu>

Fix so that O3CPU doesnt segfault on exit.
Major thing was to not execute commit if there are no active threads in CPU.

src/cpu/o3/alpha/thread_context.hh:
call deallocate instead of deallocateContext
src/cpu/o3/commit_impl.hh:
dont run commit stage if there are no instructions
src/cpu/o3/cpu.cc:
add deallocate event, deactivateThread function, and edit deallocateContext.
src/cpu/o3/cpu.hh:
add deallocate event and add optional delay to deallocateContext
src/cpu/o3/thread_context.hh:
optional delay for deallocate
src/cpu/o3/thread_context_impl.hh:
edit DPRINTFs to say Thread Context instead of Alpha TC
src/cpu/thread_context.hh:
optional delay
src/sim/syscall_emul.hh:
name stuff


# 2834:c8342a71404b 03-Jul-2006 Korey Sewell <ksewell@umich.edu>

Fix for FS O3CPU compile ... missing forward class declaration/header file after files got split for ISA-independence

src/cpu/o3/alpha/thread_context.hh:
Use 'this' when accessing cpu
src/cpu/o3/cpu.hh:
add numActiveThreds function
src/cpu/o3/thread_context.hh:
forward class declarations
src/cpu/o3/thread_context_impl.hh:
add quiesce event header file
src/cpu/thread_context.hh:
add exit() function to thread context (read comments in file)
src/sim/syscall_emul.cc:
adjust exitFunc syscall


# 2817:273f7fb94f83 30-Jun-2006 Korey Sewell <ksewell@umich.edu>

Make O3CPU model independent of the ISA

Use O3CPU when building instead of AlphaO3CPU.

I could use some better python magic in the cpu_models.py file!

AUTHORS:
add middle initial
SConstruct:
change from AlphaO3CPU to O3CPU
src/cpu/SConscript:
edits to build O3CPU instead of AlphaO3CPU
src/cpu/cpu_models.py:
change substitution template to use proper CPU EXEC CONTEXT For O3CPU Model...

Actually, some Python expertise could be used here. The 'env' variable is not
passed to this file, so I had to parse through the ARGV to find the ISA...
src/cpu/o3/base_dyn_inst.cc:
src/cpu/o3/bpred_unit.cc:
src/cpu/o3/commit.cc:
src/cpu/o3/cpu.cc:
src/cpu/o3/cpu.hh:
src/cpu/o3/decode.cc:
src/cpu/o3/fetch.cc:
src/cpu/o3/iew.cc:
src/cpu/o3/inst_queue.cc:
src/cpu/o3/lsq.cc:
src/cpu/o3/lsq_unit.cc:
src/cpu/o3/mem_dep_unit.cc:
src/cpu/o3/rename.cc:
src/cpu/o3/rob.cc:
use isa_specific.hh
src/sim/process.cc:
only initi NextNPC if not ALPHA
src/cpu/o3/alpha/cpu.cc:
alphao3cpu impl
src/cpu/o3/alpha/cpu.hh:
move AlphaTC to it's own file
src/cpu/o3/alpha/cpu_impl.hh:
Move AlphaTC to it's own file ...
src/cpu/o3/alpha/dyn_inst.cc:
src/cpu/o3/alpha/dyn_inst.hh:
src/cpu/o3/alpha/dyn_inst_impl.hh:
include paths
src/cpu/o3/alpha/impl.hh:
include paths, set default MaxThreads to 2 instead of 4
src/cpu/o3/alpha/params.hh:
set Alpha Specific Params here
src/python/m5/objects/O3CPU.py:
add O3CPU class
src/cpu/o3/SConscript:
include isa-specific build files
src/cpu/o3/alpha/thread_context.cc:
NEW HOME of AlphaTC
src/cpu/o3/alpha/thread_context.hh:
new home of AlphaTC
src/cpu/o3/isa_specific.hh:
includes ISA specific files
src/cpu/o3/params.hh:
base o3 params
src/cpu/o3/thread_context.hh:
base o3 thread context
src/cpu/o3/thread_context_impl.hh:
base o3 thead context impl