History log of /gem5/src/cpu/simple/timing.hh
Revision Date Author Comments
# 14297:b4519e586f5e 10-Sep-2019 Jordi Vaquero <jordi.vaquero@metempsy.com>

cpu, mem: Changing AtomicOpFunctor* for unique_ptr<AtomicOpFunctor>

This change is based on modify the way we move the AtomicOpFunctor*
through gem5 in order to mantain proper ownership of the object and
ensuring its destruction when it is no longer used.

Doing that we fix at the same time a memory leak in Request.hh
where we were assigning a new AtomicOpFunctor* without destroying the
previous one.

This change creates a new type AtomicOpFunctor_ptr as a
std::unique_ptr<AtomicOpFunctor> and move its ownership as needed. Except
for its only usage when AtomicOpFunc() is called.

Change-Id: Ic516f9d8217cb1ae1f0a19500e5da0336da9fd4f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20919
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 14198:9c2f67392409 17-Aug-2019 Gabe Black <gabeblack@google.com>

cpu: Make get(Data|Inst)Port return a Port and not a MasterPort.

No caller uses any of the MasterPort specific properties of these
function's return values, so we can instead return a reference to the
base Port class. This makes it possible for the data and inst ports
to be of any port type, not just gem5 style MasterPorts. This makes
life simpler for, for example, systemc based CPUs which might have TLM
ports.

It also makes it possible for any two CPUs which have compatible ports
to be switched between, as long as the ports they use support being
unbound. Unfortunately that does not include TLM or systemc ports which
are bound permanently.

Change-Id: I98fce5a16d2ef1af051238e929dd96d57a4ac838
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20240
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 14085:0075b0d29d55 28-Jun-2019 Giacomo Travaglini <giacomo.travaglini@arm.com>

cpu: isDrained renamed to isCpuDrained

cpu models inheriting from BaseCPU implement a draining checker called
isDrained. This hides the base Drainable::isDrained method and might
create confusion in the reader.
This patch is renaming it to isCpuDrained in order to avoid any
ambiguity

Change-Id: Ie5221da6a4673432c2403996e42d451cae960bbf
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19468
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 13954:2f400a5f2627 07-Jul-2017 Giacomo Gabrielli <giacomo.gabrielli@arm.com>

cpu,mem: Add support for partial loads/stores and wide mem. accesses

This changeset adds support for partial (or masked) loads/stores, i.e.
loads/stores that can disable accesses to individual bytes within the
target address range. In addition, this changeset extends the code to
crack memory accesses across most CPU models (TimingSimpleCPU still
TBD), so that arbitrarily wide memory accesses are supported. These
changes are required for supporting ISAs with wide vectors.

Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
- Tiago Muck <tiago.muck@arm.com>

Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13652:45d94ac03a27 22-Jan-2018 Tuan Ta <qtt2@cornell.edu>

cpu: support atomic memory request type with AtomicOpFunctor

This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU,
MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory
system.

Atomic memory instruction is treated as a special store instruction in
all CPU models.

In simple CPUs, an AMO request with an associated AtomicOpFunctor is
simply sent to L1 dcache.

In MinorCPU, an AMO request bypasses store buffer and waits for any
conflicting store request(s) currently in the store buffer to retire
before the AMO request is sent to the cache. AMO requests are not buffered
in the store buffer, so their effects appear immediately in the cache.

In DerivO3CPU, an AMO request is inserted in the store buffer so that it
is delivered to the cache only after all previous stores are issued to
the cache. Data forwarding between between an outstanding AMO in the
store buffer and a subsequent load is not allowed since the AMO request
does not hold valid data until it's executed in the cache.

This implementation assumes that a target ISA implementation must insert
enough memory fences as micro-ops around an atomic instruction to
enforce a correct order of memory instructions with respect to its
memory consistency model. Without extra memory fences, this implementation
can allow AMOs and other memory instructions that do not conflict
(i.e., not target the same address) to reorder.

This implementation also assumes that atomic instructions execute within
a cache line boundary since the cache for now is not able to execute an
operation on two different cache lines in one single step. Therefore,
ISAs like x86 that require multi-cache-line atomic instructions need to
either use a pair of locking load and unlocking store or change the
cache implementation to guarantee the atomicity of an atomic
instruction.

Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a
Reviewed-on: https://gem5-review.googlesource.com/c/8188
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>


# 12749:223c83ed9979 04-Jun-2018 Giacomo Travaglini <giacomo.travaglini@arm.com>

misc: Using smart pointers for memory Requests

This patch is changing the underlying type for RequestPtr from Request*
to shared_ptr<Request>. Having memory requests being managed by smart
pointers will simplify the code; it will also prevent memory leakage and
dangling pointers.

Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10996
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 12085:de78ea63e0ca 07-Jun-2017 Sean Wilson <spwilson2@wisc.edu>

cpu, gpu-compute: Replace EventWrapper use with EventFunctionWrapper

Change-Id: Idd5992463bcf9154f823b82461070d1f1842cea3
Signed-off-by: Sean Wilson <spwilson2@wisc.edu>
Reviewed-on: https://gem5-review.googlesource.com/3746
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>


# 11608:6319a1125f1c 14-Aug-2016 Nikos Nikoleris <nikos.nikoleris@arm.com>

cpu, arch: fix the type used for the request flags

Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>


# 11331:cd5c48db28e6 10-Feb-2016 Andreas Hansson <andreas.hansson@arm.com>

mem: Deduce if cache should forward snoops

This patch changes how the cache determines if snoops should be
forwarded from the memory side to the CPU side. Instead of having a
parameter, the cache now looks at the port connected on the CPU side,
and if it is a snooping port, then snoops are forwarded. Less error
prone, and less parameters to worry about.

The patch also tidies up the CPU classes to ensure that their I-side
port is not snooping by removing overrides to the snoop request
handler, such that snoop requests will panic via the default
MasterPort implement


# 11303:f694764d656d 17-Jan-2016 Steve Reinhardt <steve.reinhardt@amd.com>

cpu. arch: add initiateMemRead() to ExecContext interface

For historical reasons, the ExecContext interface had a single
function, readMem(), that did two different things depending on
whether the ExecContext supported atomic memory mode (i.e.,
AtomicSimpleCPU) or timing memory mode (all the other models).
In the former case, it actually performed a memory read; in the
latter case, it merely initiated a read access, and the read
completion did not happen until later when a response packet
arrived from the memory system.

This led to some confusing things, including timing accesses
being required to provide a pointer for the return data even
though that pointer was only used in atomic mode.

This patch splits this interface, adding a new initiateMemRead()
function to the ExecContext interface to replace the timing-mode
use of readMem().

For consistency and clarity, the readMemTiming() helper function
in the ISA definitions is renamed to initiateMemRead() as well.
For x86, where the access size is passed in explicitly, we can
also get rid of the data parameter at this level. For other ISAs,
where the access size is determined from the type of the data
parameter, we have to keep the parameter for that purpose.


# 11169:44b5c183c3cd 12-Oct-2015 Andreas Hansson <andreas.hansson@arm.com>

misc: Add explicit overrides and fix other clang >= 3.5 issues

This patch adds explicit overrides as this is now required when using
"-Wall" with clang >= 3.5, the latter now part of the most recent
XCode. The patch consequently removes "virtual" for those methods
where "override" is added. The latter should be enough of an
indication.

As part of this patch, a few minor issues that clang >= 3.5 complains
about are also resolved (unused methods and variables).


# 11168:f98eb2da15a4 12-Oct-2015 Andreas Hansson <andreas.hansson@arm.com>

misc: Remove redundant compiler-specific defines

This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap
(and similar) abstractions, as these are no longer needed with gcc 4.7
and clang 3.1 as minimum compiler versions.


# 11148:1bc3d93c7eaa 30-Sep-2015 Mitch Hayenga <mitch.hayenga@arm.com>

cpu: Add per-thread monitors

Adds per-thread address monitors to support FullSystem SMT.


# 11147:cc8d6e99cf46 30-Sep-2015 Mitch Hayenga <mitch.hayenga@arm.com>

config,cpu: Add SMT support to Atomic and Timing CPUs

Adds SMT support to the "simple" CPU models so that they can be
used with other SMT-supported CPUs. Example usage: this enables
the TimingSimpleCPU to be used to warmup caches before swapping to
detailed mode with the in-order or out-of-order based CPU models.


# 10913:38dbdeea7f1f 07-Jul-2015 Andreas Sandberg <andreas.sandberg@arm.com>

sim: Refactor and simplify the drain API

The drain() call currently passes around a DrainManager pointer, which
is now completely pointless since there is only ever one global
DrainManager in the system. It also contains vestiges from the time
when SimObjects had to keep track of their child objects that needed
draining.

This changeset moves all of the DrainState handling to the Drainable
base class and changes the drain() and drainResume() calls to reflect
this. Particularly, the drain() call has been updated to take no
parameters (the DrainManager argument isn't needed) and return a
DrainState instead of an unsigned integer (there is no point returning
anything other than 0 or 1 any more). Drainable objects should return
either DrainState::Draining (equivalent to returning 1 in the old
system) if they need more time to drain or DrainState::Drained
(equivalent to returning 0 in the old system) if they are already in a
consistent state. Returning DrainState::Running is considered an
error.

Drain done signalling is now done through the signalDrainDone() method
in the Drainable class instead of using the DrainManager directly. The
new call checks if the state of the object is DrainState::Draining
before notifying the drain manager. This means that it is safe to call
signalDrainDone() without first checking if the simulator has
requested draining. The intention here is to reduce the code needed to
implement draining in simple objects.


# 10713:eddb533708cb 02-Mar-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Split port retry for all different packet classes

This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.

The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvReqRetry, sendTimingResp has
recvRespRetry etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.

The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.


# 10653:e3fc6bc7f97e 22-Jan-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Clean up Request initialisation

This patch tidies up how we create and set the fields of a Request. In
essence it tries to use the constructor where possible (as opposed to
setPhys and setVirt), thus avoiding spreading the information across a
number of locations. In fact, setPhys is made private as part of this
patch, and a number of places where we callede setVirt instead uses
the appropriate constructor.


# 10529:05b5a6cf3521 06-Nov-2014 Marc Orr <morr@cs.wisc.edu>

x86 isa: This patch attempts an implementation at mwait.

Mwait works as follows:
1. A cpu monitors an address of interest (monitor instruction)
2. A cpu calls mwait - this loads the cache line into that cpu's cache.
3. The cpu goes to sleep.
4. When another processor requests write permission for the line, it is
evicted from the sleeping cpu's cache. This eviction is forwarded to the
sleeping cpu, which then wakes up.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>


# 10464:2a0fe8bca031 16-Oct-2014 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Probe points for basic PMU stats

This changeset adds probe points that can be used to implement PMU
counters for CPU stats. The following probes are supported:

* BaseCPU::ppCycles / Cycles
* BaseCPU::ppRetiredInsts / RetiredInsts
* BaseCPU::ppRetiredLoads / RetiredLoads
* BaseCPU::ppRetiredStores / RetiredStores
* BaseCPU::ppRetiredBranches RetiredBranches


# 10407:a9023811bf9e 20-Sep-2014 Mitch Hayenga <mitch.hayenga@arm.com>

alpha,arm,mips,power,x86,cpu,sim: Cleanup activate/deactivate

activate(), suspend(), and halt() used on thread contexts had an optional
delay parameter. However this parameter was often ignored. Also, when used,
the delay was seemily arbitrarily set to 0 or 1 cycle (no other delays were
ever specified). This patch removes the delay parameter and 'Events'
associated with them across all ISAs and cores. Unused activate logic
is also removed.


# 10379:c00f6d7e2681 19-Sep-2014 Andreas Hansson <andreas.hansson@arm.com>

arch: Pass faults by const reference where possible

This patch changes how faults are passed between methods in an attempt
to copy as few reference-counting pointer instances as possible. This
should avoid unecessary copies being created, contributing to the
increment/decrement of the reference counters.


# 10030:b531e328342d 24-Jan-2014 Ali Saidi <Ali.Saidi@ARM.com>

cpu: Add CPU support for generatig wake up events when LLSC adresses are snooped.

This patch add support for generating wake-up events in the CPU when an address
that is currently in the exclusive state is hit by a snoop. This mechanism is required
for ARMv8 multi-processor support.


# 9840:c562aa658a6f 20-Aug-2013 Andreas Hansson <andreas.hansson@arm.com>

cpu: Fix timing CPU isDrained comment formatting

This patch fixes up the comment formatting for isDrained in the timing
CPU.


# 9830:5995f4d33a11 19-Aug-2013 Andreas Hansson <andreas.hansson@arm.com>

cpu: Fix timing CPU drain check

This patch modifies the SimpleTimingCPU drain check to also consider
the fetch event. Previously, there was an assumption that there is
never a fetch event scheduled if the CPU is not executing
microcode. However, when a context is activated, a fetch even is
scheduled, and microPC() is zero.


# 9608:e2b6b86fda03 26-Mar-2013 Andreas Hansson <andreas.hansson@arm.com>

cpu: Remove CpuPort and use MasterPort in the CPU classes

This patch changes the port in the CPU classes to use MasterPort
instead of the derived CpuPort. The functions of the CpuPort are now
distributed across the relevant subclasses. The port accessor
functions (getInstPort and getDataPort) now return a MasterPort
instead of a CpuPort. This simplifies creating derivative CPUs that do
not use the CpuPort.


# 9523:b8c8437f71d9 15-Feb-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Refactor memory system checks

CPUs need to test that the memory system is in the right mode in two
places, when the CPU is initialized (unless it's switched out) and on
a drainResume(). This led to some code duplication in the CPU
models. This changeset introduces the verifyMemoryMode() method which
is called by BaseCPU::init() if the CPU isn't switched out. The
individual CPU models are responsible for calling this method when
resuming from a drain as this code is CPU model specific.


# 9442:36967173340c 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Make sure that a drained timing CPU isn't executing ucode

Currently, the timing CPU can be in the middle of a microcode sequence
or multicycle (stayAtPC is true) instruction when it is drained. This
leads to two problems:

* When switching to a hardware virtualized CPU, we obviously can't
execute gem5 microcode.

* If stayAtPC is true we might execute half of an instruction twice
when restoring a checkpoint or switching CPUs, which leads to an
incorrect execution.

After applying this patch, the CPU will be on a proper instruction
boundary, which means that it is safe to switch to any CPU model
(including hardware virtualized ones). This changeset also fixes a bug
where the timing CPU sometimes switches out with while stayAtPC is
true, which corrupts the target state after a CPU switch or
checkpoint.

Note: This changeset removes the so_state variable from checkpoints
since the drain state isn't used anymore.


# 9342:6fec8f26e56d 02-Nov-2012 Andreas Sandberg <Andreas.Sandberg@arm.com>

sim: Move the draining interface into a separate base class

This patch moves the draining interface from SimObject to a separate
class that can be used by any object needing draining. However,
objects not visible to the Python code (i.e., objects not deriving
from SimObject) still depend on their parents informing them when to
drain. This patch also gets rid of the CountedDrainEvent (which isn't
really an event) and replaces it with a DrainManager.


# 9258:baa17ba80e06 25-Sep-2012 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Squash outstanding walks when instructions are squashed.


# 9180:ee8d7a51651d 28-Aug-2012 Andreas Hansson <andreas.hansson@arm.com>

Clock: Add a Cycles wrapper class and use where applicable

This patch addresses the comments and feedback on the preceding patch
that reworks the clocks and now more clearly shows where cycles
(relative cycle counts) are used to express time.

Instead of bumping the existing patch I chose to make this a separate
patch, merely to try and focus the discussion around a smaller set of
changes. The two patches will be pushed together though.

This changes done as part of this patch are mostly following directly
from the introduction of the wrapper class, and change enough code to
make things compile and run again. There are definitely more places
where int/uint/Tick is still used to represent cycles, and it will
take some time to chase them all down. Similarly, a lot of parameters
should be changed from Param.Tick and Param.Unsigned to
Param.Cycles.

In addition, the use of curTick is questionable as there should not be
an absolute cycle. Potential solutions can be built on top of this
patch. There is a similar situation in the o3 CPU where
lastRunningCycle is currently counting in Cycles, and is still an
absolute time. More discussion to be had in other words.

An additional change that would be appropriate in the future is to
perform a similar wrapping of Tick and probably also introduce a
Ticks class along with suitable operators for all these classes.


# 9179:666bc9df1e49 28-Aug-2012 Andreas Hansson <andreas.hansson@arm.com>

Clock: Rework clocks to avoid tick-to-cycle transformations

This patch introduces the notion of a clock update function that aims
to avoid costly divisions when turning the current tick into a
cycle. Each clocked object advances a private (hidden) cycle member
and a tick member and uses these to implement functions for getting
the tick of the next cycle, or the tick of a cycle some time in the
future.

In the different modules using the clocks, changes are made to avoid
counting in ticks only to later translate to cycles. There are a few
oddities in how the O3 and inorder CPU count idle cycles, as seen by a
few locations where a cycle is subtracted in the calculation. This is
done such that the regression does not change any stats, but should be
revisited in a future patch.

Another, much needed, change that is not done as part of this patch is
to introduce a new typedef uint64_t Cycle to be able to at least hint
at the unit of the variables counting Ticks vs Cycles. This will be
done as a follow-up patch.

As an additional follow up, the thread context still uses ticks for
the book keeping of last activate and last suspend and this should
probably also be changed into cycles as well.


# 9095:0e6bd7082fac 09-Jul-2012 Andreas Hansson <andreas.hansson@arm.com>

Port: Align port names in C++ and Python

This patch is a first step to align the port names used in the Python
world and the C++ world. Ultimately it serves to make the use of
config.json together with output from the simulation easier, including
post-processing of statistics.

Most notably, the CPU, cache, and bus is addressed in this patch, and
there might be other ports that should be updated accordingly. The
dash name separator has also been replaced with a "." which is what is
used to concatenate the names in python, and a separation is made
between the master and slave port in the bus.


# 9087:b5a084a6159b 09-Jul-2012 Andreas Hansson <andreas.hansson@arm.com>

Port: Move retry from port base class to Master/SlavePort

This patch is the last part of moving all protocol-related
functionality out of the Port base class. All the send/recv functions
are already moved, and the retry (which still governs all the timing
transport functions) is the only part that remained in the base class.

The only point where this currently causes a bit of inconvenience is
in the bus where the retry list is global and holds Port pointers (not
Master/SlavePort). This is about to change with the split into a
request/response bus and will soon be removed anyway.

The patch has no impact on any regressions.


# 9066:35ac3a6f8ee0 08-Jun-2012 Andreas Hansson <andreas.hansson@arm.com>

Timing CPU: Remove a redundant port pointer

This patch is trivial and merely prunes a pointer that was never set
or used.


# 8975:7f36d4436074 01-May-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Separate requests and responses for timing accesses

This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.

For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).

The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.

With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.


# 8948:e95ee70f876c 14-Apr-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Separate snoops and normal memory requests/responses

This patch introduces port access methods that separates snoop
request/responses from normal memory request/responses. The
differentiation is made for functional, atomic and timing accesses and
builds on the introduction of master and slave ports.

Before the introduction of this patch, the packets belonging to the
different phases of the protocol (request -> [forwarded snoop request
-> snoop response]* -> response) all use the same port access
functions, even though the snoop packets flow in the opposite
direction to the normal packet. That is, a coherent master sends
normal request and receives responses, but receives snoop requests and
sends snoop responses (vice versa for the slave). These two distinct
phases now use different access functions, as described below.

Starting with the functional access, a master sends a request to a
slave through sendFunctional, and the request packet is turned into a
response before the call returns. In a system without cache coherence,
this is all that is needed from the functional interface. For the
cache-coherent scenario, a slave also sends snoop requests to coherent
masters through sendFunctionalSnoop, with responses returned within
the same packet pointer. This is currently used by the bus and caches,
and the LSQ of the O3 CPU. The send/recvFunctional and
send/recvFunctionalSnoop are moved from the Port super class to the
appropriate subclass.

Atomic accesses follow the same flow as functional accesses, with
request being sent from master to slave through sendAtomic. In the
case of cache-coherent ports, a slave can send snoop requests to a
master through sendAtomicSnoop. Just as for the functional access
methods, the atomic send and receive member functions are moved to the
appropriate subclasses.

The timing access methods are different from the functional and atomic
in that requests and responses are separated in time and
send/recvTiming are used for both directions. Hence, a master uses
sendTiming to send a request to a slave, and a slave uses sendTiming
to send a response back to a master, at a later point in time. Snoop
requests and responses travel in the opposite direction, similar to
what happens in functional and atomic accesses. With the introduction
of this patch, it is possible to determine the direction of packets in
the bus, and no longer necessary to look for both a master and a slave
port with the requested port id.

In contrast to the normal recvFunctional, recvAtomic and recvTiming
that are pure virtual functions, the recvFunctionalSnoop,
recvAtomicSnoop and recvTimingSnoop have a default implementation that
calls panic. This is to allow non-coherent master and slave ports to
not implement these functions.


# 8850:ed91b534ed04 24-Feb-2012 Andreas Hansson <andreas.hansson@arm.com>

CPU: Round-two unifying instr/data CPU ports across models

This patch continues the unification of how the different CPU models
create and share their instruction and data ports. Most importantly,
it forces every CPU to have an instruction and a data port, and gives
these ports explicit getters in the BaseCPU (getDataPort and
getInstPort). The patch helps in simplifying the code, make
assumptions more explicit, andfurther ease future patches related to
the CPU ports.

The biggest changes are in the in-order model (that was not modified
in the previous unification patch), which now moves the ports from the
CacheUnit to the CPU. It also distinguishes the instruction fetch and
load-store unit from the rest of the resources, and avoids the use of
indices and casting in favour of keeping track of these two units
explicitly (since they are always there anyways). The atomic, timing
and O3 model simply return references to their already existing ports.


# 8737:770ccf3af571 31-Jan-2012 Koan-Sin Tan <koansin.tan@gmail.com>

clang: Enable compiling gem5 using clang 2.9 and 3.0

This patch adds the necessary flags to the SConstruct and SConscript
files for compiling using clang 2.9 and later (on Ubuntu et al and OSX
XCode 4.2), and also cleans up a bunch of compiler warnings found by
clang. Most of the warnings are related to hidden virtual functions,
comparisons with unsigneds >= 0, and if-statements with empty
bodies. A number of mismatches between struct and class are also
fixed. clang 2.8 is not working as it has problems with class names
that occur in multiple namespaces (e.g. Statistics in
kernel_stats.hh).

clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which
causes confusion between the container std::set and the function
Packet::set, and this is currently addressed by not including the
entire namespace std, but rather selecting e.g. "using std::vector" in
the appropriate places.


# 8707:489489c67fd9 17-Jan-2012 Andreas Hansson <andreas.hansson@arm.com>

CPU: Moving towards a more general port across CPU models

This patch performs minimal changes to move the instruction and data
ports from specialised subclasses to the base CPU (to the largest
degree possible). Ultimately it servers to make the CPU(s) have a
well-defined interface to the memory sub-system.


# 8706:b1838faf3bcc 17-Jan-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Add port proxies instead of non-structural ports

Port proxies are used to replace non-structural ports, and thus enable
all ports in the system to correspond to a structural entity. This has
the advantage of accessing memory through the normal memory subsystem
and thus allowing any constellation of distributed memories, address
maps, etc. Most accesses are done through the "system port" that is
used for loading binaries, debugging etc. For the entities that belong
to the CPU, e.g. threads and thread contexts, they wrap the CPU data
port in a port proxy.

The following replacements are made:
FunctionalPort > PortProxy
TranslatingPort > SETranslatingPortProxy
VirtualPort > FSTranslatingPortProxy


# 8444:56de1f9320df 03-Jul-2011 Gabe Black <gblack@eecs.umich.edu>

ExecContext: Rename the readBytes/writeBytes functions to readMem and writeMem.

readBytes and writeBytes had the word "bytes" in their names because they
accessed blobs of bytes. This distinguished them from the read and write
functions which handled higher level data types. Because those functions don't
exist any more, this change renames readBytes and writeBytes to more general
names, readMem and writeMem, which reflect the fact that they are how you read
and write memory. This also makes their names more consistent with the
register reading/writing functions, although those are still read and set for
some reason.


# 8443:530ff1bc8d70 03-Jul-2011 Gabe Black <gblack@eecs.umich.edu>

ExecContext: Get rid of the now unused read/write templated functions.


# 8229:78bf55f23338 15-Apr-2011 Nathan Binkert <nate@binkert.org>

includes: sort all includes


# 7945:32758425de8c 11-Feb-2011 Ali Saidi <Ali.Saidi@ARM.com>

SimpleCPU: Fix a case where a DTLB fault redirects fetch and an I-side walk occurs.

This change fixes an issue where a DTLB fault occurs and redirects fetch to
handle the fault and the ITLB requires a walk which delays translation. In this
case the status of the cpu isn't updated appropriately, and an additional
instruction fetch occurs. Eventually this hits an assert as multiple instruction
fetches are occuring in the system and when the second one returns the
processor is in the wrong state.

Some asserts below are removed because it was always true (typo) and the state
after the initiateAcc() the processor could be in any valid state when a
d-side fault occurs.


# 7944:1daf51f62013 11-Feb-2011 Giacomo Gabrielli <Giacomo.Gabrielli@arm.com>

O3: Enhance data address translation by supporting hardware page table walkers.

Some ISAs (like ARM) relies on hardware page table walkers. For those ISAs,
when a TLB miss occurs, initiateTranslation() can return with NoFault but with
the translation unfinished.

Instructions experiencing a delayed translation due to a hardware page table
walk are deferred until the translation completes and kept into the IQ. In
order to keep track of them, the IQ has been augmented with a queue of the
outstanding delayed memory instructions. When their translation completes,
instructions are re-executed (only their initiateAccess() was already
executed; their DTB translation is now skipped). The IEW stage has been
modified to support such a 2-pass execution.


# 7745:434b5dfb87d9 15-Nov-2010 Ali Saidi <Ali.Saidi@ARM.com>

CPU: Fix bug when a split transaction is issued to a faster cache

In the case of a split transaction and a cache that is faster than a CPU we
could get two responses before next_tick expires. Add an event that is
scheduled in this case and return false rather than asserting.


# 7520:67c670459d01 13-Aug-2010 Gabe Black <gblack@eecs.umich.edu>

CPU: Add readBytes and writeBytes functions to the exec contexts.


# 6973:a123bd350935 12-Feb-2010 Timothy M. Jones <tjones1@inf.ed.ac.uk>

BaseDynInst: Make the TLB translation timing instead of atomic.

This initiates a timing translation and passes the read or write on to the
processor before waiting for it to finish. Once the translation is finished,
the instruction's state is updated via the 'finish' function. A new
DataTranslation class is created to handle this.

The idea is taken from the implementation of timing translations in
TimingSimpleCPU by Gabe Black. This patch also separates out the timing
translations from this CPU and uses the new DataTranslation class.


# 6023:47b4fcb10c11 09-Apr-2009 Nathan Binkert <nate@binkert.org>

tlb: More fixing of unified TLB


# 6022:410194bb3049 09-Apr-2009 Gabe Black <gblack@eecs.umich.edu>

tlb: Don't separate the TLB classes into an instruction TLB and a data TLB


# 5894:8091ac99341a 25-Feb-2009 Gabe Black <gblack@eecs.umich.edu>

CPU: Implement translateTiming which defers to translateAtomic, and convert the timing simple CPU to use it.


# 5890:bdef71accd68 25-Feb-2009 Gabe Black <gblack@eecs.umich.edu>

CPU: Get rid of translate... functions from various interface classes.


# 5744:342cbc20a188 14-Nov-2008 Gabe Black <gblack@eecs.umich.edu>

CPU: Refactor read/write in the simple timing CPU.


# 5728:9574f561dfa2 10-Nov-2008 Gabe Black <gblack@eecs.umich.edu>

CPU: Make unaligned accesses work in the timing simple CPU.


# 5710:b44dd45bd604 27-Oct-2008 Clint Smullen <cws3k@cs.virginia.edu>

CPU: The API change to EventWrapper did not get propagated to the entirety of TimingSimpleCPU.
The constructor no-longer schedules an event at construction and the implict conversion between int and bool was allowing the old code to compile without warning.

Signed-off By: Ali Saidi


# 5606:6da7a58b0bc8 09-Oct-2008 Nathan Binkert <nate@binkert.org>

eventq: convert all usage of events to use the new API.
For now, there is still a single global event queue, but this is
necessary for making the steps towards a parallelized m5.


# 5529:9ae69b9cd7fd 11-Aug-2008 Nathan Binkert <nate@binkert.org>

params: Convert the CPU objects to use the auto generated param structs.
A whole bunch of stuff has been converted to use the new params stuff, but
the CPU wasn't one of them. While we're at it, make some things a bit
more stylish. Most of the work was done by Gabe, I just cleaned stuff up
a bit more at the end.


# 5496:6899b894166f 01-Jul-2008 Ali Saidi <saidi@eecs.umich.edu>

After a checkpoint (and thus a stats reset), the not_idle_fraction/notIdleFraction statistic is really wrong.
The notIdleFraction statistic isn't updated when the statistics reset, probably because the cpu Status information
was pulled into the atomic and timing cpus. This changeset pulls Status back into the BaseSimpleCPU object. Anyone
care to comment on the odd naming of the Status instance? It shouldn't just be status because that is confusing
with Port::Status, but _status seems a bit strage too.


# 5336:c7e21f4e5a2e 06-Feb-2008 Stephen Hines <hines@cs.fsu.edu>

Make the Event::description() a const function


# 5315:30997e988446 02-Jan-2008 Steve Reinhardt <stever@gmail.com>

Additional comments and helper functions for PrintReq.


# 5177:4307a768e10e 22-Oct-2007 Gabe Black <gblack@eecs.umich.edu>

CPU: Add functions to the "ExecContext"s that translate a given address.


# 5169:bfd18d401251 18-Oct-2007 Ali Saidi <saidi@eecs.umich.edu>

CPU: Use the ThreadContext cpu id instead of the params cpu id in all cases.


# 5103:391933804192 01-Oct-2007 Ali Saidi <saidi@eecs.umich.edu>

CPU: fix sparc_fs booting with SimpleTimingCPU.


# 4873:b135f6e6adfe 30-Jun-2007 Steve Reinhardt <stever@eecs.umich.edu>

Event descriptions should not end in "event"
(they function as adjectives not nouns)


# 4475:fb185cc1c845 22-May-2007 Steve Reinhardt <stever@eecs.umich.edu>

Change getDeviceAddressRanges to use bool for snoop arg.


# 4471:4d86c4d096ad 21-May-2007 Steve Reinhardt <stever@eecs.umich.edu>

Add new EventWrapper constructor that takes a Tick value
and schedules the event immediately.


# 4192:7accc6365bb9 09-Mar-2007 Kevin Lim <ktlim@umich.edu>

Two fixes:
1. Make sure connectMemPorts() only gets called when the CPU's peer gets changed. This is done by making setPeer() virtual, and overriding it in the CPU's ports. When it gets called on a CPU's port (dcache specifically), it calls the normal setPeer() function, and also connectMemPorts().
2. Consolidate redundant code that handles switching in a CPU.

src/cpu/base.cc:
Move common code of switching over peers to base CPU.
src/cpu/base.hh:
Move common code of switching over peers to BaseCPU.
src/cpu/o3/cpu.cc:
Add in function that updates thread context's ports.
Also use updated function to takeOverFrom() in BaseCPU. This gets rid of some repeated code.
src/cpu/o3/cpu.hh:
Include function to update thread context's memory ports.
src/cpu/o3/lsq.hh:
Add function to dcache port that will update the memory ports upon getting a new peer.
Also include a function that will tell the CPU to update those memory ports.
src/cpu/o3/lsq_impl.hh:
Add function that will update the memory ports upon getting a new peer.
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
Add function that will update thread context's memory ports upon getting a new peer.
Also use the new BaseCPU's take over from function.
src/cpu/simple/atomic.hh:
Add in function (and dcache port) that will allow the dcache to update memory ports when it gets assigned a new peer.
src/cpu/simple/timing.hh:
Add function that will update thread context's memory ports upon getting a new peer.
src/mem/port.hh:
Make setPeer virtual so that other classes can override it.


# 3846:a0fe3210ce53 15-Dec-2006 Lisa Hsu <hsul@eecs.umich.edu>

little fixes i noticed while searching for reason for address range issues (but these weren't the cause of the problem).

RangeSize as a function takes a start address, and a SIZE, and will make the range (start, start+size-1) for you.

src/cpu/memtest/memtest.hh:
src/cpu/o3/fetch.hh:
src/cpu/o3/lsq.hh:
src/cpu/ozone/front_end.hh:
src/cpu/ozone/lw_lsq.hh:
src/cpu/simple/atomic.hh:
src/cpu/simple/timing.hh:
Fix RangeSize arguments
src/dev/alpha/tsunami_cchip.cc:
src/dev/alpha/tsunami_io.cc:
src/dev/alpha/tsunami_pchip.cc:
src/dev/baddev.cc:
pioSize indicates SIZE, not a mask


# 3647:8121d4503cbc 13-Nov-2006 Ron Dreslinski <rdreslin@umich.edu>

Make CPU models signal to update the snoop ranges


# 3401:1df0cb879413 31-Oct-2006 Kevin Lim <ktlim@umich.edu>

Ports now have a pointer to the MemObject that owns it (can be NULL).

src/cpu/simple/atomic.hh:
Port now takes in the MemObject that owns it.
src/cpu/simple/timing.hh:
Port now takes in MemObject that owns it.
src/dev/io_device.cc:
src/mem/bus.hh:
Ports now take in the MemObject that owns it.
src/mem/cache/base_cache.cc:
Ports now take in the MemObject that own it.
src/mem/port.hh:
src/mem/tport.hh:
Ports now optionally take in the MemObject that owns it.


# 3349:fec4a86fa212 20-Oct-2006 Nathan Binkert <binkertn@umich.edu>

Use PacketPtr everywhere


# 3230:e86a03911728 09-Oct-2006 Kevin Lim <ktlim@umich.edu>

Merge ktlim@zizzer:/bk/newmem
into zamp.eecs.umich.edu:/z/ktlim2/clean/o3-merge/newmem

src/cpu/memtest/memtest.cc:
src/cpu/memtest/memtest.hh:
src/cpu/simple/timing.hh:
tests/configs/o3-timing-mp.py:
Hand merge.


# 3222:19bd4dd3be83 08-Oct-2006 Kevin Lim <ktlim@umich.edu>

Record numCycles properly.

src/cpu/simple/timing.cc:
Record numCycles stat properly.
src/cpu/simple/timing.hh:
Extra variable to help record numCycles stat.


# 3192:f3e215dda3f6 09-Oct-2006 Ron Dreslinski <rdreslin@umich.edu>

Have cpus send snoop ranges


# 3170:37fd1e73f836 08-Oct-2006 Steve Reinhardt <stever@eecs.umich.edu>

Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing)
and PhysicalMemory. *No* support for caches or O3CPU.
Note that properly setting cpu_id on all CPUs is now required
for correct operation.

src/arch/SConscript:
src/base/traceflags.py:
src/cpu/base.hh:
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
src/mem/physical.cc:
src/mem/physical.hh:
src/mem/request.hh:
src/python/m5/objects/BaseCPU.py:
tests/configs/simple-atomic.py:
tests/configs/simple-timing.py:
tests/configs/tsunami-simple-atomic-dual.py:
tests/configs/tsunami-simple-atomic.py:
tests/configs/tsunami-simple-timing-dual.py:
tests/configs/tsunami-simple-timing.py:
Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing)
and PhysicalMemory. *No* support for caches or O3CPU.


# 2948:ae26cf37957c 20-Jul-2006 Ali Saidi <saidi@eecs.umich.edu>

Enforce the timing cpu ticking at it's clock rate
Add a max time option in seconds and a single system root clock be 1THz

configs/test/fs.py:
Add a max time option in seconds and a single system root clock be 1THz
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
Enforce the timing cpu ticking at it's clock rate


# 2901:f9a45473ab55 12-Jul-2006 Ali Saidi <saidi@eecs.umich.edu>

memory mode information now contained in system object
States are now running, draining, or drained. memory state information moved into system object
system parameter is not fs only for cpus
Implement drain() support in devices
Update for drain() call that returns number of times drain_event->process() will be called

Break O3 CPU! No sense in putting in a hack change that kevin is going to remove in a few minutes i imagine

src/cpu/simple/atomic.cc:
src/cpu/simple/atomic.hh:
Since se mode has a system, allow access to it
Verify that the atomic cpu is connected to an atomic system on resume
src/cpu/simple/base.cc:
Since se mode has a system, allow access to it
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
Update for new drain() call that returns number of times drain_event->process() will be called and memory state being moved into the system
Since se mode has a system, allow access to it
Verify that the timing cpu is connected to an timing system on resume
src/dev/ide_disk.cc:
src/dev/io_device.cc:
src/dev/io_device.hh:
src/dev/ns_gige.cc:
src/dev/ns_gige.hh:
src/dev/pcidev.cc:
src/dev/pcidev.hh:
src/dev/sinic.cc:
src/dev/sinic.hh:
Implement drain() support in devices
src/python/m5/config.py:
Allow drain to return number of times drain_event->process() will be called. Normally 0 or 1 but things like O3 cpu or devices with multiple ports may want to call it many times
src/python/m5/objects/BaseCPU.py:
move system parameter out of fs to everyone
src/sim/sim_object.cc:
src/sim/sim_object.hh:
States are now running, draining, or drained. memory state information moved into system object
src/sim/system.cc:
src/sim/system.hh:
memory mode information now contained in system object


# 2869:4dbf4770df29 07-Jul-2006 Kevin Lim <ktlim@umich.edu>

Merge ktlim@zizzer:/bk/newmem
into zamp.eecs.umich.edu:/z/ktlim2/clean/newmem-merge


# 2867:cc92d58a3210 07-Jul-2006 Kevin Lim <ktlim@umich.edu>

Switch out fixes for CPUs.

src/cpu/o3/cpu.cc:
Fix up keeping proper state when switched out and drained.
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
Keep track of the event we use to schedule fetch initially and upon resume. We may have to cancel the event if the CPU is switched out.


# 2856:89691405ec9c 07-Jul-2006 Ron Dreslinski <rdreslin@umich.edu>

Update cpus to use the getPort function to use a connector object to connect the I/D cache ports to memory

configs/test/test.py:
Update to use new cpu getPort functionality
src/cpu/base.cc:
Make cpu's a memObject to expose getPort interface
src/cpu/base.hh:
Make cpu's a memObject to export getPort interface
src/cpu/simple/atomic.cc:
src/cpu/simple/atomic.hh:
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
Now use the connector via getPort interface
src/mem/cache/base_cache.cc:
Make sure the cache recognizes all port names


# 2839:d5dd8a3cdea0 05-Jul-2006 Kevin Lim <ktlim@umich.edu>

Rename quiesce to drain to avoid confusion with the pseudo instruction.

src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
src/python/m5/__init__.py:
src/python/m5/config.py:
src/sim/main.cc:
src/sim/sim_events.cc:
src/sim/sim_events.hh:
src/sim/sim_object.cc:
src/sim/sim_object.hh:
Rename quiesce to drain.


# 2798:751e9170247e 29-Jun-2006 Kevin Lim <ktlim@umich.edu>

Various fixes for the CPU models to support the features that have been moved to python.

src/cpu/base.cc:
src/cpu/base.hh:
src/cpu/simple/atomic.hh:
Switching out no longer takes a sampler.
src/cpu/simple/atomic.cc:
Fix up switching out. Also fix up serialization; the nameOut() was messing up the ordering.
src/cpu/simple/timing.cc:
Add in quiesce, fix up serialization.
src/cpu/simple/timing.hh:
Add in queisce, fix up serialization.


# 2665:a124942bacb8 31-May-2006 Ali Saidi <saidi@eecs.umich.edu>

Updated Authors from bk prs info


# 2657:b119b774656b 30-May-2006 Ali Saidi <saidi@eecs.umich.edu>

Add a very poor implementation of dealing with retries on timing requests. It is especially slow with tracing on since
it ends up being O(N^2). But it's probably going to have to change for the real bus anyway, so it should be rewritten then
Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true)
Removed Port Blocked/Unblocked and replaced with sendRetry().
Remove possibility of packet mangling if packet is going to be refused anyway in bridge

src/cpu/simple/atomic.cc:
src/cpu/simple/atomic.hh:
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true)
src/dev/io_device.cc:
src/dev/io_device.hh:
Make DMA Timing requests/responses work.
Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true)
src/mem/bridge.cc:
src/mem/bridge.hh:
Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true)
Removed Port Blocked/Unblocked and replaced with sendRetry().
Remove posibility of packet mangling if packet is going to be refused anyway.
src/mem/bus.cc:
src/mem/bus.hh:
Add a very poor implementation of dealing with retries on timing requests. It is especially slow with tracing on since
it ends up being O(N^2). But it's probably going to have to change for the real bus anyway, so it should be rewritten then
src/mem/port.hh:
Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true)
Removed Blocked/Unblocked port status, their functionality is really duplicated in the recvRetry() method


# 2644:8a45565c2c04 26-May-2006 Steve Reinhardt <stever@eecs.umich.edu>

Fixes for TimingSimpleCPU under full system. Now boots Alpha Linux!

src/cpu/simple/atomic.cc:
src/cpu/simple/base.cc:
Move traceData->finalize() into postExecute().
src/cpu/simple/timing.cc:
Fixes for full system. Now boots Alpha Linux!
- Handle ifetch faults, suspend/resume.
- Delete memory request & packet objects on response.
- Don't try to do split memory accesses on prefetch references
(ISA description doesn't support this).
src/cpu/simple/timing.hh:
Minor reorganization of internal methods.


# 2640:266b80dd5eca 26-May-2006 Steve Reinhardt <stever@eecs.umich.edu>

Add names to memory Port objects for tracing.


# 2632:1bb2f91485ea 22-May-2006 Steve Reinhardt <stever@eecs.umich.edu>

New directory structure:
- simulator source now in 'src' subdirectory
- imported files from 'ext' repository
- support building in arbitrary places, including
outside of the source tree. See comment at top
of SConstruct file for more details.
Regression tests are temporarily disabled; that
syetem needs more extensive revisions.

SConstruct:
Update for new directory structure.
Modify to support build trees that are not subdirectories
of the source tree. See comment at top of file for
more details.
Regression tests are temporarily disabled.
src/arch/SConscript:
src/arch/isa_parser.py:
src/python/SConscript:
Update for new directory structure.