History log of /gem5/src/cpu/simple/timing.cc
Revision Date Author Comments
# 14297:b4519e586f5e 10-Sep-2019 Jordi Vaquero <jordi.vaquero@metempsy.com>

cpu, mem: Changing AtomicOpFunctor* for unique_ptr<AtomicOpFunctor>

This change is based on modify the way we move the AtomicOpFunctor*
through gem5 in order to mantain proper ownership of the object and
ensuring its destruction when it is no longer used.

Doing that we fix at the same time a memory leak in Request.hh
where we were assigning a new AtomicOpFunctor* without destroying the
previous one.

This change creates a new type AtomicOpFunctor_ptr as a
std::unique_ptr<AtomicOpFunctor> and move its ownership as needed. Except
for its only usage when AtomicOpFunc() is called.

Change-Id: Ic516f9d8217cb1ae1f0a19500e5da0336da9fd4f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20919
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 14085:0075b0d29d55 28-Jun-2019 Giacomo Travaglini <giacomo.travaglini@arm.com>

cpu: isDrained renamed to isCpuDrained

cpu models inheriting from BaseCPU implement a draining checker called
isDrained. This hides the base Drainable::isDrained method and might
create confusion in the reader.
This patch is renaming it to isCpuDrained in order to avoid any
ambiguity

Change-Id: Ie5221da6a4673432c2403996e42d451cae960bbf
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19468
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 13954:2f400a5f2627 07-Jul-2017 Giacomo Gabrielli <giacomo.gabrielli@arm.com>

cpu,mem: Add support for partial loads/stores and wide mem. accesses

This changeset adds support for partial (or masked) loads/stores, i.e.
loads/stores that can disable accesses to individual bytes within the
target address range. In addition, this changeset extends the code to
crack memory accesses across most CPU models (TimingSimpleCPU still
TBD), so that arbitrarily wide memory accesses are supported. These
changes are required for supporting ISAs with wide vectors.

Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
- Tiago Muck <tiago.muck@arm.com>

Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 13652:45d94ac03a27 22-Jan-2018 Tuan Ta <qtt2@cornell.edu>

cpu: support atomic memory request type with AtomicOpFunctor

This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU,
MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory
system.

Atomic memory instruction is treated as a special store instruction in
all CPU models.

In simple CPUs, an AMO request with an associated AtomicOpFunctor is
simply sent to L1 dcache.

In MinorCPU, an AMO request bypasses store buffer and waits for any
conflicting store request(s) currently in the store buffer to retire
before the AMO request is sent to the cache. AMO requests are not buffered
in the store buffer, so their effects appear immediately in the cache.

In DerivO3CPU, an AMO request is inserted in the store buffer so that it
is delivered to the cache only after all previous stores are issued to
the cache. Data forwarding between between an outstanding AMO in the
store buffer and a subsequent load is not allowed since the AMO request
does not hold valid data until it's executed in the cache.

This implementation assumes that a target ISA implementation must insert
enough memory fences as micro-ops around an atomic instruction to
enforce a correct order of memory instructions with respect to its
memory consistency model. Without extra memory fences, this implementation
can allow AMOs and other memory instructions that do not conflict
(i.e., not target the same address) to reorder.

This implementation also assumes that atomic instructions execute within
a cache line boundary since the cache for now is not able to execute an
operation on two different cache lines in one single step. Therefore,
ISAs like x86 that require multi-cache-line atomic instructions need to
either use a pair of locking load and unlocking store or change the
cache implementation to guarantee the atomicity of an atomic
instruction.

Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a
Reviewed-on: https://gem5-review.googlesource.com/c/8188
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>


# 12769:f9c0d0a09dac 02-Apr-2018 Tuan Ta <qtt2@cornell.edu>

cpu: Prevent suspended TimingSimple CPUs from fetching next instructions

In TimingSimpleCPU model, when a CPU is suspended by a syscall (e.g.,
futex(FUTEX_WAIT)), the CPU waits for another CPU to wake it up
(e.g., FUTEX_WAKE operation). While staying Idle, the suspended CPU
should not try to fetch next instructions after the syscall.

This patch added a status check before a CPU schedule a fetch event
after a fault is handled.

Change-Id: I0cc953135686c9b35afe94942aa1d0b245ec60a2
Reviewed-on: https://gem5-review.googlesource.com/8181
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Brandon Potter <Brandon.Potter@amd.com>


# 12749:223c83ed9979 04-Jun-2018 Giacomo Travaglini <giacomo.travaglini@arm.com>

misc: Using smart pointers for memory Requests

This patch is changing the underlying type for RequestPtr from Request*
to shared_ptr<Request>. Having memory requests being managed by smart
pointers will simplify the code; it will also prevent memory leakage and
dangling pointers.

Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10996
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 12748:ae5ce8e42de7 03-Jun-2018 Giacomo Travaglini <giacomo.travaglini@arm.com>

misc: Substitute pointer to Request with aliased RequestPtr

Every usage of Request* in the code has been replaced with the
RequestPtr alias. This is a preparing patch for when RequestPtr will be
the typdefed to a smart pointer to Request rather then a raw pointer to
Request.

Change-Id: I73cbaf2d96ea9313a590cdc731a25662950cd51a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10995
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>


# 12386:2bf5fb25a5f1 13-Dec-2017 Gabe Black <gabeblack@google.com>

arm,sparc,x86,base,cpu,sim: Replace the Twin(32|64)_t types with.

Replace them with std::array<>s.

Change-Id: I76624c87a1cd9b21c386a96147a18de92b8a8a34
Reviewed-on: https://gem5-review.googlesource.com/6602
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>


# 12355:568ec3a0c614 07-Feb-2017 Nikos Nikoleris <nikos.nikoleris@arm.com>

cpu: Add support for CMOs in the cpu models

Cache maintenance operations go through the write channel of the
cpu. This changes makes sure that the cpu does not try to fill in the
packet with data.

Change-Id: Ic83205bb1cda7967636d88f15adcb475eb38d158
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/5055
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>


# 12284:b91c036913da 20-Jul-2017 Jose Marinho <jose.marinho@arm.com>

cpu, cpu, sim: move Cycle probe update

Move the code responsible for performing the actual probe point notify
into BaseCPU. Use BaseCPU activateContext and suspendContext to keep
track of sleep cycles. Create a probe point (ppActiveCycles) that does
not count cycles where the processor was asleep. Rename ppCycles
to ppAllCycles to reflect its nature.

Change-Id: I1907ddd07d0ff9f2ef22cc9f61f5f46c630c9d66
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/5762
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>


# 12276:22c220be30c5 16-Mar-2017 Anouk Van Laer <anouk.vanlaer@arm.com>

pwr: Adds logic to enter power gating for the cpu model

If the CPU has been clock gated for a sufficient amount of time
(configurable via pwrGatingLatency), the CPU will go into the OFF
power state. This does not model hardware, just behaviour.

Change-Id: Ib3681d1ffa6ad25eba60f47b4020325f63472d43
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/3969
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 12085:de78ea63e0ca 07-Jun-2017 Sean Wilson <spwilson2@wisc.edu>

cpu, gpu-compute: Replace EventWrapper use with EventFunctionWrapper

Change-Id: Idd5992463bcf9154f823b82461070d1f1842cea3
Signed-off-by: Sean Wilson <spwilson2@wisc.edu>
Reviewed-on: https://gem5-review.googlesource.com/3746
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>


# 11877:5ea85692a53e 20-Jul-2015 Brandon Potter <brandon.potter@amd.com>

syscall_emul: [patch 13/22] add system call retry capability

This changeset adds functionality that allows system calls to retry without
affecting thread context state such as the program counter or register values
for the associated thread context (when system calls return with a retry
fault).

This functionality is needed to solve problems with blocking system calls
in multi-process or multi-threaded simulations where information is passed
between processes/threads. Blocking system calls can cause deadlock because
the simulator itself is single threaded. There is only a single thread
servicing the event queue which can cause deadlock if the thread hits a
blocking system call instruction.

To illustrate the problem, consider two processes using the producer/consumer
sharing model. The processes can use file descriptors and the read and write
calls to pass information to one another. If the consumer calls the blocking
read system call before the producer has produced anything, the call will
block the event queue (while executing the system call instruction) and
deadlock the simulation.

The solution implemented in this changeset is to recognize that the system
calls will block and then generate a special retry fault. The fault will
be sent back up through the function call chain until it is exposed to the
cpu model's pipeline where the fault becomes visible. The fault will trigger
the cpu model to replay the instruction at a future tick where the call has
a chance to succeed without actually going into a blocking state.

In subsequent patches, we recognize that a syscall will block by calling a
non-blocking poll (from inside the system call implementation) and checking
for events. When events show up during the poll, it signifies that the call
would not have blocked and the syscall is allowed to proceed (calling an
underlying host system call if necessary). If no events are returned from the
poll, we generate the fault and try the instruction for the thread context
at a distant tick. Note that retrying every tick is not efficient.

As an aside, the simulator has some multi-threading support for the event
queue, but it is not used by default and needs work. Even if the event queue
was completely multi-threaded, meaning that there is a hardware thread on
the host servicing a single simulator thread contexts with a 1:1 mapping
between them, it's still possible to run into deadlock due to the event queue
barriers on quantum boundaries. The solution of replaying at a later tick
is the simplest solution and solves the problem generally.


# 11793:ef606668d247 09-Nov-2016 Brandon Potter <brandon.potter@amd.com>

style: [patch 1/22] use /r/3648/ to reorganize includes


# 11608:6319a1125f1c 14-Aug-2016 Nikos Nikoleris <nikos.nikoleris@arm.com>

cpu, arch: fix the type used for the request flags

Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>


# 11526:5b81895e5d5e 06-Jun-2016 David Guillen Fandos <david.guillen@arm.com>

pwr: Low-power idle power state for idle CPUs

Add functionality to the BaseCPU that will put the entire CPU
into a low-power idle state whenever all threads in it are idle.

Change-Id: I984d1656eb0a4863c87ceacd773d2d10de5cfd2b


# 11435:0f1b46dde3fa 07-Apr-2016 Mitch Hayenga <mitch.hayenga@arm.com>

mem: Remove threadId from memory request class

In general, the ThreadID parameter is unnecessary in the memory system
as the ContextID is what is used for the purposes of locks/wakeups.
Since we allocate sequential ContextIDs for each thread on MT-enabled
CPUs, ThreadID is unnecessary as the CPUs can identify the requesting
thread through sideband info (SenderState / LSQ entries) or ContextID
offset from the base ContextID for a cpu.

This is a re-spin of 20264eb after the revert (bd1c6789) and includes
some fixes of that commit.


# 11429:cf5af0cc3be4 06-Apr-2016 Andreas Sandberg <andreas.sandberg@arm.com>

Revert power patch sets with unexpected interactions

The following patches had unexpected interactions with the current
upstream code and have been reverted for now:

e07fd01651f3: power: Add support for power models
831c7f2f9e39: power: Low-power idle power state for idle CPUs
4f749e00b667: power: Add power states to ClockedObject

Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>


# 11428:20264eb69fbf 05-Apr-2016 Mitch Hayenga <mitch.hayenga@arm.com>

mem: Remove threadId from memory request class

In general, the ThreadID parameter is unnecessary in the memory system
as the ContextID is what is used for the purposes of locks/wakeups.
Since we allocate sequential ContextIDs for each thread on MT-enabled
CPUs, ThreadID is unnecessary as the CPUs can identify the requesting
thread through sideband info (SenderState / LSQ entries) or ContextID
offset from the base ContextID for a cpu.


# 11423:831c7f2f9e39 09-Dec-2014 Akash Bagdia <akash.bagdia@ARM.com>

power: Low-power idle power state for idle CPUs

Add functionality to the BaseCPU that will put the entire CPU into a low-power
idle state whenever all threads in it are idle.


# 11356:a80884911971 19-Jul-2015 Krishnendra Nathella <krinat01@arm.com>

cpu: Fix LLSC atomic CPU wakeup

Writes to locked memory addresses (LLSC) did not wake up the locking
CPU. This can lead to deadlocks on multi-core runs. In AtomicSimpleCPU,
recvAtomicSnoop was checking if the incoming packet was an invalidation
(isInvalidate) and only then handled a locked snoop. But, writes are
seen instead of invalidates when running without caches (fast-forward
configurations). As as simple fix, now handleLockedSnoop is also called
even if the incoming snoop packet are from writes.


# 11321:02e930db812d 06-Feb-2016 Steve Reinhardt <steve.reinhardt@amd.com>

style: fix missing spaces in control statements

Result of running 'hg m5style --skip-all --fix-control -a'.


# 11320:42ecb523c64a 06-Feb-2016 Steve Reinhardt <steve.reinhardt@amd.com>

style: remove trailing whitespace

Result of running 'hg m5style --skip-all --fix-white -a'.


# 11303:f694764d656d 17-Jan-2016 Steve Reinhardt <steve.reinhardt@amd.com>

cpu. arch: add initiateMemRead() to ExecContext interface

For historical reasons, the ExecContext interface had a single
function, readMem(), that did two different things depending on
whether the ExecContext supported atomic memory mode (i.e.,
AtomicSimpleCPU) or timing memory mode (all the other models).
In the former case, it actually performed a memory read; in the
latter case, it merely initiated a read access, and the read
completion did not happen until later when a response packet
arrived from the memory system.

This led to some confusing things, including timing accesses
being required to provide a pointer for the return data even
though that pointer was only used in atomic mode.

This patch splits this interface, adding a new initiateMemRead()
function to the ExecContext interface to replace the timing-mode
use of readMem().

For consistency and clarity, the readMemTiming() helper function
in the ISA definitions is renamed to initiateMemRead() as well.
For x86, where the access size is passed in explicitly, we can
also get rid of the data parameter at this level. For other ISAs,
where the access size is determined from the type of the data
parameter, we have to keep the parameter for that purpose.


# 11151:ca4ea9b5c052 30-Sep-2015 Mitch Hayenga <mitch.hayenga@arm.com>

cpu,isa,mem: Add per-thread wakeup logic

Changes wakeup functionality so that only specific threads on SMT
capable cpus are woken.


# 11148:1bc3d93c7eaa 30-Sep-2015 Mitch Hayenga <mitch.hayenga@arm.com>

cpu: Add per-thread monitors

Adds per-thread address monitors to support FullSystem SMT.


# 11147:cc8d6e99cf46 30-Sep-2015 Mitch Hayenga <mitch.hayenga@arm.com>

config,cpu: Add SMT support to Atomic and Timing CPUs

Adds SMT support to the "simple" CPU models so that they can be
used with other SMT-supported CPUs. Example usage: this enables
the TimingSimpleCPU to be used to warmup caches before swapping to
detailed mode with the in-order or out-of-order based CPU models.


# 10913:38dbdeea7f1f 07-Jul-2015 Andreas Sandberg <andreas.sandberg@arm.com>

sim: Refactor and simplify the drain API

The drain() call currently passes around a DrainManager pointer, which
is now completely pointless since there is only ever one global
DrainManager in the system. It also contains vestiges from the time
when SimObjects had to keep track of their child objects that needed
draining.

This changeset moves all of the DrainState handling to the Drainable
base class and changes the drain() and drainResume() calls to reflect
this. Particularly, the drain() call has been updated to take no
parameters (the DrainManager argument isn't needed) and return a
DrainState instead of an unsigned integer (there is no point returning
anything other than 0 or 1 any more). Drainable objects should return
either DrainState::Draining (equivalent to returning 1 in the old
system) if they need more time to drain or DrainState::Drained
(equivalent to returning 0 in the old system) if they are already in a
consistent state. Returning DrainState::Running is considered an
error.

Drain done signalling is now done through the signalDrainDone() method
in the Drainable class instead of using the DrainManager directly. The
new call checks if the state of the object is DrainState::Draining
before notifying the drain manager. This means that it is safe to call
signalDrainDone() without first checking if the simulator has
requested draining. The intention here is to reduce the code needed to
implement draining in simple objects.


# 10774:68d688cbe26c 03-Apr-2015 Nikos Nikoleris <nikos.nikoleris@gmail.com>

cpu: fix system total instructions accounting

The totalInstructions counter is only incremented when the whole instruction is
commited and not on every microop. It was incorrectly reset in atomic and
timing cpus.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>"


# 10713:eddb533708cb 02-Mar-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Split port retry for all different packet classes

This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.

The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvReqRetry, sendTimingResp has
recvRespRetry etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.

The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.


# 10669:aae98c1cf4a0 03-Feb-2015 Andreas Hansson <andreas.hansson@arm.com>

cpu: Ensure timing CPU sinks response before sending new request

This patch changes how the timing CPU deals with processing responses,
always scheduling an event, even if it is for the current tick. This
helps to avoid situations where a new request shows up before a
response is finished in the crossbar, and also is more in line with
any realistic behaviour.


# 10665:aef704eaedd2 25-Jan-2015 Ali Saidi <Ali.Saidi@ARM.com>

sim: Clean up InstRecord

Track memory size and flags as well as add some comments and consts.


# 10653:e3fc6bc7f97e 22-Jan-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Clean up Request initialisation

This patch tidies up how we create and set the fields of a Request. In
essence it tries to use the constructor where possible (as opposed to
setPhys and setVirt), thus avoiding spreading the information across a
number of locations. In fact, setPhys is made private as part of this
patch, and a number of places where we callede setVirt instead uses
the appropriate constructor.


# 10596:1eec33d2fc52 05-Dec-2014 Gabe Black <gabeblack@google.com>

cpu: Only check for PC events on instruction boundaries.

Only the instruction address is actually checked, so there's no need to check
repeatedly while we're working through the microops of a macroop and that's
not changing.


# 10566:c99c8d2a7c31 02-Dec-2014 Andreas Hansson <andreas.hansson@arm.com>

mem: Assume all dynamic packet data is array allocated

This patch simplifies how we deal with dynamically allocated data in
the packet, always assuming that it is array allocated, and hence
should be array deallocated (delete[] as opposed to delete). The only
uses of dataDynamic was in the Ruby testers.

The ARRAY_DATA flag in the packet is removed accordingly. No
defragmentation of the flags is done at this point, leaving a gap in
the bit masks.

As the last part the patch, it renames dataDynamicArray to dataDynamic.


# 10533:d1dce0b728b6 12-Nov-2014 Ali Saidi <ali.saidi@arm.com>

arm: Fix timing wakeup with LLSC


# 10529:05b5a6cf3521 06-Nov-2014 Marc Orr <morr@cs.wisc.edu>

x86 isa: This patch attempts an implementation at mwait.

Mwait works as follows:
1. A cpu monitors an address of interest (monitor instruction)
2. A cpu calls mwait - this loads the cache line into that cpu's cache.
3. The cpu goes to sleep.
4. When another processor requests write permission for the line, it is
evicted from the sleeping cpu's cache. This eviction is forwarded to the
sleeping cpu, which then wakes up.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>


# 10464:2a0fe8bca031 16-Oct-2014 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Probe points for basic PMU stats

This changeset adds probe points that can be used to implement PMU
counters for CPU stats. The following probes are supported:

* BaseCPU::ppCycles / Cycles
* BaseCPU::ppRetiredInsts / RetiredInsts
* BaseCPU::ppRetiredLoads / RetiredLoads
* BaseCPU::ppRetiredStores / RetiredStores
* BaseCPU::ppRetiredBranches RetiredBranches


# 10407:a9023811bf9e 20-Sep-2014 Mitch Hayenga <mitch.hayenga@arm.com>

alpha,arm,mips,power,x86,cpu,sim: Cleanup activate/deactivate

activate(), suspend(), and halt() used on thread contexts had an optional
delay parameter. However this parameter was often ignored. Also, when used,
the delay was seemily arbitrarily set to 0 or 1 cycle (no other delays were
ever specified). This patch removes the delay parameter and 'Events'
associated with them across all ISAs and cores. Unused activate logic
is also removed.


# 10379:c00f6d7e2681 19-Sep-2014 Andreas Hansson <andreas.hansson@arm.com>

arch: Pass faults by const reference where possible

This patch changes how faults are passed between methods in an attempt
to copy as few reference-counting pointer instances as possible. This
should avoid unecessary copies being created, contributing to the
increment/decrement of the reference counters.


# 10342:711eb0e64249 13-May-2014 Curtis Dunham <Curtis.Dunham@arm.com>

mem: Refactor assignment of Packet types

Put the packet type swizzling (that is currently done in a lot of places)
into a refineCommand() member function.


# 10031:79d034cd6ba3 24-Jan-2014 Ali Saidi <Ali.Saidi@ARM.com>

cpu: Add support for instructions that zero cache lines.


# 10030:b531e328342d 24-Jan-2014 Ali Saidi <Ali.Saidi@ARM.com>

cpu: Add CPU support for generatig wake up events when LLSC adresses are snooped.

This patch add support for generating wake-up events in the CPU when an address
that is currently in the exclusive state is hit by a snoop. This mechanism is required
for ARMv8 multi-processor support.


# 10024:fc10e1f9f124 24-Jan-2014 Dam Sunwoo <dam.sunwoo@arm.com>

mem: per-thread cache occupancy and per-block ages

This patch enables tracking of cache occupancy per thread along with
ages (in buckets) per cache blocks. Cache occupancy stats are
recalculated on each stat dump.


# 10020:2f33cb012383 24-Jan-2014 Matt Horsnell <matt.horsnell@ARM.com>

mem: track per-request latencies and access depths in the cache hierarchy

Add some values and methods to the request object to track the translation
and access latency for a request and which level of the cache hierarchy responded
to the request.


# 9837:13a21202375d 19-Aug-2013 Lena Olson <lena@cs.wisc,edu>

cpu: Accurately count idle cycles for simple cpu

Added a couple missing updates to the notIdleFraction stat. Without
these, it sometimes gives a (not) idle fraction that is greater than 1
or less than 0.


# 9830:5995f4d33a11 19-Aug-2013 Andreas Hansson <andreas.hansson@arm.com>

cpu: Fix timing CPU drain check

This patch modifies the SimpleTimingCPU drain check to also consider
the fetch event. Previously, there was an assumption that there is
never a fetch event scheduled if the CPU is not executing
microcode. However, when a context is activated, a fetch even is
scheduled, and microPC() is zero.


# 9814:7ad2b0186a32 18-Jul-2013 Andreas Hansson <andreas.hansson@arm.com>

mem: Set the cache line size on a system level

This patch removes the notion of a peer block size and instead sets
the cache line size on the system level.

Previously the size was set per cache, and communicated through the
interconnect. There were plenty checks to ensure that everyone had the
same size specified, and these checks are now removed. Another benefit
that is not yet harnessed is that the cache line size is now known at
construction time, rather than after the port binding. Hence, the
block size can be locally stored and does not have to be queried every
time it is used.

A follow-on patch updates the configuration scripts accordingly.


# 9648:f10eb34e3e38 22-Apr-2013 Dam Sunwoo <dam.sunwoo@arm.com>

sim: separate nextCycle() and clockEdge() in clockedObjects

Previously, nextCycle() could return the *current* cycle if the current tick was
already aligned with the clock edge. This behavior is not only confusing (not
quite what the function name implies), but also caused problems in the
drainResume() function. When exiting/re-entering the sim loop (e.g., to take
checkpoints), the CPUs will drain and resume. Due to the previous behavior of
nextCycle(), the CPU tick events were being rescheduled in the same ticks that
were already processed before draining. This caused divergence from runs that
did not exit/re-entered the sim loop. (Initially a cycle difference, but a
significant impact later on.)

This patch separates out the two behaviors (nextCycle() and clockEdge()),
uses nextCycle() in drainResume, and uses clockEdge() everywhere else.
Nothing (other than name) should change except for the drainResume timing.


# 9524:d6ffa982a68b 15-Feb-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

sim: Add a system-global option to bypass caches

Virtualized CPUs and the fastmem mode of the atomic CPU require direct
access to physical memory. We currently require caches to be disabled
when using them to prevent chaos. This is not ideal when switching
between hardware virutalized CPUs and other CPU models as it would
require a configuration change on each switch. This changeset
introduces a new version of the atomic memory mode,
'atomic_noncaching', where memory accesses are inserted into the
memory system as atomic accesses, but bypass caches.

To make memory mode tests cleaner, the following methods are added to
the System class:

* isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'.
* isTimingMode() -- True if the memory mode is 'timing'.
* bypassCaches() -- True if caches should be bypassed.

The old getMemoryMode() and setMemoryMode() methods should never be
used from the C++ world anymore.


# 9523:b8c8437f71d9 15-Feb-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Refactor memory system checks

CPUs need to test that the memory system is in the right mode in two
places, when the CPU is initialized (unless it's switched out) and on
a drainResume(). This led to some code duplication in the CPU
models. This changeset introduces the verifyMemoryMode() method which
is called by BaseCPU::init() if the CPU isn't switched out. The
individual CPU models are responsible for calling this method when
resuming from a drain as this code is CPU model specific.


# 9448:569d1e8f74e4 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Unify the serialization code for all of the CPU models

Cleanup the serialization code for the simple CPUs and the O3 CPU. The
CPU-specific code has been replaced with a (un)serializeThread that
serializes the thread state / context of a specific thread. Assuming
that the thread state class uses the CPU-specific thread state uses
the base thread state serialization code, this allows us to restore a
checkpoint with any of the CPU models.


# 9442:36967173340c 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Make sure that a drained timing CPU isn't executing ucode

Currently, the timing CPU can be in the middle of a microcode sequence
or multicycle (stayAtPC is true) instruction when it is drained. This
leads to two problems:

* When switching to a hardware virtualized CPU, we obviously can't
execute gem5 microcode.

* If stayAtPC is true we might execute half of an instruction twice
when restoring a checkpoint or switching CPUs, which leads to an
incorrect execution.

After applying this patch, the CPU will be on a proper instruction
boundary, which means that it is safe to switch to any CPU model
(including hardware virtualized ones). This changeset also fixes a bug
where the timing CPU sometimes switches out with while stayAtPC is
true, which corrupts the target state after a CPU switch or
checkpoint.

Note: This changeset removes the so_state variable from checkpoints
since the drain state isn't used anymore.


# 9433:34971d2e0019 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Rename defer_registration->switched_out

The defer_registration parameter is used to prevent a CPU from
initializing at startup, leaving it in the "switched out" mode. The
name of this parameter (and the help string) is confusing. This patch
renames it to switched_out, which should be more descriptive.


# 9429:7c787b8030c6 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Correctly call parent on switchOut() and takeOverFrom()

This patch cleans up the CPU switching functionality by making sure
that CPU models consistently call the parent on switchOut() and
takeOverFrom(). This has the following implications that might alter
current functionality:

* The call to BaseCPU::switchout() in the O3 CPU is moved from
signalDrained() (!) to switchOut().

* A call to BaseSimpleCPU::switchOut() is introduced in the simple
CPUs.


# 9424:d631aac65246 07-Jan-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

cpu: Check that the memory system is in the correct mode

This patch adds checks to all CPU models to make sure that the memory
system is in the correct mode at startup and when resuming after a
drain. Previously, we only checked that the memory system was in the
right mode when resuming. This is inadequate since this is a
configuration error that should be detected at startup as well as when
resuming. Additionally, since the check was done using an assert, it
wasn't performed when NDEBUG was set (e.g., the fast target).


# 9342:6fec8f26e56d 02-Nov-2012 Andreas Sandberg <Andreas.Sandberg@arm.com>

sim: Move the draining interface into a separate base class

This patch moves the draining interface from SimObject to a separate
class that can be used by any object needing draining. However,
objects not visible to the Python code (i.e., objects not deriving
from SimObject) still depend on their parents informing them when to
drain. This patch also gets rid of the CountedDrainEvent (which isn't
really an event) and replaces it with a DrainManager.


# 9180:ee8d7a51651d 28-Aug-2012 Andreas Hansson <andreas.hansson@arm.com>

Clock: Add a Cycles wrapper class and use where applicable

This patch addresses the comments and feedback on the preceding patch
that reworks the clocks and now more clearly shows where cycles
(relative cycle counts) are used to express time.

Instead of bumping the existing patch I chose to make this a separate
patch, merely to try and focus the discussion around a smaller set of
changes. The two patches will be pushed together though.

This changes done as part of this patch are mostly following directly
from the introduction of the wrapper class, and change enough code to
make things compile and run again. There are definitely more places
where int/uint/Tick is still used to represent cycles, and it will
take some time to chase them all down. Similarly, a lot of parameters
should be changed from Param.Tick and Param.Unsigned to
Param.Cycles.

In addition, the use of curTick is questionable as there should not be
an absolute cycle. Potential solutions can be built on top of this
patch. There is a similar situation in the o3 CPU where
lastRunningCycle is currently counting in Cycles, and is still an
absolute time. More discussion to be had in other words.

An additional change that would be appropriate in the future is to
perform a similar wrapping of Tick and probably also introduce a
Ticks class along with suitable operators for all these classes.


# 9179:666bc9df1e49 28-Aug-2012 Andreas Hansson <andreas.hansson@arm.com>

Clock: Rework clocks to avoid tick-to-cycle transformations

This patch introduces the notion of a clock update function that aims
to avoid costly divisions when turning the current tick into a
cycle. Each clocked object advances a private (hidden) cycle member
and a tick member and uses these to implement functions for getting
the tick of the next cycle, or the tick of a cycle some time in the
future.

In the different modules using the clocks, changes are made to avoid
counting in ticks only to later translate to cycles. There are a few
oddities in how the O3 and inorder CPU count idle cycles, as seen by a
few locations where a cycle is subtracted in the calculation. This is
done such that the regression does not change any stats, but should be
revisited in a future patch.

Another, much needed, change that is not done as part of this patch is
to introduce a new typedef uint64_t Cycle to be able to at least hint
at the unit of the variables counting Ticks vs Cycles. This will be
done as a follow-up patch.

As an additional follow up, the thread context still uses ticks for
the book keeping of last activate and last suspend and this should
probably also be changed into cycles as well.


# 9165:f9e3dac185ba 22-Aug-2012 Andreas Hansson <andreas.hansson@arm.com>

Packet: Remove NACKs from packet and its use in endpoints

This patch removes the NACK frrom the packet as there is no longer any
module in the system that issues them (the bridge was the only one and
the previous patch removes that).

The handling of NACKs was mostly avoided throughout the code base, by
using e.g. panic or assert false, but in a few locations the NACKs
were actually dealt with (although NACKs never occured in any of the
regressions). Most notably, the DMA port will now never receive a NACK
and the backoff time is thus never changed. As a consequence, the
entire backoff mechanism (similar to a PCI bus) is now removed and the
DMA port entirely relies on the bus performing the arbitration and
issuing a retry when appropriate. This is more in line with e.g. PCIe.

Surprisingly, this patch has no impact on any of the regressions. As
mentioned in the patch that removes the NACK from the bridge, a
follow-up patch should change the request and response buffer size for
at least one regression to also verify that the system behaves as
expected when the bridge fills up.


# 9152:86c0e6ca5e7c 15-Aug-2012 Anthony Gutierrez <atgutier@umich.edu>

O3,ARM: fix some problems with drain/switchout functionality and add Drain DPRINTFs

This patch fixes some problems with the drain/switchout functionality
for the O3 cpu and for the ARM ISA and adds some useful debug print
statements.

This is an incremental fix as there are still a few bugs/mem leaks with the
switchout code. Particularly when switching from an O3CPU to a
TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA
I haven't encountered any more assertion failures; now the kernel will
typically panic inside of simulation.


# 9058:cc47e11ccec1 05-Jun-2012 Anthony Gutierrez <atgutier@umich.edu>

cpu: Don't init simple and inorder CPUs if they are defered.

initCPU() will be called to initialize switched out CPUs for the simple and
inorder CPU models. this patch prevents those CPUs from being initialized
because they should get their state from the active CPU when it is switched
out.


# 8975:7f36d4436074 01-May-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Separate requests and responses for timing accesses

This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.

For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).

The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.

With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.


# 8949:3fa1ee293096 14-Apr-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Remove the Broadcast destination from the packet

This patch simplifies the packet by removing the broadcast flag and
instead more firmly relying on (and enforcing) the semantics of
transactions in the classic memory system, i.e. request packets are
routed from a master to a slave based on the address, and when they
are created they have neither a valid source, nor destination. On
their way to the slave, the request packet is updated with a source
field for all modules that multiplex packets from multiple master
(e.g. a bus). When a request packet is turned into a response packet
(at the final slave), it moves the potentially populated source field
to the destination field, and the response packet is routed through
any multiplexing components back to the master based on the
destination field.

Modules that connect multiplexing components, such as caches and
bridges store any existing source and destination field in the sender
state as a stack (just as before).

The packet constructor is simplified in that there is no longer a need
to pass the Packet::Broadcast as the destination (this was always the
case for the classic memory system). In the case of Ruby, rather than
using the parameter to the constructor we now rely on setDest, as
there is already another three-argument constructor in the packet
class.

In many places where the packet information was printed as part of
DPRINTFs, request packets would be printed with a numeric "dest" that
would always be -1 (Broadcast) and that field is now removed from the
printing.


# 8948:e95ee70f876c 14-Apr-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Separate snoops and normal memory requests/responses

This patch introduces port access methods that separates snoop
request/responses from normal memory request/responses. The
differentiation is made for functional, atomic and timing accesses and
builds on the introduction of master and slave ports.

Before the introduction of this patch, the packets belonging to the
different phases of the protocol (request -> [forwarded snoop request
-> snoop response]* -> response) all use the same port access
functions, even though the snoop packets flow in the opposite
direction to the normal packet. That is, a coherent master sends
normal request and receives responses, but receives snoop requests and
sends snoop responses (vice versa for the slave). These two distinct
phases now use different access functions, as described below.

Starting with the functional access, a master sends a request to a
slave through sendFunctional, and the request packet is turned into a
response before the call returns. In a system without cache coherence,
this is all that is needed from the functional interface. For the
cache-coherent scenario, a slave also sends snoop requests to coherent
masters through sendFunctionalSnoop, with responses returned within
the same packet pointer. This is currently used by the bus and caches,
and the LSQ of the O3 CPU. The send/recvFunctional and
send/recvFunctionalSnoop are moved from the Port super class to the
appropriate subclass.

Atomic accesses follow the same flow as functional accesses, with
request being sent from master to slave through sendAtomic. In the
case of cache-coherent ports, a slave can send snoop requests to a
master through sendAtomicSnoop. Just as for the functional access
methods, the atomic send and receive member functions are moved to the
appropriate subclasses.

The timing access methods are different from the functional and atomic
in that requests and responses are separated in time and
send/recvTiming are used for both directions. Hence, a master uses
sendTiming to send a request to a slave, and a slave uses sendTiming
to send a response back to a master, at a later point in time. Snoop
requests and responses travel in the opposite direction, similar to
what happens in functional and atomic accesses. With the introduction
of this patch, it is possible to determine the direction of packets in
the bus, and no longer necessary to look for both a master and a slave
port with the requested port id.

In contrast to the normal recvFunctional, recvAtomic and recvTiming
that are pure virtual functions, the recvFunctionalSnoop,
recvAtomicSnoop and recvTimingSnoop have a default implementation that
calls panic. This is to allow non-coherent master and slave ports to
not implement these functions.


# 8921:e53972f72165 30-Mar-2012 Andreas Hansson <andreas.hansson@arm.com>

CPU: Unify initMemProxies across CPUs and simulation modes

This patch unifies where initMemProxies is called, in the init()
method of each BaseCPU subclass, before TheISA::initCPU is
called. Moreover, it also ensures that initMemProxies is called in
both full-system and syscall-emulation mode, thus unifying also across
the modes. An additional check is added in the ThreadState to ensure
that initMemProxies is only called once.


# 8850:ed91b534ed04 24-Feb-2012 Andreas Hansson <andreas.hansson@arm.com>

CPU: Round-two unifying instr/data CPU ports across models

This patch continues the unification of how the different CPU models
create and share their instruction and data ports. Most importantly,
it forces every CPU to have an instruction and a data port, and gives
these ports explicit getters in the BaseCPU (getDataPort and
getInstPort). The patch helps in simplifying the code, make
assumptions more explicit, andfurther ease future patches related to
the CPU ports.

The biggest changes are in the in-order model (that was not modified
in the previous unification patch), which now moves the ports from the
CacheUnit to the CPU. It also distinguishes the instruction fetch and
load-store unit from the rest of the resources, and avoids the use of
indices and casting in favour of keeping track of these two units
explicitly (since they are always there anyways). The atomic, timing
and O3 model simply return references to their already existing ports.


# 8832:247fee427324 12-Feb-2012 Ali Saidi <Ali.Saidi@ARM.com>

mem: Add a master ID to each request object.

This change adds a master id to each request object which can be
used identify every device in the system that is capable of issuing a request.
This is part of the way to removing the numCpus+1 stats in the cache and
replacing them with the master ids. This is one of a series of changes
that make way for the stats output to be changed to python.


# 8809:bb10807da889 01-Feb-2012 Gabe Black <gblack@eecs.umich.edu>

Merge with head, hopefully the last time for this batch.


# 8799:dac1e33e07b0 28-Jan-2012 Gabe Black <gblack@eecs.umich.edu>

Merge with the main repo.


# 8793:5f25086326ac 18-Nov-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Get rid of FULL_SYSTEM in the CPU directory.


# 8779:2a590c51adb1 01-Nov-2011 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Expose the same methods on the CPUs in SE and FS modes.


# 8737:770ccf3af571 31-Jan-2012 Koan-Sin Tan <koansin.tan@gmail.com>

clang: Enable compiling gem5 using clang 2.9 and 3.0

This patch adds the necessary flags to the SConstruct and SConscript
files for compiling using clang 2.9 and later (on Ubuntu et al and OSX
XCode 4.2), and also cleans up a bunch of compiler warnings found by
clang. Most of the warnings are related to hidden virtual functions,
comparisons with unsigneds >= 0, and if-statements with empty
bodies. A number of mismatches between struct and class are also
fixed. clang 2.8 is not working as it has problems with class names
that occur in multiple namespaces (e.g. Statistics in
kernel_stats.hh).

clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which
causes confusion between the container std::set and the function
Packet::set, and this is currently addressed by not including the
entire namespace std, but rather selecting e.g. "using std::vector" in
the appropriate places.


# 8708:7ccbdea0fa12 17-Jan-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Simplify ports by removing EventManager

This patch removes the inheritance of EventManager from the ports and
moves all responsibility for event queues to the owner. Eventually the
event manager should be the interface block, which could either be the
structural owner or a subblock like a LSQ in the O3 CPU for example.


# 8707:489489c67fd9 17-Jan-2012 Andreas Hansson <andreas.hansson@arm.com>

CPU: Moving towards a more general port across CPU models

This patch performs minimal changes to move the instruction and data
ports from specialised subclasses to the base CPU (to the largest
degree possible). Ultimately it servers to make the CPU(s) have a
well-defined interface to the memory sub-system.


# 8706:b1838faf3bcc 17-Jan-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Add port proxies instead of non-structural ports

Port proxies are used to replace non-structural ports, and thus enable
all ports in the system to correspond to a structural entity. This has
the advantage of accessing memory through the normal memory subsystem
and thus allowing any constellation of distributed memories, address
maps, etc. Most accesses are done through the "system port" that is
used for loading binaries, debugging etc. For the entities that belong
to the CPU, e.g. threads and thread contexts, they wrap the CPU data
port in a port proxy.

The following replacements are made:
FunctionalPort > PortProxy
TranslatingPort > SETranslatingPortProxy
VirtualPort > FSTranslatingPortProxy


# 8486:c4e77a9563f5 07-Aug-2011 Gabe Black <gblack@eecs.umich.edu>

Translation: Use a pointer type as the template argument.

This allows regular pointers and reference counted pointers without having to
use any shim structures or other tricks.


# 8444:56de1f9320df 03-Jul-2011 Gabe Black <gblack@eecs.umich.edu>

ExecContext: Rename the readBytes/writeBytes functions to readMem and writeMem.

readBytes and writeBytes had the word "bytes" in their names because they
accessed blobs of bytes. This distinguished them from the read and write
functions which handled higher level data types. Because those functions don't
exist any more, this change renames readBytes and writeBytes to more general
names, readMem and writeMem, which reflect the fact that they are how you read
and write memory. This also makes their names more consistent with the
register reading/writing functions, although those are still read and set for
some reason.


# 8443:530ff1bc8d70 03-Jul-2011 Gabe Black <gblack@eecs.umich.edu>

ExecContext: Get rid of the now unused read/write templated functions.


# 8277:bfaab04cb292 04-May-2011 Ali Saidi <Ali.Saidi@ARM.com>

CPU: Add some useful debug message to the timing simple cpu.


# 8276:66bb0d8ae8bf 04-May-2011 Ali Saidi <Ali.Saidi@ARM.com>

CPU: Fix a case where timing simple cpu faults can nest.

If we fault, change the state to faulting so that we don't fault again in the same cycle.


# 8232:b28d06a175be 15-Apr-2011 Nathan Binkert <nate@binkert.org>

trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help


# 8229:78bf55f23338 15-Apr-2011 Nathan Binkert <nate@binkert.org>

includes: sort all includes


# 8143:b0b94a7b7c1f 17-Mar-2011 Ali Saidi <Ali.Saidi@ARM.com>

ARM: Detect and skip udelay() functions in linux kernel.

This change speeds up booting, especially in MP cases, by not executing
udelay() on the core but instead skipping ahead tha amount of time that is being
delayed.


# 8105:906864dd0937 02-Mar-2011 Gabe Black <gblack@eecs.umich.edu>

Spelling: Fix the a spelling error by changing mmaped to mmapped.

There may not be a formally correct spelling for the past tense of mmap, but
mmapped is the spelling Google doesn't try to autocorrect. This makes sense
because it mirrors the past tense of map->mapped and not the past tense of
cape->caped.


# 7945:32758425de8c 11-Feb-2011 Ali Saidi <Ali.Saidi@ARM.com>

SimpleCPU: Fix a case where a DTLB fault redirects fetch and an I-side walk occurs.

This change fixes an issue where a DTLB fault occurs and redirects fetch to
handle the fault and the ITLB requires a walk which delays translation. In this
case the status of the cpu isn't updated appropriately, and an additional
instruction fetch occurs. Eventually this hits an assert as multiple instruction
fetches are occuring in the system and when the second one returns the
processor is in the wrong state.

Some asserts below are removed because it was always true (typo) and the state
after the initiateAcc() the processor could be in any valid state when a
d-side fault occurs.


# 7911:267e1e16e51b 07-Feb-2011 Joel Hestness <hestness@cs.utexas.edu>

TimingSimpleCPU: split data sender state fix

In sendSplitData, keep a pointer to the senderState that may be updated after
the call to handle*Packet. This way, if the receiver updates the packet
senderState, it can still be accessed in sendSplitData.


# 7897:d9e8b1fd1a9f 07-Feb-2011 Joel Hestness <hestness@cs.utexas.edu>

mcpat: Adds McPAT performance counters

Updated patches from Rick Strong's set that modify performance counters for
McPAT


# 7823:dac01f14f20f 08-Jan-2011 Steve Reinhardt <steve.reinhardt@amd.com>

Replace curTick global variable with accessor functions.
This step makes it easy to replace the accessor functions
(which still access a global variable) with ones that access
per-thread curTick values.


# 7745:434b5dfb87d9 15-Nov-2010 Ali Saidi <Ali.Saidi@ARM.com>

CPU: Fix bug when a split transaction is issued to a faster cache

In the case of a split transaction and a cache that is faster than a CPU we
could get two responses before next_tick expires. Add an event that is
scheduled in this case and return false rather than asserting.


# 7725:00ea9430643b 08-Nov-2010 Ali Saidi <Ali.Saidi@ARM.com>

ARM/Alpha/Cpu: Change prefetchs to be more like normal loads.

This change modifies the way prefetches work. They are now like normal loads
that don't writeback a register. Previously prefetches were supposed to call
prefetch() on the exection context, so they executed with execute() methods
instead of initiateAcc() completeAcc(). The prefetch() methods for all the CPUs
are blank, meaning that they get executed, but don't actually do anything.

On Alpha dead cache copy code was removed and prefetches are now normal ops.
They count as executed operations, but still don't do anything and IsMemRef is
not longer set on them.

On ARM IsDataPrefetch or IsInstructionPreftech is now set on all prefetch
instructions. The timing simple CPU doesn't try to do anything special for
prefetches now and they execute with the normal memory code path.


# 7720:65d338a8dba4 31-Oct-2010 Gabe Black <gblack@eecs.umich.edu>

ISA,CPU,etc: Create an ISA defined PC type that abstracts out ISA behaviors.



This change is a low level and pervasive reorganization of how PCs are managed
in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about,
the PC and the NPC, and the lsb of the PC signaled whether or not you were in
PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next
micropc, x86 and ARM introduced variable length instruction sets, and ARM
started to keep track of mode bits in the PC. Each CPU model handled PCs in
its own custom way that needed to be updated individually to handle the new
dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack,
the complexity could be hidden in the ISA at the ISA implementation's expense.
Areas like the branch predictor hadn't been updated to handle branch delay
slots or micropcs, and it turns out that had introduced a significant (10s of
percent) performance bug in SPARC and to a lesser extend MIPS. Rather than
perpetuate the problem by reworking O3 again to handle the PC features needed
by x86, this change was introduced to rework PC handling in a more modular,
transparent, and hopefully efficient way.


PC type:

Rather than having the superset of all possible elements of PC state declared
in each of the CPU models, each ISA defines its own PCState type which has
exactly the elements it needs. A cross product of canned PCState classes are
defined in the new "generic" ISA directory for ISAs with/without delay slots
and microcode. These are either typedef-ed or subclassed by each ISA. To read
or write this structure through a *Context, you use the new pcState() accessor
which reads or writes depending on whether it has an argument. If you just
want the address of the current or next instruction or the current micro PC,
you can get those through read-only accessors on either the PCState type or
the *Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the
move away from readPC. That name is ambiguous since it's not clear whether or
not it should be the actual address to fetch from, or if it should have extra
bits in it like the PAL mode bit. Each class is free to define its own
functions to get at whatever values it needs however it needs to to be used in
ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the
PC and into a separate field like ARM.

These types can be reset to a particular pc (where npc = pc +
sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as
appropriate), printed, serialized, and compared. There is a branching()
function which encapsulates code in the CPU models that checked if an
instruction branched or not. Exactly what that means in the context of branch
delay slots which can skip an instruction when not taken is ambiguous, and
ideally this function and its uses can be eliminated. PCStates also generally
know how to advance themselves in various ways depending on if they point at
an instruction, a microop, or the last microop of a macroop. More on that
later.

Ideally, accessing all the PCs at once when setting them will improve
performance of M5 even though more data needs to be moved around. This is
because often all the PCs need to be manipulated together, and by getting them
all at once you avoid multiple function calls. Also, the PCs of a particular
thread will have spatial locality in the cache. Previously they were grouped
by element in arrays which spread out accesses.


Advancing the PC:

The PCs were previously managed entirely by the CPU which had to know about PC
semantics, try to figure out which dimension to increment the PC in, what to
set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction
with the PC type itself. Because most of the information about how to
increment the PC (mainly what type of instruction it refers to) is contained
in the instruction object, a new advancePC virtual function was added to the
StaticInst class. Subclasses provide an implementation that moves around the
right element of the PC with a minimal amount of decision making. In ISAs like
Alpha, the instructions always simply assign NPC to PC without having to worry
about micropcs, nnpcs, etc. The added cost of a virtual function call should
be outweighed by not having to figure out as much about what to do with the
PCs and mucking around with the extra elements.

One drawback of making the StaticInsts advance the PC is that you have to
actually have one to advance the PC. This would, superficially, seem to
require decoding an instruction before fetch could advance. This is, as far as
I can tell, realistic. fetch would advance through memory addresses, not PCs,
perhaps predicting new memory addresses using existing ones. More
sophisticated decisions about control flow would be made later on, after the
instruction was decoded, and handed back to fetch. If branching needs to
happen, some amount of decoding needs to happen to see that it's a branch,
what the target is, etc. This could get a little more complicated if that gets
done by the predecoder, but I'm choosing to ignore that for now.


Variable length instructions:

To handle variable length instructions in x86 and ARM, the predecoder now
takes in the current PC by reference to the getExtMachInst function. It can
modify the PC however it needs to (by setting NPC to be the PC + instruction
length, for instance). This could be improved since the CPU doesn't know if
the PC was modified and always has to write it back.


ISA parser:

To support the new API, all PC related operand types were removed from the
parser and replaced with a PCState type. There are two warts on this
implementation. First, as with all the other operand types, the PCState still
has to have a valid operand type even though it doesn't use it. Second, using
syntax like PCS.npc(target) doesn't work for two reasons, this looks like the
syntax for operand type overriding, and the parser can't figure out if you're
reading or writing. Instructions that use the PCS operand (which I've
consistently called it) need to first read it into a local variable,
manipulate it, and then write it back out.


Return address stack:

The return address stack needed a little extra help because, in the presence
of branch delay slots, it has to merge together elements of the return PC and
the call PC. To handle that, a buildRetPC utility function was added. There
are basically only two versions in all the ISAs, but it didn't seem short
enough to put into the generic ISA directory. Also, the branch predictor code
in O3 and InOrder were adjusted so that they always store the PC of the actual
call instruction in the RAS, not the next PC. If the call instruction is a
microop, the next PC refers to the next microop in the same macroop which is
probably not desirable. The buildRetPC function advances the PC intelligently
to the next macroop (in an ISA specific way) so that that case works.


Change in stats:

There were no change in stats except in MIPS and SPARC in the O3 model. MIPS
runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could
likely be improved further by setting call/return instruction flags and taking
advantage of the RAS.


TODO:

Add != operators to the PCState classes, defined trivially to be !(a==b).
Smooth out places where PCs are split apart, passed around, and put back
together later. I think this might happen in SPARC's fault code. Add ISA
specific constructors that allow setting PC elements without calling a bunch
of accessors. Try to eliminate the need for the branching() function. Factor
out Alpha's PAL mode pc bit into a separate flag field, and eliminate places
where it's blindly masked out or tested in the PC.


# 7691:358c00c482f7 30-Sep-2010 Ali Saidi <Ali.Saidi@ARM.com>

CPU/Cache: Fix some errors exposed by valgrind


# 7678:f19b6a3a8cec 13-Sep-2010 Gabe Black <gblack@eecs.umich.edu>

Faults: Pass the StaticInst involved, if any, to a Fault's invoke method.

Also move the "Fault" reference counted pointer type into a separate file,
sim/fault.hh. It would be better to name this less similarly to sim/faults.hh
to reduce confusion, but fault.hh matches the name of the type. We could change
Fault to FaultPtr to match other pointer types, and then changing the name of
the file would make more sense.


# 7655:8bce423f2075 25-Aug-2010 Ali Saidi <ali.saidi@arm.com>

CPU: Print out traces for faluting inst when the flag ExecFaulting is set


# 7521:3c48b2b3cb83 13-Aug-2010 Gabe Black <gblack@eecs.umich.edu>

Merge with head.


# 7520:67c670459d01 13-Aug-2010 Gabe Black <gblack@eecs.umich.edu>

CPU: Add readBytes and writeBytes functions to the exec contexts.


# 7516:cfbbc9178e7a 12-Aug-2010 Joel Hestness <hestness@cs.utexas.edu>

TimingSimpleCPU: fix NO_ACCESS memory op handling

When a request is NO_ACCESS (x86 CDA microinstruction), the memory op
doesn't go to the cache, so TimingSimpleCPU::completeDataAccess needs
to handle the case where the current status of the CPU is Running
and not DcacheWaitResponse or DTBWaitResponse


# 7046:d21d575a6f99 23-Mar-2010 Steve Reinhardt <steve.reinhardt@amd.com>

cpu: get rid of uncached access "events"
These recordEvent() calls could cause crashes since they
access the req pointer after it's potentially been
deleted during a failed translation call. (Similar
problem to the traceData bug fixed in the previous cset.)

Moving them above the translation call (as was done
recentlyi in cset 8b2b8e5e7d35) avoids the crash
but doesn't work, since at that point we don't know if
the access is uncached or not.

It's not clear why these calls are there, and no one
seems to use them, so we'll just delete them. If they
are needed, they should be moved to somewhere that's
guaranteed to be after the translation completes but
before the request is possibly deleted, e.g., in
finishTranslation().


# 7045:e21fe6a62b1c 23-Mar-2010 Steve Reinhardt <steve.reinhardt@amd.com>

cpu: fix exec tracing memory corruption bug
Accessing traceData (to call setAddress() and/or setData())
after initiating a timing translation was causing crashes,
since a failed translation could delete the traceData
object before returning.

It turns out that there was never a need to access traceData
after initiating the translation, as the traced data was
always available earlier; this ordering was merely
historical. Furthermore, traceData->setAddress() and
traceData->setData() were being called both from the CPU
model and the ISA definition, often redundantly.

This patch standardizes all setAddress and setData calls
for memory instructions to be in the CPU models and not
in the ISA definition. It also moves those calls above
the translation calls to eliminate the crashes.


# 7016:8b2b8e5e7d35 22-Mar-2010 Brad Beckmann <Brad.Beckmann@amd.com>

TimingSimpleCPU: Fixed uncacacheable request read bug

Previously the recording of an uncached read occurred after the request was
possibly deleted within the translateTiming function.


# 6973:a123bd350935 12-Feb-2010 Timothy M. Jones <tjones1@inf.ed.ac.uk>

BaseDynInst: Make the TLB translation timing instead of atomic.

This initiates a timing translation and passes the read or write on to the
processor before waiting for it to finish. Once the translation is finished,
the instruction's state is updated via the 'finish' function. A new
DataTranslation class is created to handle this.

The idea is taken from the implementation of timing translations in
TimingSimpleCPU by Gabe Black. This patch also separates out the timing
translations from this CPU and uses the new DataTranslation class.


# 6739:48d10ba361c9 11-Nov-2009 Gabe Black <gblack@eecs.umich.edu>

Mem: Eliminate the NO_FAULT request flag.


# 6658:f4de76601762 23-Sep-2009 Nathan Binkert <nate@binkert.org>

arch: nuke arch/isa_specific.hh and move stuff to generated config/the_isa.hh


# 6227:a17798f2a52c 05-Jun-2009 Nathan Binkert <nate@binkert.org>

types: clean up types, especially signed vs unsigned


# 6221:58a3c04e6344 26-May-2009 Nathan Binkert <nate@binkert.org>

types: add a type for thread IDs and try to use it everywhere


# 6102:7fbf97dc6540 20-Apr-2009 Gabe Black <gblack@eecs.umich.edu>

Mem: Change isLlsc to isLLSC.


# 6076:e141cc7896ce 19-Apr-2009 Gabe Black <gblack@eecs.umich.edu>

Memory: Rename LOCKED for load locked store conditional to LLSC.


# 6043:19852407f5c9 19-Apr-2009 Gabe Black <gblack@eecs.umich.edu>

CPU: If the simple CPU is already idle, just return from suspendContext, don't assert.


# 6023:47b4fcb10c11 09-Apr-2009 Nathan Binkert <nate@binkert.org>

tlb: More fixing of unified TLB


# 6022:410194bb3049 09-Apr-2009 Gabe Black <gblack@eecs.umich.edu>

tlb: Don't separate the TLB classes into an instruction TLB and a data TLB


# 6012:47748a3b6ecf 12-Mar-2009 Steve Reinhardt <steve.reinhardt@amd.com>

cpu: fix minor endian issue with trace output
(no functional change)


# 5914:c92d57f579b1 25-Feb-2009 Gabe Black <gblack@eecs.umich.edu>

CPU: Don't fetch when executing a macroop.
If the CPL changes mid macroop, the end of the instruction might not be
priveleged enough to execute the beginning.


# 5894:8091ac99341a 25-Feb-2009 Gabe Black <gblack@eecs.umich.edu>

CPU: Implement translateTiming which defers to translateAtomic, and convert the timing simple CPU to use it.


# 5891:73084c6bb183 25-Feb-2009 Gabe Black <gblack@eecs.umich.edu>

ISA: Replace the translate functions in the TLBs with translateAtomic.


# 5890:bdef71accd68 25-Feb-2009 Gabe Black <gblack@eecs.umich.edu>

CPU: Get rid of translate... functions from various interface classes.


# 5744:342cbc20a188 14-Nov-2008 Gabe Black <gblack@eecs.umich.edu>

CPU: Refactor read/write in the simple timing CPU.


# 5728:9574f561dfa2 10-Nov-2008 Gabe Black <gblack@eecs.umich.edu>

CPU: Make unaligned accesses work in the timing simple CPU.


# 5726:17157c5f7e15 10-Nov-2008 Gabe Black <gblack@eecs.umich.edu>

X86: Make the timing simple CPU handle variable length instructions.


# 5714:76abee886def 02-Nov-2008 Lisa Hsu <hsul@eecs.umich.edu>

Add in Context IDs to the simulator. From now on, cpuId is almost never used,
the primary identifier for a hardware context should be contextId(). The
concept of threads within a CPU remains, in the form of threadId() because
sometimes you need to know which context within a cpu to manipulate.


# 5712:199d31b47f7b 02-Nov-2008 Lisa Hsu <hsul@eecs.umich.edu>

make BaseCPU the provider of _cpuId, and cpuId() instead of being scattered
across the subclasses. generally make it so that member data is _cpuId and
accessor functions are cpuId(). The ID val comes from the python (default -1 if
none provided), and if it is -1, the index of cpuList will be given. this has
passed util/regress quick and se.py -n4 and fs.py -n4 as well as standard
switch.


# 5710:b44dd45bd604 27-Oct-2008 Clint Smullen <cws3k@cs.virginia.edu>

CPU: The API change to EventWrapper did not get propagated to the entirety of TimingSimpleCPU.
The constructor no-longer schedules an event at construction and the implict conversion between int and bool was allowing the old code to compile without warning.

Signed-off By: Ali Saidi


# 5669:cbac62a59686 12-Oct-2008 Gabe Black <gblack@eecs.umich.edu>

X86: Don't fetch in the simple CPU if you're in the ROM.


# 5606:6da7a58b0bc8 09-Oct-2008 Nathan Binkert <nate@binkert.org>

eventq: convert all usage of events to use the new API.
For now, there is still a single global event queue, but this is
necessary for making the steps towards a parallelized m5.


# 5529:9ae69b9cd7fd 11-Aug-2008 Nathan Binkert <nate@binkert.org>

params: Convert the CPU objects to use the auto generated param structs.
A whole bunch of stuff has been converted to use the new params stuff, but
the CPU wasn't one of them. While we're at it, make some things a bit
more stylish. Most of the work was done by Gabe, I just cleaned stuff up
a bit more at the end.


# 5507:52bcc301b467 15-Jul-2008 Steve Reinhardt <stever@gmail.com>

Use ReadResp instead of LoadLockedResp for LoadLockedReq responses.


# 5497:89a6483d7047 01-Jul-2008 Ali Saidi <saidi@eecs.umich.edu>

Make the cached virtPort have a thread context so it can do everything that a newly created one can.


# 5496:6899b894166f 01-Jul-2008 Ali Saidi <saidi@eecs.umich.edu>

After a checkpoint (and thus a stats reset), the not_idle_fraction/notIdleFraction statistic is really wrong.
The notIdleFraction statistic isn't updated when the statistics reset, probably because the cpu Status information
was pulled into the atomic and timing cpus. This changeset pulls Status back into the BaseSimpleCPU object. Anyone
care to comment on the odd naming of the Status instance? It shouldn't just be status because that is confusing
with Port::Status, but _status seems a bit strage too.


# 5408:703f1779cc89 12-Jun-2008 Gabe Black <gblack@eecs.umich.edu>

CPU: Make the simple cpu trace data for loads/stores.


# 5348:7847a4bf9641 14-Feb-2008 Ali Saidi <saidi@eecs.umich.edu>

CPU: move the PC Events code to a place where the code won't be executed multiple times if an instruction faults.


# 5336:c7e21f4e5a2e 06-Feb-2008 Stephen Hines <hines@cs.fsu.edu>

Make the Event::description() a const function


# 5335:69d45f5f21a2 05-Feb-2008 Stephen Hines <hines@cs.fsu.edu>

Add base ARM code to M5


# 5315:30997e988446 02-Jan-2008 Steve Reinhardt <stever@gmail.com>

Additional comments and helper functions for PrintReq.


# 5310:4164e6bfcc8a 16-Dec-2007 Ali Saidi <saidi@eecs.umich.edu>

CPU: Update where the simple cpus read their cpu id from the thread context to init() to make sure they read the right value. This fixes a bug with multi-processor full-system configurations.


# 5221:dba788e614fe 08-Nov-2007 Ali Saidi <saidi@eecs.umich.edu>

TimingSimpleCPU: Add some DPRINTFs when the cpu suspends and resumes.


# 5177:4307a768e10e 22-Oct-2007 Gabe Black <gblack@eecs.umich.edu>

CPU: Add functions to the "ExecContext"s that translate a given address.


# 5169:bfd18d401251 18-Oct-2007 Ali Saidi <saidi@eecs.umich.edu>

CPU: Use the ThreadContext cpu id instead of the params cpu id in all cases.


# 5103:391933804192 01-Oct-2007 Ali Saidi <saidi@eecs.umich.edu>

CPU: fix sparc_fs booting with SimpleTimingCPU.


# 5101:8af5a6a6223d 28-Sep-2007 Ali Saidi <saidi@eecs.umich.edu>

Update stats for quiesced cycles


# 5100:7a0180040755 28-Sep-2007 Ali Saidi <saidi@eecs.umich.edu>

Rename cycles() function to ticks()


# 5099:8ff1345b3ae4 28-Sep-2007 Ali Saidi <saidi@eecs.umich.edu>

Update statistics to use cycles properly instead of ticks


# 5018:21795007349e 27-Aug-2007 Gabe Black <gblack@eecs.umich.edu>

Merge with head.


# 5012:c0a28154d002 27-Aug-2007 Gabe Black <gblack@eecs.umich.edu>

Merge with head


# 5001:31fda5c37c19 26-Aug-2007 Gabe Black <gblack@eecs.umich.edu>

Simple CPU: Don't trace instructions that fault. Otherwise they show up twice.


# 4998:51a0f9f59cc5 26-Aug-2007 Gabe Black <gblack@eecs.umich.edu>

Simple CPU: Make sure only instructions which complete without faulting are counted.


# 4997:e7380529bd2d 26-Aug-2007 Gabe Black <gblack@eecs.umich.edu>

Address Translation: Make SE mode use an actual TLB/MMU for translation like FS.


# 4986:b7c82ad6b3ef 24-Aug-2007 Ali Saidi <saidi@eecs.umich.edu>

Mem: Make errors in the memory system be responses, not requests. Fixes cache handling of error responses.


# 4928:951bd17db218 29-Jul-2007 Steve Reinhardt <stever@eecs.umich.edu>

Merge Gabe's changes from head.


# 4918:3214e3694fb2 27-Jul-2007 Nathan Binkert <nate@binkert.org>

Merge python and x86 changes with cache branch


# 4881:3e4b4f6ff9dd 02-Jul-2007 Steve Reinhardt <stever@eecs.umich.edu>

Couple more minor bug fixes for FS timing mode.

src/cpu/simple/timing.cc:
Fix another SC problem.
src/mem/cache/cache_impl.hh:
Forgot to call makeTimingResponse() on uncached timing responses.


# 4880:4de4d072e977 02-Jul-2007 Steve Reinhardt <stever@eecs.umich.edu>

Fix a couple LL/SC bugs that only affected timing mode.

src/cpu/simple/timing.cc:
Fix swap/stq_c command bug.
src/mem/packet.cc:
Fix incorrect LoadLockedReq command response field.


# 4878:5b747482d2d8 30-Jun-2007 Steve Reinhardt <stever@eecs.umich.edu>

Make CPU models use new LoadLockedReq/StoreCondReq commands.


# 4870:fcc39d001154 30-Jun-2007 Steve Reinhardt <stever@eecs.umich.edu>

Get rid of Packet result field. Error responses are
now encoded in cmd field.


# 4776:8c8407243a2c 28-Jul-2007 Gabe Black <gblack@eecs.umich.edu>

Turn the instruction tracing code into pluggable sim objects.
These need to be refined a little still and given parameters.


# 4762:c94e103c83ad 24-Jul-2007 Nathan Binkert <nate@binkert.org>

Major changes to how SimObjects are created and initialized. Almost all
creation and initialization now happens in python. Parameter objects
are generated and initialized by python. The .ini file is now solely for
debugging purposes and is not used in construction of the objects in any
way.


# 4584:81a11c930f2d 18-Jun-2007 Ali Saidi <saidi@eecs.umich.edu>

fix bug in timing cpu. getTime() is the time the requset was created, not the time it was repsonded to. In timing mode the
time it was responded to is curTick. Doesn't change the results, but it does make implementation of nextCycle() more difficult


# 4471:4d86c4d096ad 21-May-2007 Steve Reinhardt <stever@eecs.umich.edu>

Add new EventWrapper constructor that takes a Tick value
and schedules the event immediately.


# 4433:4722c6787f69 07-May-2007 Ali Saidi <saidi@eecs.umich.edu>

the bridge never returns false when recvTiming() is called on its ports now, it always returns true and nacks the packet if there isn't sufficient buffer space
fix the timing cpu to handle receiving a nacked packet

src/cpu/simple/timing.cc:
make the timing cpu handle receiving a nacked packet
src/mem/bridge.cc:
src/mem/bridge.hh:
the bridge never returns false when recvTiming() is called on its ports now, it always returns true and nacks the packet if there isn't sufficient buffer space


# 4224:7e828583f2cb 11-Mar-2007 Gabe Black <gblack@eecs.umich.edu>

Make sttw and sttwa use the twin memory operations.


# 4200:f55b59fc848b 10-Mar-2007 Ali Saidi <saidi@eecs.umich.edu>

I thought this code got deleted, but since it hasn't I've moved it to a place where it doesn't access freed memory.


# 4192:7accc6365bb9 09-Mar-2007 Kevin Lim <ktlim@umich.edu>

Two fixes:
1. Make sure connectMemPorts() only gets called when the CPU's peer gets changed. This is done by making setPeer() virtual, and overriding it in the CPU's ports. When it gets called on a CPU's port (dcache specifically), it calls the normal setPeer() function, and also connectMemPorts().
2. Consolidate redundant code that handles switching in a CPU.

src/cpu/base.cc:
Move common code of switching over peers to base CPU.
src/cpu/base.hh:
Move common code of switching over peers to BaseCPU.
src/cpu/o3/cpu.cc:
Add in function that updates thread context's ports.
Also use updated function to takeOverFrom() in BaseCPU. This gets rid of some repeated code.
src/cpu/o3/cpu.hh:
Include function to update thread context's memory ports.
src/cpu/o3/lsq.hh:
Add function to dcache port that will update the memory ports upon getting a new peer.
Also include a function that will tell the CPU to update those memory ports.
src/cpu/o3/lsq_impl.hh:
Add function that will update the memory ports upon getting a new peer.
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
Add function that will update thread context's memory ports upon getting a new peer.
Also use the new BaseCPU's take over from function.
src/cpu/simple/atomic.hh:
Add in function (and dcache port) that will allow the dcache to update memory ports when it gets assigned a new peer.
src/cpu/simple/timing.hh:
Add function that will update thread context's memory ports upon getting a new peer.
src/mem/port.hh:
Make setPeer virtual so that other classes can override it.


# 4115:cc1d6df13c7d 02-Mar-2007 Ali Saidi <saidi@eecs.umich.edu>

make ldtw(a) -- Twin 32 bit load work correctly -- by doing it the same way as the twin 64 bit loads

src/arch/isa_parser.py:
src/arch/sparc/isa/decoder.isa:
src/arch/sparc/isa/operands.isa:
src/base/bigint.hh:
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
src/mem/packet_access.hh:
make ldtw(a) Twin 32 bit load work correctly


# 4040:eb894f3fc168 12-Feb-2007 Ali Saidi <saidi@eecs.umich.edu>

rename store conditional stuff as extra data so it can be used for conditional swaps as well
Add support for a twin 64 bit int load
Add Memory barrier and write barrier flags as appropriate
Make atomic memory ops atomic

src/arch/alpha/isa/mem.isa:
src/arch/alpha/locked_mem.hh:
src/cpu/base_dyn_inst.hh:
src/mem/cache/cache_blk.hh:
src/mem/cache/cache_impl.hh:
rename store conditional stuff as extra data so it can be used for conditional swaps as well
src/arch/alpha/types.hh:
src/arch/mips/types.hh:
src/arch/sparc/types.hh:
add a largest read data type for statically allocating read buffers in atomic simple cpu
src/arch/isa_parser.py:
Add support for a twin 64 bit int load
src/arch/sparc/isa/decoder.isa:
Make atomic memory ops atomic
Add Memory barrier and write barrier flags as appropriate
src/arch/sparc/isa/formats/mem/basicmem.isa:
add post access code block and define a twinload format for twin loads
src/arch/sparc/isa/formats/mem/blockmem.isa:
remove old microcoded twin load coad
src/arch/sparc/isa/formats/mem/mem.isa:
swap.isa replaces the code in loadstore.isa
src/arch/sparc/isa/formats/mem/util.isa:
add a post access code block
src/arch/sparc/isa/includes.isa:
need bigint.hh for Twin64_t
src/arch/sparc/isa/operands.isa:
add a twin 64 int type
src/cpu/simple/atomic.cc:
src/cpu/simple/atomic.hh:
src/cpu/simple/base.hh:
src/cpu/simple/timing.cc:
add support for twinloads
add support for swap and conditional swap instructions
rename store conditional stuff as extra data so it can be used for conditional swaps as well
src/mem/packet.cc:
src/mem/packet.hh:
Add support for atomic swap memory commands
src/mem/packet_access.hh:
Add endian conversion function for Twin64_t type
src/mem/physical.cc:
src/mem/physical.hh:
src/mem/request.hh:
Add support for atomic swap memory commands
Rename sc code to extradata


# 4022:c422464ca16e 07-Feb-2007 Steve Reinhardt <stever@eecs.umich.edu>

Make memory commands dense again to avoid cache stat table explosion.
Created MemCmd class to wrap enum and provide handy methods to
check attributes, convert to string/int, etc.


# 3686:fa8d8b90cd8a 29-Nov-2006 Kevin Lim <ktlim@umich.edu>

Change the connecting of the physPort and virtPort to the memory object below the CPU to happen every time activateContext is called. The overhead is probably a little higher than necessary, but allows these connections to properly be made when there are CPUs that are inactive until they are switched in.

Right now this introduces a minor memory leak as old physPorts and virtPorts are not deleted when new ones are created. A flyspray task has been created for this issue. It can not be resolved until we determine how the bus will handle giving out ID's to functional ports that may be deleted.

src/cpu/o3/cpu.cc:
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
Change the setup of the physPort and virtPort to instead happen every time the CPU has a context activated. This is a little high overhead, but keeps it working correctly when the CPU does not have a physical memory attached to it until it switches in (like the case of switch CPUs).
src/cpu/o3/thread_context.hh:
Change function from being called at init() to just being called whenever the memory ports need to be connected.
src/cpu/o3/thread_context_impl.hh:
Update this to not delete the port if it's the same as the virtPort.
src/cpu/thread_context.hh:
Change function from being called at init() to whenever the memory ports need to be connected.
src/cpu/thread_state.cc:
Instead of initializing the ports, simply connect them, deleting any old ports that might exist. This allows these functions to be called multiple times.
src/cpu/thread_state.hh:
Ports are no longer initialized, but rather connected at context activation time.


# 3673:34386ba8cb41 17-Nov-2006 Ron Dreslinski <rdreslin@umich.edu>

Make an initialization pass for the thread context and set the [phys,virt]Port correctly

src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
Call the thread context initialization


# 3667:1d57100f8bf0 14-Nov-2006 Ron Dreslinski <rdreslin@umich.edu>

Merge zizzer:/bk/newmem
into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest


# 3661:efc80a01aeb6 14-Nov-2006 Ron Dreslinski <rdreslin@umich.edu>

Make cpu's capable of having a phase shift


# 3658:f0a7030c6bd9 14-Nov-2006 Kevin Lim <ktlim@umich.edu>

Various fixes to delete packet and request a little better.

src/cpu/simple/timing.cc:
Various updates for deleting requests more properly.

The major change is moving the deletion of the fetch request/packet to after the instruction has executed and completed. This should fix a few bugs because Ron's memory system didn't expect a call for a functional access while a timing access was being processed.


# 3647:8121d4503cbc 13-Nov-2006 Ron Dreslinski <rdreslin@umich.edu>

Make CPU models signal to update the snoop ranges


# 3617:384e3b1eae06 11-Nov-2006 Nathan Binkert <binkertn@umich.edu>

Get rid of the ParamContext for pseudo instructions and move
the parameters to the BaseCPU object.


# 3495:884bf1f0c0c9 06-Nov-2006 Kevin Lim <ktlim@umich.edu>

Clean up clock phase drift code a bit.

src/cpu/base.cc:
Move clock phase drift code to the base CPU so that any CPU model can use it.
src/cpu/base.hh:
Added two functions to help get the next cycle the CPU should be scheduled.
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
Use the function now in BaseCPU.


# 3479:4fbcaa81d105 01-Nov-2006 Gabe Black <gblack@eecs.umich.edu>

Merge zizzer.eecs.umich.edu:/bk/newmem/
into zeep.eecs.umich.edu:/home/gblack/m5/newmemmemops


# 3453:c3ce58882751 31-Oct-2006 Gabe Black <gblack@eecs.umich.edu>

Put the Alpha tlb stuff into the AlphaISA namespace, and give the classes more neutral names.


# 3402:db60546818d0 31-Oct-2006 Kevin Lim <ktlim@umich.edu>

Remove mem parameter. Now the translating port asks the CPU's dcache's peer for its MemObject instead of having to have a paramter for the MemObject.

configs/example/fs.py:
configs/example/se.py:
src/cpu/simple/base.cc:
src/cpu/simple/base.hh:
src/cpu/simple/timing.cc:
src/cpu/simple_thread.cc:
src/cpu/simple_thread.hh:
src/cpu/thread_state.cc:
src/cpu/thread_state.hh:
tests/configs/o3-timing-mp.py:
tests/configs/o3-timing.py:
tests/configs/simple-atomic-mp.py:
tests/configs/simple-atomic.py:
tests/configs/simple-timing-mp.py:
tests/configs/simple-timing.py:
tests/configs/tsunami-simple-atomic-dual.py:
tests/configs/tsunami-simple-atomic.py:
tests/configs/tsunami-simple-timing-dual.py:
tests/configs/tsunami-simple-timing.py:
No need for mem parameter any more.
src/cpu/checker/cpu.cc:
Use new constructor for simple thread (no more MemObject parameter).
src/cpu/checker/cpu.hh:
Remove MemObject parameter.
src/cpu/memtest/memtest.hh:
Ports now take in their MemObject owner.
src/cpu/o3/alpha/cpu_builder.cc:
Remove mem parameter.
src/cpu/o3/alpha/cpu_impl.hh:
Remove memory parameter and clean up handling of TranslatingPort.
src/cpu/o3/cpu.cc:
src/cpu/o3/cpu.hh:
src/cpu/o3/fetch.hh:
src/cpu/o3/fetch_impl.hh:
src/cpu/o3/mips/cpu_builder.cc:
src/cpu/o3/mips/cpu_impl.hh:
src/cpu/o3/params.hh:
src/cpu/o3/thread_state.hh:
src/cpu/ozone/cpu.hh:
src/cpu/ozone/cpu_builder.cc:
src/cpu/ozone/cpu_impl.hh:
src/cpu/ozone/front_end.hh:
src/cpu/ozone/front_end_impl.hh:
src/cpu/ozone/lw_lsq.hh:
src/cpu/ozone/lw_lsq_impl.hh:
src/cpu/ozone/simple_params.hh:
src/cpu/ozone/thread_state.hh:
src/cpu/simple/atomic.cc:
Remove memory parameter.


# 3387:8f146ac8248f 23-Oct-2006 Gabe Black <gblack@eecs.umich.edu>

Don't let interupts interupt microcode at undesired points.


# 3349:fec4a86fa212 20-Oct-2006 Nathan Binkert <binkertn@umich.edu>

Use PacketPtr everywhere


# 3348:11f6ef023158 20-Oct-2006 Nathan Binkert <binkertn@umich.edu>

refactor code for the packet, get rid of packet_impl.hh
and call it packet_access.hh and fix the #includes so
things compile right.


# 3310:21adbb41a37e 17-Oct-2006 Ron Dreslinski <rdreslin@umich.edu>

Fixes for uni-coherence in timing mode for FS.
Still a bug in atomic uni-coherence in FS.

src/cpu/o3/fetch_impl.hh:
src/cpu/o3/lsq_impl.hh:
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
Make CPU models handle coherence requests
src/mem/cache/base_cache.cc:
Properly signal coherence CSHRs
src/mem/cache/coherence/uni_coherence.cc:
Only deallocate once


# 3297:f0855ab36ff5 12-Oct-2006 Lisa Hsu <hsul@eecs.umich.edu>

Merge zizzer:/bk/newmem
into zed.eecs.umich.edu:/z/hsul/work/m5/newmem

src/cpu/simple/timing.cc:
hand merge


# 3230:e86a03911728 09-Oct-2006 Kevin Lim <ktlim@umich.edu>

Merge ktlim@zizzer:/bk/newmem
into zamp.eecs.umich.edu:/z/ktlim2/clean/o3-merge/newmem

src/cpu/memtest/memtest.cc:
src/cpu/memtest/memtest.hh:
src/cpu/simple/timing.hh:
tests/configs/o3-timing-mp.py:
Hand merge.


# 3227:fe19356d6f88 09-Oct-2006 Kevin Lim <ktlim@umich.edu>

Fix caches plus sampling switch over.

src/cpu/o3/cpu.cc:
Fix up caches plus sampling switch over.


# 3222:19bd4dd3be83 08-Oct-2006 Kevin Lim <ktlim@umich.edu>

Record numCycles properly.

src/cpu/simple/timing.cc:
Record numCycles stat properly.
src/cpu/simple/timing.hh:
Extra variable to help record numCycles stat.


# 3201:7c3b18c01b0e 11-Oct-2006 Lisa Hsu <hsul@eecs.umich.edu>

some drain changes in timing (kevin's) and some memory mode assertion changes so that when you come out of resume, you only assert if you're really wrong.

src/cpu/simple/atomic.cc:
memory mode assertion change so that it only goes off if it's supposed to.
src/cpu/simple/timing.cc:
some drain changes (kevin's) and some changes to memoryMode assertions so that they don't go off when they're not supposed to.


# 3184:8edaf4539e05 08-Oct-2006 Ron Dreslinski <rdreslin@umich.edu>

Fixes for functional path.

If the cpu needs to update any state when it gets a functional write (LSQ??)
then that code needs to be written.

src/cpu/o3/fetch_impl.hh:
src/cpu/o3/lsq_impl.hh:
src/cpu/ozone/front_end_impl.hh:
src/cpu/ozone/lw_lsq_impl.hh:
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
CPU's can recieve functional accesses, they need to determine if they need to do anything with them.
src/mem/bus.cc:
src/mem/bus.hh:
Make the fuctional path do the correct tye of snoop


# 3172:2c84db071850 08-Oct-2006 Steve Reinhardt <stever@eecs.umich.edu>

Replace tests of LOCKED/UNCACHEABLE flags with isLocked()/isUncacheable().


# 3170:37fd1e73f836 08-Oct-2006 Steve Reinhardt <stever@eecs.umich.edu>

Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing)
and PhysicalMemory. *No* support for caches or O3CPU.
Note that properly setting cpu_id on all CPUs is now required
for correct operation.

src/arch/SConscript:
src/base/traceflags.py:
src/cpu/base.hh:
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
src/mem/physical.cc:
src/mem/physical.hh:
src/mem/request.hh:
src/python/m5/objects/BaseCPU.py:
tests/configs/simple-atomic.py:
tests/configs/simple-timing.py:
tests/configs/tsunami-simple-atomic-dual.py:
tests/configs/tsunami-simple-atomic.py:
tests/configs/tsunami-simple-timing-dual.py:
tests/configs/tsunami-simple-timing.py:
Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing)
and PhysicalMemory. *No* support for caches or O3CPU.


# 3169:65bef767b5de 08-Oct-2006 Steve Reinhardt <stever@eecs.umich.edu>

Rename some vars for clarity.


# 3119:6c93a7460ecf 02-Oct-2006 Kevin Lim <ktlim@umich.edu>

Be sure to set progress interval.


# 2948:ae26cf37957c 20-Jul-2006 Ali Saidi <saidi@eecs.umich.edu>

Enforce the timing cpu ticking at it's clock rate
Add a max time option in seconds and a single system root clock be 1THz

configs/test/fs.py:
Add a max time option in seconds and a single system root clock be 1THz
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
Enforce the timing cpu ticking at it's clock rate


# 2923:db8a876258df 14-Jul-2006 Kevin Lim <ktlim@umich.edu>

Merge ktlim@zizzer:/bk/newmem
into zamp.eecs.umich.edu:/z/ktlim2/clean/newmem-merge

configs/test/fs.py:
configs/test/test.py:
SCCS merged


# 2915:1f4d02556ac1 12-Jul-2006 Kevin Lim <ktlim@umich.edu>

Updates for serialization. As long as the tickEvent doesn't need to be serialized (I don't believe it does because we drain all CPUs prior to checkpointing), it should be feasible to start up from other CPU's checkpoints.

src/cpu/simple/atomic.cc:
src/cpu/simple/atomic.hh:
src/cpu/simple/base.cc:
src/cpu/simple/timing.cc:
src/cpu/simple_thread.cc:
Updates for serialization.


# 2901:f9a45473ab55 12-Jul-2006 Ali Saidi <saidi@eecs.umich.edu>

memory mode information now contained in system object
States are now running, draining, or drained. memory state information moved into system object
system parameter is not fs only for cpus
Implement drain() support in devices
Update for drain() call that returns number of times drain_event->process() will be called

Break O3 CPU! No sense in putting in a hack change that kevin is going to remove in a few minutes i imagine

src/cpu/simple/atomic.cc:
src/cpu/simple/atomic.hh:
Since se mode has a system, allow access to it
Verify that the atomic cpu is connected to an atomic system on resume
src/cpu/simple/base.cc:
Since se mode has a system, allow access to it
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
Update for new drain() call that returns number of times drain_event->process() will be called and memory state being moved into the system
Since se mode has a system, allow access to it
Verify that the timing cpu is connected to an timing system on resume
src/dev/ide_disk.cc:
src/dev/io_device.cc:
src/dev/io_device.hh:
src/dev/ns_gige.cc:
src/dev/ns_gige.hh:
src/dev/pcidev.cc:
src/dev/pcidev.hh:
src/dev/sinic.cc:
src/dev/sinic.hh:
Implement drain() support in devices
src/python/m5/config.py:
Allow drain to return number of times drain_event->process() will be called. Normally 0 or 1 but things like O3 cpu or devices with multiple ports may want to call it many times
src/python/m5/objects/BaseCPU.py:
move system parameter out of fs to everyone
src/sim/sim_object.cc:
src/sim/sim_object.hh:
States are now running, draining, or drained. memory state information moved into system object
src/sim/system.cc:
src/sim/system.hh:
memory mode information now contained in system object


# 2869:4dbf4770df29 07-Jul-2006 Kevin Lim <ktlim@umich.edu>

Merge ktlim@zizzer:/bk/newmem
into zamp.eecs.umich.edu:/z/ktlim2/clean/newmem-merge


# 2867:cc92d58a3210 07-Jul-2006 Kevin Lim <ktlim@umich.edu>

Switch out fixes for CPUs.

src/cpu/o3/cpu.cc:
Fix up keeping proper state when switched out and drained.
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
Keep track of the event we use to schedule fetch initially and upon resume. We may have to cancel the event if the CPU is switched out.


# 2866:9b2e1d16d0aa 06-Jul-2006 Kevin Lim <ktlim@umich.edu>

Merge ktlim@zizzer:/bk/newmem
into zamp.eecs.umich.edu:/z/ktlim2/clean/newmem-merge


# 2860:843426871cbc 06-Jul-2006 Kevin Lim <ktlim@umich.edu>

Fixes for draining.

src/cpu/simple/timing.cc:
Update for changed return values.
src/python/m5/__init__.py:
Loop in order to make sure all objects are really drained. Objects may become undrained as other objects become drained (e.g. a bus-bridge has a packet, while a bus is empty, and the first drain() will cause the bus-bridge to give the packet to the bus).

The only case we know every object is actually drained is if they all return immediately that they are drained.


# 2857:5f3e107e8f13 07-Jul-2006 Ron Dreslinski <rdreslin@umich.edu>

Remove hack now that ports work properly


# 2856:89691405ec9c 07-Jul-2006 Ron Dreslinski <rdreslin@umich.edu>

Update cpus to use the getPort function to use a connector object to connect the I/D cache ports to memory

configs/test/test.py:
Update to use new cpu getPort functionality
src/cpu/base.cc:
Make cpu's a memObject to expose getPort interface
src/cpu/base.hh:
Make cpu's a memObject to export getPort interface
src/cpu/simple/atomic.cc:
src/cpu/simple/atomic.hh:
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
Now use the connector via getPort interface
src/mem/cache/base_cache.cc:
Make sure the cache recognizes all port names


# 2855:5ca2cdb32521 06-Jul-2006 Ron Dreslinski <rdreslin@umich.edu>

Timing cache works for hello world test.
Still need
1) detailed CPU (blocking ability in cache)
1a) Multiple outstanding requests (need to keep track of times for events)
2)Multi-level support
3)MP coherece support
4)LL/SC support
5)Functional path needs to be correctly implemented (temporarily works without multiple outstanding requests (simple cpu))

src/cpu/simple/timing.cc:
Temp hack because timing cpu doesn't export ports properly so single I/D cache communicates only through the Icache port.
src/mem/cache/base_cache.cc:
Handle marking MSHR's in service
Add support for getting CSHR's
src/mem/cache/base_cache.hh:
Make these functions visible at the base cache level
src/mem/cache/cache.hh:
make the functions virtual
src/mem/cache/cache_impl.hh:
Rename the function to make sense
src/mem/packet.hh:
Accidentally clearing the needsResponse field when sending a response back.


# 2839:d5dd8a3cdea0 05-Jul-2006 Kevin Lim <ktlim@umich.edu>

Rename quiesce to drain to avoid confusion with the pseudo instruction.

src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
src/python/m5/__init__.py:
src/python/m5/config.py:
src/sim/main.cc:
src/sim/sim_events.cc:
src/sim/sim_events.hh:
src/sim/sim_object.cc:
src/sim/sim_object.hh:
Rename quiesce to drain.


# 2836:c8f549058964 05-Jul-2006 Kevin Lim <ktlim@umich.edu>

Merge ktlim@zizzer:/bk/newmem
into zamp.eecs.umich.edu:/z/ktlim2/clean/newmem-merge

src/base/traceflags.py:
src/cpu/SConscript:
Hand merge.
src/cpu/o3/alpha/params.hh:
Hand merge. This needs to get changed.


# 2835:d2a977df88de 05-Jul-2006 Ron Dreslinski <rdreslin@umich.edu>

Fix some unset values in the request in the timing CPU.
Properly implement the MSHR allocate function.

src/cpu/simple/timing.cc:
Set the thread context in the CPU.

Need to do this properly, currently I just set it to Cpu=0 Thread=0. This will just cause all the stats in the cache based on these to just yield totals and not a distribution.
src/mem/cache/miss/mshr.cc:
Properly implement the allocate function for the MSHR.


# 2823:ff50d1693ee5 05-Jul-2006 Kevin Lim <ktlim@umich.edu>

Need to change state upon quiescing.


# 2798:751e9170247e 29-Jun-2006 Kevin Lim <ktlim@umich.edu>

Various fixes for the CPU models to support the features that have been moved to python.

src/cpu/base.cc:
src/cpu/base.hh:
src/cpu/simple/atomic.hh:
Switching out no longer takes a sampler.
src/cpu/simple/atomic.cc:
Fix up switching out. Also fix up serialization; the nameOut() was messing up the ordering.
src/cpu/simple/timing.cc:
Add in quiesce, fix up serialization.
src/cpu/simple/timing.hh:
Add in queisce, fix up serialization.


# 2683:d6b72bb2ed97 07-Jun-2006 Kevin Lim <ktlim@umich.edu>

Reorganization/renaming of CPUExecContext. Now it is called SimpleThread in order to clear up the confusion due to the many ExecContexts. It also derives from a common ThreadState object, which holds various state common to threads across CPU models.

Following with the previous check-in, ExecContext now refers only to the interface provided to the ISA in order to access CPU state. ThreadContext refers to the interface provided to all objects outside the CPU in order to access thread state. SimpleThread provides all thread state and the interface to access it, and is suitable for simple execution models such as the SimpleCPU.

src/SConscript:
Include thread state file.
src/arch/alpha/ev5.cc:
src/cpu/checker/cpu.cc:
src/cpu/checker/cpu.hh:
src/cpu/checker/thread_context.hh:
src/cpu/memtest/memtest.cc:
src/cpu/memtest/memtest.hh:
src/cpu/o3/cpu.cc:
src/cpu/ozone/cpu_impl.hh:
src/cpu/simple/atomic.cc:
src/cpu/simple/base.cc:
src/cpu/simple/base.hh:
src/cpu/simple/timing.cc:
Rename CPUExecContext to SimpleThread.
src/cpu/base_dyn_inst.hh:
Make thread member variables protected..
src/cpu/o3/alpha_cpu.hh:
src/cpu/o3/cpu.hh:
Make various members of ThreadState protected.
src/cpu/o3/alpha_cpu_impl.hh:
Push generation of TranslatingPort into the CPU itself.
Make various members of ThreadState protected.
src/cpu/o3/thread_state.hh:
Pull a lot of common code into the base ThreadState class.
src/cpu/ozone/thread_state.hh:
Rename CPUExecContext to SimpleThread, move a lot of common code into base ThreadState class.
src/cpu/thread_state.hh:
Push a lot of common code into base ThreadState class. This goes along with renaming CPUExecContext to SimpleThread, and making it derive from ThreadState.
src/cpu/simple_thread.cc:
Rename CPUExecContext to SimpleThread, make it derive from ThreadState. This helps push a lot of common code/state into a single class that can be used by all CPUs.
src/cpu/simple_thread.hh:
Rename CPUExecContext to SimpleThread, make it derive from ThreadState.
src/kern/system_events.cc:
Rename cpu_exec_context to thread_context.
src/sim/process.hh:
Remove unused forward declaration.


# 2680:246e7104f744 06-Jun-2006 Kevin Lim <ktlim@umich.edu>

Change ExecContext to ThreadContext. This is being renamed to differentiate between the interface used objects outside of the CPU, and the interface used by the ISA. ThreadContext is used by objects outside of the CPU and is specifically defined in thread_context.hh. ExecContext is more implicit, and is defined by files such as base_dyn_inst.hh or cpu/simple/base.hh.

Further renames/reorganization will be coming shortly; what is currently CPUExecContext (the old ExecContext from m5) will be renamed to SimpleThread or something similar.

src/arch/alpha/arguments.cc:
src/arch/alpha/arguments.hh:
src/arch/alpha/ev5.cc:
src/arch/alpha/faults.cc:
src/arch/alpha/faults.hh:
src/arch/alpha/freebsd/system.cc:
src/arch/alpha/freebsd/system.hh:
src/arch/alpha/isa/branch.isa:
src/arch/alpha/isa/decoder.isa:
src/arch/alpha/isa/main.isa:
src/arch/alpha/linux/process.cc:
src/arch/alpha/linux/system.cc:
src/arch/alpha/linux/system.hh:
src/arch/alpha/linux/threadinfo.hh:
src/arch/alpha/process.cc:
src/arch/alpha/regfile.hh:
src/arch/alpha/stacktrace.cc:
src/arch/alpha/stacktrace.hh:
src/arch/alpha/tlb.cc:
src/arch/alpha/tlb.hh:
src/arch/alpha/tru64/process.cc:
src/arch/alpha/tru64/system.cc:
src/arch/alpha/tru64/system.hh:
src/arch/alpha/utility.hh:
src/arch/alpha/vtophys.cc:
src/arch/alpha/vtophys.hh:
src/arch/mips/faults.cc:
src/arch/mips/faults.hh:
src/arch/mips/isa_traits.cc:
src/arch/mips/isa_traits.hh:
src/arch/mips/linux/process.cc:
src/arch/mips/process.cc:
src/arch/mips/regfile/float_regfile.hh:
src/arch/mips/regfile/int_regfile.hh:
src/arch/mips/regfile/misc_regfile.hh:
src/arch/mips/regfile/regfile.hh:
src/arch/mips/stacktrace.hh:
src/arch/sparc/faults.cc:
src/arch/sparc/faults.hh:
src/arch/sparc/isa_traits.hh:
src/arch/sparc/linux/process.cc:
src/arch/sparc/linux/process.hh:
src/arch/sparc/process.cc:
src/arch/sparc/regfile.hh:
src/arch/sparc/solaris/process.cc:
src/arch/sparc/stacktrace.hh:
src/arch/sparc/ua2005.cc:
src/arch/sparc/utility.hh:
src/arch/sparc/vtophys.cc:
src/arch/sparc/vtophys.hh:
src/base/remote_gdb.cc:
src/base/remote_gdb.hh:
src/cpu/base.cc:
src/cpu/base.hh:
src/cpu/base_dyn_inst.hh:
src/cpu/checker/cpu.cc:
src/cpu/checker/cpu.hh:
src/cpu/checker/exec_context.hh:
src/cpu/cpu_exec_context.cc:
src/cpu/cpu_exec_context.hh:
src/cpu/cpuevent.cc:
src/cpu/cpuevent.hh:
src/cpu/exetrace.hh:
src/cpu/intr_control.cc:
src/cpu/memtest/memtest.hh:
src/cpu/o3/alpha_cpu.hh:
src/cpu/o3/alpha_cpu_impl.hh:
src/cpu/o3/alpha_dyn_inst_impl.hh:
src/cpu/o3/commit.hh:
src/cpu/o3/commit_impl.hh:
src/cpu/o3/cpu.cc:
src/cpu/o3/cpu.hh:
src/cpu/o3/fetch_impl.hh:
src/cpu/o3/regfile.hh:
src/cpu/o3/thread_state.hh:
src/cpu/ozone/back_end.hh:
src/cpu/ozone/cpu.hh:
src/cpu/ozone/cpu_impl.hh:
src/cpu/ozone/front_end.hh:
src/cpu/ozone/front_end_impl.hh:
src/cpu/ozone/inorder_back_end.hh:
src/cpu/ozone/lw_back_end.hh:
src/cpu/ozone/lw_back_end_impl.hh:
src/cpu/ozone/lw_lsq.hh:
src/cpu/ozone/lw_lsq_impl.hh:
src/cpu/ozone/thread_state.hh:
src/cpu/pc_event.cc:
src/cpu/pc_event.hh:
src/cpu/profile.cc:
src/cpu/profile.hh:
src/cpu/quiesce_event.cc:
src/cpu/quiesce_event.hh:
src/cpu/simple/atomic.cc:
src/cpu/simple/base.cc:
src/cpu/simple/base.hh:
src/cpu/simple/timing.cc:
src/cpu/static_inst.cc:
src/cpu/static_inst.hh:
src/cpu/thread_state.hh:
src/dev/alpha_console.cc:
src/dev/ns_gige.cc:
src/dev/sinic.cc:
src/dev/tsunami_cchip.cc:
src/kern/kernel_stats.cc:
src/kern/kernel_stats.hh:
src/kern/linux/events.cc:
src/kern/linux/events.hh:
src/kern/system_events.cc:
src/kern/system_events.hh:
src/kern/tru64/dump_mbuf.cc:
src/kern/tru64/tru64.hh:
src/kern/tru64/tru64_events.cc:
src/kern/tru64/tru64_events.hh:
src/mem/vport.cc:
src/mem/vport.hh:
src/sim/faults.cc:
src/sim/faults.hh:
src/sim/process.cc:
src/sim/process.hh:
src/sim/pseudo_inst.cc:
src/sim/pseudo_inst.hh:
src/sim/syscall_emul.cc:
src/sim/syscall_emul.hh:
src/sim/system.cc:
src/cpu/thread_context.hh:
src/sim/system.hh:
src/sim/vptr.hh:
Change ExecContext to ThreadContext.


# 2665:a124942bacb8 31-May-2006 Ali Saidi <saidi@eecs.umich.edu>

Updated Authors from bk prs info


# 2663:c82193ae8467 31-May-2006 Steve Reinhardt <stever@eecs.umich.edu>

Streamline interface to Request object.

src/SConscript:
mem/request.cc no longer needed (all functions inline).
src/cpu/simple/atomic.cc:
src/cpu/simple/base.cc:
src/cpu/simple/timing.cc:
src/dev/io_device.cc:
src/mem/port.cc:
Modified Request object interface.
src/mem/packet.hh:
Modified Request object interface.
Address & size are always set together now, so track
with single flag.
src/mem/request.hh:
Streamline interface to support a handful of calls that set
multiple fields reflecting common usage patterns.
Reduce number of validFoo booleans by combining flags for fields
which must be set together.


# 2662:f24ae2d09e27 30-May-2006 Steve Reinhardt <stever@eecs.umich.edu>

Minor further cleanup & commenting of Packet class.

src/cpu/simple/atomic.cc:
Make common ifetch setup based on Request rather than Packet.
Packet::reset() no longer a separate function.
sendAtomic() returns latency, not absolute tick.
src/cpu/simple/atomic.hh:
sendAtomic returns latency, not absolute tick.
src/cpu/simple/base.cc:
src/cpu/simple/base.hh:
src/cpu/simple/timing.cc:
Make common ifetch setup based on Request rather than Packet.
src/dev/alpha_console.cc:
src/dev/ide_ctrl.cc:
src/dev/io_device.cc:
src/dev/isa_fake.cc:
src/dev/ns_gige.cc:
src/dev/pciconfigall.cc:
src/dev/sinic.cc:
src/dev/tsunami_cchip.cc:
src/dev/tsunami_io.cc:
src/dev/tsunami_pchip.cc:
src/dev/uart8250.cc:
src/mem/physical.cc:
Get rid of redundant Packet time field.
src/mem/packet.cc:
Eliminate reset() method.
src/mem/packet.hh:
Fold reset() function into reinitFromRequest()... it was
only ever called together with that function.
Get rid of redundant time field.
Cleanup/add comments.
src/mem/port.hh:
Document in comment that sendAtomic returns latency, not absolute tick.


# 2657:b119b774656b 30-May-2006 Ali Saidi <saidi@eecs.umich.edu>

Add a very poor implementation of dealing with retries on timing requests. It is especially slow with tracing on since
it ends up being O(N^2). But it's probably going to have to change for the real bus anyway, so it should be rewritten then
Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true)
Removed Port Blocked/Unblocked and replaced with sendRetry().
Remove possibility of packet mangling if packet is going to be refused anyway in bridge

src/cpu/simple/atomic.cc:
src/cpu/simple/atomic.hh:
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true)
src/dev/io_device.cc:
src/dev/io_device.hh:
Make DMA Timing requests/responses work.
Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true)
src/mem/bridge.cc:
src/mem/bridge.hh:
Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true)
Removed Port Blocked/Unblocked and replaced with sendRetry().
Remove posibility of packet mangling if packet is going to be refused anyway.
src/mem/bus.cc:
src/mem/bus.hh:
Add a very poor implementation of dealing with retries on timing requests. It is especially slow with tracing on since
it ends up being O(N^2). But it's probably going to have to change for the real bus anyway, so it should be rewritten then
src/mem/port.hh:
Change recvRetry() to not accept a packet. Sendtiming should be called again (and can respond with false or true)
Removed Blocked/Unblocked port status, their functionality is really duplicated in the recvRetry() method


# 2644:8a45565c2c04 26-May-2006 Steve Reinhardt <stever@eecs.umich.edu>

Fixes for TimingSimpleCPU under full system. Now boots Alpha Linux!

src/cpu/simple/atomic.cc:
src/cpu/simple/base.cc:
Move traceData->finalize() into postExecute().
src/cpu/simple/timing.cc:
Fixes for full system. Now boots Alpha Linux!
- Handle ifetch faults, suspend/resume.
- Delete memory request & packet objects on response.
- Don't try to do split memory accesses on prefetch references
(ISA description doesn't support this).
src/cpu/simple/timing.hh:
Minor reorganization of internal methods.


# 2641:6d9d837e2032 26-May-2006 Steve Reinhardt <stever@eecs.umich.edu>

Significant rework of Packet class interface:
- new constructor guarantees initialization of most fields
- flags track status of non-guaranteed fields (addr, size, src)
- accessor functions (getAddr() etc.) check status on access
- Command & Result classes are nested in Packet class scope
- Command now built from vector of behavior bits
- string version of Command for tracing
- reinitFromRequest() and makeTimingResponse() encapsulate
common manipulations of existing packets

src/cpu/simple/atomic.cc:
src/cpu/simple/base.cc:
src/cpu/simple/timing.cc:
src/dev/alpha_console.cc:
src/dev/ide_ctrl.cc:
src/dev/io_device.cc:
src/dev/io_device.hh:
src/dev/isa_fake.cc:
src/dev/ns_gige.cc:
src/dev/pciconfigall.cc:
src/dev/sinic.cc:
src/dev/tsunami_cchip.cc:
src/dev/tsunami_io.cc:
src/dev/tsunami_pchip.cc:
src/dev/uart8250.cc:
src/mem/bus.cc:
src/mem/bus.hh:
src/mem/physical.cc:
src/mem/port.cc:
src/mem/port.hh:
src/mem/request.hh:
Update for new Packet interface.


# 2632:1bb2f91485ea 22-May-2006 Steve Reinhardt <stever@eecs.umich.edu>

New directory structure:
- simulator source now in 'src' subdirectory
- imported files from 'ext' repository
- support building in arbitrary places, including
outside of the source tree. See comment at top
of SConstruct file for more details.
Regression tests are temporarily disabled; that
syetem needs more extensive revisions.

SConstruct:
Update for new directory structure.
Modify to support build trees that are not subdirectories
of the source tree. See comment at top of file for
more details.
Regression tests are temporarily disabled.
src/arch/SConscript:
src/arch/isa_parser.py:
src/python/SConscript:
Update for new directory structure.