Searched hist:13 (Results 1076 - 1100 of 1864) sorted by relevance

<<41424344454647484950>>

/gem5/src/dev/x86/
H A Dcmos.hh9808:13ffc0066b76 Thu Jul 11 22:57:00 EDT 2013 Steve Reinhardt <stever@gmail.com> dev: make BasicPioDevice take size in constructor

Instead of relying on derived classes explicitly assigning
to the BasicPioDevice pioSize field, require them to pass
a size value in to the constructor.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
5629:1565c13d1483 Sat Oct 11 04:13:00 EDT 2008 Gabe Black <gblack@eecs.umich.edu> X86: Change the CMOS from a sub-device to a real SimObject
/gem5/src/cpu/
H A DBaseCPU.py12325:48e41e644187 Fri Nov 24 13:38:00 EST 2017 Andreas Sandberg <andreas.sandberg@arm.com> cpu: Don't override ISA if provided by user

The BaseCPU.createThreads() method currently overrides the BaseCPU.isa
parameter. This is sometimes undesirable. Change the behavior so that
the default value for the isa parameter is the empty list and teach
createThreads() to only override the ISA if none has been specified.

Change-Id: I2ac5535e55fc57057e294d3c6a93088b33bf7b84
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jack Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Sudhanshu Jha <sudhanshu.jha@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/6121
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
12277:e6455b421c4b Thu Oct 19 13:45:00 EDT 2017 Jose Marinho <jose.marinho@arm.com> cpu: Make automatic transition to OFF optional

Add the power_gating_on_idle option to control whether a core
automatically enters the power gated state. The default behaviour is
to transition to clock gated when idle, but not to power gated. When
this option is set to true, the core automatically transitions to the
power gated state after a configurable latency.

Change-Id: Ida98c7fc532de4140d0e511c25613769b47b3702
Reviewed-on: https://gem5-review.googlesource.com/5741
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
11877:5ea85692a53e Mon Jul 20 10:15:00 EDT 2015 Brandon Potter <brandon.potter@amd.com> syscall_emul: [patch 13/22] add system call retry capability

This changeset adds functionality that allows system calls to retry without
affecting thread context state such as the program counter or register values
for the associated thread context (when system calls return with a retry
fault).

This functionality is needed to solve problems with blocking system calls
in multi-process or multi-threaded simulations where information is passed
between processes/threads. Blocking system calls can cause deadlock because
the simulator itself is single threaded. There is only a single thread
servicing the event queue which can cause deadlock if the thread hits a
blocking system call instruction.

To illustrate the problem, consider two processes using the producer/consumer
sharing model. The processes can use file descriptors and the read and write
calls to pass information to one another. If the consumer calls the blocking
read system call before the producer has produced anything, the call will
block the event queue (while executing the system call instruction) and
deadlock the simulation.

The solution implemented in this changeset is to recognize that the system
calls will block and then generate a special retry fault. The fault will
be sent back up through the function call chain until it is exposed to the
cpu model's pipeline where the fault becomes visible. The fault will trigger
the cpu model to replay the instruction at a future tick where the call has
a chance to succeed without actually going into a blocking state.

In subsequent patches, we recognize that a syscall will block by calling a
non-blocking poll (from inside the system call implementation) and checking
for events. When events show up during the poll, it signifies that the call
would not have blocked and the syscall is allowed to proceed (calling an
underlying host system call if necessary). If no events are returned from the
poll, we generate the fault and try the instruction for the thread context
at a distant tick. Note that retrying every tick is not efficient.

As an aside, the simulator has some multi-threading support for the event
queue, but it is not used by default and needs work. Even if the event queue
was completely multi-threaded, meaning that there is a hardware thread on
the host servicing a single simulator thread contexts with a 1:1 mapping
between them, it's still possible to run into deadlock due to the event queue
barriers on quantum boundaries. The solution of replaying at a later tick
is the simplest solution and solves the problem generally.
9849:603e2ed487f3 Wed Sep 04 13:22:00 EDT 2013 Andreas Hansson <andreas.hansson@arm.com> cpu: Move the branch predictor out of the BaseCPU

The branch predictor is guarded by having either the in-order or
out-of-order CPU as one of the available CPU models and therefore
should not be used in the BaseCPU. This patch moves the parameter to
the relevant CPU classes.
9650:d79319eb68d5 Mon Apr 22 13:20:00 EDT 2013 Timothy M. Jones <timothy.jones@arm.com> cpu: Let python scripts obtain the number of instructions executed
9647:5b6b315472e7 Mon Apr 22 13:20:00 EDT 2013 Dam Sunwoo <dam.sunwoo@arm.com> cpu: generate SimPoint basic block vector profiles

This patch is based on http://reviews.m5sim.org/r/1474/ originally written by
Mitch Hayenga. Basic block vectors are generated (simpoint.bb.gz in simout
folder) based on start and end addresses of basic blocks.

Some comments to the original patch are addressed and hooks are added to create
and resume from checkpoints based on instruction counts dictated by external
SimPoint analysis tools.

SimPoint creation/resuming options will be implemented as a separate patch.
9446:644f2a2c9bfc Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Flush TLBs on switchOut()

This changeset inserts a TLB flush in BaseCPU::switchOut to prevent
stale translations when doing repeated switching. Additionally, the
TLB flushing functionality is exported to the Python to make debugging
of switching/checkpointing easier.

A simulation script will typically use the TLB flushing functionality
to generate a reference trace. The following sequence can be used to
simulate a handover (this depends on how drain is implemented, but is
generally the case) between identically configured CPU models:

m5.drain(test_sys)
[ cpu.flushTLBs() for cpu in test_sys.cpu ]
m5.resume(test_sys)

The generated trace should normally be identical to a trace generated
when switching between identically configured CPU models or
checkpointing and resuming.
9433:34971d2e0019 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Rename defer_registration->switched_out

The defer_registration parameter is used to prevent a CPU from
initializing at startup, leaving it in the "switched out" mode. The
name of this parameter (and the help string) is confusing. This patch
renames it to switched_out, which should be more descriptive.
9430:a113f27b68bd Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Introduce sanity checks when switching between CPUs

This patch introduces the following sanity checks when switching
between CPUs:

* Check that the set of new and old CPUs do not overlap. Having an
overlap between the set of new CPUs and the set of old CPUs is
currently not supported. Doing such a switch used to result in the
following assertion error:
BaseCPU::takeOverFrom(BaseCPU*): \
Assertion `!new_itb_port->isConnected()' failed.

* Check that all new CPUs are in the switched out state.

* Check that all old CPUs are in the switched in state.
9384:877293183bdf Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@arm.com> arch: Make the ISA class inherit from SimObject

The ISA class on stores the contents of ID registers on many
architectures. In order to make reset values of such registers
configurable, we make the class inherit from SimObject, which allows
us to use the normal generated parameter headers.

This patch introduces a Python helper method, BaseCPU.createThreads(),
which creates a set of ISAs for each of the threads in an SMT
system. Although it is currently only needed when creating
multi-threaded CPUs, it should always be called before instantiating
the system as this is an obvious place to configure ID registers
identifying a thread/CPU.
H A Dpc_event.cc9649:c717bd5e0a1d Mon Apr 22 13:20:00 EDT 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> arm: Enable support for triggering a sim panic on kernel panics

Add the options 'panic_on_panic' and 'panic_on_oops' to the
LinuxArmSystem SimObject. When these option are enabled, the simulator
panics when the guest kernel panics or oopses. Enable panic on panic
and panic on oops in ARM-based test cases.
8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
8231:51cf7f3cf9ac Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> debug: create a Debug namespace
8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
4167:ce5d0f62f13b Tue Mar 06 14:13:00 EST 2007 Nathan Binkert <binkertn@umich.edu> Move all of the parameters of the Root SimObject so they are
directly configured by python. Move stuff from root.(cc|hh) to
core.(cc|hh) since it really belogs there now.
In the process, simplify how ticks are used in the python code.
/gem5/src/arch/sparc/
H A Dtlb.cc9423:43caa4ca5979 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@arm.com> arch: Add support for invalidating TLBs when draining

This patch adds support for the memInvalidate() drain method. TLB
flushing is requested by calling the virtual flushAll() method on the
TLB.

Note: This patch renames invalidateAll() to flushAll() on x86 and
SPARC to make the interface consistent across all supported
architectures.
8751:a6c772fef2f1 Thu Oct 13 04:37:00 EDT 2011 Gabe Black <gblack@eecs.umich.edu> SPARC: Remove the last checks of FULL_SYSTEM.
8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
7678:f19b6a3a8cec Mon Sep 13 22:26:00 EDT 2010 Gabe Black <gblack@eecs.umich.edu> Faults: Pass the StaticInst involved, if any, to a Fault's invoke method.

Also move the "Fault" reference counted pointer type into a separate file,
sim/fault.hh. It would be better to name this less similarly to sim/faults.hh
to reduce confusion, but fault.hh matches the name of the type. We could change
Fault to FaultPtr to match other pointer types, and then changing the name of
the file would make more sense.
7518:917208416d2a Fri Aug 13 09:10:00 EDT 2010 Gabe Black <gblack@eecs.umich.edu> CPU: Tidy up endianness handling for mmapped "IPR"s.
5894:8091ac99341a Wed Feb 25 13:16:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> CPU: Implement translateTiming which defers to translateAtomic, and convert the timing simple CPU to use it.
5891:73084c6bb183 Wed Feb 25 13:15:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> ISA: Replace the translate functions in the TLBs with translateAtomic.
5100:7a0180040755 Fri Sep 28 13:21:00 EDT 2007 Ali Saidi <saidi@eecs.umich.edu> Rename cycles() function to ticks()
4990:38d74405ddac Mon Aug 13 19:06:00 EDT 2007 Gabe Black <gblack@eecs.umich.edu> SPARC: Move tlb state into the tlb.
Each "strand" may need to have a private copy of this state, but I couldn't
find anywhere in the spec that said that after looking briefly.
This prevents writes to the thread context in o3 which was causing the
pipeline to be flushed and stopping any forward progress. The other ASI
accessible state will probably need to be accessed differently if/when we get
O3 full system up and running.
/gem5/src/cpu/o3/
H A Dlsq.hh13954:2f400a5f2627 Fri Jul 07 09:13:00 EDT 2017 Giacomo Gabrielli <giacomo.gabrielli@arm.com> cpu,mem: Add support for partial loads/stores and wide mem. accesses

This changeset adds support for partial (or masked) loads/stores, i.e.
loads/stores that can disable accesses to individual bytes within the
target address range. In addition, this changeset extends the code to
crack memory accesses across most CPU models (TimingSimpleCPU still
TBD), so that arbitrarily wide memory accesses are supported. These
changes are required for supporting ISAs with wide vectors.

Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
- Tiago Muck <tiago.muck@arm.com>

Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
13652:45d94ac03a27 Mon Jan 22 13:12:00 EST 2018 Tuan Ta <qtt2@cornell.edu> cpu: support atomic memory request type with AtomicOpFunctor

This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU,
MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory
system.

Atomic memory instruction is treated as a special store instruction in
all CPU models.

In simple CPUs, an AMO request with an associated AtomicOpFunctor is
simply sent to L1 dcache.

In MinorCPU, an AMO request bypasses store buffer and waits for any
conflicting store request(s) currently in the store buffer to retire
before the AMO request is sent to the cache. AMO requests are not buffered
in the store buffer, so their effects appear immediately in the cache.

In DerivO3CPU, an AMO request is inserted in the store buffer so that it
is delivered to the cache only after all previous stores are issued to
the cache. Data forwarding between between an outstanding AMO in the
store buffer and a subsequent load is not allowed since the AMO request
does not hold valid data until it's executed in the cache.

This implementation assumes that a target ISA implementation must insert
enough memory fences as micro-ops around an atomic instruction to
enforce a correct order of memory instructions with respect to its
memory consistency model. Without extra memory fences, this implementation
can allow AMOs and other memory instructions that do not conflict
(i.e., not target the same address) to reorder.

This implementation also assumes that atomic instructions execute within
a cache line boundary since the cache for now is not able to execute an
operation on two different cache lines in one single step. Therefore,
ISAs like x86 that require multi-cache-line atomic instructions need to
either use a pair of locking load and unlocking store or change the
cache implementation to guarantee the atomicity of an atomic
instruction.

Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a
Reviewed-on: https://gem5-review.googlesource.com/c/8188
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
13590:d7e018859709 Mon Feb 13 04:41:00 EST 2017 Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> cpu-o3: O3 LSQ Generalisation

This patch does a large modification of the LSQ in the O3 model. The
main goal of the patch is to remove the 'an operation can be served with
one or two memory requests' assumption that is present in the LSQ
and the instruction with the req, reqLow, reqHigh triplet, and
generalising it to operations that can be addressed with one request,
and operations that require many requests, embodied in the
SingleDataRequest and the SplitDataRequest.

This modification has been done mimicking the minor model to an extent,
shifting the responsibilities of dealing with VtoP translation and
tracking the status and resources from the DynInst to the LSQ via the
LSQRequest. The LSQRequest models the information concerning the
operation, handles the creation of fragments for translation and request
as well as assembling/splitting the data accordingly.

With this modifications, the implementation of vector ISAs, particularly
on the memory side, become more rich, as the new model permits a
dissociation of the ISA characteristics as vector length, from the
microarchitectural characteristics that govern how contiguous loads are
executing, allowing exploration of different LSQ to DL1 bus widths to
understand the tradeoffs in complexity and performance.

Part of the complexities introduced stem from the fact that gem5 keeps a
large amount of metadata regarding, in particular, memory operations,
thus, when an instruction is squashed while some operation as TLB lookup
or cache access is ongoing, when the relevant structure communicates to
the LSQ that the operation is over, it tries to access some pieces of
data that should have died when the instruction is squashed, leading to
asserts, panics, or memory corruption. To ensure the correct behaviour,
the LSQRequest rely on assesing who is their owner, and self-destroying
if they detect their owner is done with the request, and there will be
no subsequent action. For example, in the case of an instruction
squashed whal the TLB is doing a walk to serve the translation, when the
translation is served by the TLB, the LSQRequest detects that the
instruction was squashed, and as the translation is done, no one else
expect to access its information, and therefore, it self-destructs.
Having destroyed the LSQRequest earlier, would lead to wrong behaviour
as the TLB walk may access some fields of it.

Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>

Change-Id: I9578a1a3f6b899c390cdd886856a24db68ff7d0c
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13516
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
10239:592f0bb6bd6f Sat Jun 21 13:26:00 EDT 2014 Binh Pham <binhpham@cs.rutgers.edu> o3: split load & store queue full cases in rename

Check for free entries in Load Queue and Store Queue separately to
avoid cases when load cannot be renamed due to full Store Queue and
vice versa.

This work was done while Binh was an intern at AMD Research.
9444:ab47fe7f03f0 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Rewrite O3 draining to avoid stopping in microcode

Previously, the O3 CPU could stop in the middle of a microcode
sequence. This patch makes sure that the pipeline stops when it has
committed a normal instruction or exited from a microcode
sequence. Additionally, it makes sure that the pipeline has no
instructions in flight when it is drained, which should make draining
more robust.

Draining is controlled in the commit stage, which checks if the next
PC after a committed instruction is in microcode. If this isn't the
case, it requests a squash of all instructions after that the
instruction that just committed and immediately signals a drain stall
to the fetch stage. The CPU then continues to execute until the
pipeline and all associated buffers are empty.
9440:fdc91cab5760 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Fix O3 LSQ debug dumping constness and formatting
8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses

This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.

For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).

The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.

With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.
8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
7520:67c670459d01 Fri Aug 13 09:16:00 EDT 2010 Gabe Black <gblack@eecs.umich.edu> CPU: Add readBytes and writeBytes functions to the exec contexts.
5494:85c8d296c1cb Sat Jun 28 13:19:00 EDT 2008 Steve Reinhardt <stever@gmail.com> Backed out changeset 94a7bb476fca: caused memory leak.
H A Dfetch_impl.hh10244:d2deb51a4abf Mon Jun 30 13:50:00 EDT 2014 Anthony Gutierrez <atgutier@umich.edu> cpu: implement a bi-mode branch predictor
9982:b2bfc23f932c Fri Nov 15 13:21:00 EST 2013 Anthony Gutierrez <atgutier@umich.edu> cpu: allow the fetch buffer to be smaller than a cache line

the current implementation of the fetch buffer in the o3 cpu
is only allowed to be the size of a cache line. some
architectures, e.g., ARM, have fetch buffers smaller than a cache
line, see slide 22 at:
http://www.arm.com/files/pdf/at-exploring_the_design_of_the_cortex-a15.pdf

this patch allows the fetch buffer to be set to values smaller
than a cache line.
9644:07352f119e48 Mon Apr 22 13:20:00 EDT 2013 Ali Saidi <Ali.Saidi@ARM.com> cpu: fix a switching issue with the o3 cpu.

This change fixes the switcheroo test that broke earlier this month. The code
that was checking for the pipeline being blocked wasn't checking for a pending
translation, only for a icache access.
9444:ab47fe7f03f0 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Rewrite O3 draining to avoid stopping in microcode

Previously, the O3 CPU could stop in the middle of a microcode
sequence. This patch makes sure that the pipeline stops when it has
committed a normal instruction or exited from a microcode
sequence. Additionally, it makes sure that the pipeline has no
instructions in flight when it is drained, which should make draining
more robust.

Draining is controlled in the commit stage, which checks if the next
PC after a committed instruction is in microcode. If this isn't the
case, it requests a squash of all instructions after that the
instruction that just committed and immediately signals a drain stall
to the fetch stage. The CPU then continues to execute until the
pipeline and all associated buffers are empty.
9427:ddf45c1d54d4 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Initialize the O3 pipeline from startup()

The entire O3 pipeline used to be initialized from init(), which is
called before initState() or unserialize(). This causes the pipeline
to be initialized from an incorrect thread context. This doesn't
currently lead to correctness problems as instructions fetched from
the incorrect start PC will be squashed a few cycles after
initialization.

This patch will affect the regressions since the O3 CPU now issues its
first instruction fetch to the correct PC instead of 0x0.
9057:f5ee56466b91 Tue Jun 05 13:52:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> ISA: Back-out NoopMachInst as a StaticInstPtr change.
9040:cdfe09f9bdee Mon Jun 04 13:57:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA: Turn the ExtMachInst NoopMachinst into the StaticInstPtr NoopStaticInst.

This eliminates a use of the ExtMachInst type outside of the ISAs.
8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses

This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.

For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).

The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.

With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.
8931:7a1dfb191e3f Fri Apr 06 13:46:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Enable multiple distributed generalized memories

This patch removes the assumption on having on single instance of
PhysicalMemory, and enables a distributed memory where the individual
memories in the system are each responsible for a single contiguous
address range.

All memories inherit from an AbstractMemory that encompasses the basic
behaviuor of a random access memory, and provides untimed access
methods. What was previously called PhysicalMemory is now
SimpleMemory, and a subclass of AbstractMemory. All future types of
memory controllers should inherit from AbstractMemory.

To enable e.g. the atomic CPU and RubyPort to access the now
distributed memory, the system has a wrapper class, called
PhysicalMemory that is aware of all the memories in the system and
their associated address ranges. This class thus acts as an
infinitely-fast bus and performs address decoding for these "shortcut"
accesses. Each memory can specify that it should not be part of the
global address map (used e.g. by the functional memories by some
testers). Moreover, each memory can be configured to be reported to
the OS configuration table, useful for populating ATAG structures, and
any potential ACPI tables.

Checkpointing support currently assumes that all memories have the
same size and organisation when creating and resuming from the
checkpoint. A future patch will enable a more flexible
re-organisation.
8641:4d3ecac1abec Tue Dec 13 14:49:00 EST 2011 Nathan Binkert <nate@binkert.org> gcc: fix unused variable warnings from GCC 4.6.1
H A Dcommit_impl.hh13652:45d94ac03a27 Mon Jan 22 13:12:00 EST 2018 Tuan Ta <qtt2@cornell.edu> cpu: support atomic memory request type with AtomicOpFunctor

This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU,
MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory
system.

Atomic memory instruction is treated as a special store instruction in
all CPU models.

In simple CPUs, an AMO request with an associated AtomicOpFunctor is
simply sent to L1 dcache.

In MinorCPU, an AMO request bypasses store buffer and waits for any
conflicting store request(s) currently in the store buffer to retire
before the AMO request is sent to the cache. AMO requests are not buffered
in the store buffer, so their effects appear immediately in the cache.

In DerivO3CPU, an AMO request is inserted in the store buffer so that it
is delivered to the cache only after all previous stores are issued to
the cache. Data forwarding between between an outstanding AMO in the
store buffer and a subsequent load is not allowed since the AMO request
does not hold valid data until it's executed in the cache.

This implementation assumes that a target ISA implementation must insert
enough memory fences as micro-ops around an atomic instruction to
enforce a correct order of memory instructions with respect to its
memory consistency model. Without extra memory fences, this implementation
can allow AMOs and other memory instructions that do not conflict
(i.e., not target the same address) to reorder.

This implementation also assumes that atomic instructions execute within
a cache line boundary since the cache for now is not able to execute an
operation on two different cache lines in one single step. Therefore,
ISAs like x86 that require multi-cache-line atomic instructions need to
either use a pair of locking load and unlocking store or change the
cache implementation to guarantee the atomicity of an atomic
instruction.

Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a
Reviewed-on: https://gem5-review.googlesource.com/c/8188
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
11877:5ea85692a53e Mon Jul 20 10:15:00 EDT 2015 Brandon Potter <brandon.potter@amd.com> syscall_emul: [patch 13/22] add system call retry capability

This changeset adds functionality that allows system calls to retry without
affecting thread context state such as the program counter or register values
for the associated thread context (when system calls return with a retry
fault).

This functionality is needed to solve problems with blocking system calls
in multi-process or multi-threaded simulations where information is passed
between processes/threads. Blocking system calls can cause deadlock because
the simulator itself is single threaded. There is only a single thread
servicing the event queue which can cause deadlock if the thread hits a
blocking system call instruction.

To illustrate the problem, consider two processes using the producer/consumer
sharing model. The processes can use file descriptors and the read and write
calls to pass information to one another. If the consumer calls the blocking
read system call before the producer has produced anything, the call will
block the event queue (while executing the system call instruction) and
deadlock the simulation.

The solution implemented in this changeset is to recognize that the system
calls will block and then generate a special retry fault. The fault will
be sent back up through the function call chain until it is exposed to the
cpu model's pipeline where the fault becomes visible. The fault will trigger
the cpu model to replay the instruction at a future tick where the call has
a chance to succeed without actually going into a blocking state.

In subsequent patches, we recognize that a syscall will block by calling a
non-blocking poll (from inside the system call implementation) and checking
for events. When events show up during the poll, it signifies that the call
would not have blocked and the syscall is allowed to proceed (calling an
underlying host system call if necessary). If no events are returned from the
poll, we generate the fault and try the instruction for the thread context
at a distant tick. Note that retrying every tick is not efficient.

As an aside, the simulator has some multi-threading support for the event
queue, but it is not used by default and needs work. Even if the event queue
was completely multi-threaded, meaning that there is a hardware thread on
the host servicing a single simulator thread contexts with a 1:1 mapping
between them, it's still possible to run into deadlock due to the event queue
barriers on quantum boundaries. The solution of replaying at a later tick
is the simplest solution and solves the problem generally.
9444:ab47fe7f03f0 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Rewrite O3 draining to avoid stopping in microcode

Previously, the O3 CPU could stop in the middle of a microcode
sequence. This patch makes sure that the pipeline stops when it has
committed a normal instruction or exited from a microcode
sequence. Additionally, it makes sure that the pipeline has no
instructions in flight when it is drained, which should make draining
more robust.

Draining is controlled in the commit stage, which checks if the next
PC after a committed instruction is in microcode. If this isn't the
case, it requests a squash of all instructions after that the
instruction that just committed and immediately signals a drain stall
to the fetch stage. The CPU then continues to execute until the
pipeline and all associated buffers are empty.
9437:8088e94a9de0 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Fix broken squashAfter implementation in O3 CPU

Commit can currently both commit and squash in the same cycle. This
confuses other stages since the signals coming from the commit stage
can only signal either a squash or a commit in a cycle. This changeset
changes the behavior of squashAfter so that it commits all
instructions, including the instruction that requested the squash, in
the first cycle and then starts to squash in the next cycle.
9427:ddf45c1d54d4 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Initialize the O3 pipeline from startup()

The entire O3 pipeline used to be initialized from init(), which is
called before initState() or unserialize(). This causes the pipeline
to be initialized from an incorrect thread context. This doesn't
currently lead to correctness problems as instructions fetched from
the incorrect start PC will be squashed a few cycles after
initialization.

This patch will affect the regressions since the O3 CPU now issues its
first instruction fetch to the correct PC instead of 0x0.
9382:1c97b57d5169 Mon Jan 07 13:05:00 EST 2013 Ali Saidi <Ali.Saidi@ARM.com> cpu: rename the misleading inSyscall to noSquashFromTC

isSyscall was originally created because during handling of a syscall in SE
mode the threadcontext had to be updated. However, in many places this is used
in FS mode (e.g. fault handlers) and the name doesn't make much sense. The
boolean actually stops gem5 from squashing speculative and non-committed state
when a write to a threadcontext happens, so re-name the variable to something
more appropriate
8843:7d3ac6813147 Mon Feb 13 01:26:00 EST 2012 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> BPred: Fix RAS to handle predicated call/return instructions.

Change RAS to fix issues with predicated call/return instructions.
Handled all cases in the life of a predicated call and return instruction.
8842:a02932e2e73d Mon Feb 13 01:26:00 EST 2012 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> BP: Fix several Branch Predictor issues.
1. Updates the Branch Predictor correctly to the state
just after a mispredicted branch, if a squash occurs.
2. If a BTB does not find an entry, the branch is predicted not taken.
The global history is modified to correctly reflect this prediction.
3. Local history is now updated at the fetch stage instead of
execute stage.
4. In the Update stage of the branch predictor the local predictors are
now correctly updated according to the state of local history during
fetch stage.

This patch also improves performance by as much as 17% on some benchmarks
8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
8230:845c8eb5ac49 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: fix up code after sorting
/gem5/src/cpu/simple/
H A Dtiming.hh13954:2f400a5f2627 Fri Jul 07 09:13:00 EDT 2017 Giacomo Gabrielli <giacomo.gabrielli@arm.com> cpu,mem: Add support for partial loads/stores and wide mem. accesses

This changeset adds support for partial (or masked) loads/stores, i.e.
loads/stores that can disable accesses to individual bytes within the
target address range. In addition, this changeset extends the code to
crack memory accesses across most CPU models (TimingSimpleCPU still
TBD), so that arbitrarily wide memory accesses are supported. These
changes are required for supporting ISAs with wide vectors.

Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
- Tiago Muck <tiago.muck@arm.com>

Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
13652:45d94ac03a27 Mon Jan 22 13:12:00 EST 2018 Tuan Ta <qtt2@cornell.edu> cpu: support atomic memory request type with AtomicOpFunctor

This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU,
MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory
system.

Atomic memory instruction is treated as a special store instruction in
all CPU models.

In simple CPUs, an AMO request with an associated AtomicOpFunctor is
simply sent to L1 dcache.

In MinorCPU, an AMO request bypasses store buffer and waits for any
conflicting store request(s) currently in the store buffer to retire
before the AMO request is sent to the cache. AMO requests are not buffered
in the store buffer, so their effects appear immediately in the cache.

In DerivO3CPU, an AMO request is inserted in the store buffer so that it
is delivered to the cache only after all previous stores are issued to
the cache. Data forwarding between between an outstanding AMO in the
store buffer and a subsequent load is not allowed since the AMO request
does not hold valid data until it's executed in the cache.

This implementation assumes that a target ISA implementation must insert
enough memory fences as micro-ops around an atomic instruction to
enforce a correct order of memory instructions with respect to its
memory consistency model. Without extra memory fences, this implementation
can allow AMOs and other memory instructions that do not conflict
(i.e., not target the same address) to reorder.

This implementation also assumes that atomic instructions execute within
a cache line boundary since the cache for now is not able to execute an
operation on two different cache lines in one single step. Therefore,
ISAs like x86 that require multi-cache-line atomic instructions need to
either use a pair of locking load and unlocking store or change the
cache implementation to guarantee the atomicity of an atomic
instruction.

Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a
Reviewed-on: https://gem5-review.googlesource.com/c/8188
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
12085:de78ea63e0ca Wed Jun 07 01:13:00 EDT 2017 Sean Wilson <spwilson2@wisc.edu> cpu, gpu-compute: Replace EventWrapper use with EventFunctionWrapper

Change-Id: Idd5992463bcf9154f823b82461070d1f1842cea3
Signed-off-by: Sean Wilson <spwilson2@wisc.edu>
Reviewed-on: https://gem5-review.googlesource.com/3746
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
9442:36967173340c Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Make sure that a drained timing CPU isn't executing ucode

Currently, the timing CPU can be in the middle of a microcode sequence
or multicycle (stayAtPC is true) instruction when it is drained. This
leads to two problems:

* When switching to a hardware virtualized CPU, we obviously can't
execute gem5 microcode.

* If stayAtPC is true we might execute half of an instruction twice
when restoring a checkpoint or switching CPUs, which leads to an
incorrect execution.

After applying this patch, the CPU will be on a proper instruction
boundary, which means that it is safe to switch to any CPU model
(including hardware virtualized ones). This changeset also fixes a bug
where the timing CPU sometimes switches out with while stayAtPC is
true, which corrupts the target state after a CPU switch or
checkpoint.

Note: This changeset removes the so_state variable from checkpoints
since the drain state isn't used anymore.
8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses

This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.

For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).

The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.

With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.
8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
7520:67c670459d01 Fri Aug 13 09:16:00 EDT 2010 Gabe Black <gblack@eecs.umich.edu> CPU: Add readBytes and writeBytes functions to the exec contexts.
5894:8091ac99341a Wed Feb 25 13:16:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> CPU: Implement translateTiming which defers to translateAtomic, and convert the timing simple CPU to use it.
5890:bdef71accd68 Wed Feb 25 13:15:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> CPU: Get rid of translate... functions from various interface classes.
5169:bfd18d401251 Thu Oct 18 13:15:00 EDT 2007 Ali Saidi <saidi@eecs.umich.edu> CPU: Use the ThreadContext cpu id instead of the params cpu id in all cases.
/gem5/tests/configs/
H A Dtsunami-simple-atomic-dual.py9380:e428871da248 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> tests: Create base classes to encapsulate common test configurations

Most of the test cases currently contain a large amount of duplicated
boiler plate code. This changeset introduces a set of classes that
encapsulates most of the functionality when setting up a test
configuration.

The following base classes are introduced:
* BaseSystem - Basic system configuration that can be used for both
SE and FS simulation.

* BaseFSSystem - Basic FS configuration uni-processor and multi-processor
configurations.

* BaseFSSystemUniprocessor - Basic FS configuration for uni-processor
configurations. This is provided as a way
to make existing test cases backwards
compatible.

Architecture specific implementations are provided for ARM, Alpha, and
X86.
9036:6385cf85bf12 Thu May 31 13:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Bus: Split the bus into a non-coherent and coherent bus

This patch introduces a class hierarchy of buses, a non-coherent one,
and a coherent one, splitting the existing bus functionality. By doing
so it also enables further specialisation of the two types of buses.

A non-coherent bus connects a number of non-snooping masters and
slaves, and routes the request and response packets based on the
address. The request packets issued by the master connected to a
non-coherent bus could still snoop in caches attached to a coherent
bus, as is the case with the I/O bus and memory bus in most system
configurations. No snoops will, however, reach any master on the
non-coherent bus itself. The non-coherent bus can be used as a
template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses,
and is typically used for the I/O buses.

A coherent bus connects a number of (potentially) snooping masters and
slaves, and routes the request and response packets based on the
address, and also forwards all requests to the snoopers and deals with
the snoop responses. The coherent bus can be used as a template for
modelling QPI, HyperTransport, ACE and coherent OCP buses, and is
typically used for the L1-to-L2 buses and as the main system
interconnect.

The configuration scripts are updated to use a NoncoherentBus for all
peripheral and I/O buses.

A bit of minor tidying up has also been done.
8839:eeb293859255 Mon Feb 13 06:43:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Introduce the master/slave port roles in the Python classes

This patch classifies all ports in Python as either Master or Slave
and enforces a binding of master to slave. Conceptually, a master (such
as a CPU or DMA port) issues requests, and receives responses, and
conversely, a slave (such as a memory or a PIO device) receives
requests and sends back responses. Currently there is no
differentiation between coherent and non-coherent masters and slaves.

The classification as master/slave also involves splitting the dual
role port of the bus into a master and slave port and updating all the
system assembly scripts to use the appropriate port. Similarly, the
interrupt devices have to have their int_port split into a master and
slave port. The intdev and its children have minimal changes to
facilitate the extra port.

Note that this patch does not enforce any port typing in the C++
world, it merely ensures that the Python objects have a notion of the
port roles and are connected in an appropriate manner. This check is
carried when two ports are connected, e.g. bus.master =
memory.port. The following patches will make use of the
classifications and specialise the C++ ports into masters and slaves.
4167:ce5d0f62f13b Tue Mar 06 14:13:00 EST 2007 Nathan Binkert <binkertn@umich.edu> Move all of the parameters of the Root SimObject so they are
directly configured by python. Move stuff from root.(cc|hh) to
core.(cc|hh) since it really belogs there now.
In the process, simplify how ticks are used in the python code.
3170:37fd1e73f836 Sun Oct 08 13:53:00 EDT 2006 Steve Reinhardt <stever@eecs.umich.edu> Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing)
and PhysicalMemory. *No* support for caches or O3CPU.
Note that properly setting cpu_id on all CPUs is now required
for correct operation.

src/arch/SConscript:
src/base/traceflags.py:
src/cpu/base.hh:
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
src/mem/physical.cc:
src/mem/physical.hh:
src/mem/request.hh:
src/python/m5/objects/BaseCPU.py:
tests/configs/simple-atomic.py:
tests/configs/simple-timing.py:
tests/configs/tsunami-simple-atomic-dual.py:
tests/configs/tsunami-simple-atomic.py:
tests/configs/tsunami-simple-timing-dual.py:
tests/configs/tsunami-simple-timing.py:
Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing)
and PhysicalMemory. *No* support for caches or O3CPU.
H A Dtsunami-simple-atomic.py9380:e428871da248 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> tests: Create base classes to encapsulate common test configurations

Most of the test cases currently contain a large amount of duplicated
boiler plate code. This changeset introduces a set of classes that
encapsulates most of the functionality when setting up a test
configuration.

The following base classes are introduced:
* BaseSystem - Basic system configuration that can be used for both
SE and FS simulation.

* BaseFSSystem - Basic FS configuration uni-processor and multi-processor
configurations.

* BaseFSSystemUniprocessor - Basic FS configuration for uni-processor
configurations. This is provided as a way
to make existing test cases backwards
compatible.

Architecture specific implementations are provided for ARM, Alpha, and
X86.
9036:6385cf85bf12 Thu May 31 13:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Bus: Split the bus into a non-coherent and coherent bus

This patch introduces a class hierarchy of buses, a non-coherent one,
and a coherent one, splitting the existing bus functionality. By doing
so it also enables further specialisation of the two types of buses.

A non-coherent bus connects a number of non-snooping masters and
slaves, and routes the request and response packets based on the
address. The request packets issued by the master connected to a
non-coherent bus could still snoop in caches attached to a coherent
bus, as is the case with the I/O bus and memory bus in most system
configurations. No snoops will, however, reach any master on the
non-coherent bus itself. The non-coherent bus can be used as a
template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses,
and is typically used for the I/O buses.

A coherent bus connects a number of (potentially) snooping masters and
slaves, and routes the request and response packets based on the
address, and also forwards all requests to the snoopers and deals with
the snoop responses. The coherent bus can be used as a template for
modelling QPI, HyperTransport, ACE and coherent OCP buses, and is
typically used for the L1-to-L2 buses and as the main system
interconnect.

The configuration scripts are updated to use a NoncoherentBus for all
peripheral and I/O buses.

A bit of minor tidying up has also been done.
8839:eeb293859255 Mon Feb 13 06:43:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Introduce the master/slave port roles in the Python classes

This patch classifies all ports in Python as either Master or Slave
and enforces a binding of master to slave. Conceptually, a master (such
as a CPU or DMA port) issues requests, and receives responses, and
conversely, a slave (such as a memory or a PIO device) receives
requests and sends back responses. Currently there is no
differentiation between coherent and non-coherent masters and slaves.

The classification as master/slave also involves splitting the dual
role port of the bus into a master and slave port and updating all the
system assembly scripts to use the appropriate port. Similarly, the
interrupt devices have to have their int_port split into a master and
slave port. The intdev and its children have minimal changes to
facilitate the extra port.

Note that this patch does not enforce any port typing in the C++
world, it merely ensures that the Python objects have a notion of the
port roles and are connected in an appropriate manner. This check is
carried when two ports are connected, e.g. bus.master =
memory.port. The following patches will make use of the
classifications and specialise the C++ ports into masters and slaves.
4167:ce5d0f62f13b Tue Mar 06 14:13:00 EST 2007 Nathan Binkert <binkertn@umich.edu> Move all of the parameters of the Root SimObject so they are
directly configured by python. Move stuff from root.(cc|hh) to
core.(cc|hh) since it really belogs there now.
In the process, simplify how ticks are used in the python code.
3170:37fd1e73f836 Sun Oct 08 13:53:00 EDT 2006 Steve Reinhardt <stever@eecs.umich.edu> Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing)
and PhysicalMemory. *No* support for caches or O3CPU.
Note that properly setting cpu_id on all CPUs is now required
for correct operation.

src/arch/SConscript:
src/base/traceflags.py:
src/cpu/base.hh:
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
src/mem/physical.cc:
src/mem/physical.hh:
src/mem/request.hh:
src/python/m5/objects/BaseCPU.py:
tests/configs/simple-atomic.py:
tests/configs/simple-timing.py:
tests/configs/tsunami-simple-atomic-dual.py:
tests/configs/tsunami-simple-atomic.py:
tests/configs/tsunami-simple-timing-dual.py:
tests/configs/tsunami-simple-timing.py:
Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing)
and PhysicalMemory. *No* support for caches or O3CPU.
H A Dtsunami-simple-timing.py9380:e428871da248 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> tests: Create base classes to encapsulate common test configurations

Most of the test cases currently contain a large amount of duplicated
boiler plate code. This changeset introduces a set of classes that
encapsulates most of the functionality when setting up a test
configuration.

The following base classes are introduced:
* BaseSystem - Basic system configuration that can be used for both
SE and FS simulation.

* BaseFSSystem - Basic FS configuration uni-processor and multi-processor
configurations.

* BaseFSSystemUniprocessor - Basic FS configuration for uni-processor
configurations. This is provided as a way
to make existing test cases backwards
compatible.

Architecture specific implementations are provided for ARM, Alpha, and
X86.
9036:6385cf85bf12 Thu May 31 13:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Bus: Split the bus into a non-coherent and coherent bus

This patch introduces a class hierarchy of buses, a non-coherent one,
and a coherent one, splitting the existing bus functionality. By doing
so it also enables further specialisation of the two types of buses.

A non-coherent bus connects a number of non-snooping masters and
slaves, and routes the request and response packets based on the
address. The request packets issued by the master connected to a
non-coherent bus could still snoop in caches attached to a coherent
bus, as is the case with the I/O bus and memory bus in most system
configurations. No snoops will, however, reach any master on the
non-coherent bus itself. The non-coherent bus can be used as a
template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses,
and is typically used for the I/O buses.

A coherent bus connects a number of (potentially) snooping masters and
slaves, and routes the request and response packets based on the
address, and also forwards all requests to the snoopers and deals with
the snoop responses. The coherent bus can be used as a template for
modelling QPI, HyperTransport, ACE and coherent OCP buses, and is
typically used for the L1-to-L2 buses and as the main system
interconnect.

The configuration scripts are updated to use a NoncoherentBus for all
peripheral and I/O buses.

A bit of minor tidying up has also been done.
8839:eeb293859255 Mon Feb 13 06:43:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Introduce the master/slave port roles in the Python classes

This patch classifies all ports in Python as either Master or Slave
and enforces a binding of master to slave. Conceptually, a master (such
as a CPU or DMA port) issues requests, and receives responses, and
conversely, a slave (such as a memory or a PIO device) receives
requests and sends back responses. Currently there is no
differentiation between coherent and non-coherent masters and slaves.

The classification as master/slave also involves splitting the dual
role port of the bus into a master and slave port and updating all the
system assembly scripts to use the appropriate port. Similarly, the
interrupt devices have to have their int_port split into a master and
slave port. The intdev and its children have minimal changes to
facilitate the extra port.

Note that this patch does not enforce any port typing in the C++
world, it merely ensures that the Python objects have a notion of the
port roles and are connected in an appropriate manner. This check is
carried when two ports are connected, e.g. bus.master =
memory.port. The following patches will make use of the
classifications and specialise the C++ ports into masters and slaves.
4167:ce5d0f62f13b Tue Mar 06 14:13:00 EST 2007 Nathan Binkert <binkertn@umich.edu> Move all of the parameters of the Root SimObject so they are
directly configured by python. Move stuff from root.(cc|hh) to
core.(cc|hh) since it really belogs there now.
In the process, simplify how ticks are used in the python code.
3170:37fd1e73f836 Sun Oct 08 13:53:00 EDT 2006 Steve Reinhardt <stever@eecs.umich.edu> Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing)
and PhysicalMemory. *No* support for caches or O3CPU.
Note that properly setting cpu_id on all CPUs is now required
for correct operation.

src/arch/SConscript:
src/base/traceflags.py:
src/cpu/base.hh:
src/cpu/simple/atomic.cc:
src/cpu/simple/timing.cc:
src/cpu/simple/timing.hh:
src/mem/physical.cc:
src/mem/physical.hh:
src/mem/request.hh:
src/python/m5/objects/BaseCPU.py:
tests/configs/simple-atomic.py:
tests/configs/simple-timing.py:
tests/configs/tsunami-simple-atomic-dual.py:
tests/configs/tsunami-simple-atomic.py:
tests/configs/tsunami-simple-timing-dual.py:
tests/configs/tsunami-simple-timing.py:
Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing)
and PhysicalMemory. *No* support for caches or O3CPU.
H A Dbase_config.py11156:a37dda0f0202 Mon Oct 05 14:13:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> tests: Update SMT tests to correctly configure CPUs

The 01.hello-2T-smt test case for the O3 CPU didn't correctly setup
the number of threads before creating interrupt controllers, which
confused the constructor in BaseCPU. This changeset adds SMT support
to the test configuration infrastructure.
9654:64b653b3d72f Mon Apr 22 13:20:00 EDT 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> tests: Add support for testing KVM-based CPUs

This changeset adds support for initializing a KVM VM in the
BaseSystem test class and adds the following methods in run.py:

require_file -- Test if a file exists and abort/skip if not.
require_kvm -- Test if KVM support has been compiled into gem5 (i.e.,
BaseKvmCPU exists) and the KVM device exists on the
host.
9447:156f74caf0d4 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> tests: Add CPU switching tests

This changeset adds a set of tests that stress the CPU switching
code. It adds the following test configurations:

* tsunami-switcheroo-full -- Alpha system (atomic, timing, O3)
* realview-switcheroo-atomic -- ARM system (atomic<->atomic)
* realview-switcheroo-timing -- ARM system (timing<->timing)
* realview-switcheroo-o3 -- ARM system (O3<->O3)
* realview-switcheroo-full -- ARM system (atomic, timing, O3)

Reference data is provided for the 10.linux-boot test case. All of the
tests trigger a CPU switch once per millisecond during the boot
process.

The in-order CPU model was not included in any of the tests as it does
not support CPU handover.
9408:10a84dceab25 Mon Jan 07 13:05:00 EST 2013 Andreas Hansson <andreas.hansson@arm.com> config: Do not use hardcoded physmem in fs script

This patch generalises the address range resolution for the I/O cache
and I/O bridge such that they do not assume a single memory. The patch
involves adding a parameter to the system which is then defined based
on the memories that are to be visible from the I/O subsystem, whether
behind a cache or a bridge.

The change is needed to allow interleaved memory controllers in the
system.
9380:e428871da248 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> tests: Create base classes to encapsulate common test configurations

Most of the test cases currently contain a large amount of duplicated
boiler plate code. This changeset introduces a set of classes that
encapsulates most of the functionality when setting up a test
configuration.

The following base classes are introduced:
* BaseSystem - Basic system configuration that can be used for both
SE and FS simulation.

* BaseFSSystem - Basic FS configuration uni-processor and multi-processor
configurations.

* BaseFSSystemUniprocessor - Basic FS configuration for uni-processor
configurations. This is provided as a way
to make existing test cases backwards
compatible.

Architecture specific implementations are provided for ARM, Alpha, and
X86.
/gem5/tests/quick/fs/10.linux-boot/ref/x86/linux/pc-simple-timing/
H A Dstats.txt9901:13c5fea24be1 Wed Oct 02 05:03:00 EDT 2013 Andreas Sandberg <andreas@sandberg.pp.se> stats: Update x86 stats after x87 fixes

The updates to the x87 caused the stats for several regressions to
change. This was mainly caused by the addition of a working 32-bit and
80-bit FP load instruction and xsave support.
9568:cd1351d4d850 Fri Mar 01 13:20:00 EST 2013 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect SimpleDRAM changes

This patch bumps the stats to reflect the slight change in how the
retry is handled, and also the pruning of some redundant stats.
9314:63e7cfff4188 Thu Oct 25 13:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> stats: Update the stats to reflect the 1GHz default system clock

This patch updates the stats to reflect the change in the default
system clock from 1 THz to 1GHz. The changes are due to the DMA
devices now injecting requests at a lower pace.
9312:e05e1b69ebf2 Thu Oct 25 13:14:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect use of SimpleDRAM

This patch bumps the stats to match the use of SimpleDRAM instead of
SimpleMemory in all inorder and O3 regressions, and also all
full-system regressions. A number of performance-related stats change,
and a whole bunch of stats are added for the memory controller.
9039:9a22621c741c Mon Jun 04 13:43:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> X86: Update stats for the CPUID change.
/gem5/tests/quick/fs/10.linux-boot/ref/arm/linux/realview-simple-timing/
H A Dstats.txt11680:b4d943429dc6 Thu Oct 13 18:21:00 EDT 2016 Curtis Dunham <Curtis.Dunham@arm.com> stats: update references
9568:cd1351d4d850 Fri Mar 01 13:20:00 EST 2013 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect SimpleDRAM changes

This patch bumps the stats to reflect the slight change in how the
retry is handled, and also the pruning of some redundant stats.
9449:56610ab73040 Mon Jan 07 13:05:00 EST 2013 Ali Saidi <Ali.Saidi@ARM.com> stats: update stats for previous changes.
9314:63e7cfff4188 Thu Oct 25 13:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> stats: Update the stats to reflect the 1GHz default system clock

This patch updates the stats to reflect the change in the default
system clock from 1 THz to 1GHz. The changes are due to the DMA
devices now injecting requests at a lower pace.
9312:e05e1b69ebf2 Thu Oct 25 13:14:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect use of SimpleDRAM

This patch bumps the stats to match the use of SimpleDRAM instead of
SimpleMemory in all inorder and O3 regressions, and also all
full-system regressions. A number of performance-related stats change,
and a whole bunch of stats are added for the memory controller.
/gem5/tests/quick/fs/10.linux-boot/ref/arm/linux/realview-simple-timing-dual/
H A Dstats.txt11680:b4d943429dc6 Thu Oct 13 18:21:00 EDT 2016 Curtis Dunham <Curtis.Dunham@arm.com> stats: update references
9568:cd1351d4d850 Fri Mar 01 13:20:00 EST 2013 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect SimpleDRAM changes

This patch bumps the stats to reflect the slight change in how the
retry is handled, and also the pruning of some redundant stats.
9449:56610ab73040 Mon Jan 07 13:05:00 EST 2013 Ali Saidi <Ali.Saidi@ARM.com> stats: update stats for previous changes.
9314:63e7cfff4188 Thu Oct 25 13:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> stats: Update the stats to reflect the 1GHz default system clock

This patch updates the stats to reflect the change in the default
system clock from 1 THz to 1GHz. The changes are due to the DMA
devices now injecting requests at a lower pace.
9312:e05e1b69ebf2 Thu Oct 25 13:14:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect use of SimpleDRAM

This patch bumps the stats to match the use of SimpleDRAM instead of
SimpleMemory in all inorder and O3 regressions, and also all
full-system regressions. A number of performance-related stats change,
and a whole bunch of stats are added for the memory controller.
/gem5/src/arch/mips/
H A DSConscript9384:877293183bdf Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@arm.com> arch: Make the ISA class inherit from SimObject

The ISA class on stores the contents of ID registers on many
architectures. In order to make reset values of such registers
configurable, we make the class inherit from SimObject, which allows
us to use the normal generated parameter headers.

This patch introduces a Python helper method, BaseCPU.createThreads(),
which creates a set of ISAs for each of the threads in an SMT
system. Although it is currently only needed when creating
multi-threaded CPUs, it should always be called before instantiating
the system as this is an obvious place to configure ID registers
identifying a thread/CPU.
9057:f5ee56466b91 Tue Jun 05 13:52:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> ISA: Back-out NoopMachInst as a StaticInstPtr change.
9040:cdfe09f9bdee Mon Jun 04 13:57:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA: Turn the ExtMachInst NoopMachinst into the StaticInstPtr NoopStaticInst.

This eliminates a use of the ExtMachInst type outside of the ISAs.
5793:321f79ddb500 Tue Jan 13 17:17:00 EST 2009 Nathan Binkert <nate@binkert.org> SCons: centralize the Dir() workaround for newer versions of scons.
Scons bug id: 2006 M5 Bug id: 308
5222:bb733a878f85 Tue Nov 13 16:58:00 EST 2007 Korey Sewell <ksewell@umich.edu> Add in files from merge-bare-iron, get them compiling in FS and SE mode
H A Disa_traits.hh9057:f5ee56466b91 Tue Jun 05 13:52:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> ISA: Back-out NoopMachInst as a StaticInstPtr change.
9040:cdfe09f9bdee Mon Jun 04 13:57:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA: Turn the ExtMachInst NoopMachinst into the StaticInstPtr NoopStaticInst.

This eliminates a use of the ExtMachInst type outside of the ISAs.
8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
5222:bb733a878f85 Tue Nov 13 16:58:00 EST 2007 Korey Sewell <ksewell@umich.edu> Add in files from merge-bare-iron, get them compiling in FS and SE mode
4772:f08370a81812 Fri Jul 27 01:13:00 EDT 2007 Gabe Black <gblack@eecs.umich.edu> X86: Fix argument register indexing.
Code was assuming that all argument registers followed in order from ArgumentReg0. There is now an ArgumentReg array which is indexed to find the right index. There is a constant, NumArgumentRegs, which can be used to protect against using an invalid ArgumentReg.
H A Dutility.hh8300:eb279d6e08a2 Fri May 13 18:27:00 EDT 2011 Chander Sudanthi <chander.sudanthi@arm.com> Trace: Allow printing ASIDs and selectively tracing based on user/kernel code.

Debug flags are ExecUser, ExecKernel, and ExecAsid. ExecUser and
ExecKernel are set by default when Exec is specified. Use minus
sign with ExecUser or ExecKernel to remove user or kernel tracing
respectively.
8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
5222:bb733a878f85 Tue Nov 13 16:58:00 EST 2007 Korey Sewell <ksewell@umich.edu> Add in files from merge-bare-iron, get them compiling in FS and SE mode
4181:6edaeff44647 Tue Mar 13 12:13:00 EDT 2007 Gabe Black <gblack@eecs.umich.edu> Replaced makeExtMI with predecode.
Removed the getOpcode function from StaticInst which only made sense for Alpha.
Started implementing the x86 predecoder.
4181:6edaeff44647 Tue Mar 13 12:13:00 EDT 2007 Gabe Black <gblack@eecs.umich.edu> Replaced makeExtMI with predecode.
Removed the getOpcode function from StaticInst which only made sense for Alpha.
Started implementing the x86 predecoder.
/gem5/src/arch/x86/
H A Dfaults.cc8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
7678:f19b6a3a8cec Mon Sep 13 22:26:00 EDT 2010 Gabe Black <gblack@eecs.umich.edu> Faults: Pass the StaticInst involved, if any, to a Fault's invoke method.

Also move the "Fault" reference counted pointer type into a separate file,
sim/fault.hh. It would be better to name this less similarly to sim/faults.hh
to reduce confusion, but fault.hh matches the name of the type. We could change
Fault to FaultPtr to match other pointer types, and then changing the name of
the file would make more sense.
5909:ecbd27e5d1f8 Wed Feb 25 13:17:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> X86: Add a trace flag for tracing faults.
5895:569e3b31a868 Wed Feb 25 13:16:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> X86: Make the X86 TLB take advantage of delayed translations, and get rid of the fake TLB miss faults.
5681:54c2d92f601e Mon Oct 13 01:42:00 EDT 2008 Gabe Black <gblack@eecs.umich.edu> X86: Make the x86 interrupt fault kick off the interrupt microcode.
H A Dpagetable_walker.hh14096:bde52fccbf0f Fri Jul 12 13:29:00 EDT 2019 Matthew Poremba <matthew.poremba@amd.com> arch-x86: Don't free PTW state with inflight requests

If a page table walk is squashed, the walker state is being deleted
in the squash code. If there are in flight requests, the deleted
walker state values may be clobbered, leading to undefined behavior.
This adds a squashed boolean to the walker state which is set if a
walk is squashed while requests are still in flight. When packets
for the in flight request return, we check if the walk was squashed
and return that the walk is complete once the number of in flight
requests reaches zero. The walker state is then freed by the PTW.

Change-Id: I57a64b1548b83a8a9e8441fc9d6f33e9842df2b3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19568
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses

This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.

For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).

The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.

With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.
8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
5897:29cecf4fe602 Wed Feb 25 13:16:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> X86: Fix the timing mode of the page table walker.
5895:569e3b31a868 Wed Feb 25 13:16:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> X86: Make the X86 TLB take advantage of delayed translations, and get rid of the fake TLB miss faults.
/gem5/src/dev/arm/
H A Dpl111.cc9648:f10eb34e3e38 Mon Apr 22 13:20:00 EDT 2013 Dam Sunwoo <dam.sunwoo@arm.com> sim: separate nextCycle() and clockEdge() in clockedObjects

Previously, nextCycle() could return the *current* cycle if the current tick was
already aligned with the clock edge. This behavior is not only confusing (not
quite what the function name implies), but also caused problems in the
drainResume() function. When exiting/re-entering the sim loop (e.g., to take
checkpoints), the CPUs will drain and resume. Due to the previous behavior of
nextCycle(), the CPU tick events were being rescheduled in the same ticks that
were already processed before draining. This caused divergence from runs that
did not exit/re-entered the sim loop. (Initially a cycle difference, but a
significant impact later on.)

This patch separates out the two behaviors (nextCycle() and clockEdge()),
uses nextCycle() in drainResume, and uses clockEdge() everywhere else.
Nothing (other than name) should change except for the drainResume timing.
9415:f5d159450dfb Mon Jan 07 13:05:00 EST 2013 Chander Sudanthi <Chander.Sudanthi@arm.com> ARM: pl111/LCD framebuffer checkpointing fix

Fixed check pointing of the framebuffer. Previously, the pixel size was not
considered in determining the size of the buffer to checkpoint. This patch
checkpoints the entire framebuffer instead of the first quarter.
9395:bf428987f54e Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> arm: Fix DMA event handling bug in the PL111 model

The PL111 model currently maintains a list of pre-allocated
DmaDoneEvents to prevent unnecessary heap allocations. This list
effectively works like a stack where the top element is the latest
scheduled event. When an event triggers, the top pointer is moved down
the stack. This obviously breaks since events usually retire from the
bottom (events don't necessarily have to retire in order), which
triggers the following assertion:

gem5.debug: build/ARM/dev/arm/pl111.cc:460: void Pl111::fillFifo(): \
Assertion `!dmaDoneEvent[dmaPendingNum-1].scheduled()' failed.

This changeset adds a vector listing the currently unused events. This
vector acts like a stack where the an element is popped off the stack
when a new event is needed an pushed on the stack when they trigger.
9394:e88cf95d33d3 Mon Jan 07 13:05:00 EST 2013 Andreas Hansson <andreas.hansson@arm.com> dev: Fix the Pl111 timings by separating pixel and DMA clock

This patch fixes the Pl111 timings by creating a separate clock for
the pixel timings. The device clock is used for all interactions with
the memory system, just like the AHB clock on the actual module.

The result without this patch is that the module only is allowed to
send one request every tick of the 24MHz clock which causes a huge
backlog.
8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
/gem5/util/stats/
H A Dstats.py2343:a2b4a6ccee56 Sat Jun 10 13:08:00 EDT 2006 Nathan Binkert <binkertn@umich.edu> Remove all binning stuff
1318:57d49cdb92a5 Tue Jan 18 13:34:00 EST 2005 Ali Saidi <saidi@eecs.umich.edu> now really done with stability stats stuff
1317:0b6026d3000b Tue Jan 18 13:25:00 EST 2005 Ali Saidi <saidi@eecs.umich.edu> finished stability stats option
1307:e6b9976895c6 Wed Jan 12 13:44:00 EST 2005 Nathan Binkert <binkertn@umich.edu> More graph output junk

util/stats/stats.py:
Add the graphing output for 6GHz and 8GHz runs
1301:f85f6fb43474 Thu Jan 13 23:59:00 EST 2005 Ali Saidi <saidi@eecs.umich.edu> fix a display bug
add option to limit results to a set of ticks
fix ticks code to work

util/stats/info.py:
change samples -> ticks and pass all parameters
util/stats/stats.py:
add option to select a set of ticks and fix display bug
/gem5/src/arch/arm/isa/insts/
H A Dfp.isa13979:1e0c4607ac12 Tue Apr 30 13:24:00 EDT 2019 Ciro Santilli <ciro.santilli@arm.com> arch-arm: implement VMINNM and VMAXNM scalar version

ARMv8.2 16-bit versions have not yet been implemented, but a placeholders
were created for them.

Refactor the nearby decoding tree to closely match the ARM spec A32 decode
table.

That piece of the tree can also be called from thumb which decodes it in
the same way, although the thumb decode table has a different terminology

The old code didn't match neither A32 or T32 terminologies, so it is
better to at least match one of them to help verify correctness.

Change-Id: Iabbbca2932557cf6c98ce36690c385c3ddf39ed8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18690
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
13738:84439021dcf6 Mon Feb 18 13:06:00 EST 2019 Ciro Santilli <ciro.santilli@arm.com> arch-arm: implement floating point aarch32 VCVTA family

These instructions round floating point to integer, and were added to
aarch32 as an extension to ARMv7.

Change-Id: I62d1705badc95a4e8954a5ad62b2b6bc9e4ffe00
Reviewed-on: https://gem5-review.googlesource.com/c/16788
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
11671:520509f3e66c Thu Oct 13 14:22:00 EDT 2016 Mitch Hayenga <mitch.hayenga@arm.com> isa,arm: Add missing AArch32 FP instructions

This commit adds missing non-predicated, scalar floating point
instructions. Specifically VRINT* floating point integer rounding
instructions and VSEL* floating point conditional selects.

Change-Id: I23cbd1389f151389ac8beb28a7d18d5f93d000e7
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@arm.com>
8303:5a95f1d2494e Fri May 13 18:27:00 EDT 2011 Ali Saidi <Ali.Saidi@ARM.com> ARM: Further break up condition code into NZ, C, V bits.

Break up the condition code bits into NZ, C, V registers. These are individually
written and this removes some incorrect dependencies between instructions.
8301:858384f3af1c Fri May 13 18:27:00 EDT 2011 Ali Saidi <Ali.Saidi@ARM.com> ARM: Break up condition codes into normal flags, saturation, and simd.

This change splits out the condcodes from being one monolithic register
into three blocks that are updated independently. This allows CPUs
to not have to do RMW operations on the flags registers for instructions
that don't write all flags.
/gem5/src/sim/
H A Dfaults.hh11877:5ea85692a53e Mon Jul 20 10:15:00 EDT 2015 Brandon Potter <brandon.potter@amd.com> syscall_emul: [patch 13/22] add system call retry capability

This changeset adds functionality that allows system calls to retry without
affecting thread context state such as the program counter or register values
for the associated thread context (when system calls return with a retry
fault).

This functionality is needed to solve problems with blocking system calls
in multi-process or multi-threaded simulations where information is passed
between processes/threads. Blocking system calls can cause deadlock because
the simulator itself is single threaded. There is only a single thread
servicing the event queue which can cause deadlock if the thread hits a
blocking system call instruction.

To illustrate the problem, consider two processes using the producer/consumer
sharing model. The processes can use file descriptors and the read and write
calls to pass information to one another. If the consumer calls the blocking
read system call before the producer has produced anything, the call will
block the event queue (while executing the system call instruction) and
deadlock the simulation.

The solution implemented in this changeset is to recognize that the system
calls will block and then generate a special retry fault. The fault will
be sent back up through the function call chain until it is exposed to the
cpu model's pipeline where the fault becomes visible. The fault will trigger
the cpu model to replay the instruction at a future tick where the call has
a chance to succeed without actually going into a blocking state.

In subsequent patches, we recognize that a syscall will block by calling a
non-blocking poll (from inside the system call implementation) and checking
for events. When events show up during the poll, it signifies that the call
would not have blocked and the syscall is allowed to proceed (calling an
underlying host system call if necessary). If no events are returned from the
poll, we generate the fault and try the instruction for the thread context
at a distant tick. Note that retrying every tick is not efficient.

As an aside, the simulator has some multi-threading support for the event
queue, but it is not used by default and needs work. Even if the event queue
was completely multi-threaded, meaning that there is a hardware thread on
the host servicing a single simulator thread contexts with a 1:1 mapping
between them, it's still possible to run into deadlock due to the event queue
barriers on quantum boundaries. The solution of replaying at a later tick
is the simplest solution and solves the problem generally.
8545:a3992291e230 Tue Sep 13 00:58:00 EDT 2011 Ali Saidi <saidi@eecs.umich.edu> LSQ: Only trigger a memory violation with a load/load if the value changes.

Only create a memory ordering violation when the value could have changed
between two subsequent loads, instead of just when loads go out-of-order
to the same address. While not very common in the case of Alpha, with
an architecture with a hardware table walker this can happen reasonably
frequently beacuse a translation will miss and start a table walk and
before the CPU re-schedules the faulting instruction another one will
pass it to the same address (or cache block depending on the dendency
checking).

This patch has been tested with a couple of self-checking hand crafted
programs to stress ordering between two cores.

The performance improvement on SPEC benchmarks can be substantial (2-10%).
8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
7867:3ee9e6c2e8f7 Mon Jan 31 16:13:00 EST 2011 Gabe Black <gblack@eecs.umich.edu> Fault: Move the definition of NoFault from faults.hh to fault.hh.

Moving the definition of NoFault into fault.hh doesn't bring any new
dependencies with it, and allows some files to include just fault.hh which has
less baggage. NoFault will still be available to everything that includes
faults.hh because it includes fault.hh.
7678:f19b6a3a8cec Mon Sep 13 22:26:00 EDT 2010 Gabe Black <gblack@eecs.umich.edu> Faults: Pass the StaticInst involved, if any, to a Fault's invoke method.

Also move the "Fault" reference counted pointer type into a separate file,
sim/fault.hh. It would be better to name this less similarly to sim/faults.hh
to reduce confusion, but fault.hh matches the name of the type. We could change
Fault to FaultPtr to match other pointer types, and then changing the name of
the file would make more sense.
/gem5/src/mem/ruby/network/
H A DNetwork.hh13799:15badf7874ee Tue Mar 19 13:12:00 EDT 2019 Andrea Mondelli <Andrea.Mondelli@ucf.edu> misc: missing override specifier

Missing specifier of overridden virtual function
declared in sim_object.hh

Removed redundant "virtual" keyword

Change-Id: I42aa3349b537c9e62607bce20cf1b3aabdb99bf2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17468
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
12065:e3e51756dfef Mon Mar 13 14:19:00 EDT 2017 Nikos Nikoleris <nikos.nikoleris@arm.com> ruby: Add support for address ranges in the directory

Previously the directory covered a flat address range that always
started from address 0. This change adds a vector of address ranges
with interleaving and hashing that each directory keeps track of and
the necessary flexibility to support systems with non continuous
memory ranges.

Change-Id: I6ea1c629bdf4c5137b7d9c89dbaf6c826adfd977
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2903
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
11113:5a2e1b1b5c43 Wed Sep 16 13:10:00 EDT 2015 Joe Gross <joe.gross@amd.com> ruby: fix message buffer init order

The recent changes to make MessageBuffers SimObjects required them to be
initialized in a particular order, which could break some protocols. Fix this
by calling initNetQueues on the external nodes of each external link in the
constructor of Network.

This patch also refactors the duplicated code for checking network allocation
and setting net queues (which are called by initNetQueues) from the simple and
garnet networks to be in Network.
6154:6bb54dcb940e Mon May 11 13:38:00 EDT 2009 Nathan Binkert <nate@binkert.org> ruby: Make ruby #includes use full paths to the files they're including.
This basically means changing all #include statements and changing
autogenerated code so that it generates the correct paths. Because
slicc generates #includes, I had to hard code the include paths to
mem/protocol.
6145:15cca6ab723a Mon May 11 13:38:00 EDT 2009 Nathan Binkert <nate@binkert.org> ruby: Import ruby and slicc from GEMS

We eventually plan to replace the m5 cache hierarchy with the GEMS
hierarchy, but for now we will make both live alongside eachother.

Completed in 318 milliseconds

<<41424344454647484950>>