Searched hist:13 (Results 976 - 1000 of 1864) sorted by relevance

<<31323334353637383940>>

/gem5/
H A DMAINTAINERS14285:dd51c2516464 Wed Sep 18 13:21:00 EDT 2019 Jason Lowe-Power <jason@lowepower.com> misc: Update MAINTAINERS

Change-Id: I13ca12d4dd170ce3db03d851829df9bc62d1a74c
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20999
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
/gem5/src/dev/x86/
H A DCmos.py5629:1565c13d1483 Sat Oct 11 04:13:00 EDT 2008 Gabe Black <gblack@eecs.umich.edu> X86: Change the CMOS from a sub-device to a real SimObject
/gem5/src/cpu/simple/
H A Dtiming.cc13954:2f400a5f2627 Fri Jul 07 09:13:00 EDT 2017 Giacomo Gabrielli <giacomo.gabrielli@arm.com> cpu,mem: Add support for partial loads/stores and wide mem. accesses

This changeset adds support for partial (or masked) loads/stores, i.e.
loads/stores that can disable accesses to individual bytes within the
target address range. In addition, this changeset extends the code to
crack memory accesses across most CPU models (TimingSimpleCPU still
TBD), so that arbitrarily wide memory accesses are supported. These
changes are required for supporting ISAs with wide vectors.

Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
- Tiago Muck <tiago.muck@arm.com>

Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
13652:45d94ac03a27 Mon Jan 22 13:12:00 EST 2018 Tuan Ta <qtt2@cornell.edu> cpu: support atomic memory request type with AtomicOpFunctor

This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU,
MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory
system.

Atomic memory instruction is treated as a special store instruction in
all CPU models.

In simple CPUs, an AMO request with an associated AtomicOpFunctor is
simply sent to L1 dcache.

In MinorCPU, an AMO request bypasses store buffer and waits for any
conflicting store request(s) currently in the store buffer to retire
before the AMO request is sent to the cache. AMO requests are not buffered
in the store buffer, so their effects appear immediately in the cache.

In DerivO3CPU, an AMO request is inserted in the store buffer so that it
is delivered to the cache only after all previous stores are issued to
the cache. Data forwarding between between an outstanding AMO in the
store buffer and a subsequent load is not allowed since the AMO request
does not hold valid data until it's executed in the cache.

This implementation assumes that a target ISA implementation must insert
enough memory fences as micro-ops around an atomic instruction to
enforce a correct order of memory instructions with respect to its
memory consistency model. Without extra memory fences, this implementation
can allow AMOs and other memory instructions that do not conflict
(i.e., not target the same address) to reorder.

This implementation also assumes that atomic instructions execute within
a cache line boundary since the cache for now is not able to execute an
operation on two different cache lines in one single step. Therefore,
ISAs like x86 that require multi-cache-line atomic instructions need to
either use a pair of locking load and unlocking store or change the
cache implementation to guarantee the atomicity of an atomic
instruction.

Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a
Reviewed-on: https://gem5-review.googlesource.com/c/8188
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
12386:2bf5fb25a5f1 Wed Dec 13 03:53:00 EST 2017 Gabe Black <gabeblack@google.com> arm,sparc,x86,base,cpu,sim: Replace the Twin(32|64)_t types with.

Replace them with std::array<>s.

Change-Id: I76624c87a1cd9b21c386a96147a18de92b8a8a34
Reviewed-on: https://gem5-review.googlesource.com/6602
Maintainer: Gabe Black <gabeblack@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
12085:de78ea63e0ca Wed Jun 07 01:13:00 EDT 2017 Sean Wilson <spwilson2@wisc.edu> cpu, gpu-compute: Replace EventWrapper use with EventFunctionWrapper

Change-Id: Idd5992463bcf9154f823b82461070d1f1842cea3
Signed-off-by: Sean Wilson <spwilson2@wisc.edu>
Reviewed-on: https://gem5-review.googlesource.com/3746
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
11877:5ea85692a53e Mon Jul 20 10:15:00 EDT 2015 Brandon Potter <brandon.potter@amd.com> syscall_emul: [patch 13/22] add system call retry capability

This changeset adds functionality that allows system calls to retry without
affecting thread context state such as the program counter or register values
for the associated thread context (when system calls return with a retry
fault).

This functionality is needed to solve problems with blocking system calls
in multi-process or multi-threaded simulations where information is passed
between processes/threads. Blocking system calls can cause deadlock because
the simulator itself is single threaded. There is only a single thread
servicing the event queue which can cause deadlock if the thread hits a
blocking system call instruction.

To illustrate the problem, consider two processes using the producer/consumer
sharing model. The processes can use file descriptors and the read and write
calls to pass information to one another. If the consumer calls the blocking
read system call before the producer has produced anything, the call will
block the event queue (while executing the system call instruction) and
deadlock the simulation.

The solution implemented in this changeset is to recognize that the system
calls will block and then generate a special retry fault. The fault will
be sent back up through the function call chain until it is exposed to the
cpu model's pipeline where the fault becomes visible. The fault will trigger
the cpu model to replay the instruction at a future tick where the call has
a chance to succeed without actually going into a blocking state.

In subsequent patches, we recognize that a syscall will block by calling a
non-blocking poll (from inside the system call implementation) and checking
for events. When events show up during the poll, it signifies that the call
would not have blocked and the syscall is allowed to proceed (calling an
underlying host system call if necessary). If no events are returned from the
poll, we generate the fault and try the instruction for the thread context
at a distant tick. Note that retrying every tick is not efficient.

As an aside, the simulator has some multi-threading support for the event
queue, but it is not used by default and needs work. Even if the event queue
was completely multi-threaded, meaning that there is a hardware thread on
the host servicing a single simulator thread contexts with a 1:1 mapping
between them, it's still possible to run into deadlock due to the event queue
barriers on quantum boundaries. The solution of replaying at a later tick
is the simplest solution and solves the problem generally.
10342:711eb0e64249 Tue May 13 01:20:00 EDT 2014 Curtis Dunham <Curtis.Dunham@arm.com> mem: Refactor assignment of Packet types

Put the packet type swizzling (that is currently done in a lot of places)
into a refineCommand() member function.
9837:13a21202375d Mon Aug 19 03:52:00 EDT 2013 Lena Olson <lena@cs.wisc,edu> cpu: Accurately count idle cycles for simple cpu

Added a couple missing updates to the notIdleFraction stat. Without
these, it sometimes gives a (not) idle fraction that is greater than 1
or less than 0.
9648:f10eb34e3e38 Mon Apr 22 13:20:00 EDT 2013 Dam Sunwoo <dam.sunwoo@arm.com> sim: separate nextCycle() and clockEdge() in clockedObjects

Previously, nextCycle() could return the *current* cycle if the current tick was
already aligned with the clock edge. This behavior is not only confusing (not
quite what the function name implies), but also caused problems in the
drainResume() function. When exiting/re-entering the sim loop (e.g., to take
checkpoints), the CPUs will drain and resume. Due to the previous behavior of
nextCycle(), the CPU tick events were being rescheduled in the same ticks that
were already processed before draining. This caused divergence from runs that
did not exit/re-entered the sim loop. (Initially a cycle difference, but a
significant impact later on.)

This patch separates out the two behaviors (nextCycle() and clockEdge()),
uses nextCycle() in drainResume, and uses clockEdge() everywhere else.
Nothing (other than name) should change except for the drainResume timing.
9448:569d1e8f74e4 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Unify the serialization code for all of the CPU models

Cleanup the serialization code for the simple CPUs and the O3 CPU. The
CPU-specific code has been replaced with a (un)serializeThread that
serializes the thread state / context of a specific thread. Assuming
that the thread state class uses the CPU-specific thread state uses
the base thread state serialization code, this allows us to restore a
checkpoint with any of the CPU models.
9442:36967173340c Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Make sure that a drained timing CPU isn't executing ucode

Currently, the timing CPU can be in the middle of a microcode sequence
or multicycle (stayAtPC is true) instruction when it is drained. This
leads to two problems:

* When switching to a hardware virtualized CPU, we obviously can't
execute gem5 microcode.

* If stayAtPC is true we might execute half of an instruction twice
when restoring a checkpoint or switching CPUs, which leads to an
incorrect execution.

After applying this patch, the CPU will be on a proper instruction
boundary, which means that it is safe to switch to any CPU model
(including hardware virtualized ones). This changeset also fixes a bug
where the timing CPU sometimes switches out with while stayAtPC is
true, which corrupts the target state after a CPU switch or
checkpoint.

Note: This changeset removes the so_state variable from checkpoints
since the drain state isn't used anymore.
/gem5/src/cpu/o3/
H A Ddecode_impl.hh9444:ab47fe7f03f0 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Rewrite O3 draining to avoid stopping in microcode

Previously, the O3 CPU could stop in the middle of a microcode
sequence. This patch makes sure that the pipeline stops when it has
committed a normal instruction or exited from a microcode
sequence. Additionally, it makes sure that the pipeline has no
instructions in flight when it is drained, which should make draining
more robust.

Draining is controlled in the commit stage, which checks if the next
PC after a committed instruction is in microcode. If this isn't the
case, it requests a squash of all instructions after that the
instruction that just committed and immediately signals a drain stall
to the fetch stage. The CPU then continues to execute until the
pipeline and all associated buffers are empty.
8842:a02932e2e73d Mon Feb 13 01:26:00 EST 2012 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> BP: Fix several Branch Predictor issues.
1. Updates the Branch Predictor correctly to the state
just after a mispredicted branch, if a squash occurs.
2. If a BTB does not find an entry, the branch is predicted not taken.
The global history is modified to correctly reflect this prediction.
3. Local history is now updated at the fetch stage instead of
execute stage.
4. In the Update stage of the branch predictor the local predictors are
now correctly updated according to the state of local history during
fetch stage.

This patch also improves performance by as much as 17% on some benchmarks
8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
8230:845c8eb5ac49 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: fix up code after sorting
4636:afc8da9f526e Sat Apr 14 13:13:00 EDT 2007 Gabe Black <gblack@eecs.umich.edu> Add support for microcode and pull out the special branch delay slot handling. Branch delay slots need to be squash on a mispredict as well because the nnpc they saw was incorrect.
4636:afc8da9f526e Sat Apr 14 13:13:00 EDT 2007 Gabe Black <gblack@eecs.umich.edu> Add support for microcode and pull out the special branch delay slot handling. Branch delay slots need to be squash on a mispredict as well because the nnpc they saw was incorrect.
4632:be5b8f67b8fb Fri Apr 13 09:59:00 EDT 2007 Gabe Black <gblack@eecs.umich.edu> Remove most of the special handling for delay slots since they have to be squashed anyway on a mispredict. This is because the NNPC value they saw when executing was incorrect.
4318:eb4241362a80 Mon Apr 02 13:55:00 EDT 2007 Kevin Lim <ktlim@umich.edu> Remove/comment out DPRINTFs that were causing a segfault.

The removed ones were unnecessary. The commented out ones could be useful in the future, should this problem get fixed. See flyspray task #243.

src/cpu/o3/commit_impl.hh:
src/cpu/o3/decode_impl.hh:
src/cpu/o3/fetch_impl.hh:
src/cpu/o3/iew_impl.hh:
src/cpu/o3/inst_queue_impl.hh:
src/cpu/o3/lsq_impl.hh:
src/cpu/o3/lsq_unit_impl.hh:
src/cpu/o3/rename_impl.hh:
src/cpu/o3/rob_impl.hh:
Remove/comment out DPRINTFs that were causing a segfault.
2935:d1223a6c9156 Sun Jul 23 13:39:00 EDT 2006 Korey Sewell <ksewell@umich.edu> This changeset gets the MIPS ISA pretty much working in the O3CPU. It builds, runs, and gets very very close to completing the hello world
succesfully but there are some minor quirks to iron out. Who would've known a DELAY SLOT introduces that much complexity?! arrgh!

Anyways, a lot of this stuff had to do with my project at MIPS and me needing to know how I was going to get this working for the MIPS
ISA. So I figured I would try to touch it up and throw it in here (I hate to introduce non-completely working components... )

src/arch/alpha/isa/mem.isa:
spacing
src/arch/mips/faults.cc:
src/arch/mips/faults.hh:
Gabe really authored this
src/arch/mips/isa/decoder.isa:
add StoreConditional Flag to instruction
src/arch/mips/isa/formats/basic.isa:
Steven really did this file
src/arch/mips/isa/formats/branch.isa:
fix bug for uncond/cond control
src/arch/mips/isa/formats/mem.isa:
Adjust O3CPU memory access to use new memory model interface.
src/arch/mips/isa/formats/util.isa:
update LoadStoreBase template
src/arch/mips/isa_traits.cc:
update SERIALIZE partially
src/arch/mips/process.cc:
src/arch/mips/process.hh:
no need for this for NOW. ASID/Virtual addressing handles it
src/arch/mips/regfile/misc_regfile.hh:
add in clear() function and comments for future usage of special misc. regs
src/cpu/base_dyn_inst.hh:
add in nextNPC variable and supporting functions.

add isCondDelaySlot function

Update predTaken and mispredicted functions
src/cpu/base_dyn_inst_impl.hh:
init nextNPC
src/cpu/o3/SConscript:
add MIPS files to compile
src/cpu/o3/alpha/thread_context.hh:
no need for my name on this file
src/cpu/o3/bpred_unit_impl.hh:
Update RAS appropriately for MIPS
src/cpu/o3/comm.hh:
add some extra communication variables to aid in handling the
delay slots
src/cpu/o3/commit.hh:
minor name fix for nextNPC functions.
src/cpu/o3/commit_impl.hh:
src/cpu/o3/decode_impl.hh:
src/cpu/o3/fetch_impl.hh:
src/cpu/o3/iew_impl.hh:
src/cpu/o3/inst_queue_impl.hh:
src/cpu/o3/rename_impl.hh:
Fix necessary variables and functions for squashes with delay slots
src/cpu/o3/cpu.cc:
Update function interface ...

adjust removeInstsNotInROB function to recognize delay slots insts
src/cpu/o3/cpu.hh:
update removeInstsNotInROB
src/cpu/o3/decode.hh:
declare necessary variables for handling delay slot
src/cpu/o3/dyn_inst.hh:
Add in MipsDynInst
src/cpu/o3/fetch.hh:
src/cpu/o3/iew.hh:
src/cpu/o3/rename.hh:
declare necessary variables and adjust functions for handling delay slot
src/cpu/o3/inst_queue.hh:
src/cpu/simple/base.cc:
no need for my name here
src/cpu/o3/isa_specific.hh:
add in MIPS files
src/cpu/o3/scoreboard.hh:
dont include alpha specific isa traits!
src/cpu/o3/thread_context.hh:
no need for my name here, i just rearranged where the file goes
src/cpu/static_inst.hh:
add isCondDelaySlot function
src/cpu/o3/mips/cpu.cc:
src/cpu/o3/mips/cpu.hh:
src/cpu/o3/mips/cpu_builder.cc:
src/cpu/o3/mips/cpu_impl.hh:
src/cpu/o3/mips/dyn_inst.cc:
src/cpu/o3/mips/dyn_inst.hh:
src/cpu/o3/mips/dyn_inst_impl.hh:
src/cpu/o3/mips/impl.hh:
src/cpu/o3/mips/params.hh:
src/cpu/o3/mips/thread_context.cc:
src/cpu/o3/mips/thread_context.hh:
MIPS file for O3CPU...mirrors ALPHA definition
2843:19c4c6c2b5b1 Thu Jul 06 13:59:00 EDT 2006 Kevin Lim <ktlim@umich.edu> Support for draining, and the new method of switching out. Now switching out happens after the pipeline has been drained, deferring the three way handshake to the normal drain mechanism. The calls of switchOut() and takeOverFrom() both take action immediately.

src/cpu/o3/commit.hh:
src/cpu/o3/commit_impl.hh:
src/cpu/o3/cpu.cc:
src/cpu/o3/cpu.hh:
src/cpu/o3/decode.hh:
src/cpu/o3/decode_impl.hh:
src/cpu/o3/fetch.hh:
src/cpu/o3/fetch_impl.hh:
src/cpu/o3/iew.hh:
src/cpu/o3/iew_impl.hh:
src/cpu/o3/rename.hh:
src/cpu/o3/rename_impl.hh:
Support for draining, new method of switching out.
H A Dlsq_unit.hh13652:45d94ac03a27 Mon Jan 22 13:12:00 EST 2018 Tuan Ta <qtt2@cornell.edu> cpu: support atomic memory request type with AtomicOpFunctor

This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU,
MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory
system.

Atomic memory instruction is treated as a special store instruction in
all CPU models.

In simple CPUs, an AMO request with an associated AtomicOpFunctor is
simply sent to L1 dcache.

In MinorCPU, an AMO request bypasses store buffer and waits for any
conflicting store request(s) currently in the store buffer to retire
before the AMO request is sent to the cache. AMO requests are not buffered
in the store buffer, so their effects appear immediately in the cache.

In DerivO3CPU, an AMO request is inserted in the store buffer so that it
is delivered to the cache only after all previous stores are issued to
the cache. Data forwarding between between an outstanding AMO in the
store buffer and a subsequent load is not allowed since the AMO request
does not hold valid data until it's executed in the cache.

This implementation assumes that a target ISA implementation must insert
enough memory fences as micro-ops around an atomic instruction to
enforce a correct order of memory instructions with respect to its
memory consistency model. Without extra memory fences, this implementation
can allow AMOs and other memory instructions that do not conflict
(i.e., not target the same address) to reorder.

This implementation also assumes that atomic instructions execute within
a cache line boundary since the cache for now is not able to execute an
operation on two different cache lines in one single step. Therefore,
ISAs like x86 that require multi-cache-line atomic instructions need to
either use a pair of locking load and unlocking store or change the
cache implementation to guarantee the atomicity of an atomic
instruction.

Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a
Reviewed-on: https://gem5-review.googlesource.com/c/8188
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
13590:d7e018859709 Mon Feb 13 04:41:00 EST 2017 Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> cpu-o3: O3 LSQ Generalisation

This patch does a large modification of the LSQ in the O3 model. The
main goal of the patch is to remove the 'an operation can be served with
one or two memory requests' assumption that is present in the LSQ
and the instruction with the req, reqLow, reqHigh triplet, and
generalising it to operations that can be addressed with one request,
and operations that require many requests, embodied in the
SingleDataRequest and the SplitDataRequest.

This modification has been done mimicking the minor model to an extent,
shifting the responsibilities of dealing with VtoP translation and
tracking the status and resources from the DynInst to the LSQ via the
LSQRequest. The LSQRequest models the information concerning the
operation, handles the creation of fragments for translation and request
as well as assembling/splitting the data accordingly.

With this modifications, the implementation of vector ISAs, particularly
on the memory side, become more rich, as the new model permits a
dissociation of the ISA characteristics as vector length, from the
microarchitectural characteristics that govern how contiguous loads are
executing, allowing exploration of different LSQ to DL1 bus widths to
understand the tradeoffs in complexity and performance.

Part of the complexities introduced stem from the fact that gem5 keeps a
large amount of metadata regarding, in particular, memory operations,
thus, when an instruction is squashed while some operation as TLB lookup
or cache access is ongoing, when the relevant structure communicates to
the LSQ that the operation is over, it tries to access some pieces of
data that should have died when the instruction is squashed, leading to
asserts, panics, or memory corruption. To ensure the correct behaviour,
the LSQRequest rely on assesing who is their owner, and self-destroying
if they detect their owner is done with the request, and there will be
no subsequent action. For example, in the case of an instruction
squashed whal the TLB is doing a walk to serve the translation, when the
translation is served by the TLB, the LSQRequest detects that the
instruction was squashed, and as the translation is done, no one else
expect to access its information, and therefore, it self-destructs.
Having destroyed the LSQRequest earlier, would lead to wrong behaviour
as the TLB walk may access some fields of it.

Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>

Change-Id: I9578a1a3f6b899c390cdd886856a24db68ff7d0c
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13516
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
10342:711eb0e64249 Tue May 13 01:20:00 EDT 2014 Curtis Dunham <Curtis.Dunham@arm.com> mem: Refactor assignment of Packet types

Put the packet type swizzling (that is currently done in a lot of places)
into a refineCommand() member function.
10239:592f0bb6bd6f Sat Jun 21 13:26:00 EDT 2014 Binh Pham <binhpham@cs.rutgers.edu> o3: split load & store queue full cases in rename

Check for free entries in Load Queue and Store Queue separately to
avoid cases when load cannot be renamed due to full Store Queue and
vice versa.

This work was done while Binh was an intern at AMD Research.
9444:ab47fe7f03f0 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Rewrite O3 draining to avoid stopping in microcode

Previously, the O3 CPU could stop in the middle of a microcode
sequence. This patch makes sure that the pipeline stops when it has
committed a normal instruction or exited from a microcode
sequence. Additionally, it makes sure that the pipeline has no
instructions in flight when it is drained, which should make draining
more robust.

Draining is controlled in the commit stage, which checks if the next
PC after a committed instruction is in microcode. If this isn't the
case, it requests a squash of all instructions after that the
instruction that just committed and immediately signals a drain stall
to the fetch stage. The CPU then continues to execute until the
pipeline and all associated buffers are empty.
9440:fdc91cab5760 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Fix O3 LSQ debug dumping constness and formatting
8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses

This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.

For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).

The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.

With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.
8545:a3992291e230 Tue Sep 13 00:58:00 EDT 2011 Ali Saidi <saidi@eecs.umich.edu> LSQ: Only trigger a memory violation with a load/load if the value changes.

Only create a memory ordering violation when the value could have changed
between two subsequent loads, instead of just when loads go out-of-order
to the same address. While not very common in the case of Alpha, with
an architecture with a hardware table walker this can happen reasonably
frequently beacuse a translation will miss and start a table walk and
before the CPU re-schedules the faulting instruction another one will
pass it to the same address (or cache block depending on the dendency
checking).

This patch has been tested with a couple of self-checking hand crafted
programs to stress ordering between two cores.

The performance improvement on SPEC benchmarks can be substantial (2-10%).
8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
8230:845c8eb5ac49 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: fix up code after sorting
H A Dfree_list.cc8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
/gem5/src/arch/arm/isa/insts/
H A Dmisc.isa12762:f73d3a4aaf03 Mon Apr 30 12:13:00 EDT 2018 Giacomo Travaglini <giacomo.travaglini@arm.com> arch-arm: Read APSR in User Mode

This patch substitutes reads to the CPSR in user mode (MRS CPSR) to
reads to APSR (Application Program Status Register).
This is the user level alias for the CPSR. The APSR is a subset of the
CPSR.

Change-Id: I18a70693aef6fd305a4c4cb3c6f81f331bc60a2d
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10602
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
12542:03cb745f9982 Tue Feb 13 09:01:00 EST 2018 Giacomo Travaglini <giacomo.travaglini@arm.com> arch-arm: Add AArch32 HLT Semihosting interface

AArch32 HLT instruction is now able to issue Arm Semihosting commands as
the AArch64 counterpart in either Arm and Thumb mode.

Change-Id: I77da73d2e6a9288c704a5f646f4447022517ceb6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/8372
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
12541:de165cf2809e Tue Feb 13 08:55:00 EST 2018 Giacomo Travaglini <giacomo.travaglini@arm.com> arch-arm: Add AArch32 SVC Semihosting interface

AArch32 Svc instruction is now able to issue Arm Semihosting commands as
the AArch64 counterpart in either Arm and Thumb mode.

Change-Id: Ibe47ac23d0c26f3f819cc0e2b3ee874b5cdbb3d3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/8371
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
12504:6a6d80495bd6 Tue Dec 19 19:13:00 EST 2017 Nikos Nikoleris <nikos.nikoleris@arm.com> arch-arm: Fix printing of the data cache maintenance instructions

Change-Id: I2322c7bf65b38cb07a1ea2b5dc25dfc5a0496cf0
Reviewed-by: Jack Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/7825
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
12258:08990d24fe41 Fri Oct 13 05:03:00 EDT 2017 Giacomo Travaglini <giacomo.travaglini@arm.com> arm: Add support for armv8 CRC32 instructions

This patch introduces the ARM A32/T32/A64 CRC Instructions, which are
mandatory since ARMv8.1. The UNPREDICTABLE behaviours are implemented as
follows:
1) CRC32(C)X (64 bit) instructions are decoded as Undefined in Aarch32
2) The instructions support predication in Aarch32
3) Using R15(PC) as source/dest operand is permitted in Aarch32

Change-Id: Iaf29b05874e1370c7615da79a07f111ded17b6cc
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/5521
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
11355:46c7b3e35720 Mon Feb 29 20:13:00 EST 2016 Mitch Hayenga <mitch.hayenga@arm.com> arm: Squash after returning from exceptions in v7

Properly done for the ERET instruction in v8, but not for v7.
Many control register changes are only visible after explicit
instruction synchronization barriers or exception entry/exit.
This means mode changing instructions should squash any
younger in-flight speculative instructions.
8304:16911ff780d3 Fri May 13 18:27:00 EDT 2011 Ali Saidi <Ali.Saidi@ARM.com> ARM: Construct the predicate test register for more instruction programatically.

If one of the condition codes isn't being used in the execution we should only
read it if the instruction might be dependent on it. With the preeceding changes
there are several more cases where we should dynamically pick instead of assuming
as we did before.
8303:5a95f1d2494e Fri May 13 18:27:00 EDT 2011 Ali Saidi <Ali.Saidi@ARM.com> ARM: Further break up condition code into NZ, C, V bits.

Break up the condition code bits into NZ, C, V registers. These are individually
written and this removes some incorrect dependencies between instructions.
8302:9f23d01421de Fri May 13 18:27:00 EDT 2011 Ali Saidi <Ali.Saidi@ARM.com> ARM: Remove the saturating (Q) condition code from the renamed register.

Move the saturating bit (which is also saturating) from the renamed register
that holds the flags to the CPSR miscreg and adds a allows setting it in a
similar way to the FP saturating registers. This removes a dependency in
instructions that don't write, but need to preserve the Q bit.
8301:858384f3af1c Fri May 13 18:27:00 EDT 2011 Ali Saidi <Ali.Saidi@ARM.com> ARM: Break up condition codes into normal flags, saturation, and simd.

This change splits out the condcodes from being one monolithic register
into three blocks that are updated independently. This allows CPUs
to not have to do RMW operations on the flags registers for instructions
that don't write all flags.
/gem5/src/mem/ruby/system/
H A DSequencer.cc9224:b0539d08bda8 Fri Sep 14 00:13:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> scons: Use c++0x with gcc >= 4.4 instead of 4.6

This patch shifts the version of gcc for which we enable c++0x from
4.6 to 4.4 The more long term plan is to see what the c++0x features
can bring and what level of support would be enabled simply by bumping
the required version of gcc from 4.3 to 4.4.

A few minor things had to be fixed in the code base, most notably the
choice of a hashmap implementation. In the Ruby Sequencer there were
also a few minor issues that gcc 4.4 was not too happy about.
8641:4d3ecac1abec Tue Dec 13 14:49:00 EST 2011 Nathan Binkert <nate@binkert.org> gcc: fix unused variable warnings from GCC 4.6.1
8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
6856:f3caa1cd1d9a Fri Nov 13 10:44:00 EST 2009 Derek Hower <drh5@cs.wisc.edu> ruby: gave ALIASED_REQUEST priority over BUFFER_FULL in sequencer
6355:79464d8a4d2f Mon Jul 13 18:22:00 EDT 2009 pdudnik@gmail.com 1. Got rid of unused functions in DirectoryMemory
2. Reintroduced RMW_Read and RMW_Write
3. Defined -2 in the Sequencer as well as made a note about mandatory queue

Did not address the issues in the slicc because remaking the atomics altogether to allow
multiple processors to issue atomic requests at once
6353:979add6f6fb7 Mon Jul 13 01:11:00 EDT 2009 pdudnik@gmail.com Locked requests should actually be converted to ST rather than ATOMIC, because ATOMIC is for RMW.
6350:accdf59eedd3 Mon Jul 13 12:37:00 EDT 2009 pdudnik@gmail.com Replaced RMW with Locked. RMW will be used for the coherence-aided atomics other than LLSC
6349:1b3d165d890d Mon Jul 13 12:34:00 EDT 2009 pdudnik@gmail.com Moved the lock check and clearing the lock into makeRequest
6348:374e1d9b0660 Mon Jul 13 12:25:00 EDT 2009 pdudnik@gmail.com Forgot to replace one of the RubyRequest_RMW
/gem5/system/alpha/console/
H A DMakefile8015:37634fc80b3c Tue Jun 28 01:13:00 EDT 2005 Nathan Binkert <binkertn@umich.edu> pass the location of the m5 backdoor via the m5AlphaAccess variable
only compile one console

console/Makefile:
Now that the location of the m5 backdoor is passed into the
console via the m5AlphaAccess variable, we only need to
compile one console, and don't need to define TLASER or TSUNAMI
console/console.c:
Don't hardcode the location of the AlphaAccess structure, but
rely on m5 to pass in the correct value.
Setup "volatile struct AlphaAccess *m5AlphaAccess" for use and
get rid of the hardcoded usage.
7981:4fb228b84c1e Mon Dec 22 13:04:00 EST 2003 Nathan Binkert <binkertn@umich.edu> Implement GetChar()

console/Makefile:
Quick install target to copy the binary to zizzer
/gem5/src/arch/sparc/
H A Dtlb_map.hh8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
3929:3640569369a5 Thu Jan 25 13:43:00 EST 2007 Ali Saidi <saidi@eecs.umich.edu> fix smul and sdiv to sign extend, and handle overflow/underflow corretly
Only allow writing/reading of 32 bits of Y
Only allow writing/reading 32 bits of pc when pstate.am
Put any loaded data on the first half of a micro-op in uReg0 so it can't
overwrite the register we are using for address calculation
only erase a entry from the lookup table if it's valid
Put in a temporary check to make sure that lookup table and tlb array stay in sync
if we are interrupted in the middle of a mico-op, reset the micropc/nexpc
so we start on the first part of it when we come back

src/arch/sparc/isa/decoder.isa:
fix smul and sdiv to sign extend, and handle overflow/underflow corretly
Only allow writing/reading of 32 bits of Y
Only allow writing/reading 32 bits of pc when pstate.am
Put any loaded data on the first half of a micro-op in uReg0 so it can't
overwrite the register we are using for address calculation
src/arch/sparc/isa/formats/mem/blockmem.isa:
Put any loaded data on the first half of a micro-op in uReg0 so it can't
overwrite the register we are using for address calculation
src/arch/sparc/isa/includes.isa:
Use limits for 32bit underflow/overflow detection
src/arch/sparc/tlb.cc:
only erase a entry from the lookup table if it's valid
Put in a temporary check to make sure that lookup table and tlb array stay in sync
src/arch/sparc/tlb_map.hh:
add a print function to dump the tlb lookup table
src/cpu/simple/base.cc:
if we are interrupted in the middle of a mico-op, reset the micropc/nexpc
so we start on the first part of it when we come back
/gem5/src/dev/
H A Dintel_8254_timer.hh8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
/gem5/src/mem/ruby/common/
H A DAddress.cc6154:6bb54dcb940e Mon May 11 13:38:00 EDT 2009 Nathan Binkert <nate@binkert.org> ruby: Make ruby #includes use full paths to the files they're including.
This basically means changing all #include statements and changing
autogenerated code so that it generates the correct paths. Because
slicc generates #includes, I had to hard code the include paths to
mem/protocol.
6145:15cca6ab723a Mon May 11 13:38:00 EDT 2009 Nathan Binkert <nate@binkert.org> ruby: Import ruby and slicc from GEMS

We eventually plan to replace the m5 cache hierarchy with the GEMS
hierarchy, but for now we will make both live alongside eachother.
/gem5/src/mem/ruby/slicc_interface/
H A DAbstractCacheEntry.cc6154:6bb54dcb940e Mon May 11 13:38:00 EDT 2009 Nathan Binkert <nate@binkert.org> ruby: Make ruby #includes use full paths to the files they're including.
This basically means changing all #include statements and changing
autogenerated code so that it generates the correct paths. Because
slicc generates #includes, I had to hard code the include paths to
mem/protocol.
6145:15cca6ab723a Mon May 11 13:38:00 EDT 2009 Nathan Binkert <nate@binkert.org> ruby: Import ruby and slicc from GEMS

We eventually plan to replace the m5 cache hierarchy with the GEMS
hierarchy, but for now we will make both live alongside eachother.
/gem5/tests/quick/fs/10.linux-boot/ref/x86/linux/pc-simple-timing/
H A Dsimout9901:13c5fea24be1 Wed Oct 02 05:03:00 EDT 2013 Andreas Sandberg <andreas@sandberg.pp.se> stats: Update x86 stats after x87 fixes

The updates to the x87 caused the stats for several regressions to
change. This was mainly caused by the addition of a working 32-bit and
80-bit FP load instruction and xsave support.
9039:9a22621c741c Mon Jun 04 13:43:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> X86: Update stats for the CPUID change.
H A Dconfig.ini9901:13c5fea24be1 Wed Oct 02 05:03:00 EDT 2013 Andreas Sandberg <andreas@sandberg.pp.se> stats: Update x86 stats after x87 fixes

The updates to the x87 caused the stats for several regressions to
change. This was mainly caused by the addition of a working 32-bit and
80-bit FP load instruction and xsave support.
9039:9a22621c741c Mon Jun 04 13:43:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> X86: Update stats for the CPUID change.
/gem5/tests/quick/fs/10.linux-boot/ref/x86/linux/pc-simple-atomic/
H A Dsimout9901:13c5fea24be1 Wed Oct 02 05:03:00 EDT 2013 Andreas Sandberg <andreas@sandberg.pp.se> stats: Update x86 stats after x87 fixes

The updates to the x87 caused the stats for several regressions to
change. This was mainly caused by the addition of a working 32-bit and
80-bit FP load instruction and xsave support.
9039:9a22621c741c Mon Jun 04 13:43:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> X86: Update stats for the CPUID change.
H A Dconfig.ini9901:13c5fea24be1 Wed Oct 02 05:03:00 EDT 2013 Andreas Sandberg <andreas@sandberg.pp.se> stats: Update x86 stats after x87 fixes

The updates to the x87 caused the stats for several regressions to
change. This was mainly caused by the addition of a working 32-bit and
80-bit FP load instruction and xsave support.
9039:9a22621c741c Mon Jun 04 13:43:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> X86: Update stats for the CPUID change.
/gem5/src/dev/mips/
H A Dmalta_io.hh10280:5b67e1bdf6ad Wed Aug 13 06:57:00 EDT 2014 Andreas Sandberg <Andreas.Sandberg@ARM.com> mips: Remove unused private members to fix compile-time warning

Certain versions of clang complain about unused private members if
they are not used. This changeset removes such members from the
MIPS-specific classes to silence the warning.
5222:bb733a878f85 Tue Nov 13 16:58:00 EST 2007 Korey Sewell <ksewell@umich.edu> Add in files from merge-bare-iron, get them compiling in FS and SE mode
/gem5/src/base/
H A Doutput.hh11259:4006183015a1 Thu Nov 05 13:26:00 EST 2015 Sascha Bischoff <sascha.bischoff@ARM.com> sim: Disable gzip compression for writefile pseudo instruction

The writefile pseudo instruction uses OutputDirectory::create and
OutputDirectory::openFile to create the output files. However, by
default these will check the file extention for .gz, and create a gzip
compressed stream if the file ending matches. When writing out files,
we want to write them out exactly as they are in the guest simulation,
and never want to compress them with gzio. Additionally, this causes
m5 writefile to fail when checking the error flags for the output
steam.

With this patch we add an additional no_gz argument to
OutputDirectory::create and OutputDirectory::openFile which allows us
to override the gzip compression. Therefore, for m5 writefile we
disable the filename check, and always create a standard ostream.
9398:6a348f61220c Mon Jan 07 13:05:00 EST 2013 Andreas Hansson <andreas.hansson@arm.com> mem: Add tracing support in the communication monitor

This patch adds packet tracing to the communication monitor using a
protobuf as the mechanism for creating the trace.

If no file is specified, then the tracing is disabled. If a file is
specified, then for every packet that is successfully sent, a protobuf
message is serialized to the file.
/gem5/tests/quick/fs/10.linux-boot/ref/arm/linux/realview-simple-atomic-dual/
H A Dconfig.ini9449:56610ab73040 Mon Jan 07 13:05:00 EST 2013 Ali Saidi <Ali.Saidi@ARM.com> stats: update stats for previous changes.
8844:a451e4eda591 Mon Feb 13 01:30:00 EST 2012 Ali Saidi <Ali.Saidi@ARM.com> bp: fix up stats for changes to branch predictor
H A Dsimout9449:56610ab73040 Mon Jan 07 13:05:00 EST 2013 Ali Saidi <Ali.Saidi@ARM.com> stats: update stats for previous changes.
8844:a451e4eda591 Mon Feb 13 01:30:00 EST 2012 Ali Saidi <Ali.Saidi@ARM.com> bp: fix up stats for changes to branch predictor
/gem5/tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby/
H A Dconfig.ini11680:b4d943429dc6 Thu Oct 13 18:21:00 EDT 2016 Curtis Dunham <Curtis.Dunham@arm.com> stats: update references
11390:f40859930028 Thu Mar 17 13:32:00 EDT 2016 Steve Reinhardt <steve.reinhardt@amd.com> stats: update stats for ld.so support

Additional auxv entries leads to more instructions in start-up
while walking the list, along with different cache conflicts
wrt stack entries.
/gem5/tests/quick/se/00.hello/ref/mips/linux/simple-timing-ruby/
H A Dconfig.ini11680:b4d943429dc6 Thu Oct 13 18:21:00 EDT 2016 Curtis Dunham <Curtis.Dunham@arm.com> stats: update references
11390:f40859930028 Thu Mar 17 13:32:00 EDT 2016 Steve Reinhardt <steve.reinhardt@amd.com> stats: update stats for ld.so support

Additional auxv entries leads to more instructions in start-up
while walking the list, along with different cache conflicts
wrt stack entries.
/gem5/tests/quick/se/00.hello/ref/alpha/linux/o3-timing/
H A Dconfig.ini11680:b4d943429dc6 Thu Oct 13 18:21:00 EDT 2016 Curtis Dunham <Curtis.Dunham@arm.com> stats: update references
11384:e3cbd2823210 Thu Mar 17 13:25:00 EDT 2016 Steve Reinhardt <steve.reinhardt@amd.com> stats: update stats for mmap() change.

SE O3 runs see an additional reg read per mmap() call.
/gem5/src/base/loader/
H A Draw_object.cc11392:5967db4cff04 Thu Mar 17 13:34:00 EDT 2016 Brandon Potter <brandon.potter@amd.com> base: add symbol support for dynamic libraries

Libraries are loaded into the process address space using the
mmap system call. Conveniently, this happens to be a good
time to update the process symbol table with the library's
incoming symbols so we handle the table update from within the
system call.

This works just like an application's normal symbols. The only
difference between a dynamic library and a main executable is
when the symbol table update occurs. The symbol table update for
an executable happens at program load time and is finished before
the process ever begins executing. Since dynamic linking happens
at runtime, the symbol loading happens after the library is
first loaded into the process address space. The library binary
is examined at this time for a symbol section and that section
is parsed for symbol types with specific bindings (global,
local, weak). Subsequently, these symbols are added to the table
and are available for use by gem5 for things like trace
generation.

Checkpointing should work just as it did previously. The address
space (and therefore the library) will be recorded and the symbol
table will be entirely recorded. (It's not possible to do anything
clever like checkpoint a program and then load the program back
with different libraries with LD_LIBRARY_PATH, because the
library becomes part of the address space after being loaded.)
8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help

Completed in 120 milliseconds

<<31323334353637383940>>