Search

Home

Sort by

Full Search
Definition
Symbol
File Path
History
Type

In Project(s)

Searched hist:13 (Results 1076 - 1100 of 1864) sorted by relevance

<<41 42 434445 46 47 48 49 50 >>

/gem5/src/dev/x86/
H A D	cmos.hh	`9808:13ffc0066b76 Thu Jul 11 22:57:00 EDT 2013 Steve Reinhardt <stever@gmail.com> dev: make BasicPioDevice take size in constructor Instead of relying on derived classes explicitly assigning to the BasicPioDevice pioSize field, require them to pass a size value in to the constructor. Committed by: Nilay Vaish <nilay@cs.wisc.edu> 5629:1565c13d1483 Sat Oct 11 04:13:00 EDT 2008 Gabe Black <gblack@eecs.umich.edu> X86: Change the CMOS from a sub-device to a real SimObject`
/gem5/src/cpu/
H A D	BaseCPU.py	12325:48e41e644187 Fri Nov 24 13:38:00 EST 2017 Andreas Sandberg <andreas.sandberg@arm.com> cpu: Don't override ISA if provided by user The BaseCPU.createThreads() method currently overrides the BaseCPU.isa parameter. This is sometimes undesirable. Change the behavior so that the default value for the isa parameter is the empty list and teach createThreads() to only override the ISA if none has been specified. Change-Id: I2ac5535e55fc57057e294d3c6a93088b33bf7b84 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jack Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Sudhanshu Jha <sudhanshu.jha@arm.com> Reviewed-on: https://gem5-review.googlesource.com/6121 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> 12277:e6455b421c4b Thu Oct 19 13:45:00 EDT 2017 Jose Marinho <jose.marinho@arm.com> cpu: Make automatic transition to OFF optional Add the power_gating_on_idle option to control whether a core automatically enters the power gated state. The default behaviour is to transition to clock gated when idle, but not to power gated. When this option is set to true, the core automatically transitions to the power gated state after a configurable latency. Change-Id: Ida98c7fc532de4140d0e511c25613769b47b3702 Reviewed-on: https://gem5-review.googlesource.com/5741 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> 11877:5ea85692a53e Mon Jul 20 10:15:00 EDT 2015 Brandon Potter <brandon.potter@amd.com> syscall_emul: [patch 13/22] add system call retry capability This changeset adds functionality that allows system calls to retry without affecting thread context state such as the program counter or register values for the associated thread context (when system calls return with a retry fault). This functionality is needed to solve problems with blocking system calls in multi-process or multi-threaded simulations where information is passed between processes/threads. Blocking system calls can cause deadlock because the simulator itself is single threaded. There is only a single thread servicing the event queue which can cause deadlock if the thread hits a blocking system call instruction. To illustrate the problem, consider two processes using the producer/consumer sharing model. The processes can use file descriptors and the read and write calls to pass information to one another. If the consumer calls the blocking read system call before the producer has produced anything, the call will block the event queue (while executing the system call instruction) and deadlock the simulation. The solution implemented in this changeset is to recognize that the system calls will block and then generate a special retry fault. The fault will be sent back up through the function call chain until it is exposed to the cpu model's pipeline where the fault becomes visible. The fault will trigger the cpu model to replay the instruction at a future tick where the call has a chance to succeed without actually going into a blocking state. In subsequent patches, we recognize that a syscall will block by calling a non-blocking poll (from inside the system call implementation) and checking for events. When events show up during the poll, it signifies that the call would not have blocked and the syscall is allowed to proceed (calling an underlying host system call if necessary). If no events are returned from the poll, we generate the fault and try the instruction for the thread context at a distant tick. Note that retrying every tick is not efficient. As an aside, the simulator has some multi-threading support for the event queue, but it is not used by default and needs work. Even if the event queue was completely multi-threaded, meaning that there is a hardware thread on the host servicing a single simulator thread contexts with a 1:1 mapping between them, it's still possible to run into deadlock due to the event queue barriers on quantum boundaries. The solution of replaying at a later tick is the simplest solution and solves the problem generally. 9849:603e2ed487f3 Wed Sep 04 13:22:00 EDT 2013 Andreas Hansson <andreas.hansson@arm.com> cpu: Move the branch predictor out of the BaseCPU The branch predictor is guarded by having either the in-order or out-of-order CPU as one of the available CPU models and therefore should not be used in the BaseCPU. This patch moves the parameter to the relevant CPU classes. 9650:d79319eb68d5 Mon Apr 22 13:20:00 EDT 2013 Timothy M. Jones <timothy.jones@arm.com> cpu: Let python scripts obtain the number of instructions executed 9647:5b6b315472e7 Mon Apr 22 13:20:00 EDT 2013 Dam Sunwoo <dam.sunwoo@arm.com> cpu: generate SimPoint basic block vector profiles This patch is based on http://reviews.m5sim.org/r/1474/ originally written by Mitch Hayenga. Basic block vectors are generated (simpoint.bb.gz in simout folder) based on start and end addresses of basic blocks. Some comments to the original patch are addressed and hooks are added to create and resume from checkpoints based on instruction counts dictated by external SimPoint analysis tools. SimPoint creation/resuming options will be implemented as a separate patch. 9446:644f2a2c9bfc Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Flush TLBs on switchOut() This changeset inserts a TLB flush in BaseCPU::switchOut to prevent stale translations when doing repeated switching. Additionally, the TLB flushing functionality is exported to the Python to make debugging of switching/checkpointing easier. A simulation script will typically use the TLB flushing functionality to generate a reference trace. The following sequence can be used to simulate a handover (this depends on how drain is implemented, but is generally the case) between identically configured CPU models: m5.drain(test_sys) [ cpu.flushTLBs() for cpu in test_sys.cpu ] m5.resume(test_sys) The generated trace should normally be identical to a trace generated when switching between identically configured CPU models or checkpointing and resuming. 9433:34971d2e0019 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Rename defer_registration->switched_out The defer_registration parameter is used to prevent a CPU from initializing at startup, leaving it in the "switched out" mode. The name of this parameter (and the help string) is confusing. This patch renames it to switched_out, which should be more descriptive. 9430:a113f27b68bd Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Introduce sanity checks when switching between CPUs This patch introduces the following sanity checks when switching between CPUs: * Check that the set of new and old CPUs do not overlap. Having an overlap between the set of new CPUs and the set of old CPUs is currently not supported. Doing such a switch used to result in the following assertion error: BaseCPU::takeOverFrom(BaseCPU): \ Assertion `!new_itb_port->isConnected()' failed. Check that all new CPUs are in the switched out state. * Check that all old CPUs are in the switched in state. 9384:877293183bdf Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@arm.com> arch: Make the ISA class inherit from SimObject The ISA class on stores the contents of ID registers on many architectures. In order to make reset values of such registers configurable, we make the class inherit from SimObject, which allows us to use the normal generated parameter headers. This patch introduces a Python helper method, BaseCPU.createThreads(), which creates a set of ISAs for each of the threads in an SMT system. Although it is currently only needed when creating multi-threaded CPUs, it should always be called before instantiating the system as this is an obvious place to configure ID registers identifying a thread/CPU.
H A D	pc_event.cc	9649:c717bd5e0a1d Mon Apr 22 13:20:00 EDT 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> arm: Enable support for triggering a sim panic on kernel panics Add the options 'panic_on_panic' and 'panic_on_oops' to the LinuxArmSystem SimObject. When these option are enabled, the simulator panics when the guest kernel panics or oopses. Enable panic on panic and panic on oops in ARM-based test cases. 8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help 8231:51cf7f3cf9ac Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> debug: create a Debug namespace 8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes 4167:ce5d0f62f13b Tue Mar 06 14:13:00 EST 2007 Nathan Binkert <binkertn@umich.edu> Move all of the parameters of the Root SimObject so they are directly configured by python. Move stuff from root.(cc\|hh) to core.(cc\|hh) since it really belogs there now. In the process, simplify how ticks are used in the python code.
/gem5/src/arch/sparc/
H A D	tlb.cc	9423:43caa4ca5979 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@arm.com> arch: Add support for invalidating TLBs when draining This patch adds support for the memInvalidate() drain method. TLB flushing is requested by calling the virtual flushAll() method on the TLB. Note: This patch renames invalidateAll() to flushAll() on x86 and SPARC to make the interface consistent across all supported architectures. 8751:a6c772fef2f1 Thu Oct 13 04:37:00 EDT 2011 Gabe Black <gblack@eecs.umich.edu> SPARC: Remove the last checks of FULL_SYSTEM. 8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help 8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes 7678:f19b6a3a8cec Mon Sep 13 22:26:00 EDT 2010 Gabe Black <gblack@eecs.umich.edu> Faults: Pass the StaticInst involved, if any, to a Fault's invoke method. Also move the "Fault" reference counted pointer type into a separate file, sim/fault.hh. It would be better to name this less similarly to sim/faults.hh to reduce confusion, but fault.hh matches the name of the type. We could change Fault to FaultPtr to match other pointer types, and then changing the name of the file would make more sense. 7518:917208416d2a Fri Aug 13 09:10:00 EDT 2010 Gabe Black <gblack@eecs.umich.edu> CPU: Tidy up endianness handling for mmapped "IPR"s. 5894:8091ac99341a Wed Feb 25 13:16:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> CPU: Implement translateTiming which defers to translateAtomic, and convert the timing simple CPU to use it. 5891:73084c6bb183 Wed Feb 25 13:15:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> ISA: Replace the translate functions in the TLBs with translateAtomic. 5100:7a0180040755 Fri Sep 28 13:21:00 EDT 2007 Ali Saidi <saidi@eecs.umich.edu> Rename cycles() function to ticks() 4990:38d74405ddac Mon Aug 13 19:06:00 EDT 2007 Gabe Black <gblack@eecs.umich.edu> SPARC: Move tlb state into the tlb. Each "strand" may need to have a private copy of this state, but I couldn't find anywhere in the spec that said that after looking briefly. This prevents writes to the thread context in o3 which was causing the pipeline to be flushed and stopping any forward progress. The other ASI accessible state will probably need to be accessed differently if/when we get O3 full system up and running.
/gem5/src/cpu/o3/
H A D	lsq.hh	13954:2f400a5f2627 Fri Jul 07 09:13:00 EDT 2017 Giacomo Gabrielli <giacomo.gabrielli@arm.com> cpu,mem: Add support for partial loads/stores and wide mem. accesses This changeset adds support for partial (or masked) loads/stores, i.e. loads/stores that can disable accesses to individual bytes within the target address range. In addition, this changeset extends the code to crack memory accesses across most CPU models (TimingSimpleCPU still TBD), so that arbitrarily wide memory accesses are supported. These changes are required for supporting ISAs with wide vectors. Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> - Tiago Muck <tiago.muck@arm.com> Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> 13652:45d94ac03a27 Mon Jan 22 13:12:00 EST 2018 Tuan Ta <qtt2@cornell.edu> cpu: support atomic memory request type with AtomicOpFunctor This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU, MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory system. Atomic memory instruction is treated as a special store instruction in all CPU models. In simple CPUs, an AMO request with an associated AtomicOpFunctor is simply sent to L1 dcache. In MinorCPU, an AMO request bypasses store buffer and waits for any conflicting store request(s) currently in the store buffer to retire before the AMO request is sent to the cache. AMO requests are not buffered in the store buffer, so their effects appear immediately in the cache. In DerivO3CPU, an AMO request is inserted in the store buffer so that it is delivered to the cache only after all previous stores are issued to the cache. Data forwarding between between an outstanding AMO in the store buffer and a subsequent load is not allowed since the AMO request does not hold valid data until it's executed in the cache. This implementation assumes that a target ISA implementation must insert enough memory fences as micro-ops around an atomic instruction to enforce a correct order of memory instructions with respect to its memory consistency model. Without extra memory fences, this implementation can allow AMOs and other memory instructions that do not conflict (i.e., not target the same address) to reorder. This implementation also assumes that atomic instructions execute within a cache line boundary since the cache for now is not able to execute an operation on two different cache lines in one single step. Therefore, ISAs like x86 that require multi-cache-line atomic instructions need to either use a pair of locking load and unlocking store or change the cache implementation to guarantee the atomicity of an atomic instruction. Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a Reviewed-on: https://gem5-review.googlesource.com/c/8188 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> 13590:d7e018859709 Mon Feb 13 04:41:00 EST 2017 Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> cpu-o3: O3 LSQ Generalisation This patch does a large modification of the LSQ in the O3 model. The main goal of the patch is to remove the 'an operation can be served with one or two memory requests' assumption that is present in the LSQ and the instruction with the req, reqLow, reqHigh triplet, and generalising it to operations that can be addressed with one request, and operations that require many requests, embodied in the SingleDataRequest and the SplitDataRequest. This modification has been done mimicking the minor model to an extent, shifting the responsibilities of dealing with VtoP translation and tracking the status and resources from the DynInst to the LSQ via the LSQRequest. The LSQRequest models the information concerning the operation, handles the creation of fragments for translation and request as well as assembling/splitting the data accordingly. With this modifications, the implementation of vector ISAs, particularly on the memory side, become more rich, as the new model permits a dissociation of the ISA characteristics as vector length, from the microarchitectural characteristics that govern how contiguous loads are executing, allowing exploration of different LSQ to DL1 bus widths to understand the tradeoffs in complexity and performance. Part of the complexities introduced stem from the fact that gem5 keeps a large amount of metadata regarding, in particular, memory operations, thus, when an instruction is squashed while some operation as TLB lookup or cache access is ongoing, when the relevant structure communicates to the LSQ that the operation is over, it tries to access some pieces of data that should have died when the instruction is squashed, leading to asserts, panics, or memory corruption. To ensure the correct behaviour, the LSQRequest rely on assesing who is their owner, and self-destroying if they detect their owner is done with the request, and there will be no subsequent action. For example, in the case of an instruction squashed whal the TLB is doing a walk to serve the translation, when the translation is served by the TLB, the LSQRequest detects that the instruction was squashed, and as the translation is done, no one else expect to access its information, and therefore, it self-destructs. Having destroyed the LSQRequest earlier, would lead to wrong behaviour as the TLB walk may access some fields of it. Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> Change-Id: I9578a1a3f6b899c390cdd886856a24db68ff7d0c Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13516 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> 10239:592f0bb6bd6f Sat Jun 21 13:26:00 EDT 2014 Binh Pham <binhpham@cs.rutgers.edu> o3: split load & store queue full cases in rename Check for free entries in Load Queue and Store Queue separately to avoid cases when load cannot be renamed due to full Store Queue and vice versa. This work was done while Binh was an intern at AMD Research. 9444:ab47fe7f03f0 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Rewrite O3 draining to avoid stopping in microcode Previously, the O3 CPU could stop in the middle of a microcode sequence. This patch makes sure that the pipeline stops when it has committed a normal instruction or exited from a microcode sequence. Additionally, it makes sure that the pipeline has no instructions in flight when it is drained, which should make draining more robust. Draining is controlled in the commit stage, which checks if the next PC after a committed instruction is in microcode. If this isn't the case, it requests a squash of all instructions after that the instruction that just committed and immediately signals a drain stall to the fetch stage. The CPU then continues to execute until the pipeline and all associated buffers are empty. 9440:fdc91cab5760 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Fix O3 LSQ debug dumping constness and formatting 8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself. 8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes 7520:67c670459d01 Fri Aug 13 09:16:00 EDT 2010 Gabe Black <gblack@eecs.umich.edu> CPU: Add readBytes and writeBytes functions to the exec contexts. 5494:85c8d296c1cb Sat Jun 28 13:19:00 EDT 2008 Steve Reinhardt <stever@gmail.com> Backed out changeset 94a7bb476fca: caused memory leak.
H A D	fetch_impl.hh	10244:d2deb51a4abf Mon Jun 30 13:50:00 EDT 2014 Anthony Gutierrez <atgutier@umich.edu> cpu: implement a bi-mode branch predictor 9982:b2bfc23f932c Fri Nov 15 13:21:00 EST 2013 Anthony Gutierrez <atgutier@umich.edu> cpu: allow the fetch buffer to be smaller than a cache line the current implementation of the fetch buffer in the o3 cpu is only allowed to be the size of a cache line. some architectures, e.g., ARM, have fetch buffers smaller than a cache line, see slide 22 at: http://www.arm.com/files/pdf/at-exploring_the_design_of_the_cortex-a15.pdf this patch allows the fetch buffer to be set to values smaller than a cache line. 9644:07352f119e48 Mon Apr 22 13:20:00 EDT 2013 Ali Saidi <Ali.Saidi@ARM.com> cpu: fix a switching issue with the o3 cpu. This change fixes the switcheroo test that broke earlier this month. The code that was checking for the pipeline being blocked wasn't checking for a pending translation, only for a icache access. 9444:ab47fe7f03f0 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Rewrite O3 draining to avoid stopping in microcode Previously, the O3 CPU could stop in the middle of a microcode sequence. This patch makes sure that the pipeline stops when it has committed a normal instruction or exited from a microcode sequence. Additionally, it makes sure that the pipeline has no instructions in flight when it is drained, which should make draining more robust. Draining is controlled in the commit stage, which checks if the next PC after a committed instruction is in microcode. If this isn't the case, it requests a squash of all instructions after that the instruction that just committed and immediately signals a drain stall to the fetch stage. The CPU then continues to execute until the pipeline and all associated buffers are empty. 9427:ddf45c1d54d4 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Initialize the O3 pipeline from startup() The entire O3 pipeline used to be initialized from init(), which is called before initState() or unserialize(). This causes the pipeline to be initialized from an incorrect thread context. This doesn't currently lead to correctness problems as instructions fetched from the incorrect start PC will be squashed a few cycles after initialization. This patch will affect the regressions since the O3 CPU now issues its first instruction fetch to the correct PC instead of 0x0. 9057:f5ee56466b91 Tue Jun 05 13:52:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> ISA: Back-out NoopMachInst as a StaticInstPtr change. 9040:cdfe09f9bdee Mon Jun 04 13:57:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA: Turn the ExtMachInst NoopMachinst into the StaticInstPtr NoopStaticInst. This eliminates a use of the ExtMachInst type outside of the ISAs. 8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself. 8931:7a1dfb191e3f Fri Apr 06 13:46:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Enable multiple distributed generalized memories This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. 8641:4d3ecac1abec Tue Dec 13 14:49:00 EST 2011 Nathan Binkert <nate@binkert.org> gcc: fix unused variable warnings from GCC 4.6.1
H A D	commit_impl.hh	13652:45d94ac03a27 Mon Jan 22 13:12:00 EST 2018 Tuan Ta <qtt2@cornell.edu> cpu: support atomic memory request type with AtomicOpFunctor This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU, MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory system. Atomic memory instruction is treated as a special store instruction in all CPU models. In simple CPUs, an AMO request with an associated AtomicOpFunctor is simply sent to L1 dcache. In MinorCPU, an AMO request bypasses store buffer and waits for any conflicting store request(s) currently in the store buffer to retire before the AMO request is sent to the cache. AMO requests are not buffered in the store buffer, so their effects appear immediately in the cache. In DerivO3CPU, an AMO request is inserted in the store buffer so that it is delivered to the cache only after all previous stores are issued to the cache. Data forwarding between between an outstanding AMO in the store buffer and a subsequent load is not allowed since the AMO request does not hold valid data until it's executed in the cache. This implementation assumes that a target ISA implementation must insert enough memory fences as micro-ops around an atomic instruction to enforce a correct order of memory instructions with respect to its memory consistency model. Without extra memory fences, this implementation can allow AMOs and other memory instructions that do not conflict (i.e., not target the same address) to reorder. This implementation also assumes that atomic instructions execute within a cache line boundary since the cache for now is not able to execute an operation on two different cache lines in one single step. Therefore, ISAs like x86 that require multi-cache-line atomic instructions need to either use a pair of locking load and unlocking store or change the cache implementation to guarantee the atomicity of an atomic instruction. Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a Reviewed-on: https://gem5-review.googlesource.com/c/8188 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> 11877:5ea85692a53e Mon Jul 20 10:15:00 EDT 2015 Brandon Potter <brandon.potter@amd.com> syscall_emul: [patch 13/22] add system call retry capability This changeset adds functionality that allows system calls to retry without affecting thread context state such as the program counter or register values for the associated thread context (when system calls return with a retry fault). This functionality is needed to solve problems with blocking system calls in multi-process or multi-threaded simulations where information is passed between processes/threads. Blocking system calls can cause deadlock because the simulator itself is single threaded. There is only a single thread servicing the event queue which can cause deadlock if the thread hits a blocking system call instruction. To illustrate the problem, consider two processes using the producer/consumer sharing model. The processes can use file descriptors and the read and write calls to pass information to one another. If the consumer calls the blocking read system call before the producer has produced anything, the call will block the event queue (while executing the system call instruction) and deadlock the simulation. The solution implemented in this changeset is to recognize that the system calls will block and then generate a special retry fault. The fault will be sent back up through the function call chain until it is exposed to the cpu model's pipeline where the fault becomes visible. The fault will trigger the cpu model to replay the instruction at a future tick where the call has a chance to succeed without actually going into a blocking state. In subsequent patches, we recognize that a syscall will block by calling a non-blocking poll (from inside the system call implementation) and checking for events. When events show up during the poll, it signifies that the call would not have blocked and the syscall is allowed to proceed (calling an underlying host system call if necessary). If no events are returned from the poll, we generate the fault and try the instruction for the thread context at a distant tick. Note that retrying every tick is not efficient. As an aside, the simulator has some multi-threading support for the event queue, but it is not used by default and needs work. Even if the event queue was completely multi-threaded, meaning that there is a hardware thread on the host servicing a single simulator thread contexts with a 1:1 mapping between them, it's still possible to run into deadlock due to the event queue barriers on quantum boundaries. The solution of replaying at a later tick is the simplest solution and solves the problem generally. 9444:ab47fe7f03f0 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Rewrite O3 draining to avoid stopping in microcode Previously, the O3 CPU could stop in the middle of a microcode sequence. This patch makes sure that the pipeline stops when it has committed a normal instruction or exited from a microcode sequence. Additionally, it makes sure that the pipeline has no instructions in flight when it is drained, which should make draining more robust. Draining is controlled in the commit stage, which checks if the next PC after a committed instruction is in microcode. If this isn't the case, it requests a squash of all instructions after that the instruction that just committed and immediately signals a drain stall to the fetch stage. The CPU then continues to execute until the pipeline and all associated buffers are empty. 9437:8088e94a9de0 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Fix broken squashAfter implementation in O3 CPU Commit can currently both commit and squash in the same cycle. This confuses other stages since the signals coming from the commit stage can only signal either a squash or a commit in a cycle. This changeset changes the behavior of squashAfter so that it commits all instructions, including the instruction that requested the squash, in the first cycle and then starts to squash in the next cycle. 9427:ddf45c1d54d4 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Initialize the O3 pipeline from startup() The entire O3 pipeline used to be initialized from init(), which is called before initState() or unserialize(). This causes the pipeline to be initialized from an incorrect thread context. This doesn't currently lead to correctness problems as instructions fetched from the incorrect start PC will be squashed a few cycles after initialization. This patch will affect the regressions since the O3 CPU now issues its first instruction fetch to the correct PC instead of 0x0. 9382:1c97b57d5169 Mon Jan 07 13:05:00 EST 2013 Ali Saidi <Ali.Saidi@ARM.com> cpu: rename the misleading inSyscall to noSquashFromTC isSyscall was originally created because during handling of a syscall in SE mode the threadcontext had to be updated. However, in many places this is used in FS mode (e.g. fault handlers) and the name doesn't make much sense. The boolean actually stops gem5 from squashing speculative and non-committed state when a write to a threadcontext happens, so re-name the variable to something more appropriate 8843:7d3ac6813147 Mon Feb 13 01:26:00 EST 2012 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> BPred: Fix RAS to handle predicated call/return instructions. Change RAS to fix issues with predicated call/return instructions. Handled all cases in the life of a predicated call and return instruction. 8842:a02932e2e73d Mon Feb 13 01:26:00 EST 2012 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> BP: Fix several Branch Predictor issues. 1. Updates the Branch Predictor correctly to the state just after a mispredicted branch, if a squash occurs. 2. If a BTB does not find an entry, the branch is predicted not taken. The global history is modified to correctly reflect this prediction. 3. Local history is now updated at the fetch stage instead of execute stage. 4. In the Update stage of the branch predictor the local predictors are now correctly updated according to the state of local history during fetch stage. This patch also improves performance by as much as 17% on some benchmarks 8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help 8230:845c8eb5ac49 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: fix up code after sorting
/gem5/src/cpu/simple/
H A D	timing.hh	13954:2f400a5f2627 Fri Jul 07 09:13:00 EDT 2017 Giacomo Gabrielli <giacomo.gabrielli@arm.com> cpu,mem: Add support for partial loads/stores and wide mem. accesses This changeset adds support for partial (or masked) loads/stores, i.e. loads/stores that can disable accesses to individual bytes within the target address range. In addition, this changeset extends the code to crack memory accesses across most CPU models (TimingSimpleCPU still TBD), so that arbitrarily wide memory accesses are supported. These changes are required for supporting ISAs with wide vectors. Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> - Tiago Muck <tiago.muck@arm.com> Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> 13652:45d94ac03a27 Mon Jan 22 13:12:00 EST 2018 Tuan Ta <qtt2@cornell.edu> cpu: support atomic memory request type with AtomicOpFunctor This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU, MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory system. Atomic memory instruction is treated as a special store instruction in all CPU models. In simple CPUs, an AMO request with an associated AtomicOpFunctor is simply sent to L1 dcache. In MinorCPU, an AMO request bypasses store buffer and waits for any conflicting store request(s) currently in the store buffer to retire before the AMO request is sent to the cache. AMO requests are not buffered in the store buffer, so their effects appear immediately in the cache. In DerivO3CPU, an AMO request is inserted in the store buffer so that it is delivered to the cache only after all previous stores are issued to the cache. Data forwarding between between an outstanding AMO in the store buffer and a subsequent load is not allowed since the AMO request does not hold valid data until it's executed in the cache. This implementation assumes that a target ISA implementation must insert enough memory fences as micro-ops around an atomic instruction to enforce a correct order of memory instructions with respect to its memory consistency model. Without extra memory fences, this implementation can allow AMOs and other memory instructions that do not conflict (i.e., not target the same address) to reorder. This implementation also assumes that atomic instructions execute within a cache line boundary since the cache for now is not able to execute an operation on two different cache lines in one single step. Therefore, ISAs like x86 that require multi-cache-line atomic instructions need to either use a pair of locking load and unlocking store or change the cache implementation to guarantee the atomicity of an atomic instruction. Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a Reviewed-on: https://gem5-review.googlesource.com/c/8188 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> 12085:de78ea63e0ca Wed Jun 07 01:13:00 EDT 2017 Sean Wilson <spwilson2@wisc.edu> cpu, gpu-compute: Replace EventWrapper use with EventFunctionWrapper Change-Id: Idd5992463bcf9154f823b82461070d1f1842cea3 Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3746 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> 9442:36967173340c Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Make sure that a drained timing CPU isn't executing ucode Currently, the timing CPU can be in the middle of a microcode sequence or multicycle (stayAtPC is true) instruction when it is drained. This leads to two problems: * When switching to a hardware virtualized CPU, we obviously can't execute gem5 microcode. * If stayAtPC is true we might execute half of an instruction twice when restoring a checkpoint or switching CPUs, which leads to an incorrect execution. After applying this patch, the CPU will be on a proper instruction boundary, which means that it is safe to switch to any CPU model (including hardware virtualized ones). This changeset also fixes a bug where the timing CPU sometimes switches out with while stayAtPC is true, which corrupts the target state after a CPU switch or checkpoint. Note: This changeset removes the so_state variable from checkpoints since the drain state isn't used anymore. 8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself. 8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes 7520:67c670459d01 Fri Aug 13 09:16:00 EDT 2010 Gabe Black <gblack@eecs.umich.edu> CPU: Add readBytes and writeBytes functions to the exec contexts. 5894:8091ac99341a Wed Feb 25 13:16:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> CPU: Implement translateTiming which defers to translateAtomic, and convert the timing simple CPU to use it. 5890:bdef71accd68 Wed Feb 25 13:15:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> CPU: Get rid of translate... functions from various interface classes. 5169:bfd18d401251 Thu Oct 18 13:15:00 EDT 2007 Ali Saidi <saidi@eecs.umich.edu> CPU: Use the ThreadContext cpu id instead of the params cpu id in all cases.
/gem5/tests/configs/
H A D	tsunami-simple-atomic-dual.py	9380:e428871da248 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> tests: Create base classes to encapsulate common test configurations Most of the test cases currently contain a large amount of duplicated boiler plate code. This changeset introduces a set of classes that encapsulates most of the functionality when setting up a test configuration. The following base classes are introduced: * BaseSystem - Basic system configuration that can be used for both SE and FS simulation. * BaseFSSystem - Basic FS configuration uni-processor and multi-processor configurations. * BaseFSSystemUniprocessor - Basic FS configuration for uni-processor configurations. This is provided as a way to make existing test cases backwards compatible. Architecture specific implementations are provided for ARM, Alpha, and X86. 9036:6385cf85bf12 Thu May 31 13:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Bus: Split the bus into a non-coherent and coherent bus This patch introduces a class hierarchy of buses, a non-coherent one, and a coherent one, splitting the existing bus functionality. By doing so it also enables further specialisation of the two types of buses. A non-coherent bus connects a number of non-snooping masters and slaves, and routes the request and response packets based on the address. The request packets issued by the master connected to a non-coherent bus could still snoop in caches attached to a coherent bus, as is the case with the I/O bus and memory bus in most system configurations. No snoops will, however, reach any master on the non-coherent bus itself. The non-coherent bus can be used as a template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses, and is typically used for the I/O buses. A coherent bus connects a number of (potentially) snooping masters and slaves, and routes the request and response packets based on the address, and also forwards all requests to the snoopers and deals with the snoop responses. The coherent bus can be used as a template for modelling QPI, HyperTransport, ACE and coherent OCP buses, and is typically used for the L1-to-L2 buses and as the main system interconnect. The configuration scripts are updated to use a NoncoherentBus for all peripheral and I/O buses. A bit of minor tidying up has also been done. 8839:eeb293859255 Mon Feb 13 06:43:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Introduce the master/slave port roles in the Python classes This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves. The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port. Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves. 4167:ce5d0f62f13b Tue Mar 06 14:13:00 EST 2007 Nathan Binkert <binkertn@umich.edu> Move all of the parameters of the Root SimObject so they are directly configured by python. Move stuff from root.(cc\|hh) to core.(cc\|hh) since it really belogs there now. In the process, simplify how ticks are used in the python code. 3170:37fd1e73f836 Sun Oct 08 13:53:00 EDT 2006 Steve Reinhardt <stever@eecs.umich.edu> Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing) and PhysicalMemory. No support for caches or O3CPU. Note that properly setting cpu_id on all CPUs is now required for correct operation. src/arch/SConscript: src/base/traceflags.py: src/cpu/base.hh: src/cpu/simple/atomic.cc: src/cpu/simple/timing.cc: src/cpu/simple/timing.hh: src/mem/physical.cc: src/mem/physical.hh: src/mem/request.hh: src/python/m5/objects/BaseCPU.py: tests/configs/simple-atomic.py: tests/configs/simple-timing.py: tests/configs/tsunami-simple-atomic-dual.py: tests/configs/tsunami-simple-atomic.py: tests/configs/tsunami-simple-timing-dual.py: tests/configs/tsunami-simple-timing.py: Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing) and PhysicalMemory. No support for caches or O3CPU.
H A D	tsunami-simple-atomic.py	9380:e428871da248 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> tests: Create base classes to encapsulate common test configurations Most of the test cases currently contain a large amount of duplicated boiler plate code. This changeset introduces a set of classes that encapsulates most of the functionality when setting up a test configuration. The following base classes are introduced: * BaseSystem - Basic system configuration that can be used for both SE and FS simulation. * BaseFSSystem - Basic FS configuration uni-processor and multi-processor configurations. * BaseFSSystemUniprocessor - Basic FS configuration for uni-processor configurations. This is provided as a way to make existing test cases backwards compatible. Architecture specific implementations are provided for ARM, Alpha, and X86. 9036:6385cf85bf12 Thu May 31 13:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Bus: Split the bus into a non-coherent and coherent bus This patch introduces a class hierarchy of buses, a non-coherent one, and a coherent one, splitting the existing bus functionality. By doing so it also enables further specialisation of the two types of buses. A non-coherent bus connects a number of non-snooping masters and slaves, and routes the request and response packets based on the address. The request packets issued by the master connected to a non-coherent bus could still snoop in caches attached to a coherent bus, as is the case with the I/O bus and memory bus in most system configurations. No snoops will, however, reach any master on the non-coherent bus itself. The non-coherent bus can be used as a template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses, and is typically used for the I/O buses. A coherent bus connects a number of (potentially) snooping masters and slaves, and routes the request and response packets based on the address, and also forwards all requests to the snoopers and deals with the snoop responses. The coherent bus can be used as a template for modelling QPI, HyperTransport, ACE and coherent OCP buses, and is typically used for the L1-to-L2 buses and as the main system interconnect. The configuration scripts are updated to use a NoncoherentBus for all peripheral and I/O buses. A bit of minor tidying up has also been done. 8839:eeb293859255 Mon Feb 13 06:43:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Introduce the master/slave port roles in the Python classes This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves. The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port. Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves. 4167:ce5d0f62f13b Tue Mar 06 14:13:00 EST 2007 Nathan Binkert <binkertn@umich.edu> Move all of the parameters of the Root SimObject so they are directly configured by python. Move stuff from root.(cc\|hh) to core.(cc\|hh) since it really belogs there now. In the process, simplify how ticks are used in the python code. 3170:37fd1e73f836 Sun Oct 08 13:53:00 EDT 2006 Steve Reinhardt <stever@eecs.umich.edu> Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing) and PhysicalMemory. No support for caches or O3CPU. Note that properly setting cpu_id on all CPUs is now required for correct operation. src/arch/SConscript: src/base/traceflags.py: src/cpu/base.hh: src/cpu/simple/atomic.cc: src/cpu/simple/timing.cc: src/cpu/simple/timing.hh: src/mem/physical.cc: src/mem/physical.hh: src/mem/request.hh: src/python/m5/objects/BaseCPU.py: tests/configs/simple-atomic.py: tests/configs/simple-timing.py: tests/configs/tsunami-simple-atomic-dual.py: tests/configs/tsunami-simple-atomic.py: tests/configs/tsunami-simple-timing-dual.py: tests/configs/tsunami-simple-timing.py: Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing) and PhysicalMemory. No support for caches or O3CPU.
H A D	tsunami-simple-timing.py	9380:e428871da248 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> tests: Create base classes to encapsulate common test configurations Most of the test cases currently contain a large amount of duplicated boiler plate code. This changeset introduces a set of classes that encapsulates most of the functionality when setting up a test configuration. The following base classes are introduced: * BaseSystem - Basic system configuration that can be used for both SE and FS simulation. * BaseFSSystem - Basic FS configuration uni-processor and multi-processor configurations. * BaseFSSystemUniprocessor - Basic FS configuration for uni-processor configurations. This is provided as a way to make existing test cases backwards compatible. Architecture specific implementations are provided for ARM, Alpha, and X86. 9036:6385cf85bf12 Thu May 31 13:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Bus: Split the bus into a non-coherent and coherent bus This patch introduces a class hierarchy of buses, a non-coherent one, and a coherent one, splitting the existing bus functionality. By doing so it also enables further specialisation of the two types of buses. A non-coherent bus connects a number of non-snooping masters and slaves, and routes the request and response packets based on the address. The request packets issued by the master connected to a non-coherent bus could still snoop in caches attached to a coherent bus, as is the case with the I/O bus and memory bus in most system configurations. No snoops will, however, reach any master on the non-coherent bus itself. The non-coherent bus can be used as a template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses, and is typically used for the I/O buses. A coherent bus connects a number of (potentially) snooping masters and slaves, and routes the request and response packets based on the address, and also forwards all requests to the snoopers and deals with the snoop responses. The coherent bus can be used as a template for modelling QPI, HyperTransport, ACE and coherent OCP buses, and is typically used for the L1-to-L2 buses and as the main system interconnect. The configuration scripts are updated to use a NoncoherentBus for all peripheral and I/O buses. A bit of minor tidying up has also been done. 8839:eeb293859255 Mon Feb 13 06:43:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Introduce the master/slave port roles in the Python classes This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves. The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port. Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves. 4167:ce5d0f62f13b Tue Mar 06 14:13:00 EST 2007 Nathan Binkert <binkertn@umich.edu> Move all of the parameters of the Root SimObject so they are directly configured by python. Move stuff from root.(cc\|hh) to core.(cc\|hh) since it really belogs there now. In the process, simplify how ticks are used in the python code. 3170:37fd1e73f836 Sun Oct 08 13:53:00 EDT 2006 Steve Reinhardt <stever@eecs.umich.edu> Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing) and PhysicalMemory. No support for caches or O3CPU. Note that properly setting cpu_id on all CPUs is now required for correct operation. src/arch/SConscript: src/base/traceflags.py: src/cpu/base.hh: src/cpu/simple/atomic.cc: src/cpu/simple/timing.cc: src/cpu/simple/timing.hh: src/mem/physical.cc: src/mem/physical.hh: src/mem/request.hh: src/python/m5/objects/BaseCPU.py: tests/configs/simple-atomic.py: tests/configs/simple-timing.py: tests/configs/tsunami-simple-atomic-dual.py: tests/configs/tsunami-simple-atomic.py: tests/configs/tsunami-simple-timing-dual.py: tests/configs/tsunami-simple-timing.py: Implement Alpha LL/SC support for SimpleCPU (Atomic & Timing) and PhysicalMemory. No support for caches or O3CPU.
H A D	base_config.py	11156:a37dda0f0202 Mon Oct 05 14:13:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> tests: Update SMT tests to correctly configure CPUs The 01.hello-2T-smt test case for the O3 CPU didn't correctly setup the number of threads before creating interrupt controllers, which confused the constructor in BaseCPU. This changeset adds SMT support to the test configuration infrastructure. 9654:64b653b3d72f Mon Apr 22 13:20:00 EDT 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> tests: Add support for testing KVM-based CPUs This changeset adds support for initializing a KVM VM in the BaseSystem test class and adds the following methods in run.py: require_file -- Test if a file exists and abort/skip if not. require_kvm -- Test if KVM support has been compiled into gem5 (i.e., BaseKvmCPU exists) and the KVM device exists on the host. 9447:156f74caf0d4 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> tests: Add CPU switching tests This changeset adds a set of tests that stress the CPU switching code. It adds the following test configurations: * tsunami-switcheroo-full -- Alpha system (atomic, timing, O3) * realview-switcheroo-atomic -- ARM system (atomic<->atomic) * realview-switcheroo-timing -- ARM system (timing<->timing) * realview-switcheroo-o3 -- ARM system (O3<->O3) * realview-switcheroo-full -- ARM system (atomic, timing, O3) Reference data is provided for the 10.linux-boot test case. All of the tests trigger a CPU switch once per millisecond during the boot process. The in-order CPU model was not included in any of the tests as it does not support CPU handover. 9408:10a84dceab25 Mon Jan 07 13:05:00 EST 2013 Andreas Hansson <andreas.hansson@arm.com> config: Do not use hardcoded physmem in fs script This patch generalises the address range resolution for the I/O cache and I/O bridge such that they do not assume a single memory. The patch involves adding a parameter to the system which is then defined based on the memories that are to be visible from the I/O subsystem, whether behind a cache or a bridge. The change is needed to allow interleaved memory controllers in the system. 9380:e428871da248 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> tests: Create base classes to encapsulate common test configurations Most of the test cases currently contain a large amount of duplicated boiler plate code. This changeset introduces a set of classes that encapsulates most of the functionality when setting up a test configuration. The following base classes are introduced: * BaseSystem - Basic system configuration that can be used for both SE and FS simulation. * BaseFSSystem - Basic FS configuration uni-processor and multi-processor configurations. * BaseFSSystemUniprocessor - Basic FS configuration for uni-processor configurations. This is provided as a way to make existing test cases backwards compatible. Architecture specific implementations are provided for ARM, Alpha, and X86.
/gem5/tests/quick/fs/10.linux-boot/ref/x86/linux/pc-simple-timing/
H A D	stats.txt	9901:13c5fea24be1 Wed Oct 02 05:03:00 EDT 2013 Andreas Sandberg <andreas@sandberg.pp.se> stats: Update x86 stats after x87 fixes The updates to the x87 caused the stats for several regressions to change. This was mainly caused by the addition of a working 32-bit and 80-bit FP load instruction and xsave support. 9568:cd1351d4d850 Fri Mar 01 13:20:00 EST 2013 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect SimpleDRAM changes This patch bumps the stats to reflect the slight change in how the retry is handled, and also the pruning of some redundant stats. 9314:63e7cfff4188 Thu Oct 25 13:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> stats: Update the stats to reflect the 1GHz default system clock This patch updates the stats to reflect the change in the default system clock from 1 THz to 1GHz. The changes are due to the DMA devices now injecting requests at a lower pace. 9312:e05e1b69ebf2 Thu Oct 25 13:14:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect use of SimpleDRAM This patch bumps the stats to match the use of SimpleDRAM instead of SimpleMemory in all inorder and O3 regressions, and also all full-system regressions. A number of performance-related stats change, and a whole bunch of stats are added for the memory controller. 9039:9a22621c741c Mon Jun 04 13:43:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> X86: Update stats for the CPUID change.
/gem5/tests/quick/fs/10.linux-boot/ref/arm/linux/realview-simple-timing/
H A D	stats.txt	11680:b4d943429dc6 Thu Oct 13 18:21:00 EDT 2016 Curtis Dunham <Curtis.Dunham@arm.com> stats: update references 9568:cd1351d4d850 Fri Mar 01 13:20:00 EST 2013 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect SimpleDRAM changes This patch bumps the stats to reflect the slight change in how the retry is handled, and also the pruning of some redundant stats. 9449:56610ab73040 Mon Jan 07 13:05:00 EST 2013 Ali Saidi <Ali.Saidi@ARM.com> stats: update stats for previous changes. 9314:63e7cfff4188 Thu Oct 25 13:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> stats: Update the stats to reflect the 1GHz default system clock This patch updates the stats to reflect the change in the default system clock from 1 THz to 1GHz. The changes are due to the DMA devices now injecting requests at a lower pace. 9312:e05e1b69ebf2 Thu Oct 25 13:14:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect use of SimpleDRAM This patch bumps the stats to match the use of SimpleDRAM instead of SimpleMemory in all inorder and O3 regressions, and also all full-system regressions. A number of performance-related stats change, and a whole bunch of stats are added for the memory controller.
/gem5/tests/quick/fs/10.linux-boot/ref/arm/linux/realview-simple-timing-dual/
H A D	stats.txt	11680:b4d943429dc6 Thu Oct 13 18:21:00 EDT 2016 Curtis Dunham <Curtis.Dunham@arm.com> stats: update references 9568:cd1351d4d850 Fri Mar 01 13:20:00 EST 2013 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect SimpleDRAM changes This patch bumps the stats to reflect the slight change in how the retry is handled, and also the pruning of some redundant stats. 9449:56610ab73040 Mon Jan 07 13:05:00 EST 2013 Ali Saidi <Ali.Saidi@ARM.com> stats: update stats for previous changes. 9314:63e7cfff4188 Thu Oct 25 13:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> stats: Update the stats to reflect the 1GHz default system clock This patch updates the stats to reflect the change in the default system clock from 1 THz to 1GHz. The changes are due to the DMA devices now injecting requests at a lower pace. 9312:e05e1b69ebf2 Thu Oct 25 13:14:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect use of SimpleDRAM This patch bumps the stats to match the use of SimpleDRAM instead of SimpleMemory in all inorder and O3 regressions, and also all full-system regressions. A number of performance-related stats change, and a whole bunch of stats are added for the memory controller.
/gem5/src/arch/mips/
H A D	SConscript	9384:877293183bdf Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@arm.com> arch: Make the ISA class inherit from SimObject The ISA class on stores the contents of ID registers on many architectures. In order to make reset values of such registers configurable, we make the class inherit from SimObject, which allows us to use the normal generated parameter headers. This patch introduces a Python helper method, BaseCPU.createThreads(), which creates a set of ISAs for each of the threads in an SMT system. Although it is currently only needed when creating multi-threaded CPUs, it should always be called before instantiating the system as this is an obvious place to configure ID registers identifying a thread/CPU. 9057:f5ee56466b91 Tue Jun 05 13:52:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> ISA: Back-out NoopMachInst as a StaticInstPtr change. 9040:cdfe09f9bdee Mon Jun 04 13:57:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA: Turn the ExtMachInst NoopMachinst into the StaticInstPtr NoopStaticInst. This eliminates a use of the ExtMachInst type outside of the ISAs. 5793:321f79ddb500 Tue Jan 13 17:17:00 EST 2009 Nathan Binkert <nate@binkert.org> SCons: centralize the Dir() workaround for newer versions of scons. Scons bug id: 2006 M5 Bug id: 308 5222:bb733a878f85 Tue Nov 13 16:58:00 EST 2007 Korey Sewell <ksewell@umich.edu> Add in files from merge-bare-iron, get them compiling in FS and SE mode
H A D	isa_traits.hh	9057:f5ee56466b91 Tue Jun 05 13:52:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> ISA: Back-out NoopMachInst as a StaticInstPtr change. 9040:cdfe09f9bdee Mon Jun 04 13:57:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA: Turn the ExtMachInst NoopMachinst into the StaticInstPtr NoopStaticInst. This eliminates a use of the ExtMachInst type outside of the ISAs. 8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes 5222:bb733a878f85 Tue Nov 13 16:58:00 EST 2007 Korey Sewell <ksewell@umich.edu> Add in files from merge-bare-iron, get them compiling in FS and SE mode 4772:f08370a81812 Fri Jul 27 01:13:00 EDT 2007 Gabe Black <gblack@eecs.umich.edu> X86: Fix argument register indexing. Code was assuming that all argument registers followed in order from ArgumentReg0. There is now an ArgumentReg array which is indexed to find the right index. There is a constant, NumArgumentRegs, which can be used to protect against using an invalid ArgumentReg.
H A D	utility.hh	8300:eb279d6e08a2 Fri May 13 18:27:00 EDT 2011 Chander Sudanthi <chander.sudanthi@arm.com> Trace: Allow printing ASIDs and selectively tracing based on user/kernel code. Debug flags are ExecUser, ExecKernel, and ExecAsid. ExecUser and ExecKernel are set by default when Exec is specified. Use minus sign with ExecUser or ExecKernel to remove user or kernel tracing respectively. 8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes 5222:bb733a878f85 Tue Nov 13 16:58:00 EST 2007 Korey Sewell <ksewell@umich.edu> Add in files from merge-bare-iron, get them compiling in FS and SE mode 4181:6edaeff44647 Tue Mar 13 12:13:00 EDT 2007 Gabe Black <gblack@eecs.umich.edu> Replaced makeExtMI with predecode. Removed the getOpcode function from StaticInst which only made sense for Alpha. Started implementing the x86 predecoder. 4181:6edaeff44647 Tue Mar 13 12:13:00 EDT 2007 Gabe Black <gblack@eecs.umich.edu> Replaced makeExtMI with predecode. Removed the getOpcode function from StaticInst which only made sense for Alpha. Started implementing the x86 predecoder.
/gem5/src/arch/x86/
H A D	faults.cc	8232:b28d06a175be Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> trace: reimplement the DTRACE function so it doesn't use a vector At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help 7678:f19b6a3a8cec Mon Sep 13 22:26:00 EDT 2010 Gabe Black <gblack@eecs.umich.edu> Faults: Pass the StaticInst involved, if any, to a Fault's invoke method. Also move the "Fault" reference counted pointer type into a separate file, sim/fault.hh. It would be better to name this less similarly to sim/faults.hh to reduce confusion, but fault.hh matches the name of the type. We could change Fault to FaultPtr to match other pointer types, and then changing the name of the file would make more sense. 5909:ecbd27e5d1f8 Wed Feb 25 13:17:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> X86: Add a trace flag for tracing faults. 5895:569e3b31a868 Wed Feb 25 13:16:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> X86: Make the X86 TLB take advantage of delayed translations, and get rid of the fake TLB miss faults. 5681:54c2d92f601e Mon Oct 13 01:42:00 EDT 2008 Gabe Black <gblack@eecs.umich.edu> X86: Make the x86 interrupt fault kick off the interrupt microcode.
H A D	pagetable_walker.hh	14096:bde52fccbf0f Fri Jul 12 13:29:00 EDT 2019 Matthew Poremba <matthew.poremba@amd.com> arch-x86: Don't free PTW state with inflight requests If a page table walk is squashed, the walker state is being deleted in the squash code. If there are in flight requests, the deleted walker state values may be clobbered, leading to undefined behavior. This adds a squashed boolean to the walker state which is set if a walk is squashed while requests are still in flight. When packets for the in flight request return, we check if the walk was squashed and return that the walk is complete once the number of in flight requests reaches zero. The walker state is then freed by the PTW. Change-Id: I57a64b1548b83a8a9e8441fc9d6f33e9842df2b3 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19568 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com> 8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself. 8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes 5897:29cecf4fe602 Wed Feb 25 13:16:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> X86: Fix the timing mode of the page table walker. 5895:569e3b31a868 Wed Feb 25 13:16:00 EST 2009 Gabe Black <gblack@eecs.umich.edu> X86: Make the X86 TLB take advantage of delayed translations, and get rid of the fake TLB miss faults.
/gem5/src/dev/arm/
H A D	pl111.cc	9648:f10eb34e3e38 Mon Apr 22 13:20:00 EDT 2013 Dam Sunwoo <dam.sunwoo@arm.com> sim: separate nextCycle() and clockEdge() in clockedObjects Previously, nextCycle() could return the current cycle if the current tick was already aligned with the clock edge. This behavior is not only confusing (not quite what the function name implies), but also caused problems in the drainResume() function. When exiting/re-entering the sim loop (e.g., to take checkpoints), the CPUs will drain and resume. Due to the previous behavior of nextCycle(), the CPU tick events were being rescheduled in the same ticks that were already processed before draining. This caused divergence from runs that did not exit/re-entered the sim loop. (Initially a cycle difference, but a significant impact later on.) This patch separates out the two behaviors (nextCycle() and clockEdge()), uses nextCycle() in drainResume, and uses clockEdge() everywhere else. Nothing (other than name) should change except for the drainResume timing. 9415:f5d159450dfb Mon Jan 07 13:05:00 EST 2013 Chander Sudanthi <Chander.Sudanthi@arm.com> ARM: pl111/LCD framebuffer checkpointing fix Fixed check pointing of the framebuffer. Previously, the pixel size was not considered in determining the size of the buffer to checkpoint. This patch checkpoints the entire framebuffer instead of the first quarter. 9395:bf428987f54e Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> arm: Fix DMA event handling bug in the PL111 model The PL111 model currently maintains a list of pre-allocated DmaDoneEvents to prevent unnecessary heap allocations. This list effectively works like a stack where the top element is the latest scheduled event. When an event triggers, the top pointer is moved down the stack. This obviously breaks since events usually retire from the bottom (events don't necessarily have to retire in order), which triggers the following assertion: gem5.debug: build/ARM/dev/arm/pl111.cc:460: void Pl111::fillFifo(): \ Assertion `!dmaDoneEvent[dmaPendingNum-1].scheduled()' failed. This changeset adds a vector listing the currently unused events. This vector acts like a stack where the an element is popped off the stack when a new event is needed an pushed on the stack when they trigger. 9394:e88cf95d33d3 Mon Jan 07 13:05:00 EST 2013 Andreas Hansson <andreas.hansson@arm.com> dev: Fix the Pl111 timings by separating pixel and DMA clock This patch fixes the Pl111 timings by creating a separate clock for the pixel timings. The device clock is used for all interactions with the memory system, just like the AHB clock on the actual module. The result without this patch is that the module only is allowed to send one request every tick of the 24MHz clock which causes a huge backlog. 8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes
/gem5/util/stats/
H A D	stats.py	2343:a2b4a6ccee56 Sat Jun 10 13:08:00 EDT 2006 Nathan Binkert <binkertn@umich.edu> Remove all binning stuff 1318:57d49cdb92a5 Tue Jan 18 13:34:00 EST 2005 Ali Saidi <saidi@eecs.umich.edu> now really done with stability stats stuff 1317:0b6026d3000b Tue Jan 18 13:25:00 EST 2005 Ali Saidi <saidi@eecs.umich.edu> finished stability stats option 1307:e6b9976895c6 Wed Jan 12 13:44:00 EST 2005 Nathan Binkert <binkertn@umich.edu> More graph output junk util/stats/stats.py: Add the graphing output for 6GHz and 8GHz runs 1301:f85f6fb43474 Thu Jan 13 23:59:00 EST 2005 Ali Saidi <saidi@eecs.umich.edu> fix a display bug add option to limit results to a set of ticks fix ticks code to work util/stats/info.py: change samples -> ticks and pass all parameters util/stats/stats.py: add option to select a set of ticks and fix display bug
/gem5/src/arch/arm/isa/insts/
H A D	fp.isa	13979:1e0c4607ac12 Tue Apr 30 13:24:00 EDT 2019 Ciro Santilli <ciro.santilli@arm.com> arch-arm: implement VMINNM and VMAXNM scalar version ARMv8.2 16-bit versions have not yet been implemented, but a placeholders were created for them. Refactor the nearby decoding tree to closely match the ARM spec A32 decode table. That piece of the tree can also be called from thumb which decodes it in the same way, although the thumb decode table has a different terminology The old code didn't match neither A32 or T32 terminologies, so it is better to at least match one of them to help verify correctness. Change-Id: Iabbbca2932557cf6c98ce36690c385c3ddf39ed8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18690 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> 13738:84439021dcf6 Mon Feb 18 13:06:00 EST 2019 Ciro Santilli <ciro.santilli@arm.com> arch-arm: implement floating point aarch32 VCVTA family These instructions round floating point to integer, and were added to aarch32 as an extension to ARMv7. Change-Id: I62d1705badc95a4e8954a5ad62b2b6bc9e4ffe00 Reviewed-on: https://gem5-review.googlesource.com/c/16788 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> 11671:520509f3e66c Thu Oct 13 14:22:00 EDT 2016 Mitch Hayenga <mitch.hayenga@arm.com> isa,arm: Add missing AArch32 FP instructions This commit adds missing non-predicated, scalar floating point instructions. Specifically VRINT* floating point integer rounding instructions and VSEL* floating point conditional selects. Change-Id: I23cbd1389f151389ac8beb28a7d18d5f93d000e7 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nathanael Premillieu <nathanael.premillieu@arm.com> 8303:5a95f1d2494e Fri May 13 18:27:00 EDT 2011 Ali Saidi <Ali.Saidi@ARM.com> ARM: Further break up condition code into NZ, C, V bits. Break up the condition code bits into NZ, C, V registers. These are individually written and this removes some incorrect dependencies between instructions. 8301:858384f3af1c Fri May 13 18:27:00 EDT 2011 Ali Saidi <Ali.Saidi@ARM.com> ARM: Break up condition codes into normal flags, saturation, and simd. This change splits out the condcodes from being one monolithic register into three blocks that are updated independently. This allows CPUs to not have to do RMW operations on the flags registers for instructions that don't write all flags.
/gem5/src/sim/
H A D	faults.hh	11877:5ea85692a53e Mon Jul 20 10:15:00 EDT 2015 Brandon Potter <brandon.potter@amd.com> syscall_emul: [patch 13/22] add system call retry capability This changeset adds functionality that allows system calls to retry without affecting thread context state such as the program counter or register values for the associated thread context (when system calls return with a retry fault). This functionality is needed to solve problems with blocking system calls in multi-process or multi-threaded simulations where information is passed between processes/threads. Blocking system calls can cause deadlock because the simulator itself is single threaded. There is only a single thread servicing the event queue which can cause deadlock if the thread hits a blocking system call instruction. To illustrate the problem, consider two processes using the producer/consumer sharing model. The processes can use file descriptors and the read and write calls to pass information to one another. If the consumer calls the blocking read system call before the producer has produced anything, the call will block the event queue (while executing the system call instruction) and deadlock the simulation. The solution implemented in this changeset is to recognize that the system calls will block and then generate a special retry fault. The fault will be sent back up through the function call chain until it is exposed to the cpu model's pipeline where the fault becomes visible. The fault will trigger the cpu model to replay the instruction at a future tick where the call has a chance to succeed without actually going into a blocking state. In subsequent patches, we recognize that a syscall will block by calling a non-blocking poll (from inside the system call implementation) and checking for events. When events show up during the poll, it signifies that the call would not have blocked and the syscall is allowed to proceed (calling an underlying host system call if necessary). If no events are returned from the poll, we generate the fault and try the instruction for the thread context at a distant tick. Note that retrying every tick is not efficient. As an aside, the simulator has some multi-threading support for the event queue, but it is not used by default and needs work. Even if the event queue was completely multi-threaded, meaning that there is a hardware thread on the host servicing a single simulator thread contexts with a 1:1 mapping between them, it's still possible to run into deadlock due to the event queue barriers on quantum boundaries. The solution of replaying at a later tick is the simplest solution and solves the problem generally. 8545:a3992291e230 Tue Sep 13 00:58:00 EDT 2011 Ali Saidi <saidi@eecs.umich.edu> LSQ: Only trigger a memory violation with a load/load if the value changes. Only create a memory ordering violation when the value could have changed between two subsequent loads, instead of just when loads go out-of-order to the same address. While not very common in the case of Alpha, with an architecture with a hardware table walker this can happen reasonably frequently beacuse a translation will miss and start a table walk and before the CPU re-schedules the faulting instruction another one will pass it to the same address (or cache block depending on the dendency checking). This patch has been tested with a couple of self-checking hand crafted programs to stress ordering between two cores. The performance improvement on SPEC benchmarks can be substantial (2-10%). 8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes 7867:3ee9e6c2e8f7 Mon Jan 31 16:13:00 EST 2011 Gabe Black <gblack@eecs.umich.edu> Fault: Move the definition of NoFault from faults.hh to fault.hh. Moving the definition of NoFault into fault.hh doesn't bring any new dependencies with it, and allows some files to include just fault.hh which has less baggage. NoFault will still be available to everything that includes faults.hh because it includes fault.hh. 7678:f19b6a3a8cec Mon Sep 13 22:26:00 EDT 2010 Gabe Black <gblack@eecs.umich.edu> Faults: Pass the StaticInst involved, if any, to a Fault's invoke method. Also move the "Fault" reference counted pointer type into a separate file, sim/fault.hh. It would be better to name this less similarly to sim/faults.hh to reduce confusion, but fault.hh matches the name of the type. We could change Fault to FaultPtr to match other pointer types, and then changing the name of the file would make more sense.
/gem5/src/mem/ruby/network/
H A D	Network.hh	13799:15badf7874ee Tue Mar 19 13:12:00 EDT 2019 Andrea Mondelli <Andrea.Mondelli@ucf.edu> misc: missing override specifier Missing specifier of overridden virtual function declared in sim_object.hh Removed redundant "virtual" keyword Change-Id: I42aa3349b537c9e62607bce20cf1b3aabdb99bf2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17468 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> 12065:e3e51756dfef Mon Mar 13 14:19:00 EDT 2017 Nikos Nikoleris <nikos.nikoleris@arm.com> ruby: Add support for address ranges in the directory Previously the directory covered a flat address range that always started from address 0. This change adds a vector of address ranges with interleaving and hashing that each directory keeps track of and the necessary flexibility to support systems with non continuous memory ranges. Change-Id: I6ea1c629bdf4c5137b7d9c89dbaf6c826adfd977 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2903 Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> 11113:5a2e1b1b5c43 Wed Sep 16 13:10:00 EDT 2015 Joe Gross <joe.gross@amd.com> ruby: fix message buffer init order The recent changes to make MessageBuffers SimObjects required them to be initialized in a particular order, which could break some protocols. Fix this by calling initNetQueues on the external nodes of each external link in the constructor of Network. This patch also refactors the duplicated code for checking network allocation and setting net queues (which are called by initNetQueues) from the simple and garnet networks to be in Network. 6154:6bb54dcb940e Mon May 11 13:38:00 EDT 2009 Nathan Binkert <nate@binkert.org> ruby: Make ruby #includes use full paths to the files they're including. This basically means changing all #include statements and changing autogenerated code so that it generates the correct paths. Because slicc generates #includes, I had to hard code the include paths to mem/protocol. 6145:15cca6ab723a Mon May 11 13:38:00 EDT 2009 Nathan Binkert <nate@binkert.org> ruby: Import ruby and slicc from GEMS We eventually plan to replace the m5 cache hierarchy with the GEMS hierarchy, but for now we will make both live alongside eachother.

Completed in 318 milliseconds

<<41 42 434445 46 47 48 49 50 >>