#
14297:b4519e586f5e |
|
10-Sep-2019 |
Jordi Vaquero <jordi.vaquero@metempsy.com> |
cpu, mem: Changing AtomicOpFunctor* for unique_ptr<AtomicOpFunctor>
This change is based on modify the way we move the AtomicOpFunctor* through gem5 in order to mantain proper ownership of the object and ensuring its destruction when it is no longer used.
Doing that we fix at the same time a memory leak in Request.hh where we were assigning a new AtomicOpFunctor* without destroying the previous one.
This change creates a new type AtomicOpFunctor_ptr as a std::unique_ptr<AtomicOpFunctor> and move its ownership as needed. Except for its only usage when AtomicOpFunc() is called.
Change-Id: Ic516f9d8217cb1ae1f0a19500e5da0336da9fd4f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20919 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
13954:2f400a5f2627 |
|
07-Jul-2017 |
Giacomo Gabrielli <giacomo.gabrielli@arm.com> |
cpu,mem: Add support for partial loads/stores and wide mem. accesses
This changeset adds support for partial (or masked) loads/stores, i.e. loads/stores that can disable accesses to individual bytes within the target address range. In addition, this changeset extends the code to crack memory accesses across most CPU models (TimingSimpleCPU still TBD), so that arbitrarily wide memory accesses are supported. These changes are required for supporting ISAs with wide vectors.
Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> - Tiago Muck <tiago.muck@arm.com>
Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
13652:45d94ac03a27 |
|
22-Jan-2018 |
Tuan Ta <qtt2@cornell.edu> |
cpu: support atomic memory request type with AtomicOpFunctor
This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU, MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory system.
Atomic memory instruction is treated as a special store instruction in all CPU models.
In simple CPUs, an AMO request with an associated AtomicOpFunctor is simply sent to L1 dcache.
In MinorCPU, an AMO request bypasses store buffer and waits for any conflicting store request(s) currently in the store buffer to retire before the AMO request is sent to the cache. AMO requests are not buffered in the store buffer, so their effects appear immediately in the cache.
In DerivO3CPU, an AMO request is inserted in the store buffer so that it is delivered to the cache only after all previous stores are issued to the cache. Data forwarding between between an outstanding AMO in the store buffer and a subsequent load is not allowed since the AMO request does not hold valid data until it's executed in the cache.
This implementation assumes that a target ISA implementation must insert enough memory fences as micro-ops around an atomic instruction to enforce a correct order of memory instructions with respect to its memory consistency model. Without extra memory fences, this implementation can allow AMOs and other memory instructions that do not conflict (i.e., not target the same address) to reorder.
This implementation also assumes that atomic instructions execute within a cache line boundary since the cache for now is not able to execute an operation on two different cache lines in one single step. Therefore, ISAs like x86 that require multi-cache-line atomic instructions need to either use a pair of locking load and unlocking store or change the cache implementation to guarantee the atomicity of an atomic instruction.
Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a Reviewed-on: https://gem5-review.googlesource.com/c/8188 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
#
12749:223c83ed9979 |
|
04-Jun-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
misc: Using smart pointers for memory Requests
This patch is changing the underlying type for RequestPtr from Request* to shared_ptr<Request>. Having memory requests being managed by smart pointers will simplify the code; it will also prevent memory leakage and dangling pointers.
Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10996 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
12748:ae5ce8e42de7 |
|
03-Jun-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
misc: Substitute pointer to Request with aliased RequestPtr
Every usage of Request* in the code has been replaced with the RequestPtr alias. This is a preparing patch for when RequestPtr will be the typdefed to a smart pointer to Request rather then a raw pointer to Request.
Change-Id: I73cbaf2d96ea9313a590cdc731a25662950cd51a Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10995 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
|
#
11608:6319a1125f1c |
|
14-Aug-2016 |
Nikos Nikoleris <nikos.nikoleris@arm.com> |
cpu, arch: fix the type used for the request flags
Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
11303:f694764d656d |
|
17-Jan-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cpu. arch: add initiateMemRead() to ExecContext interface
For historical reasons, the ExecContext interface had a single function, readMem(), that did two different things depending on whether the ExecContext supported atomic memory mode (i.e., AtomicSimpleCPU) or timing memory mode (all the other models). In the former case, it actually performed a memory read; in the latter case, it merely initiated a read access, and the read completion did not happen until later when a response packet arrived from the memory system.
This led to some confusing things, including timing accesses being required to provide a pointer for the return data even though that pointer was only used in atomic mode.
This patch splits this interface, adding a new initiateMemRead() function to the ExecContext interface to replace the timing-mode use of readMem().
For consistency and clarity, the readMemTiming() helper function in the ISA definitions is renamed to initiateMemRead() as well. For x86, where the access size is passed in explicitly, we can also get rid of the data parameter at this level. For other ISAs, where the access size is determined from the type of the data parameter, we have to keep the parameter for that purpose.
|
#
11169:44b5c183c3cd |
|
12-Oct-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Add explicit overrides and fix other clang >= 3.5 issues
This patch adds explicit overrides as this is now required when using "-Wall" with clang >= 3.5, the latter now part of the most recent XCode. The patch consequently removes "virtual" for those methods where "override" is added. The latter should be enough of an indication.
As part of this patch, a few minor issues that clang >= 3.5 complains about are also resolved (unused methods and variables).
|
#
11168:f98eb2da15a4 |
|
12-Oct-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
misc: Remove redundant compiler-specific defines
This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions.
|
#
11151:ca4ea9b5c052 |
|
30-Sep-2015 |
Mitch Hayenga <mitch.hayenga@arm.com> |
cpu,isa,mem: Add per-thread wakeup logic
Changes wakeup functionality so that only specific threads on SMT capable cpus are woken.
|
#
11147:cc8d6e99cf46 |
|
30-Sep-2015 |
Mitch Hayenga <mitch.hayenga@arm.com> |
config,cpu: Add SMT support to Atomic and Timing CPUs
Adds SMT support to the "simple" CPU models so that they can be used with other SMT-supported CPUs. Example usage: this enables the TimingSimpleCPU to be used to warmup caches before swapping to detailed mode with the in-order or out-of-order based CPU models.
|
#
10935:acd48ddd725f |
|
28-Jul-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
revert 5af8f40d8f2c
|
#
10934:5af8f40d8f2c |
|
26-Jul-2015 |
Nilay Vaish <nilay@cs.wisc.edu> |
cpu: implements vector registers
This adds a vector register type. The type is defined as a std::array of a fixed number of uint64_ts. The isa_parser.py has been modified to parse vector register operands and generate the required code. Different cpus have vector register files now.
|
#
10905:a6ca6831e775 |
|
07-Jul-2015 |
Andreas Sandberg <andreas.sandberg@arm.com> |
sim: Refactor the serialization base class
Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically:
* Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section.
* Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects.
* Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections).
* The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects.
* Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code.
|
#
10698:829adc48e175 |
|
16-Feb-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Make readMiscRegNoEffect const throughout
Finally took the plunge and made this apply to all ISAs, not just ARM.
|
#
10664:61a0b02aa800 |
|
25-Jan-2015 |
Ali Saidi <Ali.Saidi@ARM.com> |
cpu: Remove all notion that we know when the cpu is misspeculating.
We have no way of knowing if a CPU model is on the wrong path with our execute-in-execute CPU models. Don't pretend that we do.
|
#
10537:47fe87b0cf97 |
|
14-Nov-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arm: Fixes based on UBSan and static analysis
Another churn to clean up undefined behaviour, mostly ARM, but some parts also touching the generic part of the code base.
Most of the fixes are simply ensuring that proper intialisation. One of the more subtle changes is the return type of the sign-extension, which is changed to uint64_t. This is to avoid shifting negative values (undefined behaviour) in the ISA code.
|
#
10529:05b5a6cf3521 |
|
06-Nov-2014 |
Marc Orr <morr@cs.wisc.edu> |
x86 isa: This patch attempts an implementation at mwait.
Mwait works as follows: 1. A cpu monitors an address of interest (monitor instruction) 2. A cpu calls mwait - this loads the cache line into that cpu's cache. 3. The cpu goes to sleep. 4. When another processor requests write permission for the line, it is evicted from the sleeping cpu's cache. This eviction is forwarded to the sleeping cpu, which then wakes up.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
|
#
10408:a59c189de383 |
|
20-Sep-2014 |
Mitch Hayenga <mitch.hayenga@arm.com> |
cpu: Remove unused deallocateContext calls
The call paths for de-scheduling a thread are halt() and suspend(), from the thread context. There is no call to deallocateContext() in general, though some CPUs chose to define it. This patch removes the function from BaseCPU and the cores which do not require it.
|
#
10379:c00f6d7e2681 |
|
19-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Pass faults by const reference where possible
This patch changes how faults are passed between methods in an attempt to copy as few reference-counting pointer instances as possible. This should avoid unecessary copies being created, contributing to the increment/decrement of the reference counters.
|
#
10319:4207f9bfcceb |
|
03-Sep-2014 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
arch, cpu: Factor out the ExecContext into a proper base class
We currently generate and compile one version of the ISA code per CPU model. This is obviously wasting a lot of resources at compile time. This changeset factors out the interface into a separate ExecContext class, which also serves as documentation for the interface between CPUs and the ISA code. While doing so, this changeset also fixes up interface inconsistencies between the different CPU models.
The main argument for using one set of ISA code per CPU model has always been performance as this avoid indirect branches in the generated code. However, this argument does not hold water. Booting Linux on a simulated ARM system running in atomic mode (opt/10.linux-boot/realview-simple-atomic) is actually 2% faster (compiled using clang 3.4) after applying this patch. Additionally, compilation time is decreased by 35%.
|
#
10193:d717abc806aa |
|
09-May-2014 |
Curtis Dunham <Curtis.Dunham@arm.com> |
cpu: add more instruction mix statistics
For the o3, add instruction mix (OpClass) histogram at commit (stats also already collected at issue). For the simple CPUs we add a histogram of executed instructions
|
#
10061:3b0d0c988ed6 |
|
09-Feb-2014 |
Andreas Sandberg <andreas@sandberg.pp.se> |
cpu: simple: Add support for using branch predictors
This changesets adds branch predictor support to the BaseSimpleCPU. The simple CPUs normally don't need a branch predictor, however, there are at least two cases where it can be desirable:
1) A simple CPU can be used to warm the branch predictor of an O3 CPU before switching to the slower O3 model.
2) The simple CPU can be used as a quick way of evaluating/debugging new branch predictors since it exposes branch predictor statistics.
Limitations: * Since the simple CPU doesn't speculate, only one instruction will be active in the branch predictor at a time (i.e., the branch predictor will never see speculative branches).
* The outcome of a branch prediction does not affect the performance of the simple CPU.
|
#
9920:028e4da64b42 |
|
15-Oct-2013 |
Yasuko Eckert <yasuko.eckert@amd.com> |
cpu: add a condition-code register class
Add a third register class for condition codes, in parallel with the integer and FP classes. No ISAs use the CC class at this point though.
|
#
9918:2c7219e2d999 |
|
15-Oct-2013 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cpu: rename *_DepTag constants to *_Reg_Base
Make these names more meaningful.
Specifically, made these substitutions:
s/FP_Base_DepTag/FP_Reg_Base/g; s/Ctrl_Base_DepTag/Misc_Reg_Base/g; s/Max_DepTag/Max_Reg_Index/g;
|
#
9461:67a6ba6604c8 |
|
12-Jan-2013 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: Changes to decoder, corrects 9376 The changes made by the changeset 9376 were not quite correct. The patch made changes to the code which resulted in decoder not getting initialized correctly when the state was restored from a checkpoint.
This patch adds a startup function to each ISA object. For x86, this function sets the required state in the decoder. For other ISAs, the function is empty right now.
|
#
9448:569d1e8f74e4 |
|
07-Jan-2013 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
cpu: Unify the serialization code for all of the CPU models
Cleanup the serialization code for the simple CPUs and the O3 CPU. The CPU-specific code has been replaced with a (un)serializeThread that serializes the thread state / context of a specific thread. Assuming that the thread state class uses the CPU-specific thread state uses the base thread state serialization code, this allows us to restore a checkpoint with any of the CPU models.
|
#
9023:e9201a7bce59 |
|
26-May-2012 |
Gabe Black <gblack@eecs.umich.edu> |
CPU: Merge the predecoder and decoder.
These classes are always used together, and merging them will give the ISAs more flexibility in how they cache things and manage the process.
|
#
9020:14321ce30881 |
|
25-May-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Decode: Make the Decoder class defined per ISA.
|
#
8887:20ea02da9c53 |
|
09-Mar-2012 |
Geoffrey Blake <geoffrey.blake@arm.com> |
CheckerCPU: Make CheckerCPU runtime selectable instead of compile selectable
Enables the CheckerCPU to be selected at runtime with the --checker option from the configs/example/fs.py and configs/example/se.py configuration files. Also merges with the SE/FS changes.
|
#
8850:ed91b534ed04 |
|
24-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
CPU: Round-two unifying instr/data CPU ports across models
This patch continues the unification of how the different CPU models create and share their instruction and data ports. Most importantly, it forces every CPU to have an instruction and a data port, and gives these ports explicit getters in the BaseCPU (getDataPort and getInstPort). The patch helps in simplifying the code, make assumptions more explicit, andfurther ease future patches related to the CPU ports.
The biggest changes are in the in-order model (that was not modified in the previous unification patch), which now moves the ports from the CacheUnit to the CPU. It also distinguishes the instruction fetch and load-store unit from the rest of the resources, and avoids the use of indices and casting in favour of keeping track of these two units explicitly (since they are always there anyways). The atomic, timing and O3 model simply return references to their already existing ports.
|
#
8834:21e8d54ecf07 |
|
12-Feb-2012 |
Anthony Gutierrez <atgutier@umich.edu> |
cpu: add separate stats for insts/ops both globally and per cpu model
|
#
8809:bb10807da889 |
|
01-Feb-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Merge with head, hopefully the last time for this batch.
|
#
8806:669e93d79ed9 |
|
29-Jan-2012 |
Gabe Black <gblack@eecs.umich.edu> |
Implement Ali's review feedback.
Try to decrease indentation, and remove some redundant FullSystem checks.
|
#
8779:2a590c51adb1 |
|
01-Nov-2011 |
Gabe Black <gblack@eecs.umich.edu> |
SE/FS: Expose the same methods on the CPUs in SE and FS modes.
|
#
8737:770ccf3af571 |
|
31-Jan-2012 |
Koan-Sin Tan <koansin.tan@gmail.com> |
clang: Enable compiling gem5 using clang 2.9 and 3.0
This patch adds the necessary flags to the SConstruct and SConscript files for compiling using clang 2.9 and later (on Ubuntu et al and OSX XCode 4.2), and also cleans up a bunch of compiler warnings found by clang. Most of the warnings are related to hidden virtual functions, comparisons with unsigneds >= 0, and if-statements with empty bodies. A number of mismatches between struct and class are also fixed. clang 2.8 is not working as it has problems with class names that occur in multiple namespaces (e.g. Statistics in kernel_stats.hh).
clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which causes confusion between the container std::set and the function Packet::set, and this is currently addressed by not including the entire namespace std, but rather selecting e.g. "using std::vector" in the appropriate places.
|
#
8733:64a7bf8fa56c |
|
31-Jan-2012 |
Geoffrey Blake <geoffrey.blake@arm.com> |
CheckerCPU: Re-factor CheckerCPU to be compatible with current gem5
Brings the CheckerCPU back to life to allow FS and SE checking of the O3CPU. These changes have only been tested with the ARM ISA. Other ISAs potentially require modification.
|
#
8557:f44572edfba3 |
|
19-Sep-2011 |
Gabe Black <gblack@eecs.umich.edu> |
Syscall: Make the syscall function available in both SE and FS modes.
In FS mode the syscall function will panic, but the interface will be consistent and code which calls syscall can be compiled in. This will allow, for instance, instructions that use syscall to be built unconditionally but then not returned by the decoder.
|
#
8541:27aaee8ec7cc |
|
09-Sep-2011 |
Gabe Black <gblack@eecs.umich.edu> |
Decode: Pull instruction decoding out of the StaticInst class into its own.
This change pulls the instruction decoding machinery (including caches) out of the StaticInst class and puts it into its own class. This has a few intrinsic benefits. First, the StaticInst code, which has gotten to be quite large, gets simpler. Second, the code that handles decode caching is now separated out into its own component and can be looked at in isolation, making it easier to understand. I took the opportunity to restructure the code a bit which will hopefully also help.
Beyond that, this change also lays some ground work for each ISA to have its own, potentially stateful decode object. We'd be able to include less contextualizing information in the ExtMachInst objects since that context would be applied at the decoder. Also, the decoder could "know" ahead of time that all the instructions it's going to see are going to be, for instance, 64 bit mode, and it will have one less thing to check when it decodes them. Because the decode caching mechanism has been separated out, it's now possible to have multiple caches which correspond to different types of decoding context. Having one cache for each element of the cross product of different configurations may become prohibitive, so it may be desirable to clear out the cache when relatively static state changes and not to have one for each setting.
Because the decode function is no longer universally accessible as a static member of the StaticInst class, a new function was added to the ThreadContexts that returns the applicable decode object.
|
#
8276:66bb0d8ae8bf |
|
04-May-2011 |
Ali Saidi <Ali.Saidi@ARM.com> |
CPU: Fix a case where timing simple cpu faults can nest.
If we fault, change the state to faulting so that we don't fault again in the same cycle.
|
#
8229:78bf55f23338 |
|
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
includes: sort all includes
|
#
7897:d9e8b1fd1a9f |
|
07-Feb-2011 |
Joel Hestness <hestness@cs.utexas.edu> |
mcpat: Adds McPAT performance counters
Updated patches from Rick Strong's set that modify performance counters for McPAT
|
#
7783:9b880b40ac10 |
|
07-Dec-2010 |
Giacomo Gabrielli <Giacomo.Gabrielli@arm.com> |
O3: Make all instructions that write a misc. register not perform the write until commit.
ARM instructions updating cumulative flags (ARM FP exceptions and saturation flags) are not serialized.
Added aliases for ARM FP exceptions and saturation flags in FPSCR. Removed write accesses to the FP condition codes for most ARM VFP instructions: only VCMP and VCMPE instructions update the FP condition codes. Removed a potential cause of seg. faults in the O3 model for NEON memory macro-ops (ARM).
|
#
7764:03efcdc3421f |
|
15-Nov-2010 |
Gabe Black <gblack@eecs.umich.edu> |
O3: Make O3 support variably lengthed instructions.
|
#
7725:00ea9430643b |
|
08-Nov-2010 |
Ali Saidi <Ali.Saidi@ARM.com> |
ARM/Alpha/Cpu: Change prefetchs to be more like normal loads.
This change modifies the way prefetches work. They are now like normal loads that don't writeback a register. Previously prefetches were supposed to call prefetch() on the exection context, so they executed with execute() methods instead of initiateAcc() completeAcc(). The prefetch() methods for all the CPUs are blank, meaning that they get executed, but don't actually do anything.
On Alpha dead cache copy code was removed and prefetches are now normal ops. They count as executed operations, but still don't do anything and IsMemRef is not longer set on them.
On ARM IsDataPrefetch or IsInstructionPreftech is now set on all prefetch instructions. The timing simple CPU doesn't try to do anything special for prefetches now and they execute with the normal memory code path.
|
#
7720:65d338a8dba4 |
|
31-Oct-2010 |
Gabe Black <gblack@eecs.umich.edu> |
ISA,CPU,etc: Create an ISA defined PC type that abstracts out ISA behaviors.
This change is a low level and pervasive reorganization of how PCs are managed in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about, the PC and the NPC, and the lsb of the PC signaled whether or not you were in PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next micropc, x86 and ARM introduced variable length instruction sets, and ARM started to keep track of mode bits in the PC. Each CPU model handled PCs in its own custom way that needed to be updated individually to handle the new dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack, the complexity could be hidden in the ISA at the ISA implementation's expense. Areas like the branch predictor hadn't been updated to handle branch delay slots or micropcs, and it turns out that had introduced a significant (10s of percent) performance bug in SPARC and to a lesser extend MIPS. Rather than perpetuate the problem by reworking O3 again to handle the PC features needed by x86, this change was introduced to rework PC handling in a more modular, transparent, and hopefully efficient way.
PC type:
Rather than having the superset of all possible elements of PC state declared in each of the CPU models, each ISA defines its own PCState type which has exactly the elements it needs. A cross product of canned PCState classes are defined in the new "generic" ISA directory for ISAs with/without delay slots and microcode. These are either typedef-ed or subclassed by each ISA. To read or write this structure through a *Context, you use the new pcState() accessor which reads or writes depending on whether it has an argument. If you just want the address of the current or next instruction or the current micro PC, you can get those through read-only accessors on either the PCState type or the *Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the move away from readPC. That name is ambiguous since it's not clear whether or not it should be the actual address to fetch from, or if it should have extra bits in it like the PAL mode bit. Each class is free to define its own functions to get at whatever values it needs however it needs to to be used in ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the PC and into a separate field like ARM.
These types can be reset to a particular pc (where npc = pc + sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as appropriate), printed, serialized, and compared. There is a branching() function which encapsulates code in the CPU models that checked if an instruction branched or not. Exactly what that means in the context of branch delay slots which can skip an instruction when not taken is ambiguous, and ideally this function and its uses can be eliminated. PCStates also generally know how to advance themselves in various ways depending on if they point at an instruction, a microop, or the last microop of a macroop. More on that later.
Ideally, accessing all the PCs at once when setting them will improve performance of M5 even though more data needs to be moved around. This is because often all the PCs need to be manipulated together, and by getting them all at once you avoid multiple function calls. Also, the PCs of a particular thread will have spatial locality in the cache. Previously they were grouped by element in arrays which spread out accesses.
Advancing the PC:
The PCs were previously managed entirely by the CPU which had to know about PC semantics, try to figure out which dimension to increment the PC in, what to set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction with the PC type itself. Because most of the information about how to increment the PC (mainly what type of instruction it refers to) is contained in the instruction object, a new advancePC virtual function was added to the StaticInst class. Subclasses provide an implementation that moves around the right element of the PC with a minimal amount of decision making. In ISAs like Alpha, the instructions always simply assign NPC to PC without having to worry about micropcs, nnpcs, etc. The added cost of a virtual function call should be outweighed by not having to figure out as much about what to do with the PCs and mucking around with the extra elements.
One drawback of making the StaticInsts advance the PC is that you have to actually have one to advance the PC. This would, superficially, seem to require decoding an instruction before fetch could advance. This is, as far as I can tell, realistic. fetch would advance through memory addresses, not PCs, perhaps predicting new memory addresses using existing ones. More sophisticated decisions about control flow would be made later on, after the instruction was decoded, and handed back to fetch. If branching needs to happen, some amount of decoding needs to happen to see that it's a branch, what the target is, etc. This could get a little more complicated if that gets done by the predecoder, but I'm choosing to ignore that for now.
Variable length instructions:
To handle variable length instructions in x86 and ARM, the predecoder now takes in the current PC by reference to the getExtMachInst function. It can modify the PC however it needs to (by setting NPC to be the PC + instruction length, for instance). This could be improved since the CPU doesn't know if the PC was modified and always has to write it back.
ISA parser:
To support the new API, all PC related operand types were removed from the parser and replaced with a PCState type. There are two warts on this implementation. First, as with all the other operand types, the PCState still has to have a valid operand type even though it doesn't use it. Second, using syntax like PCS.npc(target) doesn't work for two reasons, this looks like the syntax for operand type overriding, and the parser can't figure out if you're reading or writing. Instructions that use the PCS operand (which I've consistently called it) need to first read it into a local variable, manipulate it, and then write it back out.
Return address stack:
The return address stack needed a little extra help because, in the presence of branch delay slots, it has to merge together elements of the return PC and the call PC. To handle that, a buildRetPC utility function was added. There are basically only two versions in all the ISAs, but it didn't seem short enough to put into the generic ISA directory. Also, the branch predictor code in O3 and InOrder were adjusted so that they always store the PC of the actual call instruction in the RAS, not the next PC. If the call instruction is a microop, the next PC refers to the next microop in the same macroop which is probably not desirable. The buildRetPC function advances the PC intelligently to the next macroop (in an ISA specific way) so that that case works.
Change in stats:
There were no change in stats except in MIPS and SPARC in the O3 model. MIPS runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could likely be improved further by setting call/return instruction flags and taking advantage of the RAS.
TODO:
Add != operators to the PCState classes, defined trivially to be !(a==b). Smooth out places where PCs are split apart, passed around, and put back together later. I think this might happen in SPARC's fault code. Add ISA specific constructors that allow setting PC elements without calling a bunch of accessors. Try to eliminate the need for the branching() function. Factor out Alpha's PAL mode pc bit into a separate flag field, and eliminate places where it's blindly masked out or tested in the PC.
|
#
7664:487916d36377 |
|
31-Aug-2010 |
Gabe Black <gblack@eecs.umich.edu> |
CPU: Get rid of the unused ev5_trap function on the simple and checker CPUs.
|
#
7600:eff7f79f7dfd |
|
23-Aug-2010 |
Min Kyu Jeong <minkyu.jeong@arm.com> |
CPU: Make Exec trace to print predication result (if false) for memory instructions
|
#
7597:063f160e8b50 |
|
23-Aug-2010 |
Min Kyu Jeong <minkyu.jeong@arm.com> |
ARM/O3: store the result of the predicate evaluation in DynInst or Threadstate. THis allows the CPU to handle predicated-false instructions accordingly. This particular patch makes loads that are predicated-false to be sent straight to the commit stage directly, not waiting for return of the data that was never requested since it was predicated-false.
|
#
7445:dfd04ffc1773 |
|
03-Jun-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
Minor remote GDB cleanup. Expand the help text on the --remote-gdb-port option so people know you can use it to disable remote gdb without reading the source code, and thus don't waste any time trying to add a separate option to do that. Clean up some gdb-related cruft I found while looking for where one would add a gdb disable option, before I found the comment that told me that I didn't need to do that.
|
#
7045:e21fe6a62b1c |
|
23-Mar-2010 |
Steve Reinhardt <steve.reinhardt@amd.com> |
cpu: fix exec tracing memory corruption bug Accessing traceData (to call setAddress() and/or setData()) after initiating a timing translation was causing crashes, since a failed translation could delete the traceData object before returning.
It turns out that there was never a need to access traceData after initiating the translation, as the traced data was always available earlier; this ordering was merely historical. Furthermore, traceData->setAddress() and traceData->setData() were being called both from the CPU model and the ISA definition, often redundantly.
This patch standardizes all setAddress and setData calls for memory instructions to be in the CPU models and not in the ISA definition. It also moves those calls above the translation calls to eliminate the crashes.
|
#
6658:f4de76601762 |
|
23-Sep-2009 |
Nathan Binkert <nate@binkert.org> |
arch: nuke arch/isa_specific.hh and move stuff to generated config/the_isa.hh
|
#
6314:781969fbeca9 |
|
09-Jul-2009 |
Gabe Black <gblack@eecs.umich.edu> |
Registers: Get rid of the float register width parameter.
|
#
6221:58a3c04e6344 |
|
26-May-2009 |
Nathan Binkert <nate@binkert.org> |
types: add a type for thread IDs and try to use it everywhere
|
#
5999:3cf8e71257e0 |
|
05-Mar-2009 |
Nathan Binkert <nate@binkert.org> |
stats: Fix all stats usages to deal with template fixes
|
#
5894:8091ac99341a |
|
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
CPU: Implement translateTiming which defers to translateAtomic, and convert the timing simple CPU to use it.
|
#
5807:57f9f8b8e62f |
|
24-Jan-2009 |
Nathan Binkert <nate@binkert.org> |
cpu: provide a wakeup mechanism that can be used to pull CPUs out of sleep. Make interrupts use the new wakeup method, and pull all of the interrupt stuff into the cpu base class so that only the wakeup code needs to be updated. I tried to make wakeup, wakeCPU, and the various other mechanisms for waking and sleeping a little more sane, but I couldn't understand why the statistics were changing the way they were. Maybe we'll try again some day.
|
#
5712:199d31b47f7b |
|
02-Nov-2008 |
Lisa Hsu <hsul@eecs.umich.edu> |
make BaseCPU the provider of _cpuId, and cpuId() instead of being scattered across the subclasses. generally make it so that member data is _cpuId and accessor functions are cpuId(). The ID val comes from the python (default -1 if none provided), and if it is -1, the index of cpuList will be given. this has passed util/regress quick and se.py -n4 and fs.py -n4 as well as standard switch.
|
#
5704:98224505352a |
|
21-Oct-2008 |
Nathan Binkert <nate@binkert.org> |
style: Use the correct m5 style for things relating to interrupts.
|
#
5702:bf84e2fa05f7 |
|
20-Oct-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
O3CPU: Undo Gabe's changes to remove hwrei and simpalcheck from O3 CPU. Removing hwrei causes the instruction after the hwrei to be fetched before the ITB/DTB_CM register is updated in a call pal call sys and thus the translation fails because the user is attempting to access a super page address.
Minimally, it seems as though some sort of fetch stall or refetch after a hwrei is required. I think this works currently because the hwrei uses the exec context interface, and the o3 stalls when that occurs.
Additionally, these changes don't update the LOCK register and probably break ll/sc. Both o3 changes were removed since a great deal of manual patching would be required to only remove the hwrei change.
|
#
5640:c811ced9efc1 |
|
11-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
CPU: Eliminate the simPalCheck funciton.
|
#
5639:67cc7f0427e7 |
|
11-Oct-2008 |
Gabe Black <gblack@eecs.umich.edu> |
CPU: Eliminate the hwrei function.
|
#
5543:3af77710f397 |
|
10-Sep-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
style: Remove non-leading tabs everywhere they shouldn't be. Developers should configure their editors to not insert tabs
|
#
5529:9ae69b9cd7fd |
|
11-Aug-2008 |
Nathan Binkert <nate@binkert.org> |
params: Convert the CPU objects to use the auto generated param structs. A whole bunch of stuff has been converted to use the new params stuff, but the CPU wasn't one of them. While we're at it, make some things a bit more stylish. Most of the work was done by Gabe, I just cleaned stuff up a bit more at the end.
|
#
5496:6899b894166f |
|
01-Jul-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
After a checkpoint (and thus a stats reset), the not_idle_fraction/notIdleFraction statistic is really wrong. The notIdleFraction statistic isn't updated when the statistics reset, probably because the cpu Status information was pulled into the atomic and timing cpus. This changeset pulls Status back into the BaseSimpleCPU object. Anyone care to comment on the odd naming of the Status instance? It shouldn't just be status because that is confusing with Port::Status, but _status seems a bit strage too.
|
#
5358:e9acb84bbafb |
|
26-Feb-2008 |
Gabe Black <gblack@eecs.umich.edu> |
TLB: Make a TLB base class and put a virtual demapPage function in it.
|
#
5348:7847a4bf9641 |
|
14-Feb-2008 |
Ali Saidi <saidi@eecs.umich.edu> |
CPU: move the PC Events code to a place where the code won't be executed multiple times if an instruction faults.
|
#
5250:42577371ff31 |
|
15-Nov-2007 |
Korey Sewell <ksewell@umich.edu> |
Get MIPS simple regression working. Take out unecessary functions "setShadowSet", "CacheOp"
|
#
5222:bb733a878f85 |
|
13-Nov-2007 |
Korey Sewell <ksewell@umich.edu> |
Add in files from merge-bare-iron, get them compiling in FS and SE mode
|
#
5169:bfd18d401251 |
|
18-Oct-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
CPU: Use the ThreadContext cpu id instead of the params cpu id in all cases.
|
#
4998:51a0f9f59cc5 |
|
26-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Simple CPU: Make sure only instructions which complete without faulting are counted.
|
#
4997:e7380529bd2d |
|
26-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Address Translation: Make SE mode use an actual TLB/MMU for translation like FS.
|
#
4950:f5f19784acf1 |
|
07-Aug-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make a microcode branch microop. Also some touch up for ruflag.
|
#
4870:fcc39d001154 |
|
30-Jun-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Get rid of Packet result field. Error responses are now encoded in cmd field.
|
#
4661:44458219add1 |
|
22-Jun-2007 |
Korey Sewell <ksewell@umich.edu> |
mips import pt. 1
src/arch/mips/SConscript: "mips import pt.1".
|
#
4564:d1fb13424616 |
|
13-Jun-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Seperate the pc-pc and the pc of the incoming bytes, and get rid of the "moreBytes" which just takes a MachInst.
src/arch/x86/predecoder.cc: Seperate the pc-pc and the pc of the incoming bytes, and get rid of the "moreBytes" which just takes a MachInst. Also make the "opSize" field describe the number of bytes and not the log of the number of bytes.
|
#
4377:ca55a0b1990a |
|
18-May-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Changes to make simple cpu handle pcs appropriately for x86
|
#
4240:cde9d7751cce |
|
14-Mar-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem into ahchoo.blinky.homelinux.org:/home/gblack/m5/newmem-x86
src/arch/mips/utility.hh: src/arch/x86/SConscript: Hand merge
|
#
4185:42c0395a03f9 |
|
07-Mar-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
I missed a couple of WithEffects, this should do it
|
#
4182:5b2c0d266107 |
|
14-Mar-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Make the predecoder an object with it's own switched header file. Start adding predecoding functionality to x86.
src/arch/SConscript: src/arch/alpha/utility.hh: src/arch/mips/utility.hh: src/arch/sparc/utility.hh: src/cpu/base.hh: src/cpu/o3/fetch.hh: src/cpu/o3/fetch_impl.hh: src/cpu/simple/atomic.cc: src/cpu/simple/base.cc: src/cpu/simple/base.hh: src/cpu/static_inst.hh: src/arch/alpha/predecoder.hh: src/arch/mips/predecoder.hh: src/arch/sparc/predecoder.hh: Make the predecoder an object with it's own switched header file.
|
#
4181:6edaeff44647 |
|
13-Mar-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Replaced makeExtMI with predecode. Removed the getOpcode function from StaticInst which only made sense for Alpha. Started implementing the x86 predecoder.
|
#
4172:141705d83494 |
|
07-Mar-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
*MiscReg->*MiscRegNoEffect, *MiscRegWithEffect->*MiscReg
|
#
4050:cf1daaef9109 |
|
12-Feb-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
Merge zizzer:/bk/newmem into zeep.pool:/z/saidi/work/m5.newmem
src/cpu/simple/atomic.cc: merge steve's changes in.
|
#
4040:eb894f3fc168 |
|
12-Feb-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
rename store conditional stuff as extra data so it can be used for conditional swaps as well Add support for a twin 64 bit int load Add Memory barrier and write barrier flags as appropriate Make atomic memory ops atomic
src/arch/alpha/isa/mem.isa: src/arch/alpha/locked_mem.hh: src/cpu/base_dyn_inst.hh: src/mem/cache/cache_blk.hh: src/mem/cache/cache_impl.hh: rename store conditional stuff as extra data so it can be used for conditional swaps as well src/arch/alpha/types.hh: src/arch/mips/types.hh: src/arch/sparc/types.hh: add a largest read data type for statically allocating read buffers in atomic simple cpu src/arch/isa_parser.py: Add support for a twin 64 bit int load src/arch/sparc/isa/decoder.isa: Make atomic memory ops atomic Add Memory barrier and write barrier flags as appropriate src/arch/sparc/isa/formats/mem/basicmem.isa: add post access code block and define a twinload format for twin loads src/arch/sparc/isa/formats/mem/blockmem.isa: remove old microcoded twin load coad src/arch/sparc/isa/formats/mem/mem.isa: swap.isa replaces the code in loadstore.isa src/arch/sparc/isa/formats/mem/util.isa: add a post access code block src/arch/sparc/isa/includes.isa: need bigint.hh for Twin64_t src/arch/sparc/isa/operands.isa: add a twin 64 int type src/cpu/simple/atomic.cc: src/cpu/simple/atomic.hh: src/cpu/simple/base.hh: src/cpu/simple/timing.cc: add support for twinloads add support for swap and conditional swap instructions rename store conditional stuff as extra data so it can be used for conditional swaps as well src/mem/packet.cc: src/mem/packet.hh: Add support for atomic swap memory commands src/mem/packet_access.hh: Add endian conversion function for Twin64_t type src/mem/physical.cc: src/mem/physical.hh: src/mem/request.hh: Add support for atomic swap memory commands Rename sc code to extradata
|
#
4027:53292b42ee1c |
|
12-Feb-2007 |
Steve Reinhardt <stever@eecs.umich.edu> |
Move store conditional result checking from SimpleAtomicCpu write function into Alpha ISA description. write now just generically returns a result value if the res pointer is non-null (which means we can only provide a res pointer if we expect a valid result value).
|
#
3980:9bcb2a2e9bb8 |
|
27-Jan-2007 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer:/bk/newmem into zower.eecs.umich.edu:/eecshome/m5/newmem
src/arch/sparc/isa/formats/mem/util.isa: src/arch/sparc/isa_traits.hh: src/arch/sparc/system.cc: Hand Merge
|
#
3960:1dca397b2bab |
|
20-Dec-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Initial work to make remote gdb available in SE mode. This is completely untested.
|
#
3918:1f9a98d198e8 |
|
26-Jan-2007 |
Ali Saidi <saidi@eecs.umich.edu> |
make our code a little more standards compliant pretty close to compiling w/ suns compiler
briefly: add dummy return after panic()/fatal() split out flags by compiler vendor include cstring and cmath where appropriate use std namespace for string ops
SConstruct: Add code to detect compiler and choose cflags based on detected compiler Fix zlib check to work with suncc src/SConscript: split out flags by compiler vendor src/arch/sparc/isa/decoder.isa: use correct namespace for sqrt src/arch/sparc/isa/formats/basic.isa: add dummy return around panic src/arch/sparc/isa/formats/integerop.isa: use correct namespace for stringops src/arch/sparc/isa/includes.isa: include cstring and cmath where appropriate src/arch/sparc/isa_traits.hh: remove dangling comma src/arch/sparc/system.cc: dummy return to make sun cc front end happy src/arch/sparc/tlb.cc: src/base/compression/lzss_compression.cc: use std namespace for string ops src/arch/sparc/utility.hh: no reason to say something is unsigned unsigned int src/base/compression/null_compression.hh: dummy returns to for suncc front end src/base/cprintf.hh: use standard variadic argument syntax instead of gnuc specefic renaming src/base/hashmap.hh: don't need to define hash for suncc src/base/hostinfo.cc: need stdio.h for sprintf src/base/loader/object_file.cc: munmap is in std namespace not null src/base/misc.hh: use M5 generic noreturn macros use standard variadic macro __VA_ARGS__ src/base/pollevent.cc: we need file.h for file flags src/base/random.cc: mess with include files to make suncc happy src/base/remote_gdb.cc: malloc memory for function instead of having a non-constant in an array size src/base/statistics.hh: use std namespace for floor src/base/stats/text.cc: include math.h for rint (cmath won't work) src/base/time.cc: use suncc version of ctime_r src/base/time.hh: change macro to work with both gcc and suncc src/base/timebuf.hh: include cstring from memset and use std:: src/base/trace.hh: change variadic macros to be normal format src/cpu/SConscript: add dummy returns where appropriate src/cpu/activity.cc: include cstring for memset src/cpu/exetrace.hh: include cstring fro memcpy src/cpu/simple/base.hh: add dummy return for panic src/dev/baddev.cc: src/dev/pciconfigall.cc: src/dev/platform.cc: src/dev/sparc/t1000.cc: add dummy return where appropriate src/dev/ide_atareg.h: make define work for both gnuc and suncc src/dev/io_device.hh: add dummy returns where approirate src/dev/pcidev.hh: src/mem/cache/cache_impl.hh: src/mem/cache/miss/blocking_buffer.cc: src/mem/cache/tags/lru.hh: src/mem/cache/tags/split.hh: src/mem/cache/tags/split_lifo.hh: src/mem/cache/tags/split_lru.hh: src/mem/dram.cc: src/mem/packet.cc: src/mem/port.cc: include cstring for string ops src/dev/sparc/mm_disk.cc: add dummy return where appropriate include cstring for string ops src/mem/cache/miss/blocking_buffer.hh: src/mem/port.hh: Add dummy return where appropriate src/mem/cache/tags/iic.cc: cast hastSets to double for log() call src/mem/physical.cc: cast pmemAddr to char* for munmap src/sim/byteswap.hh: make define work for suncc and gnuc
|
#
3792:dae368e56d0e |
|
16-Dec-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Changes to the isa_parser and affected files to fix an indexing problem with split execute instructions and miscregs aliasing with integer registers.
src/arch/isa_parser.py: Rearranged things so that classes with more than one execute function treat operands properly. 1. Eliminated the CodeBlock class 2. Created a SubOperandList 3. Redefined how InstObjParams is constructed
To define an InstObjParam, you can either pass in a single code literal which will be named "code", or you can pass in a dictionary of code snippets which will be substituted into the Templates. In order to get this to work, there is a new restriction that each template has only one function in it. These changes should only affect memory instructions which have regular and split execute functions.
Also changed the MiscRegs so that they use the instrunctions srcReg and destReg arrays. src/arch/sparc/isa/formats/basic.isa: src/arch/sparc/isa/formats/branch.isa: src/arch/sparc/isa/formats/integerop.isa: src/arch/sparc/isa/formats/mem/basicmem.isa: src/arch/sparc/isa/formats/mem/blockmem.isa: src/arch/sparc/isa/formats/mem/util.isa: src/arch/sparc/isa/formats/nop.isa: src/arch/sparc/isa/formats/priv.isa: src/arch/sparc/isa/formats/trap.isa: Rearranged to work with new InstObjParam scheme. src/cpu/o3/sparc/dyn_inst.hh: Added functions to access the miscregs using the indexes from instructions srcReg and destReg arrays. Also changed the names of the other accessors so that they have the suffix "Operand" if they use those arrays. src/cpu/simple/base.hh: Added functions to access the miscregs using the indexes from instructions srcReg and destReg arrays.
|
#
3735:86a7cf4dcc11 |
|
12-Dec-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Rename the StaticInst-based (read|set)(Int|Float)Reg methods to (read|set)(Int|Float)RegOperand to distinguish from non-StaticInst version.
|
#
3521:0b0b3551def0 |
|
03-Nov-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Got rid of "inPalMode". Some places are still effectively checking if they are in PAL mode, however.
|
#
3479:4fbcaa81d105 |
|
01-Nov-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Merge zizzer.eecs.umich.edu:/bk/newmem/ into zeep.eecs.umich.edu:/home/gblack/m5/newmemmemops
|
#
3468:cf23ad1ceef2 |
|
01-Nov-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Adjustments for the AlphaTLB changing to AlphaISA::TLB and changing register file functions to not take faults
|
#
3454:26850ac19a39 |
|
31-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Move IntrFlag into the MiscRegFile and get rid of specialized accessor functions.
|
#
3453:c3ce58882751 |
|
31-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Put the Alpha tlb stuff into the AlphaISA namespace, and give the classes more neutral names.
|
#
3402:db60546818d0 |
|
31-Oct-2006 |
Kevin Lim <ktlim@umich.edu> |
Remove mem parameter. Now the translating port asks the CPU's dcache's peer for its MemObject instead of having to have a paramter for the MemObject.
configs/example/fs.py: configs/example/se.py: src/cpu/simple/base.cc: src/cpu/simple/base.hh: src/cpu/simple/timing.cc: src/cpu/simple_thread.cc: src/cpu/simple_thread.hh: src/cpu/thread_state.cc: src/cpu/thread_state.hh: tests/configs/o3-timing-mp.py: tests/configs/o3-timing.py: tests/configs/simple-atomic-mp.py: tests/configs/simple-atomic.py: tests/configs/simple-timing-mp.py: tests/configs/simple-timing.py: tests/configs/tsunami-simple-atomic-dual.py: tests/configs/tsunami-simple-atomic.py: tests/configs/tsunami-simple-timing-dual.py: tests/configs/tsunami-simple-timing.py: No need for mem parameter any more. src/cpu/checker/cpu.cc: Use new constructor for simple thread (no more MemObject parameter). src/cpu/checker/cpu.hh: Remove MemObject parameter. src/cpu/memtest/memtest.hh: Ports now take in their MemObject owner. src/cpu/o3/alpha/cpu_builder.cc: Remove mem parameter. src/cpu/o3/alpha/cpu_impl.hh: Remove memory parameter and clean up handling of TranslatingPort. src/cpu/o3/cpu.cc: src/cpu/o3/cpu.hh: src/cpu/o3/fetch.hh: src/cpu/o3/fetch_impl.hh: src/cpu/o3/mips/cpu_builder.cc: src/cpu/o3/mips/cpu_impl.hh: src/cpu/o3/params.hh: src/cpu/o3/thread_state.hh: src/cpu/ozone/cpu.hh: src/cpu/ozone/cpu_builder.cc: src/cpu/ozone/cpu_impl.hh: src/cpu/ozone/front_end.hh: src/cpu/ozone/front_end_impl.hh: src/cpu/ozone/lw_lsq.hh: src/cpu/ozone/lw_lsq_impl.hh: src/cpu/ozone/simple_params.hh: src/cpu/ozone/thread_state.hh: src/cpu/simple/atomic.cc: Remove memory parameter.
|
#
3276:dc3cd126b479 |
|
15-Oct-2006 |
Gabe Black <gblack@eecs.umich.edu> |
Started implementing microcode.
|
#
2840:227f7c4f8c81 |
|
05-Jul-2006 |
Kevin Lim <ktlim@umich.edu> |
Remove sampler and serializer. Now they are handled through C++ interacting with Python.
src/SConscript: src/cpu/base.cc: src/cpu/base.hh: src/cpu/checker/cpu.hh: src/cpu/checker/cpu_impl.hh: src/cpu/o3/cpu.cc: src/cpu/o3/cpu.hh: src/cpu/o3/fetch.hh: src/cpu/ozone/cpu.hh: src/cpu/ozone/cpu_impl.hh: src/cpu/simple/base.cc: src/cpu/simple/base.hh: src/sim/pseudo_inst.cc: Remove sampler. src/sim/sim_object.cc: Remove serializer.
|
#
2683:d6b72bb2ed97 |
|
07-Jun-2006 |
Kevin Lim <ktlim@umich.edu> |
Reorganization/renaming of CPUExecContext. Now it is called SimpleThread in order to clear up the confusion due to the many ExecContexts. It also derives from a common ThreadState object, which holds various state common to threads across CPU models.
Following with the previous check-in, ExecContext now refers only to the interface provided to the ISA in order to access CPU state. ThreadContext refers to the interface provided to all objects outside the CPU in order to access thread state. SimpleThread provides all thread state and the interface to access it, and is suitable for simple execution models such as the SimpleCPU.
src/SConscript: Include thread state file. src/arch/alpha/ev5.cc: src/cpu/checker/cpu.cc: src/cpu/checker/cpu.hh: src/cpu/checker/thread_context.hh: src/cpu/memtest/memtest.cc: src/cpu/memtest/memtest.hh: src/cpu/o3/cpu.cc: src/cpu/ozone/cpu_impl.hh: src/cpu/simple/atomic.cc: src/cpu/simple/base.cc: src/cpu/simple/base.hh: src/cpu/simple/timing.cc: Rename CPUExecContext to SimpleThread. src/cpu/base_dyn_inst.hh: Make thread member variables protected.. src/cpu/o3/alpha_cpu.hh: src/cpu/o3/cpu.hh: Make various members of ThreadState protected. src/cpu/o3/alpha_cpu_impl.hh: Push generation of TranslatingPort into the CPU itself. Make various members of ThreadState protected. src/cpu/o3/thread_state.hh: Pull a lot of common code into the base ThreadState class. src/cpu/ozone/thread_state.hh: Rename CPUExecContext to SimpleThread, move a lot of common code into base ThreadState class. src/cpu/thread_state.hh: Push a lot of common code into base ThreadState class. This goes along with renaming CPUExecContext to SimpleThread, and making it derive from ThreadState. src/cpu/simple_thread.cc: Rename CPUExecContext to SimpleThread, make it derive from ThreadState. This helps push a lot of common code/state into a single class that can be used by all CPUs. src/cpu/simple_thread.hh: Rename CPUExecContext to SimpleThread, make it derive from ThreadState. src/kern/system_events.cc: Rename cpu_exec_context to thread_context. src/sim/process.hh: Remove unused forward declaration.
|
#
2680:246e7104f744 |
|
06-Jun-2006 |
Kevin Lim <ktlim@umich.edu> |
Change ExecContext to ThreadContext. This is being renamed to differentiate between the interface used objects outside of the CPU, and the interface used by the ISA. ThreadContext is used by objects outside of the CPU and is specifically defined in thread_context.hh. ExecContext is more implicit, and is defined by files such as base_dyn_inst.hh or cpu/simple/base.hh.
Further renames/reorganization will be coming shortly; what is currently CPUExecContext (the old ExecContext from m5) will be renamed to SimpleThread or something similar.
src/arch/alpha/arguments.cc: src/arch/alpha/arguments.hh: src/arch/alpha/ev5.cc: src/arch/alpha/faults.cc: src/arch/alpha/faults.hh: src/arch/alpha/freebsd/system.cc: src/arch/alpha/freebsd/system.hh: src/arch/alpha/isa/branch.isa: src/arch/alpha/isa/decoder.isa: src/arch/alpha/isa/main.isa: src/arch/alpha/linux/process.cc: src/arch/alpha/linux/system.cc: src/arch/alpha/linux/system.hh: src/arch/alpha/linux/threadinfo.hh: src/arch/alpha/process.cc: src/arch/alpha/regfile.hh: src/arch/alpha/stacktrace.cc: src/arch/alpha/stacktrace.hh: src/arch/alpha/tlb.cc: src/arch/alpha/tlb.hh: src/arch/alpha/tru64/process.cc: src/arch/alpha/tru64/system.cc: src/arch/alpha/tru64/system.hh: src/arch/alpha/utility.hh: src/arch/alpha/vtophys.cc: src/arch/alpha/vtophys.hh: src/arch/mips/faults.cc: src/arch/mips/faults.hh: src/arch/mips/isa_traits.cc: src/arch/mips/isa_traits.hh: src/arch/mips/linux/process.cc: src/arch/mips/process.cc: src/arch/mips/regfile/float_regfile.hh: src/arch/mips/regfile/int_regfile.hh: src/arch/mips/regfile/misc_regfile.hh: src/arch/mips/regfile/regfile.hh: src/arch/mips/stacktrace.hh: src/arch/sparc/faults.cc: src/arch/sparc/faults.hh: src/arch/sparc/isa_traits.hh: src/arch/sparc/linux/process.cc: src/arch/sparc/linux/process.hh: src/arch/sparc/process.cc: src/arch/sparc/regfile.hh: src/arch/sparc/solaris/process.cc: src/arch/sparc/stacktrace.hh: src/arch/sparc/ua2005.cc: src/arch/sparc/utility.hh: src/arch/sparc/vtophys.cc: src/arch/sparc/vtophys.hh: src/base/remote_gdb.cc: src/base/remote_gdb.hh: src/cpu/base.cc: src/cpu/base.hh: src/cpu/base_dyn_inst.hh: src/cpu/checker/cpu.cc: src/cpu/checker/cpu.hh: src/cpu/checker/exec_context.hh: src/cpu/cpu_exec_context.cc: src/cpu/cpu_exec_context.hh: src/cpu/cpuevent.cc: src/cpu/cpuevent.hh: src/cpu/exetrace.hh: src/cpu/intr_control.cc: src/cpu/memtest/memtest.hh: src/cpu/o3/alpha_cpu.hh: src/cpu/o3/alpha_cpu_impl.hh: src/cpu/o3/alpha_dyn_inst_impl.hh: src/cpu/o3/commit.hh: src/cpu/o3/commit_impl.hh: src/cpu/o3/cpu.cc: src/cpu/o3/cpu.hh: src/cpu/o3/fetch_impl.hh: src/cpu/o3/regfile.hh: src/cpu/o3/thread_state.hh: src/cpu/ozone/back_end.hh: src/cpu/ozone/cpu.hh: src/cpu/ozone/cpu_impl.hh: src/cpu/ozone/front_end.hh: src/cpu/ozone/front_end_impl.hh: src/cpu/ozone/inorder_back_end.hh: src/cpu/ozone/lw_back_end.hh: src/cpu/ozone/lw_back_end_impl.hh: src/cpu/ozone/lw_lsq.hh: src/cpu/ozone/lw_lsq_impl.hh: src/cpu/ozone/thread_state.hh: src/cpu/pc_event.cc: src/cpu/pc_event.hh: src/cpu/profile.cc: src/cpu/profile.hh: src/cpu/quiesce_event.cc: src/cpu/quiesce_event.hh: src/cpu/simple/atomic.cc: src/cpu/simple/base.cc: src/cpu/simple/base.hh: src/cpu/simple/timing.cc: src/cpu/static_inst.cc: src/cpu/static_inst.hh: src/cpu/thread_state.hh: src/dev/alpha_console.cc: src/dev/ns_gige.cc: src/dev/sinic.cc: src/dev/tsunami_cchip.cc: src/kern/kernel_stats.cc: src/kern/kernel_stats.hh: src/kern/linux/events.cc: src/kern/linux/events.hh: src/kern/system_events.cc: src/kern/system_events.hh: src/kern/tru64/dump_mbuf.cc: src/kern/tru64/tru64.hh: src/kern/tru64/tru64_events.cc: src/kern/tru64/tru64_events.hh: src/mem/vport.cc: src/mem/vport.hh: src/sim/faults.cc: src/sim/faults.hh: src/sim/process.cc: src/sim/process.hh: src/sim/pseudo_inst.cc: src/sim/pseudo_inst.hh: src/sim/syscall_emul.cc: src/sim/syscall_emul.hh: src/sim/system.cc: src/cpu/thread_context.hh: src/sim/system.hh: src/sim/vptr.hh: Change ExecContext to ThreadContext.
|
#
2665:a124942bacb8 |
|
31-May-2006 |
Ali Saidi <saidi@eecs.umich.edu> |
Updated Authors from bk prs info
|
#
2662:f24ae2d09e27 |
|
30-May-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
Minor further cleanup & commenting of Packet class.
src/cpu/simple/atomic.cc: Make common ifetch setup based on Request rather than Packet. Packet::reset() no longer a separate function. sendAtomic() returns latency, not absolute tick. src/cpu/simple/atomic.hh: sendAtomic returns latency, not absolute tick. src/cpu/simple/base.cc: src/cpu/simple/base.hh: src/cpu/simple/timing.cc: Make common ifetch setup based on Request rather than Packet. src/dev/alpha_console.cc: src/dev/ide_ctrl.cc: src/dev/io_device.cc: src/dev/isa_fake.cc: src/dev/ns_gige.cc: src/dev/pciconfigall.cc: src/dev/sinic.cc: src/dev/tsunami_cchip.cc: src/dev/tsunami_io.cc: src/dev/tsunami_pchip.cc: src/dev/uart8250.cc: src/mem/physical.cc: Get rid of redundant Packet time field. src/mem/packet.cc: Eliminate reset() method. src/mem/packet.hh: Fold reset() function into reinitFromRequest()... it was only ever called together with that function. Get rid of redundant time field. Cleanup/add comments. src/mem/port.hh: Document in comment that sendAtomic returns latency, not absolute tick.
|
#
2632:1bb2f91485ea |
|
22-May-2006 |
Steve Reinhardt <stever@eecs.umich.edu> |
New directory structure: - simulator source now in 'src' subdirectory - imported files from 'ext' repository - support building in arbitrary places, including outside of the source tree. See comment at top of SConstruct file for more details. Regression tests are temporarily disabled; that syetem needs more extensive revisions.
SConstruct: Update for new directory structure. Modify to support build trees that are not subdirectories of the source tree. See comment at top of file for more details. Regression tests are temporarily disabled. src/arch/SConscript: src/arch/isa_parser.py: src/python/SConscript: Update for new directory structure.
|