Searched hist:2015 (Results 901 - 925 of 1505) sorted by relevance
/gem5/src/dev/ | ||
H A D | dma_device.hh | 11169:44b5c183c3cd Mon Oct 12 04:08:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> misc: Add explicit overrides and fix other clang >= 3.5 issues This patch adds explicit overrides as this is now required when using "-Wall" with clang >= 3.5, the latter now part of the most recent XCode. The patch consequently removes "virtual" for those methods where "override" is added. The latter should be enough of an indication. As part of this patch, a few minor issues that clang >= 3.5 complains about are also resolved (unused methods and variables). 11168:f98eb2da15a4 Mon Oct 12 04:07:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> misc: Remove redundant compiler-specific defines This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions. 11010:034378be28a2 Fri Aug 07 04:59:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> dev: Add a simple DMA engine that can be used by devices Add a simple DMA engine that sits behind a FIFO. This engine can be used by devices that need to read large amounts of data (e.g., display controllers). Most aspects of the controller, such as FIFO size, maximum number of in-flight accesses, and maximum request sizes can be configured. The DMA copies blocks of data into its FIFO. Transfers are initiated with a call to startFill() command that takes a start address and a size. Advanced users can create a derived class that overrides the onEndOfBlock() callback that is triggered when the last request to a block has been issued. At this point, the DMA engine is ready to start fetching a new block of data, potentially from a different address range. The DMA engine stops issuing new requests while it is draining. Care must be taken to ensure that devices that are fed by a DMA engine are suspended while the system is draining to avoid buffer underruns. 10913:38dbdeea7f1f Tue Jul 07 04:51:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> sim: Refactor and simplify the drain API The drain() call currently passes around a DrainManager pointer, which is now completely pointless since there is only ever one global DrainManager in the system. It also contains vestiges from the time when SimObjects had to keep track of their child objects that needed draining. This changeset moves all of the DrainState handling to the Drainable base class and changes the drain() and drainResume() calls to reflect this. Particularly, the drain() call has been updated to take no parameters (the DrainManager argument isn't needed) and return a DrainState instead of an unsigned integer (there is no point returning anything other than 0 or 1 any more). Drainable objects should return either DrainState::Draining (equivalent to returning 1 in the old system) if they need more time to drain or DrainState::Drained (equivalent to returning 0 in the old system) if they are already in a consistent state. Returning DrainState::Running is considered an error. Drain done signalling is now done through the signalDrainDone() method in the Drainable class instead of using the DrainManager directly. The new call checks if the state of the object is DrainState::Draining before notifying the drain manager. This means that it is safe to call signalDrainDone() without first checking if the simulator has requested draining. The intention here is to reduce the code needed to implement draining in simple objects. 10912:b99a6662d7c2 Tue Jul 07 04:51:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> sim: Decouple draining from the SimObject hierarchy Draining is currently done by traversing the SimObject graph and calling drain()/drainResume() on the SimObjects. This is not ideal when non-SimObjects (e.g., ports) need draining since this means that SimObjects owning those objects need to be aware of this. This changeset moves the responsibility for finding objects that need draining from SimObjects and the Python-side of the simulator to the DrainManager. The DrainManager now maintains a set of all objects that need draining. To reduce the overhead in classes owning non-SimObjects that need draining, objects inheriting from Drainable now automatically register with the DrainManager. If such an object is destroyed, it is automatically unregistered. This means that drain() and drainResume() should never be called directly on a Drainable object. While implementing the new functionality, the DrainManager has now been made thread safe. In practice, this means that it takes a lock whenever it manipulates the set of Drainable objects since SimObjects in different threads may create Drainable objects dynamically. Similarly, the drain counter is now an atomic_uint, which ensures that it is manipulated correctly when objects signal that they are done draining. A nice side effect of these changes is that it makes the drain state changes stricter, which the simulation scripts can exploit to avoid redundant drains. 10713:eddb533708cb Mon Mar 02 04:00:00 EST 2015 Andreas Hansson <andreas.hansson@arm.com> mem: Split port retry for all different packet classes This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios. The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting. The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks. |
H A D | mc146818.cc | 10905:a6ca6831e775 Tue Jul 07 04:51:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> sim: Refactor the serialization base class Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code. 10777:a8a5eb637d72 Fri Apr 03 12:42:00 EDT 2015 Nikos Nikoleris <nikos.nikoleris@gmail.com> dev: (un)serialize fix for the RTC and RTC Timer Interrupt events Restoring from a checkpoint fails if either the RTC or the RTC Timer Interrrupt event is disabled. The restored machine tried incorrectly to schedule the next event with negative offset. Committed by: Nilay Vaish <nilay@cs.wisc.edu> 10631:6d6bfdb036ce Sat Jan 03 18:51:00 EST 2015 Cagdas Dirik <cdirik@micron.com> dev: prevent RTC events firing before startup This change includes edits to MC146818 timer to prevent RTC events firing before startup to comply with SimObject initialization call sequence. Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
/gem5/src/cpu/testers/directedtest/ | ||
H A D | RubyDirectedTester.hh | 11061:25b53a7195f7 Sat Aug 29 11:19:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: eliminate type uint64 and int64 These types are being replaced with uint64_t and int64_t. 11049:dfb0aa3f0649 Wed Aug 19 11:02:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: reverts to changeset: bf82f1f7b040 11031:3815437cb231 Fri Aug 14 20:28:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: eliminate type uint64 and int64 These types are being replaced with uint64_t and int64_t. 11017:6ec228f6c143 Tue Aug 11 12:39:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: drop some redundant includes 10920:58fbfddff18d Fri Jul 10 17:05:00 EDT 2015 Brandon Potter <brandon.potter@amd.com> ruby: replace global g_abs_controls with per-RubySystem var This is another step in the process of removing global variables from Ruby to enable multiple RubySystem instances in a single simulation. The list of abstract controllers is per-RubySystem and should be represented that way, rather than as a global. Since this is the last remaining Ruby global variable, the src/mem/ruby/Common/Global.* files are also removed. 10713:eddb533708cb Mon Mar 02 04:00:00 EST 2015 Andreas Hansson <andreas.hansson@arm.com> mem: Split port retry for all different packet classes This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios. The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting. The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks. |
/gem5/src/mem/ | ||
H A D | packet_queue.hh | 11207:7b7e352f8d7f Mon Jul 20 10:15:00 EDT 2015 Brad Beckmann <Brad.Beckmann@amd.com> mem: add boolean to disable PacketQueue's size sanity check the sanity check, while generally useful for exposing memory system bugs, may be spurious with respect to GPU workloads, which may generate many more requests than typical CPU workloads. the large number of requests generated by the GPU may cause the req/resp queues to back up, thus queueing more than 100 packets. 11195:6f8b2a005abb Fri Nov 06 03:26:00 EST 2015 Andreas Hansson <andreas.hansson@arm.com> mem: Order packet queue only on matching addresses Instead of conservatively enforcing order for all packets, which may negatively impact the simulated-system performance, this patch updates the packet queue such that it only applies the restriction if there are already packets with the same address in the queue. The basic need for the order enforcement is due to coherency interactions where requests/responses to the same cache line must not over-take each other. We rely on the fact that any packet that needs order enforcement will have a block-aligned address. Thus, there is no need for the queue to know about the cacheline size. 11168:f98eb2da15a4 Mon Oct 12 04:07:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> misc: Remove redundant compiler-specific defines This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions. 10913:38dbdeea7f1f Tue Jul 07 04:51:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> sim: Refactor and simplify the drain API The drain() call currently passes around a DrainManager pointer, which is now completely pointless since there is only ever one global DrainManager in the system. It also contains vestiges from the time when SimObjects had to keep track of their child objects that needed draining. This changeset moves all of the DrainState handling to the Drainable base class and changes the drain() and drainResume() calls to reflect this. Particularly, the drain() call has been updated to take no parameters (the DrainManager argument isn't needed) and return a DrainState instead of an unsigned integer (there is no point returning anything other than 0 or 1 any more). Drainable objects should return either DrainState::Draining (equivalent to returning 1 in the old system) if they need more time to drain or DrainState::Drained (equivalent to returning 0 in the old system) if they are already in a consistent state. Returning DrainState::Running is considered an error. Drain done signalling is now done through the signalDrainDone() method in the Drainable class instead of using the DrainManager directly. The new call checks if the state of the object is DrainState::Draining before notifying the drain manager. This means that it is safe to call signalDrainDone() without first checking if the simulator has requested draining. The intention here is to reduce the code needed to implement draining in simple objects. 10722:886d2458e0d6 Mon Mar 02 04:00:00 EST 2015 Stephan Diestelhorst <stephan.diestelhorst@arm.com> mem: Add option to force in-order insertion in PacketQueue By default, the packet queue is ordered by the ticks of the to-be-sent packages. With the recent modifications of packages sinking their header time when their resposne leaves the caches, there could be cases of MSHR targets being allocated and ordered A, B, but their responses being sent out in the order B,A. This led to inconsistencies in bus traffic, in particular the snoop filter observing first a ReadExResp and later a ReadRespWithInv. Logically, these were ordered the other way around behind the MSHR, but due to the timing adjustments when inserting into the PacketQueue, they were sent out in the wrong order on the bus, confusing the snoop filter. This patch adds a flag (off by default) such that these special cases can request in-order insertion into the packet queue, which might offset timing slighty. This is expected to occur rarely and not affect timing results. 10713:eddb533708cb Mon Mar 02 04:00:00 EST 2015 Andreas Hansson <andreas.hansson@arm.com> mem: Split port retry for all different packet classes This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios. The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting. The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks. |
/gem5/src/mem/ruby/system/ | ||
H A D | Sequencer.cc | 11266:452e10b868ea Mon Jul 20 10:15:00 EDT 2015 Brad Beckmann <Brad.Beckmann@amd.com> ruby: more flexible ruby tester support This patch allows the ruby random tester to use ruby ports that may only support instr or data requests. This patch is similar to a previous changeset (8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets. This current patch implements the support in a more straight-forward way. Since retries are now tested when running the ruby random tester, this patch splits up the retry and drain check behavior so that RubyPort children, such as the GPUCoalescer, can perform those operations correctly without having to duplicate code. Finally, the patch also includes better DPRINTFs for debugging the tester. 11168:f98eb2da15a4 Mon Oct 12 04:07:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> misc: Remove redundant compiler-specific defines This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions. 11118:75c1e564a725 Fri Sep 18 14:27:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: print addresses in hex Changeset 4872dbdea907 replaced Address by Addr, but did not make changes to print statements. So the addresses which were being printed in hex earlier along with their line address, were now being printed in decimals. This patch adds a function printAddress(Addr) that can be used to print the address in hex along with the lines address. This function has been put to use in some of the places. At other places, change has been made to print just the address in hex. 11111:6da33e720481 Wed Sep 16 12:59:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: message buffer, timer table: significant changes This patch changes MessageBuffer and TimerTable, two structures used for buffering messages by components in ruby. These structures would no longer maintain pointers to clock objects. Functions in these structures have been changed to take as input current time in Tick. Similarly, these structures will not operate on Cycle valued latencies for different operations. The corresponding functions would need to be provided with these latencies by components invoking the relevant functions. These latencies should also be in Ticks. I felt the need for these changes while trying to speed up ruby. The ultimate aim is to eliminate Consumer class and replace it with an EventManager object in the MessageBuffer and TimerTable classes. This object would be used for scheduling events. The event itself would contain information on the object and function to be invoked. In hindsight, it seems I should have done this while I was moving away from use of a single global clock in the memory system. That change led to introduction of clock objects that replaced the global clock object. It never crossed my mind that having clock object pointers is not a good design. And now I really don't like the fact that we have separate consumer, receiver and sender pointers in message buffers. 11110:8647458d421d Wed Sep 16 12:59:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: remove unused function removeRequest() 11109:bf3d0f56a6ba Wed Sep 16 12:59:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: sequencer: remove commented out function printProgress() 11108:6342ddf6d733 Wed Sep 16 00:03:00 EDT 2015 David Hashe <david.hashe@amd.com> ruby: rename System.{hh,cc} to RubySystem.{hh,cc} The eventual aim of this change is to pass RubySystem pointers through to objects generated from the SLICC protocol code. Because some of these objects need to dereference their RubySystem pointers, they need access to the System.hh header file. In src/mem/ruby/SConscript, the MakeInclude function creates single-line header files in the build directory that do nothing except include the corresponding header file from the source tree. However, SLICC also generates a list of header files from its symbol table, and writes it to mem/protocol/Types.hh in the build directory. This code assumes that the header file name is the same as the class name. The end result of this is the many of the generated slicc files try to include RubySystem.hh, when the file they really need is System.hh. The path of least resistence is just to rename System.hh to RubySystem.hh. 11087:3c4bda5a2f66 Sat Sep 05 10:35:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: call setMRU from L1 controllers, not from sequencer Currently the sequencer calls the function setMRU that updates the replacement policy structures with the first level caches. While functionally this is correct, the problem is that this requires calling findTagInSet() which is an expensive function. This patch removes the calls to setMRU from the sequencer. All controllers should now update the replacement policy on their own. The set and the way index for a given cache entry can be found within the AbstractCacheEntry structure. Use these indicies to update the replacement policy structures. 11059:40e622551656 Thu Aug 27 01:51:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: handle llsc accesses through CacheEntry, not CacheMemory The sequencer takes care of llsc accesses by calling upon functions from the CacheMemory. This is unnecessary once the required CacheEntry object is available. Thus some of the calls to findTagInSet() are avoided. 11049:dfb0aa3f0649 Wed Aug 19 11:02:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: reverts to changeset: bf82f1f7b040 |
/gem5/src/arch/alpha/ | ||
H A D | system.hh | 11169:44b5c183c3cd Mon Oct 12 04:08:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> misc: Add explicit overrides and fix other clang >= 3.5 issues This patch adds explicit overrides as this is now required when using "-Wall" with clang >= 3.5, the latter now part of the most recent XCode. The patch consequently removes "virtual" for those methods where "override" is added. The latter should be enough of an indication. As part of this patch, a few minor issues that clang >= 3.5 complains about are also resolved (unused methods and variables). 11168:f98eb2da15a4 Mon Oct 12 04:07:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> misc: Remove redundant compiler-specific defines This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions. 10905:a6ca6831e775 Tue Jul 07 04:51:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> sim: Refactor the serialization base class Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code. |
H A D | decoder.hh | 11165:d90aec9435bd Fri Oct 09 15:50:00 EDT 2015 Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com> isa: Add parameter to pick different decoder inside ISA The decoder is responsible for splitting instructions in micro operations (uops). Given that different micro architectures may split operations differently, this patch allows to specify which micro architecture each isa implements, so different cores in the system can split instructions differently, also decoupling uop splitting (microArch) from ISA (Arch). This is done making the decodification calls templates that receive a type 'DecoderFlavour' that maps the name of the operation to the class that implements it. This way there is only one selection point (converting the command line enum to the appropriate DecodeFeatures object). In addition, there is no explicit code replication: template instantiation hides that, and the compiler should be able to resolve a number of things at compile-time. |
/gem5/src/mem/ruby/common/ | ||
H A D | Address.hh | 11168:f98eb2da15a4 Mon Oct 12 04:07:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> misc: Remove redundant compiler-specific defines This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions. 11118:75c1e564a725 Fri Sep 18 14:27:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: print addresses in hex Changeset 4872dbdea907 replaced Address by Addr, but did not make changes to print statements. So the addresses which were being printed in hex earlier along with their line address, were now being printed in decimals. This patch adds a function printAddress(Addr) that can be used to print the address in hex along with the lines address. This function has been put to use in some of the places. At other places, change has been made to print just the address in hex. 11025:4872dbdea907 Fri Aug 14 01:04:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: replace Address by Addr This patch eliminates the type Address defined by the ruby memory system. This memory system would now use the type Addr that is in use by the rest of the system. |
/gem5/tests/configs/ | ||
H A D | tgen-simple-mem.py | 10996:d48fda705f4d Tue Aug 04 05:29:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> mem: Move trace functionality from the CommMonitor to a probe This changeset moves the access trace functionality from the CommMonitor into a separate probe. The probe can be hooked up to any component that exports probe points of the type ProbePoints::Packet. This patch moves the dependency on Google's Protocol Buffers library from the CommMonitor to the MemTraceProbe, which means that the CommMonitor (including stack distance profiling) no long depends on it. 10995:a114e2712642 Tue Aug 04 05:29:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> mem: Redesign the stack distance calculator as a probe This changeset removes the stack distance calculator hooks from the CommMonitor class and implements a stack distance calculator as a memory system probe instead. The probe can be hooked up to any component that exports probe points of the type ProbePoints::Packet. 10720:67b3e74de9ae Mon Mar 02 04:00:00 EST 2015 Andreas Hansson <andreas.hansson@arm.com> mem: Move crossbar default latencies to subclasses This patch introduces a few subclasses to the CoherentXBar and NoncoherentXBar to distinguish the different uses in the system. We use the crossbar in a wide range of places: interfacing cores to the L2, as a system interconnect, connecting I/O and peripherals, etc. Needless to say, these crossbars have very different performance, and the clock frequency alone is not enough to distinguish these scenarios. Instead of trying to capture every possible case, this patch introduces dedicated subclasses for the three primary use-cases: L2XBar, SystemXBar and IOXbar. More can be added if needed, and the defaults can be overridden. |
/gem5/src/cpu/o3/ | ||
H A D | fu_pool.cc | 10814:46b6043bd32c Tue May 05 03:22:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> cpu: Work around gcc 4.9 issues with Num_OpClasses This patch fixes a recent issue with gcc 4.9 (and possibly more) being convinced that indices outside the array bounds are used when initialising the FUPool members. 10807:dac26eb4cb64 Wed Apr 29 23:35:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> cpu: o3: replace issueLatency with bool pipelined Currently, each op class has a parameter issueLat that denotes the cycles after which another op of the same class can be issued. As of now, this latency can either be one cycle (fully pipelined) or same as execution latency of the op (not at all pipelined). The fact that issueLat is a parameter of type Cycles makes one believe that it can be set to any value. To avoid the confusion, the parameter is being renamed as 'pipelined' with type boolean. If set to true, the op would execute in a fully pipelined fashion. Otherwise, it would execute in an unpipelined fashion. 10728:0fd6a08a7332 Mon Mar 09 10:39:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> cpu: o3: remove unused function annotateMemoryUnits() |
/gem5/tests/long/se/20.parser/ref/arm/linux/simple-atomic/ | ||
H A D | stats.txt | 10827:7f5467f2f8b8 Tue May 05 03:22:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect cache changes 10812:bacaefeb126a Thu Apr 30 15:17:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> stats: arm: updates 10726:8a20e2a1562d Mon Mar 02 05:04:00 EST 2015 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect cache and interconnect changes This is a bulk update of stats to match the changes to cache timing, interconnect timing, and a few minor changes to the o3 CPU. |
/gem5/tests/long/se/20.parser/ref/x86/linux/o3-timing/ | ||
H A D | simout | 11103:38f6188421e0 Tue Sep 15 09:14:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> stats: updates due to recent changesets including d0934b57735a 10901:8cfa8dac39fe Sun Jul 05 21:26:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> stats: x86: update stats missed out on in preivous changeset 10798:74e3c7359393 Wed Apr 22 23:22:00 EDT 2015 Steve Reinhardt <steve.reinhardt@amd.com> stats: update for previous changeset Very small differences in IQ-specific O3 stats. |
/gem5/tests/long/se/40.perlbmk/ref/arm/linux/o3-timing/ | ||
H A D | config.ini | 11103:38f6188421e0 Tue Sep 15 09:14:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> stats: updates due to recent changesets including d0934b57735a 10900:ac6617bf9967 Sat Jul 04 11:43:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> stats: update stale config.ini files, eio and few other stats. 10798:74e3c7359393 Wed Apr 22 23:22:00 EDT 2015 Steve Reinhardt <steve.reinhardt@amd.com> stats: update for previous changeset Very small differences in IQ-specific O3 stats. |
/gem5/src/base/ | ||
H A D | output.cc | 11359:b0b976a1ceda Fri Nov 27 09:41:00 EST 2015 Andreas Sandberg <andreas@sandberg.pp.se> base: Add support for changing output directories This changeset adds support for changing the simulator output directory. This can be useful when the simulation goes through several stages (e.g., a warming phase, a simulation phase, and a verification phase) since it allows the output from each stage to be located in a different directory. Relocation is done by calling core.setOutputDir() from Python or simout.setOutputDirectory() from C++. This change affects several parts of the design of the gem5's output subsystem. First, files returned by an OutputDirectory instance (e.g., simout) are of the type OutputStream instead of a std::ostream. This allows us to do some more book keeping and control re-opening of files when the output directory is changed. Second, new subdirectories are OutputDirectory instances, which should be used to create files in that sub-directory. Signed-off-by: Andreas Sandberg <andreas@sandberg.pp.se> [sascha.bischoff@arm.com: Rebased patches onto a newer gem5 version] Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> 11259:4006183015a1 Thu Nov 05 13:26:00 EST 2015 Sascha Bischoff <sascha.bischoff@ARM.com> sim: Disable gzip compression for writefile pseudo instruction The writefile pseudo instruction uses OutputDirectory::create and OutputDirectory::openFile to create the output files. However, by default these will check the file extention for .gz, and create a gzip compressed stream if the file ending matches. When writing out files, we want to write them out exactly as they are in the guest simulation, and never want to compress them with gzio. Additionally, this causes m5 writefile to fail when checking the error flags for the output steam. With this patch we add an additional no_gz argument to OutputDirectory::create and OutputDirectory::openFile which allows us to override the gzip compression. Therefore, for m5 writefile we disable the filename check, and always create a standard ostream. 10810:683ab55819fd Wed Apr 29 23:35:00 EDT 2015 Ruslan Bukin <br@bsdpad.com> arch, base, dev, kern, sym: FreeBSD support This adds support for FreeBSD/aarch64 FS and SE mode (basic set of syscalls only) Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
/gem5/src/arch/power/ | ||
H A D | tlb.hh | 11168:f98eb2da15a4 Mon Oct 12 04:07:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> misc: Remove redundant compiler-specific defines This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions. 10905:a6ca6831e775 Tue Jul 07 04:51:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> sim: Refactor the serialization base class Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code. 10687:276da6265ab8 Wed Feb 11 10:23:00 EST 2015 Andreas Sandberg <Andreas.Sandberg@ARM.com> sim: Move the BaseTLB to src/arch/generic/ The TLB-related code is generally architecture dependent and should live in the arch directory to signify that. |
H A D | isa.hh | 10935:acd48ddd725f Tue Jul 28 02:58:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> revert 5af8f40d8f2c 10934:5af8f40d8f2c Sun Jul 26 11:21:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> cpu: implements vector registers This adds a vector register type. The type is defined as a std::array of a fixed number of uint64_ts. The isa_parser.py has been modified to parse vector register operands and generate the required code. Different cpus have vector register files now. 10698:829adc48e175 Mon Feb 16 03:33:00 EST 2015 Andreas Hansson <andreas.hansson@arm.com> arch: Make readMiscRegNoEffect const throughout Finally took the plunge and made this apply to all ISAs, not just ARM. |
H A D | decoder.hh | 11165:d90aec9435bd Fri Oct 09 15:50:00 EDT 2015 Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com> isa: Add parameter to pick different decoder inside ISA The decoder is responsible for splitting instructions in micro operations (uops). Given that different micro architectures may split operations differently, this patch allows to specify which micro architecture each isa implements, so different cores in the system can split instructions differently, also decoupling uop splitting (microArch) from ISA (Arch). This is done making the decodification calls templates that receive a type 'DecoderFlavour' that maps the name of the operation to the class that implements it. This way there is only one selection point (converting the command line enum to the appropriate DecodeFeatures object). In addition, there is no explicit code replication: template instantiation hides that, and the compiler should be able to resolve a number of things at compile-time. |
/gem5/src/dev/arm/ | ||
H A D | rv_ctrl.cc | 11421:74c1e6513bd0 Wed May 13 10:02:00 EDT 2015 David Guillen Fandos <david.guillen@arm.com> sim: Thermal support for Linux This patch enables Linux to read the temperature using hwmon infrastructure. In order to use this in your gem5 you need to compile the kernel using the following configs: CONFIG_HWMON=y CONFIG_SENSORS_VEXPRESS=y And a proper dts file (containing an entry such as): dcc { compatible = "arm,vexpress,config-bus"; arm,vexpress,config-bridge = <&v2m_sysreg>; temp@0 { compatible = "arm,vexpress-temp"; arm,vexpress-sysreg,func = <4 0>; label = "DCC"; }; }; 11011:2ca6c68fdd6c Fri Aug 07 04:59:00 EDT 2015 Andreas Sandberg <Andreas.Sandberg@ARM.com> arm: Add support for programmable oscillators Add support for oscillators that can be programmed using the RealView / Versatile Express configuration interface. These oscillators are typically used for things like the pixel clock in the display controller. The default configurations support the oscillators from a Versatile Express motherboard (V2M-P1) with a CoreTile Express A15x2. 10905:a6ca6831e775 Tue Jul 07 04:51:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> sim: Refactor the serialization base class Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code. |
H A D | pl011.hh | 11168:f98eb2da15a4 Mon Oct 12 04:07:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> misc: Remove redundant compiler-specific defines This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions. 10905:a6ca6831e775 Tue Jul 07 04:51:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> sim: Refactor the serialization base class Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code. 10718:4ed87af2930f Mon Mar 02 04:00:00 EST 2015 Andreas Sandberg <Andreas.Sandberg@ARM.com> dev, arm: Clean up PL011 and rewrite interrupt handling The ARM PL011 UART model didn't clear and raise interrupts correctly. This changeset rewrites the whole interrupt handling and makes it both simpler and fixes several cases where the correct interrupts weren't raised or cleared. Additionally, it cleans up many other aspects of the code. |
/gem5/src/arch/arm/ | ||
H A D | stage2_mmu.cc | 10912:b99a6662d7c2 Tue Jul 07 04:51:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> sim: Decouple draining from the SimObject hierarchy Draining is currently done by traversing the SimObject graph and calling drain()/drainResume() on the SimObjects. This is not ideal when non-SimObjects (e.g., ports) need draining since this means that SimObjects owning those objects need to be aware of this. This changeset moves the responsibility for finding objects that need draining from SimObjects and the Python-side of the simulator to the DrainManager. The DrainManager now maintains a set of all objects that need draining. To reduce the overhead in classes owning non-SimObjects that need draining, objects inheriting from Drainable now automatically register with the DrainManager. If such an object is destroyed, it is automatically unregistered. This means that drain() and drainResume() should never be called directly on a Drainable object. While implementing the new functionality, the DrainManager has now been made thread safe. In practice, this means that it takes a lock whenever it manipulates the set of Drainable objects since SimObjects in different threads may create Drainable objects dynamically. Similarly, the drain counter is now an atomic_uint, which ensures that it is manipulated correctly when objects signal that they are done draining. A nice side effect of these changes is that it makes the drain state changes stricter, which the simulation scripts can exploit to avoid redundant drains. 10873:7c972b9aea16 Sun Jun 21 15:48:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> arm: Cleanup arch headers to remove dma_device.hh dependency Break the dependency on dma_device.hh by forward-declaring DmaPort in the relevant header. 10717:4f8c1bd6fdb8 Mon Mar 02 04:00:00 EST 2015 Andreas Hansson <andreas.hansson@arm.com> arm: Share a port for the two table walker objects This patch changes how the MMU and table walkers are created such that a single port is used to connect the MMU and the TLBs to the memory system. Previously two ports were needed as there are two table walker objects (stage one and stage two), and they both had a port. Now the port itself is moved to the Stage2MMU, and each TableWalker is simply using the port from the parent. By using the same port we also remove the need for having an additional crossbar joining the two ports before the walker cache or the L2. This simplifies the creation of the CPU cache topology in BaseCPU.py considerably. Moreover, for naming and symmetry reasons, the TLB walker port is connected through the stage-one table walker thus making the naming identical to x86. Along the same line, we use the stage-one table walker to generate the master id that is used by all TLB-related requests. |
/gem5/src/arch/sparc/ | ||
H A D | isa.cc | 11150:a8a64cca231b Wed Sep 30 12:14:00 EDT 2015 Mitch Hayenga <mitch.hayenga@arm.com> isa,cpu: Add support for FS SMT Interrupts Adds per-thread interrupt controllers and thread/context logic so that interrupts properly get routed in SMT systems. 10905:a6ca6831e775 Tue Jul 07 04:51:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> sim: Refactor the serialization base class Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code. 10698:829adc48e175 Mon Feb 16 03:33:00 EST 2015 Andreas Hansson <andreas.hansson@arm.com> arch: Make readMiscRegNoEffect const throughout Finally took the plunge and made this apply to all ISAs, not just ARM. |
/gem5/src/arch/x86/ | ||
H A D | decoder.hh | 11168:f98eb2da15a4 Mon Oct 12 04:07:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> misc: Remove redundant compiler-specific defines This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions. 11165:d90aec9435bd Fri Oct 09 15:50:00 EDT 2015 Rekai Gonzalez Alberquilla <Rekai.GonzalezAlberquilla@arm.com> isa: Add parameter to pick different decoder inside ISA The decoder is responsible for splitting instructions in micro operations (uops). Given that different micro architectures may split operations differently, this patch allows to specify which micro architecture each isa implements, so different cores in the system can split instructions differently, also decoupling uop splitting (microArch) from ISA (Arch). This is done making the decodification calls templates that receive a type 'DecoderFlavour' that maps the name of the operation to the class that implements it. This way there is only one selection point (converting the command line enum to the appropriate DecodeFeatures object). In addition, there is no explicit code replication: template instantiation hides that, and the compiler should be able to resolve a number of things at compile-time. 10924:d02e9c239892 Fri Jul 17 12:31:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> x86: decode instructions with vex prefix This patch updates the x86 decoder so that it can decode instructions with vex prefix. It also updates the isa with opcodes from vex opcode maps 1, 2 and 3. Note that none of the instructions have been implemented yet. The implementations would be provided in due course of time. |
/gem5/src/cpu/pred/ | ||
H A D | bpred_unit.hh | 11169:44b5c183c3cd Mon Oct 12 04:08:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> misc: Add explicit overrides and fix other clang >= 3.5 issues This patch adds explicit overrides as this is now required when using "-Wall" with clang >= 3.5, the latter now part of the most recent XCode. The patch consequently removes "virtual" for those methods where "override" is added. The latter should be enough of an indication. As part of this patch, a few minor issues that clang >= 3.5 complains about are also resolved (unused methods and variables). 11168:f98eb2da15a4 Mon Oct 12 04:07:00 EDT 2015 Andreas Hansson <andreas.hansson@arm.com> misc: Remove redundant compiler-specific defines This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions. 10785:f56c10663a01 Mon Apr 13 18:33:00 EDT 2015 Dibakar Gope <gope@wisc.edu> cpu: re-organizes the branch predictor structure. Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
/gem5/src/cpu/trace/ | ||
H A D | trace_cpu.hh | 11253:daf9f91b11e9 Mon Dec 07 17:42:00 EST 2015 Radhika Jagtap <radhika.jagtap@ARM.com> cpu: Support virtual addr in elastic traces This patch adds support to optionally capture the virtual address and asid for load/store instructions in the elastic traces. If they are present in the traces, Trace CPU will set those fields of the request during replay. 11252:18bb597fc40c Mon Dec 07 17:42:00 EST 2015 Radhika Jagtap <radhika.jagtap@ARM.com> cpu: Create record type enum for elastic traces This patch replaces the booleans that specified the elastic trace record type with an enum type. The source of change is the proto message for elastic trace where the enum is introduced. The struct definitions in the elastic trace probe listener as well as the Trace CPU replace the boleans with the proto message enum. The patch does not impact functionality, but traces are not compatible with previous version. This is preparation for adding new types of records in subsequent patches. 11249:0733a1c08600 Mon Dec 07 17:42:00 EST 2015 Radhika Jagtap <radhika.jagtap@ARM.com> cpu: Add TraceCPU to playback elastic traces This patch defines a TraceCPU that replays trace generated using the elastic trace probe attached to the O3 CPU model. The elastic trace is an execution trace with data dependencies and ordering dependencies annoted to it. It also replays fixed timestamp instruction fetch trace that is also generated by the elastic trace probe. The TraceCPU inherits from BaseCPU as a result of which some methods need to be defined. It has two port subclasses inherited from MasterPort for instruction and data ports. It issues the memory requests deducing the timing from the trace and without performing real execution of micro-ops. As soon as the last dependency for an instruction is complete, its computational delay, also provided in the input trace is added. The dependency-free nodes are maintained in a list, called 'ReadyList', ordered by ready time. Instructions which depend on load stall until the responses for read requests are received thus achieving elastic replay. If the dependency is not found when adding a new node, it is assumed complete. Thus, if this node is found to be completely dependency-free its issue time is calculated and it is added to the ready list immediately. This is encapsulated in the subclass ElasticDataGen. If ready nodes are issued in an unconstrained way there can be more nodes outstanding which results in divergence in timing compared to the O3CPU. Therefore, the Trace CPU also models hardware resources. A sub-class to model hardware resources is added which contains the maximum sizes of load buffer, store buffer and ROB. If resources are not available, the node is not issued. The 'depFreeQueue' structure holds nodes that are pending issue. Modeling the ROB size in the Trace CPU as a resource limitation is arguably the most important parameter of all resources. The ROB occupancy is estimated using the newly added field 'robNum'. We need to use ROB number as sequence number is at times much higher due to squashing and trace replay is focused on correct path modeling. A map called 'inFlightNodes' is added to track nodes that are not only in the readyList but also load nodes that are executed (and thus removed from readyList) but are not complete. ReadyList handles what and when to execute next node while the inFlightNodes is used for resource modelling. The oldest ROB number is updated when any node occupies the ROB or when an entry in the ROB is released. The ROB occupancy is equal to the difference in the ROB number of the newly dependency-free node and the oldest ROB number in flight. If no node dependends on a non load/store node then there is no reason to track it in the dependency graph. We filter out such nodes but count them and add a weight field to the subsequent node that we do include in the trace. The weight field is used to model ROB occupancy during replay. The depFreeQueue is chosen to be FIFO so that child nodes which are in program order get pushed into it in that order and thus issued in the in program order, like in the O3CPU. This is also why the dependents is made a sequential container, std::set to std::vector. We only check head of the depFreeQueue as nodes are issued in order and blocking on head models that better than looping the entire queue. An alternative choice would be to inspect top N pending nodes where N is the issue-width. This is left for future as the timing correlation looks good as it is. At the start of an execution event, first we attempt to issue such pending nodes by checking if appropriate resources have become available. If yes, we compute the execute tick with respect to the time then. Then we proceed to complete nodes from the readyList. When a read response is received, sometimes a dependency on it that was supposed to be released when it was issued is still not released. This occurs because the dependent gets added to the graph after the read was sent. So the check is made less strict and the dependency is marked complete on read response instead of insisting that it should have been removed on read sent. There is a check for requests spanning two cache lines as this condition triggers an assert fail in the L1 cache. If it does then truncate the size to access only until the end of that line and ignore the remainder. Strictly-ordered requests are skipped and the dependencies on such requests are handled by simply marking them complete immediately. The simulated seconds can be calculated as the difference between the final_tick stat and the tickOffset stat. A CountedExitEvent that contains a static int belonging to the Trace CPU class as a down counter is used to implement multi Trace CPU simulation exit. |
/gem5/src/mem/ruby/slicc_interface/ | ||
H A D | RubyRequest.hh | 11305:78c1e4f5dfc5 Mon Jul 20 10:15:00 EDT 2015 Blake Hechtman <blake.hechtman@amd.com> mem: misc flags for AMD gpu model This patch add support to mark memory requests/packets with attributes defined in HSA, such as memory order and scope. 11025:4872dbdea907 Fri Aug 14 01:04:00 EDT 2015 Nilay Vaish <nilay@cs.wisc.edu> ruby: replace Address by Addr This patch eliminates the type Address defined by the ruby memory system. This memory system would now use the type Addr that is in use by the rest of the system. 11005:e7f403b6b76f Fri Aug 07 04:59:00 EDT 2015 Andreas Sandberg <andreas.sandberg@arm.com> base: Declare a type for context IDs Context IDs used to be declared as ad hoc (usually as int). This changeset introduces a typedef for ContextIDs and a constant for invalid context IDs. |
Completed in 164 milliseconds