Searched hist:2012 (Results 576 - 600 of 1124) sorted by relevance
/gem5/tests/configs/ | ||
H A D | o3-timing-mp.py | diff 9315:2e00867b5001 Fri Oct 26 06:42:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> config: Fix the cache class naming in regression scripts This patch unifies the naming of the default L1 and L2 caches in the regression configs to be in line with what is used in the se and fs scripts. diff 9311:227d19399b51 Thu Oct 25 13:14:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> config: Use SimpleDRAM in full-system, and with o3 and inorder This patch favours using SimpleDRAM with the default timing instead of SimpleMemory for all regressions that involve the o3 or inorder CPU, or are full system (in other words, where the actual performance of the memory is important for the overall performance). Moving forward, the solution for FSConfig and the users of fs.py and se.py is probably something similar to what we use to choose the CPU type. I envision a few pre-set configurations SimpleLPDDR2, SimpleDDR3, etc that can be choosen by a dram_type option. Feedback on this part is welcome. This patch changes plenty stats and adds all the DRAM controller related stats. A follow-on patch updates the relevant statistics. The total run-time for the entire regression goes up with ~5% with this patch due to the added complexity of the SimpleDRAM model. This is a concious trade-off to ensure that the model is properly tested. diff 9310:aa7bf10e822a Thu Oct 25 04:32:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> config: Use shared cache config for regressions This patch uses the common L1, L2 and IOCache configuration for the regressions that all share the same cache parameters. There are a few regressions that use a slightly different configuration (memtest, o3-timing=mp, simple-atomic-mp and simple-timing-mp), and the latter are not changed in this patch. They will be updated in a future patch. The common cache configurations are changed to match the ones used in the regressions, and are slightly changed with respect to what they were. Hopefully this means we can converge on a common base configuration, used both in the normal user configurations and regressions. As only regressions that shared the same cache configuration are updated, no regressions are affected. diff 9288:3d6da8559605 Mon Oct 15 08:10:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Mem: Use cycles to express cache-related latencies This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency. The stat blocked_cycles that actually counter in ticks is now updated to count in cycles. As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions. diff 9263:066099902102 Tue Sep 25 12:49:00 EDT 2012 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> Cache: add a response latency to the caches In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path. diff 9036:6385cf85bf12 Thu May 31 13:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Bus: Split the bus into a non-coherent and coherent bus This patch introduces a class hierarchy of buses, a non-coherent one, and a coherent one, splitting the existing bus functionality. By doing so it also enables further specialisation of the two types of buses. A non-coherent bus connects a number of non-snooping masters and slaves, and routes the request and response packets based on the address. The request packets issued by the master connected to a non-coherent bus could still snoop in caches attached to a coherent bus, as is the case with the I/O bus and memory bus in most system configurations. No snoops will, however, reach any master on the non-coherent bus itself. The non-coherent bus can be used as a template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses, and is typically used for the I/O buses. A coherent bus connects a number of (potentially) snooping masters and slaves, and routes the request and response packets based on the address, and also forwards all requests to the snoopers and deals with the snoop responses. The coherent bus can be used as a template for modelling QPI, HyperTransport, ACE and coherent OCP buses, and is typically used for the L1-to-L2 buses and as the main system interconnect. The configuration scripts are updated to use a NoncoherentBus for all peripheral and I/O buses. A bit of minor tidying up has also been done. diff 8931:7a1dfb191e3f Fri Apr 06 13:46:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Enable multiple distributed generalized memories This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. diff 8876:44f8e7bb7fdf Fri Mar 02 09:21:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> CPU: Check that the interrupt controller is created when needed This patch adds a creation-time check to the CPU to ensure that the interrupt controller is created for the cases where it is needed, i.e. if the CPU is not being switched in later and not a checker CPU. The patch also adds the "createInterruptController" call to a number of the regression scripts. diff 8839:eeb293859255 Mon Feb 13 06:43:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Introduce the master/slave port roles in the Python classes This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves. The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port. Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves. diff 8833:2870638642bd Sun Feb 12 17:07:00 EST 2012 Dam Sunwoo <dam.sunwoo@arm.com> mem: fix cache stats to use request ids correctly This patch fixes the cache stats to use the new request ids. Cache stats also display the requestor names in the vector subnames. Most cache stats now include "nozero" and "nonan" flags to reduce the amount of excessive cache stat dump. Also, simplified incMissCount()/incHitCount() functions. |
/gem5/src/arch/power/ | ||
H A D | SConscript | diff 9057:f5ee56466b91 Tue Jun 05 13:52:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> ISA: Back-out NoopMachInst as a StaticInstPtr change. diff 9040:cdfe09f9bdee Mon Jun 04 13:57:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA: Turn the ExtMachInst NoopMachinst into the StaticInstPtr NoopStaticInst. This eliminates a use of the ExtMachInst type outside of the ISAs. diff 9022:bb25e7646c41 Fri May 25 03:55:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA: Make the decode function part of the ISA's decoder. |
H A D | isa_traits.hh | diff 9329:3fe8438cbcfc Fri Nov 02 12:32:00 EDT 2012 Dam Sunwoo <dam.sunwoo@arm.com> ISA: generic Linux thread info support This patch takes the Linux thread info support scattered across different ISA implementations (currently in ARM, ALPHA, and MIPS), and unifies them into a single file. Adds a few more helper functions to read out TGID, mm, etc. ISA-specific information (e.g., ALPHA PCBB register) is now moved to the corresponding isa_traits.hh files. diff 9057:f5ee56466b91 Tue Jun 05 13:52:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> ISA: Back-out NoopMachInst as a StaticInstPtr change. diff 9040:cdfe09f9bdee Mon Jun 04 13:57:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA: Turn the ExtMachInst NoopMachinst into the StaticInstPtr NoopStaticInst. This eliminates a use of the ExtMachInst type outside of the ISAs. |
/gem5/src/arch/x86/ | ||
H A D | decoder.cc | diff 9024:5851586f399c Sat May 26 16:45:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA,CPU: Generalize and split out the components of the decode cache. This will allow it to be specialized by the ISAs. The existing caching scheme is provided by the BasicDecodeCache in the GenericISA namespace and is built from the generalized components. diff 9023:e9201a7bce59 Sat May 26 16:44:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> CPU: Merge the predecoder and decoder. These classes are always used together, and merging them will give the ISAs more flexibility in how they cache things and manage the process. 9022:bb25e7646c41 Fri May 25 03:55:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA: Make the decode function part of the ISA's decoder. |
H A D | mmapped_ipr.hh | diff 9180:ee8d7a51651d Tue Aug 28 14:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Clock: Add a Cycles wrapper class and use where applicable This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes. diff 9179:666bc9df1e49 Tue Aug 28 14:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Clock: Rework clocks to avoid tick-to-cycle transformations This patch introduces the notion of a clock update function that aims to avoid costly divisions when turning the current tick into a cycle. Each clocked object advances a private (hidden) cycle member and a tick member and uses these to implement functions for getting the tick of the next cycle, or the tick of a cycle some time in the future. In the different modules using the clocks, changes are made to avoid counting in ticks only to later translate to cycles. There are a few oddities in how the O3 and inorder CPU count idle cycles, as seen by a few locations where a cycle is subtracted in the calculation. This is done such that the regression does not change any stats, but should be revisited in a future patch. Another, much needed, change that is not done as part of this patch is to introduce a new typedef uint64_t Cycle to be able to at least hint at the unit of the variables counting Ticks vs Cycles. This will be done as a follow-up patch. As an additional follow up, the thread context still uses ticks for the book keeping of last activate and last suspend and this should probably also be changed into cycles as well. diff 8902:75b524b64c28 Mon Mar 19 06:36:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> gcc: Clean-up of non-C++0x compliant code, first steps This patch cleans up a number of minor issues aiming to get closer to compliance with the C++0x standard as interpreted by gcc and clang (compile with std=c++0x and -pedantic-errors). In particular, the patch cleans up enums where the last item was succeded by a comma, namespaces closed by a curcly brace followed by a semi-colon, and the use of the GNU-extension typeof (replaced by templated functions). It does not address variable-length arrays, zero-size arrays, anonymous structs, range expressions in switch statements, and the use of long long. The generated CPU code also has a large number of issues that remain to be fixed, mainly related to overflows in implicit constant conversion (due to shifts). |
H A D | X86LocalApic.py | diff 9338:97b4a2be1e5b Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it. diff 9162:019047ead23b Tue Aug 21 05:50:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Device: Remove overloaded pio_latency parameter This patch removes the overloading of the parameter, which seems both redundant, and possibly incorrect. The PciConfigAll now also uses a Param.Latency rather than a Param.Tick. For backwards compatibility it still sets the pio_latency to 1 tick. All the comments have also been updated to not state that it is in simticks when it is not necessarily the case. diff 8839:eeb293859255 Mon Feb 13 06:43:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Introduce the master/slave port roles in the Python classes This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves. The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port. Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves. |
/gem5/src/dev/arm/ | ||
H A D | kmi.hh | diff 9525:0587c8983d47 Thu Oct 25 09:05:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@ARM.com> arm: Create a GIC base class and make the PL390 derive from it This patch moves the GIC interface to a separate base class and makes all interrupt devices use that base class instead of a pointer to the PL390 implementation. This allows us to have multiple GIC implementations. Future implementations will allow in-kernel GIC implementations when using hardware virtualization. diff 9330:4a3269a11230 Fri Nov 02 12:32:00 EDT 2012 Chander Sudanthi <chander.sudanthi@arm.com> base: split out the VncServer into a VncInput and Server classes This patch adds a VncInput base class which VncServer inherits from. Another class can implement the same interface and be used instead of the VncServer, for example a class that replays Vnc traffic. diff 9235:5aa4896ed55a Wed Sep 19 06:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> AddrRange: Transition from Range<T> to AddrRange This patch takes the final plunge and transitions from the templated Range class to the more specific AddrRange. In doing so it changes the obvious Range<Addr> to AddrRange, and also bumps the range_map to be AddrRangeMap. In addition to the obvious changes, including the removal of redundant includes, this patch also does some house keeping in preparing for the introduction of address interleaving support in the ranges. The Range class is also stripped of all the functionality that is never used. |
H A D | timer_cpulocal.cc | diff 9525:0587c8983d47 Thu Oct 25 09:05:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@ARM.com> arm: Create a GIC base class and make the PL390 derive from it This patch moves the GIC interface to a separate base class and makes all interrupt devices use that base class instead of a pointer to the PL390 implementation. This allows us to have multiple GIC implementations. Future implementations will allow in-kernel GIC implementations when using hardware virtualization. diff 9157:e0bad9d7bbd6 Tue Aug 21 05:49:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Clock: Move the clock and related functions to ClockedObject This patch moves the clock of the CPU, bus, and numerous devices to the new class ClockedObject, that sits in between the SimObject and MemObject in the class hierarchy. Although there are currently a fair amount of MemObjects that do not make use of the clock, they potentially should do so, e.g. the caches should at some point have the same clock as the CPU, potentially with a 1:n ratio. This patch does not introduce any new clock objects or object hierarchies (clusters, clock domains etc), but is still a step in the direction of having a more structured approach clock domains. The most contentious part of this patch is the serialisation of clocks that some of the modules (but not all) did previously. This serialisation should not be needed as the clock is set through the parameters even when restoring from the checkpoint. In other words, the state is "stored" in the Python code that creates the modules. The nextCycle methods are also simplified and the clock phase parameter of the CPU is removed (this could be part of a clock object once they are introduced). diff 8993:d5f9445010da Thu May 10 19:04:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> ARM: Fix incorrect use of not operators in arm devices |
H A D | rtc_pl031.cc | diff 8993:d5f9445010da Thu May 10 19:04:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> ARM: Fix incorrect use of not operators in arm devices diff 8904:1b6d79c9a603 Wed Mar 21 11:34:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> ARM: Fix uninitialized value in ARM RTC model. 8869:fa8dcdd7e26c Thu Mar 01 18:26:00 EST 2012 Ali Saidi <Ali.Saidi@ARM.com> ARM: Add RTC device for ARM platforms. This change implements a PL031 real time clock. |
/gem5/src/arch/alpha/linux/ | ||
H A D | linux.hh | diff 9146:a61fdbbc1d45 Mon Aug 06 22:52:00 EDT 2012 Marc Orr <marc.orr@gmail.com> syscall emulation: Enabled getrlimit and getrusage for x86. Added/moved rlimit constants to base linux header file. This patch is a revised version of Vince Weaver's earlier patch. diff 9141:593fe25c86a6 Mon Aug 06 19:52:00 EDT 2012 Marc Orr <marc.orr@gmail.com> syscall emulation: Clean up ioctl handling, and implement for x86. Enable different whitelists for different OS/arch combinations, since some use the generic Linux definitions only, and others use definitions inherited from earlier Unix flavors on those architectures. Also update x86 function pointers so ioctl is no longer unimplemented on that platform. This patch is a revised version of Vince Weaver's earlier patch. diff 9112:6e854ea87bab Wed Jul 11 01:51:00 EDT 2012 Marc Orr <marc.orr@gmail.com> syscall emulation: Add the futex system call. |
/gem5/src/arch/arm/ | ||
H A D | decoder.cc | diff 9024:5851586f399c Sat May 26 16:45:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA,CPU: Generalize and split out the components of the decode cache. This will allow it to be specialized by the ISAs. The existing caching scheme is provided by the BasicDecodeCache in the GenericISA namespace and is built from the generalized components. diff 9023:e9201a7bce59 Sat May 26 16:44:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> CPU: Merge the predecoder and decoder. These classes are always used together, and merging them will give the ISAs more flexibility in how they cache things and manage the process. 9022:bb25e7646c41 Fri May 25 03:55:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA: Make the decode function part of the ISA's decoder. |
H A D | stacktrace.cc | diff 9069:873634453ae0 Mon Jun 11 11:07:00 EDT 2012 Anthony Gutierrez <atgutier@umich.edu> ARM: implement the ProcessInfo methods diff 8852:c744483edfcf Fri Feb 24 11:45:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Make port proxies use references rather than pointers This patch is adding a clearer design intent to all objects that would not be complete without a port proxy by making the proxies members rathen than dynamically allocated. In essence, if NULL would not be a valid value for the proxy, then we avoid using a pointer to make this clear. The same approach is used for the methods using these proxies, such as loadSections, that now use references rather than pointers to better reflect the fact that NULL would not be an acceptable value (in fact the code would break and that is how this patch started out). Overall the concept of "using a reference to express unconditional composition where a NULL pointer is never valid" could be done on a much broader scale throughout the code base, but for now it is only done in the locations affected by the proxies. diff 8706:b1838faf3bcc Tue Jan 17 01:55:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Add port proxies instead of non-structural ports Port proxies are used to replace non-structural ports, and thus enable all ports in the system to correspond to a structural entity. This has the advantage of accessing memory through the normal memory subsystem and thus allowing any constellation of distributed memories, address maps, etc. Most accesses are done through the "system port" that is used for loading binaries, debugging etc. For the entities that belong to the CPU, e.g. threads and thread contexts, they wrap the CPU data port in a port proxy. The following replacements are made: FunctionalPort > PortProxy TranslatingPort > SETranslatingPortProxy VirtualPort > FSTranslatingPortProxy |
H A D | table_walker.hh | diff 9342:6fec8f26e56d Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Move the draining interface into a separate base class This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager. diff 9294:8fb03b13de02 Mon Oct 15 08:12:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Add protocol-agnostic ports in the port hierarchy This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default. diff 9258:baa17ba80e06 Tue Sep 25 12:49:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> ARM: Squash outstanding walks when instructions are squashed. diff 9165:f9e3dac185ba Wed Aug 22 11:39:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Packet: Remove NACKs from packet and its use in endpoints This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that). The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe. Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up. diff 9152:86c0e6ca5e7c Wed Aug 15 10:38:00 EDT 2012 Anthony Gutierrez <atgutier@umich.edu> O3,ARM: fix some problems with drain/switchout functionality and add Drain DPRINTFs This patch fixes some problems with the drain/switchout functionality for the O3 cpu and for the ARM ISA and adds some useful debug print statements. This is an incremental fix as there are still a few bugs/mem leaks with the switchout code. Particularly when switching from an O3CPU to a TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA I haven't encountered any more assertion failures; now the kernel will typically panic inside of simulation. diff 9016:18093957a102 Wed May 23 09:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> DMA: Split the DMA device and IO device into seperate files This patch moves the DMA device to its own set of files, splitting it from the IO device. There are no behavioural changes associated with this patch. The patch also grabs the opportunity to do some very minor tidying up, including some white space removal and pruning some redundant parameters. Besides the immediate benefits of the separation-of-concerns, this patch also makes upcoming changes more streamlined as it split the devices that are only slaves and the DMA device that also acts as a master. diff 9015:7f4d25789dc4 Wed May 23 09:14:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Add a snooping DMA port subclass for table walker This patch makes the (device) DmaPort non-snooping and removes the recvSnoop constructor parameter and instead introduces a SnoopingDmaPort subclass for the ARM table walker. Functionality is unchanged, as are the stats, and the patch merely clarifies that the normal DMA ports are not snooping (although they may issue requests that are snooped by others, as done with PCI, PCIe, AMBA4 ACE etc). Currently this port is declared in the ARM table walker as it is not used anywhere else. If other ports were to have similar behaviour it could be moved in a future patch. diff 8922:17f037ad8918 Fri Mar 30 09:40:00 EDT 2012 William Wang <william.wang@arm.com> MEM: Introduce the master/slave port sub-classes in C++ This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects. The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches. The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal. The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again. diff 8902:75b524b64c28 Mon Mar 19 06:36:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> gcc: Clean-up of non-C++0x compliant code, first steps This patch cleans up a number of minor issues aiming to get closer to compliance with the C++0x standard as interpreted by gcc and clang (compile with std=c++0x and -pedantic-errors). In particular, the patch cleans up enums where the last item was succeded by a comma, namespaces closed by a curcly brace followed by a semi-colon, and the use of the GNU-extension typeof (replaced by templated functions). It does not address variable-length arrays, zero-size arrays, anonymous structs, range expressions in switch statements, and the use of long long. The generated CPU code also has a large number of issues that remain to be fixed, mainly related to overflows in implicit constant conversion (due to shifts). diff 8851:7e966326ef5b Fri Feb 24 11:43:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Move port creation to the memory object(s) construction This patch moves all port creation from the getPort method to be consistently done in the MemObject's constructor. This is possible thanks to the Swig interface passing the length of the vector ports. Previously there was a mix of: 1) creating the ports as members (at object construction time) and using getPort for the name resolution, or 2) dynamically creating the ports in the getPort call. This is now uniform. Furthermore, objects that would not be complete without a port have these ports as members rather than having pointers to dynamically allocated ports. This patch also enables an elaboration-time enumeration of all the ports in the system which can be used to determine the masterId. |
/gem5/src/cpu/o3/ | ||
H A D | O3CPU.py | diff 9341:a0eff1e9c773 Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> cpu: O3 add a header declaring the DerivO3CPU SWIG needs a complete declaration of all wrapped objects. This patch adds a header file with the DerivO3CPU class and includes it in the SWIG interface. diff 9184:a1a8f137b796 Fri Sep 07 00:34:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Param: Transition to Cycles for relevant parameters This patch is a first step to using Cycles as a parameter type. The main affected modules are the CPUs and the Ruby caches. There are definitely plenty more places that are affected, but this patch serves as a starting point to making the transition. An important part of this patch is to actually enable parameters to be specified as Param.Cycles which involves some changes to params.py. diff 9179:666bc9df1e49 Tue Aug 28 14:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Clock: Rework clocks to avoid tick-to-cycle transformations This patch introduces the notion of a clock update function that aims to avoid costly divisions when turning the current tick into a cycle. Each clocked object advances a private (hidden) cycle member and a tick member and uses these to implement functions for getting the tick of the next cycle, or the tick of a cycle some time in the future. In the different modules using the clocks, changes are made to avoid counting in ticks only to later translate to cycles. There are a few oddities in how the O3 and inorder CPU count idle cycles, as seen by a few locations where a cycle is subtracted in the calculation. This is done such that the regression does not change any stats, but should be revisited in a future patch. Another, much needed, change that is not done as part of this patch is to introduce a new typedef uint64_t Cycle to be able to at least hint at the unit of the variables counting Ticks vs Cycles. This will be done as a follow-up patch. As an additional follow up, the thread context still uses ticks for the book keeping of last activate and last suspend and this should probably also be changed into cycles as well. diff 9132:c8d4b0595448 Fri Jul 27 16:08:00 EDT 2012 Anthony Gutierrez <atgutier@umich.edu> checker: make checker cpu id match its host's cpu id when using the checker i ran into problems where an instruction reading the cpu id register failed because the ids did not match, and hence, the result of the instruction did not match. this patch ensures that the ids match so this instruction does not fail. this problem only seemed to manifest itself when multiple cores were in the system, either multi-core, or extra switched- out cores present in the system. diff 8887:20ea02da9c53 Fri Mar 09 09:59:00 EST 2012 Geoffrey Blake <geoffrey.blake@arm.com> CheckerCPU: Make CheckerCPU runtime selectable instead of compile selectable Enables the CheckerCPU to be selected at runtime with the --checker option from the configs/example/fs.py and configs/example/se.py configuration files. Also merges with the SE/FS changes. diff 8809:bb10807da889 Wed Feb 01 01:40:00 EST 2012 Gabe Black <gblack@eecs.umich.edu> Merge with head, hopefully the last time for this batch. diff 8807:35e77c938919 Sun Jan 29 06:27:00 EST 2012 Gabe Black <gblack@eecs.umich.edu> Yet another merge with the main repository. diff 8799:dac1e33e07b0 Sat Jan 28 10:24:00 EST 2012 Gabe Black <gblack@eecs.umich.edu> Merge with the main repo. diff 8796:a2ae5c378d0a Sat Jan 07 05:15:00 EST 2012 Gabe Black <gblack@eecs.umich.edu> Merge with the main repository again. diff 8733:64a7bf8fa56c Tue Jan 31 10:46:00 EST 2012 Geoffrey Blake <geoffrey.blake@arm.com> CheckerCPU: Re-factor CheckerCPU to be compatible with current gem5 Brings the CheckerCPU back to life to allow FS and SE checking of the O3CPU. These changes have only been tested with the ARM ISA. Other ISAs potentially require modification. |
/gem5/src/arch/mips/ | ||
H A D | stacktrace.cc | diff 8852:c744483edfcf Fri Feb 24 11:45:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Make port proxies use references rather than pointers This patch is adding a clearer design intent to all objects that would not be complete without a port proxy by making the proxies members rathen than dynamically allocated. In essence, if NULL would not be a valid value for the proxy, then we avoid using a pointer to make this clear. The same approach is used for the methods using these proxies, such as loadSections, that now use references rather than pointers to better reflect the fact that NULL would not be an acceptable value (in fact the code would break and that is how this patch started out). Overall the concept of "using a reference to express unconditional composition where a NULL pointer is never valid" could be done on a much broader scale throughout the code base, but for now it is only done in the locations affected by the proxies. diff 8799:dac1e33e07b0 Sat Jan 28 10:24:00 EST 2012 Gabe Black <gblack@eecs.umich.edu> Merge with the main repo. diff 8706:b1838faf3bcc Tue Jan 17 01:55:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Add port proxies instead of non-structural ports Port proxies are used to replace non-structural ports, and thus enable all ports in the system to correspond to a structural entity. This has the advantage of accessing memory through the normal memory subsystem and thus allowing any constellation of distributed memories, address maps, etc. Most accesses are done through the "system port" that is used for loading binaries, debugging etc. For the entities that belong to the CPU, e.g. threads and thread contexts, they wrap the CPU data port in a port proxy. The following replacements are made: FunctionalPort > PortProxy TranslatingPort > SETranslatingPortProxy VirtualPort > FSTranslatingPortProxy |
/gem5/src/arch/x86/bios/ | ||
H A D | intelmp.cc | diff 8852:c744483edfcf Fri Feb 24 11:45:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Make port proxies use references rather than pointers This patch is adding a clearer design intent to all objects that would not be complete without a port proxy by making the proxies members rathen than dynamically allocated. In essence, if NULL would not be a valid value for the proxy, then we avoid using a pointer to make this clear. The same approach is used for the methods using these proxies, such as loadSections, that now use references rather than pointers to better reflect the fact that NULL would not be an acceptable value (in fact the code would break and that is how this patch started out). Overall the concept of "using a reference to express unconditional composition where a NULL pointer is never valid" could be done on a much broader scale throughout the code base, but for now it is only done in the locations affected by the proxies. diff 8737:770ccf3af571 Tue Jan 31 00:05:00 EST 2012 Koan-Sin Tan <koansin.tan@gmail.com> clang: Enable compiling gem5 using clang 2.9 and 3.0 This patch adds the necessary flags to the SConstruct and SConscript files for compiling using clang 2.9 and later (on Ubuntu et al and OSX XCode 4.2), and also cleans up a bunch of compiler warnings found by clang. Most of the warnings are related to hidden virtual functions, comparisons with unsigneds >= 0, and if-statements with empty bodies. A number of mismatches between struct and class are also fixed. clang 2.8 is not working as it has problems with class names that occur in multiple namespaces (e.g. Statistics in kernel_stats.hh). clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which causes confusion between the container std::set and the function Packet::set, and this is currently addressed by not including the entire namespace std, but rather selecting e.g. "using std::vector" in the appropriate places. diff 8706:b1838faf3bcc Tue Jan 17 01:55:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Add port proxies instead of non-structural ports Port proxies are used to replace non-structural ports, and thus enable all ports in the system to correspond to a structural entity. This has the advantage of accessing memory through the normal memory subsystem and thus allowing any constellation of distributed memories, address maps, etc. Most accesses are done through the "system port" that is used for loading binaries, debugging etc. For the entities that belong to the CPU, e.g. threads and thread contexts, they wrap the CPU data port in a port proxy. The following replacements are made: FunctionalPort > PortProxy TranslatingPort > SETranslatingPortProxy VirtualPort > FSTranslatingPortProxy |
/gem5/src/cpu/testers/memtest/ | ||
H A D | MemTest.py | diff 9338:97b4a2be1e5b Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it. diff 8839:eeb293859255 Mon Feb 13 06:43:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Introduce the master/slave port roles in the Python classes This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves. The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port. Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves. diff 8832:247fee427324 Sun Feb 12 17:07:00 EST 2012 Ali Saidi <Ali.Saidi@ARM.com> mem: Add a master ID to each request object. This change adds a master id to each request object which can be used identify every device in the system that is capable of issuing a request. This is part of the way to removing the numCpus+1 stats in the cache and replacing them with the master ids. This is one of a series of changes that make way for the stats output to be changed to python. |
/gem5/src/mem/ | ||
H A D | AbstractMemory.py | diff 9338:97b4a2be1e5b Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it. diff 9236:c38988024f1f Wed Sep 19 06:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Mem: Remove the file parameter from AbstractMemory This patch removes the unused file parameter from the AbstractMemory. The patch serves to make it easier to transition to a separation of the actual contigious host memory backing store, and the gem5 memory controllers. Without the file parameter it becomes easier to hide the creation of the mmap in the PhysicalMemory, as there are no longer any reasons to expose the actual contigious ranges to the user. To the best of my knowledge there is no use of the parameter, so the change should not affect anyone. 8931:7a1dfb191e3f Fri Apr 06 13:46:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Enable multiple distributed generalized memories This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. |
H A D | addr_mapper.cc | diff 9294:8fb03b13de02 Mon Oct 15 08:12:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Add protocol-agnostic ports in the port hierarchy This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default. diff 9279:8b16c3804bda Mon Oct 15 08:07:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Mem: Use range operations in bus in preparation for striping This patch transitions the bus to use the AddrRange operations instead of directly accessing the start and end. The change facilitates the move to a more elaborate AddrRange class that also supports address striping in the bus by specifying interleaving bits in the ranges. Two new functions are added to the AddrRange to determine if two ranges intersect, and if one is a subset of another. The bus propagation of address ranges is also tweaked such that an update is only propagated if the bus received information from all the downstream slave modules. This avoids the iteration and need for the cycle-breaking scheme that was previously used. 9259:fc28f3ca5b21 Tue Sep 25 12:49:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> mem: Add a gasket that allows memory ranges to be re-mapped. For example if DRAM is at two locations and mirrored this patch allows the mirroring to occur. |
/gem5/src/mem/ruby/common/ | ||
H A D | SConscript | diff 9171:ae88ecf37145 Mon Aug 27 02:00:00 EDT 2012 Nilay Vaish <nilay@cs.wisc.edu> Ruby: Remove RubyEventQueue This patch removes RubyEventQueue. Consumer objects now rely on RubySystem or themselves for scheduling events. diff 9012:6d64aa6a26af Tue May 22 12:35:00 EDT 2012 Nilay Vaish <nilay@cs.wisc.edu> Ruby: Remove the unused src/mem/ruby/common/Driver.* files. diff 8913:8b223e308b08 Thu Mar 22 06:34:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Scons: Remove Werror=False in SConscript files This patch removes the overriding of "-Werror" in a handful of cases. The code compiles with gcc 4.6.3 and clang 3.0 without any warnings, and thus without any errors. There are no functional changes introduced by this patch. In the future, rather than ypassing "-Werror", address the warnings. |
/gem5/src/dev/ | ||
H A D | io_device.cc | diff 9342:6fec8f26e56d Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Move the draining interface into a separate base class This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager. diff 9294:8fb03b13de02 Mon Oct 15 08:12:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Add protocol-agnostic ports in the port hierarchy This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default. diff 9095:0e6bd7082fac Mon Jul 09 00:35:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Align port names in C++ and Python This patch is a first step to align the port names used in the Python world and the C++ world. Ultimately it serves to make the use of config.json together with output from the simulation easier, including post-processing of statistics. Most notably, the CPU, cache, and bus is addressed in this patch, and there might be other ports that should be updated accordingly. The dash name separator has also been replaced with a "." which is what is used to concatenate the names in python, and a separation is made between the master and slave port in the bus. diff 9090:e4e22240398f Mon Jul 09 00:35:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Make getAddrRanges const This patch makes getAddrRanges const throughout the code base. There is no reason why it should not be, and making it const prevents adding any unintentional side-effects. diff 9016:18093957a102 Wed May 23 09:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> DMA: Split the DMA device and IO device into seperate files This patch moves the DMA device to its own set of files, splitting it from the IO device. There are no behavioural changes associated with this patch. The patch also grabs the opportunity to do some very minor tidying up, including some white space removal and pruning some redundant parameters. Besides the immediate benefits of the separation-of-concerns, this patch also makes upcoming changes more streamlined as it split the devices that are only slaves and the DMA device that also acts as a master. diff 9015:7f4d25789dc4 Wed May 23 09:14:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Add a snooping DMA port subclass for table walker This patch makes the (device) DmaPort non-snooping and removes the recvSnoop constructor parameter and instead introduces a SnoopingDmaPort subclass for the ARM table walker. Functionality is unchanged, as are the stats, and the patch merely clarifies that the normal DMA ports are not snooping (although they may issue requests that are snooped by others, as done with PCI, PCIe, AMBA4 ACE etc). Currently this port is declared in the ARM table walker as it is not used anywhere else. If other ports were to have similar behaviour it could be moved in a future patch. diff 8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself. diff 8949:3fa1ee293096 Sat Apr 14 05:45:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Remove the Broadcast destination from the packet This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field. Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before). The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class. In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing. diff 8948:e95ee70f876c Sat Apr 14 05:45:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate snoops and normal memory requests/responses This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports. Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below. Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass. Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses. The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id. In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions. diff 8922:17f037ad8918 Fri Mar 30 09:40:00 EDT 2012 William Wang <william.wang@arm.com> MEM: Introduce the master/slave port sub-classes in C++ This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects. The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches. The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal. The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again. |
H A D | io_device.hh | diff 9342:6fec8f26e56d Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Move the draining interface into a separate base class This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager. diff 9294:8fb03b13de02 Mon Oct 15 08:12:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Add protocol-agnostic ports in the port hierarchy This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default. diff 9090:e4e22240398f Mon Jul 09 00:35:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Make getAddrRanges const This patch makes getAddrRanges const throughout the code base. There is no reason why it should not be, and making it const prevents adding any unintentional side-effects. diff 9016:18093957a102 Wed May 23 09:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> DMA: Split the DMA device and IO device into seperate files This patch moves the DMA device to its own set of files, splitting it from the IO device. There are no behavioural changes associated with this patch. The patch also grabs the opportunity to do some very minor tidying up, including some white space removal and pruning some redundant parameters. Besides the immediate benefits of the separation-of-concerns, this patch also makes upcoming changes more streamlined as it split the devices that are only slaves and the DMA device that also acts as a master. diff 9015:7f4d25789dc4 Wed May 23 09:14:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Add a snooping DMA port subclass for table walker This patch makes the (device) DmaPort non-snooping and removes the recvSnoop constructor parameter and instead introduces a SnoopingDmaPort subclass for the ARM table walker. Functionality is unchanged, as are the stats, and the patch merely clarifies that the normal DMA ports are not snooping (although they may issue requests that are snooped by others, as done with PCI, PCIe, AMBA4 ACE etc). Currently this port is declared in the ARM table walker as it is not used anywhere else. If other ports were to have similar behaviour it could be moved in a future patch. diff 8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself. diff 8948:e95ee70f876c Sat Apr 14 05:45:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate snoops and normal memory requests/responses This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports. Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below. Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass. Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses. The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id. In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions. diff 8922:17f037ad8918 Fri Mar 30 09:40:00 EDT 2012 William Wang <william.wang@arm.com> MEM: Introduce the master/slave port sub-classes in C++ This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects. The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches. The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal. The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again. diff 8914:8c3bd7bea667 Thu Mar 22 06:36:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Split SimpleTimingPort into PacketQueue and ports This patch decouples the queueing and the port interactions to simplify the introduction of the master and slave ports. By separating the queueing functionality from the port itself, it becomes much easier to distinguish between master and slave ports, and still retain the queueing ability for both (without code duplication). As part of the split into a PacketQueue and a port, there is now also a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The QueuedPort is useful for ports that want to leave the packet transmission of outgoing packets to the queue and is used by both master and slave ports. The SimpleTimingPort inherits from the QueuedPort and adds the implemention of recvTiming and recvFunctional through recvAtomic. The PioPort and MessagePort are cleaned up as part of the changes. diff 8851:7e966326ef5b Fri Feb 24 11:43:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Move port creation to the memory object(s) construction This patch moves all port creation from the getPort method to be consistently done in the MemObject's constructor. This is possible thanks to the Swig interface passing the length of the vector ports. Previously there was a mix of: 1) creating the ports as members (at object construction time) and using getPort for the name resolution, or 2) dynamically creating the ports in the getPort call. This is now uniform. Furthermore, objects that would not be complete without a port have these ports as members rather than having pointers to dynamically allocated ports. This patch also enables an elaboration-time enumeration of all the ports in the system which can be used to determine the masterId. |
/gem5/configs/common/ | ||
H A D | Caches.py | diff 9321:7f0464326b2b Tue Oct 30 07:44:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> config: Unify caches used in regressions and adjust L2 MSHRs This patch unified the L1 and L2 caches used throughout the regressions instead of declaring different, but very similar, configurations in the different scripts. The patch also changes the default L2 configuration to match what it used to be for the fs and se scripts (until the last patch that updated the regressions to also make use of the cache config). The MSHRs and targets per MSHR are now set to a more realistic default of 20 and 12, respectively. As a result of both the aforementioned changes, many of the regression stats are changed. A follow-on patch will bump the stats. diff 9315:2e00867b5001 Fri Oct 26 06:42:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> config: Fix the cache class naming in regression scripts This patch unifies the naming of the default L1 and L2 caches in the regression configs to be in line with what is used in the se and fs scripts. diff 9310:aa7bf10e822a Thu Oct 25 04:32:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> config: Use shared cache config for regressions This patch uses the common L1, L2 and IOCache configuration for the regressions that all share the same cache parameters. There are a few regressions that use a slightly different configuration (memtest, o3-timing=mp, simple-atomic-mp and simple-timing-mp), and the latter are not changed in this patch. They will be updated in a future patch. The common cache configurations are changed to match the ones used in the regressions, and are slightly changed with respect to what they were. Hopefully this means we can converge on a common base configuration, used both in the normal user configurations and regressions. As only regressions that shared the same cache configuration are updated, no regressions are affected. diff 9288:3d6da8559605 Mon Oct 15 08:10:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Mem: Use cycles to express cache-related latencies This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency. The stat blocked_cycles that actually counter in ticks is now updated to count in cycles. As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions. diff 9263:066099902102 Tue Sep 25 12:49:00 EDT 2012 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> Cache: add a response latency to the caches In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path. |
/gem5/configs/splash2/ | ||
H A D | run.py | diff 9036:6385cf85bf12 Thu May 31 13:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Bus: Split the bus into a non-coherent and coherent bus This patch introduces a class hierarchy of buses, a non-coherent one, and a coherent one, splitting the existing bus functionality. By doing so it also enables further specialisation of the two types of buses. A non-coherent bus connects a number of non-snooping masters and slaves, and routes the request and response packets based on the address. The request packets issued by the master connected to a non-coherent bus could still snoop in caches attached to a coherent bus, as is the case with the I/O bus and memory bus in most system configurations. No snoops will, however, reach any master on the non-coherent bus itself. The non-coherent bus can be used as a template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses, and is typically used for the I/O buses. A coherent bus connects a number of (potentially) snooping masters and slaves, and routes the request and response packets based on the address, and also forwards all requests to the snoopers and deals with the snoop responses. The coherent bus can be used as a template for modelling QPI, HyperTransport, ACE and coherent OCP buses, and is typically used for the L1-to-L2 buses and as the main system interconnect. The configuration scripts are updated to use a NoncoherentBus for all peripheral and I/O buses. A bit of minor tidying up has also been done. diff 8931:7a1dfb191e3f Fri Apr 06 13:46:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Enable multiple distributed generalized memories This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. diff 8847:ef8630054b5e Tue Feb 14 14:15:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Fix residual bus ports and make them master/slave This patch cleans up a number of remaining uses of bus.port which is now split into bus.master and bus.slave. The only non-trivial change is the memtest where the level building now has to be aware of the role of the ports used in the previous level. diff 8836:922edffe734d Sun Feb 12 18:18:00 EST 2012 Ali Saidi <saidi@eecs.umich.edu> configs: fix minor config bugs posted on the mailing list diff 8801:1a84c6a81299 Sat Jan 28 10:24:00 EST 2012 Gabe Black <gblack@eecs.umich.edu> SE/FS: Make SE vs. FS mode a runtime parameter. |
/gem5/util/ | ||
H A D | regress | diff 8969:b9f4e3884951 Thu Apr 26 21:28:00 EDT 2012 Nilay Vaish <nilay@cs.wisc.edu> util/regress: Add the missing comma in the list of builds diff 8968:6d11b01e2c53 Wed Apr 25 23:43:00 EDT 2012 Nilay Vaish <nilay@cs.wisc.edu> Regression: Add a test for x86 timing full system ruby simulation diff 8816:c80758736323 Tue Feb 07 07:43:00 EST 2012 Gabe Black <gblack@eecs.umich.edu> m5=>gem5: Make the regression script build gem5.* instead of m5.* diff 8814:f0fcb53bed25 Sun Feb 05 04:23:00 EST 2012 Gabe Black <gblack@eecs.umich.edu> Regressions: Fix the regress script when "all" is used. When the "all" test is specified, the "tests" list should have two elements in it, "quick" and "long", not a single element "quick,long". The later would be appropriate as the default for one of the command line options which are split at commas, but at that point "tests" should already be a list. diff 8811:be4990a2c764 Thu Feb 02 04:51:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> Regression: Update the regress script after SE/FS merge This patch updates the regress script to reflect the merge of the SE/FS builds and the new structure of the test directories. It adds a "mode" flag to the script, that defaults to both se and fs. |
Completed in 155 milliseconds