Searched hist:2012 (Results 651 - 675 of 1124) sorted by relevance
/gem5/src/mem/ruby/network/fault_model/ | ||
H A D | FaultModel.cc | diff 8946:fb6c89334b86 Sat Apr 14 05:43:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> clang/gcc: Fix compilation issues with clang 3.0 and gcc 4.6 This patch addresses a number of minor issues that cause problems when compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it avoids using the deprecated ext/hash_map and instead uses unordered_map (and similarly so for the hash_set). To make use of the new STL containers, g++ and clang has to be invoked with "-std=c++0x", and this is now added for all gcc versions >= 4.6, and for clang >= 3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1 unordered_map to avoid the deprecation warning. The addition of c++0x in turn causes a few problems, as the compiler is more stringent and adds a number of new warnings. Below, the most important issues are enumerated: 1) the use of namespaces is more strict, e.g. for isnan, and all headers opening the entire namespace std are now fixed. 2) another other issue caused by the more stringent compiler is the narrowing of the embedded python, which used to be a char array, and is now unsigned char since there were values larger than 128. 3) a particularly odd issue that arose with the new c++0x behaviour is found in range.hh, where the operator< causes gcc to complain about the template type parsing (the "<" is interpreted as the beginning of a template argument), and the problem seems to be related to the begin/end members introduced for the range-type iteration, which is a new feature in c++11. As a minor update, this patch also fixes the build flags for the clang debug target that used to be shared with gcc and incorrectly use "-ggdb". |
H A D | FaultModel.hh | diff 8946:fb6c89334b86 Sat Apr 14 05:43:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> clang/gcc: Fix compilation issues with clang 3.0 and gcc 4.6 This patch addresses a number of minor issues that cause problems when compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it avoids using the deprecated ext/hash_map and instead uses unordered_map (and similarly so for the hash_set). To make use of the new STL containers, g++ and clang has to be invoked with "-std=c++0x", and this is now added for all gcc versions >= 4.6, and for clang >= 3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1 unordered_map to avoid the deprecation warning. The addition of c++0x in turn causes a few problems, as the compiler is more stringent and adds a number of new warnings. Below, the most important issues are enumerated: 1) the use of namespaces is more strict, e.g. for isnan, and all headers opening the entire namespace std are now fixed. 2) another other issue caused by the more stringent compiler is the narrowing of the embedded python, which used to be a char array, and is now unsigned char since there were values larger than 128. 3) a particularly odd issue that arose with the new c++0x behaviour is found in range.hh, where the operator< causes gcc to complain about the template type parsing (the "<" is interpreted as the beginning of a template argument), and the problem seems to be related to the begin/end members introduced for the range-type iteration, which is a new feature in c++11. As a minor update, this patch also fixes the build flags for the clang debug target that used to be shared with gcc and incorrectly use "-ggdb". |
/gem5/src/mem/ruby/network/ | ||
H A D | BasicRouter.hh | diff 9274:ba635023d4bb Tue Oct 02 15:35:00 EDT 2012 Nilay Vaish <nilay@cs.wisc.edu> ruby: changes to simple network This patch makes the Switch structure inherit from BasicRouter, as is done in two other networks. |
H A D | BasicRouter.py | diff 9338:97b4a2be1e5b Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it. |
/gem5/src/arch/arm/linux/ | ||
H A D | system.cc | diff 9332:ae2a5329ce96 Fri Nov 02 12:32:00 EDT 2012 Dam Sunwoo <dam.sunwoo@arm.com> ARM: dump stats and process info on context switches This patch enables dumping statistics and Linux process information on context switch boundaries (__switch_to() calls) that are used for Streamline integration (a graphical statistics viewer from ARM). diff 9290:90dd57ca9a7e Mon Oct 15 08:12:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Fix: Address a few minor issues identified by cppcheck This patch addresses a number of smaller issues identified by the code inspection utility cppcheck. There are a number of identified leaks in the arm/linux/system.cc (although the function only get's called once so it is not a major problem), a few deletes in dev/x86/i8042.cc that were not array deletes, and sprintfs where the character array had one element less than needed. In the IIC tags there was a function allocating an array of longs which is in fact never used. diff 9261:f795ce1feb5b Tue Sep 25 12:49:00 EDT 2012 Dam Sunwoo <dam.sunwoo@arm.com> ARM: added support for flattened device tree blobs Newer Linux kernels require DTB (device tree blobs) to specify platform configurations. The input DTB filename can be specified through gem5 parameters in LinuxArmSystem. diff 8997:f07639e4b676 Thu May 10 19:04:00 EDT 2012 Dam Sunwoo <dam.sunwoo@arm.com> ARM: guard masked symbol tables by default Symbol tables masked with the loadAddrMask create redundant entries that could conflict with kernel function events that rely on the original addresses. This patch guards the creation of those masked symbol tables by default, with an option to enable them when needed (for early-stage kernel debugging, etc.) diff 8931:7a1dfb191e3f Fri Apr 06 13:46:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Enable multiple distributed generalized memories This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. diff 8885:52bbd95b31ed Fri Mar 09 09:59:00 EST 2012 Ali Saidi <Ali.Saidi@ARM.com> System: Move code in initState() back into constructor whenever possible. The change to port proxies recently moved code out of the constructor into initState(). This is needed for code that loads data into memory, however for code that setups symbol tables, kernel based events, etc this is the wrong thing to do as that code is only called when a checkpoint isn't being restored from. diff 8870:f95c4042f2d0 Thu Mar 01 18:26:00 EST 2012 Ali Saidi <Ali.Saidi@ARM.com> ARM: Add support for Versatile Express extended memory map Also clean up how we create boot loader memory a bit. diff 8852:c744483edfcf Fri Feb 24 11:45:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Make port proxies use references rather than pointers This patch is adding a clearer design intent to all objects that would not be complete without a port proxy by making the proxies members rathen than dynamically allocated. In essence, if NULL would not be a valid value for the proxy, then we avoid using a pointer to make this clear. The same approach is used for the methods using these proxies, such as loadSections, that now use references rather than pointers to better reflect the fact that NULL would not be an acceptable value (in fact the code would break and that is how this patch started out). Overall the concept of "using a reference to express unconditional composition where a NULL pointer is never valid" could be done on a much broader scale throughout the code base, but for now it is only done in the locations affected by the proxies. diff 8706:b1838faf3bcc Tue Jan 17 01:55:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Add port proxies instead of non-structural ports Port proxies are used to replace non-structural ports, and thus enable all ports in the system to correspond to a structural entity. This has the advantage of accessing memory through the normal memory subsystem and thus allowing any constellation of distributed memories, address maps, etc. Most accesses are done through the "system port" that is used for loading binaries, debugging etc. For the entities that belong to the CPU, e.g. threads and thread contexts, they wrap the CPU data port in a port proxy. The following replacements are made: FunctionalPort > PortProxy TranslatingPort > SETranslatingPortProxy VirtualPort > FSTranslatingPortProxy |
/gem5/configs/splash2/ | ||
H A D | cluster.py | diff 9036:6385cf85bf12 Thu May 31 13:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Bus: Split the bus into a non-coherent and coherent bus This patch introduces a class hierarchy of buses, a non-coherent one, and a coherent one, splitting the existing bus functionality. By doing so it also enables further specialisation of the two types of buses. A non-coherent bus connects a number of non-snooping masters and slaves, and routes the request and response packets based on the address. The request packets issued by the master connected to a non-coherent bus could still snoop in caches attached to a coherent bus, as is the case with the I/O bus and memory bus in most system configurations. No snoops will, however, reach any master on the non-coherent bus itself. The non-coherent bus can be used as a template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses, and is typically used for the I/O buses. A coherent bus connects a number of (potentially) snooping masters and slaves, and routes the request and response packets based on the address, and also forwards all requests to the snoopers and deals with the snoop responses. The coherent bus can be used as a template for modelling QPI, HyperTransport, ACE and coherent OCP buses, and is typically used for the L1-to-L2 buses and as the main system interconnect. The configuration scripts are updated to use a NoncoherentBus for all peripheral and I/O buses. A bit of minor tidying up has also been done. diff 8931:7a1dfb191e3f Fri Apr 06 13:46:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Enable multiple distributed generalized memories This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. diff 8847:ef8630054b5e Tue Feb 14 14:15:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Fix residual bus ports and make them master/slave This patch cleans up a number of remaining uses of bus.port which is now split into bus.master and bus.slave. The only non-trivial change is the memtest where the level building now has to be aware of the role of the ports used in the previous level. diff 8801:1a84c6a81299 Sat Jan 28 10:24:00 EST 2012 Gabe Black <gblack@eecs.umich.edu> SE/FS: Make SE vs. FS mode a runtime parameter. |
/gem5/src/arch/alpha/ | ||
H A D | AlphaTLB.py | diff 9338:97b4a2be1e5b Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it. |
/gem5/src/arch/mips/ | ||
H A D | MipsTLB.py | diff 9338:97b4a2be1e5b Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it. |
/gem5/src/arch/sparc/ | ||
H A D | SparcTLB.py | diff 9338:97b4a2be1e5b Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it. |
/gem5/src/cpu/o3/ | ||
H A D | FUPool.py | diff 9338:97b4a2be1e5b Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it. |
/gem5/src/dev/arm/ | ||
H A D | base_gic.cc | 9525:0587c8983d47 Thu Oct 25 09:05:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@ARM.com> arm: Create a GIC base class and make the PL390 derive from it This patch moves the GIC interface to a separate base class and makes all interrupt devices use that base class instead of a pointer to the PL390 implementation. This allows us to have multiple GIC implementations. Future implementations will allow in-kernel GIC implementations when using hardware virtualization. |
/gem5/src/kern/linux/ | ||
H A D | linux.hh | diff 9238:9fa13250abd8 Fri Sep 21 04:51:00 EDT 2012 Lluc Alvarez <lluc.alvarez@bsc.es> SE: Ignore FUTEX_PRIVATE_FLAG of sys_futex This patch ignores the FUTEX_PRIVATE_FLAG of the sys_futex system call in SE mode. With this patch, when sys_futex with the options FUTEX_WAIT_PRIVATE or FUTEX_WAKE_PRIVATE is emulated, the FUTEX_PRIVATE_FLAG is ignored and so their behaviours are the regular FUTEX_WAIT and FUTEX_WAKE. Emulating FUTEX_WAIT_PRIVATE and FUTEX_WAKE_PRIVATE as if they were non-private is safe from a functional point of view. The FUTEX_PRIVATE_FLAG does not change the semantics of the futex, it's just a mechanism to improve performance under certain circunstances that can be ignored in SE mode. diff 9146:a61fdbbc1d45 Mon Aug 06 22:52:00 EDT 2012 Marc Orr <marc.orr@gmail.com> syscall emulation: Enabled getrlimit and getrusage for x86. Added/moved rlimit constants to base linux header file. This patch is a revised version of Vince Weaver's earlier patch. diff 9141:593fe25c86a6 Mon Aug 06 19:52:00 EDT 2012 Marc Orr <marc.orr@gmail.com> syscall emulation: Clean up ioctl handling, and implement for x86. Enable different whitelists for different OS/arch combinations, since some use the generic Linux definitions only, and others use definitions inherited from earlier Unix flavors on those architectures. Also update x86 function pointers so ioctl is no longer unimplemented on that platform. This patch is a revised version of Vince Weaver's earlier patch. diff 9112:6e854ea87bab Wed Jul 11 01:51:00 EDT 2012 Marc Orr <marc.orr@gmail.com> syscall emulation: Add the futex system call. |
/gem5/src/python/m5/util/ | ||
H A D | terminal.py | diff 8947:217fbc57df05 Sat Apr 14 05:44:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Regression: Add ANSI colours to highlight test status This patch adds a very basic pretty-printing of the test status (passed or failed) to highlight failing tests even more: green for passed, and red for failed. The printing only uses ANSI it the target output is a tty and supports ANSI colours. Hence, any regression scripts that are outputting to files or sending e-mails etc should still be fine. |
/gem5/util/ | ||
H A D | cpt_upgrader.py | diff 9332:ae2a5329ce96 Fri Nov 02 12:32:00 EDT 2012 Dam Sunwoo <dam.sunwoo@arm.com> ARM: dump stats and process info on context switches This patch enables dumping statistics and Linux process information on context switch boundaries (__switch_to() calls) that are used for Streamline integration (a graphical statistics viewer from ARM). diff 9293:df7c3f99ebca Mon Oct 15 08:12:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Mem: Separate the host and guest views of memory backing store This patch moves all the memory backing store operations from the independent memory controllers to the global physical memory. The main reason for this patch is to allow address striping in a future set of patches, but at this point it already provides some useful functionality in that it is now possible to change the number of memory controllers and their address mapping in combination with checkpointing. Thus, the host and guest view of the memory backing store are now completely separate. With this patch, the individual memory controllers are far simpler as all responsibility for serializing/unserializing is moved to the physical memory. Currently, the functionality is more or less moved from AbstractMemory to PhysicalMemory without any major changes. However, in a future patch the physical memory will also resolve any ranges that are interleaved and properly assign the backing store to the memory controllers, and keep the host memory as a single contigous chunk per address range. Functionality for future extensions which involve CPU virtualization also enable the host to get pointers to the backing store. diff 9056:0e38b529c387 Tue Jun 05 10:36:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> cpt: update some comments in the checkpoint migration script 9048:950298f29140 Tue Jun 05 01:23:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> sim: Provide a framework for detecting out of data checkpoints and migrating them. |
/gem5/src/dev/ | ||
H A D | dma_device.cc | diff 9342:6fec8f26e56d Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Move the draining interface into a separate base class This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager. diff 9307:98e05d58f9eb Tue Oct 23 04:49:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> dev: Remove zero-time loop in DMA timing send This patch removes the zero-time loop used to send items from the DMA port transmit list. Instead of having a loop, the DMA port now uses an event to schedule sending of a single packet. Ultimately this patch serves to ease the transition to a blocking 4-phase handshake. A follow-on patch will update the regression statistics. diff 9294:8fb03b13de02 Mon Oct 15 08:12:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Add protocol-agnostic ports in the port hierarchy This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default. diff 9166:1d983855df2c Wed Aug 22 11:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> DMA: Refactor the DMA device and align timing and atomic This patch does a bunch of house-keeping updates on the DMA, including indentation, and formatting, but most importantly breaks out the response handling such that it can be shared between the atomic and timing modes. It also removes a potential bug caused by the atomic handling of responses only deleting the allocated request (pkt->req) once the DMA action completes instead of doing so for every packet. Before this patch, the handling of responses was near identical for atomic and timing, but the code was simply duplicated. With this patch, the handleResp method deals with the responses in both cases. There are further updates to make after removing the NACKs, but that will be part of a separate follow-up patch. This patch does not change the behaviour of any regression. diff 9165:f9e3dac185ba Wed Aug 22 11:39:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Packet: Remove NACKs from packet and its use in endpoints This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that). The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe. Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up. diff 9152:86c0e6ca5e7c Wed Aug 15 10:38:00 EDT 2012 Anthony Gutierrez <atgutier@umich.edu> O3,ARM: fix some problems with drain/switchout functionality and add Drain DPRINTFs This patch fixes some problems with the drain/switchout functionality for the O3 cpu and for the ARM ISA and adds some useful debug print statements. This is an incremental fix as there are still a few bugs/mem leaks with the switchout code. Particularly when switching from an O3CPU to a TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA I haven't encountered any more assertion failures; now the kernel will typically panic inside of simulation. diff 9133:82491f9ed266 Fri Jul 27 16:08:00 EDT 2012 Anthony Gutierrez <atgutier@umich.edu> dma: remove unused variable this patch removes the actionInProgress field from the DmaPort class. this variable is only defined and initiated in the ctor. it is never used. diff 9095:0e6bd7082fac Mon Jul 09 00:35:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Align port names in C++ and Python This patch is a first step to align the port names used in the Python world and the C++ world. Ultimately it serves to make the use of config.json together with output from the simulation easier, including post-processing of statistics. Most notably, the CPU, cache, and bus is addressed in this patch, and there might be other ports that should be updated accordingly. The dash name separator has also been replaced with a "." which is what is used to concatenate the names in python, and a separation is made between the master and slave port in the bus. 9016:18093957a102 Wed May 23 09:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> DMA: Split the DMA device and IO device into seperate files This patch moves the DMA device to its own set of files, splitting it from the IO device. There are no behavioural changes associated with this patch. The patch also grabs the opportunity to do some very minor tidying up, including some white space removal and pruning some redundant parameters. Besides the immediate benefits of the separation-of-concerns, this patch also makes upcoming changes more streamlined as it split the devices that are only slaves and the DMA device that also acts as a master. |
/gem5/configs/ruby/ | ||
H A D | MOESI_CMP_directory.py | diff 9319:ab0a36d082bb Sat Oct 27 17:04:00 EDT 2012 Malek Musleh <malek.musleh@gmail.com> ruby: set the is_icache param for caches This patch sets the is_icache param for the L1 caches used in the MESI and the MOESI CMP directory protocols. diff 9232:3bb99fab80d4 Wed Sep 19 06:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> AddrRange: Simplify AddrRange params Python hierarchy This patch simplifies the Range object hierarchy in preparation for an address range class that also allows striping (e.g. selecting a few bits as matching in addition to the range). To extend the AddrRange class to an AddrRegion, the first step is to simplify the hierarchy such that we can make it as lean as possible before adding the new functionality. The only class using Range and MetaRange is AddrRange, and the three classes are now collapsed into one. diff 9154:198352d722e4 Fri Aug 17 00:39:00 EDT 2012 Jason Power <power.jg@gmail.com> Ruby: Add RubySystem parameter to MemoryControl This guarantees that RubySystem object is created before the MemoryController object is created. diff 9100:3caf131d7a95 Wed Jul 11 01:51:00 EDT 2012 Brad Beckmann <Brad.Beckmann@amd.com> ruby: changes how Topologies are created Instead of just passing a list of controllers to the makeTopology function in src/mem/ruby/network/topologies/<Topo>.py we pass in a function pointer which knows how to make the topology, possibly with some extra state set in the configs/ruby/<protocol>.py file. Thus, we can move all of the files from network/topologies to configs/topologies. A new class BaseTopology is added which all topologies in configs/topologies must inheirit from and follow its API. diff 8931:7a1dfb191e3f Fri Apr 06 13:46:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Enable multiple distributed generalized memories This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. diff 8929:4148f9af0b70 Thu Apr 05 12:09:00 EDT 2012 Nilay Vaish <nilay@cs.wisc.edu> Config: corrects the way Ruby attaches to the DMA ports With recent changes to the memory system, a port cannot be assigned a peer port twice. While making use of the Ruby memory system in FS mode, DMA ports were assigned peer twice, once for the classic memory system and once for the Ruby memory system. This patch removes this double assignment of peer ports. diff 8923:820111f58fbb Fri Mar 30 09:42:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Ruby: Remove the physMemPort and instead access memory directly This patch removes the physMemPort from the RubySequencer and instead uses the system pointer to access the physmem. The system already keeps track of the physmem and the valid memory address ranges, and with this patch we merely make use of that existing functionality. The memory is modified so that it is possible to call the access functions (atomic and functional) without going through the port, and the memory is allowed to be unconnected, i.e. have no ports (since Ruby does not attach it like the conventional memory system). diff 8845:a230379caf65 Tue Feb 14 03:41:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Fix master/slave ports in Ruby and non-regression scripts This patch brings the Ruby and other scripts up to date with the introduction of the master/slave ports. diff 8717:5c253f1031d7 Mon Jan 23 12:07:00 EST 2012 Nilay Vaish <nilay@cs.wisc.edu> O3, Ruby: Forward invalidations from Ruby to O3 CPU This patch implements the functionality for forwarding invalidations and replacements from the L1 cache of the Ruby memory system to the O3 CPU. The implementation adds a list of ports to RubyPort. Whenever a replacement or an invalidation is performed, the L1 cache forwards this to all the ports, which is the LSQ in case of the O3 CPU. |
/gem5/src/arch/generic/linux/ | ||
H A D | threadinfo.hh | 9329:3fe8438cbcfc Fri Nov 02 12:32:00 EDT 2012 Dam Sunwoo <dam.sunwoo@arm.com> ISA: generic Linux thread info support This patch takes the Linux thread info support scattered across different ISA implementations (currently in ARM, ALPHA, and MIPS), and unifies them into a single file. Adds a few more helper functions to read out TGID, mm, etc. ISA-specific information (e.g., ALPHA PCBB register) is now moved to the corresponding isa_traits.hh files. |
/gem5/src/cpu/pred/ | ||
H A D | tournament.cc | diff 9360:515891d9057a Thu Dec 06 10:31:00 EST 2012 Erik Tomusk <E.Tomusk@sms.ed.ac.uk> TournamentBP: Fix some bugs with table sizes and counters globalHistoryBits, globalPredictorSize, and choicePredictorSize are decoupled. globalHistoryBits controls how much history is kept, global and choice predictor sizes control how much of that history is used when accessing predictor tables. This way, global and choice predictors can actually be different sizes, and it is no longer possible to walk off the predictor arrays and cause a seg fault. There are now individual thresholds for choice, global, and local saturating counters, so that taken/not taken decisions are correct even when the predictors' counters' sizes are different. The interface for localPredictorSize has been removed from TournamentBP because the value can be calculated from localHistoryBits. Committed by: Nilay Vaish <nilay@cs.wisc.edu> diff 9327:07a22ace275d Fri Nov 02 12:32:00 EDT 2012 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> o3: Fix a couple of issues with the local predictor. Fix some issues with the local predictor and the way it's indexed. diff 8843:7d3ac6813147 Mon Feb 13 01:26:00 EST 2012 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> BPred: Fix RAS to handle predicated call/return instructions. Change RAS to fix issues with predicated call/return instructions. Handled all cases in the life of a predicated call and return instruction. diff 8842:a02932e2e73d Mon Feb 13 01:26:00 EST 2012 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com> BP: Fix several Branch Predictor issues. 1. Updates the Branch Predictor correctly to the state just after a mispredicted branch, if a squash occurs. 2. If a BTB does not find an entry, the branch is predicted not taken. The global history is modified to correctly reflect this prediction. 3. Local history is now updated at the fetch stage instead of execute stage. 4. In the Update stage of the branch predictor the local predictors are now correctly updated according to the state of local history during fetch stage. This patch also improves performance by as much as 17% on some benchmarks |
/gem5/src/mem/ | ||
H A D | abstract_mem.cc | diff 9293:df7c3f99ebca Mon Oct 15 08:12:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Mem: Separate the host and guest views of memory backing store This patch moves all the memory backing store operations from the independent memory controllers to the global physical memory. The main reason for this patch is to allow address striping in a future set of patches, but at this point it already provides some useful functionality in that it is now possible to change the number of memory controllers and their address mapping in combination with checkpointing. Thus, the host and guest view of the memory backing store are now completely separate. With this patch, the individual memory controllers are far simpler as all responsibility for serializing/unserializing is moved to the physical memory. Currently, the functionality is more or less moved from AbstractMemory to PhysicalMemory without any major changes. However, in a future patch the physical memory will also resolve any ranges that are interleaved and properly assign the backing store to the memory controllers, and keep the host memory as a single contigous chunk per address range. Functionality for future extensions which involve CPU virtualization also enable the host to get pointers to the backing store. diff 9236:c38988024f1f Wed Sep 19 06:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Mem: Remove the file parameter from AbstractMemory This patch removes the unused file parameter from the AbstractMemory. The patch serves to make it easier to transition to a separation of the actual contigious host memory backing store, and the gem5 memory controllers. Without the file parameter it becomes easier to hide the creation of the mmap in the PhysicalMemory, as there are no longer any reasons to expose the actual contigious ranges to the user. To the best of my knowledge there is no use of the parameter, so the change should not affect anyone. diff 9235:5aa4896ed55a Wed Sep 19 06:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> AddrRange: Transition from Range<T> to AddrRange This patch takes the final plunge and transitions from the templated Range class to the more specific AddrRange. In doing so it changes the obvious Range<Addr> to AddrRange, and also bumps the range_map to be AddrRangeMap. In addition to the obvious changes, including the removal of redundant includes, this patch also does some house keeping in preparing for the introduction of address interleaving support in the ranges. The Range class is also stripped of all the functionality that is never used. diff 9203:939077a54014 Mon Sep 10 11:57:00 EDT 2012 Marco Elver <marco.elver@ed.ac.uk> Mem: Allow serializing of more than INT_MAX bytes Despite gzwrite taking an unsigned for length, it returns an int for bytes written; gzwrite fails if (int)len < 0. Because of this, call gzwrite with len no larger than INT_MAX: write in blocks of INT_MAX if data to be written is larger than INT_MAX. diff 9098:7909b6cf7188 Mon Jul 09 00:35:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Mem: Make members relating to range and size constant This patch makes the address-range related members const. The change is trivial and merely ensures that they can be called on a const memory. diff 9080:753fc1c3618c Fri Jun 29 11:19:00 EDT 2012 Matt Evans <matt.evans@arm.com> Mem: Fix a livelock resulting in LLSC/locked memory access implementation. Currently when multiple CPUs perform a load-linked/store-conditional sequence, the loads all create a list of reservations which is then scanned when the stores occur. A reservation matching the context and address of the store is sought, BUT all reservations matching the address are also erased at this point. The upshot is that a store-conditional will remove all reservations even if the store itself does not succeed. A livelock was observed using 7-8 CPUs where a thread would erase the reservations of other threads, not succeed, loop and put its own reservation in again only to have it blown by another thread that unsuccessfully now tries to store-conditional -- no forward progress was made, hanging the system. The correct way to do this is to only blow a reservation when a store (conditional or not) actually /occurs/ to its address. One thread always wins (the one that does the store-conditional first). diff 9053:9cad1c26c3b3 Tue Jun 05 01:23:00 EDT 2012 Dam Sunwoo <dam.sunwoo@arm.com> Mem: add per-master stats to physmem Added per-master stats (similar to cache stats) to physmem. diff 8992:e68dd2ba4fa4 Thu May 10 19:04:00 EDT 2012 Ali Saidi <Ali.Saidi@ARM.com> gem5: assert before indexing intro arrays to verify bounds 8931:7a1dfb191e3f Fri Apr 06 13:46:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Enable multiple distributed generalized memories This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. |
H A D | bridge.cc | diff 9294:8fb03b13de02 Mon Oct 15 08:12:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Add protocol-agnostic ports in the port hierarchy This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default. diff 9235:5aa4896ed55a Wed Sep 19 06:15:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> AddrRange: Transition from Range<T> to AddrRange This patch takes the final plunge and transitions from the templated Range class to the more specific AddrRange. In doing so it changes the obvious Range<Addr> to AddrRange, and also bumps the range_map to be AddrRangeMap. In addition to the obvious changes, including the removal of redundant includes, this patch also does some house keeping in preparing for the introduction of address interleaving support in the ranges. The Range class is also stripped of all the functionality that is never used. diff 9180:ee8d7a51651d Tue Aug 28 14:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Clock: Add a Cycles wrapper class and use where applicable This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes. diff 9164:d112473185ea Wed Aug 22 11:39:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Bridge: Remove NACKs in the bridge and unify with packet queue This patch removes the NACKing in the bridge, as the split request/response busses now ensure that protocol deadlocks do not occur, i.e. the message-dependency chain is broken by always allowing responses to make progress without being stalled by requests. The NACKs had limited support in the system with most components ignoring their use (with a suitable call to panic), and as the NACKs are no longer needed to avoid protocol deadlocks, the cleanest way is to simply remove them. The bridge is the starting point as this is the only place where the NACKs are created. A follow-up patch will remove the code that deals with NACKs in the endpoints, e.g. the X86 table walker and DMA port. Ultimately the type of packet can be complete removed (until someone sees a need for modelling more complex protocols, which can now be done in parts of the system since the port and interface is split). As a consequence of the NACK removal, the bridge now has to send a retry to a master if the request or response queue was full on the first attempt. This change also makes the bridge ports very similar to QueuedPorts, and a later patch will change the bridge to use these. A first step in this direction is taken by aligning the name of the member functions, as done by this patch. A bit of tidying up has also been done as part of the simplifications. Surprisingly, this patch has no impact on any of the regressions. Hence, there was never any NACKs issued. In a follow-up patch I would suggest changing the size of the bridge buffers set in FSConfig.py to also test the situation where the bridge fills up. diff 9095:0e6bd7082fac Mon Jul 09 00:35:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Align port names in C++ and Python This patch is a first step to align the port names used in the Python world and the C++ world. Ultimately it serves to make the use of config.json together with output from the simulation easier, including post-processing of statistics. Most notably, the CPU, cache, and bus is addressed in this patch, and there might be other ports that should be updated accordingly. The dash name separator has also been replaced with a "." which is what is used to concatenate the names in python, and a separation is made between the master and slave port in the bus. diff 9090:e4e22240398f Mon Jul 09 00:35:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Make getAddrRanges const This patch makes getAddrRanges const throughout the code base. There is no reason why it should not be, and making it const prevents adding any unintentional side-effects. diff 9029:120ba616606e Wed May 30 05:28:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Bridge: Split deferred request, response and sender state This patch splits the PacketBuffer class into a RequestState and a DeferredRequest and DeferredResponse. Only the requests need a SenderState, and the deferred requests and responses only need an associated point in time for the request and the response queue. Besides the cleaning up, the goal is to simplify the transition to a new port handshake, and with these changes, the two packet queues are starting to look very similar to the generic packet queue, but currently they do a few unique things relating to the NACK and counting of requests/responses that the packet queue cannot be conveniently used. This will be addressed in a later patch. diff 8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself. diff 8949:3fa1ee293096 Sat Apr 14 05:45:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Remove the Broadcast destination from the packet This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field. Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before). The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class. In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing. diff 8948:e95ee70f876c Sat Apr 14 05:45:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate snoops and normal memory requests/responses This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports. Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below. Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass. Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses. The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id. In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions. |
H A D | port_proxy.hh | diff 8922:17f037ad8918 Fri Mar 30 09:40:00 EDT 2012 William Wang <william.wang@arm.com> MEM: Introduce the master/slave port sub-classes in C++ This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects. The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches. The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal. The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again. diff 8861:56d011130987 Wed Feb 29 04:47:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Make all the port proxy members const This is a trivial patch that merely makes all the member functions of the port proxies const. There is no good reason why they should not be, and this change only serves to make it explicit that they are not modified through their use. diff 8853:0216ed80991b Fri Feb 24 11:46:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Move all read/write blob functions from Port to PortProxy This patch moves the readBlob/writeBlob/memsetBlob from the Port class to the PortProxy class, thus making a clear separation of the basic port functionality (recv/send functional/atomic/timing), and the higher-level functional accessors available on the port proxies. There are only a few places in the code base where the blob functions were used on ports, and they are all for peeking into the memory system without making a normal memory access (in the memtest, and the malta and tsunami pchip). The memtest also exemplifies how easy it is to create a non-translating proxy if desired. The malta and tsunami pchip used a slave port to perform a functional read, and this is now changed to rely on the physProxy of the system (to which they already have a pointer). 8706:b1838faf3bcc Tue Jan 17 01:55:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Add port proxies instead of non-structural ports Port proxies are used to replace non-structural ports, and thus enable all ports in the system to correspond to a structural entity. This has the advantage of accessing memory through the normal memory subsystem and thus allowing any constellation of distributed memories, address maps, etc. Most accesses are done through the "system port" that is used for loading binaries, debugging etc. For the entities that belong to the CPU, e.g. threads and thread contexts, they wrap the CPU data port in a port proxy. The following replacements are made: FunctionalPort > PortProxy TranslatingPort > SETranslatingPortProxy VirtualPort > FSTranslatingPortProxy |
/gem5/src/arch/x86/ | ||
H A D | pagetable_walker.cc | diff 9294:8fb03b13de02 Mon Oct 15 08:12:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Port: Add protocol-agnostic ports in the port hierarchy This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default. diff 9165:f9e3dac185ba Wed Aug 22 11:39:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Packet: Remove NACKs from packet and its use in endpoints This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that). The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe. Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up. diff 8975:7f36d4436074 Tue May 01 13:40:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate requests and responses for timing accesses This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself. diff 8953:488d45aeb672 Sun Apr 15 02:24:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> X86: Use the AddrTrie class to implement the TLB. This change also adjusts the TlbEntry class so that it stores the number of address bits wide a page is rather than its size in bytes. In other words, instead of storing 4K for a 4K page, it stores 12. 12 is easy to turn into 4K, but it's a little harder going the other way. diff 8949:3fa1ee293096 Sat Apr 14 05:45:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Remove the Broadcast destination from the packet This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field. Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before). The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class. In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing. diff 8948:e95ee70f876c Sat Apr 14 05:45:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate snoops and normal memory requests/responses This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports. Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below. Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass. Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses. The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id. In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions. diff 8922:17f037ad8918 Fri Mar 30 09:40:00 EDT 2012 William Wang <william.wang@arm.com> MEM: Introduce the master/slave port sub-classes in C++ This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects. The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches. The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal. The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again. diff 8832:247fee427324 Sun Feb 12 17:07:00 EST 2012 Ali Saidi <Ali.Saidi@ARM.com> mem: Add a master ID to each request object. This change adds a master id to each request object which can be used identify every device in the system that is capable of issuing a request. This is part of the way to removing the numCpus+1 stats in the cache and replacing them with the master ids. This is one of a series of changes that make way for the stats output to be changed to python. diff 8711:c7e14f52c682 Tue Jan 17 01:55:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Separate queries for snooping and address ranges This patch simplifies the address-range determination mechanism and also unifies the naming across ports and devices. It further splits the queries for determining if a port is snooping and what address ranges it responds to (aiming towards a separation of cache-maintenance ports and pure memory-mapped ports). Default behaviours are such that most ports do not have to define isSnooping, and master ports need not implement getAddrRanges. |
/gem5/src/mem/ruby/system/ | ||
H A D | DMASequencer.hh | diff 9208:2451e60d4555 Tue Sep 11 10:23:00 EDT 2012 Nilay Vaish <nilay@cs.wisc.edu> Ruby: Use uint8_t instead of uint8 everywhere diff 9117:49116b947194 Thu Jul 12 09:39:00 EDT 2012 Nilay Vaish <nilay@cs.wisc.edu> Ruby: remove config information from ruby.stats This patch removes printConfig() functions from all structures in Ruby. Most of the information is already part of config.ini, and where ever it is not, it would become in due course. diff 9104:27d56b644e78 Wed Jul 11 01:51:00 EDT 2012 Joel Hestness <hestness@cs.utexas.edu> ruby: tag and data cache access support Updates to Ruby to support statistics counting of cache accesses. This feature serves multiple purposes beyond simple stats collection. It provides the foundation for ruby to model the cache tag and data arrays as physical resources, as well as provide the necessary input data for McPAT power modeling. diff 8688:5ca9dd977386 Wed Jan 11 14:48:00 EST 2012 Nilay Vaish <nilay@cs.wisc.edu> Ruby: Resurrect Cache Warmup Capability This patch resurrects ruby's cache warmup capability. It essentially makes use of all the infrastructure that was added to the controllers, memories and the cache recorder. |
/gem5/src/arch/x86/insts/ | ||
H A D | badmicroop.cc | diff 8961:ff4762285f99 Mon Apr 23 03:00:00 EDT 2012 Gabe Black <gblack@eecs.umich.edu> ISA: Put parser generated files in a "generated" directory. This is to avoid collision with non-generated files. |
/gem5/src/cpu/ | ||
H A D | BaseCPU.py | diff 9338:97b4a2be1e5b Fri Nov 02 12:32:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it. diff 9284:f4ff625eae56 Mon Oct 15 08:08:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Regression: Use CPU clock and 32-byte width for L1-L2 bus This patch changes the CoherentBus between the L1s and L2 to use the CPU clock and also four times the width compared to the default bus. The parameters are not intending to fit every single scenario, but rather serve as a better startingpoint than what we previously had. Note that the scripts that do not use the addTwoLevelCacheHiearchy are not affected by this change. A separate patch will update the stats. diff 9254:f1b35c618252 Tue Sep 25 12:49:00 EDT 2012 Andreas Sandberg <Andreas.Sandberg@arm.com> sim: Move CPU-specific methods from SimObject to the BaseCPU class diff 9180:ee8d7a51651d Tue Aug 28 14:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Clock: Add a Cycles wrapper class and use where applicable This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes. diff 9161:e353c178fb36 Tue Aug 21 05:49:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> CPU: Remove overloaded function_trace_start parameter This patch removes the overloading of the parameter, which seems both redundant, and possibly incorrect. The inorder CPU is particularly interesting as it uses a different name for the parameter, and never make any use of it internally. diff 9157:e0bad9d7bbd6 Tue Aug 21 05:49:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Clock: Move the clock and related functions to ClockedObject This patch moves the clock of the CPU, bus, and numerous devices to the new class ClockedObject, that sits in between the SimObject and MemObject in the class hierarchy. Although there are currently a fair amount of MemObjects that do not make use of the clock, they potentially should do so, e.g. the caches should at some point have the same clock as the CPU, potentially with a 1:n ratio. This patch does not introduce any new clock objects or object hierarchies (clusters, clock domains etc), but is still a step in the direction of having a more structured approach clock domains. The most contentious part of this patch is the serialisation of clocks that some of the modules (but not all) did previously. This serialisation should not be needed as the clock is set through the parameters even when restoring from the checkpoint. In other words, the state is "stored" in the Python code that creates the modules. The nextCycle methods are also simplified and the clock phase parameter of the CPU is removed (this could be part of a clock object once they are introduced). diff 9036:6385cf85bf12 Thu May 31 13:30:00 EDT 2012 Andreas Hansson <andreas.hansson@arm.com> Bus: Split the bus into a non-coherent and coherent bus This patch introduces a class hierarchy of buses, a non-coherent one, and a coherent one, splitting the existing bus functionality. By doing so it also enables further specialisation of the two types of buses. A non-coherent bus connects a number of non-snooping masters and slaves, and routes the request and response packets based on the address. The request packets issued by the master connected to a non-coherent bus could still snoop in caches attached to a coherent bus, as is the case with the I/O bus and memory bus in most system configurations. No snoops will, however, reach any master on the non-coherent bus itself. The non-coherent bus can be used as a template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses, and is typically used for the I/O buses. A coherent bus connects a number of (potentially) snooping masters and slaves, and routes the request and response packets based on the address, and also forwards all requests to the snoopers and deals with the snoop responses. The coherent bus can be used as a template for modelling QPI, HyperTransport, ACE and coherent OCP buses, and is typically used for the L1-to-L2 buses and as the main system interconnect. The configuration scripts are updated to use a NoncoherentBus for all peripheral and I/O buses. A bit of minor tidying up has also been done. diff 8887:20ea02da9c53 Fri Mar 09 09:59:00 EST 2012 Geoffrey Blake <geoffrey.blake@arm.com> CheckerCPU: Make CheckerCPU runtime selectable instead of compile selectable Enables the CheckerCPU to be selected at runtime with the --checker option from the configs/example/fs.py and configs/example/se.py configuration files. Also merges with the SE/FS changes. diff 8863:50ce4deacda9 Thu Mar 01 12:37:00 EST 2012 Nilay Vaish <nilay@cs.wisc.edu> x86: Fix switching of CPUs This patch prevents creation of interrupt controller for cpus that will be switched in later diff 8839:eeb293859255 Mon Feb 13 06:43:00 EST 2012 Andreas Hansson <andreas.hansson@arm.com> MEM: Introduce the master/slave port roles in the Python classes This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves. The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port. Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves. |
Completed in 161 milliseconds