Searched hist:2014 (Results 601 - 625 of 1681) sorted by relevance
/gem5/tests/quick/se/00.hello/ref/arm/linux/o3-timing/ | ||
H A D | stats.txt | diff 10628:c9b7e0c69f88 Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for decoder, TLB, prefetcher and DRAM changes Changes due to speculative execution of an unaligned PC, introduction of TLB stats, changes and re-work of the prefetcher, and the introduction of rank-wise refresh in the DRAM controller. diff 10488:7c27480a5031 Mon Oct 20 17:48:00 EDT 2014 Nilay Vaish <nilay@cs.wisc.edu> stats: updates due to previous mmap and exit_group patches. diff 10433:821cbe4a183b Thu Oct 09 17:52:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Add DRAM power statistics to reference output diff 10409:8c80b91944c5 Sat Sep 20 17:18:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for filter, crossbar and config changes This patch bumps the stats to reflect the addition of the snoop filter and snoop stats, the change from bus to crossbar, and the updates to the ARM regressions that are now using a different CPU and cache configuration. Lastly, some minor changes are expected due to the activation cleanup of the CPUs. diff 10352:5f1f92bf76ee Wed Sep 03 07:42:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats for CPU and cache changes This patch updates the stats to reflect the fixes and changes to the CPU (mainly the o3), and the caches. diff 10242:cb4e86c17767 Sun Jun 22 17:33:00 EDT 2014 Steve Reinhardt <steve.reinhardt@amd.com> stats: update for O3 changes Mostly small differences in total ticks, but O3 stall causes shifted significantly. 30.eon does speed up by ~6% on Alpha and ARM, and 50.vortex by 4.5% on ARM. At the other extreme, X86 70.twolf is 0.8% slower. diff 10229:aae7735450a9 Fri May 23 07:07:00 EDT 2014 Nilay Vaish <nilay@cs.wisc.edu> stats: changes due to o3 cpu and ruby message buffer patches diff 10220:9eab5efc02e8 Fri May 09 18:58:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for the fixes, and mostly DRAM controller changes diff 10148:4574d5882066 Sun Mar 23 11:12:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats for DRAM changes This patch updates the stats to reflect the changes to the DRAM controller. diff 10038:7eccd14e2610 Fri Jan 24 16:29:00 EST 2014 Ali Saidi <Ali.Saidi@ARM.com> stats: update stats for ARMv8 changes |
/gem5/tests/quick/se/00.hello/ref/arm/linux/o3-timing-checker/ | ||
H A D | stats.txt | diff 10628:c9b7e0c69f88 Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for decoder, TLB, prefetcher and DRAM changes Changes due to speculative execution of an unaligned PC, introduction of TLB stats, changes and re-work of the prefetcher, and the introduction of rank-wise refresh in the DRAM controller. diff 10488:7c27480a5031 Mon Oct 20 17:48:00 EDT 2014 Nilay Vaish <nilay@cs.wisc.edu> stats: updates due to previous mmap and exit_group patches. diff 10433:821cbe4a183b Thu Oct 09 17:52:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Add DRAM power statistics to reference output diff 10409:8c80b91944c5 Sat Sep 20 17:18:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for filter, crossbar and config changes This patch bumps the stats to reflect the addition of the snoop filter and snoop stats, the change from bus to crossbar, and the updates to the ARM regressions that are now using a different CPU and cache configuration. Lastly, some minor changes are expected due to the activation cleanup of the CPUs. diff 10352:5f1f92bf76ee Wed Sep 03 07:42:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats for CPU and cache changes This patch updates the stats to reflect the fixes and changes to the CPU (mainly the o3), and the caches. diff 10242:cb4e86c17767 Sun Jun 22 17:33:00 EDT 2014 Steve Reinhardt <steve.reinhardt@amd.com> stats: update for O3 changes Mostly small differences in total ticks, but O3 stall causes shifted significantly. 30.eon does speed up by ~6% on Alpha and ARM, and 50.vortex by 4.5% on ARM. At the other extreme, X86 70.twolf is 0.8% slower. diff 10229:aae7735450a9 Fri May 23 07:07:00 EDT 2014 Nilay Vaish <nilay@cs.wisc.edu> stats: changes due to o3 cpu and ruby message buffer patches diff 10220:9eab5efc02e8 Fri May 09 18:58:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for the fixes, and mostly DRAM controller changes diff 10148:4574d5882066 Sun Mar 23 11:12:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats for DRAM changes This patch updates the stats to reflect the changes to the DRAM controller. diff 10038:7eccd14e2610 Fri Jan 24 16:29:00 EST 2014 Ali Saidi <Ali.Saidi@ARM.com> stats: update stats for ARMv8 changes |
/gem5/tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/ | ||
H A D | stats.txt | diff 10628:c9b7e0c69f88 Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for decoder, TLB, prefetcher and DRAM changes Changes due to speculative execution of an unaligned PC, introduction of TLB stats, changes and re-work of the prefetcher, and the introduction of rank-wise refresh in the DRAM controller. diff 10585:1c9d5d9417b3 Tue Dec 02 06:08:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for fixes, mostly TLB and WriteInvalidate diff 10549:6317351a288c Fri Nov 21 20:22:00 EST 2014 Gabe Black <gabeblack@google.com> x86: Update stats for the new Linux delay port. diff 10540:45204db420c0 Mon Nov 17 03:16:00 EST 2014 Gabe Black <gabeblack@google.com> x86: Update the stats for the x86 FS o3 boot test. diff 10535:4ccec5baf82c Wed Nov 12 09:05:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump regressions to match latest changes Updates after timezone hick-up and sorting of dictionary items in the SimObject. diff 10530:533ec854b2f1 Tue Nov 11 15:17:00 EST 2014 Nilay Vaish <nilay@cs.wisc.edu> stats: changes to x86 o3 fs and sparc fs regression tests. diff 10513:ca4438b6e39a Thu Oct 30 00:18:00 EDT 2014 Ali Saidi <Ali.Saidi@ARM.com> tests: Update regressions for the new kernels and various preceeding fixes. diff 10452:be23c690f8c0 Thu Oct 16 05:49:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Small bump of trailing stats Somehow these seem to have been missed. diff 10451:3a87241adfb8 Sat Oct 11 17:18:00 EDT 2014 Nilay Vaish <nilay@cs.wisc.edu> stats: updates due to changes to x86, stale configs. diff 10433:821cbe4a183b Thu Oct 09 17:52:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Add DRAM power statistics to reference output |
/gem5/tests/long/fs/10.linux-boot/ref/arm/linux/realview-o3-dual/ | ||
H A D | stats.txt | diff 10628:c9b7e0c69f88 Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for decoder, TLB, prefetcher and DRAM changes Changes due to speculative execution of an unaligned PC, introduction of TLB stats, changes and re-work of the prefetcher, and the introduction of rank-wise refresh in the DRAM controller. diff 10585:1c9d5d9417b3 Tue Dec 02 06:08:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for fixes, mostly TLB and WriteInvalidate diff 10576:de2979ff873a Tue Dec 02 06:08:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for o3 LSQ changes diff 10535:4ccec5baf82c Wed Nov 12 09:05:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump regressions to match latest changes Updates after timezone hick-up and sorting of dictionary items in the SimObject. diff 10517:ba51f8572571 Mon Nov 03 11:14:00 EST 2014 Ali Saidi <Ali.Saidi@ARM.com> tests: Update stats no match. Bootloader I had on my sytem was an older version with a couple of instruction differences. diff 10513:ca4438b6e39a Thu Oct 30 00:18:00 EDT 2014 Ali Saidi <Ali.Saidi@ARM.com> tests: Update regressions for the new kernels and various preceeding fixes. diff 10433:821cbe4a183b Thu Oct 09 17:52:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Add DRAM power statistics to reference output diff 10419:28b31101d9e6 Sun Sep 28 16:53:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect ARM fixes As a result of the fixes, the full-system dual-core ARM regressions are slightly changed. Hopefully this also means there will no longer be any discrepancies between the results observed on different hosts. diff 10409:8c80b91944c5 Sat Sep 20 17:18:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for filter, crossbar and config changes This patch bumps the stats to reflect the addition of the snoop filter and snoop stats, the change from bus to crossbar, and the updates to the ARM regressions that are now using a different CPU and cache configuration. Lastly, some minor changes are expected due to the activation cleanup of the CPUs. diff 10352:5f1f92bf76ee Wed Sep 03 07:42:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats for CPU and cache changes This patch updates the stats to reflect the fixes and changes to the CPU (mainly the o3), and the caches. |
/gem5/src/mem/ | ||
H A D | dram_ctrl.cc | diff 10620:74834c49fbbe Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> config: Expose the DRAM ranks as a command-line option This patch gives the user direct influence over the number of DRAM ranks to make it easier to tune the memory density without affecting the bandwidth (previously the only means of scaling the device count was through the number of channels). The patch also adds some basic sanity checks to ensure that the number of ranks is a power of two (since we rely on bit slices in the address decoding). diff 10619:6dd27a0e0d23 Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Ensure DRAM controller is idle when in atomic mode This patch addresses an issue seen with the KVM CPU where the refresh events scheduled by the DRAM controller forces the simulator to switch out of the KVM mode, thus killing performance. The current patch works around the fact that we currently have no proper API to inform a SimObject of the mode switches. Instead we rely on drainResume being called after any switch, and cache the previous mode locally to be able to decide on appropriate actions. The switcheroo regression require a minor stats bump as a result. diff 10618:bb665366cc00 Tue Dec 23 09:31:00 EST 2014 Omar Naji <Omar.Naji@arm.com> mem: Add rank-wise refresh to the DRAM controller This patch adds rank-wise refresh to the controller, as opposed to the channel-wide refresh currently in place. In essence each rank can be refreshed independently, and for this to be possible the controller is extended with a state machine per rank. Without this patch the data bus is always idle during a refresh, as all the ranks are refreshing at the same time. With the rank-wise refresh it is possible to use one rank while another one is refreshing, and thus the data bus can be kept busy. The patch introduces a Rank class to encapsulate the state per rank, and also shifts all the relevant banks, activation tracking etc to the rank. The arbitration is also updated to consider the state of the rank. diff 10617:471d390943f0 Tue Dec 23 09:31:00 EST 2014 Omar Naji <Omar.Naji@arm.com> mem: Fix a bug in the DRAM controller arbitration Fix a minor issue that affects multi-rank systems. diff 10561:e1a853349529 Tue Dec 02 06:07:00 EST 2014 Omar Naji <Omar.Naji@arm.com> mem: Add a GDDR5 DRAM config This patch adds a first cut GDDR5 config to accommodate the users combining gem5 and GPUSim. The config is based on a SK Hynix datasheet, and the Nvidia GTX580 specification. Someone from the GPUSim user-camp should tweak the default page-policy and static frontend and backend latencies. diff 10509:d5554f97c451 Thu Oct 30 00:18:00 EDT 2014 Ali Saidi <Ali.Saidi@ARM.com> arm, mem: Fix drain bug and provide drain prints for more components. diff 10492:59f9f18aae0c Mon Oct 20 18:03:00 EDT 2014 Omar Naji <Omar.Naji@arm.com> mem: Fix DRAM activationlLimit bug Ensure that we do the proper event scheduling also when the activation limit is disabled. diff 10489:99d59caa4c8f Mon Oct 20 18:03:00 EDT 2014 Omar Naji <Omar.Naji@arm.com> mem: Add DRAM device size and check against config This patch adds the size of the DRAM device to the DRAM config. It also compares the actual DRAM size (calculated using information from the config) to the size defined in the system. If these two values do not match gem5 will print a warning. In order to do correct DRAM research the size of the memory defined in the system should match the size of the DRAM in the config. The timing and current parameters found in the DRAM configs are defined for a DRAM device with a specific size and would differ for another device with a different size. diff 10466:73b7549d979e Thu Oct 16 05:49:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Dynamically determine page bytes in memory components This patch takes a step towards an ISA-agnostic memory system by enabling the components to establish the page size after instantiation. The swap operation in the memory is now also allowing any granularity to avoid depending on the IntReg of the ISA. diff 10432:da98b90b5df0 Tue Jul 29 12:22:00 EDT 2014 Omar Naji <Omar.Naji@arm.com> mem: DRAMPower integration for on-line DRAM power stats This patch takes the final step in integrating DRAMPower and adds the appropriate calls in the DRAM controller to provide the command trace and extract the power and energy stats. The debug printouts are still left in place, but will eventually be removed. At the moment the DRAM power calculation is always on when using the DRAM controller model. The run-time impact of this addition is around 1.5% when looking at the total host seconds of the regressions. We deem this a sensible trade-off to avoid the complication of adding an enable/disable mechanism. |
H A D | dram_ctrl.hh | diff 10619:6dd27a0e0d23 Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Ensure DRAM controller is idle when in atomic mode This patch addresses an issue seen with the KVM CPU where the refresh events scheduled by the DRAM controller forces the simulator to switch out of the KVM mode, thus killing performance. The current patch works around the fact that we currently have no proper API to inform a SimObject of the mode switches. Instead we rely on drainResume being called after any switch, and cache the previous mode locally to be able to decide on appropriate actions. The switcheroo regression require a minor stats bump as a result. diff 10618:bb665366cc00 Tue Dec 23 09:31:00 EST 2014 Omar Naji <Omar.Naji@arm.com> mem: Add rank-wise refresh to the DRAM controller This patch adds rank-wise refresh to the controller, as opposed to the channel-wide refresh currently in place. In essence each rank can be refreshed independently, and for this to be possible the controller is extended with a state machine per rank. Without this patch the data bus is always idle during a refresh, as all the ranks are refreshing at the same time. With the rank-wise refresh it is possible to use one rank while another one is refreshing, and thus the data bus can be kept busy. The patch introduces a Rank class to encapsulate the state per rank, and also shifts all the relevant banks, activation tracking etc to the rank. The arbitration is also updated to consider the state of the rank. diff 10489:99d59caa4c8f Mon Oct 20 18:03:00 EDT 2014 Omar Naji <Omar.Naji@arm.com> mem: Add DRAM device size and check against config This patch adds the size of the DRAM device to the DRAM config. It also compares the actual DRAM size (calculated using information from the config) to the size defined in the system. If these two values do not match gem5 will print a warning. In order to do correct DRAM research the size of the memory defined in the system should match the size of the DRAM in the config. The timing and current parameters found in the DRAM configs are defined for a DRAM device with a specific size and would differ for another device with a different size. diff 10432:da98b90b5df0 Tue Jul 29 12:22:00 EDT 2014 Omar Naji <Omar.Naji@arm.com> mem: DRAMPower integration for on-line DRAM power stats This patch takes the final step in integrating DRAMPower and adds the appropriate calls in the DRAM controller to provide the command trace and extract the power and energy stats. The debug printouts are still left in place, but will eventually be removed. At the moment the DRAM power calculation is always on when using the DRAM controller model. The run-time impact of this addition is around 1.5% when looking at the total host seconds of the regressions. We deem this a sensible trade-off to avoid the complication of adding an enable/disable mechanism. diff 10394:70cfafa17653 Sat Sep 20 17:18:00 EDT 2014 Wendy Elsasser <wendy.elsasser@arm.com> mem: Add DDR4 bank group timing Added the following parameter to the DRAMCtrl class: - bank_groups_per_rank This defaults to 1. For the DDR4 case, the default is overridden to indicate bank group architecture, with multiple bank groups per rank. Added the following delays to the DRAMCtrl class: - tCCD_L : CAS-to-CAS, same bank group delay - tRRD_L : RAS-to-RAS, same bank group delay These parameters are only applied when bank group timing is enabled. Bank group timing is currently enabled only for DDR4 memories. For all other memories, these delays will default to '0 ns' In the DRAM controller model, applied the bank group timing to the per bank parameters actAllowedAt and colAllowedAt. The actAllowedAt will be updated based on bank group when an ACT is issued. The colAllowedAt will be updated based on bank group when a RD/WR burst is issued. At the moment no modifications are made to the scheduling. diff 10393:0fafa62b6c01 Sat Sep 20 17:17:00 EDT 2014 Wendy Elsasser <wendy.elsasser@arm.com> mem: Add memory rank-to-rank delay Add the following delay to the DRAM controller: - tCS : Different rank bus turnaround delay This will be applied for 1) read-to-read, 2) write-to-write, 3) write-to-read, and 4) read-to-write command sequences, where the new command accesses a different rank than the previous burst. The delay defaults to 2*tCK for each defined memory class. Note that this does not correspond to one particular timing constraint, but is a way of modelling all the associated constraints. The DRAM controller has some minor changes to prioritize commands to the same rank. This prioritization will only occur when the command stream is not switching from a read to write or vice versa (in the case of switching we have a gap in any case). To prioritize commands to the same rank, the model will determine if there are any commands queued (same type) to the same rank as the previous command. This check will ensure that the 'same rank' command will be able to execute without adding bubbles to the command flow, e.g. any ACT delay requirements can be done under the hoods, allowing the burst to issue seamlessly. diff 10287:4966471a1ba1 Tue Aug 26 10:13:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Update DRAM controller comments Update comments and add a reference for more information. diff 10286:e95a0ab1d368 Tue Aug 26 10:12:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Fix address interleaving bug in DRAM controller This patch fixes a bug in the DRAM controller address decoding. In cases where the DRAM burst size (e.g. 32 bytes in a rank with a single LPDDR3 x32) was smaller than the channel interleaving size (e.g. systems with a 64-byte cache line) one address bit effectively got used as a channel bit when it should have been a low-order column bit. This patch adds a notion of "columns per stripe", and more clearly deals with the low-order column bits and high-order column bits. The patch also relaxes the granularity check such that it is possible to use interleaving granularities other than the cache line size. The patch also adds a missing M5_CLASS_VAR_USED to the tCK member as it is only used in the debug build for now. diff 10247:0ad233f0a77d Mon Jun 30 13:56:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: DRAMPower trace output This patch adds a DRAMPower flag to enable off-line DRAM power analysis using the DRAMPower tool. A new DRAMPower flag is added and a follow-on patch adds a Python script to post-process the output and order it based on time stamps. The long-term goal is to link DRAMPower as a library and provide the commands through function calls to the model rather than first printing and then parsing the commands. At the moment it is also up to the user to ensure that the same DRAM configuration is used by the gem5 controller model and DRAMPower. diff 10246:e0e3efe3b1d5 Mon Jun 30 13:56:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Add bank and rank indices as fields to the DRAM bank This patch adds the index of the bank and rank as a field so that we can determine the identity of a given bank (reference or pointer) for the power tracing. We also grab the opportunity of cleaning up the arguments used for identifying the bank when activating. |
/gem5/src/cpu/minor/ | ||
H A D | SConscript | 10259:ebb376f73dd2 Wed Jul 23 17:09:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> cpu: `Minor' in-order CPU model This patch contains a new CPU model named `Minor'. Minor models a four stage in-order execution pipeline (fetch lines, decompose into macroops, decompose macroops into microops, execute). The model was developed to support the ARM ISA but should be fixable to support all the remaining gem5 ISAs. It currently also works for Alpha, and regressions are included for ARM and Alpha (including Linux boot). Documentation for the model can be found in src/doc/inside-minor.doxygen and its internal operations can be visualised using the Minorview tool utils/minorview.py. Minor was designed to be fairly simple and not to engage in a lot of instruction annotation. As such, it currently has very few gathered stats and may lack other gem5 features. Minor is faster than the o3 model. Sample results: Benchmark | Stat host_seconds (s) ---------------+--------v--------v-------- (on ARM, opt) | simple | o3 | minor | timing | timing | timing ---------------+--------+--------+-------- 10.linux-boot | 169 | 1883 | 1075 10.mcf | 117 | 967 | 491 20.parser | 668 | 6315 | 3146 30.eon | 542 | 3413 | 2414 40.perlbmk | 2339 | 20905 | 11532 50.vortex | 122 | 1094 | 588 60.bzip2 | 2045 | 18061 | 9662 70.twolf | 207 | 2736 | 1036 |
H A D | activity.hh | 10259:ebb376f73dd2 Wed Jul 23 17:09:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> cpu: `Minor' in-order CPU model This patch contains a new CPU model named `Minor'. Minor models a four stage in-order execution pipeline (fetch lines, decompose into macroops, decompose macroops into microops, execute). The model was developed to support the ARM ISA but should be fixable to support all the remaining gem5 ISAs. It currently also works for Alpha, and regressions are included for ARM and Alpha (including Linux boot). Documentation for the model can be found in src/doc/inside-minor.doxygen and its internal operations can be visualised using the Minorview tool utils/minorview.py. Minor was designed to be fairly simple and not to engage in a lot of instruction annotation. As such, it currently has very few gathered stats and may lack other gem5 features. Minor is faster than the o3 model. Sample results: Benchmark | Stat host_seconds (s) ---------------+--------v--------v-------- (on ARM, opt) | simple | o3 | minor | timing | timing | timing ---------------+--------+--------+-------- 10.linux-boot | 169 | 1883 | 1075 10.mcf | 117 | 967 | 491 20.parser | 668 | 6315 | 3146 30.eon | 542 | 3413 | 2414 40.perlbmk | 2339 | 20905 | 11532 50.vortex | 122 | 1094 | 588 60.bzip2 | 2045 | 18061 | 9662 70.twolf | 207 | 2736 | 1036 |
H A D | trace.hh | 10259:ebb376f73dd2 Wed Jul 23 17:09:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> cpu: `Minor' in-order CPU model This patch contains a new CPU model named `Minor'. Minor models a four stage in-order execution pipeline (fetch lines, decompose into macroops, decompose macroops into microops, execute). The model was developed to support the ARM ISA but should be fixable to support all the remaining gem5 ISAs. It currently also works for Alpha, and regressions are included for ARM and Alpha (including Linux boot). Documentation for the model can be found in src/doc/inside-minor.doxygen and its internal operations can be visualised using the Minorview tool utils/minorview.py. Minor was designed to be fairly simple and not to engage in a lot of instruction annotation. As such, it currently has very few gathered stats and may lack other gem5 features. Minor is faster than the o3 model. Sample results: Benchmark | Stat host_seconds (s) ---------------+--------v--------v-------- (on ARM, opt) | simple | o3 | minor | timing | timing | timing ---------------+--------+--------+-------- 10.linux-boot | 169 | 1883 | 1075 10.mcf | 117 | 967 | 491 20.parser | 668 | 6315 | 3146 30.eon | 542 | 3413 | 2414 40.perlbmk | 2339 | 20905 | 11532 50.vortex | 122 | 1094 | 588 60.bzip2 | 2045 | 18061 | 9662 70.twolf | 207 | 2736 | 1036 |
/gem5/src/cpu/o3/ | ||
H A D | SConsopts | diff 10319:4207f9bfcceb Wed Sep 03 07:42:00 EDT 2014 Andreas Sandberg <Andreas.Sandberg@ARM.com> arch, cpu: Factor out the ExecContext into a proper base class We currently generate and compile one version of the ISA code per CPU model. This is obviously wasting a lot of resources at compile time. This changeset factors out the interface into a separate ExecContext class, which also serves as documentation for the interface between CPUs and the ISA code. While doing so, this changeset also fixes up interface inconsistencies between the different CPU models. The main argument for using one set of ISA code per CPU model has always been performance as this avoid indirect branches in the generated code. However, this argument does not hold water. Booting Linux on a simulated ARM system running in atomic mode (opt/10.linux-boot/realview-simple-atomic) is actually 2% faster (compiled using clang 3.4) after applying this patch. Additionally, compilation time is decreased by 35%. |
/gem5/src/cpu/simple/ | ||
H A D | SConsopts | diff 10319:4207f9bfcceb Wed Sep 03 07:42:00 EDT 2014 Andreas Sandberg <Andreas.Sandberg@ARM.com> arch, cpu: Factor out the ExecContext into a proper base class We currently generate and compile one version of the ISA code per CPU model. This is obviously wasting a lot of resources at compile time. This changeset factors out the interface into a separate ExecContext class, which also serves as documentation for the interface between CPUs and the ISA code. While doing so, this changeset also fixes up interface inconsistencies between the different CPU models. The main argument for using one set of ISA code per CPU model has always been performance as this avoid indirect branches in the generated code. However, this argument does not hold water. Booting Linux on a simulated ARM system running in atomic mode (opt/10.linux-boot/realview-simple-atomic) is actually 2% faster (compiled using clang 3.4) after applying this patch. Additionally, compilation time is decreased by 35%. |
/gem5/src/doc/ | ||
H A D | inside-minor.doxygen | 10259:ebb376f73dd2 Wed Jul 23 17:09:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> cpu: `Minor' in-order CPU model This patch contains a new CPU model named `Minor'. Minor models a four stage in-order execution pipeline (fetch lines, decompose into macroops, decompose macroops into microops, execute). The model was developed to support the ARM ISA but should be fixable to support all the remaining gem5 ISAs. It currently also works for Alpha, and regressions are included for ARM and Alpha (including Linux boot). Documentation for the model can be found in src/doc/inside-minor.doxygen and its internal operations can be visualised using the Minorview tool utils/minorview.py. Minor was designed to be fairly simple and not to engage in a lot of instruction annotation. As such, it currently has very few gathered stats and may lack other gem5 features. Minor is faster than the o3 model. Sample results: Benchmark | Stat host_seconds (s) ---------------+--------v--------v-------- (on ARM, opt) | simple | o3 | minor | timing | timing | timing ---------------+--------+--------+-------- 10.linux-boot | 169 | 1883 | 1075 10.mcf | 117 | 967 | 491 20.parser | 668 | 6315 | 3146 30.eon | 542 | 3413 | 2414 40.perlbmk | 2339 | 20905 | 11532 50.vortex | 122 | 1094 | 588 60.bzip2 | 2045 | 18061 | 9662 70.twolf | 207 | 2736 | 1036 |
/gem5/src/mem/slicc/ast/ | ||
H A D | TransitionDeclAST.py | diff 10165:7e9edf4297a9 Sat Apr 19 10:00:00 EDT 2014 Nilay Vaish <nilay@cs.wisc.edu> ruby: slicc: slight change to rule for transitions It had an unnecessary pairs token which is being removed. |
/gem5/src/sim/ | ||
H A D | global_event.cc | diff 10361:280cc9b0794f Tue Sep 09 04:36:00 EDT 2014 Andreas Sandberg <Andreas.Sandberg@ARM.com> sim: Fix resource leak in BaseGlobalEvent Static analysis revealed that BaseGlobalEvent::barrier was never deallocated. This changeset solves this leak by making the barrier allocation a part of the BaseGlobalEvent instead of storing a pointer to a separate heap-allocated barrier. |
/gem5/src/unittest/ | ||
H A D | symtest.cc | diff 10055:6153b582c9b5 Thu Jan 30 01:21:00 EST 2014 Ola Jeppsson <ola.jeppsson@gmail.com> unittest: Fix build errors Committed by: Nilay Vaish <nilay@cs.wisc.edu> |
/gem5/ext/mcpat/ | ||
H A D | bus_interconnect.cc | 10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes This patch includes software engineering changes and some generic bug fixes Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known issues/concernts we did not have a chance to address in this patch. High-level changes in this patch include: 1) Making XML parsing modular and hierarchical: - Shift parsing responsibility into the components - Read XML in a (mostly) context-free recursive manner so that McPAT input files can contain arbitrary component hierarchies 2) Making power, energy, and area calculations a hierarchical and recursive process - Components track their subcomponents and recursively call compute functions in stages - Make C++ object hierarchy reflect inheritance of classes of components with similar structures - Simplify computeArea() and computeEnergy() functions to eliminate successive calls to calculate separate TDP vs. runtime energy - Remove Processor component (now unnecessary) and introduce a more abstract System component 3) Standardizing McPAT output across all components - Use a single, common data structure for storing and printing McPAT output - Recursively call print functions through component hierarchy 4) For caches, allow splitting data array and tag array reads and writes for better accuracy 5) Improving the usability of CACTI by printing more helpful warning and error messages 6) Minor: Impose more rigorous code style for clarity (more work still to be done) Overall, these changes greatly reduce the amount of replicated code, and they improve McPAT runtime and decrease memory footprint. |
H A D | bus_interconnect.h | 10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes This patch includes software engineering changes and some generic bug fixes Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known issues/concernts we did not have a chance to address in this patch. High-level changes in this patch include: 1) Making XML parsing modular and hierarchical: - Shift parsing responsibility into the components - Read XML in a (mostly) context-free recursive manner so that McPAT input files can contain arbitrary component hierarchies 2) Making power, energy, and area calculations a hierarchical and recursive process - Components track their subcomponents and recursively call compute functions in stages - Make C++ object hierarchy reflect inheritance of classes of components with similar structures - Simplify computeArea() and computeEnergy() functions to eliminate successive calls to calculate separate TDP vs. runtime energy - Remove Processor component (now unnecessary) and introduce a more abstract System component 3) Standardizing McPAT output across all components - Use a single, common data structure for storing and printing McPAT output - Recursively call print functions through component hierarchy 4) For caches, allow splitting data array and tag array reads and writes for better accuracy 5) Improving the usability of CACTI by printing more helpful warning and error messages 6) Minor: Impose more rigorous code style for clarity (more work still to be done) Overall, these changes greatly reduce the amount of replicated code, and they improve McPAT runtime and decrease memory footprint. |
H A D | cachearray.cc | 10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes This patch includes software engineering changes and some generic bug fixes Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known issues/concernts we did not have a chance to address in this patch. High-level changes in this patch include: 1) Making XML parsing modular and hierarchical: - Shift parsing responsibility into the components - Read XML in a (mostly) context-free recursive manner so that McPAT input files can contain arbitrary component hierarchies 2) Making power, energy, and area calculations a hierarchical and recursive process - Components track their subcomponents and recursively call compute functions in stages - Make C++ object hierarchy reflect inheritance of classes of components with similar structures - Simplify computeArea() and computeEnergy() functions to eliminate successive calls to calculate separate TDP vs. runtime energy - Remove Processor component (now unnecessary) and introduce a more abstract System component 3) Standardizing McPAT output across all components - Use a single, common data structure for storing and printing McPAT output - Recursively call print functions through component hierarchy 4) For caches, allow splitting data array and tag array reads and writes for better accuracy 5) Improving the usability of CACTI by printing more helpful warning and error messages 6) Minor: Impose more rigorous code style for clarity (more work still to be done) Overall, these changes greatly reduce the amount of replicated code, and they improve McPAT runtime and decrease memory footprint. |
H A D | cachearray.h | 10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes This patch includes software engineering changes and some generic bug fixes Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known issues/concernts we did not have a chance to address in this patch. High-level changes in this patch include: 1) Making XML parsing modular and hierarchical: - Shift parsing responsibility into the components - Read XML in a (mostly) context-free recursive manner so that McPAT input files can contain arbitrary component hierarchies 2) Making power, energy, and area calculations a hierarchical and recursive process - Components track their subcomponents and recursively call compute functions in stages - Make C++ object hierarchy reflect inheritance of classes of components with similar structures - Simplify computeArea() and computeEnergy() functions to eliminate successive calls to calculate separate TDP vs. runtime energy - Remove Processor component (now unnecessary) and introduce a more abstract System component 3) Standardizing McPAT output across all components - Use a single, common data structure for storing and printing McPAT output - Recursively call print functions through component hierarchy 4) For caches, allow splitting data array and tag array reads and writes for better accuracy 5) Improving the usability of CACTI by printing more helpful warning and error messages 6) Minor: Impose more rigorous code style for clarity (more work still to be done) Overall, these changes greatly reduce the amount of replicated code, and they improve McPAT runtime and decrease memory footprint. |
H A D | cachecontroller.cc | 10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes This patch includes software engineering changes and some generic bug fixes Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known issues/concernts we did not have a chance to address in this patch. High-level changes in this patch include: 1) Making XML parsing modular and hierarchical: - Shift parsing responsibility into the components - Read XML in a (mostly) context-free recursive manner so that McPAT input files can contain arbitrary component hierarchies 2) Making power, energy, and area calculations a hierarchical and recursive process - Components track their subcomponents and recursively call compute functions in stages - Make C++ object hierarchy reflect inheritance of classes of components with similar structures - Simplify computeArea() and computeEnergy() functions to eliminate successive calls to calculate separate TDP vs. runtime energy - Remove Processor component (now unnecessary) and introduce a more abstract System component 3) Standardizing McPAT output across all components - Use a single, common data structure for storing and printing McPAT output - Recursively call print functions through component hierarchy 4) For caches, allow splitting data array and tag array reads and writes for better accuracy 5) Improving the usability of CACTI by printing more helpful warning and error messages 6) Minor: Impose more rigorous code style for clarity (more work still to be done) Overall, these changes greatly reduce the amount of replicated code, and they improve McPAT runtime and decrease memory footprint. |
H A D | cachecontroller.h | 10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes This patch includes software engineering changes and some generic bug fixes Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known issues/concernts we did not have a chance to address in this patch. High-level changes in this patch include: 1) Making XML parsing modular and hierarchical: - Shift parsing responsibility into the components - Read XML in a (mostly) context-free recursive manner so that McPAT input files can contain arbitrary component hierarchies 2) Making power, energy, and area calculations a hierarchical and recursive process - Components track their subcomponents and recursively call compute functions in stages - Make C++ object hierarchy reflect inheritance of classes of components with similar structures - Simplify computeArea() and computeEnergy() functions to eliminate successive calls to calculate separate TDP vs. runtime energy - Remove Processor component (now unnecessary) and introduce a more abstract System component 3) Standardizing McPAT output across all components - Use a single, common data structure for storing and printing McPAT output - Recursively call print functions through component hierarchy 4) For caches, allow splitting data array and tag array reads and writes for better accuracy 5) Improving the usability of CACTI by printing more helpful warning and error messages 6) Minor: Impose more rigorous code style for clarity (more work still to be done) Overall, these changes greatly reduce the amount of replicated code, and they improve McPAT runtime and decrease memory footprint. |
H A D | cacheunit.cc | 10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes This patch includes software engineering changes and some generic bug fixes Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known issues/concernts we did not have a chance to address in this patch. High-level changes in this patch include: 1) Making XML parsing modular and hierarchical: - Shift parsing responsibility into the components - Read XML in a (mostly) context-free recursive manner so that McPAT input files can contain arbitrary component hierarchies 2) Making power, energy, and area calculations a hierarchical and recursive process - Components track their subcomponents and recursively call compute functions in stages - Make C++ object hierarchy reflect inheritance of classes of components with similar structures - Simplify computeArea() and computeEnergy() functions to eliminate successive calls to calculate separate TDP vs. runtime energy - Remove Processor component (now unnecessary) and introduce a more abstract System component 3) Standardizing McPAT output across all components - Use a single, common data structure for storing and printing McPAT output - Recursively call print functions through component hierarchy 4) For caches, allow splitting data array and tag array reads and writes for better accuracy 5) Improving the usability of CACTI by printing more helpful warning and error messages 6) Minor: Impose more rigorous code style for clarity (more work still to be done) Overall, these changes greatly reduce the amount of replicated code, and they improve McPAT runtime and decrease memory footprint. |
H A D | cacheunit.h | 10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes This patch includes software engineering changes and some generic bug fixes Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known issues/concernts we did not have a chance to address in this patch. High-level changes in this patch include: 1) Making XML parsing modular and hierarchical: - Shift parsing responsibility into the components - Read XML in a (mostly) context-free recursive manner so that McPAT input files can contain arbitrary component hierarchies 2) Making power, energy, and area calculations a hierarchical and recursive process - Components track their subcomponents and recursively call compute functions in stages - Make C++ object hierarchy reflect inheritance of classes of components with similar structures - Simplify computeArea() and computeEnergy() functions to eliminate successive calls to calculate separate TDP vs. runtime energy - Remove Processor component (now unnecessary) and introduce a more abstract System component 3) Standardizing McPAT output across all components - Use a single, common data structure for storing and printing McPAT output - Recursively call print functions through component hierarchy 4) For caches, allow splitting data array and tag array reads and writes for better accuracy 5) Improving the usability of CACTI by printing more helpful warning and error messages 6) Minor: Impose more rigorous code style for clarity (more work still to be done) Overall, these changes greatly reduce the amount of replicated code, and they improve McPAT runtime and decrease memory footprint. |
H A D | common.h | 10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes This patch includes software engineering changes and some generic bug fixes Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known issues/concernts we did not have a chance to address in this patch. High-level changes in this patch include: 1) Making XML parsing modular and hierarchical: - Shift parsing responsibility into the components - Read XML in a (mostly) context-free recursive manner so that McPAT input files can contain arbitrary component hierarchies 2) Making power, energy, and area calculations a hierarchical and recursive process - Components track their subcomponents and recursively call compute functions in stages - Make C++ object hierarchy reflect inheritance of classes of components with similar structures - Simplify computeArea() and computeEnergy() functions to eliminate successive calls to calculate separate TDP vs. runtime energy - Remove Processor component (now unnecessary) and introduce a more abstract System component 3) Standardizing McPAT output across all components - Use a single, common data structure for storing and printing McPAT output - Recursively call print functions through component hierarchy 4) For caches, allow splitting data array and tag array reads and writes for better accuracy 5) Improving the usability of CACTI by printing more helpful warning and error messages 6) Minor: Impose more rigorous code style for clarity (more work still to be done) Overall, these changes greatly reduce the amount of replicated code, and they improve McPAT runtime and decrease memory footprint. |
H A D | system.cc | 10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes This patch includes software engineering changes and some generic bug fixes Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known issues/concernts we did not have a chance to address in this patch. High-level changes in this patch include: 1) Making XML parsing modular and hierarchical: - Shift parsing responsibility into the components - Read XML in a (mostly) context-free recursive manner so that McPAT input files can contain arbitrary component hierarchies 2) Making power, energy, and area calculations a hierarchical and recursive process - Components track their subcomponents and recursively call compute functions in stages - Make C++ object hierarchy reflect inheritance of classes of components with similar structures - Simplify computeArea() and computeEnergy() functions to eliminate successive calls to calculate separate TDP vs. runtime energy - Remove Processor component (now unnecessary) and introduce a more abstract System component 3) Standardizing McPAT output across all components - Use a single, common data structure for storing and printing McPAT output - Recursively call print functions through component hierarchy 4) For caches, allow splitting data array and tag array reads and writes for better accuracy 5) Improving the usability of CACTI by printing more helpful warning and error messages 6) Minor: Impose more rigorous code style for clarity (more work still to be done) Overall, these changes greatly reduce the amount of replicated code, and they improve McPAT runtime and decrease memory footprint. |
Completed in 127 milliseconds