Searched hist:2014 (Results 601 - 625 of 1681) sorted by relevance

<<21222324252627282930>>

/gem5/tests/quick/se/00.hello/ref/arm/linux/o3-timing/
H A Dstats.txt10628:c9b7e0c69f88 Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for decoder, TLB, prefetcher and DRAM changes

Changes due to speculative execution of an unaligned PC, introduction
of TLB stats, changes and re-work of the prefetcher, and the
introduction of rank-wise refresh in the DRAM controller.
10488:7c27480a5031 Mon Oct 20 17:48:00 EDT 2014 Nilay Vaish <nilay@cs.wisc.edu> stats: updates due to previous mmap and exit_group patches.
10433:821cbe4a183b Thu Oct 09 17:52:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Add DRAM power statistics to reference output
10409:8c80b91944c5 Sat Sep 20 17:18:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for filter, crossbar and config changes

This patch bumps the stats to reflect the addition of the snoop filter
and snoop stats, the change from bus to crossbar, and the updates to
the ARM regressions that are now using a different CPU and cache
configuration. Lastly, some minor changes are expected due to the
activation cleanup of the CPUs.
10352:5f1f92bf76ee Wed Sep 03 07:42:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats for CPU and cache changes

This patch updates the stats to reflect the fixes and changes to the
CPU (mainly the o3), and the caches.
10242:cb4e86c17767 Sun Jun 22 17:33:00 EDT 2014 Steve Reinhardt <steve.reinhardt@amd.com> stats: update for O3 changes

Mostly small differences in total ticks, but O3 stall causes
shifted significantly.

30.eon does speed up by ~6% on Alpha and ARM, and 50.vortex
by 4.5% on ARM. At the other extreme, X86 70.twolf is 0.8%
slower.
10229:aae7735450a9 Fri May 23 07:07:00 EDT 2014 Nilay Vaish <nilay@cs.wisc.edu> stats: changes due to o3 cpu and ruby message buffer patches
10220:9eab5efc02e8 Fri May 09 18:58:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for the fixes, and mostly DRAM controller changes
10148:4574d5882066 Sun Mar 23 11:12:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats for DRAM changes

This patch updates the stats to reflect the changes to the DRAM
controller.
10038:7eccd14e2610 Fri Jan 24 16:29:00 EST 2014 Ali Saidi <Ali.Saidi@ARM.com> stats: update stats for ARMv8 changes
/gem5/tests/quick/se/00.hello/ref/arm/linux/o3-timing-checker/
H A Dstats.txt10628:c9b7e0c69f88 Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for decoder, TLB, prefetcher and DRAM changes

Changes due to speculative execution of an unaligned PC, introduction
of TLB stats, changes and re-work of the prefetcher, and the
introduction of rank-wise refresh in the DRAM controller.
10488:7c27480a5031 Mon Oct 20 17:48:00 EDT 2014 Nilay Vaish <nilay@cs.wisc.edu> stats: updates due to previous mmap and exit_group patches.
10433:821cbe4a183b Thu Oct 09 17:52:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Add DRAM power statistics to reference output
10409:8c80b91944c5 Sat Sep 20 17:18:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for filter, crossbar and config changes

This patch bumps the stats to reflect the addition of the snoop filter
and snoop stats, the change from bus to crossbar, and the updates to
the ARM regressions that are now using a different CPU and cache
configuration. Lastly, some minor changes are expected due to the
activation cleanup of the CPUs.
10352:5f1f92bf76ee Wed Sep 03 07:42:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats for CPU and cache changes

This patch updates the stats to reflect the fixes and changes to the
CPU (mainly the o3), and the caches.
10242:cb4e86c17767 Sun Jun 22 17:33:00 EDT 2014 Steve Reinhardt <steve.reinhardt@amd.com> stats: update for O3 changes

Mostly small differences in total ticks, but O3 stall causes
shifted significantly.

30.eon does speed up by ~6% on Alpha and ARM, and 50.vortex
by 4.5% on ARM. At the other extreme, X86 70.twolf is 0.8%
slower.
10229:aae7735450a9 Fri May 23 07:07:00 EDT 2014 Nilay Vaish <nilay@cs.wisc.edu> stats: changes due to o3 cpu and ruby message buffer patches
10220:9eab5efc02e8 Fri May 09 18:58:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for the fixes, and mostly DRAM controller changes
10148:4574d5882066 Sun Mar 23 11:12:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats for DRAM changes

This patch updates the stats to reflect the changes to the DRAM
controller.
10038:7eccd14e2610 Fri Jan 24 16:29:00 EST 2014 Ali Saidi <Ali.Saidi@ARM.com> stats: update stats for ARMv8 changes
/gem5/tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/
H A Dstats.txt10628:c9b7e0c69f88 Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for decoder, TLB, prefetcher and DRAM changes

Changes due to speculative execution of an unaligned PC, introduction
of TLB stats, changes and re-work of the prefetcher, and the
introduction of rank-wise refresh in the DRAM controller.
10585:1c9d5d9417b3 Tue Dec 02 06:08:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for fixes, mostly TLB and WriteInvalidate
10549:6317351a288c Fri Nov 21 20:22:00 EST 2014 Gabe Black <gabeblack@google.com> x86: Update stats for the new Linux delay port.
10540:45204db420c0 Mon Nov 17 03:16:00 EST 2014 Gabe Black <gabeblack@google.com> x86: Update the stats for the x86 FS o3 boot test.
10535:4ccec5baf82c Wed Nov 12 09:05:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump regressions to match latest changes

Updates after timezone hick-up and sorting of dictionary items in the
SimObject.
10530:533ec854b2f1 Tue Nov 11 15:17:00 EST 2014 Nilay Vaish <nilay@cs.wisc.edu> stats: changes to x86 o3 fs and sparc fs regression tests.
10513:ca4438b6e39a Thu Oct 30 00:18:00 EDT 2014 Ali Saidi <Ali.Saidi@ARM.com> tests: Update regressions for the new kernels and various preceeding fixes.
10452:be23c690f8c0 Thu Oct 16 05:49:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Small bump of trailing stats

Somehow these seem to have been missed.
10451:3a87241adfb8 Sat Oct 11 17:18:00 EDT 2014 Nilay Vaish <nilay@cs.wisc.edu> stats: updates due to changes to x86, stale configs.
10433:821cbe4a183b Thu Oct 09 17:52:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Add DRAM power statistics to reference output
/gem5/tests/long/fs/10.linux-boot/ref/arm/linux/realview-o3-dual/
H A Dstats.txt10628:c9b7e0c69f88 Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for decoder, TLB, prefetcher and DRAM changes

Changes due to speculative execution of an unaligned PC, introduction
of TLB stats, changes and re-work of the prefetcher, and the
introduction of rank-wise refresh in the DRAM controller.
10585:1c9d5d9417b3 Tue Dec 02 06:08:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for fixes, mostly TLB and WriteInvalidate
10576:de2979ff873a Tue Dec 02 06:08:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for o3 LSQ changes
10535:4ccec5baf82c Wed Nov 12 09:05:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump regressions to match latest changes

Updates after timezone hick-up and sorting of dictionary items in the
SimObject.
10517:ba51f8572571 Mon Nov 03 11:14:00 EST 2014 Ali Saidi <Ali.Saidi@ARM.com> tests: Update stats no match.

Bootloader I had on my sytem was an older version with a couple of
instruction differences.
10513:ca4438b6e39a Thu Oct 30 00:18:00 EDT 2014 Ali Saidi <Ali.Saidi@ARM.com> tests: Update regressions for the new kernels and various preceeding fixes.
10433:821cbe4a183b Thu Oct 09 17:52:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Add DRAM power statistics to reference output
10419:28b31101d9e6 Sun Sep 28 16:53:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats to reflect ARM fixes

As a result of the fixes, the full-system dual-core ARM regressions
are slightly changed. Hopefully this also means there will no longer
be any discrepancies between the results observed on different hosts.
10409:8c80b91944c5 Sat Sep 20 17:18:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Bump stats for filter, crossbar and config changes

This patch bumps the stats to reflect the addition of the snoop filter
and snoop stats, the change from bus to crossbar, and the updates to
the ARM regressions that are now using a different CPU and cache
configuration. Lastly, some minor changes are expected due to the
activation cleanup of the CPUs.
10352:5f1f92bf76ee Wed Sep 03 07:42:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> stats: Update stats for CPU and cache changes

This patch updates the stats to reflect the fixes and changes to the
CPU (mainly the o3), and the caches.
/gem5/src/mem/
H A Ddram_ctrl.cc10620:74834c49fbbe Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> config: Expose the DRAM ranks as a command-line option

This patch gives the user direct influence over the number of DRAM
ranks to make it easier to tune the memory density without affecting
the bandwidth (previously the only means of scaling the device count
was through the number of channels).

The patch also adds some basic sanity checks to ensure that the number
of ranks is a power of two (since we rely on bit slices in the address
decoding).
10619:6dd27a0e0d23 Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Ensure DRAM controller is idle when in atomic mode

This patch addresses an issue seen with the KVM CPU where the refresh
events scheduled by the DRAM controller forces the simulator to switch
out of the KVM mode, thus killing performance.

The current patch works around the fact that we currently have no
proper API to inform a SimObject of the mode switches. Instead we rely
on drainResume being called after any switch, and cache the previous
mode locally to be able to decide on appropriate actions.

The switcheroo regression require a minor stats bump as a result.
10618:bb665366cc00 Tue Dec 23 09:31:00 EST 2014 Omar Naji <Omar.Naji@arm.com> mem: Add rank-wise refresh to the DRAM controller

This patch adds rank-wise refresh to the controller, as opposed to the
channel-wide refresh currently in place. In essence each rank can be
refreshed independently, and for this to be possible the controller
is extended with a state machine per rank.

Without this patch the data bus is always idle during a refresh, as
all the ranks are refreshing at the same time. With the rank-wise
refresh it is possible to use one rank while another one is
refreshing, and thus the data bus can be kept busy.

The patch introduces a Rank class to encapsulate the state per rank,
and also shifts all the relevant banks, activation tracking etc to the
rank. The arbitration is also updated to consider the state of the rank.
10617:471d390943f0 Tue Dec 23 09:31:00 EST 2014 Omar Naji <Omar.Naji@arm.com> mem: Fix a bug in the DRAM controller arbitration

Fix a minor issue that affects multi-rank systems.
10561:e1a853349529 Tue Dec 02 06:07:00 EST 2014 Omar Naji <Omar.Naji@arm.com> mem: Add a GDDR5 DRAM config

This patch adds a first cut GDDR5 config to accommodate the users
combining gem5 and GPUSim. The config is based on a SK Hynix
datasheet, and the Nvidia GTX580 specification. Someone from the
GPUSim user-camp should tweak the default page-policy and static
frontend and backend latencies.
10509:d5554f97c451 Thu Oct 30 00:18:00 EDT 2014 Ali Saidi <Ali.Saidi@ARM.com> arm, mem: Fix drain bug and provide drain prints for more components.
10492:59f9f18aae0c Mon Oct 20 18:03:00 EDT 2014 Omar Naji <Omar.Naji@arm.com> mem: Fix DRAM activationlLimit bug

Ensure that we do the proper event scheduling also when the activation
limit is disabled.
10489:99d59caa4c8f Mon Oct 20 18:03:00 EDT 2014 Omar Naji <Omar.Naji@arm.com> mem: Add DRAM device size and check against config

This patch adds the size of the DRAM device to the DRAM config. It
also compares the actual DRAM size (calculated using information from
the config) to the size defined in the system. If these two values do
not match gem5 will print a warning. In order to do correct DRAM
research the size of the memory defined in the system should match the
size of the DRAM in the config. The timing and current parameters
found in the DRAM configs are defined for a DRAM device with a
specific size and would differ for another device with a different
size.
10466:73b7549d979e Thu Oct 16 05:49:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Dynamically determine page bytes in memory components

This patch takes a step towards an ISA-agnostic memory
system by enabling the components to establish the page size after
instantiation. The swap operation in the memory is now also allowing
any granularity to avoid depending on the IntReg of the ISA.
10432:da98b90b5df0 Tue Jul 29 12:22:00 EDT 2014 Omar Naji <Omar.Naji@arm.com> mem: DRAMPower integration for on-line DRAM power stats

This patch takes the final step in integrating DRAMPower and adds the
appropriate calls in the DRAM controller to provide the command trace
and extract the power and energy stats. The debug printouts are still
left in place, but will eventually be removed.

At the moment the DRAM power calculation is always on when using the
DRAM controller model. The run-time impact of this addition is around
1.5% when looking at the total host seconds of the regressions. We
deem this a sensible trade-off to avoid the complication of adding an
enable/disable mechanism.
H A Ddram_ctrl.hh10619:6dd27a0e0d23 Tue Dec 23 09:31:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Ensure DRAM controller is idle when in atomic mode

This patch addresses an issue seen with the KVM CPU where the refresh
events scheduled by the DRAM controller forces the simulator to switch
out of the KVM mode, thus killing performance.

The current patch works around the fact that we currently have no
proper API to inform a SimObject of the mode switches. Instead we rely
on drainResume being called after any switch, and cache the previous
mode locally to be able to decide on appropriate actions.

The switcheroo regression require a minor stats bump as a result.
10618:bb665366cc00 Tue Dec 23 09:31:00 EST 2014 Omar Naji <Omar.Naji@arm.com> mem: Add rank-wise refresh to the DRAM controller

This patch adds rank-wise refresh to the controller, as opposed to the
channel-wide refresh currently in place. In essence each rank can be
refreshed independently, and for this to be possible the controller
is extended with a state machine per rank.

Without this patch the data bus is always idle during a refresh, as
all the ranks are refreshing at the same time. With the rank-wise
refresh it is possible to use one rank while another one is
refreshing, and thus the data bus can be kept busy.

The patch introduces a Rank class to encapsulate the state per rank,
and also shifts all the relevant banks, activation tracking etc to the
rank. The arbitration is also updated to consider the state of the rank.
10489:99d59caa4c8f Mon Oct 20 18:03:00 EDT 2014 Omar Naji <Omar.Naji@arm.com> mem: Add DRAM device size and check against config

This patch adds the size of the DRAM device to the DRAM config. It
also compares the actual DRAM size (calculated using information from
the config) to the size defined in the system. If these two values do
not match gem5 will print a warning. In order to do correct DRAM
research the size of the memory defined in the system should match the
size of the DRAM in the config. The timing and current parameters
found in the DRAM configs are defined for a DRAM device with a
specific size and would differ for another device with a different
size.
10432:da98b90b5df0 Tue Jul 29 12:22:00 EDT 2014 Omar Naji <Omar.Naji@arm.com> mem: DRAMPower integration for on-line DRAM power stats

This patch takes the final step in integrating DRAMPower and adds the
appropriate calls in the DRAM controller to provide the command trace
and extract the power and energy stats. The debug printouts are still
left in place, but will eventually be removed.

At the moment the DRAM power calculation is always on when using the
DRAM controller model. The run-time impact of this addition is around
1.5% when looking at the total host seconds of the regressions. We
deem this a sensible trade-off to avoid the complication of adding an
enable/disable mechanism.
10394:70cfafa17653 Sat Sep 20 17:18:00 EDT 2014 Wendy Elsasser <wendy.elsasser@arm.com> mem: Add DDR4 bank group timing

Added the following parameter to the DRAMCtrl class:
- bank_groups_per_rank

This defaults to 1. For the DDR4 case, the default is overridden to indicate
bank group architecture, with multiple bank groups per rank.

Added the following delays to the DRAMCtrl class:
- tCCD_L : CAS-to-CAS, same bank group delay
- tRRD_L : RAS-to-RAS, same bank group delay

These parameters are only applied when bank group timing is enabled. Bank
group timing is currently enabled only for DDR4 memories.

For all other memories, these delays will default to '0 ns'

In the DRAM controller model, applied the bank group timing to the per bank
parameters actAllowedAt and colAllowedAt.
The actAllowedAt will be updated based on bank group when an ACT is issued.
The colAllowedAt will be updated based on bank group when a RD/WR burst is
issued.

At the moment no modifications are made to the scheduling.
10393:0fafa62b6c01 Sat Sep 20 17:17:00 EDT 2014 Wendy Elsasser <wendy.elsasser@arm.com> mem: Add memory rank-to-rank delay

Add the following delay to the DRAM controller:
- tCS : Different rank bus turnaround delay

This will be applied for
1) read-to-read,
2) write-to-write,
3) write-to-read, and
4) read-to-write
command sequences, where the new command accesses a different rank
than the previous burst.

The delay defaults to 2*tCK for each defined memory class. Note that
this does not correspond to one particular timing constraint, but is a
way of modelling all the associated constraints.

The DRAM controller has some minor changes to prioritize commands to
the same rank. This prioritization will only occur when the command
stream is not switching from a read to write or vice versa (in the
case of switching we have a gap in any case).

To prioritize commands to the same rank, the model will determine if there are
any commands queued (same type) to the same rank as the previous command.
This check will ensure that the 'same rank' command will be able to execute
without adding bubbles to the command flow, e.g. any ACT delay requirements
can be done under the hoods, allowing the burst to issue seamlessly.
10287:4966471a1ba1 Tue Aug 26 10:13:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Update DRAM controller comments

Update comments and add a reference for more information.
10286:e95a0ab1d368 Tue Aug 26 10:12:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Fix address interleaving bug in DRAM controller

This patch fixes a bug in the DRAM controller address decoding. In
cases where the DRAM burst size (e.g. 32 bytes in a rank with a single
LPDDR3 x32) was smaller than the channel interleaving size
(e.g. systems with a 64-byte cache line) one address bit effectively
got used as a channel bit when it should have been a low-order column
bit.

This patch adds a notion of "columns per stripe", and more clearly
deals with the low-order column bits and high-order column bits. The
patch also relaxes the granularity check such that it is possible to
use interleaving granularities other than the cache line size.

The patch also adds a missing M5_CLASS_VAR_USED to the tCK member as
it is only used in the debug build for now.
10247:0ad233f0a77d Mon Jun 30 13:56:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: DRAMPower trace output

This patch adds a DRAMPower flag to enable off-line DRAM power
analysis using the DRAMPower tool. A new DRAMPower flag is added
and a follow-on patch adds a Python script to post-process the output
and order it based on time stamps.

The long-term goal is to link DRAMPower as a library and provide the
commands through function calls to the model rather than first
printing and then parsing the commands. At the moment it is also up to
the user to ensure that the same DRAM configuration is used by the
gem5 controller model and DRAMPower.
10246:e0e3efe3b1d5 Mon Jun 30 13:56:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Add bank and rank indices as fields to the DRAM bank

This patch adds the index of the bank and rank as a field so that we can
determine the identity of a given bank (reference or pointer) for the
power tracing. We also grab the opportunity of cleaning up the
arguments used for identifying the bank when activating.
/gem5/src/cpu/minor/
H A DSConscript10259:ebb376f73dd2 Wed Jul 23 17:09:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> cpu: `Minor' in-order CPU model

This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).

The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).

Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.

Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.

Minor is faster than the o3 model. Sample results:

Benchmark | Stat host_seconds (s)
---------------+--------v--------v--------
(on ARM, opt) | simple | o3 | minor
| timing | timing | timing
---------------+--------+--------+--------
10.linux-boot | 169 | 1883 | 1075
10.mcf | 117 | 967 | 491
20.parser | 668 | 6315 | 3146
30.eon | 542 | 3413 | 2414
40.perlbmk | 2339 | 20905 | 11532
50.vortex | 122 | 1094 | 588
60.bzip2 | 2045 | 18061 | 9662
70.twolf | 207 | 2736 | 1036
H A Dactivity.hh10259:ebb376f73dd2 Wed Jul 23 17:09:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> cpu: `Minor' in-order CPU model

This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).

The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).

Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.

Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.

Minor is faster than the o3 model. Sample results:

Benchmark | Stat host_seconds (s)
---------------+--------v--------v--------
(on ARM, opt) | simple | o3 | minor
| timing | timing | timing
---------------+--------+--------+--------
10.linux-boot | 169 | 1883 | 1075
10.mcf | 117 | 967 | 491
20.parser | 668 | 6315 | 3146
30.eon | 542 | 3413 | 2414
40.perlbmk | 2339 | 20905 | 11532
50.vortex | 122 | 1094 | 588
60.bzip2 | 2045 | 18061 | 9662
70.twolf | 207 | 2736 | 1036
H A Dtrace.hh10259:ebb376f73dd2 Wed Jul 23 17:09:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> cpu: `Minor' in-order CPU model

This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).

The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).

Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.

Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.

Minor is faster than the o3 model. Sample results:

Benchmark | Stat host_seconds (s)
---------------+--------v--------v--------
(on ARM, opt) | simple | o3 | minor
| timing | timing | timing
---------------+--------+--------+--------
10.linux-boot | 169 | 1883 | 1075
10.mcf | 117 | 967 | 491
20.parser | 668 | 6315 | 3146
30.eon | 542 | 3413 | 2414
40.perlbmk | 2339 | 20905 | 11532
50.vortex | 122 | 1094 | 588
60.bzip2 | 2045 | 18061 | 9662
70.twolf | 207 | 2736 | 1036
/gem5/src/cpu/o3/
H A DSConsopts10319:4207f9bfcceb Wed Sep 03 07:42:00 EDT 2014 Andreas Sandberg <Andreas.Sandberg@ARM.com> arch, cpu: Factor out the ExecContext into a proper base class

We currently generate and compile one version of the ISA code per CPU
model. This is obviously wasting a lot of resources at compile
time. This changeset factors out the interface into a separate
ExecContext class, which also serves as documentation for the
interface between CPUs and the ISA code. While doing so, this
changeset also fixes up interface inconsistencies between the
different CPU models.

The main argument for using one set of ISA code per CPU model has
always been performance as this avoid indirect branches in the
generated code. However, this argument does not hold water. Booting
Linux on a simulated ARM system running in atomic mode
(opt/10.linux-boot/realview-simple-atomic) is actually 2% faster
(compiled using clang 3.4) after applying this patch. Additionally,
compilation time is decreased by 35%.
/gem5/src/cpu/simple/
H A DSConsopts10319:4207f9bfcceb Wed Sep 03 07:42:00 EDT 2014 Andreas Sandberg <Andreas.Sandberg@ARM.com> arch, cpu: Factor out the ExecContext into a proper base class

We currently generate and compile one version of the ISA code per CPU
model. This is obviously wasting a lot of resources at compile
time. This changeset factors out the interface into a separate
ExecContext class, which also serves as documentation for the
interface between CPUs and the ISA code. While doing so, this
changeset also fixes up interface inconsistencies between the
different CPU models.

The main argument for using one set of ISA code per CPU model has
always been performance as this avoid indirect branches in the
generated code. However, this argument does not hold water. Booting
Linux on a simulated ARM system running in atomic mode
(opt/10.linux-boot/realview-simple-atomic) is actually 2% faster
(compiled using clang 3.4) after applying this patch. Additionally,
compilation time is decreased by 35%.
/gem5/src/doc/
H A Dinside-minor.doxygen10259:ebb376f73dd2 Wed Jul 23 17:09:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> cpu: `Minor' in-order CPU model

This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).

The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).

Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.

Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.

Minor is faster than the o3 model. Sample results:

Benchmark | Stat host_seconds (s)
---------------+--------v--------v--------
(on ARM, opt) | simple | o3 | minor
| timing | timing | timing
---------------+--------+--------+--------
10.linux-boot | 169 | 1883 | 1075
10.mcf | 117 | 967 | 491
20.parser | 668 | 6315 | 3146
30.eon | 542 | 3413 | 2414
40.perlbmk | 2339 | 20905 | 11532
50.vortex | 122 | 1094 | 588
60.bzip2 | 2045 | 18061 | 9662
70.twolf | 207 | 2736 | 1036
/gem5/src/mem/slicc/ast/
H A DTransitionDeclAST.py10165:7e9edf4297a9 Sat Apr 19 10:00:00 EDT 2014 Nilay Vaish <nilay@cs.wisc.edu> ruby: slicc: slight change to rule for transitions
It had an unnecessary pairs token which is being removed.
/gem5/src/sim/
H A Dglobal_event.cc10361:280cc9b0794f Tue Sep 09 04:36:00 EDT 2014 Andreas Sandberg <Andreas.Sandberg@ARM.com> sim: Fix resource leak in BaseGlobalEvent

Static analysis revealed that BaseGlobalEvent::barrier was never
deallocated. This changeset solves this leak by making the barrier
allocation a part of the BaseGlobalEvent instead of storing a pointer
to a separate heap-allocated barrier.
/gem5/src/unittest/
H A Dsymtest.cc10055:6153b582c9b5 Thu Jan 30 01:21:00 EST 2014 Ola Jeppsson <ola.jeppsson@gmail.com> unittest: Fix build errors

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
/gem5/ext/mcpat/
H A Dbus_interconnect.cc10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes
This patch includes software engineering changes and some generic bug fixes
Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known
issues/concernts we did not have a chance to address in this patch.

High-level changes in this patch include:
1) Making XML parsing modular and hierarchical:
- Shift parsing responsibility into the components
- Read XML in a (mostly) context-free recursive manner so that McPAT input
files can contain arbitrary component hierarchies
2) Making power, energy, and area calculations a hierarchical and recursive
process
- Components track their subcomponents and recursively call compute
functions in stages
- Make C++ object hierarchy reflect inheritance of classes of components
with similar structures
- Simplify computeArea() and computeEnergy() functions to eliminate
successive calls to calculate separate TDP vs. runtime energy
- Remove Processor component (now unnecessary) and introduce a more abstract
System component
3) Standardizing McPAT output across all components
- Use a single, common data structure for storing and printing McPAT output
- Recursively call print functions through component hierarchy
4) For caches, allow splitting data array and tag array reads and writes for
better accuracy
5) Improving the usability of CACTI by printing more helpful warning and error
messages
6) Minor: Impose more rigorous code style for clarity (more work still to be
done)
Overall, these changes greatly reduce the amount of replicated code, and they
improve McPAT runtime and decrease memory footprint.
H A Dbus_interconnect.h10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes
This patch includes software engineering changes and some generic bug fixes
Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known
issues/concernts we did not have a chance to address in this patch.

High-level changes in this patch include:
1) Making XML parsing modular and hierarchical:
- Shift parsing responsibility into the components
- Read XML in a (mostly) context-free recursive manner so that McPAT input
files can contain arbitrary component hierarchies
2) Making power, energy, and area calculations a hierarchical and recursive
process
- Components track their subcomponents and recursively call compute
functions in stages
- Make C++ object hierarchy reflect inheritance of classes of components
with similar structures
- Simplify computeArea() and computeEnergy() functions to eliminate
successive calls to calculate separate TDP vs. runtime energy
- Remove Processor component (now unnecessary) and introduce a more abstract
System component
3) Standardizing McPAT output across all components
- Use a single, common data structure for storing and printing McPAT output
- Recursively call print functions through component hierarchy
4) For caches, allow splitting data array and tag array reads and writes for
better accuracy
5) Improving the usability of CACTI by printing more helpful warning and error
messages
6) Minor: Impose more rigorous code style for clarity (more work still to be
done)
Overall, these changes greatly reduce the amount of replicated code, and they
improve McPAT runtime and decrease memory footprint.
H A Dcachearray.cc10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes
This patch includes software engineering changes and some generic bug fixes
Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known
issues/concernts we did not have a chance to address in this patch.

High-level changes in this patch include:
1) Making XML parsing modular and hierarchical:
- Shift parsing responsibility into the components
- Read XML in a (mostly) context-free recursive manner so that McPAT input
files can contain arbitrary component hierarchies
2) Making power, energy, and area calculations a hierarchical and recursive
process
- Components track their subcomponents and recursively call compute
functions in stages
- Make C++ object hierarchy reflect inheritance of classes of components
with similar structures
- Simplify computeArea() and computeEnergy() functions to eliminate
successive calls to calculate separate TDP vs. runtime energy
- Remove Processor component (now unnecessary) and introduce a more abstract
System component
3) Standardizing McPAT output across all components
- Use a single, common data structure for storing and printing McPAT output
- Recursively call print functions through component hierarchy
4) For caches, allow splitting data array and tag array reads and writes for
better accuracy
5) Improving the usability of CACTI by printing more helpful warning and error
messages
6) Minor: Impose more rigorous code style for clarity (more work still to be
done)
Overall, these changes greatly reduce the amount of replicated code, and they
improve McPAT runtime and decrease memory footprint.
H A Dcachearray.h10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes
This patch includes software engineering changes and some generic bug fixes
Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known
issues/concernts we did not have a chance to address in this patch.

High-level changes in this patch include:
1) Making XML parsing modular and hierarchical:
- Shift parsing responsibility into the components
- Read XML in a (mostly) context-free recursive manner so that McPAT input
files can contain arbitrary component hierarchies
2) Making power, energy, and area calculations a hierarchical and recursive
process
- Components track their subcomponents and recursively call compute
functions in stages
- Make C++ object hierarchy reflect inheritance of classes of components
with similar structures
- Simplify computeArea() and computeEnergy() functions to eliminate
successive calls to calculate separate TDP vs. runtime energy
- Remove Processor component (now unnecessary) and introduce a more abstract
System component
3) Standardizing McPAT output across all components
- Use a single, common data structure for storing and printing McPAT output
- Recursively call print functions through component hierarchy
4) For caches, allow splitting data array and tag array reads and writes for
better accuracy
5) Improving the usability of CACTI by printing more helpful warning and error
messages
6) Minor: Impose more rigorous code style for clarity (more work still to be
done)
Overall, these changes greatly reduce the amount of replicated code, and they
improve McPAT runtime and decrease memory footprint.
H A Dcachecontroller.cc10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes
This patch includes software engineering changes and some generic bug fixes
Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known
issues/concernts we did not have a chance to address in this patch.

High-level changes in this patch include:
1) Making XML parsing modular and hierarchical:
- Shift parsing responsibility into the components
- Read XML in a (mostly) context-free recursive manner so that McPAT input
files can contain arbitrary component hierarchies
2) Making power, energy, and area calculations a hierarchical and recursive
process
- Components track their subcomponents and recursively call compute
functions in stages
- Make C++ object hierarchy reflect inheritance of classes of components
with similar structures
- Simplify computeArea() and computeEnergy() functions to eliminate
successive calls to calculate separate TDP vs. runtime energy
- Remove Processor component (now unnecessary) and introduce a more abstract
System component
3) Standardizing McPAT output across all components
- Use a single, common data structure for storing and printing McPAT output
- Recursively call print functions through component hierarchy
4) For caches, allow splitting data array and tag array reads and writes for
better accuracy
5) Improving the usability of CACTI by printing more helpful warning and error
messages
6) Minor: Impose more rigorous code style for clarity (more work still to be
done)
Overall, these changes greatly reduce the amount of replicated code, and they
improve McPAT runtime and decrease memory footprint.
H A Dcachecontroller.h10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes
This patch includes software engineering changes and some generic bug fixes
Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known
issues/concernts we did not have a chance to address in this patch.

High-level changes in this patch include:
1) Making XML parsing modular and hierarchical:
- Shift parsing responsibility into the components
- Read XML in a (mostly) context-free recursive manner so that McPAT input
files can contain arbitrary component hierarchies
2) Making power, energy, and area calculations a hierarchical and recursive
process
- Components track their subcomponents and recursively call compute
functions in stages
- Make C++ object hierarchy reflect inheritance of classes of components
with similar structures
- Simplify computeArea() and computeEnergy() functions to eliminate
successive calls to calculate separate TDP vs. runtime energy
- Remove Processor component (now unnecessary) and introduce a more abstract
System component
3) Standardizing McPAT output across all components
- Use a single, common data structure for storing and printing McPAT output
- Recursively call print functions through component hierarchy
4) For caches, allow splitting data array and tag array reads and writes for
better accuracy
5) Improving the usability of CACTI by printing more helpful warning and error
messages
6) Minor: Impose more rigorous code style for clarity (more work still to be
done)
Overall, these changes greatly reduce the amount of replicated code, and they
improve McPAT runtime and decrease memory footprint.
H A Dcacheunit.cc10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes
This patch includes software engineering changes and some generic bug fixes
Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known
issues/concernts we did not have a chance to address in this patch.

High-level changes in this patch include:
1) Making XML parsing modular and hierarchical:
- Shift parsing responsibility into the components
- Read XML in a (mostly) context-free recursive manner so that McPAT input
files can contain arbitrary component hierarchies
2) Making power, energy, and area calculations a hierarchical and recursive
process
- Components track their subcomponents and recursively call compute
functions in stages
- Make C++ object hierarchy reflect inheritance of classes of components
with similar structures
- Simplify computeArea() and computeEnergy() functions to eliminate
successive calls to calculate separate TDP vs. runtime energy
- Remove Processor component (now unnecessary) and introduce a more abstract
System component
3) Standardizing McPAT output across all components
- Use a single, common data structure for storing and printing McPAT output
- Recursively call print functions through component hierarchy
4) For caches, allow splitting data array and tag array reads and writes for
better accuracy
5) Improving the usability of CACTI by printing more helpful warning and error
messages
6) Minor: Impose more rigorous code style for clarity (more work still to be
done)
Overall, these changes greatly reduce the amount of replicated code, and they
improve McPAT runtime and decrease memory footprint.
H A Dcacheunit.h10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes
This patch includes software engineering changes and some generic bug fixes
Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known
issues/concernts we did not have a chance to address in this patch.

High-level changes in this patch include:
1) Making XML parsing modular and hierarchical:
- Shift parsing responsibility into the components
- Read XML in a (mostly) context-free recursive manner so that McPAT input
files can contain arbitrary component hierarchies
2) Making power, energy, and area calculations a hierarchical and recursive
process
- Components track their subcomponents and recursively call compute
functions in stages
- Make C++ object hierarchy reflect inheritance of classes of components
with similar structures
- Simplify computeArea() and computeEnergy() functions to eliminate
successive calls to calculate separate TDP vs. runtime energy
- Remove Processor component (now unnecessary) and introduce a more abstract
System component
3) Standardizing McPAT output across all components
- Use a single, common data structure for storing and printing McPAT output
- Recursively call print functions through component hierarchy
4) For caches, allow splitting data array and tag array reads and writes for
better accuracy
5) Improving the usability of CACTI by printing more helpful warning and error
messages
6) Minor: Impose more rigorous code style for clarity (more work still to be
done)
Overall, these changes greatly reduce the amount of replicated code, and they
improve McPAT runtime and decrease memory footprint.
H A Dcommon.h10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes
This patch includes software engineering changes and some generic bug fixes
Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known
issues/concernts we did not have a chance to address in this patch.

High-level changes in this patch include:
1) Making XML parsing modular and hierarchical:
- Shift parsing responsibility into the components
- Read XML in a (mostly) context-free recursive manner so that McPAT input
files can contain arbitrary component hierarchies
2) Making power, energy, and area calculations a hierarchical and recursive
process
- Components track their subcomponents and recursively call compute
functions in stages
- Make C++ object hierarchy reflect inheritance of classes of components
with similar structures
- Simplify computeArea() and computeEnergy() functions to eliminate
successive calls to calculate separate TDP vs. runtime energy
- Remove Processor component (now unnecessary) and introduce a more abstract
System component
3) Standardizing McPAT output across all components
- Use a single, common data structure for storing and printing McPAT output
- Recursively call print functions through component hierarchy
4) For caches, allow splitting data array and tag array reads and writes for
better accuracy
5) Improving the usability of CACTI by printing more helpful warning and error
messages
6) Minor: Impose more rigorous code style for clarity (more work still to be
done)
Overall, these changes greatly reduce the amount of replicated code, and they
improve McPAT runtime and decrease memory footprint.
H A Dsystem.cc10234:5cb711fa6176 Tue Jun 03 16:32:00 EDT 2014 Yasuko Eckert <yasuko.eckert@amd.com> ext: McPAT interface changes and fixes
This patch includes software engineering changes and some generic bug fixes
Joel Hestness and Yasuko Eckert made to McPAT 0.8. There are still known
issues/concernts we did not have a chance to address in this patch.

High-level changes in this patch include:
1) Making XML parsing modular and hierarchical:
- Shift parsing responsibility into the components
- Read XML in a (mostly) context-free recursive manner so that McPAT input
files can contain arbitrary component hierarchies
2) Making power, energy, and area calculations a hierarchical and recursive
process
- Components track their subcomponents and recursively call compute
functions in stages
- Make C++ object hierarchy reflect inheritance of classes of components
with similar structures
- Simplify computeArea() and computeEnergy() functions to eliminate
successive calls to calculate separate TDP vs. runtime energy
- Remove Processor component (now unnecessary) and introduce a more abstract
System component
3) Standardizing McPAT output across all components
- Use a single, common data structure for storing and printing McPAT output
- Recursively call print functions through component hierarchy
4) For caches, allow splitting data array and tag array reads and writes for
better accuracy
5) Improving the usability of CACTI by printing more helpful warning and error
messages
6) Minor: Impose more rigorous code style for clarity (more work still to be
done)
Overall, these changes greatly reduce the amount of replicated code, and they
improve McPAT runtime and decrease memory footprint.

Completed in 141 milliseconds

<<21222324252627282930>>