#
13876:1643f200987c |
|
27-Mar-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
config: Add flag options to set the hardware prefetchers to use
This patch adds three flag options to set the prefetcher class of the L1i cache, L1d cache and L2 cache.
Change-Id: I310fcd9c49f9554d98cd565a32bdb96a3e165486 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17709 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
13811:88827de8fced |
|
26-Mar-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
config: Use the corresponding HPI Caches when using the HPI cpu
The HPI cpu comes with specific cache definitions, but they are ignored when using this cpu. This patch solves this in the same way it is done for the O3_ARM_v7a cpu.
Change-Id: Iabf763291099d9508e3c5eac00b1e233cb38ce6b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17708 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
13806:e2ca4f169e82 |
|
23-Mar-2019 |
Javier Bueno <javier.bueno@metempsy.com> |
configs: fix class reference in CacheConfigs
One reference was not properly updated when changing to absolute import paths
Change-Id: Idf330487d5d08d92ebb4489f16d75429f882bd7a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17541 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
13774:a1be2a0c55f2 |
|
25-Feb-2019 |
Andreas Sandberg <andreas.sandberg@arm.com> |
configs: Use absolute import paths
Use absoluate import paths to be Python 3 compatible. This also imports absolute_import from __future__ to ensure that Python 2.7 behaves the same way as Python 3.
Change-Id: Ica06ed95814e9cd3e768b3e1785075e36f6e56d0 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16708 Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
|
#
13731:67cd980cb20f |
|
26-Jan-2019 |
Andreas Sandberg <andreas.sandberg@arm.com> |
configs: Fix Python 3 iterator and exec compatibility issues
Python 2.7 used to return lists for operations such as map and range, this has changed in Python 3. To make the configs Python 3 compliant, add explicit conversions from iterators to lists where needed, replace xrange with range, and fix changes to exec syntax.
This change doesn't fix import paths since that might require us to restructure the configs slightly.
Change-Id: Idcea8482b286779fc98b4e144ca8f54069c08024 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/16002 Reviewed-by: Gabe Black <gabeblack@google.com>
|
#
12564:2778478ca882 |
|
06-Mar-2018 |
Gabe Black <gabeblack@google.com> |
config: Switch from the print statement to the print function.
Change-Id: I701fa58cfcfa2767ce9ad24da314a053889878d0 Reviewed-on: https://gem5-review.googlesource.com/8762 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Gabe Black <gabeblack@google.com>
|
#
12097:77a3d2890ba6 |
|
26-Jun-2017 |
Andreas Sandberg <andreas.sandberg@arm.com> |
config: Move core timing models to config/common/cores
Change-Id: I189b6462cc64f7cc6c1b7a6c2af1abb60e1854de Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com> Reviewed-on: https://gem5-review.googlesource.com/3943 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
#
12014:f973caaf935d |
|
08-May-2017 |
Gabe Black <gabeblack@google.com> |
config: Fix up some configs to not use CPU aliases.
Support for CPU aliases were removed recently.
Change-Id: I3c1173dc34170d8639d95e52bf660f248848f77f Reviewed-on: https://gem5-review.googlesource.com/3100 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
#
11604:b254396b7759 |
|
12-Aug-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add snoop filter to SystemXBar by default
This patch changes the default behaviour of the SystemXBar, adding a snoop filter. With the recent updates to the snoop filter allocation behaviour this change no longer causes problems for the regressions without caches.
Change-Id: Ibe0cd437b71b2ede9002384126553679acc69cc1 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com>
|
#
11539:de57dbf319d0 |
|
20-Jun-2016 |
Andreas Hansson <andreas.hansson@arm.com> |
config: Fix omission of walker cache in config scripts
This patch ensures a walker cache is instantiated if specfied.
Change-Id: I2c6b4bf3454d56bb19558c73b406e1875acbd986 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com>
|
#
11501:9345c4320477 |
|
27-May-2016 |
Stephan Diestelhorst <stephan.diestelhorst@arm.com> |
mem, config: Selective use of snoop filter
Disable the default snoop filter in the SystemXBar so that the typical membus does not have a snoop filter by default. Instead, add the snoop filter only when there are caches added to the system (with the caches / l2cache options).
The underlying problem is that the snoop filter grows without bounds (for now) if there are no caches to tell it that lines have been evicted. This causes slow regression runs for all the atomic regressions. This patch fixes this behaviour.
|
#
11320:42ecb523c64a |
|
06-Feb-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
style: remove trailing whitespace
Result of running 'hg m5style --skip-all --fix-white -a'.
|
#
11251:a15c86af004a |
|
07-Dec-2015 |
Radhika Jagtap <radhika.jagtap@ARM.com> |
config: Enable elastic trace capture and replay in se/fs
This patch adds changes to the configuration scripts to support elastic tracing and replay.
The patch adds a command line option to enable elastic tracing in SE mode and FS mode. When enabled the Elastic Trace cpu probe is attached to O3CPU and a few O3 CPU parameters are tuned. The Elastic Trace probe writes out both instruction fetch and data dependency traces. The patch also enables configuring the TraceCPU to replay traces using the SE and FS script.
The replay run is designed to resume from checkpoint using atomic cpu to restore state keeping it consistent with FS run flow. It then switches to TraceCPU to replay the input traces.
|
#
10884:c60acdbdd6ad |
|
03-Jul-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Allow read-only caches and check compliance
This patch adds a parameter to the BaseCache to enable a read-only cache, for example for the instruction cache, or table-walker cache (not for x86). A number of checks are put in place in the code to ensure a read-only cache does not end up with dirty data.
A follow-on patch adds suitable read requests to allow a read-only cache to explicitly ask for clean data.
|
#
10780:46070443051e |
|
08-Apr-2015 |
Curtis Dunham <Curtis.Dunham@arm.com> |
config: Support full-system with SST's memory system
This patch adds an example configuration in ext/sst/tests/ that allows an SST/gem5 instance to simulate a 4-core AArch64 system with SST's memHierarchy components providing all the caches and memories.
|
#
10720:67b3e74de9ae |
|
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Move crossbar default latencies to subclasses
This patch introduces a few subclasses to the CoherentXBar and NoncoherentXBar to distinguish the different uses in the system. We use the crossbar in a wide range of places: interfacing cores to the L2, as a system interconnect, connecting I/O and peripherals, etc. Needless to say, these crossbars have very different performance, and the clock frequency alone is not enough to distinguish these scenarios.
Instead of trying to capture every possible case, this patch introduces dedicated subclasses for the three primary use-cases: L2XBar, SystemXBar and IOXbar. More can be added if needed, and the defaults can be overridden.
|
#
10613:9d0aef7a9b2e |
|
23-Dec-2014 |
Marco Elver <Marco.Elver@ARM.com> |
config: Add --memchecker option
This patch adds the --memchecker option, to denote that a MemChecker should be instantiated for the system. The exact usage of the MemChecker depends on the system configuration.
For now CacheConfig.py makes use of the option, adding MemCheckerMonitor instances between CPUs and D-Caches.
Note, however, that currently this only provides limited checking on a running system; other parts of the system, such as I/O devices are not monitored, and may cause warnings to be issued by the monitor.
|
#
10405:7a618c07e663 |
|
20-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Rename Bus to XBar to better reflect its behaviour
This patch changes the name of the Bus classes to XBar to better reflect the actual timing behaviour. The actual instances in the config scripts are not renamed, and remain as e.g. iobus or membus.
As part of this renaming, the code has also been clean up slightly, making use of range-based for loops and tidying up some comments. The only changes outside the bus/crossbar code is due to the delay variables in the packet.
|
#
9815:3b3b94536547 |
|
18-Jul-2013 |
Andreas Hansson <andreas.hansson> |
config: Update script to set cache line size on system
This patch changes the config scripts such that they do not set the cache line size per cache instance, but rather for the system as a whole.
|
#
9793:6e6cefc1db1f |
|
27-Jun-2013 |
Akash Bagdia <akash.bagdia@arm.com> |
sim: Add the notion of clock domains to all ClockedObjects
This patch adds the notion of source- and derived-clock domains to the ClockedObjects. As such, all clock information is moved to the clock domain, and the ClockedObjects are grouped into domains.
The clock domains are either source domains, with a specific clock period, or derived domains that have a parent domain and a divider (potentially chained). For piece of logic that runs at a derived clock (a ratio of the clock its parent is running at) the necessary derived clock domain is created from its corresponding parent clock domain. For now, the derived clock domain only supports a divider, thus ensuring a lower speed compared to its parent. Multiplier functionality implies a PLL logic that has not been modelled yet (create a separate clock instead).
The clock domains should be used as a mechanism to provide a controllable clock source that affects clock for every clocked object lying beneath it. The clock of the domain can (in a future patch) be controlled by a handler responsible for dynamic frequency scaling of the respective clock domains.
All the config scripts have been retro-fitted with clock domains. For the System a default SrcClockDomain is created. For CPUs that run at a different speed than the system, there is a seperate clock domain created. This domain incorporates the CPU and the associated caches. As before, Ruby runs under its own clock domain.
The clock period of all domains are pre-computed, such that no virtual functions or multiplications are needed when calling clockPeriod. Instead, the clock period is pre-computed when any changes occur. For this to be possible, each clock domain tracks its children.
|
#
9789:233420718e61 |
|
27-Jun-2013 |
Akash Bagdia <akash.bagdia@arm.com> |
config: Add a CPU clock command-line option
This patch adds a 'cpu_clock' command-line option and uses the value to assign clocks to components running at the CPU speed (L1 and L2 including the L2-bus). The configuration scripts are updated accordingly.
The 'clock' option is left unchanged in this patch as it is still used by a number of components. In follow-on patches the latter will be disambiguated further.
|
#
9522:9290a0198c50 |
|
15-Feb-2013 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
config: Remove O3 dependencies
The default cache configuration script currently import the O3_ARM_v7a model configuration, which depends on the O3 CPU. This breaks if gem5 has been compiled without O3 support. This changeset removes the dependency by only importing the model if it is requested by the user. As a bonus, it actually removes some code duplication in the configuration scripts.
|
#
9284:f4ff625eae56 |
|
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Regression: Use CPU clock and 32-byte width for L1-L2 bus
This patch changes the CoherentBus between the L1s and L2 to use the CPU clock and also four times the width compared to the default bus. The parameters are not intending to fit every single scenario, but rather serve as a better startingpoint than what we previously had.
Note that the scripts that do not use the addTwoLevelCacheHiearchy are not affected by this change.
A separate patch will update the stats.
|
#
9036:6385cf85bf12 |
|
31-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Bus: Split the bus into a non-coherent and coherent bus
This patch introduces a class hierarchy of buses, a non-coherent one, and a coherent one, splitting the existing bus functionality. By doing so it also enables further specialisation of the two types of buses.
A non-coherent bus connects a number of non-snooping masters and slaves, and routes the request and response packets based on the address. The request packets issued by the master connected to a non-coherent bus could still snoop in caches attached to a coherent bus, as is the case with the I/O bus and memory bus in most system configurations. No snoops will, however, reach any master on the non-coherent bus itself. The non-coherent bus can be used as a template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses, and is typically used for the I/O buses.
A coherent bus connects a number of (potentially) snooping masters and slaves, and routes the request and response packets based on the address, and also forwards all requests to the snoopers and deals with the snoop responses. The coherent bus can be used as a template for modelling QPI, HyperTransport, ACE and coherent OCP buses, and is typically used for the L1-to-L2 buses and as the main system interconnect.
The configuration scripts are updated to use a NoncoherentBus for all peripheral and I/O buses.
A bit of minor tidying up has also been done.
|
#
8863:50ce4deacda9 |
|
01-Mar-2012 |
Nilay Vaish <nilay@cs.wisc.edu> |
x86: Fix switching of CPUs This patch prevents creation of interrupt controller for cpus that will be switched in later
|
#
8846:2eaf1809c6c6 |
|
14-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Script: Fix the scripts that use the num_cpus cache parameter
This patch merely removes the use of the num_cpus cache parameter which no longer exists after the introduction of the masterIds. The affected scripts fail when trying to set the parameter. Note that this patch does not update the regression stats.
|
#
8839:eeb293859255 |
|
13-Feb-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Introduce the master/slave port roles in the Python classes
This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves.
The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port.
Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves.
|
#
8724:7b4d80b26e35 |
|
26-Jan-2012 |
Ronald Dreslinski <rdreslin@umich.edu> |
configs: A more realistic configuration of an ARM-like processor
|
#
8057:5a8208fa1600 |
|
23-Feb-2011 |
Korey Sewell <ksewell@umich.edu> |
configs: cache: add cache line size option
|
#
8056:8fe2d7ff1111 |
|
23-Feb-2011 |
Korey Sewell <ksewell@umich.edu> |
configs: set default cache params It's confusing (especially to new users), when you are setting some standard parameters (as defined in Options.py) and they aren't reflected in the simulations so we might as well link the settings in CacheConfig.py to those in Options.py
|
#
7876:189b9b258779 |
|
03-Feb-2011 |
Gabe Black <gblack@eecs.umich.edu> |
Config: Keep track of uncached and cached ports separately.
This makes sure that the address ranges requested for caches and uncached ports don't conflict with each other, and that accesses which are always uncached (message signaled interrupts for instance) don't waste time passing through caches.
|
#
7868:6029008db669 |
|
01-Feb-2011 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add L1 caches for the TLB walkers.
Small L1 caches are connected to the TLB walkers when caches are used. This allows them to participate in the coherence protocol properly.
|
#
6981:aba5f7216636 |
|
25-Feb-2010 |
Lisa Hsu <Lisa.Hsu@amd.com> |
configs: pull out cache configuration code from se.py and fs.py. Most of these frontend configurations share cache configuration code, pull it out so that changes to caches don't have to require changing multiple config files.
|