History log of /gem5/tests/configs/o3-timing-mp.py
Revision Date Author Comments
# 11837:17b37f38944a 14-Feb-2017 Wendy Elsasser <wendy.elsasser@arm.com>

mem: Update DRAM configuration names

Names of DRAM configurations were updated to reflect both
the channel and device data width.

Previous naming format was:
<DEVICE_TYPE>_<DATA_RATE>_<CHANNEL_WIDTH>

The following nomenclature is now used:
<DEVICE_TYPE>_<DATA_RATE>_<n>x<w>
where n = The number of devices per rank on the channel
x = Device width

Total channel width can be calculated by n*w

Example:
A 64-bit DDR4, 2400 channel consisting of 4-bit devices:
n = 16
w = 4
The resulting configuration name is:
DDR4_2400_16x4

Updated scripts to match new naming convention.

Added unique configurations for DDR4 for:
1) 16x4
2) 8x8
3) 4x16

Change-Id: Ibd7f763b7248835c624309143cb9fc29d56a69d1
Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>


# 9792:c02004c2cc5b 27-Jun-2013 Andreas Hansson <andreas.hansson@arm.com>

config: Add a BaseSESystem builder for re-use in regressions

This patch extends the existing system builders to also include a
syscall-emulation builder. This builder is deployed in all
syscall-emulation regressions that do not involve Ruby,
i.e. o3-timing, simple-timing and simple-atomic, as well as the
multi-processor regressions o3-timing-mp, simple-timing-mp and
simple-atomic-mp (the latter are only used by SPARC at this point).

The values chosen for the cache sizes match those that were used in
the existing config scripts (despite being on the large
side). Similarly, a mem_class parameter is added to the builder base
class to enable simple-atomic to use SimpleMemory and o3-timing to use
the default DDR3 configuration.

Due to the different order the ports are connected, the bus stats get
shuffled around for the multi-processor regressions. A separate patch
bumps the port indices. Besides this, all behaviour is exactly the
same.


# 9790:ccc428657233 27-Jun-2013 Akash Bagdia <akash.bagdia@arm.com>

config: Add a system clock command-line option

This patch adds a 'sys_clock' command-line option and use it to assign
clocks to the system during instantiation.

As part of this change, the default clock in the System class is
removed and whenever a system is instantiated a system clock value
must be set. A default value is provided for the command-line option.

The configs and tests are updated accordingly.


# 9728:7daeab1685e9 30-May-2013 Andreas Hansson <andreas.hansson@arm.com>

mem: More descriptive DRAM config names

This patch changes the class names of the variuos DRAM configurations
to better reflect what memory they are based on. The speed and
interface width is now part of the name, and also the alias that is
used to select them on the command line.

Some minor changes are done to the actual parameters, to better
reflect the named configurations. As a result of these changes the
regressions change slightly and the stats will be bumped in a separate
patch.


# 9489:172dbcb74a0e 31-Jan-2013 Andreas Hansson <andreas.hansson@arm.com>

mem: Add DDR3 and LPDDR2 DRAM controller configurations

This patch moves the default DRAM parameters from the SimpleDRAM class
to two different subclasses, one for DDR3 and one for LPDDR2. More can
be added as we go forward.

The regressions that previously used the SimpleDRAM are now using
SimpleDDR3 as this is the most similar configuration.


# 9381:ffec48040ac1 07-Jan-2013 Ali Saidi <Ali.Saidi@ARM.com>

tests: Always specify memory mode in every test system.

Previous to this change we didn't always set the memory mode which worked as
long as we never attempted to switch CPUs or checked that a CPU was in a
memory system with the correct mode. Future changes will make CPUs verify
that they're operating in the correct mode and thus we need to always set it.


# 9315:2e00867b5001 26-Oct-2012 Andreas Hansson <andreas.hansson@arm.com>

config: Fix the cache class naming in regression scripts

This patch unifies the naming of the default L1 and L2 caches in the
regression configs to be in line with what is used in the se and fs
scripts.


# 9311:227d19399b51 25-Oct-2012 Andreas Hansson <andreas.hansson@arm.com>

config: Use SimpleDRAM in full-system, and with o3 and inorder

This patch favours using SimpleDRAM with the default timing instead of
SimpleMemory for all regressions that involve the o3 or inorder CPU,
or are full system (in other words, where the actual performance of
the memory is important for the overall performance).

Moving forward, the solution for FSConfig and the users of fs.py and
se.py is probably something similar to what we use to choose the CPU
type. I envision a few pre-set configurations SimpleLPDDR2,
SimpleDDR3, etc that can be choosen by a dram_type option. Feedback on
this part is welcome.

This patch changes plenty stats and adds all the DRAM controller
related stats. A follow-on patch updates the relevant statistics. The
total run-time for the entire regression goes up with ~5% with this
patch due to the added complexity of the SimpleDRAM model. This is a
concious trade-off to ensure that the model is properly tested.


# 9310:aa7bf10e822a 25-Oct-2012 Andreas Hansson <andreas.hansson@arm.com>

config: Use shared cache config for regressions

This patch uses the common L1, L2 and IOCache configuration for the
regressions that all share the same cache parameters. There are a few
regressions that use a slightly different configuration (memtest,
o3-timing=mp, simple-atomic-mp and simple-timing-mp), and the latter
are not changed in this patch. They will be updated in a future patch.

The common cache configurations are changed to match the ones used in
the regressions, and are slightly changed with respect to what they
were. Hopefully this means we can converge on a common base
configuration, used both in the normal user configurations and
regressions.

As only regressions that shared the same cache configuration are
updated, no regressions are affected.


# 9288:3d6da8559605 15-Oct-2012 Andreas Hansson <andreas.hansson@arm.com>

Mem: Use cycles to express cache-related latencies

This patch changes the cache-related latencies from an absolute time
expressed in Ticks, to a number of cycles that can be scaled with the
clock period of the caches. Ultimately this patch serves to enable
future work that involves dynamic frequency scaling. As an immediate
benefit it also makes it more convenient to specify cache performance
without implicitly assuming a specific CPU core operating frequency.

The stat blocked_cycles that actually counter in ticks is now updated
to count in cycles.

As the timing is now rounded to the clock edges of the cache, there
are some regressions that change. Plenty of them have very minor
changes, whereas some regressions with a short run-time are perturbed
quite significantly. A follow-on patch updates all the statistics for
the regressions.


# 9263:066099902102 25-Sep-2012 Mrinmoy Ghosh <mrinmoy.ghosh@arm.com>

Cache: add a response latency to the caches

In the current caches the hit latency is paid twice on a miss. This patch lets
a configurable response latency be set of the cache for the backward path.


# 9036:6385cf85bf12 31-May-2012 Andreas Hansson <andreas.hansson@arm.com>

Bus: Split the bus into a non-coherent and coherent bus

This patch introduces a class hierarchy of buses, a non-coherent one,
and a coherent one, splitting the existing bus functionality. By doing
so it also enables further specialisation of the two types of buses.

A non-coherent bus connects a number of non-snooping masters and
slaves, and routes the request and response packets based on the
address. The request packets issued by the master connected to a
non-coherent bus could still snoop in caches attached to a coherent
bus, as is the case with the I/O bus and memory bus in most system
configurations. No snoops will, however, reach any master on the
non-coherent bus itself. The non-coherent bus can be used as a
template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses,
and is typically used for the I/O buses.

A coherent bus connects a number of (potentially) snooping masters and
slaves, and routes the request and response packets based on the
address, and also forwards all requests to the snoopers and deals with
the snoop responses. The coherent bus can be used as a template for
modelling QPI, HyperTransport, ACE and coherent OCP buses, and is
typically used for the L1-to-L2 buses and as the main system
interconnect.

The configuration scripts are updated to use a NoncoherentBus for all
peripheral and I/O buses.

A bit of minor tidying up has also been done.


# 8931:7a1dfb191e3f 06-Apr-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Enable multiple distributed generalized memories

This patch removes the assumption on having on single instance of
PhysicalMemory, and enables a distributed memory where the individual
memories in the system are each responsible for a single contiguous
address range.

All memories inherit from an AbstractMemory that encompasses the basic
behaviuor of a random access memory, and provides untimed access
methods. What was previously called PhysicalMemory is now
SimpleMemory, and a subclass of AbstractMemory. All future types of
memory controllers should inherit from AbstractMemory.

To enable e.g. the atomic CPU and RubyPort to access the now
distributed memory, the system has a wrapper class, called
PhysicalMemory that is aware of all the memories in the system and
their associated address ranges. This class thus acts as an
infinitely-fast bus and performs address decoding for these "shortcut"
accesses. Each memory can specify that it should not be part of the
global address map (used e.g. by the functional memories by some
testers). Moreover, each memory can be configured to be reported to
the OS configuration table, useful for populating ATAG structures, and
any potential ACPI tables.

Checkpointing support currently assumes that all memories have the
same size and organisation when creating and resuming from the
checkpoint. A future patch will enable a more flexible
re-organisation.


# 8876:44f8e7bb7fdf 02-Mar-2012 Andreas Hansson <andreas.hansson@arm.com>

CPU: Check that the interrupt controller is created when needed

This patch adds a creation-time check to the CPU to ensure that the
interrupt controller is created for the cases where it is needed,
i.e. if the CPU is not being switched in later and not a checker CPU.

The patch also adds the "createInterruptController" call to a number
of the regression scripts.


# 8839:eeb293859255 13-Feb-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Introduce the master/slave port roles in the Python classes

This patch classifies all ports in Python as either Master or Slave
and enforces a binding of master to slave. Conceptually, a master (such
as a CPU or DMA port) issues requests, and receives responses, and
conversely, a slave (such as a memory or a PIO device) receives
requests and sends back responses. Currently there is no
differentiation between coherent and non-coherent masters and slaves.

The classification as master/slave also involves splitting the dual
role port of the bus into a master and slave port and updating all the
system assembly scripts to use the appropriate port. Similarly, the
interrupt devices have to have their int_port split into a master and
slave port. The intdev and its children have minimal changes to
facilitate the extra port.

Note that this patch does not enforce any port typing in the C++
world, it merely ensures that the Python objects have a notion of the
port roles and are connected in an appropriate manner. This check is
carried when two ports are connected, e.g. bus.master =
memory.port. The following patches will make use of the
classifications and specialise the C++ ports into masters and slaves.


# 8833:2870638642bd 12-Feb-2012 Dam Sunwoo <dam.sunwoo@arm.com>

mem: fix cache stats to use request ids correctly

This patch fixes the cache stats to use the new request ids.
Cache stats also display the requestor names in the vector subnames.
Most cache stats now include "nozero" and "nonan" flags to reduce the
amount of excessive cache stat dump. Also, simplified
incMissCount()/incHitCount() functions.


# 8801:1a84c6a81299 28-Jan-2012 Gabe Black <gblack@eecs.umich.edu>

SE/FS: Make SE vs. FS mode a runtime parameter.


# 8706:b1838faf3bcc 17-Jan-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Add port proxies instead of non-structural ports

Port proxies are used to replace non-structural ports, and thus enable
all ports in the system to correspond to a structural entity. This has
the advantage of accessing memory through the normal memory subsystem
and thus allowing any constellation of distributed memories, address
maps, etc. Most accesses are done through the "system port" that is
used for loading binaries, debugging etc. For the entities that belong
to the CPU, e.g. threads and thread contexts, they wrap the CPU data
port in a port proxy.

The following replacements are made:
FunctionalPort > PortProxy
TranslatingPort > SETranslatingPortProxy
VirtualPort > FSTranslatingPortProxy


# 8631:8c038d4cd210 01-Dec-2011 Chander Sudanthi <chander.sudanthi@arm.com>

O3: Remove hardcoded tgts_per_mshr in O3CPU.py.

There are two lines in O3CPU.py that set the dcache and icache
tgts_per_mshr to 20, ignoring any pre-configured value of tgts_per_mshr.
This patch removes these hardcoded lines from O3CPU.py and sets the default
L1 cache mshr targets to 20.


# 8134:b01a51ff05fa 17-Mar-2011 Ali Saidi <Ali.Saidi@ARM.com>

Mem: Fix issue with dirty block being lost when entire block transferred to non-cache.

This change fixes the problem for all the cases we actively use. If you want to try
more creative I/O device attachments (E.g. sharing an L2), this won't work. You
would need another level of caching between the I/O device and the cache
(which you actually need anyway with our current code to make sure writes
propagate). This is required so that you can mark the cache in between as
top level and it won't try to send ownership of a block to the I/O device.
Asserts have been added that should catch any issues.


# 7876:189b9b258779 03-Feb-2011 Gabe Black <gblack@eecs.umich.edu>

Config: Keep track of uncached and cached ports separately.

This makes sure that the address ranges requested for caches and uncached ports
don't conflict with each other, and that accesses which are always uncached
(message signaled interrupts for instance) don't waste time passing through
caches.


# 6978:ab05e20dc4a7 23-Feb-2010 Lisa Hsu <Lisa.Hsu@amd.com>

cache: Make caches sharing aware and add occupancy stats.
On the config end, if a shared L2 is created for the system, it is
parameterized to have n sharers as defined by option.num_cpus. In addition to
making the cache sharing aware so that discriminating tag policies can make use
of context_ids to make decisions, I added an occupancy AverageStat and an occ %
stat to each cache so that you could know which contexts are occupying how much
cache on average, both in terms of blocks and percentage. Note that since
devices have context_id -1, having an array of occ stats that correspond to
each context_id will break here, so in FS mode I add an extra bucket for device
blocks. This bucket is explicitly not added in SE mode in order to not only
avoid ugliness in the stats.txt file, but to avoid broken stats (some formulas
break when a bucket is 0).


# 6654:4c84e771cca7 22-Sep-2009 Nathan Binkert <nate@binkert.org>

python: Move more code into m5.util allow SCons to use that code.
Get rid of misc.py and just stick misc things in __init__.py
Move utility functions out of SCons files and into m5.util
Move utility type stuff from m5/__init__.py to m5/util/__init__.py
Remove buildEnv from m5 and allow access only from m5.defines
Rename AddToPath to addToPath while we're moving it to m5.util
Rename read_command to readCommand while we're moving it
Rename compare_versions to compareVersions while we're moving it.


# 4876:a18cedc19da5 30-Jun-2007 Steve Reinhardt <stever@eecs.umich.edu>

Get rid of remaining traces of obsolete CoherenceProtocol object.


# 4444:0648bdc8d1c9 10-May-2007 Ali Saidi <saidi@eecs.umich.edu>

remove hit_latency and make latency do the right thing
set the latency parameter in terms of a latency
add caches to tsunami-simple configs

configs/common/Caches.py:
tests/configs/memtest.py:
tests/configs/o3-timing-mp.py:
tests/configs/o3-timing.py:
tests/configs/simple-atomic-mp.py:
tests/configs/simple-timing-mp.py:
tests/configs/simple-timing.py:
set the latency parameter in terms of a latency
configs/common/FSConfig.py:
give the bridge a default latency too
src/mem/cache/cache_builder.cc:
src/python/m5/objects/BaseCache.py:
remove hit_latency and make latency do the right thing
tests/configs/tsunami-simple-atomic-dual.py:
tests/configs/tsunami-simple-atomic.py:
tests/configs/tsunami-simple-timing-dual.py:
tests/configs/tsunami-simple-timing.py:
add caches to tsunami-simple configs


# 4390:76bbcf725852 22-Apr-2007 Kevin Lim <ktlim@umich.edu>

Update configs to set the CPU clock properly.


# 3402:db60546818d0 31-Oct-2006 Kevin Lim <ktlim@umich.edu>

Remove mem parameter. Now the translating port asks the CPU's dcache's peer for its MemObject instead of having to have a paramter for the MemObject.

configs/example/fs.py:
configs/example/se.py:
src/cpu/simple/base.cc:
src/cpu/simple/base.hh:
src/cpu/simple/timing.cc:
src/cpu/simple_thread.cc:
src/cpu/simple_thread.hh:
src/cpu/thread_state.cc:
src/cpu/thread_state.hh:
tests/configs/o3-timing-mp.py:
tests/configs/o3-timing.py:
tests/configs/simple-atomic-mp.py:
tests/configs/simple-atomic.py:
tests/configs/simple-timing-mp.py:
tests/configs/simple-timing.py:
tests/configs/tsunami-simple-atomic-dual.py:
tests/configs/tsunami-simple-atomic.py:
tests/configs/tsunami-simple-timing-dual.py:
tests/configs/tsunami-simple-timing.py:
No need for mem parameter any more.
src/cpu/checker/cpu.cc:
Use new constructor for simple thread (no more MemObject parameter).
src/cpu/checker/cpu.hh:
Remove MemObject parameter.
src/cpu/memtest/memtest.hh:
Ports now take in their MemObject owner.
src/cpu/o3/alpha/cpu_builder.cc:
Remove mem parameter.
src/cpu/o3/alpha/cpu_impl.hh:
Remove memory parameter and clean up handling of TranslatingPort.
src/cpu/o3/cpu.cc:
src/cpu/o3/cpu.hh:
src/cpu/o3/fetch.hh:
src/cpu/o3/fetch_impl.hh:
src/cpu/o3/mips/cpu_builder.cc:
src/cpu/o3/mips/cpu_impl.hh:
src/cpu/o3/params.hh:
src/cpu/o3/thread_state.hh:
src/cpu/ozone/cpu.hh:
src/cpu/ozone/cpu_builder.cc:
src/cpu/ozone/cpu_impl.hh:
src/cpu/ozone/front_end.hh:
src/cpu/ozone/front_end_impl.hh:
src/cpu/ozone/lw_lsq.hh:
src/cpu/ozone/lw_lsq_impl.hh:
src/cpu/ozone/simple_params.hh:
src/cpu/ozone/thread_state.hh:
src/cpu/simple/atomic.cc:
Remove memory parameter.


# 3230:e86a03911728 09-Oct-2006 Kevin Lim <ktlim@umich.edu>

Merge ktlim@zizzer:/bk/newmem
into zamp.eecs.umich.edu:/z/ktlim2/clean/o3-merge/newmem

src/cpu/memtest/memtest.cc:
src/cpu/memtest/memtest.hh:
src/cpu/simple/timing.hh:
tests/configs/o3-timing-mp.py:
Hand merge.


# 3223:a2b6fa575c05 08-Oct-2006 Kevin Lim <ktlim@umich.edu>

Clean up configs.

configs/common/FSConfig.py:
configs/common/SysPaths.py:
configs/example/fs.py:
configs/example/se.py:
tests/configs/o3-timing-mp.py:
tests/configs/o3-timing.py:
Clean up configs by removing FullO3Config and instead using default values.
src/python/m5/objects/FUPool.py:
Add in default FUPool.
src/python/m5/objects/O3CPU.py:
Use defaults better. Also set checker parameters, and fix up a config bug.


# 3200:4b072dcc7a57 09-Oct-2006 Ron Dreslinski <rdreslin@umich.edu>

Update configs for cpu_id

tests/configs/o3-timing-mp.py:
tests/configs/simple-atomic-mp.py:
tests/configs/simple-timing-mp.py:
Update config for cpu_id


# 3134:cf578b0dd70d 05-Oct-2006 Ron Dreslinski <rdreslin@umich.edu>

First pass at snooping stuff that compiles and doesn't break.

Still need:
-Handle NACK's on the recieve side
-Distinguish top level caches
-Handle repsonses from caches failing the fast path
-Handle BusError and propogate it
-Fix the invalidate packet associated with snooping in the cache

src/mem/bus.cc:
Make sure to snoop on functional accesses
src/mem/cache/base_cache.cc:
Wait to make a request into a response until it is ready to be issued
src/mem/cache/base_cache.hh:
Support range changes for snoops
Set up snoop responses for cache->cache transfers
src/mem/cache/cache_impl.hh:
Only access the cache if it wasn't satisfied by cache->cache transfer
Handle snoop phases (detect block, then snoop)
Fix functional access to work properly (still need to fix snoop path for functional accesses)