Searched hist:2014 (Results 1351 - 1375 of 1681) sorted by relevance

<<51525354555657585960>>

/gem5/src/arch/arm/isa/templates/
H A Dtemplates.isa10037:5cac77888310 Fri Jan 24 16:29:00 EST 2014 ARM gem5 Developers arm: Add support for ARMv8 (AArch64 & AArch32)

Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64
kernel you are restricted to AArch64 user-mode binaries. This will be addressed
in a later patch.

Note: Virtualization is only supported in AArch32 mode. This will also be fixed
in a later patch.

Contributors:
Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation)
Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation)
Mbou Eyole (AArch64 NEON, validation)
Ali Saidi (AArch64 Linux support, code integration, validation)
Edmund Grimley-Evans (AArch64 FP)
William Wang (AArch64 Linux support)
Rene De Jong (AArch64 Linux support, performance opt.)
Matt Horsnell (AArch64 MP, validation)
Matt Evans (device models, code integration, validation)
Chris Adeniyi-Jones (AArch64 syscall-emulation)
Prakash Ramrakhyani (validation)
Dam Sunwoo (validation)
Chander Sudanthi (validation)
Stephan Diestelhorst (validation)
Andreas Hansson (code integration, performance opt.)
Eric Van Hensbergen (performance opt.)
Gabe Black
/gem5/src/cpu/minor/
H A Dfetch2.hh10259:ebb376f73dd2 Wed Jul 23 17:09:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> cpu: `Minor' in-order CPU model

This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).

The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).

Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.

Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.

Minor is faster than the o3 model. Sample results:

Benchmark | Stat host_seconds (s)
---------------+--------v--------v--------
(on ARM, opt) | simple | o3 | minor
| timing | timing | timing
---------------+--------+--------+--------
10.linux-boot | 169 | 1883 | 1075
10.mcf | 117 | 967 | 491
20.parser | 668 | 6315 | 3146
30.eon | 542 | 3413 | 2414
40.perlbmk | 2339 | 20905 | 11532
50.vortex | 122 | 1094 | 588
60.bzip2 | 2045 | 18061 | 9662
70.twolf | 207 | 2736 | 1036
H A Dpipeline.cc10259:ebb376f73dd2 Wed Jul 23 17:09:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> cpu: `Minor' in-order CPU model

This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).

The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).

Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.

Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.

Minor is faster than the o3 model. Sample results:

Benchmark | Stat host_seconds (s)
---------------+--------v--------v--------
(on ARM, opt) | simple | o3 | minor
| timing | timing | timing
---------------+--------+--------+--------
10.linux-boot | 169 | 1883 | 1075
10.mcf | 117 | 967 | 491
20.parser | 668 | 6315 | 3146
30.eon | 542 | 3413 | 2414
40.perlbmk | 2339 | 20905 | 11532
50.vortex | 122 | 1094 | 588
60.bzip2 | 2045 | 18061 | 9662
70.twolf | 207 | 2736 | 1036
H A Ddyn_inst.hh10259:ebb376f73dd2 Wed Jul 23 17:09:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> cpu: `Minor' in-order CPU model

This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).

The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).

Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.

Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.

Minor is faster than the o3 model. Sample results:

Benchmark | Stat host_seconds (s)
---------------+--------v--------v--------
(on ARM, opt) | simple | o3 | minor
| timing | timing | timing
---------------+--------+--------+--------
10.linux-boot | 169 | 1883 | 1075
10.mcf | 117 | 967 | 491
20.parser | 668 | 6315 | 3146
30.eon | 542 | 3413 | 2414
40.perlbmk | 2339 | 20905 | 11532
50.vortex | 122 | 1094 | 588
60.bzip2 | 2045 | 18061 | 9662
70.twolf | 207 | 2736 | 1036
H A Ddyn_inst.cc10259:ebb376f73dd2 Wed Jul 23 17:09:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> cpu: `Minor' in-order CPU model

This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).

The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).

Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.

Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.

Minor is faster than the o3 model. Sample results:

Benchmark | Stat host_seconds (s)
---------------+--------v--------v--------
(on ARM, opt) | simple | o3 | minor
| timing | timing | timing
---------------+--------+--------+--------
10.linux-boot | 169 | 1883 | 1075
10.mcf | 117 | 967 | 491
20.parser | 668 | 6315 | 3146
30.eon | 542 | 3413 | 2414
40.perlbmk | 2339 | 20905 | 11532
50.vortex | 122 | 1094 | 588
60.bzip2 | 2045 | 18061 | 9662
70.twolf | 207 | 2736 | 1036
/gem5/src/cpu/pred/
H A Dbi_mode.hh10244:d2deb51a4abf Mon Jun 30 13:50:00 EDT 2014 Anthony Gutierrez <atgutier@umich.edu> cpu: implement a bi-mode branch predictor
/gem5/src/kern/linux/
H A Dprintk.cc10468:8c1b836edc92 Thu Oct 16 05:49:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> dev: Use shared_ptr for Arguments::Data

This patch takes a first few steps in transitioning from the ad-hoc
RefCountingPtr to the c++11 shared_ptr. There are no changes in
behaviour, and the code modifications are mainly introducing the
use of make_shared.

Note that the class could use unique_ptr rather than shared_ptr, was
it not for the postfix increment and decrement operators.
/gem5/src/mem/
H A Dabstract_mem.cc10583:d1e1e8588881 Tue Dec 02 06:08:00 EST 2014 Curtis Dunham <Curtis.Dunham@arm.com> mem: Support WriteInvalidate (again)

This patch takes a clean-slate approach to providing WriteInvalidate
(write streaming, full cache line writes without first reading)
support.

Unlike the prior attempt, which took an aggressive approach of directly
writing into the cache before handling the coherence actions, this
approach follows the existing cache flows as closely as possible.
10563:755b18321206 Tue Dec 02 06:07:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Add const getters for write packet data

This patch takes a first step in tightening up how we use the data
pointer in write packets. A const getter is added for the pointer
itself (getConstPtr), and a number of member functions are also made
const accordingly. In a range of places throughout the memory system
the new member is used.

The patch also removes the unused isReadWrite function.
10466:73b7549d979e Thu Oct 16 05:49:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Dynamically determine page bytes in memory components

This patch takes a step towards an ISA-agnostic memory
system by enabling the components to establish the page size after
instantiation. The swap operation in the memory is now also allowing
any granularity to avoid depending on the IntReg of the ISA.
10102:b5de69974a2e Fri Mar 07 15:56:00 EST 2014 Ali Saidi <ali.saidi@arm.com> mem: Wakeup sleeping CPUs without caches on LLSC

For systems without caches, the LLSC code does not get snoops for
wake-ups. We add the LLSC code in the abstract memory to do the job
for us.
H A DXBar.py10405:7a618c07e663 Sat Sep 20 17:18:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Rename Bus to XBar to better reflect its behaviour

This patch changes the name of the Bus classes to XBar to better
reflect the actual timing behaviour. The actual instances in the
config scripts are not renamed, and remain as e.g. iobus or membus.

As part of this renaming, the code has also been clean up slightly,
making use of range-based for loops and tidying up some comments. The
only changes outside the bus/crossbar code is due to the delay
variables in the packet.
H A Dmem_checker_monitor.cc10612:6332c9d471a8 Tue Dec 23 09:31:00 EST 2014 Marco Elver <Marco.Elver@ARM.com> mem: Add MemChecker and MemCheckerMonitor

This patch adds the MemChecker and MemCheckerMonitor classes. While
MemChecker can be integrated anywhere in the system and is independent,
the most convenient usage is through the MemCheckerMonitor -- this
however, puts limitations on where the MemChecker is able to observe
read/write transactions.
H A Dnoncoherent_xbar.hh10405:7a618c07e663 Sat Sep 20 17:18:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Rename Bus to XBar to better reflect its behaviour

This patch changes the name of the Bus classes to XBar to better
reflect the actual timing behaviour. The actual instances in the
config scripts are not renamed, and remain as e.g. iobus or membus.

As part of this renaming, the code has also been clean up slightly,
making use of range-based for loops and tidying up some comments. The
only changes outside the bus/crossbar code is due to the delay
variables in the packet.
H A Dsimple_mem.cc10509:d5554f97c451 Thu Oct 30 00:18:00 EDT 2014 Ali Saidi <Ali.Saidi@ARM.com> arm, mem: Fix drain bug and provide drain prints for more components.
10466:73b7549d979e Thu Oct 16 05:49:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Dynamically determine page bytes in memory components

This patch takes a step towards an ISA-agnostic memory
system by enabling the components to establish the page size after
instantiation. The swap operation in the memory is now also allowing
any granularity to avoid depending on the IntReg of the ISA.
10405:7a618c07e663 Sat Sep 20 17:18:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Rename Bus to XBar to better reflect its behaviour

This patch changes the name of the Bus classes to XBar to better
reflect the actual timing behaviour. The actual instances in the
config scripts are not renamed, and remain as e.g. iobus or membus.

As part of this renaming, the code has also been clean up slightly,
making use of range-based for loops and tidying up some comments. The
only changes outside the bus/crossbar code is due to the delay
variables in the packet.
10372:0a810481d511 Fri Sep 19 10:35:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> mem: Check return value of checkFunctional in SimpleMemory

Simple fix to ensure we only iterate until we are done.
H A Dpage_table.hh10558:426665ec11a9 Sun Nov 23 21:01:00 EST 2014 Alexandru Dutu <alexandru.dutu@amd.com> mem: Page Table map api modification

This patch adds uncacheable/cacheable and read-only/read-write attributes to
the map method of PageTableBase. It also modifies the constructor of TlbEntry
structs for all architectures to consider the new attributes.
10556:1e3b3c7a0cba Sun Nov 23 21:01:00 EST 2014 Alexandru Dutu <alexandru.dutu@amd.com> mem: Page Table long lines

Trimmed down all the lines greater than 78 characters.
10318:98771a936b61 Wed Sep 03 07:42:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> arch: Cleanup unused ISA traits constants

This patch prunes unused values, and also unifies how the values are
defined (not using an enum for ALPHA), aligning the use of int vs Addr
etc.

The patch also removes the duplication of PageBytes/PageShift and
VMPageSize/LogVMPageSize. For all ISAs the two pairs had identical
values and the latter has been removed.
10298:77af86f37337 Tue Apr 01 01:18:00 EDT 2014 Alexandru <alexandru.dutu@amd.com> mem: adding a multi-level page table class
This patch defines a multi-level page table class that stores the page table in
system memory, consistent with ISA specifications. In this way, cpu models that
use the actual hardware to execute (e.g. KvmCPU), are able to traverse the page
table.
/gem5/src/mem/cache/
H A Dmshr.hh10582:c04dc66e4316 Tue Dec 02 06:08:00 EST 2014 Curtis Dunham <Curtis.Dunham@arm.com> mem: Remove WriteInvalidate support

Prepare for a different implementation following in the next patch
10503:94d58056729f Tue Oct 21 18:04:00 EDT 2014 Curtis Dunham <Curtis.Dunham@arm.com> mem: don't inhibit WriteInv's or defer snoops on their MSHRs

WriteInvalidate semantics depend on the unconditional writeback
or they won't complete. Also, there's no point in deferring snoops
on their MSHRs, as they don't get new data at the end of their life
cycle the way other transactions do.

Add comment in the cache about a minor inefficiency re: WriteInvalidate.
10502:f2f1dbfd505e Thu Oct 30 00:18:00 EDT 2014 Curtis Dunham <Curtis.Dunham@arm.com> mem: have WriteInvalidate obsolete MSHRs

Since WriteInvalidate directly writes into the cache, it can
create tricky timing interleavings with reads and writes to the
same cache line that haven't yet completed. This patch ensures
that these requests, when completed, don't overwrite the newer
data from the WriteInvalidate.
10028:fb8c44de891a Fri Jan 24 16:29:00 EST 2014 Giacomo Gabrielli <Giacomo.Gabrielli@arm.com> mem: Add support for a security bit in the memory system

This patch adds the basic building blocks required to support e.g. ARM
TrustZone by discerning secure and non-secure memory accesses.
/gem5/src/mem/cache/prefetch/
H A Dqueued.hh10623:b9646f4546ad Tue Dec 23 09:31:00 EST 2014 Mitch Hayenga <mitch.hayenga@arm.com> mem: Rework the structuring of the prefetchers

Re-organizes the prefetcher class structure. Previously the
BasePrefetcher forced multiple assumptions on the prefetchers that
inherited from it. This patch makes the BasePrefetcher class truly
representative of base functionality. For example, the base class no
longer enforces FIFO order. Instead, prefetchers with FIFO requests
(like the existing stride and tagged prefetchers) now inherit from a
new QueuedPrefetcher base class.

Finally, the stride-based prefetcher now assumes a custimizable lookup table
(sets/ways) rather than the previous fully associative structure.
/gem5/src/mem/ruby/common/
H A DSet.hh10348:c91b23c72d5e Wed Sep 03 07:42:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> base: Use the global Mersenne twister throughout

This patch tidies up random number generation to ensure that it is
done consistently throughout the code base. In essence this involves a
clean-up of Ruby, and some code simplifications in the traffic
generator.

As part of this patch a bunch of skewed distributions (off-by-one etc)
have been fixed.

Note that a single global random number generator is used, and that
the object instantiation order will impact the behaviour (the sequence
of numbers will be unaffected, but if module A calles random before
module B then they would obviously see a different outcome). The
dependency on the instantiation order is true in any case due to the
execution-model of gem5, so we leave it as is. Also note that the
global ranom generator is not thread safe at this point.

Regressions using the memtest, TrafficGen or any Ruby tester are
affected and will be updated accordingly.
/gem5/src/sim/
H A Darguments.hh10468:8c1b836edc92 Thu Oct 16 05:49:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> dev: Use shared_ptr for Arguments::Data

This patch takes a first few steps in transitioning from the ad-hoc
RefCountingPtr to the c++11 shared_ptr. There are no changes in
behaviour, and the code modifications are mainly introducing the
use of make_shared.

Note that the class could use unique_ptr rather than shared_ptr, was
it not for the postfix increment and decrement operators.
/gem5/tests/configs/
H A Dx86_generic.py10003:459491344fcf Fri Jan 03 20:08:00 EST 2014 Steve Reinhardt <steve.reinhardt@amd.com> config, x86: move kernel specification from tests to FSConfig.py

For some reason, the default x86 kernel is specified in
tests/configs/x86_generic.py and not in configs/common/FSConfig.py,
where the kernels for all the other ISAs are. This means that
running configs/example/fs.py for x86 fails because no kernel
is specified. Moving the specification over fixes this problem.

There is another problem that this uncovers, which is that going
past the init stage (i.e., past where the regression test stops)
fails because the fsck test on the disk device fails, but that's
a separate issue.
/gem5/src/arch/arm/isa/formats/
H A Daarch64.isa10506:aa23216161fa Thu Oct 30 00:18:00 EDT 2014 Ali Saidi <Ali.Saidi@ARM.com> arm: Mark some miscregs (timer counter) registers at unverifiable.

The checker can't verify timer registers, so it should just grab the version
from the executing CPU, otherwise it could get a larger value and diverge
execution.
10337:85001c018d4c Wed Sep 03 07:42:00 EDT 2014 Andrew Bardsley <Andrew.Bardsley@arm.com> arm: ISA X31 destination register fix

This patch substituted the zero register for X31 used as a
destination register. This prevents false dependencies based on
X31.
10173:a6402a046e36 Wed Apr 23 05:18:00 EDT 2014 Mitchell Hayenga <Mitchell.Hayenga@ARM.com> arm: Don't use a stack allocated mnemonic

FailUnimplemented passed a stack created mnemonic as a const char * which
causes some grief when the stack goes away.
10037:5cac77888310 Fri Jan 24 16:29:00 EST 2014 ARM gem5 Developers arm: Add support for ARMv8 (AArch64 & AArch32)

Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64
kernel you are restricted to AArch64 user-mode binaries. This will be addressed
in a later patch.

Note: Virtualization is only supported in AArch32 mode. This will also be fixed
in a later patch.

Contributors:
Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation)
Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation)
Mbou Eyole (AArch64 NEON, validation)
Ali Saidi (AArch64 Linux support, code integration, validation)
Edmund Grimley-Evans (AArch64 FP)
William Wang (AArch64 Linux support)
Rene De Jong (AArch64 Linux support, performance opt.)
Matt Horsnell (AArch64 MP, validation)
Matt Evans (device models, code integration, validation)
Chris Adeniyi-Jones (AArch64 syscall-emulation)
Prakash Ramrakhyani (validation)
Dam Sunwoo (validation)
Chander Sudanthi (validation)
Stephan Diestelhorst (validation)
Andreas Hansson (code integration, performance opt.)
Eric Van Hensbergen (performance opt.)
Gabe Black
/gem5/src/arch/arm/insts/
H A Dmisc64.hh10037:5cac77888310 Fri Jan 24 16:29:00 EST 2014 ARM gem5 Developers arm: Add support for ARMv8 (AArch64 & AArch32)

Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64
kernel you are restricted to AArch64 user-mode binaries. This will be addressed
in a later patch.

Note: Virtualization is only supported in AArch32 mode. This will also be fixed
in a later patch.

Contributors:
Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation)
Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation)
Mbou Eyole (AArch64 NEON, validation)
Ali Saidi (AArch64 Linux support, code integration, validation)
Edmund Grimley-Evans (AArch64 FP)
William Wang (AArch64 Linux support)
Rene De Jong (AArch64 Linux support, performance opt.)
Matt Horsnell (AArch64 MP, validation)
Matt Evans (device models, code integration, validation)
Chris Adeniyi-Jones (AArch64 syscall-emulation)
Prakash Ramrakhyani (validation)
Dam Sunwoo (validation)
Chander Sudanthi (validation)
Stephan Diestelhorst (validation)
Andreas Hansson (code integration, performance opt.)
Eric Van Hensbergen (performance opt.)
Gabe Black
/gem5/src/arch/arm/
H A Dfaults.hh10537:47fe87b0cf97 Fri Nov 14 03:53:00 EST 2014 Andreas Hansson <andreas.hansson@arm.com> arm: Fixes based on UBSan and static analysis

Another churn to clean up undefined behaviour, mostly ARM, but some
parts also touching the generic part of the code base.

Most of the fixes are simply ensuring that proper intialisation. One
of the more subtle changes is the return type of the sign-extension,
which is changed to uint64_t. This is to avoid shifting negative
values (undefined behaviour) in the ISA code.
10417:710ee116eb68 Sat Sep 27 09:08:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> arch: Use const StaticInstPtr references where possible

This patch optimises the passing of StaticInstPtr by avoiding copying
the reference-counting pointer. This avoids first incrementing and
then decrementing the reference-counting pointer.
10205:3ca67d0e0e7e Thu Apr 17 17:56:00 EDT 2014 Ali Saidi <Ali.Saidi@ARM.com> arm: Make sure UndefinedInstructions are properly initialized
10037:5cac77888310 Fri Jan 24 16:29:00 EST 2014 ARM gem5 Developers arm: Add support for ARMv8 (AArch64 & AArch32)

Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64
kernel you are restricted to AArch64 user-mode binaries. This will be addressed
in a later patch.

Note: Virtualization is only supported in AArch32 mode. This will also be fixed
in a later patch.

Contributors:
Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation)
Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation)
Mbou Eyole (AArch64 NEON, validation)
Ali Saidi (AArch64 Linux support, code integration, validation)
Edmund Grimley-Evans (AArch64 FP)
William Wang (AArch64 Linux support)
Rene De Jong (AArch64 Linux support, performance opt.)
Matt Horsnell (AArch64 MP, validation)
Matt Evans (device models, code integration, validation)
Chris Adeniyi-Jones (AArch64 syscall-emulation)
Prakash Ramrakhyani (validation)
Dam Sunwoo (validation)
Chander Sudanthi (validation)
Stephan Diestelhorst (validation)
Andreas Hansson (code integration, performance opt.)
Eric Van Hensbergen (performance opt.)
Gabe Black
H A Dtlb.hh10474:799c8ee4ecba Thu Oct 16 05:49:00 EDT 2014 Andreas Hansson <andreas.hansson@arm.com> arch: Use shared_ptr for all Faults

This patch takes quite a large step in transitioning from the ad-hoc
RefCountingPtr to the c++11 shared_ptr by adopting its use for all
Faults. There are no changes in behaviour, and the code modifications
are mostly just replacing "new" with "make_shared".
10463:25c5da51bbe0 Thu Oct 16 05:49:00 EDT 2014 Andreas Sandberg <Andreas.Sandberg@ARM.com> arm: Add TLB PMU probes

This changeset adds probe points that can be used to implement PMU
counters for TLB stats. The following probes are supported:

* ArmISA::TLB::ppRefills / TLB Refills (TLB insertions)
10194:e6d2e8083d9c Fri May 09 18:58:00 EDT 2014 Geoffrey Blake <Geoffrey.Blake@arm.com> arch, arm: Preserve TLB bootUncacheability when switching CPUs

The ARM TLBs have a bootUncacheability flag used to make some loads
and stores become uncacheable when booting in FS mode. Later the
flag is cleared to let those loads and stores operate as normal. When
doing a takeOverFrom(), this flag's state is not preserved and is
momentarily reset until the CPSR is touched. On single core runs this
is a non-issue. On multi-core runs this can lead to crashes on the O3
CPU model from the following series of events:
1) takeOverFrom executed to switch from Atomic -> O3
2) All bootUncacheability flags are reset to true
3) Core2 tries to execute a load covered by bootUncacheability, it
is flagged as uncacheable
4) Core2's load needs to replay due to a pipeline flush
3) Core1 core does an action on CPSR
4) The handling code for CPSR then checks all other cores
to determine if bootUncacheability can be set to false
5) Asynchronously set bootUncacheability on all cores to false
6) Core2 replays load previously set as uncacheable and notices
it is now flagged as cacheable, leads to a panic.
This patch implements takeOverFrom() functionality for the ARM TLBs
to preserve flag values when switching from atomic -> detailed.
10037:5cac77888310 Fri Jan 24 16:29:00 EST 2014 ARM gem5 Developers arm: Add support for ARMv8 (AArch64 & AArch32)

Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64
kernel you are restricted to AArch64 user-mode binaries. This will be addressed
in a later patch.

Note: Virtualization is only supported in AArch32 mode. This will also be fixed
in a later patch.

Contributors:
Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation)
Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation)
Mbou Eyole (AArch64 NEON, validation)
Ali Saidi (AArch64 Linux support, code integration, validation)
Edmund Grimley-Evans (AArch64 FP)
William Wang (AArch64 Linux support)
Rene De Jong (AArch64 Linux support, performance opt.)
Matt Horsnell (AArch64 MP, validation)
Matt Evans (device models, code integration, validation)
Chris Adeniyi-Jones (AArch64 syscall-emulation)
Prakash Ramrakhyani (validation)
Dam Sunwoo (validation)
Chander Sudanthi (validation)
Stephan Diestelhorst (validation)
Andreas Hansson (code integration, performance opt.)
Eric Van Hensbergen (performance opt.)
Gabe Black
/gem5/src/mem/ruby/profiler/
H A DAddressProfiler.hh10012:ec5a5bfb941d Fri Jan 10 17:19:00 EST 2014 Nilay Vaish <nilay@cs.wisc.edu> ruby: move all statistics to stats.txt, eliminate ruby.stats
/gem5/src/mem/ruby/network/
H A DTopology.hh10005:8c2b0dc16ccd Sat Jan 04 01:03:00 EST 2014 Nilay Vaish <nilay@cs.wisc.edu> ruby: add support for clusters

A cluster over here means a set of controllers that can be accessed only by a
certain set of cores. For example, consider a two level hierarchy. Assume
there are 4 L1 controllers (private) and 2 L2 controllers. We can have two
different hierarchies here:

a. the address space is partitioned between the two L2 controllers. Each L1
controller accesses both the L2 controllers. In this case, each L1 controller
is a cluster initself.

b. both the L2 controllers can cache any address. An L1 controller has access
to only one of the L2 controllers. In this case, each L2 controller
along with the L1 controllers that access it, form a cluster.

This patch allows for each controller to have a cluster ID, which is 0 by
default. By setting the cluster ID properly, one can instantiate hierarchies
with clusters. Note that the coherence protocol might have to be changed as
well.
/gem5/src/cpu/kvm/
H A Dbase.hh10407:a9023811bf9e Sat Sep 20 17:18:00 EDT 2014 Mitch Hayenga <mitch.hayenga@arm.com> alpha,arm,mips,power,x86,cpu,sim: Cleanup activate/deactivate

activate(), suspend(), and halt() used on thread contexts had an optional
delay parameter. However this parameter was often ignored. Also, when used,
the delay was seemily arbitrarily set to 0 or 1 cycle (no other delays were
ever specified). This patch removes the delay parameter and 'Events'
associated with them across all ISAs and cores. Unused activate logic
is also removed.
10114:bd83b4f6a12e Sun Mar 16 12:40:00 EDT 2014 Andreas Sandberg <andreas@sandberg.pp.se> kvm: Clean up signal handling

KVM used to use two signals, one for instruction count exits and one
for timer exits. There is really no need to distinguish between the
two since they only trigger exits from KVM. This changeset unifies and
renames the signals and adds a method, kick(), that can be used to
raise the control signal in the vCPU thread. It also removes the early
timer warning since we do not normally see if the signal was
delivered.
10112:1a2f64842044 Sun Mar 16 12:28:00 EDT 2014 Andreas Sandberg <andreas@sandberg.pp.se> kvm: x86: Add support for x86 INIT and STARTUP handling

This changeset adds support for INIT and STARTUP IPI handling. We
currently handle both of these interrupts in gem5 and transfer the
state to KVM. Since we do not have a BIOS loaded, we pretend that the
INIT interrupt suspends the CPU after reset.
10073:2360411a16be Thu Feb 20 09:43:00 EST 2014 Andreas Sandberg <andreas@sandberg.pp.se> kvm: Add support for multi-system simulation

The introduction of parallel event queues added most of the support
needed to run multiple VMs (systems) within the same gem5
instance. This changeset fixes up signal delivery so that KVM's
control signals are delivered to the thread that executes the CPU's
event queue. Specifically:

* Timers and counters are now initialized from a separate method
(startupThread) that is scheduled as the first event in the
thread-specific event queue. This ensures that they are
initialized from the thread that is going to execute the CPUs
event queue and enables signal delivery to the right thread when
exiting from KVM.

* The POSIX-timer-based KVM timer (used to force exits from KVM) has
been updated to deliver signals to the thread that's executing KVM
instead of the process (thread is undefined in that case). This
assumes that the timer is instantiated from the thread that is
going to execute the KVM vCPU.

* Signal masking is now done using pthread_sigmask instead of
sigprocmask. The behavior of the latter is undefined in threaded
applications.

* Since signal masks can be inherited, make sure to actively unmask
the control signals when setting up the KVM signal mask.

There are currently no facilities to multiplex between multiple KVM
CPUs in the same event queue, we are therefore limited to
configurations where there is only one KVM CPU per event queue. In
practice, this means that multi-system configurations can be
simulated, but not multiple CPUs in a shared-memory configuration.

Completed in 159 milliseconds

<<51525354555657585960>>