Search

Home

Sort by

Full Search
Definition
Symbol
File Path
History
Type

In Project(s)

Searched hist:13 (Results 1151 - 1175 of 1864) sorted by relevance

<<41 42 43 44 45 464748 49 50 >>

/gem5/src/cpu/o3/
H A D	cpu.hh	13954:2f400a5f2627 Fri Jul 07 09:13:00 EDT 2017 Giacomo Gabrielli <giacomo.gabrielli@arm.com> cpu,mem: Add support for partial loads/stores and wide mem. accesses This changeset adds support for partial (or masked) loads/stores, i.e. loads/stores that can disable accesses to individual bytes within the target address range. In addition, this changeset extends the code to crack memory accesses across most CPU models (TimingSimpleCPU still TBD), so that arbitrarily wide memory accesses are supported. These changes are required for supporting ISAs with wide vectors. Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> - Tiago Muck <tiago.muck@arm.com> Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> 13652:45d94ac03a27 Mon Jan 22 13:12:00 EST 2018 Tuan Ta <qtt2@cornell.edu> cpu: support atomic memory request type with AtomicOpFunctor This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU, MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory system. Atomic memory instruction is treated as a special store instruction in all CPU models. In simple CPUs, an AMO request with an associated AtomicOpFunctor is simply sent to L1 dcache. In MinorCPU, an AMO request bypasses store buffer and waits for any conflicting store request(s) currently in the store buffer to retire before the AMO request is sent to the cache. AMO requests are not buffered in the store buffer, so their effects appear immediately in the cache. In DerivO3CPU, an AMO request is inserted in the store buffer so that it is delivered to the cache only after all previous stores are issued to the cache. Data forwarding between between an outstanding AMO in the store buffer and a subsequent load is not allowed since the AMO request does not hold valid data until it's executed in the cache. This implementation assumes that a target ISA implementation must insert enough memory fences as micro-ops around an atomic instruction to enforce a correct order of memory instructions with respect to its memory consistency model. Without extra memory fences, this implementation can allow AMOs and other memory instructions that do not conflict (i.e., not target the same address) to reorder. This implementation also assumes that atomic instructions execute within a cache line boundary since the cache for now is not able to execute an operation on two different cache lines in one single step. Therefore, ISAs like x86 that require multi-cache-line atomic instructions need to either use a pair of locking load and unlocking store or change the cache implementation to guarantee the atomicity of an atomic instruction. Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a Reviewed-on: https://gem5-review.googlesource.com/c/8188 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> 13590:d7e018859709 Mon Feb 13 04:41:00 EST 2017 Rekai Gonzalez-Alberquilla <rekai.gonzalezalberquilla@arm.com> cpu-o3: O3 LSQ Generalisation This patch does a large modification of the LSQ in the O3 model. The main goal of the patch is to remove the 'an operation can be served with one or two memory requests' assumption that is present in the LSQ and the instruction with the req, reqLow, reqHigh triplet, and generalising it to operations that can be addressed with one request, and operations that require many requests, embodied in the SingleDataRequest and the SplitDataRequest. This modification has been done mimicking the minor model to an extent, shifting the responsibilities of dealing with VtoP translation and tracking the status and resources from the DynInst to the LSQ via the LSQRequest. The LSQRequest models the information concerning the operation, handles the creation of fragments for translation and request as well as assembling/splitting the data accordingly. With this modifications, the implementation of vector ISAs, particularly on the memory side, become more rich, as the new model permits a dissociation of the ISA characteristics as vector length, from the microarchitectural characteristics that govern how contiguous loads are executing, allowing exploration of different LSQ to DL1 bus widths to understand the tradeoffs in complexity and performance. Part of the complexities introduced stem from the fact that gem5 keeps a large amount of metadata regarding, in particular, memory operations, thus, when an instruction is squashed while some operation as TLB lookup or cache access is ongoing, when the relevant structure communicates to the LSQ that the operation is over, it tries to access some pieces of data that should have died when the instruction is squashed, leading to asserts, panics, or memory corruption. To ensure the correct behaviour, the LSQRequest rely on assesing who is their owner, and self-destroying if they detect their owner is done with the request, and there will be no subsequent action. For example, in the case of an instruction squashed whal the TLB is doing a walk to serve the translation, when the translation is served by the TLB, the LSQRequest detects that the instruction was squashed, and as the translation is done, no one else expect to access its information, and therefore, it self-destructs. Having destroyed the LSQRequest earlier, would lead to wrong behaviour as the TLB walk may access some fields of it. Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> Change-Id: I9578a1a3f6b899c390cdd886856a24db68ff7d0c Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13516 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> 13557:fc33e6048b25 Sat Oct 13 03:54:00 EDT 2018 Gabe Black <gabeblack@google.com> cpu: dev: sim: gpu-compute: Banish some ISA specific register types. These types are IntReg, FloatReg, FloatRegBits, and MiscReg. There are some remaining types, specifically the vector registers and the CCReg. I'm less familiar with these new types of registers, and so will look at getting rid of them at some later time. Change-Id: Ide8f76b15c531286f61427330053b44074b8ac9b Reviewed-on: https://gem5-review.googlesource.com/c/13624 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> 11877:5ea85692a53e Mon Jul 20 10:15:00 EDT 2015 Brandon Potter <brandon.potter@amd.com> syscall_emul: [patch 13/22] add system call retry capability This changeset adds functionality that allows system calls to retry without affecting thread context state such as the program counter or register values for the associated thread context (when system calls return with a retry fault). This functionality is needed to solve problems with blocking system calls in multi-process or multi-threaded simulations where information is passed between processes/threads. Blocking system calls can cause deadlock because the simulator itself is single threaded. There is only a single thread servicing the event queue which can cause deadlock if the thread hits a blocking system call instruction. To illustrate the problem, consider two processes using the producer/consumer sharing model. The processes can use file descriptors and the read and write calls to pass information to one another. If the consumer calls the blocking read system call before the producer has produced anything, the call will block the event queue (while executing the system call instruction) and deadlock the simulation. The solution implemented in this changeset is to recognize that the system calls will block and then generate a special retry fault. The fault will be sent back up through the function call chain until it is exposed to the cpu model's pipeline where the fault becomes visible. The fault will trigger the cpu model to replay the instruction at a future tick where the call has a chance to succeed without actually going into a blocking state. In subsequent patches, we recognize that a syscall will block by calling a non-blocking poll (from inside the system call implementation) and checking for events. When events show up during the poll, it signifies that the call would not have blocked and the syscall is allowed to proceed (calling an underlying host system call if necessary). If no events are returned from the poll, we generate the fault and try the instruction for the thread context at a distant tick. Note that retrying every tick is not efficient. As an aside, the simulator has some multi-threading support for the event queue, but it is not used by default and needs work. Even if the event queue was completely multi-threaded, meaning that there is a hardware thread on the host servicing a single simulator thread contexts with a 1:1 mapping between them, it's still possible to run into deadlock due to the event queue barriers on quantum boundaries. The solution of replaying at a later tick is the simplest solution and solves the problem generally. 9448:569d1e8f74e4 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Unify the serialization code for all of the CPU models Cleanup the serialization code for the simple CPUs and the O3 CPU. The CPU-specific code has been replaced with a (un)serializeThread that serializes the thread state / context of a specific thread. Assuming that the thread state class uses the CPU-specific thread state uses the base thread state serialization code, this allows us to restore a checkpoint with any of the CPU models. 9444:ab47fe7f03f0 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Rewrite O3 draining to avoid stopping in microcode Previously, the O3 CPU could stop in the middle of a microcode sequence. This patch makes sure that the pipeline stops when it has committed a normal instruction or exited from a microcode sequence. Additionally, it makes sure that the pipeline has no instructions in flight when it is drained, which should make draining more robust. Draining is controlled in the commit stage, which checks if the next PC after a committed instruction is in microcode. If this isn't the case, it requests a squash of all instructions after that the instruction that just committed and immediately signals a drain stall to the fetch stage. The CPU then continues to execute until the pipeline and all associated buffers are empty. 9436:4a0223da4924 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> o3 cpu: Remove unused variables 9433:34971d2e0019 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Rename defer_registration->switched_out The defer_registration parameter is used to prevent a CPU from initializing at startup, leaving it in the "switched out" mode. The name of this parameter (and the help string) is confusing. This patch renames it to switched_out, which should be more descriptive. 9427:ddf45c1d54d4 Mon Jan 07 13:05:00 EST 2013 Andreas Sandberg <Andreas.Sandberg@ARM.com> cpu: Initialize the O3 pipeline from startup() The entire O3 pipeline used to be initialized from init(), which is called before initState() or unserialize(). This causes the pipeline to be initialized from an incorrect thread context. This doesn't currently lead to correctness problems as instructions fetched from the incorrect start PC will be squashed a few cycles after initialization. This patch will affect the regressions since the O3 CPU now issues its first instruction fetch to the correct PC instead of 0x0.
/gem5/src/sim/
H A D	ClockDomain.py	10249:6bbb7ae309ac Mon Jun 30 13:56:00 EDT 2014 Stephan Diestelhorst <stephan.diestelhorst@arm.com> power: Add basic DVFS support for gem5 Adds DVFS capabilities to gem5, by allowing users to specify lists for frequencies and voltages in SrcClockDomains and VoltageDomains respectively. A separate component, DVFSHandler, provides a small interface to change operating points of the associated domains. Clock domains will be linked to voltage domains and thus allow separate clock, but shared voltage lines. Currently all the valid performance-level updates are performed with a fixed transition latency as specified for the domain. Config file example: ... vd = VoltageDomain(voltage = ['1V','0.95V','0.90V','0.85V']) tsys.cluster1.clk_domain.clock = ['1GHz','700MHz','400MHz','230MHz'] tsys.cluster2.clk_domain.clock = ['1GHz','700MHz','400MHz','230MHz'] tsys.cluster1.clk_domain.domain_id = 0 tsys.cluster2.clk_domain.domain_id = 1 tsys.cluster1.clk_domain.voltage_domain = vd tsys.cluster2.clk_domain.voltage_domain = vd tsys.dvfs_handler.domains = [tsys.cluster1.clk_domain, tsys.cluster2.clk_domain] tsys.dvfs_handler.enable = True
/gem5/configs/boot/
H A D	nat-spec-surge-client.rcS	`1645:4efe65fb0bce Thu Apr 28 16:13:00 EDT 2005 Ron Dreslinski <rdreslin@umich.edu> Make ip_conntrack table size larger`
H A D	natbox-spec-surge.rcS	`1645:4efe65fb0bce Thu Apr 28 16:13:00 EDT 2005 Ron Dreslinski <rdreslin@umich.edu> Make ip_conntrack table size larger`
H A D	surge-client.rcS	`1645:4efe65fb0bce Thu Apr 28 16:13:00 EDT 2005 Ron Dreslinski <rdreslin@umich.edu> Make ip_conntrack table size larger`
/gem5/src/arch/sparc/isa/formats/mem/
H A D	mem.isa	4040:eb894f3fc168 Mon Feb 12 13:06:00 EST 2007 Ali Saidi <saidi@eecs.umich.edu> rename store conditional stuff as extra data so it can be used for conditional swaps as well Add support for a twin 64 bit int load Add Memory barrier and write barrier flags as appropriate Make atomic memory ops atomic src/arch/alpha/isa/mem.isa: src/arch/alpha/locked_mem.hh: src/cpu/base_dyn_inst.hh: src/mem/cache/cache_blk.hh: src/mem/cache/cache_impl.hh: rename store conditional stuff as extra data so it can be used for conditional swaps as well src/arch/alpha/types.hh: src/arch/mips/types.hh: src/arch/sparc/types.hh: add a largest read data type for statically allocating read buffers in atomic simple cpu src/arch/isa_parser.py: Add support for a twin 64 bit int load src/arch/sparc/isa/decoder.isa: Make atomic memory ops atomic Add Memory barrier and write barrier flags as appropriate src/arch/sparc/isa/formats/mem/basicmem.isa: add post access code block and define a twinload format for twin loads src/arch/sparc/isa/formats/mem/blockmem.isa: remove old microcoded twin load coad src/arch/sparc/isa/formats/mem/mem.isa: swap.isa replaces the code in loadstore.isa src/arch/sparc/isa/formats/mem/util.isa: add a post access code block src/arch/sparc/isa/includes.isa: need bigint.hh for Twin64_t src/arch/sparc/isa/operands.isa: add a twin 64 int type src/cpu/simple/atomic.cc: src/cpu/simple/atomic.hh: src/cpu/simple/base.hh: src/cpu/simple/timing.cc: add support for twinloads add support for swap and conditional swap instructions rename store conditional stuff as extra data so it can be used for conditional swaps as well src/mem/packet.cc: src/mem/packet.hh: Add support for atomic swap memory commands src/mem/packet_access.hh: Add endian conversion function for Twin64_t type src/mem/physical.cc: src/mem/physical.hh: src/mem/request.hh: Add support for atomic swap memory commands Rename sc code to extradata
/gem5/util/stats/
H A D	display.py	`1547:1f0c266940d4 Tue Mar 15 13:22:00 EST 2005 Nathan Binkert <binkertn@umich.edu> get rid of issequence and just use the isinstance builtin`
/gem5/util/
H A D	tracediff	`4018:77a084a042bb Tue Feb 06 13:06:00 EST 2007 Steve Reinhardt <stever@eecs.umich.edu> Use perl FindBin package to set path to rundiff to the directory where tracediff is.`
/gem5/src/dev/alpha/
H A D	backdoor.hh	`8789:a8b63a0ee14c Sun Nov 13 05:05:00 EST 2011 Gabe Black <gblack@eecs.umich.edu> SE/FS: Get rid of FULL_SYSTEM in dev.`
/gem5/src/dev/x86/
H A D	i8237.hh	`9808:13ffc0066b76 Thu Jul 11 22:57:00 EDT 2013 Steve Reinhardt <stever@gmail.com> dev: make BasicPioDevice take size in constructor Instead of relying on derived classes explicitly assigning to the BasicPioDevice pioSize field, require them to pass a size value in to the constructor. Committed by: Nilay Vaish <nilay@cs.wisc.edu>`
H A D	speaker.hh	`9808:13ffc0066b76 Thu Jul 11 22:57:00 EDT 2013 Steve Reinhardt <stever@gmail.com> dev: make BasicPioDevice take size in constructor Instead of relying on derived classes explicitly assigning to the BasicPioDevice pioSize field, require them to pass a size value in to the constructor. Committed by: Nilay Vaish <nilay@cs.wisc.edu>`
/gem5/tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/
H A D	system.pc.com_1.terminal	`9901:13c5fea24be1 Wed Oct 02 05:03:00 EDT 2013 Andreas Sandberg <andreas@sandberg.pp.se> stats: Update x86 stats after x87 fixes The updates to the x87 caused the stats for several regressions to change. This was mainly caused by the addition of a working 32-bit and 80-bit FP load instruction and xsave support.`
/gem5/src/base/
H A D	cp_annotate.hh	`8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes`
/gem5/src/dev/mips/
H A D	malta.hh	`5222:bb733a878f85 Tue Nov 13 16:58:00 EST 2007 Korey Sewell <ksewell@umich.edu> Add in files from merge-bare-iron, get them compiling in FS and SE mode`
/gem5/src/dev/sparc/
H A D	dtod.hh	`3943:68e673d2db04 Sun Jan 28 13:26:00 EST 2007 Nathan Binkert <binkertn@umich.edu> Stick the conversion of python to unix time with all of the other param code so that other functions can use it as well.`
H A D	iob.hh	`8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes`
H A D	mm_disk.hh	`8229:78bf55f23338 Fri Apr 15 13:44:00 EDT 2011 Nathan Binkert <nate@binkert.org> includes: sort all includes`
/gem5/src/base/loader/
H A D	ecoff_object.hh	11392:5967db4cff04 Thu Mar 17 13:34:00 EDT 2016 Brandon Potter <brandon.potter@amd.com> base: add symbol support for dynamic libraries Libraries are loaded into the process address space using the mmap system call. Conveniently, this happens to be a good time to update the process symbol table with the library's incoming symbols so we handle the table update from within the system call. This works just like an application's normal symbols. The only difference between a dynamic library and a main executable is when the symbol table update occurs. The symbol table update for an executable happens at program load time and is finished before the process ever begins executing. Since dynamic linking happens at runtime, the symbol loading happens after the library is first loaded into the process address space. The library binary is examined at this time for a symbol section and that section is parsed for symbol types with specific bindings (global, local, weak). Subsequently, these symbols are added to the table and are available for use by gem5 for things like trace generation. Checkpointing should work just as it did previously. The address space (and therefore the library) will be recorded and the symbol table will be entirely recorded. (It's not possible to do anything clever like checkpoint a program and then load the program back with different libraries with LD_LIBRARY_PATH, because the library becomes part of the address space after being loaded.)
H A D	aout_object.hh	11392:5967db4cff04 Thu Mar 17 13:34:00 EDT 2016 Brandon Potter <brandon.potter@amd.com> base: add symbol support for dynamic libraries Libraries are loaded into the process address space using the mmap system call. Conveniently, this happens to be a good time to update the process symbol table with the library's incoming symbols so we handle the table update from within the system call. This works just like an application's normal symbols. The only difference between a dynamic library and a main executable is when the symbol table update occurs. The symbol table update for an executable happens at program load time and is finished before the process ever begins executing. Since dynamic linking happens at runtime, the symbol loading happens after the library is first loaded into the process address space. The library binary is examined at this time for a symbol section and that section is parsed for symbol types with specific bindings (global, local, weak). Subsequently, these symbols are added to the table and are available for use by gem5 for things like trace generation. Checkpointing should work just as it did previously. The address space (and therefore the library) will be recorded and the symbol table will be entirely recorded. (It's not possible to do anything clever like checkpoint a program and then load the program back with different libraries with LD_LIBRARY_PATH, because the library becomes part of the address space after being loaded.)
H A D	raw_object.hh	11392:5967db4cff04 Thu Mar 17 13:34:00 EDT 2016 Brandon Potter <brandon.potter@amd.com> base: add symbol support for dynamic libraries Libraries are loaded into the process address space using the mmap system call. Conveniently, this happens to be a good time to update the process symbol table with the library's incoming symbols so we handle the table update from within the system call. This works just like an application's normal symbols. The only difference between a dynamic library and a main executable is when the symbol table update occurs. The symbol table update for an executable happens at program load time and is finished before the process ever begins executing. Since dynamic linking happens at runtime, the symbol loading happens after the library is first loaded into the process address space. The library binary is examined at this time for a symbol section and that section is parsed for symbol types with specific bindings (global, local, weak). Subsequently, these symbols are added to the table and are available for use by gem5 for things like trace generation. Checkpointing should work just as it did previously. The address space (and therefore the library) will be recorded and the symbol table will be entirely recorded. (It's not possible to do anything clever like checkpoint a program and then load the program back with different libraries with LD_LIBRARY_PATH, because the library becomes part of the address space after being loaded.)
/gem5/tests/quick/fs/80.netperf-stream/ref/alpha/linux/twosys-tsunami-simple-atomic/
H A D	config.ini	`9449:56610ab73040 Mon Jan 07 13:05:00 EST 2013 Ali Saidi <Ali.Saidi@ARM.com> stats: update stats for previous changes.`
/gem5/tests/quick/se/00.hello/ref/alpha/linux/simple-atomic/
H A D	config.ini	`11390:f40859930028 Thu Mar 17 13:32:00 EDT 2016 Steve Reinhardt <steve.reinhardt@amd.com> stats: update stats for ld.so support Additional auxv entries leads to more instructions in start-up while walking the list, along with different cache conflicts wrt stack entries.`
/gem5/tests/quick/se/00.hello/ref/alpha/linux/simple-timing/
H A D	config.ini	`11390:f40859930028 Thu Mar 17 13:32:00 EDT 2016 Steve Reinhardt <steve.reinhardt@amd.com> stats: update stats for ld.so support Additional auxv entries leads to more instructions in start-up while walking the list, along with different cache conflicts wrt stack entries.`
/gem5/tests/quick/se/00.hello/ref/mips/linux/simple-timing/
H A D	config.ini	`11390:f40859930028 Thu Mar 17 13:32:00 EDT 2016 Steve Reinhardt <steve.reinhardt@amd.com> stats: update stats for ld.so support Additional auxv entries leads to more instructions in start-up while walking the list, along with different cache conflicts wrt stack entries.`
/gem5/tests/quick/se/00.hello/ref/mips/linux/simple-atomic/
H A D	config.ini	`11390:f40859930028 Thu Mar 17 13:32:00 EDT 2016 Steve Reinhardt <steve.reinhardt@amd.com> stats: update stats for ld.so support Additional auxv entries leads to more instructions in start-up while walking the list, along with different cache conflicts wrt stack entries.`

Completed in 115 milliseconds

<<41 42 43 44 45 464748 49 50 >>