History log of /gem5/src/gpu-compute/compute_unit.cc
Revision Date Author Comments
# 13892:0182a0601f66 22-Apr-2019 Gabe Black <gabeblack@google.com>

mem: Minimize the use of MemObject.

MemObject doesn't provide anything beyond its base ClockedObject any
more, so this change removes it from most inheritance hierarchies.
Occasionally MemObject is replaced with SimObject when I was fairly
confident that the extra functionality of ClockedObject wasn't needed.

Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 12799:b230ebe2f641 25-Jun-2018 Alexandru Dutu <alexandru.dutu@amd.com>

gpu-compute: Remove unneeded Request::setVirt call

This sets the members of a Request object to the values they
already hold, except the atomicOpFunctor which is set to
nullptr. This call introduces a bug for atomics and is not
useful for non-atomic requests. This changeset is also
adding the wave PC and instruction sequence number to the
Request object.

Change-Id: I62f7b4a597483b0aa848a0cfbc72181e1063f56a
Reviewed-on: https://gem5-review.googlesource.com/11549
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>


# 12749:223c83ed9979 04-Jun-2018 Giacomo Travaglini <giacomo.travaglini@arm.com>

misc: Using smart pointers for memory Requests

This patch is changing the underlying type for RequestPtr from Request*
to shared_ptr<Request>. Having memory requests being managed by smart
pointers will simplify the code; it will also prevent memory leakage and
dangling pointers.

Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10996
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 12748:ae5ce8e42de7 03-Jun-2018 Giacomo Travaglini <giacomo.travaglini@arm.com>

misc: Substitute pointer to Request with aliased RequestPtr

Every usage of Request* in the code has been replaced with the
RequestPtr alias. This is a preparing patch for when RequestPtr will be
the typdefed to a smart pointer to Request rather then a raw pointer to
Request.

Change-Id: I73cbaf2d96ea9313a590cdc731a25662950cd51a
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10995
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>


# 12717:2e2c211644d2 27-Apr-2018 Brandon Potter <brandon.potter@amd.com>

gpu-compute: use X86ISA::TlbEntry over GpuTlbEntry

GpuTlbEntry was derived from a vanilla X86ISA::TlbEntry definition. It
wrapped the class and included an extra member "valid". This member was
intended to report on the validity of the entry, however it introduced
bugs when folks forgot to set field properly in the code. So, instead of
keeping the extra field which we might forget to set, we track validity by
using nullptr for invalid tlb entries (as the tlb entries are dynamically
allocated). This saves on the extra class definition and prevents bugs
creeping into the code since the checks are intrinsically tied into
accessing any of the X86ISA::TlbEntry members.

This changeset fixes the issues introduced by a8d030522, a4e722725, and
2a15bfd79.

Change-Id: I30ebe3ec223fb833f3795bf0403d0016ac9a8bc2
Reviewed-on: https://gem5-review.googlesource.com/10481
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>


# 12697:cd71b966be1e 27-Apr-2018 Tony Gutierrez <anthony.gutierrez@amd.com>

style: fix amd license and style issues

Change-Id: I26136fb49f743c4a597f8021cfd27f78897267b5
Reviewed-on: https://gem5-review.googlesource.com/10463
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>


# 12680:91f4d6668b4f 04-Apr-2018 Giacomo Travaglini <giacomo.travaglini@arm.com>

sim,cpu,mem,arch: Introduced MasterInfo data structure

With this patch a gem5 System will store more info about its Masters.
While it was previously keeping track of the Master name and Master ID
only, it is now adding a per-Master pointer to the SimObject related to
the Master.
This will make it possible for a client to query a System for a Master
using either the master's name or the master's pointer.

Change-Id: I8b97d328a65cd06f329e2cdd3679451c17d2b8f6
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/9781
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 12126:06c1fbaa5724 27-Jun-2017 Sean Wilson <spwilson2@wisc.edu>

gpu-compute: Refactor some Event subclasses to lambdas

Change-Id: Ic1332b8e8ba0afacbe591c80f4d06afbf5f04bd9
Signed-off-by: Sean Wilson <spwilson2@wisc.edu>
Reviewed-on: https://gem5-review.googlesource.com/3922
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>


# 11700:7d4d424c9f17 26-Oct-2016 Tony Gutierrez <anthony.gutierrez@amd.com>

gpu-compute: support in-order data delivery in GM pipe

this patch adds an ordered response buffer to the GM pipeline
to ensure in-order data delivery. the buffer is implemented as
a stl ordered map, which sorts the request in program order by
using their sequence ID. when requests return to the GM pipeline
they are marked as done. only the oldest request may be serviced
from the ordered buffer, and only if is marked as done.

the FIFO response buffers are kept and used in OoO delivery mode


# 11698:d1ad31187fa5 26-Oct-2016 Tony Gutierrez <anthony.gutierrez@amd.com>

gpu-compute: use System cache line size in the GPU


# 11695:0a65922d564d 26-Oct-2016 Tony Gutierrez <anthony.gutierrez@amd.com>

gpu-compute: add instruction mix stats for the gpu


# 11692:e772fdcd3809 26-Oct-2016 Tony Gutierrez <anthony.gutierrez@amd.com>

gpu-compute: remove inst enums and use bit flag for attributes

this patch removes the GPUStaticInst enums that were defined in GPU.py.
instead, a simple set of attribute flags that can be set in the base
instruction class are used. this will help unify the attributes of HSAIL
and machine ISA instructions within the model itself.

because the static instrution now carries the attributes, a GPUDynInst
must carry a pointer to a valid GPUStaticInst so a new static kernel launch
instruction is added, which carries the attributes needed to perform a
the kernel launch.


# 11657:5fad5a37d6fc 04-Oct-2016 Alexandru Dutu <alexandru.dutu@amd.com>

gpu-compute: Added method to compute the actual workgroup size
This patch adds a method to the Wavefront class to compute the actual workgroup
size. This can be different from the maximum workgroup size specified when
launching the kernel through the NDRange object. Current solution is still not
optimal, as we are computing these for each wavefront and the dispatcher also
needs to have this information and can't actually call
Wavefront::computeActuallWgSz before the wavefronts are being created. A long
term solution would be to have a Workgroup class that deals with all these
details.


# 11643:42a1873be45c 16-Sep-2016 Alexandru Dutu <alexandru.dutu@amd.com>

gpu-compute: Refactoring Wavefront::dynWaveId


# 11639:2e8d4bd8108d 16-Sep-2016 Alexandru Dutu <alexandru.dutu@amd.com>

gpu-compute: Wavefront refactoring
Renaming members of the Wavefront class in accordance with the style guide.


# 11638:b511733958d0 16-Sep-2016 Alexandru Dutu <alexandru.dutu@amd.com>

gpu-compute: Remove WFContext
WFContext struct is currently unused and it has been rendered not useful in
saving and restoring the context of a Wavefront. Wavefront class should be
sufficient for that purpose and the runtime can figure out the memory size
it will need to allocate for a Wavefront through an IOCTL.


# 11534:7106f550afad 09-Jun-2016 jkalamat <john.kalamatianos@amd.com>

gpu-compute: parametrize Wavefront size

Eliminate the VSZ constant that defined the Wavefront size (in numbers of work
items); replaced it with a parameter in the GPU.py configuration script.
Changed all data structures dependent on the Wavefront size to be dynamically
sized. Legal values of Wavefront size are 16, 32, 64 for now and checked at
initialization time.


# 11523:81332eb10367 06-Jun-2016 David Guillen Fandos <david.guillen@arm.com>

stats: Fixing regStats function for some SimObjects

Fixing an issue with regStats not calling the parent class method
for most SimObjects in Gem5. This causes issues if one adds new
stats in the base class (since they are never initialized properly!).

Change-Id: Iebc5aa66f58816ef4295dc8e48a357558d76a77c
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>


# 11435:0f1b46dde3fa 07-Apr-2016 Mitch Hayenga <mitch.hayenga@arm.com>

mem: Remove threadId from memory request class

In general, the ThreadID parameter is unnecessary in the memory system
as the ContextID is what is used for the purposes of locks/wakeups.
Since we allocate sequential ContextIDs for each thread on MT-enabled
CPUs, ThreadID is unnecessary as the CPUs can identify the requesting
thread through sideband info (SenderState / LSQ entries) or ContextID
offset from the base ContextID for a cpu.

This is a re-spin of 20264eb after the revert (bd1c6789) and includes
some fixes of that commit.


# 11364:1bd9f1b27438 04-Mar-2016 Andreas Hansson <andreas.hansson@arm.com>

base: Fix gpu-compute output stream creation

Match changes in output stream.


# 11345:b6a66a90e0a1 18-Feb-2016 John Kalamatianos <john.kalamatianos@amd.com>

gpu: fix bugs with MemFence, Flat Instrs and Resource utilization

Both Memory Fence is now flagged as Global Memory only to avoid resource
oversubscribing.
Flat instructions now check for Shared Memory resource busy to avoid
oversubscribing resources.
All WaitClass resources now use cycles (not ticks) to register the number
of pipe stages between Scoreboard and Execute to be consistent with
instruction scheduling logic which always used clock cycles.


# 11308:7d8836fd043d 19-Jan-2016 Tony Gutierrez <anthony.gutierrez@amd.com>

gpu-compute: AMD's baseline GPU model