Cross Reference: /gem5/src/gpu-compute/compute

History log of /gem5/src/gpu-compute/compute_unit.cc
Revision	Date	Author	Comments
# 13892:0182a0601f66	22-Apr-2019	Gabe Black <gabeblack@google.com>	mem: Minimize the use of MemObject. MemObject doesn't provide anything beyond its base ClockedObject any more, so this change removes it from most inheritance hierarchies. Occasionally MemObject is replaced with SimObject when I was fairly confident that the extra functionality of ClockedObject wasn't needed. Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Gabe Black <gabeblack@google.com>
# 12799:b230ebe2f641	25-Jun-2018	Alexandru Dutu <alexandru.dutu@amd.com>	gpu-compute: Remove unneeded Request::setVirt call This sets the members of a Request object to the values they already hold, except the atomicOpFunctor which is set to nullptr. This call introduces a bug for atomics and is not useful for non-atomic requests. This changeset is also adding the wave PC and instruction sequence number to the Request object. Change-Id: I62f7b4a597483b0aa848a0cfbc72181e1063f56a Reviewed-on: https://gem5-review.googlesource.com/11549 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
# 12749:223c83ed9979	04-Jun-2018	Giacomo Travaglini <giacomo.travaglini@arm.com>	misc: Using smart pointers for memory Requests This patch is changing the underlying type for RequestPtr from Request* to shared_ptr<Request>. Having memory requests being managed by smart pointers will simplify the code; it will also prevent memory leakage and dangling pointers. Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10996 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
# 12748:ae5ce8e42de7	03-Jun-2018	Giacomo Travaglini <giacomo.travaglini@arm.com>	misc: Substitute pointer to Request with aliased RequestPtr Every usage of Request* in the code has been replaced with the RequestPtr alias. This is a preparing patch for when RequestPtr will be the typdefed to a smart pointer to Request rather then a raw pointer to Request. Change-Id: I73cbaf2d96ea9313a590cdc731a25662950cd51a Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10995 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
# 12717:2e2c211644d2	27-Apr-2018	Brandon Potter <brandon.potter@amd.com>	gpu-compute: use X86ISA::TlbEntry over GpuTlbEntry GpuTlbEntry was derived from a vanilla X86ISA::TlbEntry definition. It wrapped the class and included an extra member "valid". This member was intended to report on the validity of the entry, however it introduced bugs when folks forgot to set field properly in the code. So, instead of keeping the extra field which we might forget to set, we track validity by using nullptr for invalid tlb entries (as the tlb entries are dynamically allocated). This saves on the extra class definition and prevents bugs creeping into the code since the checks are intrinsically tied into accessing any of the X86ISA::TlbEntry members. This changeset fixes the issues introduced by a8d030522, a4e722725, and 2a15bfd79. Change-Id: I30ebe3ec223fb833f3795bf0403d0016ac9a8bc2 Reviewed-on: https://gem5-review.googlesource.com/10481 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
# 12697:cd71b966be1e	27-Apr-2018	Tony Gutierrez <anthony.gutierrez@amd.com>	style: fix amd license and style issues Change-Id: I26136fb49f743c4a597f8021cfd27f78897267b5 Reviewed-on: https://gem5-review.googlesource.com/10463 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
# 12680:91f4d6668b4f	04-Apr-2018	Giacomo Travaglini <giacomo.travaglini@arm.com>	sim,cpu,mem,arch: Introduced MasterInfo data structure With this patch a gem5 System will store more info about its Masters. While it was previously keeping track of the Master name and Master ID only, it is now adding a per-Master pointer to the SimObject related to the Master. This will make it possible for a client to query a System for a Master using either the master's name or the master's pointer. Change-Id: I8b97d328a65cd06f329e2cdd3679451c17d2b8f6 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/9781 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
# 12126:06c1fbaa5724	27-Jun-2017	Sean Wilson <spwilson2@wisc.edu>	gpu-compute: Refactor some Event subclasses to lambdas Change-Id: Ic1332b8e8ba0afacbe591c80f4d06afbf5f04bd9 Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3922 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
# 11700:7d4d424c9f17	26-Oct-2016	Tony Gutierrez <anthony.gutierrez@amd.com>	gpu-compute: support in-order data delivery in GM pipe this patch adds an ordered response buffer to the GM pipeline to ensure in-order data delivery. the buffer is implemented as a stl ordered map, which sorts the request in program order by using their sequence ID. when requests return to the GM pipeline they are marked as done. only the oldest request may be serviced from the ordered buffer, and only if is marked as done. the FIFO response buffers are kept and used in OoO delivery mode
# 11698:d1ad31187fa5	26-Oct-2016	Tony Gutierrez <anthony.gutierrez@amd.com>	gpu-compute: use System cache line size in the GPU
# 11695:0a65922d564d	26-Oct-2016	Tony Gutierrez <anthony.gutierrez@amd.com>	gpu-compute: add instruction mix stats for the gpu
# 11692:e772fdcd3809	26-Oct-2016	Tony Gutierrez <anthony.gutierrez@amd.com>	gpu-compute: remove inst enums and use bit flag for attributes this patch removes the GPUStaticInst enums that were defined in GPU.py. instead, a simple set of attribute flags that can be set in the base instruction class are used. this will help unify the attributes of HSAIL and machine ISA instructions within the model itself. because the static instrution now carries the attributes, a GPUDynInst must carry a pointer to a valid GPUStaticInst so a new static kernel launch instruction is added, which carries the attributes needed to perform a the kernel launch.
# 11657:5fad5a37d6fc	04-Oct-2016	Alexandru Dutu <alexandru.dutu@amd.com>	gpu-compute: Added method to compute the actual workgroup size This patch adds a method to the Wavefront class to compute the actual workgroup size. This can be different from the maximum workgroup size specified when launching the kernel through the NDRange object. Current solution is still not optimal, as we are computing these for each wavefront and the dispatcher also needs to have this information and can't actually call Wavefront::computeActuallWgSz before the wavefronts are being created. A long term solution would be to have a Workgroup class that deals with all these details.
# 11643:42a1873be45c	16-Sep-2016	Alexandru Dutu <alexandru.dutu@amd.com>	gpu-compute: Refactoring Wavefront::dynWaveId
# 11639:2e8d4bd8108d	16-Sep-2016	Alexandru Dutu <alexandru.dutu@amd.com>	gpu-compute: Wavefront refactoring Renaming members of the Wavefront class in accordance with the style guide.
# 11638:b511733958d0	16-Sep-2016	Alexandru Dutu <alexandru.dutu@amd.com>	gpu-compute: Remove WFContext WFContext struct is currently unused and it has been rendered not useful in saving and restoring the context of a Wavefront. Wavefront class should be sufficient for that purpose and the runtime can figure out the memory size it will need to allocate for a Wavefront through an IOCTL.
# 11534:7106f550afad	09-Jun-2016	jkalamat <john.kalamatianos@amd.com>	gpu-compute: parametrize Wavefront size Eliminate the VSZ constant that defined the Wavefront size (in numbers of work items); replaced it with a parameter in the GPU.py configuration script. Changed all data structures dependent on the Wavefront size to be dynamically sized. Legal values of Wavefront size are 16, 32, 64 for now and checked at initialization time.
# 11523:81332eb10367	06-Jun-2016	David Guillen Fandos <david.guillen@arm.com>	stats: Fixing regStats function for some SimObjects Fixing an issue with regStats not calling the parent class method for most SimObjects in Gem5. This causes issues if one adds new stats in the base class (since they are never initialized properly!). Change-Id: Iebc5aa66f58816ef4295dc8e48a357558d76a77c Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
# 11435:0f1b46dde3fa	07-Apr-2016	Mitch Hayenga <mitch.hayenga@arm.com>	mem: Remove threadId from memory request class In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu. This is a re-spin of 20264eb after the revert (bd1c6789) and includes some fixes of that commit.
# 11364:1bd9f1b27438	04-Mar-2016	Andreas Hansson <andreas.hansson@arm.com>	base: Fix gpu-compute output stream creation Match changes in output stream.
# 11345:b6a66a90e0a1	18-Feb-2016	John Kalamatianos <john.kalamatianos@amd.com>	gpu: fix bugs with MemFence, Flat Instrs and Resource utilization Both Memory Fence is now flagged as Global Memory only to avoid resource oversubscribing. Flat instructions now check for Shared Memory resource busy to avoid oversubscribing resources. All WaitClass resources now use cycles (not ticks) to register the number of pipe stages between Scoreboard and Execute to be consistent with instruction scheduling logic which always used clock cycles.
# 11308:7d8836fd043d	19-Jan-2016	Tony Gutierrez <anthony.gutierrez@amd.com>	gpu-compute: AMD's baseline GPU model