#
14096:bde52fccbf0f |
|
12-Jul-2019 |
Matthew Poremba <matthew.poremba@amd.com> |
arch-x86: Don't free PTW state with inflight requests
If a page table walk is squashed, the walker state is being deleted in the squash code. If there are in flight requests, the deleted walker state values may be clobbered, leading to undefined behavior. This adds a squashed boolean to the walker state which is set if a walk is squashed while requests are still in flight. When packets for the in flight request return, we check if the walk was squashed and return that the walk is complete once the number of in flight requests reaches zero. The walker state is then freed by the PTW.
Change-Id: I57a64b1548b83a8a9e8441fc9d6f33e9842df2b3 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19568 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>
|
#
13892:0182a0601f66 |
|
22-Apr-2019 |
Gabe Black <gabeblack@google.com> |
mem: Minimize the use of MemObject.
MemObject doesn't provide anything beyond its base ClockedObject any more, so this change removes it from most inheritance hierarchies. Occasionally MemObject is replaced with SimObject when I was fairly confident that the extra functionality of ClockedObject wasn't needed.
Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Gabe Black <gabeblack@google.com>
|
#
13784:1941dc118243 |
|
07-Mar-2019 |
Gabe Black <gabeblack@google.com> |
arch, cpu, dev, gpu, mem, sim, python: start using getPort.
Replace the getMasterPort, getSlavePort, and getEthPort functions with getPort, and remove extraneous mechanisms that are no longer necessary.
Change-Id: Iab7e3c02d2f3a0cf33e7e824e18c28646b5bc318 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17040 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
#
13229:b45254f2733a |
|
12-Oct-2018 |
Gabe Black <gabeblack@google.com> |
x86: Use little endian packet accessors.
We know data is little endian, so we can use those accessors explicitly.
Change-Id: I09aa7f1e525ad1346e932ce4a772b64bf59dc350 Reviewed-on: https://gem5-review.googlesource.com/c/13456 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Gabe Black <gabeblack@google.com>
|
#
12749:223c83ed9979 |
|
04-Jun-2018 |
Giacomo Travaglini <giacomo.travaglini@arm.com> |
misc: Using smart pointers for memory Requests
This patch is changing the underlying type for RequestPtr from Request* to shared_ptr<Request>. Having memory requests being managed by smart pointers will simplify the code; it will also prevent memory leakage and dangling pointers.
Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10996 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
#
11793:ef606668d247 |
|
09-Nov-2016 |
Brandon Potter <brandon.potter@amd.com> |
style: [patch 1/22] use /r/3648/ to reorganize includes
|
#
11321:02e930db812d |
|
06-Feb-2016 |
Steve Reinhardt <steve.reinhardt@amd.com> |
style: fix missing spaces in control statements
Result of running 'hg m5style --skip-all --fix-control -a'.
|
#
11216:80e82ce1978d |
|
16-Nov-2015 |
Bjoern A. Zeeb <baz21@cam.ac.uk> |
x86: pagetable walker: fix typo in comment
|
#
10713:eddb533708cb |
|
02-Mar-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Split port retry for all different packet classes
This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios.
The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting.
The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks.
|
#
10694:1a6785e37d81 |
|
11-Feb-2015 |
Marco Balboni <Marco.Balboni@ARM.com> |
mem: Clarification of packet crossbar timings
This patch clarifies the packet timings annotated when going through a crossbar.
The old 'firstWordDelay' is replaced by 'headerDelay' that represents the delay associated to the delivery of the header of the packet.
The old 'lastWordDelay' is replaced by 'payloadDelay' that represents the delay needed to processing the payload of the packet.
For now the uses and values remain identical. However, going forward the payloadDelay will be additive, and not include the headerDelay. Follow-on patches will make the headerDelay capture the pipeline latency incurred in the crossbar, whereas the payloadDelay will capture the additional serialisation delay.
|
#
10660:87f7b5a07584 |
|
22-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Remove unused Packet src and dest fields
This patch takes the final step in removing the src and dest fields in the packet. These fields were rather confusing in that they only remember a single multiplexing component, and pushed the responsibility to the bridge and caches to store the fields in a senderstate, thus effectively creating a stack. With the recent changes to the crossbar response routing the crossbar is now responsible without relying on the packet fields. Thus, these variables are now unused and can be removed.
|
#
10654:e49bf4884c59 |
|
22-Jan-2015 |
Andreas Hansson <andreas.hansson@arm.com> |
x86: Delay X86 table walk on receiving walker response
This patch fixes a minor issue in the X86 page table walker where it ended up sending new request packets to the crossbar before the response processing was finished (recvTimingResp is directly calling sendTimingReq). Under certain conditions this caused the crossbar to see illegal combinations of request/response overlap, in turn causing problems with a slightly modified crossbar implementation.
|
#
10474:799c8ee4ecba |
|
16-Oct-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
arch: Use shared_ptr for all Faults
This patch takes quite a large step in transitioning from the ad-hoc RefCountingPtr to the c++11 shared_ptr by adopting its use for all Faults. There are no changes in behaviour, and the code modifications are mostly just replacing "new" with "make_shared".
|
#
10405:7a618c07e663 |
|
20-Sep-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Rename Bus to XBar to better reflect its behaviour
This patch changes the name of the Bus classes to XBar to better reflect the actual timing behaviour. The actual instances in the config scripts are not renamed, and remain as e.g. iobus or membus.
As part of this renaming, the code has also been clean up slightly, making use of range-based for loops and tidying up some comments. The only changes outside the bus/crossbar code is due to the delay variables in the packet.
|
#
10299:bec0c5ffc323 |
|
28-Aug-2014 |
Alexandru <alexandru.dutu@amd.com> |
mem: adding architectural page table support for SE mode This patch enables the use of page tables that are stored in system memory and respect x86 specification, in SE mode. It defines an architectural page table for x86 as a MultiLevelPageTable class and puts a placeholder class for other ISAs page tables, giving the possibility for future implementation.
|
#
10241:1444f2ee67d7 |
|
21-Jun-2014 |
Binh Pham <binhpham@cs.rutgers.edu> |
x86: fix table walker assertion
In a cycle, we could see a R and W requests corresponding to the same page walk being sent to the memory. During the cycle that assertion happens, we have 2 responses corresponding to the R and W above. We also have a 'read' variable to keep track of the inflight Read request, this gets reset to NULL right after we send out any R request; and gets set to the next R in the page walk when a response comes back.
The issue we are seeing here is when we get a response for W request, assert(!read) fires because we got a response for R request right before this, hence we set 'read' to NOT NULL value, pointing to the next R request in the pagewalk!
This work was done while Binh was an intern at AMD Research.
|
#
10231:cb2e6950956d |
|
31-May-2014 |
Steve Reinhardt <steve.reinhardt@amd.com> |
style: eliminate equality tests with true and false
Using '== true' in a boolean expression is totally redundant, and using '== false' is pretty verbose (and arguably less readable in most cases) compared to '!'.
It's somewhat of a pet peeve, perhaps, but I had some time waiting for some tests to run and decided to clean these up.
Unfortunately, SLICC appears not to have the '!' operator, so I had to leave the '== false' tests in the SLICC code.
|
#
10018:c9ef81684179 |
|
24-Jan-2014 |
Andreas Hansson <andreas.hansson@arm.com> |
x86: Fix memory leak in table walker
This patch fixes a memory leak in the table walker, by ensuring that the sender state is deleted again if the request packet cannot be successfully sent.
|
#
9701:f02f3b6562d5 |
|
21-May-2013 |
Gedare Bloom <gedare@rtems.org> |
x86: Squash outstanding walks when instructions are squashed. This is the x86 version of the ARM changeset baa17ba80e06. In case an instruction has been squashed by the o3 cpu, this patch allows page table walker to avoid carrying out a pending translation that the instruction requested for.
|
#
9579:2a13ddb8bd0d |
|
07-Mar-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
x86: Make the table walker reset the packet delay
This patch fixes an issue related to the table walker recycling packets that still have a bus delay that is not accounted for. For now, we simply ignore the values and reset them to zero.
|
#
9542:683991c46ac8 |
|
19-Feb-2013 |
Andreas Hansson <andreas.hansson@arm.com> |
mem: Add predecessor to SenderState base class
This patch adds a predecessor field to the SenderState base class to make the process of linking them up more uniform, and enable a traversal of the stack without knowing the specific type of the subclasses.
There are a number of simplifications done as part of changing the SenderState, particularly in the RubyTest.
|
#
9524:d6ffa982a68b |
|
15-Feb-2013 |
Andreas Sandberg <Andreas.Sandberg@ARM.com> |
sim: Add a system-global option to bypass caches
Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches.
To make memory mode tests cleaner, the following methods are added to the System class:
* isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed.
The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore.
|
#
9294:8fb03b13de02 |
|
15-Oct-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Port: Add protocol-agnostic ports in the port hierarchy
This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations.
The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default.
|
#
9165:f9e3dac185ba |
|
22-Aug-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
Packet: Remove NACKs from packet and its use in endpoints
This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that).
The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe.
Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up.
|
#
8975:7f36d4436074 |
|
01-May-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate requests and responses for timing accesses
This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses.
For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions).
The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly.
With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself.
|
#
8953:488d45aeb672 |
|
15-Apr-2012 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Use the AddrTrie class to implement the TLB.
This change also adjusts the TlbEntry class so that it stores the number of address bits wide a page is rather than its size in bytes. In other words, instead of storing 4K for a 4K page, it stores 12. 12 is easy to turn into 4K, but it's a little harder going the other way.
|
#
8949:3fa1ee293096 |
|
14-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Remove the Broadcast destination from the packet
This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field.
Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before).
The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class.
In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing.
|
#
8948:e95ee70f876c |
|
14-Apr-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate snoops and normal memory requests/responses
This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports.
Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below.
Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass.
Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses.
The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id.
In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions.
|
#
8922:17f037ad8918 |
|
30-Mar-2012 |
William Wang <william.wang@arm.com> |
MEM: Introduce the master/slave port sub-classes in C++
This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects.
The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches.
The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal.
The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again.
|
#
8832:247fee427324 |
|
12-Feb-2012 |
Ali Saidi <Ali.Saidi@ARM.com> |
mem: Add a master ID to each request object.
This change adds a master id to each request object which can be used identify every device in the system that is capable of issuing a request. This is part of the way to removing the numCpus+1 stats in the cache and replacing them with the master ids. This is one of a series of changes that make way for the stats output to be changed to python.
|
#
8711:c7e14f52c682 |
|
17-Jan-2012 |
Andreas Hansson <andreas.hansson@arm.com> |
MEM: Separate queries for snooping and address ranges
This patch simplifies the address-range determination mechanism and also unifies the naming across ports and devices. It further splits the queries for determining if a port is snooping and what address ranges it responds to (aiming towards a separation of cache-maintenance ports and pure memory-mapped ports). Default behaviours are such that most ports do not have to define isSnooping, and master ports need not implement getAddrRanges.
|
#
8232:b28d06a175be |
|
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
trace: reimplement the DTRACE function so it doesn't use a vector At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help
|
#
8229:78bf55f23338 |
|
15-Apr-2011 |
Nathan Binkert <nate@binkert.org> |
includes: sort all includes
|
#
8096:021a0724c5c0 |
|
27-Feb-2011 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Use regular read requests in the walker instead of read exclusive.
|
#
7912:a9f05ab40763 |
|
07-Feb-2011 |
Joel Hestness <hestness@cs.utexas.edu> |
x86: Timing support for pagetable walker
Move page table walker state to its own object type, and make the walker instantiate state for each outstanding walk. By storing the states in a queue, the walker is able to handle multiple outstanding timing requests. Note that functional walks use separate state elements.
|
#
7087:fb8d5786ff30 |
|
24-May-2010 |
Nathan Binkert <nate@binkert.org> |
copyright: Change HP copyright on x86 code to be more friendly
|
#
6027:3d7c2fe13f6a |
|
13-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix minor bug in the page table walker from TLB shuffling.
|
#
6023:47b4fcb10c11 |
|
09-Apr-2009 |
Nathan Binkert <nate@binkert.org> |
tlb: More fixing of unified TLB
|
#
6022:410194bb3049 |
|
09-Apr-2009 |
Gabe Black <gblack@eecs.umich.edu> |
tlb: Don't separate the TLB classes into an instruction TLB and a data TLB
|
#
5904:5c61233cbd53 |
|
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Add a trace flag for the page table walker.
|
#
5897:29cecf4fe602 |
|
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Fix the timing mode of the page table walker.
|
#
5895:569e3b31a868 |
|
25-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the X86 TLB take advantage of delayed translations, and get rid of the fake TLB miss faults.
|
#
5881:73c0aaaaf186 |
|
23-Feb-2009 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Pass whether an access was a read/write/fetch so faults can behave accordingly.
|
#
5736:426510e758ad |
|
10-Nov-2008 |
Nathan Binkert <nate@binkert.org> |
mem: update stuff for changes to Packet and Request
|
#
5304:a685ea6cc8b8 |
|
02-Dec-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Make the page not present panic more descriptive.
|
#
5245:d94bb8af9f76 |
|
12-Nov-2007 |
Gabe Black <gblack@eecs.umich.edu> |
X86: Separate out the page table walker into it's own cc and hh.
|