History log of /gem5/src/arch/x86/pagetable_walker.cc
Revision Date Author Comments
# 14096:bde52fccbf0f 12-Jul-2019 Matthew Poremba <matthew.poremba@amd.com>

arch-x86: Don't free PTW state with inflight requests

If a page table walk is squashed, the walker state is being deleted
in the squash code. If there are in flight requests, the deleted
walker state values may be clobbered, leading to undefined behavior.
This adds a squashed boolean to the walker state which is set if a
walk is squashed while requests are still in flight. When packets
for the in flight request return, we check if the walk was squashed
and return that the walk is complete once the number of in flight
requests reaches zero. The walker state is then freed by the PTW.

Change-Id: I57a64b1548b83a8a9e8441fc9d6f33e9842df2b3
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19568
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>


# 13892:0182a0601f66 22-Apr-2019 Gabe Black <gabeblack@google.com>

mem: Minimize the use of MemObject.

MemObject doesn't provide anything beyond its base ClockedObject any
more, so this change removes it from most inheritance hierarchies.
Occasionally MemObject is replaced with SimObject when I was fairly
confident that the extra functionality of ClockedObject wasn't needed.

Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 13784:1941dc118243 07-Mar-2019 Gabe Black <gabeblack@google.com>

arch, cpu, dev, gpu, mem, sim, python: start using getPort.

Replace the getMasterPort, getSlavePort, and getEthPort functions
with getPort, and remove extraneous mechanisms that are no longer
necessary.

Change-Id: Iab7e3c02d2f3a0cf33e7e824e18c28646b5bc318
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17040
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 13229:b45254f2733a 12-Oct-2018 Gabe Black <gabeblack@google.com>

x86: Use little endian packet accessors.

We know data is little endian, so we can use those accessors
explicitly.

Change-Id: I09aa7f1e525ad1346e932ce4a772b64bf59dc350
Reviewed-on: https://gem5-review.googlesource.com/c/13456
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 12749:223c83ed9979 04-Jun-2018 Giacomo Travaglini <giacomo.travaglini@arm.com>

misc: Using smart pointers for memory Requests

This patch is changing the underlying type for RequestPtr from Request*
to shared_ptr<Request>. Having memory requests being managed by smart
pointers will simplify the code; it will also prevent memory leakage and
dangling pointers.

Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10996
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>


# 11793:ef606668d247 09-Nov-2016 Brandon Potter <brandon.potter@amd.com>

style: [patch 1/22] use /r/3648/ to reorganize includes


# 11321:02e930db812d 06-Feb-2016 Steve Reinhardt <steve.reinhardt@amd.com>

style: fix missing spaces in control statements

Result of running 'hg m5style --skip-all --fix-control -a'.


# 11216:80e82ce1978d 16-Nov-2015 Bjoern A. Zeeb <baz21@cam.ac.uk>

x86: pagetable walker: fix typo in comment


# 10713:eddb533708cb 02-Mar-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Split port retry for all different packet classes

This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.

The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvReqRetry, sendTimingResp has
recvRespRetry etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.

The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.


# 10694:1a6785e37d81 11-Feb-2015 Marco Balboni <Marco.Balboni@ARM.com>

mem: Clarification of packet crossbar timings

This patch clarifies the packet timings annotated
when going through a crossbar.

The old 'firstWordDelay' is replaced by 'headerDelay' that represents
the delay associated to the delivery of the header of the packet.

The old 'lastWordDelay' is replaced by 'payloadDelay' that represents
the delay needed to processing the payload of the packet.

For now the uses and values remain identical. However, going forward
the payloadDelay will be additive, and not include the
headerDelay. Follow-on patches will make the headerDelay capture the
pipeline latency incurred in the crossbar, whereas the payloadDelay
will capture the additional serialisation delay.


# 10660:87f7b5a07584 22-Jan-2015 Andreas Hansson <andreas.hansson@arm.com>

mem: Remove unused Packet src and dest fields

This patch takes the final step in removing the src and dest fields in
the packet. These fields were rather confusing in that they only
remember a single multiplexing component, and pushed the
responsibility to the bridge and caches to store the fields in a
senderstate, thus effectively creating a stack. With the recent
changes to the crossbar response routing the crossbar is now
responsible without relying on the packet fields. Thus, these
variables are now unused and can be removed.


# 10654:e49bf4884c59 22-Jan-2015 Andreas Hansson <andreas.hansson@arm.com>

x86: Delay X86 table walk on receiving walker response

This patch fixes a minor issue in the X86 page table walker where it
ended up sending new request packets to the crossbar before the
response processing was finished (recvTimingResp is directly calling
sendTimingReq). Under certain conditions this caused the crossbar to
see illegal combinations of request/response overlap, in turn causing
problems with a slightly modified crossbar implementation.


# 10474:799c8ee4ecba 16-Oct-2014 Andreas Hansson <andreas.hansson@arm.com>

arch: Use shared_ptr for all Faults

This patch takes quite a large step in transitioning from the ad-hoc
RefCountingPtr to the c++11 shared_ptr by adopting its use for all
Faults. There are no changes in behaviour, and the code modifications
are mostly just replacing "new" with "make_shared".


# 10405:7a618c07e663 20-Sep-2014 Andreas Hansson <andreas.hansson@arm.com>

mem: Rename Bus to XBar to better reflect its behaviour

This patch changes the name of the Bus classes to XBar to better
reflect the actual timing behaviour. The actual instances in the
config scripts are not renamed, and remain as e.g. iobus or membus.

As part of this renaming, the code has also been clean up slightly,
making use of range-based for loops and tidying up some comments. The
only changes outside the bus/crossbar code is due to the delay
variables in the packet.


# 10299:bec0c5ffc323 28-Aug-2014 Alexandru <alexandru.dutu@amd.com>

mem: adding architectural page table support for SE mode
This patch enables the use of page tables that are stored in system memory
and respect x86 specification, in SE mode. It defines an architectural
page table for x86 as a MultiLevelPageTable class and puts a placeholder
class for other ISAs page tables, giving the possibility for future
implementation.


# 10241:1444f2ee67d7 21-Jun-2014 Binh Pham <binhpham@cs.rutgers.edu>

x86: fix table walker assertion

In a cycle, we could see a R and W requests corresponding to the same
page walk being sent to the memory. During the cycle that assertion
happens, we have 2 responses corresponding to the R and W above. We
also have a 'read' variable to keep track of the inflight Read
request, this gets reset to NULL right after we send out any R
request; and gets set to the next R in the page walk when a response
comes back.

The issue we are seeing here is when we get a response for W request,
assert(!read) fires because we got a response for R request right
before this, hence we set 'read' to NOT NULL value, pointing to the
next R request in the pagewalk!

This work was done while Binh was an intern at AMD Research.


# 10231:cb2e6950956d 31-May-2014 Steve Reinhardt <steve.reinhardt@amd.com>

style: eliminate equality tests with true and false

Using '== true' in a boolean expression is totally redundant,
and using '== false' is pretty verbose (and arguably less
readable in most cases) compared to '!'.

It's somewhat of a pet peeve, perhaps, but I had some time
waiting for some tests to run and decided to clean these up.

Unfortunately, SLICC appears not to have the '!' operator,
so I had to leave the '== false' tests in the SLICC code.


# 10018:c9ef81684179 24-Jan-2014 Andreas Hansson <andreas.hansson@arm.com>

x86: Fix memory leak in table walker

This patch fixes a memory leak in the table walker, by ensuring that
the sender state is deleted again if the request packet cannot be
successfully sent.


# 9701:f02f3b6562d5 21-May-2013 Gedare Bloom <gedare@rtems.org>

x86: Squash outstanding walks when instructions are squashed.
This is the x86 version of the ARM changeset baa17ba80e06. In case an
instruction has been squashed by the o3 cpu, this patch allows page
table walker to avoid carrying out a pending translation that the
instruction requested for.


# 9579:2a13ddb8bd0d 07-Mar-2013 Andreas Hansson <andreas.hansson@arm.com>

x86: Make the table walker reset the packet delay

This patch fixes an issue related to the table walker recycling
packets that still have a bus delay that is not accounted for. For
now, we simply ignore the values and reset them to zero.


# 9542:683991c46ac8 19-Feb-2013 Andreas Hansson <andreas.hansson@arm.com>

mem: Add predecessor to SenderState base class

This patch adds a predecessor field to the SenderState base class to
make the process of linking them up more uniform, and enable a
traversal of the stack without knowing the specific type of the
subclasses.

There are a number of simplifications done as part of changing the
SenderState, particularly in the RubyTest.


# 9524:d6ffa982a68b 15-Feb-2013 Andreas Sandberg <Andreas.Sandberg@ARM.com>

sim: Add a system-global option to bypass caches

Virtualized CPUs and the fastmem mode of the atomic CPU require direct
access to physical memory. We currently require caches to be disabled
when using them to prevent chaos. This is not ideal when switching
between hardware virutalized CPUs and other CPU models as it would
require a configuration change on each switch. This changeset
introduces a new version of the atomic memory mode,
'atomic_noncaching', where memory accesses are inserted into the
memory system as atomic accesses, but bypass caches.

To make memory mode tests cleaner, the following methods are added to
the System class:

* isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'.
* isTimingMode() -- True if the memory mode is 'timing'.
* bypassCaches() -- True if caches should be bypassed.

The old getMemoryMode() and setMemoryMode() methods should never be
used from the C++ world anymore.


# 9294:8fb03b13de02 15-Oct-2012 Andreas Hansson <andreas.hansson@arm.com>

Port: Add protocol-agnostic ports in the port hierarchy

This patch adds an additional level of ports in the inheritance
hierarchy, separating out the protocol-specific and protocl-agnostic
parts. All the functionality related to the binding of ports is now
confined to use BaseMaster/BaseSlavePorts, and all the
protocol-specific parts stay in the Master/SlavePort. In the future it
will be possible to add other protocol-specific implementations.

The functions used in the binding of ports, i.e. getMaster/SlavePort
now use the base classes, and the index parameter is updated to use
the PortID typedef with the symbolic InvalidPortID as the default.


# 9165:f9e3dac185ba 22-Aug-2012 Andreas Hansson <andreas.hansson@arm.com>

Packet: Remove NACKs from packet and its use in endpoints

This patch removes the NACK frrom the packet as there is no longer any
module in the system that issues them (the bridge was the only one and
the previous patch removes that).

The handling of NACKs was mostly avoided throughout the code base, by
using e.g. panic or assert false, but in a few locations the NACKs
were actually dealt with (although NACKs never occured in any of the
regressions). Most notably, the DMA port will now never receive a NACK
and the backoff time is thus never changed. As a consequence, the
entire backoff mechanism (similar to a PCI bus) is now removed and the
DMA port entirely relies on the bus performing the arbitration and
issuing a retry when appropriate. This is more in line with e.g. PCIe.

Surprisingly, this patch has no impact on any of the regressions. As
mentioned in the patch that removes the NACK from the bridge, a
follow-up patch should change the request and response buffer size for
at least one regression to also verify that the system behaves as
expected when the bridge fills up.


# 8975:7f36d4436074 01-May-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Separate requests and responses for timing accesses

This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.

For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).

The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.

With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.


# 8953:488d45aeb672 15-Apr-2012 Gabe Black <gblack@eecs.umich.edu>

X86: Use the AddrTrie class to implement the TLB.

This change also adjusts the TlbEntry class so that it stores the number of
address bits wide a page is rather than its size in bytes. In other words,
instead of storing 4K for a 4K page, it stores 12. 12 is easy to turn into 4K,
but it's a little harder going the other way.


# 8949:3fa1ee293096 14-Apr-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Remove the Broadcast destination from the packet

This patch simplifies the packet by removing the broadcast flag and
instead more firmly relying on (and enforcing) the semantics of
transactions in the classic memory system, i.e. request packets are
routed from a master to a slave based on the address, and when they
are created they have neither a valid source, nor destination. On
their way to the slave, the request packet is updated with a source
field for all modules that multiplex packets from multiple master
(e.g. a bus). When a request packet is turned into a response packet
(at the final slave), it moves the potentially populated source field
to the destination field, and the response packet is routed through
any multiplexing components back to the master based on the
destination field.

Modules that connect multiplexing components, such as caches and
bridges store any existing source and destination field in the sender
state as a stack (just as before).

The packet constructor is simplified in that there is no longer a need
to pass the Packet::Broadcast as the destination (this was always the
case for the classic memory system). In the case of Ruby, rather than
using the parameter to the constructor we now rely on setDest, as
there is already another three-argument constructor in the packet
class.

In many places where the packet information was printed as part of
DPRINTFs, request packets would be printed with a numeric "dest" that
would always be -1 (Broadcast) and that field is now removed from the
printing.


# 8948:e95ee70f876c 14-Apr-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Separate snoops and normal memory requests/responses

This patch introduces port access methods that separates snoop
request/responses from normal memory request/responses. The
differentiation is made for functional, atomic and timing accesses and
builds on the introduction of master and slave ports.

Before the introduction of this patch, the packets belonging to the
different phases of the protocol (request -> [forwarded snoop request
-> snoop response]* -> response) all use the same port access
functions, even though the snoop packets flow in the opposite
direction to the normal packet. That is, a coherent master sends
normal request and receives responses, but receives snoop requests and
sends snoop responses (vice versa for the slave). These two distinct
phases now use different access functions, as described below.

Starting with the functional access, a master sends a request to a
slave through sendFunctional, and the request packet is turned into a
response before the call returns. In a system without cache coherence,
this is all that is needed from the functional interface. For the
cache-coherent scenario, a slave also sends snoop requests to coherent
masters through sendFunctionalSnoop, with responses returned within
the same packet pointer. This is currently used by the bus and caches,
and the LSQ of the O3 CPU. The send/recvFunctional and
send/recvFunctionalSnoop are moved from the Port super class to the
appropriate subclass.

Atomic accesses follow the same flow as functional accesses, with
request being sent from master to slave through sendAtomic. In the
case of cache-coherent ports, a slave can send snoop requests to a
master through sendAtomicSnoop. Just as for the functional access
methods, the atomic send and receive member functions are moved to the
appropriate subclasses.

The timing access methods are different from the functional and atomic
in that requests and responses are separated in time and
send/recvTiming are used for both directions. Hence, a master uses
sendTiming to send a request to a slave, and a slave uses sendTiming
to send a response back to a master, at a later point in time. Snoop
requests and responses travel in the opposite direction, similar to
what happens in functional and atomic accesses. With the introduction
of this patch, it is possible to determine the direction of packets in
the bus, and no longer necessary to look for both a master and a slave
port with the requested port id.

In contrast to the normal recvFunctional, recvAtomic and recvTiming
that are pure virtual functions, the recvFunctionalSnoop,
recvAtomicSnoop and recvTimingSnoop have a default implementation that
calls panic. This is to allow non-coherent master and slave ports to
not implement these functions.


# 8922:17f037ad8918 30-Mar-2012 William Wang <william.wang@arm.com>

MEM: Introduce the master/slave port sub-classes in C++

This patch introduces the notion of a master and slave port in the C++
code, thus bringing the previous classification from the Python
classes into the corresponding simulation objects and memory objects.

The patch enables us to classify behaviours into the two bins and add
assumptions and enfore compliance, also simplifying the two
interfaces. As a starting point, isSnooping is confined to a master
port, and getAddrRanges to slave ports. More of these specilisations
are to come in later patches.

The getPort function is not getMasterPort and getSlavePort, and
returns a port reference rather than a pointer as NULL would never be
a valid return value. The default implementation of these two
functions is placed in MemObject, and calls fatal.

The one drawback with this specific patch is that it requires some
code duplication, e.g. QueuedPort becomes QueuedMasterPort and
QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort
(avoiding multiple inheritance). With the later introduction of the
port interfaces, moving the functionality outside the port itself, a
lot of the duplicated code will disappear again.


# 8832:247fee427324 12-Feb-2012 Ali Saidi <Ali.Saidi@ARM.com>

mem: Add a master ID to each request object.

This change adds a master id to each request object which can be
used identify every device in the system that is capable of issuing a request.
This is part of the way to removing the numCpus+1 stats in the cache and
replacing them with the master ids. This is one of a series of changes
that make way for the stats output to be changed to python.


# 8711:c7e14f52c682 17-Jan-2012 Andreas Hansson <andreas.hansson@arm.com>

MEM: Separate queries for snooping and address ranges

This patch simplifies the address-range determination mechanism and
also unifies the naming across ports and devices. It further splits
the queries for determining if a port is snooping and what address
ranges it responds to (aiming towards a separation of
cache-maintenance ports and pure memory-mapped ports). Default
behaviours are such that most ports do not have to define isSnooping,
and master ports need not implement getAddrRanges.


# 8232:b28d06a175be 15-Apr-2011 Nathan Binkert <nate@binkert.org>

trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help


# 8229:78bf55f23338 15-Apr-2011 Nathan Binkert <nate@binkert.org>

includes: sort all includes


# 8096:021a0724c5c0 27-Feb-2011 Gabe Black <gblack@eecs.umich.edu>

X86: Use regular read requests in the walker instead of read exclusive.


# 7912:a9f05ab40763 07-Feb-2011 Joel Hestness <hestness@cs.utexas.edu>

x86: Timing support for pagetable walker

Move page table walker state to its own object type, and make the
walker instantiate state for each outstanding walk. By storing the
states in a queue, the walker is able to handle multiple outstanding
timing requests. Note that functional walks use separate state
elements.


# 7087:fb8d5786ff30 24-May-2010 Nathan Binkert <nate@binkert.org>

copyright: Change HP copyright on x86 code to be more friendly


# 6027:3d7c2fe13f6a 13-Apr-2009 Gabe Black <gblack@eecs.umich.edu>

X86: Fix minor bug in the page table walker from TLB shuffling.


# 6023:47b4fcb10c11 09-Apr-2009 Nathan Binkert <nate@binkert.org>

tlb: More fixing of unified TLB


# 6022:410194bb3049 09-Apr-2009 Gabe Black <gblack@eecs.umich.edu>

tlb: Don't separate the TLB classes into an instruction TLB and a data TLB


# 5904:5c61233cbd53 25-Feb-2009 Gabe Black <gblack@eecs.umich.edu>

X86: Add a trace flag for the page table walker.


# 5897:29cecf4fe602 25-Feb-2009 Gabe Black <gblack@eecs.umich.edu>

X86: Fix the timing mode of the page table walker.


# 5895:569e3b31a868 25-Feb-2009 Gabe Black <gblack@eecs.umich.edu>

X86: Make the X86 TLB take advantage of delayed translations, and get rid of the fake TLB miss faults.


# 5881:73c0aaaaf186 23-Feb-2009 Gabe Black <gblack@eecs.umich.edu>

X86: Pass whether an access was a read/write/fetch so faults can behave accordingly.


# 5736:426510e758ad 10-Nov-2008 Nathan Binkert <nate@binkert.org>

mem: update stuff for changes to Packet and Request


# 5304:a685ea6cc8b8 02-Dec-2007 Gabe Black <gblack@eecs.umich.edu>

X86: Make the page not present panic more descriptive.


# 5245:d94bb8af9f76 12-Nov-2007 Gabe Black <gblack@eecs.umich.edu>

X86: Separate out the page table walker into it's own cc and hh.