stack_dist_calc.hh revision 10614:da37aec3ed1a
1/*
2 * Copyright (c) 2014 ARM Limited
3 * All rights reserved
4 *
5 * The license below extends only to copyright in the software and shall
6 * not be construed as granting a license to any other intellectual
7 * property including but not limited to intellectual property relating
8 * to a hardware implementation of the functionality of the software
9 * licensed hereunder.  You may use the software subject to the license
10 * terms below provided that you ensure that this notice is replicated
11 * unmodified and in its entirety in all distributions of the software,
12 * modified or unmodified, in source code or in binary form.
13 *
14 * Redistribution and use in source and binary forms, with or without
15 * modification, are permitted provided that the following conditions are
16 * met: redistributions of source code must retain the above copyright
17 * notice, this list of conditions and the following disclaimer;
18 * redistributions in binary form must reproduce the above copyright
19 * notice, this list of conditions and the following disclaimer in the
20 * documentation and/or other materials provided with the distribution;
21 * neither the name of the copyright holders nor the names of its
22 * contributors may be used to endorse or promote products derived from
23 * this software without specific prior written permission.
24 *
25 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
26 * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
27 * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
28 * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
29 * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
30 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
31 * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
32 * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
33 * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
34 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
35 * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
36 *
37 * Authors: Kanishk Sugand
38 *          Andreas Hansson
39 */
40
41#ifndef __MEM_STACK_DIST_CALC_HH__
42#define __MEM_STACK_DIST_CALC_HH__
43
44#include <map>
45#include <vector>
46
47#include "base/types.hh"
48#include "mem/packet.hh"
49#include "params/StackDistCalc.hh"
50#include "sim/sim_object.hh"
51#include "sim/stats.hh"
52
53/**
54  * The stack distance calculator is a passive object that merely
55  * observes the addresses pass to it. It calculates stack distances
56  * of incoming addresses based on the partial sum hierarchy tree
57  * algorithm described by Alamasi et al.
58  * http://doi.acm.org/10.1145/773039.773043.
59  *
60  * A tree structure is maintained and updated at each transaction
61  * (unique or non-unique). The tree is implemented as an STL vector
62  * with layers of the form <map> Each layer in this tree is an
63  * ordered map <uint64_t, Node*>. Nodes are structs which take form
64  * of leaf, intermediate and root nodes. For example, in a tree with 3
65  * layers, tree[0][5] gives a leaf node pointer for key=5 tree[1][1]
66  * gives an intermediate node pointer for key=1 tree[2][0] gives the
67  * root node in the tree.
68  *
69  * At every transaction a hash-map (aiMap) is looked up to check if
70  * the address was already encountered before. Based on this lookup a
71  * transaction can be termed as unique or non-unique.
72  *
73  * In addition to the normal stack distance calculation, a feature to
74  * mark an old node in the tree is added. This is useful if it is
75  * required to see the reuse pattern. For example, BackInvalidates
76  * from a lower level (e.g. membus to L2), can be marked (isMarked
77  * flag of Node set to True). Then later if this same address is
78  * accessed (by L1), the value of the isMarked flag would be
79  * True. This would give some insight on how the BackInvalidates
80  * policy of the lower level affect the read/write accesses in an
81  * application.
82  *
83  * There are two functions provided to interface with the calculator:
84  * 1. pair<uint64_t, bool> calcStackDistAndUpdate(Addr r_address,
85  *                                                bool addNewNode)
86  * At every unique transaction a new leaf node is added at tree[0](leaf layer)
87  * and linked to the layer above (if addNewNode is True). The sums of all
88  * the intermediate nodes is updated till the root. The stack-distance is
89  * returned as a Constant representing INFINITY.
90  *
91  * At every non-unique transaction the tree is traversed from the
92  * leaf at the returned index to the root, the old node is deleted
93  * from the tree, and the sums (to the right are collected) and
94  * decremented. The collected sum represets the stack distance of the
95  * found node. If this node was marked then a bool flag set to True
96  * is returned with the stack_distance. During this operation a node
97  * is discarded at the leaf layer always. Moreover during the
98  * traversal upwards using the updateSum() method, if an intermediate
99  * node is found with no children connected to it, then that is
100  * discarded too.
101  *
102  * The return value of this function is a pair representing the
103  * stack_distance and the value of the marked flag.
104  *
105  * 2. pair<uint64_t , bool> calcStackDist(Addr r_address, bool mark)
106  * This is a stripped down version of the above function which is used to
107  * just inspect the tree, and mark a leaf node (if mark flag is set). The
108  * functionality to add a new node is removed.
109  *
110  * At every unique transaction the stack-distance is returned as a constant
111  * representing INFINITY.
112  *
113  * At every non-unique transaction the tree is traversed from the
114  * leaf at the returned index to the root, and the sums (to the right)
115  * are collected. The collected sum represets the stack distance of
116  * the found node.
117  *
118  * This function does NOT Modify the stack. (No node is added or
119  * deleted).  It is just used to mark a node already created and get
120  * its stack distance.
121  *
122  * The return value of this function is a pair representing the stack
123  * distance and the value of the marked flag.
124  *
125  * The table below depicts the usage of the Algorithm using the functions:
126  * pair<uint64_t Stack_dist, bool isMarked> calcStackDistAndUpdate
127  *                                      (Addr r_address, bool addNewNode)
128  * pair<uint64_t Stack_dist, bool isMarked> calcStackDist
129  *                                              (Addr r_address, bool mark)
130  *
131  * |   Function           |   Arguments   |Return Val |Use For|
132  * |calcStackDistAndUpdate|r_address, True|I/SD,False |A,GD,GM|
133  * |calcStackDistAndUpdate|r_address,False|SD,prevMark|D,GD,GM|
134  * |calcStackDist         |r_address,False|SD,prevMark|  GD,GM|
135  * |calcStackDist         |r_address, True|SD,prevMark|  GD,GM|
136  *
137  * (*A: Allocate an address in stack, if old entry present then it is deleted,
138  *  *U: Delete old-address from stack, no new entry is added
139  *  *GD: Get-Stack distance of an address,
140  *  *GM: Get value of Mark flag, indicates if that address has been touched
141  *                                                                  before,
142  *  *I: stack-distance = infinity,
143  *  *SD: Stack Distance
144  *  *r_address: address to be added, *prevMark: value of isMarked flag
145  *                                                              of the Node)
146  *
147  * Invalidates refer to a type of packet that removes something from
148  * a cache, either autonoumously (due-to cache's own replacement
149  * policy), or snoops from other caches which invalidate something
150  * inside our cache.
151  *
152  * Usage            |   Function to use    |Typical Use           |
153  * Add new entry    |calcStackDistAndUpdate|Read/Write Allocate   |
154  * Delete Old Entry |calcStackDistAndUpdate|Writebacks/Cleanevicts|
155  * Dist.of Old entry|calcStackDist         |Cleanevicts/Invalidate|
156  *
157  * Node Balancing: The tree structure is maintained by an
158  * updateTree() operation called when an intermediate node is
159  * required. The update operation is roughly categorized as a root
160  * update or intermediate layer update. When number of leaf nodes
161  * grow over a power of 2 then a new layer is added at the top of the
162  * tree and a new root node is initialized. The old node at the lower
163  * layer is connected to this.  In an intermediate node update
164  * operation a new intermediate node is added to the required layer.
165  *
166  * Debugging: Debugging can be enabled by setting the verifyStack flag
167  * true. Debugging is implemented using a dummy stack that behaves in
168  * a naive way, using STL vectors (i.e each unique address is pushed
169  * on the top of an STL vector stack, and SD is returned as
170  * Infinity. If a non unique address is encountered then the previous
171  * entry in the STL vector is removed, all the entities above it are
172  * pushed down, and the address is pushed at the top of the stack).
173  *
174  * A printStack(int numOfEntitiesToPrint) is provided to print top n entities
175  * in both (tree and STL based dummy stack).
176  */
177class StackDistCalc : public SimObject
178{
179
180  private:
181
182    struct Node;
183
184    typedef std::map<uint64_t, Node*> IndexNodeMap;
185    typedef std::map<Addr, uint64_t> AddressIndexMap;
186    typedef std::vector<IndexNodeMap> TreeType;
187
188    /**
189     * Gets sum from the node upwards recursively till the root.  This
190     * function is called first by getSumsLeavesToRoot, and then
191     * recursively calls itself.
192     *
193     * @param node pointer to the node which is updated
194     * @param from_left variable which says that the request arrived
195     *        from the left
196     * @param sum_from_below Sum of left and right children below
197     * @param level level in the tree the calling node is located
198     * @param stack_dist stack distance of the node below
199     * @return The stack distance of the current address.
200     *
201     */
202    uint64_t getSum(Node* node, bool from_left, uint64_t sum_from_below,
203                    uint64_t stack_dist, uint64_t level) const;
204
205    /**
206     * Gets the sum from the leaf node specified. This function
207     * is called by calcStackDist.
208     *
209     * @param node pointer to the node which is updated
210     * @return The stack distance of the current address.
211     *
212     */
213    uint64_t getSumsLeavesToRoot(Node* node) const;
214
215    /**
216     * Updates the nodes upwards recursively till the root.
217     * This function is first called by updateSumsLeavesToRoot,
218     * and then it recursively calls itself.
219     *
220     * @param node pointer to the node which is updated
221     * @param from_left variable which says that the request arrived
222     * from the left
223     * @param sum_from_below Sum of left and right children below
224     * @param level level in the tree the calling node is located
225     * @param stack_dist stack distance of the node below
226     * @param discard_node whether the calling node was discarded or not
227     * @return The stack distance of the current address.
228     *
229     */
230    uint64_t updateSum(Node* node,
231                       bool from_left, uint64_t sum_from_below, uint64_t level,
232                       uint64_t stack_dist, bool discard_node);
233
234    /**
235     * Updates the leaf nodes and nodes above. This function is
236     * called by the calcStackDistAndUpdate.
237     *
238     * @param node pointer to the node which is updated
239     * @param is_new_leaf is true if this is a newly added node
240     * @return The stack distance of the current address.
241     *
242     */
243    uint64_t updateSumsLeavesToRoot(Node* node, bool is_new_leaf);
244
245    /**
246     * updateTree is a tree balancing operation, which maintains the
247     * binary tree structure.
248     * This method is called whenever index%2 == 0 (i.e. every
249     * alternate cycle) The two main operation are :
250     * OP1. Moving the root node one layer up if index counter
251     *       crosses power of 2
252     * OP2. Addition of intermediate nodes as and when required
253     *      and linking them to their parents in the layer above.
254     */
255    void updateTree();
256
257    /**
258     * This method is used for verification purposes
259     * It recursively traverses upwards from the given node till
260     * the root to check if the ultimate parent node (root-node) points
261     * to null.
262     *
263     * @param node pointer to the node whose sanity is being checked
264     * @param level the level at which this node is located in the tree
265     *
266     */
267    void sanityCheckTree(const Node* node, uint64_t level = 0) const;
268
269    /**
270     * A convenient way of refering to infinity.
271     */
272    static constexpr uint64_t Infinity = std::numeric_limits<uint64_t>::max();
273
274    /**
275     * Process the given address. If Mark is true then set the
276     * mark flag of the leaf node.
277     * This function returns the stack distance of the incoming
278     * address and the previous status of the mark flag.
279     *
280     * @param r_address The current address to process
281     * @param mark set the mark flag for the address.
282     * @return The stack distance of the current address and the mark flag.
283     */
284    std::pair<uint64_t , bool> calcStackDist(const Addr r_address,
285                                             bool mark = false);
286
287    /**
288     * Process the given address:
289     *  - Lookup the tree for the given address
290     *  - delete old node if found in tree
291     *  - add a new node (if addNewNode flag is set)
292     * This function returns the stack distance of the incoming
293     * address and the status of the mark flag.
294     *
295     * @param r_address The current address to process
296     * @param addNewNode If true, a new node is added to the tree
297     * @return The stack distance of the current address and the mark flag.
298     */
299    std::pair<uint64_t, bool> calcStackDistAndUpdate(const Addr r_address,
300                                                     bool addNewNode = true);
301
302    /**
303     * Return the counter for address accesses (unique and
304     * non-unique). This is further used to dump stats at
305     * regular intervals.
306     *
307     * @return The stack distance of the current address.
308     */
309    uint64_t getIndex() const { return index; }
310
311    /**
312     * Query depth of the tree (tree[0] represents leaf layer while
313     * tree[treeDepth] represents the root layer, all layers in
314     * between contain intermediate nodes)
315     *
316     * @return Tree depth
317     */
318    uint64_t getTreeDepth() const { return tree.size() - 1; }
319
320    /**
321     * Print the last n items on the stack.
322     * This method prints top n entries in the tree based implementation as
323     * well as dummy stack.
324     * @param n Number of entries to print
325     */
326    void printStack(int n = 5) const;
327
328    /**
329     * This is an alternative implementation of the stack-distance
330     * in a naive way. It uses simple STL vector to represent the stack.
331     * It can be used in parallel for debugging purposes.
332     * It is 10x slower than the tree based implemenation.
333     *
334     * @param r_address The current address to process
335     * @param update_stack Flag to indicate if stack should be updated
336     * @return  Stack distance which is calculated by this alternative
337     * implementation
338     *
339     */
340    uint64_t verifyStackDist(const Addr r_address,
341                             bool update_stack = false);
342
343  public:
344
345    /**
346     * Convenience method to get the params when registering stats.
347     */
348    const StackDistCalcParams* params() const
349    { return reinterpret_cast<const StackDistCalcParams*>(_params); }
350
351    StackDistCalc(const StackDistCalcParams* p);
352
353    ~StackDistCalc();
354
355    void regStats();
356
357    /**
358     * Update the tree and the statistics.
359     *
360     * @param cmd Command from the packet
361     * @param addr Address to put on the stack
362     */
363    void update(const MemCmd& cmd, Addr addr);
364
365  private:
366
367    /**
368     * Node which takes form of Leaf, INode or Root
369     */
370    struct Node{
371        // Sum of the left children
372        uint64_t sumLeft;
373
374        // Sum of the right children
375        uint64_t sumRight;
376
377        // Flag to indicate that sumLeft has gone from non-zero value to 0
378        bool discardLeft;
379
380        // Flag to indicate that sumRight has gone from non-zero value to 0
381        bool discardRight;
382
383        // Index of the current element in the Map
384        uint64_t nodeIndex;
385
386        // Pointer to the parent
387        Node* parent;
388
389        // Flag to mark the node as the right/left child
390        bool isLeftNode;
391
392        /**
393         * Flag to indicate if this address is marked. Used in case
394         * where stack distance of a touched address is required.
395         */
396        bool isMarked;
397
398        /**
399         * The discard flags are false by default they become true if
400         * the node is reached again in a future lookup.
401         */
402        Node() : sumLeft(0), sumRight(0), discardLeft(false),
403                 discardRight(false), nodeIndex(0),
404                 parent(nullptr), isLeftNode(true), isMarked(false)
405        { }
406    };
407
408    /**
409     * Internal counter for address accesses (unique and non-unique)
410     * This counter increments everytime the calcStackDist() method is
411     * called. This counter is used as a key for the hash- map at the
412     * leaf layer. Practically at every call to the calculator this
413     * counter is incremented and a new leaf node is added in the tree
414     * at the leaf layer using this counter value as the key.
415     */
416    uint64_t index;
417
418    // Binary tree of partial sums
419    TreeType tree;
420
421    // Hash map which returns last seen index of each address
422    AddressIndexMap aiMap;
423
424    // Keeps count of number of the next unique index for each
425    // level in the tree
426    std::vector<uint64_t> nextIndex;
427
428    // Dummy Stack for verification
429    std::vector<uint64_t> stack;
430
431    // Flag to enable verification of stack. (Slows down the simulation)
432    const bool verifyStack;
433
434    // Disable the linear histograms
435    const bool disableLinearHists;
436
437    // Disable the logarithmic histograms
438    const bool disableLogHists;
439
440    // Reads linear histogram
441    Stats::Histogram readLinearHist;
442
443    // Reads logarithmic histogram
444    Stats::SparseHistogram readLogHist;
445
446    // Writes linear histogram
447    Stats::Histogram writeLinearHist;
448
449    // Writes logarithmic histogram
450    Stats::SparseHistogram writeLogHist;
451
452};
453
454#endif //__STACK_DIST_CALC_HH__
455