stack_dist_calc.hh revision 10614:da37aec3ed1a
1/* 2 * Copyright (c) 2014 ARM Limited 3 * All rights reserved 4 * 5 * The license below extends only to copyright in the software and shall 6 * not be construed as granting a license to any other intellectual 7 * property including but not limited to intellectual property relating 8 * to a hardware implementation of the functionality of the software 9 * licensed hereunder. You may use the software subject to the license 10 * terms below provided that you ensure that this notice is replicated 11 * unmodified and in its entirety in all distributions of the software, 12 * modified or unmodified, in source code or in binary form. 13 * 14 * Redistribution and use in source and binary forms, with or without 15 * modification, are permitted provided that the following conditions are 16 * met: redistributions of source code must retain the above copyright 17 * notice, this list of conditions and the following disclaimer; 18 * redistributions in binary form must reproduce the above copyright 19 * notice, this list of conditions and the following disclaimer in the 20 * documentation and/or other materials provided with the distribution; 21 * neither the name of the copyright holders nor the names of its 22 * contributors may be used to endorse or promote products derived from 23 * this software without specific prior written permission. 24 * 25 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 26 * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 27 * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 28 * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 29 * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 30 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 31 * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 32 * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 33 * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 34 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 35 * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 36 * 37 * Authors: Kanishk Sugand 38 * Andreas Hansson 39 */ 40 41#ifndef __MEM_STACK_DIST_CALC_HH__ 42#define __MEM_STACK_DIST_CALC_HH__ 43 44#include <map> 45#include <vector> 46 47#include "base/types.hh" 48#include "mem/packet.hh" 49#include "params/StackDistCalc.hh" 50#include "sim/sim_object.hh" 51#include "sim/stats.hh" 52 53/** 54 * The stack distance calculator is a passive object that merely 55 * observes the addresses pass to it. It calculates stack distances 56 * of incoming addresses based on the partial sum hierarchy tree 57 * algorithm described by Alamasi et al. 58 * http://doi.acm.org/10.1145/773039.773043. 59 * 60 * A tree structure is maintained and updated at each transaction 61 * (unique or non-unique). The tree is implemented as an STL vector 62 * with layers of the form <map> Each layer in this tree is an 63 * ordered map <uint64_t, Node*>. Nodes are structs which take form 64 * of leaf, intermediate and root nodes. For example, in a tree with 3 65 * layers, tree[0][5] gives a leaf node pointer for key=5 tree[1][1] 66 * gives an intermediate node pointer for key=1 tree[2][0] gives the 67 * root node in the tree. 68 * 69 * At every transaction a hash-map (aiMap) is looked up to check if 70 * the address was already encountered before. Based on this lookup a 71 * transaction can be termed as unique or non-unique. 72 * 73 * In addition to the normal stack distance calculation, a feature to 74 * mark an old node in the tree is added. This is useful if it is 75 * required to see the reuse pattern. For example, BackInvalidates 76 * from a lower level (e.g. membus to L2), can be marked (isMarked 77 * flag of Node set to True). Then later if this same address is 78 * accessed (by L1), the value of the isMarked flag would be 79 * True. This would give some insight on how the BackInvalidates 80 * policy of the lower level affect the read/write accesses in an 81 * application. 82 * 83 * There are two functions provided to interface with the calculator: 84 * 1. pair<uint64_t, bool> calcStackDistAndUpdate(Addr r_address, 85 * bool addNewNode) 86 * At every unique transaction a new leaf node is added at tree[0](leaf layer) 87 * and linked to the layer above (if addNewNode is True). The sums of all 88 * the intermediate nodes is updated till the root. The stack-distance is 89 * returned as a Constant representing INFINITY. 90 * 91 * At every non-unique transaction the tree is traversed from the 92 * leaf at the returned index to the root, the old node is deleted 93 * from the tree, and the sums (to the right are collected) and 94 * decremented. The collected sum represets the stack distance of the 95 * found node. If this node was marked then a bool flag set to True 96 * is returned with the stack_distance. During this operation a node 97 * is discarded at the leaf layer always. Moreover during the 98 * traversal upwards using the updateSum() method, if an intermediate 99 * node is found with no children connected to it, then that is 100 * discarded too. 101 * 102 * The return value of this function is a pair representing the 103 * stack_distance and the value of the marked flag. 104 * 105 * 2. pair<uint64_t , bool> calcStackDist(Addr r_address, bool mark) 106 * This is a stripped down version of the above function which is used to 107 * just inspect the tree, and mark a leaf node (if mark flag is set). The 108 * functionality to add a new node is removed. 109 * 110 * At every unique transaction the stack-distance is returned as a constant 111 * representing INFINITY. 112 * 113 * At every non-unique transaction the tree is traversed from the 114 * leaf at the returned index to the root, and the sums (to the right) 115 * are collected. The collected sum represets the stack distance of 116 * the found node. 117 * 118 * This function does NOT Modify the stack. (No node is added or 119 * deleted). It is just used to mark a node already created and get 120 * its stack distance. 121 * 122 * The return value of this function is a pair representing the stack 123 * distance and the value of the marked flag. 124 * 125 * The table below depicts the usage of the Algorithm using the functions: 126 * pair<uint64_t Stack_dist, bool isMarked> calcStackDistAndUpdate 127 * (Addr r_address, bool addNewNode) 128 * pair<uint64_t Stack_dist, bool isMarked> calcStackDist 129 * (Addr r_address, bool mark) 130 * 131 * | Function | Arguments |Return Val |Use For| 132 * |calcStackDistAndUpdate|r_address, True|I/SD,False |A,GD,GM| 133 * |calcStackDistAndUpdate|r_address,False|SD,prevMark|D,GD,GM| 134 * |calcStackDist |r_address,False|SD,prevMark| GD,GM| 135 * |calcStackDist |r_address, True|SD,prevMark| GD,GM| 136 * 137 * (*A: Allocate an address in stack, if old entry present then it is deleted, 138 * *U: Delete old-address from stack, no new entry is added 139 * *GD: Get-Stack distance of an address, 140 * *GM: Get value of Mark flag, indicates if that address has been touched 141 * before, 142 * *I: stack-distance = infinity, 143 * *SD: Stack Distance 144 * *r_address: address to be added, *prevMark: value of isMarked flag 145 * of the Node) 146 * 147 * Invalidates refer to a type of packet that removes something from 148 * a cache, either autonoumously (due-to cache's own replacement 149 * policy), or snoops from other caches which invalidate something 150 * inside our cache. 151 * 152 * Usage | Function to use |Typical Use | 153 * Add new entry |calcStackDistAndUpdate|Read/Write Allocate | 154 * Delete Old Entry |calcStackDistAndUpdate|Writebacks/Cleanevicts| 155 * Dist.of Old entry|calcStackDist |Cleanevicts/Invalidate| 156 * 157 * Node Balancing: The tree structure is maintained by an 158 * updateTree() operation called when an intermediate node is 159 * required. The update operation is roughly categorized as a root 160 * update or intermediate layer update. When number of leaf nodes 161 * grow over a power of 2 then a new layer is added at the top of the 162 * tree and a new root node is initialized. The old node at the lower 163 * layer is connected to this. In an intermediate node update 164 * operation a new intermediate node is added to the required layer. 165 * 166 * Debugging: Debugging can be enabled by setting the verifyStack flag 167 * true. Debugging is implemented using a dummy stack that behaves in 168 * a naive way, using STL vectors (i.e each unique address is pushed 169 * on the top of an STL vector stack, and SD is returned as 170 * Infinity. If a non unique address is encountered then the previous 171 * entry in the STL vector is removed, all the entities above it are 172 * pushed down, and the address is pushed at the top of the stack). 173 * 174 * A printStack(int numOfEntitiesToPrint) is provided to print top n entities 175 * in both (tree and STL based dummy stack). 176 */ 177class StackDistCalc : public SimObject 178{ 179 180 private: 181 182 struct Node; 183 184 typedef std::map<uint64_t, Node*> IndexNodeMap; 185 typedef std::map<Addr, uint64_t> AddressIndexMap; 186 typedef std::vector<IndexNodeMap> TreeType; 187 188 /** 189 * Gets sum from the node upwards recursively till the root. This 190 * function is called first by getSumsLeavesToRoot, and then 191 * recursively calls itself. 192 * 193 * @param node pointer to the node which is updated 194 * @param from_left variable which says that the request arrived 195 * from the left 196 * @param sum_from_below Sum of left and right children below 197 * @param level level in the tree the calling node is located 198 * @param stack_dist stack distance of the node below 199 * @return The stack distance of the current address. 200 * 201 */ 202 uint64_t getSum(Node* node, bool from_left, uint64_t sum_from_below, 203 uint64_t stack_dist, uint64_t level) const; 204 205 /** 206 * Gets the sum from the leaf node specified. This function 207 * is called by calcStackDist. 208 * 209 * @param node pointer to the node which is updated 210 * @return The stack distance of the current address. 211 * 212 */ 213 uint64_t getSumsLeavesToRoot(Node* node) const; 214 215 /** 216 * Updates the nodes upwards recursively till the root. 217 * This function is first called by updateSumsLeavesToRoot, 218 * and then it recursively calls itself. 219 * 220 * @param node pointer to the node which is updated 221 * @param from_left variable which says that the request arrived 222 * from the left 223 * @param sum_from_below Sum of left and right children below 224 * @param level level in the tree the calling node is located 225 * @param stack_dist stack distance of the node below 226 * @param discard_node whether the calling node was discarded or not 227 * @return The stack distance of the current address. 228 * 229 */ 230 uint64_t updateSum(Node* node, 231 bool from_left, uint64_t sum_from_below, uint64_t level, 232 uint64_t stack_dist, bool discard_node); 233 234 /** 235 * Updates the leaf nodes and nodes above. This function is 236 * called by the calcStackDistAndUpdate. 237 * 238 * @param node pointer to the node which is updated 239 * @param is_new_leaf is true if this is a newly added node 240 * @return The stack distance of the current address. 241 * 242 */ 243 uint64_t updateSumsLeavesToRoot(Node* node, bool is_new_leaf); 244 245 /** 246 * updateTree is a tree balancing operation, which maintains the 247 * binary tree structure. 248 * This method is called whenever index%2 == 0 (i.e. every 249 * alternate cycle) The two main operation are : 250 * OP1. Moving the root node one layer up if index counter 251 * crosses power of 2 252 * OP2. Addition of intermediate nodes as and when required 253 * and linking them to their parents in the layer above. 254 */ 255 void updateTree(); 256 257 /** 258 * This method is used for verification purposes 259 * It recursively traverses upwards from the given node till 260 * the root to check if the ultimate parent node (root-node) points 261 * to null. 262 * 263 * @param node pointer to the node whose sanity is being checked 264 * @param level the level at which this node is located in the tree 265 * 266 */ 267 void sanityCheckTree(const Node* node, uint64_t level = 0) const; 268 269 /** 270 * A convenient way of refering to infinity. 271 */ 272 static constexpr uint64_t Infinity = std::numeric_limits<uint64_t>::max(); 273 274 /** 275 * Process the given address. If Mark is true then set the 276 * mark flag of the leaf node. 277 * This function returns the stack distance of the incoming 278 * address and the previous status of the mark flag. 279 * 280 * @param r_address The current address to process 281 * @param mark set the mark flag for the address. 282 * @return The stack distance of the current address and the mark flag. 283 */ 284 std::pair<uint64_t , bool> calcStackDist(const Addr r_address, 285 bool mark = false); 286 287 /** 288 * Process the given address: 289 * - Lookup the tree for the given address 290 * - delete old node if found in tree 291 * - add a new node (if addNewNode flag is set) 292 * This function returns the stack distance of the incoming 293 * address and the status of the mark flag. 294 * 295 * @param r_address The current address to process 296 * @param addNewNode If true, a new node is added to the tree 297 * @return The stack distance of the current address and the mark flag. 298 */ 299 std::pair<uint64_t, bool> calcStackDistAndUpdate(const Addr r_address, 300 bool addNewNode = true); 301 302 /** 303 * Return the counter for address accesses (unique and 304 * non-unique). This is further used to dump stats at 305 * regular intervals. 306 * 307 * @return The stack distance of the current address. 308 */ 309 uint64_t getIndex() const { return index; } 310 311 /** 312 * Query depth of the tree (tree[0] represents leaf layer while 313 * tree[treeDepth] represents the root layer, all layers in 314 * between contain intermediate nodes) 315 * 316 * @return Tree depth 317 */ 318 uint64_t getTreeDepth() const { return tree.size() - 1; } 319 320 /** 321 * Print the last n items on the stack. 322 * This method prints top n entries in the tree based implementation as 323 * well as dummy stack. 324 * @param n Number of entries to print 325 */ 326 void printStack(int n = 5) const; 327 328 /** 329 * This is an alternative implementation of the stack-distance 330 * in a naive way. It uses simple STL vector to represent the stack. 331 * It can be used in parallel for debugging purposes. 332 * It is 10x slower than the tree based implemenation. 333 * 334 * @param r_address The current address to process 335 * @param update_stack Flag to indicate if stack should be updated 336 * @return Stack distance which is calculated by this alternative 337 * implementation 338 * 339 */ 340 uint64_t verifyStackDist(const Addr r_address, 341 bool update_stack = false); 342 343 public: 344 345 /** 346 * Convenience method to get the params when registering stats. 347 */ 348 const StackDistCalcParams* params() const 349 { return reinterpret_cast<const StackDistCalcParams*>(_params); } 350 351 StackDistCalc(const StackDistCalcParams* p); 352 353 ~StackDistCalc(); 354 355 void regStats(); 356 357 /** 358 * Update the tree and the statistics. 359 * 360 * @param cmd Command from the packet 361 * @param addr Address to put on the stack 362 */ 363 void update(const MemCmd& cmd, Addr addr); 364 365 private: 366 367 /** 368 * Node which takes form of Leaf, INode or Root 369 */ 370 struct Node{ 371 // Sum of the left children 372 uint64_t sumLeft; 373 374 // Sum of the right children 375 uint64_t sumRight; 376 377 // Flag to indicate that sumLeft has gone from non-zero value to 0 378 bool discardLeft; 379 380 // Flag to indicate that sumRight has gone from non-zero value to 0 381 bool discardRight; 382 383 // Index of the current element in the Map 384 uint64_t nodeIndex; 385 386 // Pointer to the parent 387 Node* parent; 388 389 // Flag to mark the node as the right/left child 390 bool isLeftNode; 391 392 /** 393 * Flag to indicate if this address is marked. Used in case 394 * where stack distance of a touched address is required. 395 */ 396 bool isMarked; 397 398 /** 399 * The discard flags are false by default they become true if 400 * the node is reached again in a future lookup. 401 */ 402 Node() : sumLeft(0), sumRight(0), discardLeft(false), 403 discardRight(false), nodeIndex(0), 404 parent(nullptr), isLeftNode(true), isMarked(false) 405 { } 406 }; 407 408 /** 409 * Internal counter for address accesses (unique and non-unique) 410 * This counter increments everytime the calcStackDist() method is 411 * called. This counter is used as a key for the hash- map at the 412 * leaf layer. Practically at every call to the calculator this 413 * counter is incremented and a new leaf node is added in the tree 414 * at the leaf layer using this counter value as the key. 415 */ 416 uint64_t index; 417 418 // Binary tree of partial sums 419 TreeType tree; 420 421 // Hash map which returns last seen index of each address 422 AddressIndexMap aiMap; 423 424 // Keeps count of number of the next unique index for each 425 // level in the tree 426 std::vector<uint64_t> nextIndex; 427 428 // Dummy Stack for verification 429 std::vector<uint64_t> stack; 430 431 // Flag to enable verification of stack. (Slows down the simulation) 432 const bool verifyStack; 433 434 // Disable the linear histograms 435 const bool disableLinearHists; 436 437 // Disable the logarithmic histograms 438 const bool disableLogHists; 439 440 // Reads linear histogram 441 Stats::Histogram readLinearHist; 442 443 // Reads logarithmic histogram 444 Stats::SparseHistogram readLogHist; 445 446 // Writes linear histogram 447 Stats::Histogram writeLinearHist; 448 449 // Writes logarithmic histogram 450 Stats::SparseHistogram writeLogHist; 451 452}; 453 454#endif //__STACK_DIST_CALC_HH__ 455