1# Copyright (c) 2012-2013 ARM Limited 2# All rights reserved. 3# 4# The license below extends only to copyright in the software and shall 5# not be construed as granting a license to any other intellectual 6# property including but not limited to intellectual property relating 7# to a hardware implementation of the functionality of the software 8# licensed hereunder. You may use the software subject to the license 9# terms below provided that you ensure that this notice is replicated 10# unmodified and in its entirety in all distributions of the software, 11# modified or unmodified, in source code or in binary form. 12# 13# Copyright (c) 2015 The University of Bologna 14# All rights reserved. 15# 16# Redistribution and use in source and binary forms, with or without 17# modification, are permitted provided that the following conditions are 18# met: redistributions of source code must retain the above copyright 19# notice, this list of conditions and the following disclaimer; 20# redistributions in binary form must reproduce the above copyright 21# notice, this list of conditions and the following disclaimer in the 22# documentation and/or other materials provided with the distribution; 23# neither the name of the copyright holders nor the names of its 24# contributors may be used to endorse or promote products derived from 25# this software without specific prior written permission. 26# 27# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 28# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 29# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 30# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 31# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 32# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 33# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 34# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 35# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 36# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 37# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 38# 39# Authors: Erfan Azarkhish 40# Abdul Mutaal Ahmad 41 42# A Simplified model of a complete HMC device. Based on: 43# [1] http://www.hybridmemorycube.org/specification-download/ 44# [2] High performance AXI-4.0 based interconnect for extensible smart memory 45# cubes(E. Azarkhish et. al) 46# [3] Low-Power Hybrid Memory Cubes With Link Power Management and Two-Level 47# Prefetching (J. Ahn et. al) 48# [4] Memory-centric system interconnect design with Hybrid Memory Cubes 49# (G. Kim et. al) 50# [5] Near Data Processing, Are we there yet? (M. Gokhale) 51# http://www.cs.utah.edu/wondp/gokhale.pdf 52# [6] openHMC - A Configurable Open-Source Hybrid Memory Cube Controller 53# (J. Schmidt) 54# [7] Hybrid Memory Cube performance characterization on data-centric 55# workloads (M. Gokhale) 56# 57# This script builds a complete HMC device composed of vault controllers, 58# serial links, the main internal crossbar, and an external hmc controller. 59# 60# - VAULT CONTROLLERS: 61# Instances of the HMC_2500_1x32 class with their functionality specified in 62# dram_ctrl.cc 63# 64# - THE MAIN XBAR: 65# This component is simply an instance of the NoncoherentXBar class, and its 66# parameters are tuned to [2]. 67# 68# - SERIAL LINKS CONTROLLER: 69# SerialLink is a simple variation of the Bridge class, with the ability to 70# account for the latency of packet serialization and controller latency. We 71# assume that the serializer component at the transmitter side does not need 72# to receive the whole packet to start the serialization. But the 73# deserializer waits for the complete packet to check its integrity first. 74# 75# * Bandwidth of the serial links is not modeled in the SerialLink component 76# itself. 77# 78# * Latency of serial link controller is composed of SerDes latency + link 79# controller 80# 81# * It is inferred from the standard [1] and the literature [3] that serial 82# links share the same address range and packets can travel over any of 83# them so a load distribution mechanism is required among them. 84# 85# ----------------------------------------- 86# | Host/HMC Controller | 87# | ---------------------- | 88# | | Link Aggregator | opt | 89# | ---------------------- | 90# | ---------------------- | 91# | | Serial Link + Ser | * 4 | 92# | ---------------------- | 93# |--------------------------------------- 94# ----------------------------------------- 95# | Device 96# | ---------------------- | 97# | | Xbar | * 4 | 98# | ---------------------- | 99# | ---------------------- | 100# | | Vault Controller | * 16 | 101# | ---------------------- | 102# | ---------------------- | 103# | | Memory | | 104# | ---------------------- | 105# |---------------------------------------| 106# 107# In this version we have present 3 different HMC archiecture along with 108# alongwith their corresponding test script. 109# 110# same: It has 4 crossbars in HMC memory. All the crossbars are connected 111# to each other, providing complete memory range. This archicture also covers 112# the added latency for sending a request to non-local vault(bridge in b/t 113# crossbars). All the 4 serial links can access complete memory. So each 114# link can be connected to separate processor. 115# 116# distributed: It has 4 crossbars inside the HMC. Crossbars are not 117# connected.Through each crossbar only local vaults can be accessed. But to 118# support this architecture we need a crossbar between serial links and 119# processor. 120# 121# mixed: This is a hybrid architecture. It has 4 crossbars inside the HMC. 122# 2 Crossbars are connected to only local vaults. From other 2 crossbar, a 123# request can be forwarded to any other vault. 124
| 1# Copyright (c) 2012-2013 ARM Limited 2# All rights reserved. 3# 4# The license below extends only to copyright in the software and shall 5# not be construed as granting a license to any other intellectual 6# property including but not limited to intellectual property relating 7# to a hardware implementation of the functionality of the software 8# licensed hereunder. You may use the software subject to the license 9# terms below provided that you ensure that this notice is replicated 10# unmodified and in its entirety in all distributions of the software, 11# modified or unmodified, in source code or in binary form. 12# 13# Copyright (c) 2015 The University of Bologna 14# All rights reserved. 15# 16# Redistribution and use in source and binary forms, with or without 17# modification, are permitted provided that the following conditions are 18# met: redistributions of source code must retain the above copyright 19# notice, this list of conditions and the following disclaimer; 20# redistributions in binary form must reproduce the above copyright 21# notice, this list of conditions and the following disclaimer in the 22# documentation and/or other materials provided with the distribution; 23# neither the name of the copyright holders nor the names of its 24# contributors may be used to endorse or promote products derived from 25# this software without specific prior written permission. 26# 27# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 28# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 29# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 30# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 31# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 32# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 33# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 34# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 35# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 36# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 37# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 38# 39# Authors: Erfan Azarkhish 40# Abdul Mutaal Ahmad 41 42# A Simplified model of a complete HMC device. Based on: 43# [1] http://www.hybridmemorycube.org/specification-download/ 44# [2] High performance AXI-4.0 based interconnect for extensible smart memory 45# cubes(E. Azarkhish et. al) 46# [3] Low-Power Hybrid Memory Cubes With Link Power Management and Two-Level 47# Prefetching (J. Ahn et. al) 48# [4] Memory-centric system interconnect design with Hybrid Memory Cubes 49# (G. Kim et. al) 50# [5] Near Data Processing, Are we there yet? (M. Gokhale) 51# http://www.cs.utah.edu/wondp/gokhale.pdf 52# [6] openHMC - A Configurable Open-Source Hybrid Memory Cube Controller 53# (J. Schmidt) 54# [7] Hybrid Memory Cube performance characterization on data-centric 55# workloads (M. Gokhale) 56# 57# This script builds a complete HMC device composed of vault controllers, 58# serial links, the main internal crossbar, and an external hmc controller. 59# 60# - VAULT CONTROLLERS: 61# Instances of the HMC_2500_1x32 class with their functionality specified in 62# dram_ctrl.cc 63# 64# - THE MAIN XBAR: 65# This component is simply an instance of the NoncoherentXBar class, and its 66# parameters are tuned to [2]. 67# 68# - SERIAL LINKS CONTROLLER: 69# SerialLink is a simple variation of the Bridge class, with the ability to 70# account for the latency of packet serialization and controller latency. We 71# assume that the serializer component at the transmitter side does not need 72# to receive the whole packet to start the serialization. But the 73# deserializer waits for the complete packet to check its integrity first. 74# 75# * Bandwidth of the serial links is not modeled in the SerialLink component 76# itself. 77# 78# * Latency of serial link controller is composed of SerDes latency + link 79# controller 80# 81# * It is inferred from the standard [1] and the literature [3] that serial 82# links share the same address range and packets can travel over any of 83# them so a load distribution mechanism is required among them. 84# 85# ----------------------------------------- 86# | Host/HMC Controller | 87# | ---------------------- | 88# | | Link Aggregator | opt | 89# | ---------------------- | 90# | ---------------------- | 91# | | Serial Link + Ser | * 4 | 92# | ---------------------- | 93# |--------------------------------------- 94# ----------------------------------------- 95# | Device 96# | ---------------------- | 97# | | Xbar | * 4 | 98# | ---------------------- | 99# | ---------------------- | 100# | | Vault Controller | * 16 | 101# | ---------------------- | 102# | ---------------------- | 103# | | Memory | | 104# | ---------------------- | 105# |---------------------------------------| 106# 107# In this version we have present 3 different HMC archiecture along with 108# alongwith their corresponding test script. 109# 110# same: It has 4 crossbars in HMC memory. All the crossbars are connected 111# to each other, providing complete memory range. This archicture also covers 112# the added latency for sending a request to non-local vault(bridge in b/t 113# crossbars). All the 4 serial links can access complete memory. So each 114# link can be connected to separate processor. 115# 116# distributed: It has 4 crossbars inside the HMC. Crossbars are not 117# connected.Through each crossbar only local vaults can be accessed. But to 118# support this architecture we need a crossbar between serial links and 119# processor. 120# 121# mixed: This is a hybrid architecture. It has 4 crossbars inside the HMC. 122# 2 Crossbars are connected to only local vaults. From other 2 crossbar, a 123# request can be forwarded to any other vault. 124
|
125import optparse
| 125import argparse
|
126 127import m5 128from m5.objects import *
| 126 127import m5 128from m5.objects import *
|
| 129from m5.util import *
|
129
| 130
|
130# A single Hybrid Memory Cube (HMC) 131class HMCSystem(SubSystem): 132 #*****************************CROSSBAR PARAMETERS*************************
| 131 132def add_options(parser): 133 # *****************************CROSSBAR PARAMETERS*************************
|
133 # Flit size of the main interconnect [1]
| 134 # Flit size of the main interconnect [1]
|
134 xbar_width = Param.Unsigned(32, "Data width of the main XBar (Bytes)")
| 135 parser.add_argument("--xbar-width", default=32, action="store", type=int, 136 help="Data width of the main XBar (Bytes)")
|
135 136 # Clock frequency of the main interconnect [1] 137 # This crossbar, is placed on the logic-based of the HMC and it has its 138 # own voltage and clock domains, different from the DRAM dies or from the 139 # host.
| 137 138 # Clock frequency of the main interconnect [1] 139 # This crossbar, is placed on the logic-based of the HMC and it has its 140 # own voltage and clock domains, different from the DRAM dies or from the 141 # host.
|
140 xbar_frequency = Param.Frequency('1GHz', "Clock Frequency of the main " 141 "XBar")
| 142 parser.add_argument("--xbar-frequency", default='1GHz', type=str, 143 help="Clock Frequency of the main XBar")
|
142 143 # Arbitration latency of the HMC XBar [1]
| 144 145 # Arbitration latency of the HMC XBar [1]
|
144 xbar_frontend_latency = Param.Cycles(1, "Arbitration latency of the XBar")
| 146 parser.add_argument("--xbar-frontend-latency", default=1, action="store", 147 type=int, help="Arbitration latency of the XBar")
|
145 146 # Latency to forward a packet via the interconnect [1](two levels of FIFOs 147 # at the input and output of the inteconnect)
| 148 149 # Latency to forward a packet via the interconnect [1](two levels of FIFOs 150 # at the input and output of the inteconnect)
|
148 xbar_forward_latency = Param.Cycles(2, "Forward latency of the XBar")
| 151 parser.add_argument("--xbar-forward-latency", default=2, action="store", 152 type=int, help="Forward latency of the XBar")
|
149 150 # Latency to forward a response via the interconnect [1](two levels of 151 # FIFOs at the input and output of the inteconnect)
| 153 154 # Latency to forward a response via the interconnect [1](two levels of 155 # FIFOs at the input and output of the inteconnect)
|
152 xbar_response_latency = Param.Cycles(2, "Response latency of the XBar")
| 156 parser.add_argument("--xbar-response-latency", default=2, action="store", 157 type=int, help="Response latency of the XBar")
|
153 154 # number of cross which connects 16 Vaults to serial link[7]
| 158 159 # number of cross which connects 16 Vaults to serial link[7]
|
155 number_mem_crossbar = Param.Unsigned(4, "Number of crossbar in HMC" 156 )
| 160 parser.add_argument("--number-mem-crossbar", default=4, action="store", 161 type=int, help="Number of crossbar in HMC")
|
157
| 162
|
158 #*****************************SERIAL LINK PARAMETERS***********************
| 163 # *****************************SERIAL LINK PARAMETERS**********************
|
159 # Number of serial links controllers [1]
| 164 # Number of serial links controllers [1]
|
160 num_links_controllers = Param.Unsigned(4, "Number of serial links")
| 165 parser.add_argument("--num-links-controllers", default=4, action="store", 166 type=int, help="Number of serial links")
|
161 162 # Number of packets (not flits) to store at the request side of the serial 163 # link. This number should be adjusted to achive required bandwidth
| 167 168 # Number of packets (not flits) to store at the request side of the serial 169 # link. This number should be adjusted to achive required bandwidth
|
164 link_buffer_size_req = Param.Unsigned(10, "Number of packets to buffer " 165 "at the request side of the serial link")
| 170 parser.add_argument("--link-buffer-size-req", default=10, action="store", 171 type=int, help="Number of packets to buffer at the\ 172 request side of the serial link")
|
166 167 # Number of packets (not flits) to store at the response side of the serial 168 # link. This number should be adjusted to achive required bandwidth
| 173 174 # Number of packets (not flits) to store at the response side of the serial 175 # link. This number should be adjusted to achive required bandwidth
|
169 link_buffer_size_rsp = Param.Unsigned(10, "Number of packets to buffer " 170 "at the response side of the serial link")
| 176 parser.add_argument("--link-buffer-size-rsp", default=10, action="store", 177 type=int, help="Number of packets to buffer at the\ 178 response side of the serial link")
|
171 172 # Latency of the serial link composed by SER/DES latency (1.6ns [4]) plus 173 # the PCB trace latency (3ns Estimated based on [5])
| 179 180 # Latency of the serial link composed by SER/DES latency (1.6ns [4]) plus 181 # the PCB trace latency (3ns Estimated based on [5])
|
174 link_latency = Param.Latency('4.6ns', "Latency of the serial links")
| 182 parser.add_argument("--link-latency", default='4.6ns', type=str, 183 help="Latency of the serial links")
|
175 176 # Clock frequency of the each serial link(SerDes) [1]
| 184 185 # Clock frequency of the each serial link(SerDes) [1]
|
177 link_frequency = Param.Frequency('10GHz', "Clock Frequency of the serial" 178 "links")
| 186 parser.add_argument("--link-frequency", default='10GHz', type=str, 187 help="Clock Frequency of the serial links")
|
179 180 # Clock frequency of serial link Controller[6] 181 # clk_hmc[Mhz]= num_lanes_per_link * lane_speed [Gbits/s] / 182 # data_path_width * 10^6 183 # clk_hmc[Mhz]= 16 * 10 Gbps / 256 * 10^6 = 625 Mhz
| 188 189 # Clock frequency of serial link Controller[6] 190 # clk_hmc[Mhz]= num_lanes_per_link * lane_speed [Gbits/s] / 191 # data_path_width * 10^6 192 # clk_hmc[Mhz]= 16 * 10 Gbps / 256 * 10^6 = 625 Mhz
|
184 link_controller_frequency = Param.Frequency('625MHz', 185 "Clock Frequency of the link controller")
| 193 parser.add_argument("--link-controller-frequency", default='625MHz', 194 type=str, help="Clock Frequency of the link\ 195 controller")
|
186 187 # Latency of the serial link controller to process the packets[1][6] 188 # (ClockDomain = 625 Mhz ) 189 # used here for calculations only
| 196 197 # Latency of the serial link controller to process the packets[1][6] 198 # (ClockDomain = 625 Mhz ) 199 # used here for calculations only
|
190 link_ctrl_latency = Param.Cycles(4, "The number of cycles required for the" 191 "controller to process the packet")
| 200 parser.add_argument("--link-ctrl-latency", default=4, action="store", 201 type=int, help="The number of cycles required for the\ 202 controller to process the packet")
|
192 193 # total_ctrl_latency = link_ctrl_latency + link_latency 194 # total_ctrl_latency = 4(Cycles) * 1.6 ns + 4.6 ns
| 203 204 # total_ctrl_latency = link_ctrl_latency + link_latency 205 # total_ctrl_latency = 4(Cycles) * 1.6 ns + 4.6 ns
|
195 total_ctrl_latency = Param.Latency('11ns', "The latency experienced by" 196 "every packet regardless of size of packet")
| 206 parser.add_argument("--total-ctrl-latency", default='11ns', type=str, 207 help="The latency experienced by every packet\ 208 regardless of size of packet")
|
197 198 # Number of parallel lanes in each serial link [1]
| 209 210 # Number of parallel lanes in each serial link [1]
|
199 num_lanes_per_link = Param.Unsigned( 16, "Number of lanes per each link")
| 211 parser.add_argument("--num-lanes-per-link", default=16, action="store", 212 type=int, help="Number of lanes per each link")
|
200 201 # Number of serial links [1]
| 213 214 # Number of serial links [1]
|
202 num_serial_links = Param.Unsigned(4, "Number of serial links")
| 215 parser.add_argument("--num-serial-links", default=4, action="store", 216 type=int, help="Number of serial links")
|
203 204 # speed of each lane of serial link - SerDes serial interface 10 Gb/s
| 217 218 # speed of each lane of serial link - SerDes serial interface 10 Gb/s
|
205 serial_link_speed = Param.UInt64(10, "Gbs/s speed of each lane of" 206 "serial link")
| 219 parser.add_argument("--serial-link-speed", default=10, action="store", 220 type=int, help="Gbs/s speed of each lane of serial\ 221 link")
|
207
| 222
|
208 #*****************************PERFORMANCE MONITORING************************
| 223 # address range for each of the serial links 224 parser.add_argument("--serial-link-addr-range", default='1GB', type=str, 225 help="memory range for each of the serial links.\ 226 Default: 1GB") 227 228 # *****************************PERFORMANCE MONITORING*********************
|
209 # The main monitor behind the HMC Controller
| 229 # The main monitor behind the HMC Controller
|
210 enable_global_monitor = Param.Bool(False, "The main monitor behind the " 211 "HMC Controller")
| 230 parser.add_argument("--enable-global-monitor", action="store_true", 231 help="The main monitor behind the HMC Controller")
|
212 213 # The link performance monitors
| 232 233 # The link performance monitors
|
214 enable_link_monitor = Param.Bool(False, "The link monitors" )
| 234 parser.add_argument("--enable-link-monitor", action="store_true", 235 help="The link monitors")
|
215 216 # link aggregator enable - put a cross between buffers & links
| 236 237 # link aggregator enable - put a cross between buffers & links
|
217 enable_link_aggr = Param.Bool(False, "The crossbar between port and " 218 "Link Controller")
| 238 parser.add_argument("--enable-link-aggr", action="store_true", help="The\ 239 crossbar between port and Link Controller")
|
219
| 240
|
220 enable_buff_div = Param.Bool(True, "Memory Range of Buffer is" 221 "divided between total range")
| 241 parser.add_argument("--enable-buff-div", action="store_true", 242 help="Memory Range of Buffer is ivided between total\ 243 range")
|
222
| 244
|
223 #*****************************HMC ARCHITECTURE ************************
| 245 # *****************************HMC ARCHITECTURE **************************
|
224 # Memory chunk for 16 vault - numbers of vault / number of crossbars
| 246 # Memory chunk for 16 vault - numbers of vault / number of crossbars
|
225 mem_chunk = Param.Unsigned(4, "Chunk of memory range for each cross bar " 226 "in arch 0")
| 247 parser.add_argument("--mem-chunk", default=4, action="store", type=int, 248 help="Chunk of memory range for each cross bar in\ 249 arch 0")
|
227 228 # size of req buffer within crossbar, used for modelling extra latency 229 # when the reuqest go to non-local vault
| 250 251 # size of req buffer within crossbar, used for modelling extra latency 252 # when the reuqest go to non-local vault
|
230 xbar_buffer_size_req = Param.Unsigned(10, "Number of packets to buffer " 231 "at the request side of the crossbar")
| 253 parser.add_argument("--xbar-buffer-size-req", default=10, action="store", 254 type=int, help="Number of packets to buffer at the\ 255 request side of the crossbar")
|
232 233 # size of response buffer within crossbar, used for modelling extra latency 234 # when the response received from non-local vault
| 256 257 # size of response buffer within crossbar, used for modelling extra latency 258 # when the response received from non-local vault
|
235 xbar_buffer_size_resp = Param.Unsigned(10, "Number of packets to buffer " 236 "at the response side of the crossbar")
| 259 parser.add_argument("--xbar-buffer-size-resp", default=10, action="store", 260 type=int, help="Number of packets to buffer at the\ 261 response side of the crossbar") 262 # HMC device architecture. It affects the HMC host controller as well 263 parser.add_argument("--arch", type=str, choices=["same", "distributed", 264 "mixed"], default="distributed", help="same: HMC with\ 265 4 links, all with same range.\ndistributed: HMC with\ 266 4 links with distributed range.\nmixed: mixed with\ 267 same and distributed range.\nDefault: distributed") 268 # HMC device - number of vaults 269 parser.add_argument("--hmc-dev-num-vaults", default=16, action="store", 270 type=int, help="number of independent vaults within\ 271 the HMC device. Note: each vault has a memory\ 272 controller (valut controller)\nDefault: 16") 273 # HMC device - vault capacity or size 274 parser.add_argument("--hmc-dev-vault-size", default='256MB', type=str, 275 help="vault storage capacity in bytes. Default:\ 276 256MB") 277 parser.add_argument("--mem-type", type=str, choices=["HMC_2500_1x32"], 278 default="HMC_2500_1x32", help="type of HMC memory to\ 279 use. Default: HMC_2500_1x32") 280 parser.add_argument("--mem-channels", default=1, action="store", type=int, 281 help="Number of memory channels") 282 parser.add_argument("--mem-ranks", default=1, action="store", type=int, 283 help="Number of ranks to iterate across") 284 parser.add_argument("--burst-length", default=256, action="store", 285 type=int, help="burst length in bytes. Note: the\ 286 cache line size will be set to this value.\nDefault:\ 287 256")
|
237
| 288
|
238# configure host system with Serial Links 239def config_host_hmc(options, system):
| |
240
| 289
|
241 system.hmc_host=HMCSystem()
| 290# configure HMC host controller 291def config_hmc_host_ctrl(opt, system):
|
242
| 292
|
243 try: 244 system.hmc_host.enable_global_monitor = options.enable_global_monitor 245 except: 246 pass;
| 293 # create HMC host controller 294 system.hmc_host = SubSystem()
|
247
| 295
|
248 try: 249 system.hmc_host.enable_link_monitor = options.enable_link_monitor 250 except: 251 pass;
| 296 # Create additional crossbar for arch1 297 if opt.arch == "distributed" or opt.arch == "mixed": 298 clk = '100GHz' 299 vd = VoltageDomain(voltage='1V') 300 # Create additional crossbar for arch1 301 system.membus = NoncoherentXBar(width=8) 302 system.membus.badaddr_responder = BadAddr() 303 system.membus.default = Self.badaddr_responder.pio 304 system.membus.width = 8 305 system.membus.frontend_latency = 3 306 system.membus.forward_latency = 4 307 system.membus.response_latency = 2 308 cd = SrcClockDomain(clock=clk, voltage_domain=vd) 309 system.membus.clk_domain = cd
|
252
| 310
|
253 # Serial link Controller with 16 SerDes links at 10 Gbps 254 # with serial link ranges w.r.t to architecture 255 system.hmc_host.seriallink = [SerialLink(ranges = options.ser_ranges[i], 256 req_size=system.hmc_host.link_buffer_size_req, 257 resp_size=system.hmc_host.link_buffer_size_rsp, 258 num_lanes=system.hmc_host.num_lanes_per_link, 259 link_speed=system.hmc_host.serial_link_speed, 260 delay=system.hmc_host.total_ctrl_latency) 261 for i in xrange(system.hmc_host.num_serial_links)]
| 311 # create memory ranges for the serial links 312 slar = convert.toMemorySize(opt.serial_link_addr_range) 313 # Memmory ranges of serial link for arch-0. Same as the ranges of vault 314 # controllers (4 vaults to 1 serial link) 315 if opt.arch == "same": 316 ser_ranges = [AddrRange(0, (4*slar)-1) for i in 317 range(opt.num_serial_links)] 318 # Memmory ranges of serial link for arch-1. Distributed range accross 319 # links 320 if opt.arch == "distributed": 321 ser_ranges = [AddrRange(i*slar, ((i+1)*slar)-1) for i in 322 range(opt.num_serial_links)] 323 # Memmory ranges of serial link for arch-2 'Mixed' address distribution 324 # over links 325 if opt.arch == "mixed": 326 ser_range0 = AddrRange(0, (1*slar)-1) 327 ser_range1 = AddrRange(1*slar, 2*slar-1) 328 ser_range2 = AddrRange(0, (4*slar)-1) 329 ser_range3 = AddrRange(0, (4*slar)-1) 330 ser_ranges = [ser_range0, ser_range1, ser_range2, ser_range3]
|
262
| 331
|
| 332 # Serial link Controller with 16 SerDes links at 10 Gbps with serial link 333 # ranges w.r.t to architecture 334 sl = [SerialLink(ranges=ser_ranges[i], 335 req_size=opt.link_buffer_size_req, 336 resp_size=opt.link_buffer_size_rsp, 337 num_lanes=opt.num_lanes_per_link, 338 link_speed=opt.serial_link_speed, 339 delay=opt.total_ctrl_latency) for i in 340 xrange(opt.num_serial_links)] 341 system.hmc_host.seriallink = sl 342
|
263 # enable global monitor
| 343 # enable global monitor
|
264 if system.hmc_host.enable_global_monitor: 265 system.hmc_host.lmonitor = [ CommMonitor() 266 for i in xrange(system.hmc_host.num_serial_links)]
| 344 if opt.enable_global_monitor: 345 system.hmc_host.lmonitor = [CommMonitor() for i in 346 xrange(opt.num_serial_links)]
|
267 268 # set the clock frequency for serial link
| 347 348 # set the clock frequency for serial link
|
269 for i in xrange(system.hmc_host.num_serial_links): 270 system.hmc_host.seriallink[i].clk_domain = SrcClockDomain(clock=system. 271 hmc_host.link_controller_frequency, voltage_domain= 272 VoltageDomain(voltage = '1V'))
| 349 for i in xrange(opt.num_serial_links): 350 clk = opt.link_controller_frequency 351 vd = VoltageDomain(voltage='1V') 352 scd = SrcClockDomain(clock=clk, voltage_domain=vd) 353 system.hmc_host.seriallink[i].clk_domain = scd
|
273 274 # Connect membus/traffic gen to Serial Link Controller for differrent HMC 275 # architectures
| 354 355 # Connect membus/traffic gen to Serial Link Controller for differrent HMC 356 # architectures
|
276 if options.arch == "distributed": 277 for i in xrange(system.hmc_host.num_links_controllers): 278 if system.hmc_host.enable_global_monitor: 279 system.membus.master = system.hmc_host.lmonitor[i].slave 280 system.hmc_host.lmonitor[i].master = \ 281 system.hmc_host.seriallink[i].slave
| 357 hh = system.hmc_host 358 if opt.arch == "distributed": 359 mb = system.membus 360 for i in xrange(opt.num_links_controllers): 361 if opt.enable_global_monitor: 362 mb.master = hh.lmonitor[i].slave 363 hh.lmonitor[i].master = hh.seriallink[i].slave
|
282 else:
| 364 else:
|
283 system.membus.master = system.hmc_host.seriallink[i].slave 284 if options.arch == "mixed": 285 if system.hmc_host.enable_global_monitor: 286 system.membus.master = system.hmc_host.lmonitor[0].slave 287 system.hmc_host.lmonitor[0].master = \ 288 system.hmc_host.seriallink[0].slave 289 290 system.membus.master = system.hmc_host.lmonitor[1].slave 291 system.hmc_host.lmonitor[1].master = \ 292 system.hmc_host.seriallink[1].slave 293 294 system.tgen[2].port = system.hmc_host.lmonitor[2].slave 295 system.hmc_host.lmonitor[2].master = \ 296 system.hmc_host.seriallink[2].slave 297 298 system.tgen[3].port = system.hmc_host.lmonitor[3].slave 299 system.hmc_host.lmonitor[3].master = \ 300 system.hmc_host.seriallink[3].slave
| 365 mb.master = hh.seriallink[i].slave 366 if opt.arch == "mixed": 367 mb = system.membus 368 if opt.enable_global_monitor: 369 mb.master = hh.lmonitor[0].slave 370 hh.lmonitor[0].master = hh.seriallink[0].slave 371 mb.master = hh.lmonitor[1].slave 372 hh.lmonitor[1].master = hh.seriallink[1].slave
|
301 else:
| 373 else:
|
302 system.membus.master = system.hmc_host.seriallink[0].slave 303 system.membus.master = system.hmc_host.seriallink[1].slave 304 system.tgen[2].port = system.hmc_host.seriallink[2].slave 305 system.tgen[3].port = system.hmc_host.seriallink[3].slave 306 if options.arch == "same" : 307 for i in xrange(system.hmc_host.num_links_controllers): 308 if system.hmc_host.enable_global_monitor: 309 system.tgen[i].port = system.hmc_host.lmonitor[i].slave 310 system.hmc_host.lmonitor[i].master = \ 311 system.hmc_host.seriallink[i].slave 312 else: 313 system.tgen[i].port = system.hmc_host.seriallink[i].slave
| 374 mb.master = hh.seriallink[0].slave 375 mb.master = hh.seriallink[1].slave
|
314
| 376
|
| 377 if opt.arch == "same": 378 for i in xrange(opt.num_links_controllers): 379 if opt.enable_global_monitor: 380 hh.lmonitor[i].master = hh.seriallink[i].slave 381
|
315 return system 316
| 382 return system 383
|
317# Create an HMC device and attach it to the current system 318def config_hmc(options, system, hmc_host):
| |
319
| 384
|
320 # Create HMC device 321 system.hmc_dev = HMCSystem()
| 385# Create an HMC device 386def config_hmc_dev(opt, system, hmc_host):
|
322
| 387
|
323 # Global monitor 324 try: 325 system.hmc_dev.enable_global_monitor = options.enable_global_monitor 326 except: 327 pass;
| 388 # create HMC device 389 system.hmc_dev = SubSystem()
|
328
| 390
|
329 try: 330 system.hmc_dev.enable_link_monitor = options.enable_link_monitor 331 except: 332 pass;
| 391 # create memory ranges for the vault controllers 392 arv = convert.toMemorySize(opt.hmc_dev_vault_size) 393 addr_ranges_vaults = [AddrRange(i*arv, ((i+1)*arv-1)) for i in 394 range(opt.hmc_dev_num_vaults)] 395 system.mem_ranges = addr_ranges_vaults
|
333
| 396
|
| 397 if opt.enable_link_monitor: 398 lm = [CommMonitor() for i in xrange(opt.num_links_controllers)] 399 system.hmc_dev.lmonitor = lm
|
334
| 400
|
335 if system.hmc_dev.enable_link_monitor: 336 system.hmc_dev.lmonitor = [ CommMonitor() 337 for i in xrange(system.hmc_dev.num_links_controllers)] 338
| |
339 # 4 HMC Crossbars located in its logic-base (LoB)
| 401 # 4 HMC Crossbars located in its logic-base (LoB)
|
340 system.hmc_dev.xbar = [ NoncoherentXBar(width=system.hmc_dev.xbar_width, 341 frontend_latency=system.hmc_dev.xbar_frontend_latency, 342 forward_latency=system.hmc_dev.xbar_forward_latency, 343 response_latency=system.hmc_dev.xbar_response_latency ) 344 for i in xrange(system.hmc_host.number_mem_crossbar)]
| 402 xb = [NoncoherentXBar(width=opt.xbar_width, 403 frontend_latency=opt.xbar_frontend_latency, 404 forward_latency=opt.xbar_forward_latency, 405 response_latency=opt.xbar_response_latency) for i in 406 xrange(opt.number_mem_crossbar)] 407 system.hmc_dev.xbar = xb
|
345
| 408
|
346 for i in xrange(system.hmc_dev.number_mem_crossbar): 347 system.hmc_dev.xbar[i].clk_domain = SrcClockDomain( 348 clock=system.hmc_dev.xbar_frequency,voltage_domain= 349 VoltageDomain(voltage='1V'))
| 409 for i in xrange(opt.number_mem_crossbar): 410 clk = opt.xbar_frequency 411 vd = VoltageDomain(voltage='1V') 412 scd = SrcClockDomain(clock=clk, voltage_domain=vd) 413 system.hmc_dev.xbar[i].clk_domain = scd
|
350 351 # Attach 4 serial link to 4 crossbar/s
| 414 415 # Attach 4 serial link to 4 crossbar/s
|
352 for i in xrange(system.hmc_dev.num_serial_links): 353 if system.hmc_dev.enable_link_monitor:
| 416 for i in xrange(opt.num_serial_links): 417 if opt.enable_link_monitor:
|
354 system.hmc_host.seriallink[i].master = \ 355 system.hmc_dev.lmonitor[i].slave 356 system.hmc_dev.lmonitor[i].master = system.hmc_dev.xbar[i].slave 357 else: 358 system.hmc_host.seriallink[i].master = system.hmc_dev.xbar[i].slave 359 360 # Connecting xbar with each other for request arriving at the wrong xbar, 361 # then it will be forward to correct xbar. Bridge is used to connect xbars
| 418 system.hmc_host.seriallink[i].master = \ 419 system.hmc_dev.lmonitor[i].slave 420 system.hmc_dev.lmonitor[i].master = system.hmc_dev.xbar[i].slave 421 else: 422 system.hmc_host.seriallink[i].master = system.hmc_dev.xbar[i].slave 423 424 # Connecting xbar with each other for request arriving at the wrong xbar, 425 # then it will be forward to correct xbar. Bridge is used to connect xbars
|
362 if options.arch == "same":
| 426 if opt.arch == "same":
|
363 numx = len(system.hmc_dev.xbar) 364 365 # create a list of buffers
| 427 numx = len(system.hmc_dev.xbar) 428 429 # create a list of buffers
|
366 system.hmc_dev.buffers = [ Bridge( 367 req_size=system.hmc_dev.xbar_buffer_size_req, 368 resp_size=system.hmc_dev.xbar_buffer_size_resp) 369 for i in xrange(numx * (system.hmc_dev.mem_chunk - 1))]
| 430 system.hmc_dev.buffers = [Bridge(req_size=opt.xbar_buffer_size_req, 431 resp_size=opt.xbar_buffer_size_resp) 432 for i in xrange(numx*(opt.mem_chunk-1))]
|
370 371 # Buffer iterator 372 it = iter(range(len(system.hmc_dev.buffers))) 373 374 # necesarry to add system_port to one of the xbar 375 system.system_port = system.hmc_dev.xbar[3].slave 376 377 # iterate over all the crossbars and connect them as required 378 for i in range(numx): 379 for j in range(numx): 380 # connect xbar to all other xbars except itself 381 if i != j: 382 # get the next index of buffer 383 index = it.next() 384 385 # Change the default values for ranges of bridge 386 system.hmc_dev.buffers[index].ranges = system.mem_ranges[
| 433 434 # Buffer iterator 435 it = iter(range(len(system.hmc_dev.buffers))) 436 437 # necesarry to add system_port to one of the xbar 438 system.system_port = system.hmc_dev.xbar[3].slave 439 440 # iterate over all the crossbars and connect them as required 441 for i in range(numx): 442 for j in range(numx): 443 # connect xbar to all other xbars except itself 444 if i != j: 445 # get the next index of buffer 446 index = it.next() 447 448 # Change the default values for ranges of bridge 449 system.hmc_dev.buffers[index].ranges = system.mem_ranges[
|
387 j * int(system.hmc_dev.mem_chunk): 388 (j + 1) * int(system.hmc_dev.mem_chunk)]
| 450 j * int(opt.mem_chunk): 451 (j + 1) * int(opt.mem_chunk)]
|
389 390 # Connect the bridge between corssbars 391 system.hmc_dev.xbar[i].master = system.hmc_dev.buffers[ 392 index].slave 393 system.hmc_dev.buffers[ 394 index].master = system.hmc_dev.xbar[j].slave 395 else: 396 # Don't connect the xbar to itself 397 pass 398 399 # Two crossbars are connected to all other crossbars-Other 2 vault 400 # can only direct traffic to it local vaults
| 452 453 # Connect the bridge between corssbars 454 system.hmc_dev.xbar[i].master = system.hmc_dev.buffers[ 455 index].slave 456 system.hmc_dev.buffers[ 457 index].master = system.hmc_dev.xbar[j].slave 458 else: 459 # Don't connect the xbar to itself 460 pass 461 462 # Two crossbars are connected to all other crossbars-Other 2 vault 463 # can only direct traffic to it local vaults
|
401 if options.arch == "mixed": 402
| 464 if opt.arch == "mixed":
|
403 system.hmc_dev.buffer30 = Bridge(ranges=system.mem_ranges[0:4]) 404 system.hmc_dev.xbar[3].master = system.hmc_dev.buffer30.slave 405 system.hmc_dev.buffer30.master = system.hmc_dev.xbar[0].slave 406 407 system.hmc_dev.buffer31 = Bridge(ranges=system.mem_ranges[4:8]) 408 system.hmc_dev.xbar[3].master = system.hmc_dev.buffer31.slave 409 system.hmc_dev.buffer31.master = system.hmc_dev.xbar[1].slave 410 411 system.hmc_dev.buffer32 = Bridge(ranges=system.mem_ranges[8:12]) 412 system.hmc_dev.xbar[3].master = system.hmc_dev.buffer32.slave 413 system.hmc_dev.buffer32.master = system.hmc_dev.xbar[2].slave 414
| 465 system.hmc_dev.buffer30 = Bridge(ranges=system.mem_ranges[0:4]) 466 system.hmc_dev.xbar[3].master = system.hmc_dev.buffer30.slave 467 system.hmc_dev.buffer30.master = system.hmc_dev.xbar[0].slave 468 469 system.hmc_dev.buffer31 = Bridge(ranges=system.mem_ranges[4:8]) 470 system.hmc_dev.xbar[3].master = system.hmc_dev.buffer31.slave 471 system.hmc_dev.buffer31.master = system.hmc_dev.xbar[1].slave 472 473 system.hmc_dev.buffer32 = Bridge(ranges=system.mem_ranges[8:12]) 474 system.hmc_dev.xbar[3].master = system.hmc_dev.buffer32.slave 475 system.hmc_dev.buffer32.master = system.hmc_dev.xbar[2].slave 476
|
415
| |
416 system.hmc_dev.buffer20 = Bridge(ranges=system.mem_ranges[0:4]) 417 system.hmc_dev.xbar[2].master = system.hmc_dev.buffer20.slave 418 system.hmc_dev.buffer20.master = system.hmc_dev.xbar[0].slave 419 420 system.hmc_dev.buffer21 = Bridge(ranges=system.mem_ranges[4:8]) 421 system.hmc_dev.xbar[2].master = system.hmc_dev.buffer21.slave 422 system.hmc_dev.buffer21.master = system.hmc_dev.xbar[1].slave 423 424 system.hmc_dev.buffer23 = Bridge(ranges=system.mem_ranges[12:16]) 425 system.hmc_dev.xbar[2].master = system.hmc_dev.buffer23.slave 426 system.hmc_dev.buffer23.master = system.hmc_dev.xbar[3].slave
| 477 system.hmc_dev.buffer20 = Bridge(ranges=system.mem_ranges[0:4]) 478 system.hmc_dev.xbar[2].master = system.hmc_dev.buffer20.slave 479 system.hmc_dev.buffer20.master = system.hmc_dev.xbar[0].slave 480 481 system.hmc_dev.buffer21 = Bridge(ranges=system.mem_ranges[4:8]) 482 system.hmc_dev.xbar[2].master = system.hmc_dev.buffer21.slave 483 system.hmc_dev.buffer21.master = system.hmc_dev.xbar[1].slave 484 485 system.hmc_dev.buffer23 = Bridge(ranges=system.mem_ranges[12:16]) 486 system.hmc_dev.xbar[2].master = system.hmc_dev.buffer23.slave 487 system.hmc_dev.buffer23.master = system.hmc_dev.xbar[3].slave
|
427
| |
| |