# Master's Thesis

Title

# Proposal and evaluation of energy-efficient Network-on-Chip architecture with integrating packet and path switches

Supervisor Professor Masayuki Murata

> Author Takahide Ikeda

February 10th, 2015

Department of Information Networking Graduate School of Information Science and Technology Osaka University

## Master's Thesis

Proposal and evaluation of energy-efficient Network-on-Chip architecture with integrating packet and path switches

Takahide Ikeda

## Abstract

Many core chips, where a large number of cores are implemented on a single integrated circuit chip, are being developed. Though the traditional integrated circuit chips use buses to accommodate communication between cores, the bus can be a significant bottleneck for many-core chips, because a large number of cores communicate simultaneously. Instead of the bus, packet switch networks constructed on a chip, which are called *network on chips* (NoCs), have been discussed. The NoC can accommodate multiple flows between cores simultaneously. However, as the traffic rate between cores increases, the NoC consumes more energy, which is one of the important problem in the NoC. One approach to reduction of the energy consumption of the NoC is to establish the short cut paths. The short cut paths reduce the number of packet switches passed by each packet, which leads to the reduction of the number of packets processed by packet switches. Two approaches to establishing the short cut paths have been discussed. One approach is to use the wired path switches. The wired path switch connects its input port with one of its output ports based on the configuration. Thus, the short cut paths between packet switches can be established by configuring wired path switches along the route between the packet switches. Hereafter we call these short cut paths wired paths. Another approach is to use the wireless communication. The wireless communication modules for packet switches in NoCs have been proposed. By using the wireless communication, the packet switches without direct wired links can communicate with each other directly. Thus, this wireless communication channels can be used to establish the short cut paths. Hereafter we call the short cut paths established by using wireless communication channels wireless paths. The existing work on the short cut paths in NoCs assume that the short cut paths are established when starting the application. However, the cores communicating with each other are difficult to predict before starting the application, because they may depend not only on the application but also on the data processed by the application. In this thesis, we propose a framework that dynamically configures the short cut paths based on the periodically monitored traffic. In this framework, we deploy the central controller called resource manager, which manages the resources of the path network. We also deploy

the distributed controller called *path request controller* at each packet switch. The path request controller counts the number of packets passing the switch by the destination cores. The path request controller calculates the candidate paths from the switch which are expected to reduce the energy consumption based on the counted number of packets, and requests the establishment of the candidate paths. The resource manager select the paths to be established among the requested candidates so as to maximize the expected reduction of the energy consumption. In this thesis, we discuss two architectures based on the proposed framework; the first one is the NoC with the wired paths, and the other one is the NoC with wireless paths. To clarify their advantages and disadvantages, we compare the energy consumption achieved by these architectures through numerical simulations. The results show that the NoC with the wireless paths consumes smaller energy when the number of cores deployed in a chip or the number of flows generated by each core is small. However, as the number of flows generated by each core increases, the NoC with the wireless paths cannot save the energy consumption. On the other hand, the NoC with the wired paths can save the energy consumption up to 60~% compared with the NoC without path switches even when 400 cores are deployed in a chip.

## Keywords

Network on Chip, Energy consumption, Wired path, Wireless path, Topology

# Contents

| 1        | Intr  | Introduction |                                                                         |    |  |  |  |
|----------|-------|--------------|-------------------------------------------------------------------------|----|--|--|--|
| <b>2</b> | Rela  | ated W       | /ork                                                                    | 7  |  |  |  |
| 3        | NoC   | C Arch       | itecture with Integrated Packet and Path Switches                       | 9  |  |  |  |
|          | 3.1   | Archit       | ecture                                                                  | 9  |  |  |  |
|          |       | 3.1.1        | Wired Path Network                                                      | 11 |  |  |  |
|          |       | 3.1.2        | Wireless Path Network                                                   | 11 |  |  |  |
|          | 3.2   | Routir       | ng for NoC with Integrating Packet and Path Switches                    | 11 |  |  |  |
|          |       | 3.2.1        | Construction of Energy Efficient Short Cut Paths                        | 11 |  |  |  |
|          |       |              | 3.2.1.1 Calculation of Candidate Short Cut Paths                        | 11 |  |  |  |
|          |       |              | 3.2.1.2 Selection from Candidate Short Cut Paths $\ldots \ldots \ldots$ | 12 |  |  |  |
|          |       | 3.2.2        | Calculation of Routes over the Short Cut Paths                          | 12 |  |  |  |
| 4        | Eva   | luatior      | 1                                                                       | 14 |  |  |  |
|          | 4.1   | Compa        | ared Architectures                                                      | 14 |  |  |  |
|          | 4.2   | Model        | s                                                                       | 14 |  |  |  |
|          |       | 4.2.1        | Energy Model                                                            | 14 |  |  |  |
|          |       |              | 4.2.1.1 Wireless Communication                                          | 14 |  |  |  |
|          |       |              | 4.2.1.2 Wired Link                                                      | 14 |  |  |  |
|          |       |              | 4.2.1.3 Switch                                                          | 15 |  |  |  |
|          |       | 4.2.2        | Traffic Model                                                           | 15 |  |  |  |
|          |       | 4.2.3        | Metrics                                                                 | 15 |  |  |  |
|          | 4.3   | Result       | s                                                                       | 16 |  |  |  |
|          |       | 4.3.1        | Impact of Traffic Patterns                                              | 16 |  |  |  |
|          |       |              | 4.3.1.1 Impact of Number of Flows Generated by Each Core $\ldots$       | 16 |  |  |  |
|          |       |              | 4.3.1.2 Impact of Location of Communicating Cores                       | 18 |  |  |  |
|          |       | 4.3.2        | Impact of the Gate Density                                              | 22 |  |  |  |
|          |       |              | 4.3.2.1 Impact of Number of Cores in a Chip                             | 22 |  |  |  |
|          |       |              | 4.3.2.2 Impact of Number of Path Switches in a Chip                     | 27 |  |  |  |
|          |       | 4.3.3        | Impact of Energy Consumption of Devices                                 | 30 |  |  |  |
|          |       |              | 4.3.3.1 Impact of the Area of the Chip                                  | 30 |  |  |  |
|          |       |              | 4.3.3.2 Impact of the Energy Consumption Model                          | 34 |  |  |  |
| 5        | Con   | clusio       | a                                                                       | 37 |  |  |  |
| A        | cknov | wledgn       | nents                                                                   | 38 |  |  |  |

# List of Figures

| 1  | Overview of the NoC architecture with path network                         | 10 |
|----|----------------------------------------------------------------------------|----|
| 2  | Impact of number of flows generated by each core                           | 17 |
| 3  | Impact of location of communicating cores                                  | 19 |
| 4  | The energy consumption normalized by that of the NoC without path network  | 20 |
| 5  | The energy consumption of the NoC with wired path normalized by that       |    |
|    | of the NoC with wireless path                                              | 21 |
| 6  | Impact of number of cores in a chip                                        | 24 |
| 7  | The energy consumption normalized by that of the NoC without path network  | 25 |
| 8  | The energy consumption of the NoC with wired path normalized by that       |    |
|    | of the NoC with wireless path                                              | 26 |
| 9  | Impact of number of path switches in a chip                                | 28 |
| 10 | Impact of number of path switches in a chip when the number of cores is 25 | 29 |
| 11 | Impact of the Area of the Chip                                             | 31 |
| 12 | The energy consumption normalized by that of the NoC without path network  | 32 |
| 13 | The energy consumption of the NoC with wired path normalized by that       |    |
|    | of the NoC with wireless path                                              | 33 |
| 14 | The energy consumption of the NoC with wired path normalized by that       |    |
|    | of the NoC with wireless path                                              | 35 |

## 1 Introduction

Many core chips, where a large number of cores are implemented on a single integrated circuit chip, are being developed [1]. Though the traditional integrated circuit chips use buses to accommodate communication between cores, the bus can be a significant bottle-neck for many-core chips, because a large number of cores communicate simultaneously. Instead of the bus, packet switch networks constructed on a chip, which are called *network* on chips (NoCs), have been discussed[2, 3]. The NoC can accommodate multiple flows between cores simultaneously. However, as the traffic rate between cores increases, the NoC consumes more energy, which is one of the important problem in the NoC.

One approach to reduction of the energy consumption of the NoC is to establish the short cut paths [4, 5]. The short cut paths reduce the number of packet switches passed by each packet, which lead to the reduction of the number of packets processed by packet switches. Two approaches to establishing the short cut paths have been discussed. One approach is to use the wired path switches [6, 4, 7]. The wired path switch connects its input port with one of its output ports based on the configuration. Thus, the short cut paths between packet switches can be established by configuring wired path switches along the route between the packet switches. Hereafter we call these short cut paths wired paths. Another approach is to use the wireless communication [5, 8, 9]. The wireless communication modules for packet switches without direct wired links can communicate with each other directly. Thus, these wireless communication channels can be used to establish the short cut paths. Hereafter we call the short cut paths established by using wireless communication channels wireless communication wire

The existing work on the short cut paths in NoCs assume that the short cut paths are established when starting the application. However, the cores communicating with each other are difficult to predict before starting the application, because they may depend not only on the application but also on the data processed by the application. Therefore dynamic control of a short cut paths is needed.

In this thesis, we propose a framework that dynamically configures the short cut paths based on the periodically monitored traffic. In this thesis, the layer of the path network is stacked on the layer of the packet network. The short cut path is built between the packet switch by setting of a path network. In this framework, we deploy the central controller called *resource manager*, which manages the resources of the path network. We also deploy the distributed controller called *path request controller* at each packet switch. The path request controller counts the number of packets passing the switch by the destination cores. The path request controller calculates the candidate paths from the switch which are expected to reduce the energy consumption based on the counted number of packets, and requests the establishment of the candidate paths. The resource manager selects the paths to be established among the requested candidates so as to maximize the expected reduction of the energy consumption.

In this thesis, we discuss two architectures based on the proposed framework; the first one is the NoC with the wired paths, and the other one is the NoC with the wireless paths. To clarify their advantages and disadvantages, we compare the energy consumption achieved by these architectures through numerical simulations.

The rest of this thesis is organized as follows. Section 2 surveys the NoC. Section 3 introduces the NoC architecture with integrated packet and path switches. Section 4 presents an evaluation of our NoC architecture. Section 5 mentions the conclusion.

## 2 Related Work

The bus can be a significant bottleneck for many-core chips, because a large number of cores communicate simultaneously. Instead of the bus, packet switch networks constructed on a chip, which are called *network on chips (NoCs)*, have been discussed[2, 3]. In the NoC, a core who has data to be sent to another core splits the data into packets and relays it to the packet switch. The packet switch relays the packets based on the destination address included in the packet header. Unlike the bus that can relay only one flow within each clock cycle, the NoC can accommodate multiple flows. Thus, the NoC is one of the promising approaches to solving the communication bottleneck in many core chips. The structures of NoCs have also been discussed. One of them is 3D NoC, which is constructed by stacking multiple 2D chip layers vertically [11-13]. The vertically stacked layers enables the increase of the number of cores without the increase of the chip.

One of the important problem in the NoC is the energy consumption, because the NoC consumes more energy as the traffic rate between cores increases. One approach to reduction of the energy consumption of the NoC is to establish the short cut paths [4, 14]. The short cut paths reduces the number of packet switches passed by each packet, which leads to the reduction of the number of packets processed by packet switches. Two approaches to establishing the short cut paths have been discussed. One approach is to use the wired path switches [6, 4]. The wired path switch connects its input port with one of its output ports based on the configuration. After the configuration of the ports, all packets arriving the input port are relayed to the output port connected to the input port. Thus, the wired path switches can be used to establish energy-efficient short cut paths. Though the path switch cannot relay flows from different input ports to the same output port, because each output port can be connected at most one input port in the path switch, the wired path switch consumes less energy than the packet switch, because the path switch does not require complicated processing such as label processing and decision of the output ports. Hereafter we call these short cut paths wired paths. Several NoC structures using wired path switches have been discussed[4, 2]. M. B. Stensgaard et al. have proposed a NoC structure where wired path switches are placed around each packet switch [4]. In this architecture, the wired path between packet switches can be established by configuring the wired path switches along the route between the packet switches. We have also investigated the NoC structure where the network constructed of wired path switches are vertically stacked over the layer of the packet switch network [6], and demonstrated that vertically stacked wired path switch network saves the energy consumption.

Another approach is to use the wireless communication [5, 8]. The energy efficient wireless communication module for packet switches in NoCs has been proposed[15]. By using the wireless communication, the packet switches without direct wired links can

communicate with each other directly. This wireless communication module consumes only a small energy, because the communication area is quite small. The module proposed by Dan Zhao [15] consumes only 0.19pW/bit by setting the communication area to 0.1 $mm^2$ . Thus, all nodes can communicate with only a small power. In addition, this module can send a packet within one clock cycle. Thus, this wireless communication channels can be used to establish the energy efficient short cut paths. Hereafter we call the short cut paths established by using wireless communication channels wireless paths.

The existing work on the short cut paths in NoCs assume that the short cut paths are established when starting the application. However, the cores communicating with each other are difficult to predict before starting the application, because they may depend not only on the application but also on the data processed by the application. Therefore dynamic control of a short cut paths is needed.

In this thesis, we propose a framework that dynamically configures the short cut paths based on the periodically monitored traffic. Moreover we discuss two architectures based on the proposed framework; the first one is the NoC with the wired paths, and the other one is the NoC with wireless paths. To clarify their advantages and disadvantages, we compare the energy consumption achieved by these architectures through numerical simulations.

# 3 NoC Architecture with Integrated Packet and Path Switches

#### 3.1 Architecture

In this thesis, we propose a NoC architecture using path and packet switches. This subsection proposes an architecture to control the path networks. This architecture can be applied to both types of paths, wired paths and wireless paths.

Figure 1 shows the proposed architecture. This architecture includes three kinds of layers, which are vertically stacked. The first layer includes the cores and the network constructed of packet switches. The second one is the path network. The path network can be constructed of the wired path switches or constructed by adding a wireless communication module to each packet switch placed on the first layer. The last one includes the central controller called *resource manager*, which manages the resource of the path network. In this architecture, we also deploy the distributed controller called *path request controller* at each packet switch. The path request controller counts the number of packets passing the switch by the destination cores. The path request controller calculates the candidate paths from the switch which are expected to reduce the energy consumption based on the counted number of packets, and requests the establishment of the candidate paths. The resource manager selects the paths to be established among the requested candidates so as to maximize the expected reduction of the energy consumption.

In this thesis, we use the wired path network or wireless path network. The rest of this subsection explains detail of each path network.



Figure 1: Overview of the NoC architecture with path network

#### 3.1.1 Wired Path Network

A wired path network is constructed of wired path switches. A wired path consumes a small energy compared with a packet switch because it does not require any processing to relay traffic, though multiple flows from different input ports cannot share the same output port. The wired path between packet switches is established by configuring the path switch along the route of the wired path. The set of the packet switches and the established wired paths constructs the logical network topology. In these architectures, the logical network topology can be changed by the configuration of the path switches.

In the wired path network, the links used by a path cannot be used by the other paths. Thus, the resource manager manages the resources of the links, and reserves a link when constructing the wired path using the link.

### 3.1.2 Wireless Path Network

A wireless path network is constructed by adding wireless communication module to each packet switch. The wireless communication can be performed through these wireless communication modules. By using the wireless communication, the packet switches without direct wired links can communicate with each other directly. The wireless communication module for network on chip consumes only a small energy because the communication area is quite small. In addition, this module can send a packet within one clock cycle. Thus, the wireless communication can be used as the energy efficient wireless path.

The Frequency Division Multiplexing (FDM) enables the simultaneous construction of multiple wireless paths. However, only as many wireless paths as the number of channels can be constructed. Therefore, the resource manager manages the wireless channels. In our architecture, the wireless path is requested by the path request controller. Then the resource manager checks whether the channels that can be assigned to the wireless path exist. If the channel can be assigned, the resource manager assigns the channel to the wireless path, and notifies the assigned channel to path request controller. Finally, the wireless path is established by using the channel. If the channel cannot be assigned, the request is rejected.

### 3.2 Routing for NoC with Integrating Packet and Path Switches

## 3.2.1 Construction of Energy Efficient Short Cut Paths

## 3.2.1.1 Calculation of Candidate Short Cut Paths

In our architecture, the short-cut paths are established by the combination of the resource manager and the path request controllers. The path request controllers monitor the traffic passing the corresponding packet switches. Then, the path request controllers calculate the candidate paths from the corresponding packet switches and request them to the resource controller. The rest of this subsection, we explain the details of the steps to establish the paths.

The path request controller counts the number of packets passing the switch by the destination cores. We denote the set of flows passing the packet switch p by  $F_p$ , and the monitored traffic rate of  $f \in F$  by  $B_f$ . The path request controller calculates the candidate paths from the switch p which are expected to reduce the energy consumption based on the counted number of packets, and requests the establishment of the candidate paths.

The reduction of the energy consumption  $E_{cut}$  when the path from the switch (X, Y) to the switch (X', Y') is established is calculated by the following equations

$$E_{cut} = \sum_{f \in F} \quad B_f \left[ E_p \left( |X - X_f| + |Y - Y_f| \right) - E_p \left( |X' - X_f| + |Y' - Y_f| \right) - E_{short} \right]^+$$

where  $X_f$  and  $Y_f$  is the x and y coordinates of the packet switch nearest to the destination core of the flow f.  $[x]^+$  equals to x if the value of x is positive, otherwise  $[x]^+$  equals to 0.  $E_{short}$  is the energy consumption of the short cut path from a (X, Y) to a (X', Y').  $E_p$  is the energy consumed when a packet is relayed by a packet switch.

After  $E_{cut}$  is calculated, the route search controller selects N candidate paths whose  $E_{cut}$  are the largest among the calculated candidate paths. Then, the route search controller requests the construction of the candidate paths. The requests include not only the addresses of destination packet switches of the paths, but also the calculated reduction of the energy consumption.

### 3.2.1.2 Selection from Candidate Short Cut Paths

The resource manager receives the requests from all route search controllers. The resource manager selects the short cut paths to be established based on the reduction of the energy included in the requests. From the candidate short cut paths with the largest energy reduction, the resource manager checks whether the sufficient resource to establish the path exists. If the sufficient resource exists, the resource manager selects the path as the path to be established, and reserve the resources. Finally, after all the candidate paths are checked, the resource manager configures the paths by configuring path switches or wireless communication modules.

#### 3.2.2 Calculation of Routes over the Short Cut Paths

In this thesis, we use the routing method based on the XY routing. In the XY routing, a packet is forwarded to the X coordinate direction until the X coordinate equals to that of the destination packet switch. Then a packet is forwarded to the Y-coordinate direction.

However, in our architecture, the established short cut paths can be regarded as the links similar to the wired links, and the set of the wired links and the established short cut paths forms the virtual network topology. Thus, in our architecture, routes should be controlled over the virtual network topology, which may have different structure from the grid topology. Therefore, in our routing method, we add the consideration of the existence of the short cut paths to the XY routing. Our routing method works as follows.

- 1. If the packet switch connected by the short cut path is the nearest to the destination among the connected packet switches, use the short cut path.
- 2. Otherwise, if the X coordinate of the destination is different from the X coordinate of the current switch, forward the packet to the switch with the X coordinate whose difference from the X coordinate of the destination becomes smaller.
- 3. Otherwise, forward the packet to the switch with the Y coordinate whose difference from the Y coordinate of the destination becomes smaller.

# 4 Evaluation

## 4.1 Compared Architectures

In this thesis, we discuss two architectures; the first one is the NoC with the wired paths, and the other one is the NoC with wireless paths. The NoC with the wired paths is constructed by the path switches, which are placed in a lattice. On the other hand, the NoC with the wireless paths is constructed by adding wireless communication module to each packet switch. The NoC with the wireless paths uses the FDM. Thus, the number of paths the NoC with the wireless establishes is limited by the number of channels in the FDM. In this thesis, the number of channels is 24.

To clarify their advantages and disadvantages, we compare the energy consumption achieved by these architectures through numerical simulations.

## 4.2 Models

## 4.2.1 Energy Model

#### 4.2.1.1 Wireless Communication

The communication area of the wireless communication module for NoC proposed by Deb et al. is  $20 \text{ mm}^2$ , and the module consumes 2.3 pJ/bit [10]. We calculate the energy consumption based on this module.

If we change the communication area, the energy consumption of the wireless communication module may change. Thus, we model the energy consumption as the function of the diameter of the communication area. The relation between the energy consumption and the diameter of the communication area is

$$P_W = R \times L_{wireless}^2 \times 10^{-12} \tag{1}$$

where  $L_{wireless}$  is the diameter of the communication area, and R is a constant. We calculate R based on the energy consumption of the wireless communication modules by Deb et al. [10]. By using the calculated R, the energy consumption  $(P_L(J/bit))$  is

$$P_W = 0.000825 L_{wireless}^2 \times 10^{-12} \tag{2}$$

#### 4.2.1.2 Wired Link

The packet switch and the path switch use wired links to send packets to the next switch. The energy consumption of the wired link used for NoC is modeled by Wolkotte et al. [16]. The energy consumed by a wired link  $(P_L(J/bit))$  between the switches is

$$P_L = 0.12L_{link} \times 10^{-12} \tag{3}$$

where  $L_{link}$  is the length of the wired link.

#### 4.2.1.3 Switch

We use the model by Wolkotte et al. [16]. In this model, the energy consumption of packet switch  $P_p$  is

$$P_P = 0.98 \times 10^{-12} [J/bit].$$

The energy consumption of the path switch  $P_c$  is

$$P_C = 0.37 \times 10^{-12} [J/bit].$$

## 4.2.2 Traffic Model

The energy consumption depends on the traffic generated between cores. In this thesis, we generate the following three kinds of traffic.

- Near cores traffic Traffic is generated only between the near cores. For this traffic, we randomly select the pairs of the communicating cores by randomly selecting the pairs of the cores from the set of the pairs of the cores whose number of hops is less than 1/4H where H is the longest hops in the chip.
- **Remote cores traffic** Traffic is generated only between the remote cores. For this traffic, we randomly select the pairs of the communicating cores by randomly selecting the pairs of the cores from the set of the pairs of the cores whose number of hops is more than 3/4H where H is the longest hops in the chip.
- **Random cores traffic** Traffic is generated between the random cores. For this traffic, we randomly select the pairs of the communicating cores.

The traffic amount between the communicating cores are selected randomly from 1 bit to 100 bit for all the communicating cores. The traffic amount may have an impact on the energy consumption. But the energy consumption is proportional to the amount of the relayed traffic. Thus, the discussion in this thesis can be applicable even if the traffic rate between cores change.

### 4.2.3 Metrics

In this thesis, we use the total energy consumed to relay all flows generated between cores as the metric.

## 4.3 Results

The energy consumption depends on the following items.

- Traffic Patterns
- The Gate Density
- Energy Consumption of Devices

In the rest of this section, we discuss the suitable NoC architecture, considering each of the above points.

#### 4.3.1 Impact of Traffic Patterns

The short cut paths to be established depend on the number of flows or the location of the communicating cores. Thus, the number of flows and the location of the communicating cores may have an impact on the energy consumption, and the suitable NoC architecture. In this subsection, we first clarify the suitable network architecture that can accommodate communication between cores with a small energy consumption, considering the traffic patterns.

#### 4.3.1.1 Impact of Number of Flows Generated by Each Core

The number of communicating cores depends on the application; the simple task can be allocated to a small number of cores, while the complex task may require may cores cooperating with each other. The suitable NoC architecture may depend on the number of flows generated by each core. In this paragraph, we consider the impact of the number of flows generated by each core, and clarify the suitable NoC architecture for each kind of application.

In this evaluation, we use the following environment. We set the number of cores to 100, and place them in a  $10 \times 10$  lattice. We place the same number of packet switches as the number of cores. The length of the wired link between packet switches is 1 mm. We place the same number of path switches as the number of packet switches for the NoC with the wired paths. For the NoC with the wireless paths, we deploy wireless communication modules to all of the packet switches. First, we discuss the impact of number of flows generated by each core based on the random cores traffic; the number of flows generated by each core are selected randomly. We change the number of flows generated by each core from 1 to 20.





(a) Energy consumption

(b) The energy consumption normalized by that of the NoC without path network



(c) The energy consumption of the NoC with wired path normalized by that of the NoC with wireless path

Figure 2: Impact of number of flows generated by each core

Figure 2 shows the results. The results show that the NoCs with path network can save the energy consumption compared with the NoC without path network. That is, the establishment of the short cut paths can reduce the energy consumption.

Hereafter, we compare the NoC with the wired paths and the NoC with the wireless paths. Fig. 2 demonstrates that the NoC with the wired paths consumes a larger energy than the NoC with the wireless paths when the number of flows generated by each core is small. This is caused by the difference of the energy consumptions of the wired path and the wireless paths. The wired path consumes more energy than the wireless paths, because the wired path requires the processing of the path switches along the route from the source switch to the destination switch. As a result, the NoC with the wireless paths consumes a smaller energy than the NoC with the wired paths.

On the other hand, as the number of flows generated by each core increases, the energy consumption of the NoC with the wireless paths increases significantly. If the number of flows generated by each core becomes more than 15, the NoC with the wireless path consumes more energy than the NoC with the wired path. This is caused by the difference of the number of constructible paths. The NoC with the wired path can establish paths, unless a link on the route from the source to the destination of the path is allocated to another path. On the other hand, the NoC with the wireless path can establish only 24 paths. Therefore, as the number of flows generated by each core increases, the NoC with the wireless path cannot establish a sufficient number of paths. As a result, a large number of packets are relayed via the packet network, which increases the energy consumption.

#### 4.3.1.2 Impact of Location of Communicating Cores

We also compare the NoC architectures, considering the location of the communication cores. In this evaluation, we generate three kinds of the traffic patterns, which are explained in Section 4.2.2.

In this evaluation, we use the following environment. We place the same number of packet switches. The length of the wired link between packet switches is 1 mm. We place the same number of path switches as the number of packet switches for the NoC with the wired paths. For the NoC with the wireless paths, we deploy wireless communication modules to all of the packet switches. We change the number of flows generated by each core from 1 to 20.







(b) The energy consumption normalized by that of the NoC without path network



(c) The energy consumption of the NoC with wired path normalized by that of the NoC with wireless path

Figure 3: Impact of location of communicating cores



(c) Remote cores traffic

Figure 4: The energy consumption normalized by that of the NoC without path network



Figure 5: The energy consumption of the NoC with wired path normalized by that of the NoC with wireless path

Figures 3, 4 and 5 show the results. From these figures, in the case of near cores traffic, the energy consumption of the NoC with the wireless path consumes a smaller energy than the NoC with the wired path. This is caused by the energy consumption of the wired path switches. If the communicating cores are close, the energy consumption of the wired path becomes larger than the energy consumed when a packet is relayed via the packet network, since the wired path switches consume energy, and the number of switches including the wired path switches passed by a packet may increase. In this case, the wired paths are not established. On the other hand, the wireless paths are established even in this case, since the wireless paths consume a smaller energy. As a result, the NoC with the wireless path saves the energy consumption, while the NoC with the wired path cannot.

On the other hand, the NoC with the wired paths saves the energy consumption, when the locations of the communicating cores are far away. In this case, a large energy is consumed if packets are relayed via only the packet network. Thus, a large number of paths are required to be established. The NoC with the wired paths can establish a sufficient number of paths, while the NoC with the wireless paths cannot. As a result, the energy consumption of the NoC with the wireless paths becomes larger than that of the NoC with the wireless.

We summarize the above results below. The NoC with the wireless path is suitable to the case that the number of flows generated by each core is small and the communicating cores are closely placed. However, if the number of flows generated by each core becomes large or the traffic between cores placed far away from each other exist, the NoC with the wireless path cannot save the energy sufficiently. In this case, the NoC with the wired path is suitable.

### 4.3.2 Impact of the Gate Density

The gate density of the chips may have an impact on the energy consumption; as the gate density of the chip increases, the number of flows generated by each core increases, which may lead to a large energy consumption. The increase of the gate density also can increase the number of wired path switches on the chip, and increase the number of constructible wired paths. On the other hand, the number of wireless paths cannot be increased even when the gate density increases. That is, the suitable NoC architecture may depend on the gate density of the chips. In this subsection, we discuss the impact of the gate density on the suitable NoC architectures.

## 4.3.2.1 Impact of Number of Cores in a Chip

We first discuss the impact of the number of cores in the chip. In this evaluation, we use the following environment. We change the number of cores from 25 to 400. We

place the same number of packet switches as the number of cores. We set the size of the chip to 20 mm  $\times$  20 mm. We place the same number of path switches as the number of packet switches for the NoC with the wired paths. For the NoC with the wireless paths, we deploy wireless communication modules to all of the packet switches. We generate the random cores traffic by changing the number of flows generated by each core from 1 to 20.





(a) Energy consumption

(b) The energy consumption normalized by that of the NoC without path network



(c) The energy consumption of the NoC with wired path normalized by that of the NoC with wireless path

Figure 6: Impact of number of cores in a chip



1

0





(c) The number of cores on the chip is 225

(d) The number of cores on the chip is 400



| ore                  | 20 | 1.233103 | 0.880099 | 0.788222 | 0.722103 |
|----------------------|----|----------|----------|----------|----------|
|                      | 15 | 1.248494 | 0.97504  | 0.835065 | 0.781595 |
| of flows<br>y each g | 10 | 1.281967 | 1.023401 | 0.907353 | 0.855714 |
| ber c<br>ed by       | 5  | 1.325823 | 1.044335 | 0.930269 | 0.861325 |
| umb<br>rate          | 1  | 1.346014 | 1.116541 | 0.958261 | 0.873122 |
| ene                  |    | 25       | 100      | 225      | 400      |
| 0.0                  |    |          |          |          |          |

Number of Cores

Figure 8: The energy consumption of the NoC with wired path normalized by that of the NoC with wireless path

Figures 6, 7 and 8 show the results. From these figures, when the number of cores in the chip is small, the energy consumption of the NoC with the wireless path consumes a smaller energy than the NoC with the wired path. When the number of flows generated by each core is small, all cores are close to each other. In this case, the wired paths are not established as discussed in Section 4.3.1.1. On the other hand, the wireless paths are established even in this case. In addition, since the number of cores is small, a small number of short cut paths are sufficient. As a result, the NoC with the wireless paths saves the energy significantly.

On the other hand, as the number of cores increases, the energy consumption of the NoC with the wireless paths increases significantly. This is caused by that the NoC with the wireless path can establish only 24 paths. Therefore, as the number of cores increases, the NoC with the wireless path cannot establish a sufficient number of paths. On the other hand, even if the number of cores becomes large, the NoC with the wirelest paths can establish a sufficient number of wired paths can establish a sufficient number of paths, because the number of wired path switches also increases, which leads to the increase of the constructible paths.

We summarize the above results below. The NoC with the wireless path is suitable to the small chips, where the number of cores is less than 100. On the other hand, the NoC with the wired path is suitable to the large NoCs where more than 225 cores are placed.

## 4.3.2.2 Impact of Number of Path Switches in a Chip

We can increase the number of wired path switches by stacking additional wired path network layer. In this paragraph, we discuss the impact of the number of wired path switches.

In this evaluation, we use the following environment. We evaluate two cases of the number of cores; we place 25 cores in the first case, and 100 cores in the other case. We place the same number of packet switches as the number of cores. The length of the wired link between packet switches is 1 mm. We change the number of wired path switches by changing the number of layers of the path networks from 1 to 3. Each layer of the wired path network includes the same number of path switches as the number of packet switches. For the NoC with the wireless paths, we deploy wireless communication modules to all of the packet switches. We generate the random cores traffic by setting the number of flows generated by each core to 10.





(a) Energy consumption

(b) The energy consumption normalized by that of the NoC without path network



(c) The energy consumption of the NoC with wired path normalized by that of the NoC with wireless path





(a) Energy consumption



(b) The energy consumption of the packet switch is normalized as 1



(c) The energy consumption of the wireless network is normalized as 1

Figure 10: Impact of number of path switches in a chip when the number of cores is 25

Figures 9 and 10 show the results. Figure 10 shows that the energy consumption cannot be reduced by the increase of the number of wired path switches in the small chip with 25 cores. This is because all cores are close to each other. Thus, the wired path are not established even if we increase the number of wired path switches.

On the other hand, the increase of the number of layers of the wired path network significantly decreases the energy consumption in the NoC with 100 cores. The NoC with 100 cores requires the wired paths to achieve the energy efficient communication. By adding the layer of the wired path network, more short cut paths can be established, which leads to the reduction of the energy consumption.

## 4.3.3 Impact of Energy Consumption of Devices

All of the above discussions are based on the devices in Refs. [10] and [16]. However, NoC may use the devices (i.e., packet switches, wired path switches and wireless communication modules) whose energy consumptions are different from Refs. [17] and [5]. In this subsection, we discuss the impact of the energy consumption of the devices in the NoC [10, 16] on the suitable NoC architectures.

#### 4.3.3.1 Impact of the Area of the Chip

The communication area has a large impact on the energy consumption of the wireless paths; the energy consumption of the wireless path becomes small as the communication area becomes small. The size of the chip may be reduced by the progress of the processing technologies, which may reduce the required communication area of the wireless communication. In this paragraph, we discuss the impact of the size of the chip.

In this evaluation, we use the following environment. We change the number of cores from 25 to 400. We place the same number of packet switches as the number of cores. We change the size of the chip from 5 mm  $\times$  5 mm to 20 mm  $\times$  20 mm. We place the same number of path switches as the number of packet switches for the NoC with the wired paths. For the NoC with the wireless paths, we deploy wireless communication modules to all of the packet switches.







(b) The energy consumption normalized by that of the NoC without path network



(c) The energy consumption of the NoC with wired path normalized by that of the NoC with wireless path

Figure 11: Impact of the Area of the Chip



(c) The length of the chip is 20mm

Figure 12: The energy consumption normalized by that of the NoC without path network

| 6.0              | <sup>200</sup> Number of Cores |          |          |          | 6.0      | w Number of Cores   |    |          |          |          |          |
|------------------|--------------------------------|----------|----------|----------|----------|---------------------|----|----------|----------|----------|----------|
| Nu<br>gener      |                                | 25       | 100      | 225      | 400      | Nu<br>gener         |    | 25       | 100      | 225      | 400      |
| Numbe<br>ierated | 1                              | 1.682464 | 1.310573 | 1.037422 | 0.923387 | Number<br>Ierated I | 1  | 1.5625   | 1.206    | 0.981785 | 0.894646 |
| d b              | 5                              | 1.576344 | 1.246988 | 1.017013 | 0.901639 |                     | 5  | 1.486974 | 1.167587 | 0.954237 | 0.889803 |
| f flo<br>ea      | 10                             | 1.523327 | 1.176909 | 0.988014 | 0.883031 | of flov<br>V eac    | 10 | 1.455408 | 1.129371 | 0.944708 | 0.869894 |
| ws<br>h c        | 15                             | 1.462687 | 1.128028 | 0.966346 | 0.868778 | vs<br>h c           | 15 | 1.409091 | 1.081433 | 0.920354 | 0.814516 |
| ore              | 20                             | 1.409722 | 1.043956 | 0.938988 | 0.802168 | ore                 | 20 | 1.373377 | 0.96034  | 0.896409 | 0.750594 |

(a) The length of the chip is 5mm

(b) The length of the chip is 10mm

| ore                  | 20 | 1.233103 | 0.880099 | 0.788222 | 0.722103 |
|----------------------|----|----------|----------|----------|----------|
| ows<br>ch c          | 15 | 1.248494 | 0.97504  | 0.835065 | 0.781595 |
| of flows<br>y each c | 10 | 1.281967 | 1.023401 | 0.907353 | 0.855714 |
|                      | 5  | 1.325823 | 1.044335 | 0.930269 | 0.861325 |
| Number<br>ierated b  | 1  | 1.346014 | 1.116541 | 0.958261 | 0.873122 |
| NI                   |    | 25       | 100      | 225      | 400      |
| 60                   |    |          | Number   | of Cores |          |

(c) The length of the chip is 20mm

Figure 13: The energy consumption of the NoC with wired path normalized by that of the NoC with wireless path

Figures 11, 12 and 13 show the results. From these figures, as the area of the chip decreases, the energy consumption of the NoC with the wireless path becomes small. This is because the energy consumption of the wireless path becomes small by restricting the communication areas.

These figures also indicate that the NoC with the wireless path network consumes more energy than the NoC with the wired path network if the number of cores increases, even in the case of the 5 mm  $\times$  5 mm chip. That is, even if the area of the chip becomes small, the wired path network is suitable to the NoC with a large number of cores.

#### 4.3.3.2 Impact of the Energy Consumption Model

In the previous paragraph, we discussed the impact of the energy consumption of the wireless communication modules. The energy consumption of the path switch may be reduced in the future NoC. In this paragraph, we discuss the impact of the energy consumption of the wired path switches on the suitable NoC architectures.

In this evaluation, we change the energy consumption of path switch from 25% to 100% of the model in Ref. [16]. We evaluate the chips with 25, 100, 225, and 400 cores. We place the same number of packet switches as the number of cores. We place the same number of path switches as the number of packet switches for the NoC with the wired paths. For the NoC with the wireless paths, we deploy wireless communication modules to all of the packet switches. We generate the random core traffic, and change the number of flows generated by each core from 1 to 20.

| s<br>core               | 20  | 1.191724                | 0.997241       | 0.816552     | s<br>core                            | 20                                | 0.880099      | 0.767614      | 0.690977     |
|-------------------------|-----|-------------------------|----------------|--------------|--------------------------------------|-----------------------------------|---------------|---------------|--------------|
| ows<br>ch c             | 15  | 1.248494                | 1.072289       | 0.858434     | ws<br>th co                          | 15                                | 0.97504       | 0.848442      | 0.760623     |
| r of flows<br>by each c | 10  | 1.281967                | 1.136066       | 0.895082     | of flows<br>y each c                 | 10                                | 1.023401      | 0.900156      | 0.801872     |
|                         | 5   | 1.325823                | 1.176776       | 0.904679     |                                      | 5                                 | 1.044335      | 0.917898      | 0.824302     |
| Number<br>ierated k     | _1  | 1.346014                | 1.201087       | 0.911232     | Number<br>erated k                   | 1                                 | 1.116541      | 0.926692      | 0.894737     |
| Numbe                   |     | 100                     | 50             | 25           | Numbei<br>enerated                   |                                   | 100           | 50            | 25           |
| <u>60</u>               | Er  | ergy consum             | ption of path  | n switch (%) | 80                                   | E                                 | nergy consum  | nption of pat | n switch (%) |
| (a)                     |     | e number of<br>0.788222 |                |              | ( )                                  |                                   | e number of o |               |              |
| s<br>core               | 20  | 0.788222                | 0.684032       | 0.620612     | 's<br>core                           | 20                                | 0.722103      | 0.621245      | 0.575107     |
| of flows<br>y each c    | 15  | 0.835065                | 0.744156       | 0.679221     | f flows<br>each                      | 15                                | 0.781595      | 0.672393      | 0.62454      |
| of fl<br>y ea           | 10  | 0.907353                | 0.805882       | 0.733824     | of fl<br>y ea                        | 10                                | 0.855714      | 0.745714      | 0.671429     |
|                         | 5   | 0.930269                | 0.827258       | 0.751189     | er (                                 | 5                                 | 0.861325      | 0.770416      | 0.697997     |
| Number<br>ierated b     | 1   | 0.958261                | 0.866087       | 0.784348     | Number of flows<br>lerated by each c | 1                                 | 0.873122      | 0.796327      | 0.716194     |
| Numbei                  |     | 100                     | 50             | 25           | Number o<br>generated by             |                                   | 100           | 50            | 25           |
| 0.0                     | E   | nergy consur            | nption of patl | h switch (%) | 0.0                                  | Energy consumption of path switch |               |               |              |
| (c)                     | The | e number of a           | cores on the   | chip is 225  | (d)                                  | The                               | e number of a | cores on the  | chip is 400  |

Figure 14: The energy consumption of the NoC with wired path normalized by that of the NoC with wireless path

Figure 14 shows the results. From this figure, even if the energy consumption of the path switch changes, the above discussion of the suitable NoC architecture can be applicable, though the energy consumption may change.

# 5 Conclusion

We proposed the NoC architecture combining a packet network with a path network. In this architecture, path request controllers monitors the traffic passing the corresponding packet switches, and requests the establishment of the paths from the packet switch so as to reduce the energy consumption. Then, resource controller constructs the paths based on the requests from path request controllers.

In this thesis, we also discuss the suitable NoC architectures, focusing on the technology for the path networks; wired path networks and wireless path networks. The results indicate that (1) NoC with the wireless path network is suitable to the NoC with a small number of cores or the NoC where the communication occurs between a small number of cores. (2) NoC with the wired path is required to accommodate communication with a small energy consumption in the NoC with a large number of cores or the NoC where the communication between cores occurs frequently. We also performed the evaluation by changing the area of the chip or the energy consumption of the path switches, and demonstrates the above discussion is applicable even in such cases.

In this thesis, we focus on the short cut paths. However, for the reduction of the energy consumption, there remain other points to be discussed. For example, the task assignments may have a large impact on the energy consumption, which is one of our future research topics.

# Acknowledgments

Foremost, I would like to express my deepest gratitude to Professor Masayuki Murata of Osaka University for his exact guidance, encouragement, and insightful comments. Furthermore, I would like show my sincere appreciation to Assistant Professor Yuichi Ohsita of Osaka University for continuous support, helpful discussions, and insightful advices. And also I would like to express my sincere appreciation to Associate Professor Shin'ichi Arakawa and Assistant Professor Daichi Kominami for appropriate advices. Finally, I thank my friends and colleagues in the Department of Information Networking, Graduate School of Information Science and Technology of Osaka University for their kindness.

# References

- S. Borkar, "Thousand core chips : a technology perspective," in *Proceedings of DAC*, pp. 746–749, June 2007.
- [2] A. Ankur, C. Iskander, and R. Shankar, "Survey of network on chip architectures & contributions," *Journal of Engineering, Computing and Architecture*, pp. 21–27, 2009.
- [3] P. P. Pande, C. Grecu, M. Jones, A. Ivanov, and R. Salah, "Performance evaluation and design trade-offs for network-on-chip interconnect architecture," in *IEEE Transactions on Computers*, vol. 54, pp. 1025–1040, Aug. 2005.
- [4] M. B. Stensgaard and J. Sparso, "ReNoC: A network-on-chip architecture with reconfigurable topology," in *Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip*, pp. 55–64, Apr. 2008.
- [5] J. Murray, P. Pande, and B. Shirazi, "Sustainable multi-core architecture with on-chip wireless links," in *Proceedings of the great lakes symposium on VLSI*, pp. 263–266, 2012.
- [6] T. Ikeda, Y. Ohsita and M. Murata, "3D network structures using circuit switches and packet switches for on-chip data centers," in *Proceedings of International Journal* On Advances in Networks and Services, pp. 73–84, June 2014.
- [7] M. Modarressi, H. Sarbazi-Azad, and M. Arjomand, "A hybrid packet-circuit switched on-chip network based on sdm," in *Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2009. DATE '09*, pp. 566–569, Apr. 2009.
- [8] A. Ganguly, S. Deb, and B. Belzer, "Scalable hybrid wireless network-on-chip architectures for multicore systems," in *IEEE Transactions on Computers*, pp. 1485–1502, Oct. 2011.
- [9] X. Yu, B. J, W. P, D. Heo, P. P. P, and M. S, "Architecture and design of multichannel millimeter-wave wireless noc," in *Proceedings of Design Test*, pp. 19–28, Dec. 2014.
- [10] S. Deb, A. Ganguly, P. P. Pande, B. Belzer, and D. Heo, "Wireless NoC as interconnection backbone for multicore chips: promises and challenges," in *Proceedings of Emerging and Selected Topics in Circuits and Systems*, pp. 228–239, June 2012.
- [11] F. Li, C. Nicopoulos, T. Richardson, and Y. Xie, "Design and management of 3D multiprocessors using network-in-memory," in *Proceedings of ISCA*, pp. 130–141, June 2006.

- [12] Abdallah and A. Ben, "3D network-on-chip," in Multicore Systems On-Chip: Practical Software/Hardware Design. Atlantis Press, pp. 89–125, 2013.
- [13] Viswanathan, N., K. Paramasivam, and K. Somasundaram, "Exploring optimal topology and routing algorithm for 3D network on chip," in *American Journal of Applied Sciences*, 2012.
- [14] H. Gu, Z. Chen, Y. Yang, and H. Ding, "RONoC: A reconfigurable architecture for application-specific optical network-on-chip," in *Proceedings of IEICE TRANSAC-TIONS on Information and Systems*, pp. 142–145, June 2014.
- [15] D. Zhao and Y. Wang, "SD-MAC: Design and synthesis of a Hardware-Efficient Collision-Free QoS-Aware MAC protocol for wireless Network-on-Chip," in *Computers, IEEE Transactions*, pp. 1230–1245, May 2008.
- [16] P. T. Wolkotte, G. J. M. Smit, N. Kavaldjiev, J. E. Becker, and J. Becker, "Energy model of networks-on-chip and a bus," in *Proceedings of IEEE International Symposium on System-on-Chip*, pp. 82–85, Nov. 2005.
- [17] M. Modarressi and H. Sarbzi-Azad, A High-performance and low-power on-chip network with reconfigurable toplogy. Dynamic reconfigurable network-on-chip design: innovations for computational processing and communication, June 2010.