

# VLSI IMPLEMENTATION OF CDMA ENCODER DECODER FOR NETWORK ON CHIP COMMUNICATIONS Karri Naga Narasimha<sup>1</sup>, E.Sarva Rameswarudu<sup>2</sup>

1 M.Tech, ES&VLSI, Kakinada Institute of Technology and Science, Divili 2 Professor, Dept of ECE, Kakinada Institute of Technology and Science, Divili

**Abstract:** Code Division Multiple Access (CDMA) is proposed as the physical layer enabler of Network-On-Chip (NoC) interconnects for its prominent features such as fixed latency, guaranteed service, and reduced system complexity. CDMA interconnects have been adopted by the NoC community as it originates in wireless communications where each bit in a CDMA encoded data word is transmitted on a separate channel to avoid interference. However, the wireless interference problem can be efficiently mitigated in on-chip interconnects eliminating the need for replicating the CDMA channel. Moreover, wireless channels are sequential by nature which is not the case in on-chip interconnects where parallel buses are the default communication means.

Keywords: CDMA,WB,NOC,SB

### I. INTRODUCTION

Modern Systems-on-chips (SoCs) are becoming massively parallel with many harmoniously interconnected Processing Elements (PEs). Interconnecting the PEs is commonly achieved through buses and Networks-on-Chips (NoCs) [1]. In NoCs, exchanged data is bundled into packets and traverse several network layers passing by the physical layer which defines how packets are actually transmitted between NoC units. The physical layer of a NoC is implemented by routers employing crossbar switches. Code Division Multiple Access (CDMA) is a medium sharing technique that leverages orthogonal codes to enable simultaneous packet routing. Unlike timeshared channels, CDMA leverages the code space to enable channel sharing. CDMA has been proposed as an on-chip interconnect technique for both bus and NoC interconnect architectures [2]. Many advantages of using CDMA for onchip interconnects include reduced power consumption, fixed communication latency, and reduced system complexity [3]. Utilizing CDMA in NoC interconnects is adopted from the wireless communications literature, where the data is spread by orthogonal codes at the transmitters, the spread data are added on the wireless channel, and the received sum is decoded at the receivers. Classical CDMA systems rely on the Walsh orthogonal code family to enable medium sharing. Many research groups have investigated several aspects of CDMA in



NoCs, including our group which presented the Overloaded CDMA for on-chip Interconnects (OCI) [4] [5] [6]. A 14-node CDMA-based network has been developed in [7]. The network utilizes 7 Walsh codes and assignment of the Walsh codes to the network nodes is dynamic based on the request from each node. Two architectures have been introduced in [7]: a serial CDMA network where each data chip in the spreading code is sent in one clock cycle; and a parallel CDMA network where all data chips are sent in the same cycle. The serial and parallel CDMA-based networks have been compared to a conventional CDMA network, a meshbased NoC, and a Time Division Multiple Access (TDMA) bus. For the same network area, the throughput of the parallel CDMA network is higher than that of the mesh-based NoC and the TDMA bus due to the simultaneous medium access nature of CDMA. Standard-basis codes are proposed as a replacement to Walsh CDMA codes in [8]. Standard-basis codes resemble TDMA signaling because each code consists of only a single chip of one and the remaining chips are zeros. The TDMA codes' orthogonality enables them to replace the Walsh codes as spreading and despreading CDMA codes, which reduces the complexity of the channel adder and decoder as the sum of TDMA codes is limited to zero or one per clock cycle.

### Figure 1. Conventional CDMA crossbar .



The conventional CDMA crossbar employed in the literature is depicted in Figure 1. The crossbar interconnects N transmit ports to N receive ports using N-chip length Walsh spreading codes. The binary data from each transmit port is encoded using an XOR encoder; the data bit is XORed with a unique N-chip spreading code assigned to the transmit-receive pair and transmitted in N clock cycles. Data spread from all encoders are added by the CDMA channel adder and sent to all receive port. The decoder at each receive port extracts the data from the channel sum by correlating the channel sum with the assigned spreading code. The correlation operation is implemented using an accumulator and a multiplexer since the despreading code chips are unipolar ("0" or "1"). In all



of the CDMA interconnect related work, each data bit in a data word is encoded and transmitted in a separate CDMA channel and the encoding/decoding logic is replicated W times for data packets of width W which is a direct application of the wireless CDMA principles in NoC interconnects. However, wireless communication channels are sequential by nature due to the interference problem. Multiple access and MIMO techniques can enable concurrent data transmission on the same wireless channel at the expense of increasing the transmitter/receiver complexity. in on-chip interconnects, on the other hand, a single channel can be efficiently utilized to enable parallel data transmission as noise and interference effects can be efficiently mitigated [9]. In this work, we present a single channel, multi-bit CDMA crossbar namely Aggregated CDMA (ACDMA) NoC crossbar.

#### 3. PROPOSED NOC CROSSBAR ARCHITECTURE

Direct sequence spread spectrum CDMA (DSSS-CDMA) is a leading approach for medium sharing in wireless communications where a set of orthogonal spreading codes composed of a stream of chips of leng**N** are multiplied by the transmitted data bits such that each data bit is spread **A** cycles. A unique spreading code is assigned to every TX-RX pair sharing the communication channel. Data streams of users sharing the channel are spread and simultaneously transmitted to an additive communication channel. Despreading is achieved by applying the correlation operation to the received sum, where each receiver can extract its data by correlating it with the assigned spreading code. Orthogonality between spreading codes guarantees unique identification of every code received in the channel sum by exploiting the associative and distributive properties of the addition operation carried out by the communication channel. In wireless communications, random effects such as noise, fading, and multipath arising in the communication channel affect proper identification of the received sum, which increases the bit error rate (BER) of the received data.

Unfortunately, the number of orthogonal codes in a spreading code set is usually limited to the spreading code length*N*, which reduces the channel utilization efficiency. Overloaded CDMA has been proposed in the wireless communication literature to increase the number of spreading codes by adding nonorthogonal codes that can be identified on the receiver side [13]. Increasing the channel utilization comes at the expense of relaxing the orthogonality requirements of the spreading codes and increasing MAI, which consequently increases the BER. The proposed overloaded CDMA spreading codes in wireless communications are accompanied with complicated receiver structures making use of multiuser detection instead of the simple correlator or matched filter receiver employed in basic DSSS-CDMA.



The basic structure of applying CDMA technique to NoC with a star topology is shown in Fig. 2. In this figure, a PE executes tasks of the application and network interface (NI) divides data flows from PE into packets and reconstruct data flows by using packets from NoC. In the sender, packet flits from NI are transformed to a sequential bit stream via a parallel-to-serial (P2S) module. This bit stream is encoded with an orthogonal code in the Encoding module (E in Fig. 1). The coded data from different encoding modules are added together in the Addition module (A in Fig. 2). Then, the sums of data chips are transmitted to receivers. In the receiver, Decoding modules (D in Fig. 2) reconstruct original data bits from the sums of data chips. Then these sequential bit streams are transformed to packet flits by serial-toparallel (S2P) modules. Finally, these packet flits are transferred to NI.

In the CDMA NoC, network scheduler receives the transmitting requests from senders and assigns proper spreading codes to the senders and requested receivers. Note that all-zero codeword is assigned to nodes having no data to transmit/ receive. Moreover, when there are multiple senders requesting the same receiver, the scheduler will apply an arbitration scheme, for example, round-robin. The chip counters calculate how many orthogonal chips are used in one encoding/decoding operation. Each node needs two chip counters, one for the sender and the other for the receiver. Note that packet flits from NI can also be transformed to multiple bit streams in the P2S module to make tradeoffs between power/area cost and packet transfer latency, and the scheduler should provide a bit-synchronous scheme to maintain the orthogonality of the transmitted channels, as discussed in [8]. In this brief, we focus on the design and comparison of WB- and SB-based CDMA encoding/decoding method, which corresponds to E, A, and D modules in Fig. 2.

Fig. 3(a) shows the WB encoder architecture. An original data bit is first encoded with a Walsh code by taking an XOR operation. Then, these encoded data are added up to a multibit sum signal by taking arithmetical additions. Each sender needs an XOR gate, and multiple wires are used to express the sum signal if we have two or more senders. Moreover, the number of wires increases as the number of senders increases. Fig. 3(b) shows our SB encoding scheme. An original data bit from a sender is fed into an AND gate in a chip-by-chip manner, and it will be spread to n-chip



encoded data with an orthogonal code of a standard basis. The relationship between a bit and a chip is shown in Fig. 4. Then, the encoded data from different senders are mixed together through an XOR operation, and a binary sum signal is generated. Therefore, the output signal is always a sequence of binary signal transferred to destination using one single wire. The progressions of both the encoding schemes are depicted in Fig. 4. Fig. 4(a) and (b) illustrates the WB encoding process with four-chip Walsh codes and the SB encoding process with four-chip standard orthogonal codes, respectively



Fig. 2. Structure of CDMA NoC



Fig. 3. Block diagram of encoding scheme. (a) WB encoder. (b) SB encoder



|| Volume 5 || Issue 8 || August 2020 || ISSN (Online) 2456-0774

INTERNATIONAL JOURNAL OF ADVANCE SCIENTIFIC RESEARCH

AND ENGINEERING TRENDS



Fig. 4. Data encoding example. (a) WB encoding. (b) SB encoding





**CDMA Decoder** The WB decoding scheme is presented in Fig. 5(a). According to the chip value of Walsh code, the received multibit sums are accumulated into positive part (if the chip value is 0) or negative part (if the chip value is 1). Therefore, the two accumulators in the WB decoder separately contain a multibit adder to accumulate the coming chips and a group of registers to hold the accumulated value. Through the comparison module after the two accumulators, the original data is reconstructed. If the value of positive part is large, the original data is 1. Otherwise, the



original data is 0. The SB decoding scheme is shown in Fig. 5(b). When the binary sum signal arrives at receivers, an AND operation is taken between the binary sum and the corresponding orthogonal code in chip-bychip manner. Then, the result chips are sent to an accumulator. After m-chips are accumulated (m is the length of the orthogonal code), the output value of the accumulator will be the corresponding original data. Note that there is always only one chip equal to 1 and all othechips are equal to 0 for an orthogonal code in standard basis. Hence, the maximal accumulated value in the SB accumulator is 1 and it can be stored in a 1-bit register. Therefore, in the SB decoding module, only one AND gate and an accumulator with one 1-bit register are used, resulting in less logical resources. An example of the decoding process is illustrated in Fig. 6. In Fig. 6(a), at the WB decoder of receiver 1, the accumulated value 3 in the positive part is larger than the accumulated value 1 in the negative part. By the WB decoding scheme, the decoded data is 1, which is equal to the source data bit from sender 1. In Fig. 6(b), at the SB decoder of receiver 1, the output value of the accumulator is 1, which is also equal to the source data bit from sender 1. In Fig. 6(b), at the SB decoder of receiver 1, the output value of the accumulator is 1, which is also equal to the source data bit from sender 1. In Fig. 6(b), at the SB decoder of receiver 1, the decoding results in receiver 2 are also correct, but are not shown in the figure. Hence, both methods can reconstruct the original data bit from the sum signal by using their respective spreading codes.



Fig. 6. Data decoding example. (a) WB decoding at receiver 1. (b) SB decoding at receiver 1.

### 4. SIMULATION RESULTS



|| Volume 5 || Issue 8 || August 2020 || ISSN (Online) 2456-0774

INTERNATIONAL JOURNAL OF ADVANCE SCIENTIFIC RESEARCH

#### AND ENGINEERING TRENDS

| Name            | Value    | 0 ns | 50 ns | 100 | ns  1  | .50 ns |
|-----------------|----------|------|-------|-----|--------|--------|
| 🕨 😽 out[7:0]    | 01100001 | 0000 | 0000  | C   | 011000 | 001    |
| 16 а            | 1        |      |       |     |        |        |
| 16 в            | 0        |      |       |     |        |        |
| 🕨 📷 walsh1[3:0] | 1010     | 00   | 00    | C   | 1010   | )      |
| 🕨 📷 walsh2[3:0] | 1100     | 00   | 00    | C   | 1100   | 0      |
|                 |          |      |       |     |        |        |
|                 |          |      |       |     |        |        |
|                 |          |      |       |     |        |        |





#### Fig.8.wb decoder

From fig.7 and fig.8 the encoded data from two senders are mixed together through xor operation, and a binary sum signal is generated. Therefore, the output signal is always a sequence of binary signal transferred to destination using one single wire. The progression of both the encoding schemes are depicted from fig.7. and fig.8. In WB decoding scheme the chip value of walsh code, the received multi bit sums are accumulated positive part or negative part by using comparator we have to compare positive and negative parts, if positive is greater than negative then the original data is 1, otherwise the original data is 0.



Fig.9. sb encoder



|| Volume 5 || Issue 8 || August 2020 || ISSN (Online) 2456-0774

INTERNATIONAL JOURNAL OF ADVANCE SCIENTIFIC RESEARCH

AND ENGINEERING TRENDS



fig.10. Sb decoder

From fig.9 and fig.10 in SB encoding scheme original data bit from a sender is fed into an AND gate in chip by chip manner and encoded data from a different senders are mixed together by an xor operation and a binary sum signal is generated. In sb decoding scheme the binary sum signal arrives at receivers, an AND operation is taken between binary sum and corresponding sum then the result is send to an accumulator the output of the accumulator will be the corresponding original data.

| Device Utilization Summary (estimated values) |      |           |             |  |  |  |  |
|-----------------------------------------------|------|-----------|-------------|--|--|--|--|
| Logic Utilization                             | Used | Available | Utilization |  |  |  |  |
| Number of Slices                              | 4    | 5888      | 0%          |  |  |  |  |
| Number of 4 input LUTs                        | 8    | 11776     | 0%          |  |  |  |  |
| Number of bonded IOBs                         | 18   | 372       | 4%          |  |  |  |  |



| Timing constraint: Default path analysis<br>Total number of paths / destination ports: 32 / 8 |                 |         |       |                                                 |  |  |  |  |  |  |
|-----------------------------------------------------------------------------------------------|-----------------|---------|-------|-------------------------------------------------|--|--|--|--|--|--|
| Delay:<br>Source:<br>Destination:                                                             | Source: b (PAD) |         |       |                                                 |  |  |  |  |  |  |
| Data Path: b to out<6>                                                                        |                 |         |       |                                                 |  |  |  |  |  |  |
|                                                                                               | _               | Gate    |       |                                                 |  |  |  |  |  |  |
| Cell:in->out                                                                                  |                 | Delay   | -     | Logical Name (Net Name)                         |  |  |  |  |  |  |
| IBUF:I->O                                                                                     |                 |         |       | b IBUF (b IBUF)                                 |  |  |  |  |  |  |
| LUT4:I0->0                                                                                    | 1               | 0.648   | 0.420 | out<6>1 (out 6 OBUF)                            |  |  |  |  |  |  |
| OBUF:I->O                                                                                     |                 | 4.520   |       | out_6_OBUF (out<6>)                             |  |  |  |  |  |  |
| Total                                                                                         |                 | 7.337ns |       | ns logic, 1.320ns route)<br>logic, 18.0% route) |  |  |  |  |  |  |

Fig 12: Time Summary



### V. CONCLUSION:

We propose a new CDMA encoding/decoding method for on-chip communication. It can be realized by using simple logic and costs less power and area. The standard basis other than the Walsh code is used as the spreading code in our method. It thus decreases the encoding/decoding latency and increases the maximum throughput of NoCs. Mathematical proof is conducted to prove the correctness of our method. From the experimental results, we find that our method outperforms the WB encoding/decoding scheme, and the CDMA NoC performance is also improved when our method is applied.

[1] L. Wang, J. Hao, and F. Wang. Bus-based and NoC infrastructure performance emulation and **REFERENCES** 

comparison. In Information Technology: New Generations, 2009. ITNG '09. Sixth International Conference on, pages 855–858, April 2009.

[2] R. H. Bell, Chang Yong Kang, L. John, and E. E. Swartzlander. CDMA as a multiprocessor interconnect strategy. In Signals, Systems and Computers, 2001. Conference Record of the Thirty-Fifth Asilomar Conference on, volume 2, pages 1246–1250 vol.2, Nov 2001.

[3] B. C. C. Lai, P. Schaumont, and I. Verbauwhede. CT-bus: a heterogeneous CDMA/TDMA bus for future SOC. In Signals, Systems and Computers, 2004. Conference Record of the Thirty-Eighth Asilomar Conference on, volume 2, pages 1868–1872 Vol.2, Nov 2004.

[4] K. E. Ahmed and M. M. Farag. Overloaded CDMA bus topology for MPSoC interconnect. In 2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14), pages 1–7, Dec 2014.

[5] K. E. Ahmed and M. M. Farag. Enhanced overloaded CDMA interconnect (OCI) bus architecture for on-chip communication. In 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, pages 78–87, Aug 2015.

[6] E. H. Dinan and B. Jabbari, "Spreading codes for direct sequence CDMA and wideband

CDMA cellular networks," IEEE Commun. Mag., vol. 36, no. 9, pp. 48–54, Sep. 1998.

[7] M. Kim, D. Kim, and G. E. Sobelman, "MPEG-4 performance analysis for a CDMA network-onchip," in Proc. Int. Conf. Commun., Circuits, Syst., May 2005, pp. 493–496.

[8] X. Wang, T. Ahonen, and J. Nurmi, "Applying CDMA technique to network-on-chip," IEEE

Trans. Very Large Scale Integr. (VLSI) Syst., vol. 15, no. 10, pp. 1091–1100, Oct. 2007.

[9] W. Lee and G. E. Sobelman, "Mesh-star hybrid NoC architecture with CDMA switch," in



Proc. IEEE Int. Symp. Circuits Syst., May 2009, pp. 1349–1352.

[10] M. Kim, D. Kim, and G. E. Sobelman, "Adaptive scheduling for CDMA-based networks-on-

chip," in Proc. 3rd Int. IEEE-NEWCAS Conf., Jun. 2005, pp. 357-360.

[11] W. Lee and G. E. Sobelman, "Semi-distributed scheduling for flexible codeword assignment

in a CDMA network-on-chip," in Proc. IEEE 8th Int. Conf. ASIC, Oct. 2009, pp. 431-434.

[12] S. Poddar, P. Ghosal, P. Mukherjee, S. Samui, and H. Rahaman, "Design of an NoC with on-chip photonic interconnects using adaptive CDMA links," in Proc. IEEE Int. Conf. SOC, Sep. 2012, pp. 352–357.

[13] A. Vidapalapati, V. Vijayakumaran, A. Ganguly, and A. Kwasinski, "NoC architectures with adaptive code division multiple access based wireless links," in Proc. IEEE Int. Symp. Circuits Syst., May 2012, pp. 636–639.

[14] X. Wang and J. Nurmi, "Modeling a code-division multiple-access network-on-chip using SystemC," in Proc. Norchip, Nov. 2007, pp. 1–5.