Adaptive space-time compression for efficient massive MIMO fronthauling

Paikun Zhu; Yuki Yoshida; Ken-ichi Kitayama

doi:10.1364/OE.26.024098

1. Introduction

Radio access network (RAN) represents an important infrastructure of mobile systems to support the fast growth of wireless traffic, which is also a major source of total cost of ownership (TCO) and power consumption for mobile operators [1]. Recently centralized/cloud RAN (C-RAN) architecture has gained much attention due to its advantage of efficient cost and power savings, and other features such as facilitating coordinated multi-point (CoMP) transmission/reception and so on [1,2,7]. In the C-RAN architecture, a centralized baseband unit (BBU) pool connects a group of distributed remote radio heads (RRH) via fronthaul (FH). The use of digital interfaces such as common public radio interface (CPRI) [3] and open radio equipment interface (ORI) [6] has been specified for FH data transmission in 4G. However, conventional FH interfaces fall short in support of the forthcoming 5G communications because of two issues in bandwidth (BW) inefficiency: (1) a relatively large number of quantization bits is used to represent one radio signal sample, e.g., 16 (15 + 1) bits per I/Q sample in CPRI, which results in FH rate of >1-Gb/s for 20MHz (30.72MSa/s) wireless signal [3]; (2) more importantly, required fronthaul BW increases linearly with number of antennas in RRH, regardless of actual mobile traffic volume. In 5G much larger number of antennas, namely massive MIMO, is expected to support ultra-high mobile traffic density. If 256 antennas are assumed, even 20MHz wireless BW would cost CPRI rate of 259.5-Gb/s [2]. Currently, most efforts have been made on issue (1) and are based on time-domain FH compression [4–6], while practical solution to issue (2) may be functional split in PHY-layer and above [7,8]. Note that analog radio-over-fiber (A-RoF) based spectrally efficient fronthaul [9,10] is not discussed here. This paper focuses on fronthaul based on digital RoF (D-RoF) which may have minimal changes on existing RRH and BBU equipment and provide better networking capability.

In this work, we address both issue (1) and (2) by FH compression in not only time domain but also spatial domain. Since the number of users is usually smaller than the number of antennas in massive MIMO scenarios [11,12], redundancy can be found among received signals. This implies the possibility of BW reduction by removing the redundant channels in the spatial domain, i.e., spatial compression. A major technical challenge of spatial compression is that it needs to be accomplished under the stringent FH requirements of low-latency (e.g., <100 μs [6]) and low-complexity (compared with wireless (de)modulation process). In addition, as wireless environment dynamically changes, compression may need to be adaptive and, preferably, in a blind manner.

Based on these considerations, we present an adaptive FH spatial compression approach based on subspace tracking filter to reduce MIMO channel dimension, combined with time-domain compression based on adaptive quantization. Subspace tracking (e.g., projection approximation subspace tracking, PAST [14] and fast approximated power iteration, FAPI [15]) is originally invented in signal processing theory for low-complexity, online and blind matrix decomposition. We suggest and demonstrate for the first time its effective application to FH compression. Furthermore, we show the feasibility of joint optimization of spatial and time-domain compression schemes. Enabled by the adaptive space-time compression, the required FH BW becomes only dependent on the actual number of users, which can be no longer proportional to the number of RRH antennas. Subsequently, a combined optimization of the space-time compression technique and optical FH with limited BW is investigated. We experimentally demonstrate uplink fronthauling of 256 antennas and 32 1024QAM UEs over 5-km FH link by 10GBd PAM4 optical interface. In addition, we confirmed the effectiveness of the compression technique for 256 antennas with extended FH reach of 15-km.

2. Principle of the proposed space-time compression technique

Figure 1 shows the conceptual diagram of the proposed space-time compressor. Here, only uplink is shown as an example; however, the technique can also be applicable to the downlink FH. Suppose that uplink signals from P user equipment (UE) are received by M antennas (P<M). The M channels are first spatially compressed by an adaptive spatial filter (SF), which reduces the number of spatial channels from M to K (P≤K<M). The SF outputs are then compressed in the time domain by adaptive quantizers. Each I/Q sample in the i-th output are encoded to B_i bits (i = 1~K). K and {B_i} can be jointly adjusted by a joint compression ratio (CR) optimization module. The compressed bit streams are multiplexed and sent to BBU via FH. At BBU, the bit stream is first demultiplexed and de-quantized. The resulting K channels are de-compressed by an inverse SF to reconstruct M channels for PHY processing.

Fig. 1 Concept of the space-time fronthaul compressor.

Download Full Size | PDF

2.1 Spatial compression based on subspace tracking

Spatial compression of wireless MIMO signals can be formulated as a problem of finding low-rank approximation of M-by-N signal matrix X, N being the number of signal samples in each compression operation [13]. The best-known solution may be to compute the singular value decomposition (SVD) of matrix X. In [12], a principal component analysis (PCA) algorithm has been applied to the FH compression. However, the computational complexity of those approaches is in the order of O(MN × min{M, N}). The huge complexity and excessive latency due to batch-type processing makes them impractical in the context of FH. In this work, we propose adaptive filter-based approach to find low-rank matrix approximation with reduced complexity. Instead of extracting individual principal components [12], our adaptive SF tracks the signal subspace of interest, also called “subspace tracking” [14,15]. In our approach, the inputs of SF are modeled as M-by-1 vector x at each time index, and the SF is modeled as a M-by-K matrix W (K<M). The output of SF is compressed K-by-1 vector y, which is expressed by $y = W^{H} x$ . W is chosen so as to minimize the cost function J(W) in Eq. (1), where E(*) and ||*|| denote expectation and 2-norm respectively. Wy can be understood as the decompressed or reconstructed signal vector, and J(W) can be understood as vector-version mean square error induced by compression.

min J (W) = E {{‖ x - W y ‖}^{2}} .

The optimum SF W_Opt (which minimizes J(W)) equals UQ, where columns of U are K dominant eigenvectors of E{xx^H} and Q is unitary [14], i.e., columns of W_Opt span the signal subspace. To solve this unconstrained minimization problem and obtain the SF W_Opt, expectation in Eq. (1) needs to be calculated, which requires buffering a large set of signals and would induce unacceptable processing latency. For practical implementation, the cost function can be modified to an exponential-windowing weighted sum:

min J (W (t)) = \sum_{i = 1}^{t} β^{t - i} {‖ x (i) - W (t) W^{H} (t) x (i) ‖}^{2} .

After modification of cost function and projection approximation [14,15], SF becomes adaptive (denoted by W(t)) and can be updated by low-complexity recursive implementation. The parameter β in Eq. (2) represents a forgetting factor, which guarantees that data in the distant past are down-weighted and enables channel tracking capability of the SF. The schematic of the adaptive SF is depicted in Fig. 2, which contains a forward path for spatial compression based on linear filter and a feedback path to update/adapt the weights in W(t).

Fig. 2 Schematic of the adaptive spatial filter.

Download Full Size | PDF

The complexity of linear filtering on the forward path is MK flops per time index. Moreover, thanks to the recursive implementation, the complexity of updating SF W(t) is 2MK + 5K² + 2M + O(K) flops per update [15]. Therefore, the latency and complexity of the SF is expected to be lower than the wireless demodulation process, which requires at least M serial-to-parallel conversions with data buffering and M discrete Fourier transform (DFT) with complexity of M × O(N_DFTlogN_DFT) flops (N_DFT is the size of DFT). (Note that usually, K<<M<<N_DFT.) Moreover, complexity and latency of filter weights updating may be reduced by updating SF every multiple time index, since the filter weights should be valid within the coherence time of channel and/or the period of radio resource allocation. W(t) is periodically sent to BBU to act as the inverse SF. Period of sending depends on the coherence time of wireless channel and the period of radio resource allocation, e.g., once per signal subframe, indicating a small FH overhead. The inverse SF in BBU is also a linear matrix operation. Therefore, an interesting topic in the future can be further omitting this overhead by treating inverse SF as part of MIMO equalization in PHY processing. In uplink FH, we suggest that SF should be designed to have robustness against sudden signal change. Among subspace tracking algorithms, here fast approximated power iteration (FAPI) [15] is employed. Details of FAPI-based SF initialization and SF weight update procedure are given in the Appendix. The converging process of SF W(t) (i.e., subspace tracking process) may be illustrated by the evolution of maximum principal angle (MPA) between subspaces spanned by columns of W(t) and W_Opt [13–15], or by the evolution of normalized square error of decompressed waveform. Example curves are also shown in the Appendix. The convergence speed depends on the dimension of the signal subspace, K of the spatial filter, and forgetting factor β [15].

2.2 Time-domain compression based on adaptive quantizer

For time-domain compression, adaptivity is also desired in FH. Particularly, if adaptive SFs are employed, the characteristics of SF outputs may fluctuate over time and conventional training-based quantizers may not be feasible. Moreover, the adaptivity in both space and time domain would facilitate the joint optimization. Here, we propose to employ adaptive differential pulse code modulation (ADPCM) [16–18] commercialized in audio field. The block diagram of ADPCM encoder and decoder is shown in Figs. 3(a)-3(b) respectively. Both the scaling factor of quantization table and tap weights of the predictor are adaptive. Differential operation is helpful to suppress quantization noise, while backward adaptation omits sending overhead of adaptation parameter to BBU.

Fig. 3 Conceptual block diagram of (a) ADPCM encoder and (b) ADPCM decoder. Q: quantizer. Q⁻¹: inverse quantizer. Δ: quantization scale factor adaptor. A(z): pole-part adaptive filter in the predictor. B(z): zero-part adaptive filter in the predictor.

Download Full Size | PDF

The ADPCM encoder is composed of a forward path and a feedback path. On the forward path, each input sample In(k) is firstly subtracted by its estimated version S_e(k) to produce a difference sample d(k) = In(k) − S_e(k). Then d(k) is quantized to I(k) as the output of the encoder. The quantization table is adaptively scaled by a scaling factor Δ(k), which is calculated based on the amplitude of current sample and Δ(k−1). On the feedback path, I(k) is inversely quantized to d_q(k). In the adaptive predictor, multiple past d_q(k) as well as predicted samples S_e(k) are used to calculate current sample of S_e(k) through the following equations:

S_{e} (k) = \sum_{i = 1}^{N_{A}} a_{i} (k - 1) S_{r} (k - i) + \sum_{j = 1}^{N_{B}} b_{j} (k - 1) d_{q} (k - j)

S_{r} (k - i) = S_{e} (k - i) + d_{q} (k - i)

where {a_i(k), i = 1~N_A} and {b_j(k), j = 1~N_B} are tap weights of the predictor at time index k. The tap weights are adaptively updated based on previous values, e. g., using gradient algorithms [17].

The quantization noise of ADPCM is discussed as follows. Error induced by the quantizer can be expressed as:

e (k) = d_{q} (k) - d (k) .

Quantization noise can be characterized by difference-to-quantization-noise ratio (DQNR), expressed as:

D Q N R = E {d^{2} (k)} / E {e^{2} (k)} .

With increased number of quantization bits, e(k) can be reduced and DQNR can be improved, at the cost of compromised CR. The signal-to-quantization-noise ratio (SQNR) can also be used to evaluate quantization noise, which is expressed as

S Q N R = E {I n^{2} (k)} / E {e^{2} (k)} = D Q N R * E {I n^{2} (k)} / E {d^{2} (k)} = D Q N R * G_{P}

where the ratio between incoming sample and difference sample is defined as predictor gain G_P. More accurate prediction corresponds to smaller E{d²(k)} and larger predictor gain. Accuracy may be improved by increasing the number of taps in the predictor, i.e., N_A and N_B in Eq. (3), which however increases latency on the feedback path of ADPCM. Regarding the context of FH, prediction accuracy and prediction-induced latency should be balanced. In this work, the number of predictor taps N_A and N_B are assumed to be 2 and 6, respectively [16,17].

At ADPCM decoder, the input In(k) is inversely quantized to d_q(k). The decompressed/reconstructed signal S_r(k) is the sum of d_q(k) and predictor output S_e(k). The principles of the inverse quantizer, adaptive predictor and quantization scaling factor calculator are the same as those in the encoder. In fact, the decoder is “embedded” in the encoder [16–18].

The computational complexity of ADPCM depends on specific algorithm and hardware implementation. In this work, the main complexity of encoder algorithm [16,17] is the (N_A + N_B) real multipliers in the adaptive predictor, if logarithm in the quantizer and exponent in the inverse quantizer are implemented by look-up tables. In total, the complexity of time-domain FH compression in the technique would be about 2K(N_A + N_B). On the other hand, the complexity of ADPCM decoder is slightly lower than that of encoder since there’s no quantizer.

A notable advantage of ADPCM is its low-delay feature. Due to the backward adaptation structure, the algorithmic delay of ADPCM is equal to the sampling period of the input signal; for example, when the sample rate of input audio signal is 8-kHz, the delay is 0.125ms [19]. This implies that if wireless signals with MHz-level sample rate are processed, the delay of ADPCM-based compression is expected to be within 1μs. In practical implementation, this delay is also related to the specific design of digital ADPCM circuit (e.g., clock frequency and sequential logic) and precision of digitized signal.

2.3 Joint optimization of space-time compression

A feature of our FH compression technique, distinct from previous compression approaches, is the feasibility of joint optimization of the spatial compressor and time-domain compressor in the technique. In the space-time compression, K of the SF and {B_i} of adaptive quantizers are parameters that can be adjusted. We aim at seeking optimum signal EVM for a given compression ratio (CR), and then minimizing required FH bit rate under a certain EVM threshold.

Equation (8) gives the CR of our compressor compared with CPRI. We assume the SF W is transmitted to BBU once per signal subframe with L samples, and B_SF denotes number of bits used to quantize each I or Q sample of the M-by-K SF W. B_i (i = 1~K) denotes the number of quantization bits for I or Q branch of the i-th SF output. 1-bit control word is assumed for each spatial channel, and B_CPRI = 15.

C R = (L \sum_{i = 1}^{K} (B_{i} + 1) + M K B_{S F}) / [M L (B_{C P R I} + 1)] .

Required FH bit rate (BR) after compression is then expressed by Eq. (9), where f_S denotes the sample rate of signal before compression, and H_LC denotes overhead for line coding.

B R = C R \cdot B R_{C P R I} = C R \cdot 2 M f_{S} (B_{C P R I} + 1) H_{L C} .

Assuming the overhead of W is negligible or omitted, CR can be approximated as Eq. (10), which can be explained as the product of spatial CR and time-domain CR.

C R \approx \underset{S p a t i a l C R}{\underset{︸}{K / M}} \cdot \underset{T i m e - d o m a i n C R}{\underset{︸}{\sum_{i = 1}^{K} (B_{i} + 1) / K (B_{C P R I} + 1)}} .

In this case, Eq. (9) can be re-written as Eq. (11)

B R = 2 f_{S} \sum_{i = 1}^{K} (B_{i} + 1) \cdot H_{L C} .

For a given M, we can see that a certain CR can be achieved by multiple K-{B_i} combinations: larger number of SF outputs K corresponds to smaller quantization resolution {B_i}. Each combination could have different EVM performance. Therefore, for each CR, the K-{B_i} combination(s) that achieves lowest EVM can be chosen. Subsequently, we can seek the minimum required FH bit rate for a given EVM requirement. This joint optimization will be discussed through numerical simulations in Section 3.2 and experiments in Section 4.

3. Simulation setup and results

In this section, we conduct the uplink numerical simulations to evaluate the performance of proposed space-time FH compression technique. We investigate both cases with single-input-single-output (SISO) and multiple-input-multiple-output (MIMO) wireless transmission. The performance metrics used include EVM and required FH bit rate under a certain EVM requirement. In order to accurately evaluate the signal distortion caused by the compression, we set wireless receiver noise and quantization error of A/D converter to be zero, and training sequences of wireless signals are not compressed. The FH is assumed to be error-free. In the compressor, ADPCM-based adaptive quantizer is developed mainly based on audio standards [16,17]. In addition, 7-bit ADPCM is developed based on 6-bit ADPCM and 1-bit instantaneous quantization noise coding [20].

3.1 SISO case

SISO case is investigated to show the adaptivity of ADPCM-based time-domain compression and also to show the backward compatibility of our technique to single-antenna scenario. In SISO FH compression case, only the adaptive quantizer in the compressor is activated. The performance of SISO FH compression is investigated by testing 3 different kinds of wireless technology or signal: localized single-carrier frequency division multiple access (SC-FDMA), OFDMA and filtered-OFDMA [21] (F-OFDMA). SC-FDMA emulates LTE-A uplink to investigate backward compatibility of the compression technique, while OFDMA and F-OFDMA may emulate candidate uplink formats in 5G. All 3 kinds of wireless signals have a sample rate of 30.72-MSa/s, IFFT size of 2048, 1200 data subcarriers and cyclic prefix size of 256. 64QAM is tested as modulation format. In all multiple access scenarios, we assume a single user with 100% and 50% occupied BW (i.e., 20MHz and 10MHz with localized BW allocation [28]). Figure 4 shows peak-to-average power ratio (PAPR) of SC-FDMA, OFDMA and F-OFDMA signals with 100% and 50% occupied BW. In PHY processing, the reference signal-to-noise ratio (SNR) of the minimum mean square error (MMSE) equalizer is set to be 40dB. Each EVM value is calculated from 1 SC-FDMA/OFDMA symbol and averaged from 500 random system realizations. We compare our compressor with the compressor described in the ORI standard, which consists of 3/4 downsampling followed by fitting-based nonlinear quantization (FBNQ) [6].

Fig. 4 PAPR of SC-FDMA, OFDMA and F-OFDMA signals with different bandwidth occupancy ratio.

Download Full Size | PDF

Figures 5, 6, and 7 show EVM versus CR of ADPCM-based adaptive quantizer and ORI’s compressor for SC-FDMA, OFDMA, and F-OFDMA signals respectively. CR here is defined by (B + 1)/(B_CPRI + 1) (without downsampling) or 3/4*(B + 1)/(B_CPRI + 1) (with downsampling), where B is number of quantization bits in ADPCM or ORI’s nonlinear quantizer, and B_CPRI = 15. 1-bit control word is assumed. We show performance of ADPCM both with and without 3/4 downsampling. The simulation results show that ADPCM-based adaptive quantizer automatically adapts to different wireless signals and different BW occupancy condition, without requiring time-consuming statistical fitting as ORI’s compressor does. In the case of 100% occupied BW, ADPCM achieves very similar performance with ORI standardized algorithm when combined with 3/4 downsampling. In hardware implementation, such downsampling filter (usually several tens of taps [22]) may considerably increase the complexity and latency of compression, particularly for RRH in the uplink. In our proposal, downsampling filter is not involved. In the case of 50% occupied BW, ADPCM (both with and without downsampling) outperforms ORI’s compressor. The reason is that when BW is not fully occupied, this part of spectral redundancy is adaptively utilized by ADPCM to improve EVM performance, while ORI’s compressor with fixed 3/4 downsampling is not able to utilize it. Although the performance of the latter may be improved by reducing the downsampling ratio, the procedure cannot be adaptive as in ADPCM.

Fig. 5 Performance of ADPCM compression for SC-FDMA signal with bandwidth occupancy ratio of (a) 100%, (b) 50%.

Download Full Size | PDF

Fig. 6 Performance of ADPCM compression for OFDMA signal with bandwidth occupancy ratio of (a) 100%, (b) 50%.

Download Full Size | PDF

Fig. 7 Performance of ADPCM compression for Filtered-OFDMA signal with bandwidth occupancy ratio of (a) 100%, (b) 50%.

Download Full Size | PDF

In the remaining of this work, 30.72-MSa/s OFDMA signal with 100% BW occupancy is assumed as the test signal to evaluate the performance of the compressor.

In conventional FH compression approaches, block scaling/normalization before compression is usually critical to reduce signal dynamic range, and this scaling information is sent from higher layers periodically as overhead [6,23]. Here we investigate the impact of input signal power on the compression performance of ADPCM-based FH compressor. Figure 8 shows EVM performance versus relative input power of 6-bit ADPCM (CR = 43.75%) and 10-bit ORI’s compressor (CR = 51.5625%). Relative input power of 0dB corresponds to the lowest EVM. Thanks to the adaptivity of ADPCM, ADPCM-based FH compressor exhibits much better input power tolerance than ORI’s compressor. Notably, EVM of ADPCM remains almost unchanged (0.9%~1.0%) even when the input signal power varies more than 20dB. Therefore, the requirement of block scaling/normalization accuracy and related overhead can be greatly relaxed, which is beneficial in reducing the complexity and latency of FH.

Fig. 8 EVM performance of 6-bit ADPCM and 10-bit ORI’s compressor as a function of relative input power.

Download Full Size | PDF

3.2 MIMO case

In the following, we conduct uplink MIMO FH simulations to investigate the combined performance of the spatial filter and adaptive quantizer. Different number of antennas (i.e., M) and number of UE (i.e., P) are studied. Independent, identically distributed (i.i.d.) Rayleigh fading model is assumed for massive MIMO wireless channel [26]. Spatial multiplexing MIMO technology is assumed. Nevertheless, as the proposed technique compresses waveform at PHY-RF split point [27], it is also compatible to other MIMO technologies (e.g., beamforming) at higher-layer split points. β in FAPI-based SF is 0.9999.

In MIMO case, {B_i} in Eqs. (8) and (9) may be calculated by a quantization bit allocation (QBA) algorithm [12], which increases the complexity and latency of compression, especially when K is large. Here we will show that QBA procedure can be simplified in our compression technique. When assuming P = 32, M = 256 and K = 64, Figs. 9(a) and 9(b) plots histogram of power variance among K FAPI-based SF output channels, compared with power variance among M input channels and K PCA output channels. Specifically, power of each input or output channel is calculated over 1 OFDM symbol and normalized. Then, the power variance is calculated among K (or M) channels. 500 random realizations were tested to obtain histogram of power variance in Figs. 9(a)-9(b) and average power variance in Fig. 9(c). Interestingly, FAPI-based SF outputs exhibit considerably smaller power variance than PCA-based spatial compressor. This difference may be explained by that PCA extracts principal components of MIMO matrix one-by-one in a descendent order in terms of power or eigenvalue, while the SF tracks the signal subspace as a whole. Moreover, we found that the variance of SF outputs is even smaller than the SF inputs.

Fig. 9 Simulation results: (a) histogram of power variance among M input channels and K output channels of FAPI-based SF. (b) histogram of power variance among K output channels of PCA. (c) Average power variance versus number of UE.

Download Full Size | PDF

We then fix M = 256 and K = 64, and vary the number of UE (i.e., P) to investigate the power variance among spatial compressor outputs in a dynamic wireless access scenario. Figure 9(c) shows that, the power variance among the outputs of our SF is always much (about two orders of magnitude) smaller than PCA outputs and also smaller than SF inputs. Such power uniformity could greatly simplify the quantization bit allocation process in the following quantizer or time-domain compressor. For example, in PCA-based compression [12], bit allocation may need to be done frequently to adapt to time-varying eigenvalues. On the contrary, in this work {B_i} (i = 1~K) in Eqs. (10)-(11) is simply set identical (denoted as B). In the following, all EVM performance is achieved with such simple QBA. CR and required FH bit rate can be re-written as Eqs. (12) and (13).

C R \approx \underset{S p a t i a l C R}{\underset{︸}{K / M}} \cdot \underset{T i m e - d o m a i n C R}{\underset{︸}{(B + 1) / (B_{C P R I} + 1)}}

B R = 2 f_{S} K (B + 1) \cdot H_{L C} .

Finally, we show the results of joint optimization suggested in Section 2.3. Figure 10(a) shows the contour plot of EVM versus spatial and time-domain CR, in the case M = 256 and P = 16. Time-domain CR and spatial CR is calculated by Eq. (12), and EVM is averaged across all UE. It is possible to reduce spatial CR down to P/M (6.25%), which means the number of SF outputs (i.e., K) down to the number of UE, while more diversity gain in EVM can be achieved by increasing K. The SF enables a new axis of design flexibility for FH compression. While time-domain CR is more discrete (as B can only be integer), spatial CR can be tuned in a finer granularity, which increases the resolution of CR adjustment and flexibility of FH compression [24]. Moreover, quantization resolution (B) can be traded with spatial redundancy (K). For instance, EVM of −42dB or 0.8% can be achieved by time-domain CR of 50% (7-bit quantizer) with spatial CR of 7%, or time-domain CR of 37.5% (5-bit quantizer) with spatial CR of 41%. Another example is shown by the dashed line in Fig. 10(a), where CR of 20% can be achieved by different combinations of spatial/time-domain CR with different EVM performance. Therefore, this enables the opportunity to jointly optimize spatial and time-domain CR to achieve lowest required FH bit rate for a given EVM requirement and/or margin. For example, Fig. 10(b) shows required FH bit rate (with EVM<1%) versus number of antennas after jointly optimizing spatial and time-domain CR (i.e., K and B). As shown by the solid lines, required FH bit rate no longer increases with number of antennas, which greatly reduces FH BW, e.g., 256-antenna FH with 32 UE requires only 15.1-Gb/s FH bit rate to achieve EVM<1%.

Fig. 10 Simulation results: (a) Contour plot of EVM (dB) vs. space and time-domain CR. EVM(dB) = 20log10(EVM). (b) Required FH bit rate vs. number of antennas.

Download Full Size | PDF

4. Experimental setup and results

In this section, we further investigate the combined optimization of the proposed space-time compression technique and optical FH link with limited BW by proof-of-concept experiments. The experimental setup of uplink FH transmission is depicted in Fig. 11(a). UE signal generation and wireless channel propagation were emulated offline with same parameters as in the previous MIMO simulation. At the RRH, the signals were received by 256 antennas and then compressed by 256-by-K FAPI-based SF (β = 0.9999). The resulting K channels were adaptively quantized by B-bit ADPCM. The output streams were then multiplexed in time domain, coded by 64b/66b line coding [3], and modulated to PAM4 symbols. An arbitrary waveform generator (AWG, Tektronix 70001A) output PAM4 signal at 1 sample per symbol (i.e., non-returning-to-zero, NRZ), which was amplified and modulated onto an optical carrier at ~1551.7nm (DFB laser NLK1554BTZ-A) via an LN Mach-Zehnder modulator (MZM, MXAN-LN-40). The peak-to-peak voltage (V_PP) of driving signal for the MZM was about 2.0-V. Fig. 11(b) shows examples of optical spectra of 2.5GBd, 10GBd, and 25GBd PAM4 signal at the output of the MZM. The optical PAM4 signal was then transmitted over single-mode fiber (SMF) FH to BBU. At BBU, the PAM4 signal was received by a photodetector (PD). In 5G, the required FH reach would be up to 20km [29]. For high-capacity applications, FH distance can be just a few kilometres [30]. In this work, 5km and 15km FH cases are studied. The received power in 5km and 15km FH are −0.2dBm and −2.4dBm, respectively. The electrical PAM4 signal was sent to a Tektronix oscilloscope operating at 50GSa/s. The V_PP of signal detected on the oscilloscope was about 42-mV (5km FH case) or about 28-mV (15km FH case). The signal was then demodulated offline, including PAM4 recovery based on decision feedback equalizer (DFE), 64b/66b decoding, and MIMO PHY-processing. In this experiment, the FH BW was mainly limited by the available AWG. The demonstrated CPRI-equivalent FH rate is 259.5-Gb/s according to Eq. (9).

Fig. 11 (a) Experimental setup. PC: polarization controller. (b) Optical spectra (resolution: 0.02nm) of PAM4 signal at different baud rates, measured at the output of MZM.

Download Full Size | PDF

We tested multiple cases of number of UE (i.e., P) up to 100 [11]. For each P, different PAM4 baud rates up to 25GBd were tested to investigate different CR. The CR and PAM4 baud rate are expressed by Eqs. (12) and (14) respectively (H_LC = 66/64).

B a u d R a t e = f_{S} K \cdot (B + 1) \cdot H_{L C}

For example, 5GBd and 10GBd correspond to CR of 3.9% and 7.8%, respectively. Baud rate of AWG in the experiment were integer multiples of 2.5GBd, slightly larger than Eq. (14). Smaller CR indicates smaller FH BW yet larger signal distortion caused by compression. Larger CR indicates smaller compression-induced distortion but requires broader FH BW, and larger distortion would be caused if the actual FH BW is not adequate. As suggested in Section 2.3, we jointly tuned K and B to find lowest signal EVM for a given CR, and then seek minimum required FH baud rate (or bit rate) for a given EVM requirement.

Figure 12(a) shows FH link BER with different baud rates. After decision feedback equalization (DFE), no bit error was detected in 2.5GBd~10GBd cases, while from 12.5GBd to 25GBd BER gradually increased due to more severe inter-symbol interference caused by limited AWG BW. The eye diagrams measured at modulator output are shown in insets of Fig. 12(a). Figure 12(b) shows wireless EVM versus baud rate, after jointly optimizing spatial and time-domain CR. For each P, an optimum EVM and corresponding baud rate exists. When the baud rate is smaller than the optimum, signal distortion due to compression is dominant, which may be called “compression-limited regime”. When the baud rate is larger than the optimum, signal distortion due to link BER is dominant, which may be called “FH BW-limited regime”. If EVM thresholds of 8%, 3.5%, 1.68% and 0.7% for 64QAM, 256QAM, 1024QAM and 4096QAM [25,31] are assumed, the compressor supported 32 UE with 4096QAM using 10GBd, or 64 UE with 256QAM using 12.5GBd. On the other hand, for a given EVM requirement, we can choose lowest baud rates or BR to reduce the cost of FH link, e.g., 12 UE with 4096QAM can be supported using 5GBd.

Fig. 12 Experimental results, after 5km FH. (a) BER versus PAM4 baud rate. (b) Wireless EVM vs. PAM4 baud rate, with different number of UE (i.e., P).

Download Full Size | PDF

We also investigate the performance of the adaptive space-time compression technique with extended FH reach of 15km. The experimental results are shown in Fig. 13. With FH reach of 15km, degraded BER performance can be observed by Fig. 13(a) due to the following reasons: (1) received optical power is reduced while optical pre-amplifier was not used, which reduces electrical signal-to-noise ratio; (2) electrical RF amplifier was not used, which increases digitization noise of the oscilloscope; (3) for high baud rates, chromatic dispersion starts to be another source of inter-symbol interference (ISI). To improve optical link BER, we increase the number of feedforward and feedback taps in DFE from {6, 3} to {14, 6} when the baud rate is equal or larger than 10GBd. After jointly optimizing spatial and time-domain CR, experimental results are shown in Fig. 13(b). With 15km FH, the proposed space-time compression technique successfully supported 12 UE with 4096QAM using 5GBd PAM4, or 32 UE with 1024QAM using 7.5GBd PAM4, or 64 UE with 64QAM using 12.5GBd PAM4. Note that compared with 5km FH case, the degradation of EVM performance in 15km FH case is due to higher link BER. If link errors can be reduced (e.g., by spectrally-efficient pulse shaping, electrical amplification and/or more advanced DSP) or corrected by error correction codes, EVM performance or FH capacity would be improved, at the expense of increased FH latency.

Fig. 13 Experimental results, after 15km FH. (a) BER versus PAM4 baud rate. (b) Wireless EVM vs. PAM4 baud rate, with different number of UE (i.e., P).

Download Full Size | PDF

5. Conclusion

In this paper, we have proposed a low-complexity adaptive space-time compression technique based on adaptive subspace tracking filter and adaptive quantizer for efficient FH delivery of massive MIMO signals. Enabled by the technique, the required FH BW becomes only dependent on the actual number of UE, which can be decoupled from the number of antennas. Furthermore, we have shown that time-domain quantization resolution can be traded with spatial redundancy, which adds a new axis of flexibility for the FH data compressor. Joint optimization across space and time dimensions is also feasible. Experimental results of 5km FH have shown that, with 256 antennas (CPRI-equivalent rate of 259.5-Gb/s), 32 UE with 4096QAM can be supported by 10GBd PAM4 optical interface, or 64 UE with 256QAM can be supported by 12.5GBd PAM4 optical interface. Extended FH reach is also demonstrated by delivering compressed wireless signals over 15km FH. In this case, 32 UE with 1024QAM can be supported by 7.5GBd PAM4 optical interface, or 64 UE with 64QAM can be supported by 12.5GBd PAM4 optical interface. The proposed technique could be a promising candidate for FH interface in 5G massive MIMO scenarios. In addition, as MIMO is becoming an integral part of most radio access systems nowadays, the compression technique could also be useful in broader contexts, such as distributed antenna system (DAS) and MIMO fiber-wireless converged system.

Appendix

The update procedure of spatial filter W(t) based on FAPI technique is given in order by Eq. (15)-(25).

h (t) = Z (t - 1) y (t)

g (t) = \frac{h (t)}{β + y {(t)}^{H} h (t)}

ε^{2} (t) = {‖ x (t) ‖}^{2} - {‖ y (t) ‖}^{2}

τ (t) = \frac{ε^{2} (t)}{1 + ε^{2} (t) {‖ g (t) ‖}^{2} + \sqrt{1 + ε^{2} (t) {‖ g (t) ‖}^{2}}}

η (t) = 1 - τ (t) {‖ g (t) ‖}^{2}

y^{'} (t) = η (t) y (t) + τ (t) g (t)

h' (t) = Z {(t - 1)}^{H} y' (t)

λ (t) = \frac{τ (t)}{η (t)} [Z (t - 1) - h' {(t)}^{H} g (t)] g (t)

Z (t) = \frac{1}{β} [Z (t - 1) - g (t) h^{'} {(t)}^{H} + λ (t) g {(t)}^{H}]

e (t) = η (t) x (t) - W (t - 1) y^{'} (t)

W (t) = W (t - 1) + e (t) g {(t)}^{H} .

The initial values of the spatial filter W (M-by-K matrix) and the auxiliary filter Z (K-by-K matrix) should be chosen such that W(0) is an orthonormal matrix and Z(0) is a positive definite matrix [15]. In this work we simply choose

W (0) = [\begin{matrix} I_{K \times K} \\ 0_{(M - K) \times K} \end{matrix}]

Z (0) = I_{K \times K}

where I and 0 denote identity matrix and zero matrix respectively.

Figure 14(a) shows an example of evolution of maximum principal angle (MPA) between subspace spanned by columns of W(t) and subspace spanned by columns of W_Opt (i.e., true signal subspace) [13–15]. MPA approaches zero if subspaces become nearly identical. Fig. 14(b) shows corresponding evolution of normalized square error of decompressed waveform (after averaging across M channels).

Fig. 14 Simulation results. (a) Evolution of MPA between signal subspaces spanned by columns of FAPI-based W(t) and W_Opt. (b) Evolution of normalized square error of decompressed waveform. The simulation setup is the same as that of Fig. 9(a).

Download Full Size | PDF

Funding

“Wired-and-Wireless Converged Radio Access Network for Massive IoT Traffic” Ministry of Internal Affairs and Communications R & D contract (FY2017~2020).

Acknowledgments

The authors would like to thank Dr. A. Kanno from NICT for his help on the experiment. Portions of this work will be presented at ECOC 2018, Tu3B.6.

References and links

1. Y. Yoshida, “Mobile xHaul evolution: enabling tools for a flexible 5G xHaul network,” in Optical Fiber Communication Conference (Optical Society of America, 2018), paper Tu2K.1 (Tutorial). [CrossRef]

2. J. Kani, J. Terada, K.-I. Suzuki, and A. Otaka, “Solutions for future mobile fronthaul and access-network convergence,” J. Lightwave Technol. 35(3), 527–534 (2017).

3. CPRI Specification v7.0, Technical Report (2015).

4. M. Xu, X. Liu, N. Chand, F. Effenberger, and G.-K. Chang, “Fast statistical estimation in highly compressed digital RoF systems for efficient 5G wireless signal delivery,” in Optical Fiber Communication Conference (Optical Society of America, 2017), paper M3E.7. [CrossRef]

5. L. Zhang, X. Pang, O. Ozolins, A. Udalcovs, R. Schatz, U. Westergren, G. Jacobsen, S. Popov, L. Wosinska, S. Xiao, W. Hu, and J. Chen, “Digital mobile fronthaul employing differential pulse code modulation with suppressed quantization noise,” Opt. Express 25(25), 31921–31936 (2017). [CrossRef] [PubMed]

6. ETSI standard, GS Open Radio Interface (ORI) 002–1 V4.1.1, Oct. 2014.

7. C.-L. IH. Li, J. Korhonen, J. Huang, and L. Han, “RAN revolution with NGFI (xhaul) for 5G,” J. Lightwave Technol. 36(2), 541–550 (2018). [CrossRef]

8. K. Miyamoto, S. Kuwano, J. Terada, and A. Otaka, “Analysis of mobile fronthaul bandwidth and wireless transmission performance in split-PHY processing architecture,” Opt. Express 24(2), 1261–1268 (2016). [CrossRef] [PubMed]

9. X. Liu, N. Chand, F. Effenberger, L. Zhou, and H. Lin, “Demonstration of bandwidth-efficient mobile fronthaul enabling seamless aggregation of 36 E-UTRA-like wireless signals in a single 1.1-GHz wavelength channel,” in Optical Fiber Communication Conference (Optical Society of America, 2015), paper M2J.2. [CrossRef]

10. D. Che, F. Yuan, and W. Shieh, “High-fidelity angle-modulated analog optical link,” Opt. Express 24(15), 16320–16328 (2016). [CrossRef] [PubMed]

11. E. Bjornson, E. G. Larsson, and M. Debbah, “Massive MIMO for maximal spectral efficiency: how many users and pilots should be allocated?” IEEE Trans. Wirel. Commun. 15(2), 1293–1308 (2016). [CrossRef]

12. J. Choi, B. L. Evans, and A. Gatherer, “Space-time fronthaul compression of complex baseband uplink LTE signals,” in Proceedings of IEEE International Conference on Communications (ICC) (IEEE, 2016), pp. 1–6. [CrossRef]

13. P. Comon and G. H. Golub, “Tracking a few extreme singular values and vectors in signal processing,” Proc. IEEE 78(8), 1327–1343 (1990). [CrossRef]

14. B. Yang, “Projection approximation subspace tracking,” IEEE Trans. Signal Process. 43(1), 96–108 (1995).

15. R. Badeau, B. David, and G. Richard, “Fast approximated power iteration subspace tracking,” IEEE Trans. Signal Process. 53(8), 2931–2941 (2005). [CrossRef]

16. ITU-T Recommendation G.722, 1988.

17. ITU-T Recommendation G.726, 1990.

18. H. Benvenuto, G. Bertocci, and W. Daumer, “The 32-kb/s ADPCM coding standard,” Bell Labs Tech. J. 65(5), 12–22 (1986).

19. ETSI standard, “Digital Enhanced Cordless Telecommunications (DECT); Common Interface (CI); Part 8: Speech coding and transmission,” ETSI EN 300 175–8 V2.0.1, Mar. 2007.

20. N. Jayant, “Variable rate ADPCM based on explicit noise coding,” Bell Syst. Tech. J. 62(3), 657–677 (1983). [CrossRef]

21. J. Abdoli, M. Jia, and J. Ma, “Filtered OFDM: a new waveform for future wireless systems,” in Proceedings of IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC) (IEEE, 2015), pp. 66–70. [CrossRef]

22. H. Zeng, X. Liu, S. Megeed, N. Chand, and F. Effenberger, “Real-time demonstration of CPRI-compatible efficient mobile fronthaul using FPGA,” J. Lightwave Technol. 35(6), 1241–1247 (2017). [CrossRef]

23. B. Guo, W. Cao, A. Tao, and D. Samardzija, “LTE/LTE-A signal compression on the CPRI interface,” Bell Labs Tech. J. 18(2), 117–133 (2013). [CrossRef]

24. L. Ramalho, I. Freire, C. Lu, M. Berg, and A. Klautau, “Improved LPC-Based fronthaul compression with high rate adaptation resolution,” IEEE Commun. Lett. 22(3), 458–461 (2018). [CrossRef]

25. J. Wang, Z. Yu, K. Ying, J. Zhang, F. Lu, M. Xu, L. Cheng, X. Ma, and G.-K. Chang, “Digital mobile fronthaul based on delta–sigma modulation for 32 LTE carrier aggregation and FBMC signals,” J. Opt. Commun. Netw. 9(2), A233–A244 (2017). [CrossRef]

26. C.-X. Wang, S. Wu, L. Bai, X. You, and J. Wang, “Recent advances and future challenges for massive MIMO channel measurements and models,” Sci. China Inf. Sci. 59(2), 1–16 (2016).

27. 3GPP, “Radio access architecture and interfaces,” TR 38.801, V14.0.0, Mar. 2017.

28. A. Ghosh, J. Zhang, J. Andrews, and R. Muhamed, “Fundamentals of LTE,” Pearson Education, 2010.

29. ITU-T Technical Report, “Transport network support of IMT-2020/5G,” Feb. 2018.

30. X. Liu, H. Zeng, N. Chand, and F. Effenberger, “Efficient mobile fronthaul via DSP-based channel aggregation,” J. Lightwave Technol. 34(6), 1556–1564 (2016). [CrossRef]

31. CableLabs, “DOCSIS 3.1 physical Layer Specification,” CM-SP-PHYv3.1–I14–180509, 2018.

Adaptive space-time compression for efficient massive MIMO fronthauling

Abstract

1. Introduction

2. Principle of the proposed space-time compression technique

2.1 Spatial compression based on subspace tracking

2.2 Time-domain compression based on adaptive quantizer

2.3 Joint optimization of space-time compression

3. Simulation setup and results

3.1 SISO case

3.2 MIMO case

4. Experimental setup and results

5. Conclusion

Appendix

Funding

Acknowledgments

References and links

Cited By

Figures (14)

Equations (27)

Optics Express