Transmission rate Optimization by dynamic resource allocation algorithm for RF/VLC heterogeneous networks

Ruimin Gao; Ping Wang; Jingyu Wang; Ting Yang; Huili Shi; Zhao Wang; Sihui Chi; Hui Che; Lixin Guo

doi:10.1364/OE.433392

1. Introduction

In recent years, with the continuous emergence of high data rate services and the rapid growth of mobile internet users, the demand for data rate has grown exponentially. However, traditional radio frequency(RF) systems have strict management of available frequency bands, and the limited spectrum resources will be difficult to meet the explosive data demand [1]. Among the emerging wireless communication technologies, visible light communication (VLC) technology, which uses visible light for wireless access, is a highly potential solution [2–8]. VLC can use bandwidths from 400THz to 800THz, which are still license-free spectrums [9]. In addition, compared with RF, VLC also offers many advantages such as secure communications, feasibility in RF-restricted areas. However, the performance of VLC will degrade sharply in the absence of line-of sight (LOS) transmission between VLC access point (AP) and users. In addition, the coverage area of VLC AP is limited. Hence, the reliability is becoming a major concern for the VLC systems. On the contrary, the radio frequency (RF) system can provide a wide range of coverage even in the absence of LOS transmission [10]. Therefore, a VLC/RF heterogeneous network considering the advantages of both techniques has received a lot of attention in recent years [11–14], and specifically, a heterogeneous network architecture has been amply studied by a Euro-Chinese research consortium was reported in [15,16], which could provide a great solution to the broadband wireless access problem. Hybrid RF / VLC combines RF and VLC networks into a hybrid system and the user is associated with a VLC AP or a RF AP for data transmission. The VLC AP provides high data rates, and the RF AP ensures communication when the VLC APs cannot provide services for users. In this way, the two networks compensated each other [17].Compared with the traditional RF and VLC networks, hybrid RF/VLC network has better performance, especially over high date transmission rate [18,19].

For the VLC/RF heterogeneous network, due to the existence of two APs in the network, APs assignment and resource allocation (RA) are two important issues to be determined first before the data transmission of users, and they are important factors affecting the transmission rate of the system. Up to now, there have been some studies focusing on AP assignment [20–23]. In [17], three different link usage strategies were proposed to enhance network performances to address specific quality-of-service (QoS) constraints and the non-asymptotic data backlog and buffering delay violation probability bounds were obtained. In [18], a context-aware learning algorithm sensitive to traffic type-location-time information was proposed and the algorithm could achieve better performance with faster convergence speed than traditional methods. In [19], a two-stage Access Point Selection (APS) method using fuzzy logic was given and developed. In [20], a load balancing method was presented to restrain vertical handover. These methods improve the system rate and reduce the computational complexity of hybrid VLC system. As well, RA is also quite essential for hybrid RF / VLC network to improve the performance of system. In [24], energy efficiency (EE) was studied, and the comparison of hybrid RF / VLC network, an RF only network and a heterogeneous network composed of two RF communication systems was given. In [25], the EE subchannel and power allocation were investigated in the context of software-defined VLC and RF small-cell networks by Dinkelbach's method and the powerful alternative direction method. Both of them optimized the EE of hybrid system. Meanwhile, the study of optimal resource allocation for maximizing the transmission rate in hybrid RF/VLC network has become the focus of research these years [26,27]. In [26], an iterative algorithm was proposed to distribute users on APs and a new efficient algorithm that finds optimal dual variables was produced to allocate the power of each AP to the connected users for total achievable data rate maximization. In [27], a bandwidth aggregation protocol was demonstrated to optimize the throughput of system. But there are still some challenges on how to jointly optimize the performance of the hybrid system with the objective of system stability by dynamically allocating AP, bandwidth and transmit power effectively. Firstly, different types of APs will bring their corresponding design issues. In general, the binary allocation variables and power allocation would make the resource allocation problem be a mixed-integer nonlinear programming (MINP), which belongs to the non-concave problem and typically has high computational complexity. Secondly, RA should be carried out under stochastic channel conditions and RA becomes a joint optimization problem which usually requires the optimization of several parameters simultaneously. These multiple factors make the RA optimization difficult to solve. In [26], two approaches were tried to optimize the load balancing and RA jointly. Simulation result showed that the system transmission rate and fairness were improved. But these two methods could not provide timely responses to dynamic changes in the network. In [28], a deep Q-network (DQN) learning-based algorithm was given and it could solve the multi-parameters optimization adaptively to maximize the system transmission rate. In [29], the resource allocation problems for hybrid RF/VLC networks with mobile users exhibiting the multi-homing capability were studied, and the multi-agent Q-learning was employed to develop an online two-timescale power allocation algorithm that could identify the transmit power at the RF and VLC APs to satisfy the QoS requirements of users. Nevertheless, to the best of the authors’ knowledge, a comprehensive optimization of AP assignment, subchannel allocation, transmit power allocation with the limit of the budgetary transmit power and the stability of the RF/VLC network has not been reported up to now, though it is quite essential for the hybrid VLC system.

Motivated by the above analysis, a novel and rapid DRA algorithm has been proposed to jointly optimize the AP assignment, subchannel allocation, power allocation for APs in a downlink hybrid RF/VLC network. Based on the channel state information (CSI) and the queue state information (QSI), the DRA algorithm aims to maximize the time-averaged transmission rate while guaranteeing system stability. With the Lyapunov optimization technique, the time-average problem is converted into each timeslot. Then the optimization problem could be solved at a single timeslot by dividing it into three subproblems. The DRA algorithm does not require iterations, which could greatly improve the optimization speed. Simulations show that the proposed DRA algorithm can significantly enhance the transmission rate of the system compared with two traditional resource allocation algorithms and ensure the stability of the system.

2. Indoor VLC/RF heterogeneous network model

As is shown in below Fig. 1, an indoor VLC/RF system model is presented. The system consists of one RF AP, ${N_{VLC}}$ VLC APs and M users. In this model, VLC APs are installed on the ceiling. It is assumed that the RF AP can cover the whole room area and all the APs are connected to a central controller, which is responsible for collection of users’ feedback, scheduling, association, and resource allocation. M users are distributed uniformly on the receiving plane and equipped with both VLC receivers and RF receivers. The set of APs is denoted by ${{\cal N}}\textrm{ = }{{{\cal N}}_{\textrm{RF}}}\textrm{ + }{{{\cal N}}_{\textrm{VLC}}}\textrm{ = }\{{0,1,2, \ldots ,N} \}$, where ${{{\cal N}}_{\textrm{RF}}}\textrm{ = }\{0 \}$ and ${{{\cal N}}_{\textrm{VLC}}}\textrm{ = }\{{1,2, \ldots ,N} \}$ represent the sets of RF AP and VLC APs, respectively. In addition, ${{\cal M}}\textrm{ = }\{{1,2,3, \ldots ,M} \}$ is used to denote the set of users. And the locations of all users are supposed to be unchanged during a short period of timeslot t. Set $\alpha n,m(t)$ as the network selection variable, where $\alpha n,m(t) = 1$ means that user m accesses with AP$n$ at timeslot $t,\,\alpha n,m(t) = 0,\,\alpha n,m(t) = 0$, otherwise. On the basis of the fact that users do not have the multi-homing capability, it is assumed that an individual user can be served by only one AP at a timeslot, the following constraint can be obtained

(1)$$\sum\limits_{n \in {{\cal N}}} {\alpha _{n,m}(t)} = 1,\alpha _{n,m}(t) \in \{{0,1} \},\forall m.$$

Fig. 1. system model of an indoor RF/VLC heterogeneous network

Download Full Size | PDF

2.1 RF channel model

The RF communication path loss is denoted as [29]

(2)$$\textrm{PL}[\textrm{dB] = Alo}{\textrm{g}_{10}}({d_{0,m}}) + \textrm{B + Clo}{\textrm{g}_{10}}(\frac{{{f_c}}}{5}) + X,$$

where ${d_{0,m}}$ denotes the distance of RF AP and user $m,\,{f_c}$ is the carrier frequency in GHz, A, B and C are constants that depend on the propagation model, and X is the environment specific term. Suppose there are I resource blocks (RBs), which is a resource block of a continuous subcarrier over frequency, in RF system, and let $x_{0,m}^i(t)$ be the RB selection variable. $x_{0,m}^i(t) = 1$ means that at timeslot t, user m is connected to the RB i of RF AP, while $x_{0,m}^i(t) = 0$ means that there is no access. Because there is only one RF AP in the hybrid VLC/RF network, the channel interference in RF network can be ignored. Thus, the SNR (signal to noise ratio) between user m and the RF AP on RB i can be expressed as [29]

(3)$$\zeta _{0,m}^i(t) = \frac{{p_0^i(t)g_{0,m}^i(t)}}{{\sigma _{RF}^2}},$$

where, $p_0^i(t)$ is the transmitted power of RF AP on RB $i,\,\sigma _{RF}^2$ is the average noise power of RF system, $g_{0,m}^i(t)$ is the channel gain between user m and RB i of RF AP. According to Shannon formula, the information transmission rate between RF AP sub-channel I and user m at timeslot t can be obtained as follows

(4)$$u_{0,m}^i(t)\textrm{ = }{W_{RF}}{\log _2}(1 + \zeta _{0,m}^i(t)),$$

where ${W_{RF}}$ is the bandwidth of RB. Therefore, the information transmission rate between RF AP and user m at timeslot t is

(5)$${R_{0,m}}(t )= {\alpha _{0,m}}(t ){u_{0,m}}(t )= {\alpha _{0,m}}(t )\sum\limits_{i \in I} {x_{0,m}^i(t )u_{0,m}^i(t )} .$$

Since each RB can only be allocated to one user at a timeslot, for every RB i, there is

(6)$$\sum\limits_{m \in {{\cal M}}} {\alpha _{0,m}(t)} x_{0,m}^i(t) \le 1.$$

2.2 VLC channel model

For the studied VLC system, the gain of LOS channel between the VLC AP$n$ and the user m can be modeled as [22]

(7)$${h_{n,m}} = \left\{ \begin{array}{l} \frac{{(ml + 1){A_p}}}{{2\pi d_{n,m}^2}}{\cos^{ml}}({{\theta_{n,m}}} )\cos ({{\phi_{n,m}}} ){T_n}{g_n},({\phi_{n,m}} \le FOV)\\ 0,({\phi_{n,m}} > FOV) \end{array} \right.,$$

where $ml ={-} \ln 2/\ln (\cos (FOV))$ is the Lambertian order, FOV is the semi-angle at half power of the light source emission pattern, ${A_p}$ is the effective area of the photon diode (PD) of receivers, $d_{n,m}^{}$ is the distance between the AP n and user $m\,{\theta _{n,m}}$ and ${\phi _{n,m}}$ represent irradiance angle and incidence angle, respectively.${T_n}$ depicts the gain of an optical filter and ${g_n}$ is the concentrator gain. Suppose there are J subcarriers in VLC system, and let $w_{n,m}^j(t)$ be the subcarrier selection variable where $w_{n,m}^j(t) = 1$ means that the subcarrier j of AP$n$ is assigned to the user m at timeslot, while $w_{n,m}^j(t) = 0$ otherwise. The signal-to- interference-plus-noise ratio (SINR) between user m m and the VLC AP$n$ on subcarrier j can be calculated as [26]

(8)$$\zeta _{n,m}^j(t) = \frac{{{{|{\varepsilon \pi p_{n,m}^{j, \textrm{opt}}(t)h_{n,m}^j(t)} |}^2}}}{{\sigma _{\textrm{VLC}}^2\textrm{ + }\sum\limits_{\scriptstyle{n}^{\prime} \in {{{\cal N}}_{VLC}}\atop \scriptstyle{n}^{\prime} \ne n} {{{|{\varepsilon \pi p_{n^{\prime},m}^{j, \textrm{opt}}(t)h_{n^{\prime},m}^j(t)} |}^2}} }}\textrm{ = }\frac{{p_{n,m}^j(t){{|{\varepsilon h_{n,m}^j(t)} |}^2}}}{{\sigma _{\textrm{VLC}}^2\textrm{ + }\sum\limits_{\scriptstyle{n}^{\prime} \in {{{\cal N}}_{VLC}}\atop \scriptstyle{n}^{\prime} \ne n} {p_{n^{\prime},m}^j(t){{|{\varepsilon h_{n^{\prime},m}^j(t)} |}^2}} }},$$

where $\varepsilon $ is the photodetector responsivity of the users, ${p_{n,m}}(t)$ represents the transmitted power of the VLC AP$n$ to the user $m,\,{g_{n,m}}(t)$ is the channel gain between the VLC AP$n$ and the user m, and $\sigma _{VLC}^2$ is the noise power of the VLC system.

According to Shannon’s capacity formula, the availability information transmission rate between the VLC AP$n$ and user m on subcarrier j at timeslot t can be expressed as

(9)$$u_{n,m}^j(t) = {W_{VLC}}{\log _2}(1 + \zeta _{n,m}^j(t)),$$

where ${W_{VLC}}$ is the bandwidth of subcarrier. Since each subcarrier can only be allocated to one user at a timeslot, for every subcarrier j, the following constraint can be obtained

(10)$$\sum\limits_{m \in {{\cal M}}} {\alpha _{n,m}(t)w_{n,m}^j(t)} \le 1{\kern 1pt} {\kern 1pt} {\kern 1pt} ,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \alpha _{n,m}(t) \in \{{0,1} \},{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} w_{n,m}^j(t) \in \{{0,1} \},\forall n \in {{{\cal N}}_{VLC}},$$

therefore, the information transmission rate between the VLC AP n and user m at timeslot t is given by

(11)$${R_{n,m}}(t )= {\alpha _{n,m}}(t ){u_{n,m}}(t )= {\alpha _{n,m}}(t )\sum\limits_{j \in J} {w_{n,m}^j(t )u_{n,m}^j(t )} .$$

2.3 Power consumption limiting condition

The transmit power between RF AP and user m can be expressed as

(12)$$P_{0,m}^{RF}(t) = \alpha _{0,m}(t)\sum\limits_{i \in I} {x_{0,m}^i(t)p_{0,m}^i(t)} ,$$

and the transmit power of the RF AP can be calculated as

(13)$$P_0^{RF}(t) = \sum\limits_{m \in {{\cal M}}} {\alpha _{0,m}(t)} P_{0,m}(t) = \sum\limits_{m \in {{\cal M}}} {\alpha _{0,m}(t)\sum\limits_{i \in I} {x_{0,m}^i(t)p_{0,m}^i(t)} } .$$

Then, the power between VLC AP n and user m can be expressed as

(14)$$P_{n,m}^{VLC}(t) = \sum\limits_{j \in J} {w_{n,m}^j(t)p_{n,m}^j(t)} ,$$

and the transmitted power of the VLC AP n at timeslot t is

(15)$$P_n^{VLC}(t) = \sum\limits_{m \in {{\cal M}}} {\alpha n,m(t)} P_{n,m}^{VLC}(t).$$

In addition, to conserve power, an upper bound of time-averaged power consumption of AP n can be set to

(16)$$0 \le \mathop {\lim }\limits_{T \to \infty } \frac{1}{T}\sum\limits_{t = 0}^{T - 1} {E[{{P_n}(t)} ]} < P_n^{ave},$$

where $P_n^{ave}$ is the maximum average power consumption of AP$n$ and it can be expressed as

(17)$$P_n^{ave} = \left\{ \begin{array}{ll} P_{RF}^{ave},&if\;n = 0,\\ P_{VLC}^{ave},&if\;n = 1, \ldots ,N. \end{array} \right.$$

2.4 Data queue model

In this work, it is assumed that buffering queues of each user are maintained by the central controller. Let ${Q_m}(t)$ denote the queue length maintained by the system for user m at the beginning of timeslot $t.\,{A_m}(t)$ denotes the amount of data arriving at timeslot t for user $m$, which is distributed independently and identically over different timeslots. In general, the arrival data rate of user m at timeslot $t$ is given by ${\lambda _m}$ [30], $E[{{A_m}(t)} ]= {\lambda _m},\forall m \in {{\cal M}}$. And ${\lambda _m}$ is a bounded variable to guarantee the stability of the system. In view of the above analysis, the ${Q_m}(t)$ evolves according to [30]

(18)$${Q_m}(t + 1) = {[{{Q_m}(t) - {R_m}(t),0} ]^\textrm{ + }} + A_m^{}(t),$$

where ${[{x,0} ]^\textrm{ + }} = \max [{x,0} ]$. For each user m, the process of the queues is stochastic due to the stochastic of ${A_m}(t)$ and the time-varying characteristic of ${R_m}(t)$. If the queues of system are ignored by only considering maximizing the system transmission rate, a large amount of data may accumulate in the queue during the transmission process, which will make the system cannot operate. Therefore, it is required to establish a model about the stability of the queue. The general definition of queuing stability is given as

(19)$$\mathop {\lim }\limits_{T \to \infty } \frac{1}{T}\sum\limits_{t = 0}^{T - 1} {E[{{Q_m}(t)} ]} < \infty ,$$

where $E[. ]$ represents the expectation operator.

In a practical hybrid VLC system, the average rate stability means that the time-averaged rate of departures from the queue is greater than or equal to the time-averaged rate of inputs into the queue. Thus, when the average rate of the queue is stable, the data in every queue will eventually be sent to its users.

3. Problem formulation

3.1 Primal problem formulation

The purpose of this work is to obtain the maximum transmission rate under the condition of system stability. And at timeslot t, the transmission rate of the whole system can be expressed as

(20)$$R(t) = \sum\limits_{m \in {{\cal M}}} {{R_m}(t) = } \sum\limits_{n \in {{\cal N}}} {\sum\limits_{m \in {{\cal M}}} {{R_{n,m}}} } (t).$$

The objective is to maximize the average rate of the system. To achieve this optimization objective, the above function is designed as a time-averaged expression. Therefore, a stochastic optimization problem can be formulated as following

(21)$$\begin{aligned} &\max \;\;\;\mathop {\lim }\limits_{T \to \infty } \frac{1}{T}\sum\limits_{t = 0}^{T - 1} {E[{{R_{}}(t)} ]} \\ &\textrm{s}\textrm{.t}\;\mathop {C1:\lim }\limits_{\;\;\;\;\;\;\;\;T \to \infty } \frac{1}{T}\sum\limits_{t = 0}^{T - 1} {E[{{Q_m}(t)} ]} < \infty ,\forall m \in {{\cal M}},\\ &C\textrm{2}:\sum\limits_{n \in {{\cal N}}} {\alpha n,m(t)} = 1,\alpha n,m(t) \in \{{0,1} \},\forall m \in {{\cal M}},\forall n \in {{\cal N}},\\ &\quad C\textrm{3}:\sum\limits_{m \in {{\cal M}}} {\alpha 0,m(t)} x_{0,m}^i(t) \le 1,\alpha n,m(t) \in \{{0,1} \},x_{0,m}^i(t) \in \{{0,1} \},\forall m \in {{\cal M}},\\ &\quad C\textrm{4}:0 \le \mathop {\lim }\limits_{T \to \infty } \frac{1}{T}\sum\limits_{t = 0}^{T - 1} {E[{{P_n}(t)} ]} < P_n^{ave},\\ &\quad C\textrm{5}:\sum\limits_{m \in {{\cal M}}} {\alpha n,m(t)w_{n,m}^j(t)} \le 1{\kern 1pt} {\kern 1pt} {\kern 1pt} ,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \alpha n,m(t) \in \{{0,1} \},{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} w_{n,m}^j(t) \in \{{0,1} \},\forall n \in {{{\cal N}}_{VLC}},\\ &\quad C\textrm{6}:p_{0,m}^i(t) \ge 0,p_{n,m}^j(t) \ge 0,\forall n \in {{{\cal N}}_{VLC}},\forall m \in {{\cal M} }. \end{aligned}$$

Here, $C1$ is the stability requirement. $C2$ ensures that at each timeslot, a user can only be served by one AP. $C3$ ensures that one user can be served on each RB at most for RF AP. $C4$ represents the average power budget constraint for each AP. $C5$ ensures that at most one user can be served on each subchannel for each VLC AP. $C6$ denotes the non-negativity of the transmit power.

By taking the dynamics of traffic arrivals and wireless channels into consideration, an average throughput-maximum RA scheme is supposed. To provide more flexible RA algorithm among users, a service contract is also considered where the average resource provision is guaranteed for each user. Because it is impossible to know future traffic arrival and wireless channel information before transmission, the average throughput maximization problem is formulated into a stochastic optimization problem and the Lyapunov optimization technique is adopted to solve this problem, and an online resource allocation algorithm is further developed.

3.2 Analysis based on Lyapunov optimization theory

As is known, Lyapunov optimization theory can deal with the stability of the system while dealing with the optimization of performance. This theory normally considers the random network environment and solves the problems of queue length minimization and network utility optimization simultaneously. It is noted that problem (21) has a time-averaged limitation on the transmit power consumption. To cope with the time-averaged limitation, Lyapunov optimization is applied to transform $C4$ into an evolutionary process for the virtual queue $\{{{Y_n}(t)} \}$ as follows

(22)$${Y_n}(t + 1) = {[{{Y_n}(t) + {P_n}(t) - P_n^{ave}(t),0} ]^\textrm{ + }}.$$

The virtual queues $\{{{Y_n}(t)} \}$ can be used to constrain queue stability. If there exists the policy to have the virtual queue $\{{{Y_n}(t)} \}$ stabilized, then, it automatically satisfies the average power constraint C4 [30].

Let ${\mathbf Q}(t)$ and ${\mathbf Y}(t)$ be the corresponding vectors of queues states at timeslot t, respectively, and denote ${\mathbf G}(t) = [{{\mathbf Q}(t),{\mathbf Y}(t)} ]$ be a concatenated vector. Then, the quadratic Lyapunov function is defined as

(23)$$L({{\mathbf G}(t)} )= \frac{1}{2}\sum\limits_{{n} \in {{\cal N}}\atop m \in {{\cal M}}}^{} {Q_m^2(t} ) + \frac{1}{2}\sum\limits_{n \in {{\cal N}}}^{} {Y_n^2(t} ).$$

Lyapunov function $L({{\mathbf G}(t)} )$ is a scalar metric to measure the queue congestion state. The one-slot drift of the Lyapunov function at timeslot t can be given by

(24)$$\begin{aligned} &L({{\mathbf G}(t\textrm{ + 1})} )- L({{\mathbf G}(t)} )\\ &= \frac{1}{2}\left( {\sum\limits_{{n} \in {{\cal N}}\atop m \in {{\cal M}}}^{} {Q_m^2(t} + 1) + \sum\limits_{n \in {{\cal N}}}^{} {Y_n^2(t + 1} )} \right)\; - \frac{1}{2}\left( {\sum\limits_{{n} \in {{\cal N}}\atop m \in {{{\cal M}}_n}}^{} {Q_m^2(t} ) + \sum\limits_{n \in {{\cal N}}}^{} {Y_n^2(t} )} \right)\\ &\le \mu + \sum\limits_{{n} \in {{\cal N}}\atop m \in {{\cal M}}}^{} {{Q_m}(t)({Am(t) - \alpha n,m(t){u_{n,m}}(t)} )} \\ &\quad + \sum\limits_{n \in {{\cal N}}}^{} {[{{Y_n}(t)({{P_n}(t) - P_n^{ave}(t)} )} ]} , \end{aligned}$$

where $\mu = \frac{{(N + 1) \times M}}{2}(R_{\max }^2 + A_{\max }^2)\textrm{ + }\frac{{(N + 1)}}{2}(P_{\max }^{\textrm{ave}2} + P_{\max }^2)$ is a constant.

Lyapunov drift at timeslot t is denoted as

(25)$$\Delta ({\mathbf G}(t)) = {\mathbb{E} }[{L\textrm{(}{\mathbf G}\textrm{(t + 1))}|{{\mathbf G}\textrm{(t)}} } ]- {\mathbb{E} }[{L\textrm{(}{\mathbf G}\textrm{(t)}|{{\mathbf G}\textrm{(t)}} } ],$$

and therefore the optimization objective is mapped to a penalty function

(26)$$\Delta ({\mathbf G}(t))\textrm{ - V}{\mathbb{E}}[{R(t)|{{\mathbf G}\textrm{(t)}} } ],$$

With Lyapunov optimization method, the underlying objective of (22) is to minimize the upper bound of the Lyapunov drift-plus-penalty at each timeslot, and the upper bound of the Lyapunov drift-plus-penalty is calculated by

(27)$$\begin{aligned} &\Delta ({\mathbf G}(t))\textrm{ - V}{\mathbb{E}}[{R(t)|{{\mathbf G}\textrm{(t)}} } ]\\ &= {\mathbb{E}}[{L\textrm{(}{\mathbf G}\textrm{(t + 1))}|{{\mathbf G}\textrm{(t)}} } ]\textrm{ - }{\mathbb{E}}[{{\mathbf G}\textrm{(t)}} ]- \textrm{V}\sum\limits_{{n} \in {{\cal N}}\atop m \in {{\cal M}}} {{\mathbb{E}}[{{R_{n,m}}|{{\mathbf G}\textrm{(t)}} } ]} \\ &\le \mu + {\mathbb{E}}\left[ \begin{array}{l} \sum\limits_{{n} \in {{\cal N}}\atop m \in {{\cal M}}}^{} {{Q_m}(t)({Am(t) - \alpha n,m(t){u_{n,m}}(t)} )} \\ + \sum\limits_{n \in {{\cal N}}}^{} {[{{Y_n}(t)({{P_n}(t) - P_n^{ave}(t)} )} ]} |{{\mathbf G}\textrm{(t)}} \end{array} \right]\\ &- \textrm{V}\sum\limits_{{n} \in {{\cal N}}\atop m \in {{\cal M}}} {{\mathbb{E}}[{{R_{n,m}}|{{\mathbf G}\textrm{(t)}} } ]} , \end{aligned}$$

where V is a non-negative constant parameter which controls the tradeoff between drift and reward. A greater value of V indicates that a greater priority is assigned to maximize the throughput at the expense of larger queue lengths, and a smaller V indicates that queue length is prioritized to ensure the stability of system. By minimizing drift and penalty terms, Lyapunov optimization method can minimize the queue length and solve the network utility optimization problem. Thus, according to Lyapunov optimization theory, problem (22) can be transformed into the problem of minimizing the upper bound of the Lyapunov drift-plus-penalty at each timeslot as

(28)$$\begin{aligned} &\min \;\;\;\sum\limits_{n \in {{\cal N}}}^{} {{Y_n}(t)({{P_n}(t) - P_n^{ave}(t)} )} - \textrm{V}\sum\limits_{{n} \in {{\cal N}}\atop m \in {{\cal M}}}^{} {\alpha n,m(t){u_{n,m}}(t)} \\ &\quad +\sum\limits_{{n} \in {{\cal N}}\atop m \in {{\cal M}}}^{} {{Q_m}(t)(Am(t)\textrm{ - }\alpha n,m(t){u_{n,m}}(t))} \\ &\textrm{s}\textrm{.t}\;\;\;C2,C3,C5,C6. \end{aligned}$$

4. Design of resource allocation algorithms

In fact, the reformulated optimization problem (28) is a constrained mixed integer programming problem, where $\alpha n,m(t),\,x_{0,m}^i(t)$ and $w_{n,m}^j(t)$ will take integer values while $p_{0,m}^i(t)$ and $p_{n,m}^j(t)$ takes continuous values. In this section, the optimization problem in (28) is decomposed into three sub-problems and the corresponding algorithms are proposed for them separately.

4.1 Scheduling policy

By observing the form of the objective function in (20), the distributed resource allocation algorithm is presented. In the first stage, the assignment of AP is determined on the basis of the system queues information. And in the second stage, the resource allocation problem is considered in different networks. When $n \in {{{\cal N}}_{\textrm{RF}}}$, the subproblem of joint RB and power allocation in RF system is solved. When $n \in {{{\cal N}}_{\textrm{VLC}}}$, the subproblem of joint subcarrier and power allocation in VLC system is solved. Thus, the resource optimization steps for VLC/RF heterogeneous networks can be obtained as follows

Step1: At the beginning of timeslot t, the queue length ${\mathbf Q}(t)$, virtual power queue length ${\mathbf Y}(t)$ and channel gain are observed.

Step2: The AP assignment variables $\alpha _{n,m}(t)$ can be obtained by solving the following subproblem

(29)$$\begin{aligned} &\min \;\sum\limits_{{n} \in {{\cal N}}\atop m \in {{\cal M}}}^{} {{Q_m}(t)(Am(t)\textrm{ - }\alpha _{n,m}(t){u_{n,m}}(t))} - \sum\limits_{{n} \in {{\cal N}}\atop m \in {{\cal M}}}^{} {\alpha _{n,m}(t){u_{n,m}}(t)\textrm{V}} \\ &\textrm{s}\textrm{.t}\;\;\;\sum\limits_{n \in {{\cal N}}} {\alpha _{n,m}(t)} = 1,\alpha _{n,m}(t) \in \{{0,1} \},\forall m \in {{\cal M}},\forall n \in {{\cal N}}.\textrm{ } \end{aligned}$$

Step3: With the network selection variables $\alpha _{n,m}(t)$ obtained in subproblem (29), solve the following subproblem (30) for joint RB and power allocation in RF system and subproblem (31) for joint subcarriers and power allocation in VLC system, respectively. When user $m$ is associated with RF AP, the subproblem is formulated as

(30)$$\begin{aligned} &\max \;\;\;\sum\limits_{m \in {{{\cal M}}_0}}^{} {{u_{0,m}}(t)({V\textrm{ + }{Q_m}(t)} )} - {Y_0}(t)({{P_0}(t) - P_{RF}^{ave}(t)} )\\ &\textrm{s}\textrm{.t}\;\;\;\;\;\;\sum\limits_{m \in {{{\cal M}}_0}} {x_{0,m}^i(t)} \le 1,\alpha n,m(t) \in \{{0,1} \},x_{0,m}^i(t) \in \{{0,1} \},\forall m \in {{{\cal M}}_0},\\ &\quad\quad p_{0,m}^i(t) \ge 0,\forall m \in {{{\cal M}}_0}. \end{aligned}$$

when user m is associated with VLC APs, the subproblem is formulated as

(31)$$\begin{aligned} &\max \;\;\;\sum\limits_{{n} \in {{{\cal N}}_{VLC}}\atop m \in {{\cal M}}}^{} {\alpha _{n,m}(t){u_{n,m}}(t)({{Q_m}(t) + \textrm{V}} )} - {Y_n}(t)({{P_n}(t) - P_{\textrm{VLC}}^{ave}(t)} )\\ &\textrm{s}\textrm{.t}\;\;\;\;\;\;\sum\limits_{m \in {{\cal M}}} {{\alpha _{n,m}}(t)} w_{n,m}^i(t) \le 1,\alpha _{n,m}(t) \in \{{0,1} \},w_{n,m}^i(t) \in \{{0,1} \},\forall m \in {{\cal M}},\\ &\quad\quad p_{n,m}^j(t) \ge 0,\forall m \in {{\cal M}}. \end{aligned}$$

Step 4: Update the queue length ${\mathbf Q}(t)$ according to Eq. (18).

Thus, the primary problem is decomposed into two steps and divided into three relatively independent subproblems. The procedure for solving these subproblems is given in the following parts.

4.2 AP assignment

From (29), it is known that AP assignment is independent among each user. Thus, (29) can be divided into M subproblems, each of which is given as

(32)$$\begin{aligned} &\max \;\;\;\sum\limits_{n \in {{\cal N}}} {\alpha _{n,m}(t){\textrm{v}_{n,m}}(t)} \\ &\textrm{s}\textrm{.t}\;\;\;\;\;\sum\limits_{n \in {{\cal N}}} {\alpha _{n,m}(t)} = 1,\alpha _{n,m}(t) \in \{{0,1} \},\forall m \in {{\cal M}},\forall n \in {{\cal N} }. \end{aligned}$$

where ${\textrm{v}_{n,m}}(t)\textrm{ = }{u_{n,m}}(t)(\textrm{V} + {Q_m}(t))$. This is a linear optimization problem, which is solved to obtain the optimal network access factor as

(33)$$\alpha _{n,m}(t)\textrm{ = }\left\{ \begin{array}{ll} 1,&n = \mathop {\arg \max \{{{\textrm{v}_{l,m}}(t),0 \le l \le N} \}}\limits_l ,\\ 0,&otherwise. \end{array} \right.$$

Under the network selection strategy above mentioned, ${\textrm{v}_{n,m}}(t)$ can be considered as the network utility that balances ${u_{n,m}}(t)$ and ${Q_m}(t)$ during timeslot t. Then, Eq. (33) ensures that user can connect to the AP with the maximum of network utility.

4.3 Joint RBs and power allocation of RF AP

As can be found, problem (30) is the function of discrete variable $x_{0,m}^i$ and continuous variable $p_{0,m}^i(t)$, hence the joint subcarrier and power allocation problem is a mixed combinatorial problem. The Lagrange relaxation method is applied to transform the original integer-programming problem to a concave optimization problem.

First, relax $x_{0,m}^i(t)$ to $\tilde{x}_{0,m}^i(t) \in [0,1]$. Then, problem (30) can be reformulated as a concave optimization problem

(34)$$\begin{aligned} &\max \;\;\;\sum\limits_{i \in I} {\sum\limits_{m \in {{{\cal M}}_0}} {\tilde{x}_{0,m}^i(t)({\textrm{V + }{Q_m}(t)} ){W_{RF}}{{\log }_2}(1 + \frac{{p_{0,m}^i(t)g_{0,m}^i(t)}}{{\tilde{x}_{0,m}^i(t)\sigma _{RF}^2}})} } \\ &\quad\quad\quad\quad - {Y_0}\sum\limits_{i \in I} {\sum\limits_{m \in {{{\cal M}}_0}} {p_{0,m}^i} } \\ &\textrm{s}\textrm{.t}\;\;\;\;\sum\limits_{m \in {{{\cal M}}_0}} {\tilde{x}_{0,m}^i(t)} \le 1,\tilde{x}_{0,m}^i(t) \in [{0,1} ],\forall m \in {{{\cal M}}_0},\\ &\quad p_{0,m}^i(t) \ge 0,\forall m \in {{{\cal M}}_0}. \end{aligned}$$

Because $\tilde{x}_{0,m}^i(t){W_{RF}}{\log _2}(1 + \frac{{p_{0,m}^i(t)g_{0,m}^i(t)}}{{\tilde{x}_{0,m}^i(t)\sigma _{RF}^2}}) \ge \tilde{x}_{0,m}^i(t){W_{RF}}{\log _2}(1 + \frac{{p_{0,m}^i(t)g_{0,m}^i(t)}}{{\sigma _{RF}^2}})$ and the feasible region is compatible.$\sum\limits_{i \in I} {\sum\limits_{m \in {{{\cal M}}_0}} {\tilde{x}_{0,m}^i(t){W_{RF}}{{\log }_2}(1 + \frac{{p_{0,m}^i(t)g_{0,m}^i(t)}}{{\tilde{x}_{0,m}^i(t)\sigma _{RF}^2}})({\textrm{V - }{Q_m}(t)} )} }$ is jointly concave in $\tilde{x}_{0,m}^i$ and $p_{0,m}^i(t)$. And it can be regarded as the linear function of the concave function ${\log _2}(1 + \frac{{p_{0,m}^i(t)g_{0,m}^i(t)}}{{\sigma _{RF}^2}})$. Thus, the given optimization problem is the sum of $I \times |{{{{\cal M}}_0}} |$ concave functions. In addition, C3 and C4 are both linear constraints. Therefore, they can form a convex set and the given optimization problem is a concave optimization problem.

Here, the objective function can be formulated as

(35)$$L(P) = \sum\limits_{i \in I} {\sum\limits_{m \in {{\cal M}}} {\tilde{x}_{0,m}^i(t)({\textrm{V + }{Q_m}(t)} ){W_{RF}}{{\log }_2}(1 + \frac{{p_{0,m}^i(t)g_{0,m}^i(t)}}{{\tilde{x}_{0,m}^i(t)\sigma _{RF}^2}})} } - {Y_0}\sum\limits_{i \in I} {\sum\limits_{m \in {{\cal M}}} {p_{0,m}^i} } .$$

Then, taking the partial derivative of $L(P)$ with respect to $p_{0,m}^i(t)$, the following expression can be obtained as

(36)$$\frac{{\partial L}}{{\partial p_{0,m}^i(t)}} = \frac{{\tilde{x}_{0,m}^i(t)({\textrm{V + }{Q_m}(t)} ){W_{RF}}g_{0,m}^i(t)}}{{(\sigma _{RF}^2\tilde{x}_{0,m}^i(t) + p_{0,m}^i(t)g_{0,m}^i(t))\ln 2}} - {Y_0}.$$

According to the KKT conditions, the optimal power allocation is achieved when it meets the following conditions

(37)$$\begin{aligned} \frac{{\partial \left[ {L\textrm{ - }\left( {\sum\limits_{m \in {{{\cal M}}_0}} {\sum\limits_{i \in I} {{\mu_{i,m}}p_{0,m}^i(t)} } } \right)} \right]}}{{\partial p_{0,m}^i(t)}} &= \frac{{\tilde{x}_{0,m}^i(t)({\textrm{V + }{Q_m}(t)} ){W_{RF}}g_{0,m}^i(t)}}{{(\sigma _{RF}^2\tilde{x}_{0,m}^i(t) + p_{0,m}^i(t)g_{0,m}^i(t))\ln 2}},\\ &- {Y_0} - {\mu _{i,m}} = 0 \end{aligned}$$

(38)$${\mu _{i,m}}p_{0,m}^i(t)\textrm{ = }0,\;\;\;\forall i,m,$$

(39)$${\mu _{i,m}} \ge 0,\;\;\forall i,m,$$

where ${\mu _{i,m}}$ is the dual variable of constraint C4 to construct the Lagrange function. In order to hold ${\mu _{i,m}}p_{0,m}^i(t)\textrm{ = }0,\,{\mu _{i,m}}\textrm{ = }0$, or $p_{0,m}^i(t) = 0$ must be achieved. Thus, the optimal power allocation is achieved when ${\mu _{i,m}}\textrm{ = }0$. By making (38) equal to be zero, the relationship between $p_{0,m}^{i{\ast }}(t)$ and $\tilde{x}_{0,m}^i(t)$ can be obtained as

(40)$$p_{0,m}^{i{\ast }}(t) = {\left[ {\frac{{({\textrm{V + }{Q_m}(t)} ){W_{RF}}}}{{{Y_0}\ln 2}} - \frac{{\sigma_{\textrm{RF}}^2}}{{g_{0,m}^i(t)}}} \right]^ + }\tilde{x}_{0,m}^i(t).$$

Then, the result of power allocation can be adopted to obtain RB assignment. After substituting Eq. (40) into Eq. (35), $L(P)$ can be expressed as

(41)$$L(P) = \sum\limits_{i \in I} {\sum\limits_{m \in {{\cal M}}} {{l_{i,m}}(P)\tilde{x}_{0,m}^i(t)} } ,$$

where

(42)$$\begin{aligned} {l_{i,m}}(P) &= a_{0,m}^i({\textrm{V + }{Q_m}(t)} ){W_{RF}}\\ &\quad {\log _2}(1 + {\left[ {\frac{{a_{0,m}^i\tilde{x}_{0,m}^i(t)({\textrm{V + }{Q_m}(t)} ){W_{RF}}}}{{{Y_0}\ln 2}} - \frac{{{W_{RF}}\sigma_{RF}^2}}{{g_{0,m}^i(t)}}} \right]^ + }\frac{{g_{0,m}^i(t)}}{{{W_{RF}}\sigma _{RF}^2}})\;\\ &\quad- {Y_0}{\left[ {\frac{{a_{0,m}^i\tilde{x}_{0,m}^i(t)({\textrm{V + }{Q_m}(t)} ){W_{RF}}}}{{{Y_0}\ln 2}} - \frac{{{W_{RF}}\sigma_{RF}^2}}{{g_{0,m}^i(t)}}} \right]^ + }\;.\end{aligned}$$

It can be easily observed that $\max \;L(P)$ is a classical linear assignment problem. Hence, it can be equivalently decomposed into I subproblems, each subproblem is formulated as

(43)$${L_i}(P) = \sum\limits_{m \in {{\cal M}}} {{l_{i,m}}(P)\tilde{x}_{0,m}^i(t)} .$$

The objective of the problem is to maximize ${L_i}(P)$ for all subcarrier i. And for all subcarrier i, the optimal assignment is to select the user who has the largest ${l_{i,m}}(P)$. Then, the solution can be obtained as

(44)$$\tilde{x}_{0,m}^{i\ast }(t) = \left\{ \begin{array}{l} 1,m = \mathop {\arg \max (}\limits_{m^{\prime}} {l_{i,m^{\prime}}}(P),m^{\prime} \in {\cal M}),\\ 0,otherwhise. \end{array} \right.$$

4.4 Joint subcarriers and power allocation of VLC AP

To solve the problem presented in (31), the upper bound is required to be found and optimized. Thus, the problem (31) can be converted into

(45)$$\begin{aligned} &\max \;\;\;\sum\limits_{{n} \in {{{\cal N}}_{VLC}}\atop m \in {{\cal M}}}^{} {\alpha _{n,m}(t){W_{VLC}}{{\log }_2}(1 + \frac{{p_{n,m}^j(t){{|{\varepsilon h_{n,m}^j(t)} |}^2}}}{{\sigma _{\textrm{VLC}}^2}})({{Q_m}(t) + \textrm{V}} )} \\ &\quad\quad - {Y_n}(t)({{P_n}(t) - P_{\textrm{VLC}}^{ave}(t)} )\\ &\textrm{s}\textrm{.t}\;\;\;\;\;\;\sum\limits_{m \in {{\cal M}}} {{\alpha _{n,m}}(t)} w_{n,m}^j(t) \le 1,\alpha _{n,m}(t) \in \{{0,1} \},w_{n,m}^j(t) \in \{{0,1} \},\forall m \in {{\cal M}},\\ &\quad\quad p_{n,m}^j(t) \ge 0,\forall m \in {{\cal M}}. \end{aligned}$$

It can be observed that the subcarriers selection is independent among different VLC APs. Therefore, problem (45) can be equivalently decomposed into $|{{{{\cal N}}_{VLC}}} |$ subproblems, each of which can be given as

(46)$$\begin{aligned} &\max \;\;\;\sum\limits_{m \in {{{\cal M}}_n}}^{} {{W_{VLC}}{{\log }_2}(1 + \frac{{p_{n,m}^j(t){{|{\varepsilon h_{n,m}^j(t)} |}^2}}}{{\sigma _{VLC}^2}})({{Q_m}(t) + \textrm{V}} )} - {Y_n}(t)({{P_n}(t) - P_{\textrm{VLC}}^{ave}(t)} )\\ &\textrm{s}\textrm{.t}\;\;\;\;\;\;\sum\limits_{m \in {{\cal M}}} {{\alpha _{n,m}}(t)} w_{n,m}^j(t) \le 1,\alpha n,m(t) \in \{{0,1} \},w_{n,m}^j(t) \in \{{0,1} \},\forall m \in {{{\cal M}}_n},\\ &\quad\quad p_{n,m}^j(t) \ge 0,\forall m \in {{{\cal M}}_n}. \end{aligned}$$

where ${{{\cal M}}_n}$ is the set of users connected with the VLC AP$n$.

Similar to problem (30), the joint subchannel and power allocation problem of VLC is also a hybrid combinatorial optimization problem. Thus, Lagrange relaxation method is used to transform the original integer programming mixed problem into a convex optimization problem

(47)$$\begin{aligned} &\max \;\;\;\sum\limits_{j \in J} {\sum\limits_{m \in {{{\cal M}}_n}} {\tilde{w}_{n,m}^j(t)({\textrm{V + }{Q_m}(t)} ){W_{VLC}}{{\log }_2}(1 + \frac{{p_{n,m}^j(t){{|{\varepsilon h_{n,m}^j(t)} |}^2}}}{{\tilde{w}_{n,m}^j(t)\sigma _{VLC}^2}})} } \\ &\quad\quad - {Y_n}(\sum\limits_{j \in J} {\sum\limits_{m \in {{\cal M}}} {p_{n,m}^j} } - P_{VLC}^{ave})\\ &\textrm{s}\textrm{.t}\;\;\;\;\;\sum\limits_{m \in {{{\cal M}}_0}} {\tilde{w}_{n,m}^j(t)} \le 1,\forall m \in {{{\cal M}}_n},\\ &\quad\quad 0 \le \tilde{w}_{n,m}^j(t) \le 1,\forall m \in {{{\cal M}}_n},\\ &\quad\quad p_{n,m}^j \ge 0. \end{aligned}$$

The objective function is formulated as

(48)$$\begin{aligned} L(P) &= \sum\limits_{j \in J} {\sum\limits_{m \in {{{\cal M}}_n}} {\tilde{w}_{n,m}^j(t)({\textrm{V + }{Q_m}(t)} ){W_{VLC}}{{\log }_2}(1 + \frac{{p_{n,m}^j(t){{|{\varepsilon h_{n,m}^j(t)} |}^2}}}{{\tilde{w}_{n,m}^j(t)\sigma _{VLC}^2}})} } \\ &\quad - {Y_n}\sum\limits_{j \in J} {\sum\limits_{m \in {{\cal M}}} {p_{n,m}^j} } . \end{aligned}$$

Then, taking the partial derivative of $L(P)$ with respect to $p_{n,m}^j(t)$, the following expression can be obtained as

(49)$$\frac{{\partial L}}{{\partial p_{n,m}^j(t)}} = \frac{{\tilde{w}_{n,m}^j(t)({\textrm{V + }{Q_m}(t)} ){W_{VLC}}{{|{\varepsilon h_{n,m}^j(t)} |}^2}}}{{(\sigma _{VLC}^2\tilde{w}_{n,m}^j(t) + p_{n,m}^j(t){{|{\varepsilon h_{n,m}^j(t)} |}^2})\ln 2}} - {Y_n}.$$

According to the KKT conditions, the optimal power allocation is achieved when it meets the following conditions

(50)$$\begin{aligned} \frac{{\partial \left[ {L\textrm{ - }\left( {\sum\limits_{m \in {{{\cal M}}_n}} {\sum\limits_{j \in J} {{\mu_{j,m}}p_{n,m}^j(t)} } } \right)} \right]}}{{\partial p_{n,m}^j(t)}} &= \frac{{\tilde{w}_{n,m}^j(t)({\textrm{V + }{Q_{n,m}}(t)} ){W_{VLC}}{{|{\varepsilon h_{n,m}^j(t)} |}^2}}}{{(\sigma _{VLC}^2\tilde{w}_{n,m}^j(t) + p_{n,m}^j(t){{|{\varepsilon h_{n,m}^j(t)} |}^2})\ln 2}},\\ & - {Y_n} - {\mu _{j,m}} = 0 \end{aligned}$$

(51)$${\mu _{j,m}}p_{n,m}^j(t)\textrm{ = }0,\;\;\;\forall j,m,$$

(52)$${\mu _{j,m}} \ge 0,\;\;\forall i,m,$$

the relationship between $p_{n,m}^{j\ast }(t)$ and $\tilde{w}_{n,m}^j(t)$ can be obtained as

(53)$$p_{n,m}^{j\ast }(t) = {\left[ {\frac{{({\textrm{V + }{Q_m}(t)} ){W_{VLC}}}}{{{Y_n}\ln 2}} - \frac{{\sigma_{VLC}^2}}{{{{|{\varepsilon h_{n,m}^j(t)} |}^2}}}} \right]^ + }\tilde{w}_{n,m}^j(t).$$

Then, the result of power allocation would be used for getting subcarrier assignment. After substituting Eq. (53) into Eq. (48), $L(P)$ can be written as

(54)$$L(P) = \sum\limits_{j \in J} {\sum\limits_{m \in {{\cal M}}} {{l_{j,m}}(P)\tilde{w}_{n,m}^j(t)} } ,$$

where

(55)$$\begin{aligned} {l_{j,m}}({\mathbf P}) &= ({\textrm{V + }{Q_m}(t)} ){W_0}{\log _2}(1\textrm{ + }{\left[ {\frac{{({\textrm{V + }{Q_m}(t)} ){W_{VLC}}}}{{{Y_n}\ln 2}} - \frac{{\sigma_{VLC}^2}}{{{{|{\varepsilon h_{n,m}^j(t)} |}^2}}}} \right]^ + }\frac{{{{|{\varepsilon h_{n,m}^j(t)} |}^2}}}{{\sigma _{VLC}^2}})\;\\ & - {Y_n}{\left[ {\frac{{({\textrm{V + }{Q_{n,m}}(t)} ){W_0}}}{{{Y_n}\ln 2}} - \frac{{\sigma_{VLC}^2}}{{{{|{\varepsilon h_{n,m}^j(t)} |}^2}}}} \right]^ + }.\end{aligned}$$

Then, the $\max \;L(P)$ problem can be equivalently decomposed into I subproblems, each subproblem is formulated as

(56)$${L_j}(P) = \sum\limits_{m \in {{\cal M}}} {{l_{j,m}}(P)\tilde{w}_{n,m}^j(t)} ,$$

For each subcarrier j, the optimal assignment is to obtain the user who has the largest ${l_{j,m}}(P)$. The solution can be obtained as

(57)$$\tilde{w}_{n,m}^{j\ast }(t) = \left\{ \begin{array}{l} 1,m = \mathop {\arg \max (}\limits_{m^{\prime}} {l_{j,m^{\prime}}}({\mathbf P}),m^{\prime} \in {{\cal M}_n}),{l_{j,m^{\prime}}} > 0,\\ 0,otherwhise. \end{array} \right.$$

5. Simulation results

In this part, the simulations are carried out to evaluate the performance of the proposed DRA algorithm considering a room with a size of $8m \times 8m \times 3m$ in Fig. 1. And the receiver of user is located on a receiving plane with a height of 0.25 m. There are 4 VLC APs installed on the ceiling with a layout of a square matrix, and a separation of 2 m is between the closest two APs. Besides, each AP has 25 subcarriers. The RF AP is deployed at $(0,0,0)$ of the room and it has 4 RBs. The parameters adopted in the simulation can be found in [10], [13] and some key parameters are given in below Table 1. For simplicity, the bandwidths of RF and VLC APs are normalized. The arrival traffic data of each user follows the Poission distribution with the mean arrival rate $\lambda $. The arrival rate and conditions of the RF channel maintain within one timeslot and changes in the next timeslot.

Table 1. Simulation Parameters

View Table

In Fig. 2, the impacts of the control parameter V and average traffic arrival rate $\lambda $ are investigated on the tradeoff performance between the average queue length Q and average transmission rate R. Since the bandwidth is normalized, the unit of transmitted data volume is set to package. The average communication rate for a single RB of RF AP and single subcarrier of VLC APs in the system is 3 packages and 5 packages, respectivley. Thus, the average arrival rate is limited to 5 packages. And the location of user is changed at each timeslot. Figure 2(a) shows the Q versus control parameter V. It can be observed that the value of Q in the system increases as V grows. This is because that the DRA algorithm will consider the optimization of the system stability less important when V increases, which can be found from Eq. (26). In addition, as the traffic arrival rate increases, the average queue length also increases. Figure 2(b) shows the average transmission rate in the system as a function of the control parameter V. As can be found, the average transmission rate increases as V grows. This is due to the fact that the increase in V makes the algorithm more focused on optimizing the transmission rate. Another scenario is that the average transmission rate increases more slowly while V is growing. This is because that with a lower value of V, the system can cover the arrival rate and with a higher value of V, packets are rejected in order to ensure queue stability. According to Little's theorem [8], the average transmission delay is directly proportional to the average queue length. It means that a smaller average queue length in the system indicates a smaller average delay of data transmission at a certain rate of data arrival. Hence, the DRA algorithm could achieve a tradeoff between transmission rate and delay, therefore an appropriate value of V would correspond to a certain tradeoff performance of the system.

Fig. 2. Average queue length and transmission rate versus control parameter V

Download Full Size | PDF

Figure 3(a) depicts the transmission rates of 4 VLC APs and 1 RF AP in the system over 30 timeslots. The simulation is carried out under the condition that V value is taken as 200, the arrival rate is 3 packages, and 6 users are distributed indoors. The result is averaged over 500 experiments. It can be found that all VLC APs in the system can achieve an average transmission rate of 12 packages or more at each timeslot, and RF AP provides lower transmission rate, which means that VLC APs can provide higher transmission rate. Furthermore, the transient changes of the system are analyzed. Assuming that users change position once every 10 timeslots, Fig. 3(b) shows the instantaneous transmission rate of all the APs in the system under DRA algorithm in one experiment which includes 200 timeslots. It is shown that VLC APs can offer more transmission rates and more stable services than RF APs when users are not moving. Figure 3(c) and Fig. 3(d) represent the instantaneous transmission rate and queue length of 6 users in this system in one experiment including 30 timeslots on V=50 and V=200. It can be observed that for each user, the system would increase the transmission rate as response to the increase of Q. Concretely, the transmission rate of each user grows at the timeslot after the growth of Q. This is because the algorithm would make the system provide higher transmission rate to reduce the buffering queues, therefore ensuring the stability of the system. Another conclusion is that the queue length in the system is higher when V=200, especially for user3, user3 and user6, than that for V=50.

Fig. 3. performance of DRA in indoor RF/VLC heterogeneous network

Download Full Size | PDF

Figure 4 gives the comparison of system performance using different algorithms. Considering the minimum distance (MD) algorithm and the best CSI (BC) algorithm [26,31], the changes of the average transmission rate and queue length versus number of users are presented in Fig. 4(a) and Fig. 4(b). As can be seen in Fig. 4(a), with the increase of users in the system, both R and Q gradually increase, and the transmission rate of DRA algorithm is significantly higher than that of MD-RA algorithm and BC-RA algorithm. In Fig. 4(b), the queue length of DRA is the lowest of these three algorithms, which means that the system with DRA algorithm has better stability than that of the system with MD-RA algorithm and BC-RA algorithm. Figure 4(c) and Fig. 4(d) study the average transmission rate and queue length with respect to different numbers of users considering the outage probability. Let $op$ denote the probability of the availability of LOS and a uniform random number is generated between 0 and 1. If the number is more than $op$, the LOS of VLC APs is available; otherwise, it is not available. It is shown that the proposed DRA algorithm could achieve higher transmission rate and guarantee lower queue length with consideration of outage probability. Above all, it is proved that the DRA can provide better performance of system transmission rate and system stability in general condition than that of MD-RA algorithm and BC-RA algorithm.

Fig. 4. comparison of Average queue length and transmission rate versus number of users

Download Full Size | PDF

6. Conclusions

In this paper, a dynamic resource optimization is studied for a hybrid RF/VLC network and a novel and rapid DRA algorithm is proposed. In DRA, network resources are optimized under the constraints of system stability to maximize the system transmission rate. Specifically, a stochastic optimization problem is formulated to maximize the time-averaged transmission rate subject to the network stability constraint and the average power budget of the hybrid RF/VLC network. Then, by using the Lyapunov optimization technique, the DRA algorithm let us obtaining respectively the optimal AP assignment and RA at each timeslot. In particular, the DRA algorithm divides the problem into three independent subproblems and no iterations are required. The simulations show that the proposed DRA algorithm can significantly improve the transmission rate while stabilizing the system.

Funding

National Natural Science Foundation of China (62071365); Equipment Scientific Research Project (GK20202A050024).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. S. Jaffry, R. Hussain, X. Gui, and S. F. Hasan, “A Comprehensive Survey on Moving Networks,” IEEE Commun. Surv. Tutorials 23(1), 110–136 (2021). [CrossRef]

2. B. Lin, Q. Lai, Z. Ghassemlooy, and X. Tang, “A Machine Learning Based Signal Demodulator in NOMA-VLC,” J. Lightwave Technol. 39(10), 3081–3087 (2021). [CrossRef]

3. J. Deng, X. Jin, X. Ma, M. Jin, C. Gong, and Z. Xu, “Graph-based Multi-user Scheduling for Indoor Cooperative Visible Light Transmission,” Opt. Express 28(11), 15984–16002 (2020). [CrossRef]

4. A. R. Ndjiongue, T. M. N. Ngatched, O. A. Dobre, and H. Haas, “Re-Configurable Intelligent Surface-Based VLC Receivers Using Tunable Liquid-Crystals: The Concept,” J. Lightwave Technol. 39(10), 3193–3200 (2021). [CrossRef]

5. M. S. Demir and M. Uysal, “A Cross-Layer Design for Dynamic Resource Management of VLC Networks,” IEEE Trans. Commun. 69(3), 1858–1867 (2021). [CrossRef]

6. C. Hu, C. Chen, M. Guo, Y. Yang, J. Luo, and L. Chen, “Optical-spatial-summing-based NOMA with Fine-grained Power Allocation for VLC-enabled IoT Applications,” Opt. Lett. 45(17), 4927–4930 (2020). [CrossRef]

7. Y. Yang, C. Chen, P. Du, X. Deng, J. Luo, W.-D. Zhong, and L. Chen, “Low Complexity OFDM VLC System Enabled by Spatial Summing Modulation,” Opt. Express 27(21), 30788–30795 (2019). [CrossRef]

8. J. C. Valencia-Estrada, J. García-Marquez, L. Chassagne, and S. Topsu, “Catadioptric Interfaces for Designing VLC Antennae,” Appl. Opt. 56(27), 7559–7566 (2017). [CrossRef]

9. M. Nassiri, G. Baghersalimi, and Z. Ghassemlooy, “Optical OFDM Based on the Fractional Fourier Rransform for an Indoor VLC System,” Appl. Opt. 60(9), 2664–2671 (2021). [CrossRef]

10. W. Wu, F. Zhou, and Q. Yang, “Adaptive Network Resource Optimization for Heterogeneous VLC/RF Wireless Networks,” IEEE Trans. Commun. 66(11), 5568–5581 (2018). [CrossRef]

11. D. A. Basnayaka and H. Haas, “Design and Analysis of a Hybrid Radio Frequency and Visible Light Communication System,” IEEE Trans. Commun. 65(10), 1 (2017). [CrossRef]

12. J. Chen and Z. Wang, “Topology Control in Hybrid VLC/RF Vehicular Ad-Hoc Network,” IEEE T. Wirel. Commun. 19(3), 1965–1976 (2020). [CrossRef]

13. X. Wu, M. D. Soltani, L. Zhou, M. Safari, and H. Haas, “Hybrid LiFi and WiFi Networks: A Survey,” IEEE Commun. Surv. Tutorials 23(2), 1398–1420 (2021). [CrossRef]

14. M. Amjad, H. K. Qureshi, S. A. Hassan, A. Ahmad, and S. Jangsher, “Optimization of MAC Frame Slots and Power in Hybrid VLC/RF Networks,” IEEE Access 8, 21653–21664 (2020). [CrossRef]

15. J. Cosmas, B. Meunier, K. Ali, N. Jawad, M. Salih, Y. Zhang, Z. Hadad, B. Globen, H. Gokmen, S. Malkos, M. Cakan, H. Koumaras, A. Kourtis, C. Sakkas, D. Negru, M. Lacaud, M. Ran, E. Ran, J. Garcia, W. Li, L. K. Huang, R. Zetik, K. Cabaj, W. Mazurczyk, X. Zhang, and A. Kapovits, “A 5G Radio-Light SDN Architecture for Wireless and Mobile Network Access in Buildings,” in 2018 IEEE 5G World Forum (5GWF)(2018), pp. 135–140.

16. J. Cosmas, B. Meunier, K. Ali, N. Jawad, H. Y. Meng, F. Goutagneux, E. Legale, M. Satta, P. Jay, X. Zhang, C. Huang, J. Garcia, M. Negru, Y. Zhang, T. Kourtis, C. Koumaras, C. Sakkas, L. K. Huang, R. Zetik, K. Cabaj, W. Mazurczyk, M. E. Cakan, and A. Kapovits, “5G Internet of Radio Light Services for Musée de la Carte à Jouer,” in2018 Global LIFI Congress (GLC) (2018), pp. 1–6.

17. J. Al-Khori, G. Nauryzbayev, M. M. Abdallah, and M. Hamdi, “Joint Beamforming Design and Power Minimization for Friendly Jamming Relaying Hybrid RF/VLC Systems,” IEEE Photonics J. 11(2), 1–18 (2019). [CrossRef]

18. M. Obeed, A. M. Salhab, M. S. Alouini, and S. A. Zummo, “On Optimizing VLC Networks for Downlink Multi-User Transmission: A Survey,” IEEE Commun. Surv. Tutorials 21(3), 2947–2976 (2019). [CrossRef]

19. S. Ma, F. Zhang, H. Li, F. Zhou, M. S. Alouini, and S. Li, “Aggregated VLC-RF Systems: Achievable Rates, Optimal Power Allocation, and Energy Efficiency,” IEEE T. Wirel. Commun. 19(11), 7265–7278 (2020). [CrossRef]

20. M. Hammouda, S. Akın, A. M. Vegni, H. Haas, and J. Peissig, “Link Selection in Hybrid RF/VLC Systems Under Statistical Queueing Constraints,” IEEE T. Wirel. Commun. 17(4), 2738–2754 (2018). [CrossRef]

21. Z. Du, C. Wang, Y. Sun, and G. Wu, “Context-Aware Indoor VLC/RF Heterogeneous Network Selection: Reinforcement Learning With Knowledge Transfer,” IEEE Access 6, 33275–33284 (2018). [CrossRef]

22. X. Wu, M. Safari, and H. Haas, “Access Point Selection for Hybrid Li-Fi and Wi-Fi Networks,” IEEE Trans. Commun. 65(12), 5375–5385 (2017). [CrossRef]

23. X. Wu and H. Haas, “Mobility-Aware Load Balancing for Hybrid LiFi and WiFi Networks,” J. Opt. Commun. Netw. 11(12), 588–597 (2019). [CrossRef]

24. M. Kashef, M. Ismail, M. Abdallah, K. A. Qaraqe, and E. Serpedin, “Energy Efficient Resource Allocation for Mixed RF/VLC Heterogeneous Wireless Networks,” IEEE J. Sel. Area. Comm. 34(4), 883–893 (2016). [CrossRef]

25. H. Zhang, N. Liu, K. Long, J. Cheng, V. C. M. Leung, and L. Hanzo, “Energy Efficient Subchannel and Power Allocation for Software-defined Heterogeneous VLC and RF Networks,” IEEE J. Sel. Area. Comm. 36(3), 658–670 (2018). [CrossRef]

26. M. Obeed, A. M. Salhab, S. A. Zummo, and M. Alouini, “Joint Optimization of Power Allocation and Load Nalancing for Hybrid VLC/RF Networks,” J. Opt. Commun. Netw. 10(5), 553–562 (2018). [CrossRef]

27. Y. S. M. Pratama and K. W. Choi, “Bandwidth Aggregation Protocol and Throughput-Optimal Scheduler for Hybrid RF and Visible Light Communication Systems,” IEEE Access 6, 32173–32187 (2018). [CrossRef]

28. S. Shrivastava, B. Chen, C. Chen, H. Wang, and M. Dai, “Deep Q-Network Learning Based Downlink Resource Allocation for Hybrid RF/VLC Systems,” IEEE Access 8, 149412–149434 (2020). [CrossRef]

29. J. Kong, Z. Y. Wu, M. Ismail, E. Serpedin, and K. A. Qaraqe, “Q-Learning Based Two-Timescale Power Allocation for Multi-Homing Hybrid RF/VLC Networks,” IEEE Wirel. Commun. Le. 9(4), 443–447 (2020). [CrossRef]

30. J. Neely and Michael, “Stochastic Network Optimization with Application to Communication and Queueing Systems,” Synth. Lect. Commun. 3(1), 1–211 (2010). [CrossRef]

31. S. A. AlQahtani and M. Alhassany, “Comparing Different LTE Dcheduling Dchemes,” in 2013 9th International Wireless Communications and Mobile Computing Conference (IWCMC)(2013), pp. 264–269.

Parameter	Value
Semi-angle at half power	$70^{\circ}$
Number of luminary	$5$
Maximum power of luminary	$4 W$
Field-of-view	$60^{\circ}$
Physical area of a PD,	$1 c m^{2}$
Gain of optical filter	$1$
Refractive index	$1.5$
photodetector responsivity	$0 .52 A / W$

Transmission rate Optimization by dynamic resource allocation algorithm for RF/VLC heterogeneous networks

Abstract

1. Introduction

2. Indoor VLC/RF heterogeneous network model

2.1 RF channel model

2.2 VLC channel model

2.3 Power consumption limiting condition

2.4 Data queue model

3. Problem formulation

3.1 Primal problem formulation

3.2 Analysis based on Lyapunov optimization theory

4. Design of resource allocation algorithms

4.1 Scheduling policy

4.2 AP assignment

4.3 Joint RBs and power allocation of RF AP

4.4 Joint subcarriers and power allocation of VLC AP

5. Simulation results

6. Conclusions

Funding

Disclosures

Data availability

References

Supplementary Material (1)

Data availability

Cited By

Figures (4)

Tables (1)

Equations (57)

Optics Express