Wireless Powered Cooperation-Assisted Mobile Edge Computing

This paper studies a mobile edge computing (MEC) system in which two mobile devices are energized by the wireless power transfer (WPT) from an access point (AP) and they can offload part or all of their computation-intensive latency-critical tasks to the AP connected with an MEC server or an edge cloud. This harvest-then-offload protocol operates in an optimized time-division manner. To overcome the doubly near-far effect for the farther mobile device, cooperative communications in the form of relaying via the nearer mobile device is considered for offloading. Our aim is to minimize the AP’s total transmit energy subject to the constraints of the computational tasks. We illustrate that the optimization is equivalent to a min–max problem, which can be optimally solved by a two-phase method. The first phase obtains the optimal offloading decisions by solving a sum-energy-saving maximization problem for given an energy transmit power. In the second phase, the optimal minimum energy transmit power is obtained by a bisection search method. Numerical results demonstrate that the optimized MEC system utilizing cooperation has significant performance improvement over systems without cooperation.

computing (MEC) has emerged as a promising concept, which promotes to use cloud-computing facilities at the edge of mobile networks by integrating MEC servers at the wireless access points (APs). This paradigm of computation offloading is motivated by ultralow latency, high bandwidth, and realtime access to radio network information, which is widely considered as an effective means to liberate the mobile devices from heavy computation workloads, e.g., [1]- [3].

A. Prior Works
MEC, with proximate access, is a promising complementary counterpart of centralized mobile cloud computing. The crossdisciplinary nature of MEC lays the important role of resource management in achieving energy-efficient or delay-optimal MEC. Recent years have witnessed encouraging progress on this topic for both single-user [4]- [8] as well as multiuser [9]- [14] MEC systems. For single-user MEC systems, the energy-optimal mobile cloud computing under stochastic wireless channel was considered in [4]. Later in [5], a dynamic offloading scheme with link selection was proposed to improve the energy efficiency. Another dynamic offloading scheme with energy harvesting was addressed in [7] to reduce the delay cost. In [8], a Markov decision process approach was adopted to handle a delay minimization problem. As for the multiuser MEC systems, joint radio-and-computational resource management becomes more complicated. A multi-cell MEC offloading system was considered in [9] in order to minimize the overall energy consumption of users. In [10], the distributed offloading decision making problem was formulated as a multiuser computation offloading game. Optimal energyefficient resource allocation for multiple users was addressed in [11] based on time-division multiple-access (TDMA) and orthogonal frequency-division multiple-access (OFDMA) systems. The cooperation among clouds was investigated in [12] to maximize the revenues of clouds. In [14], a stochastic resource management resorting Lyapunov optimization was considered to minimize the power consumption.
Taking the full benefits of powerful computational resources at the edges, nonetheless, faces several challenges. Insufficient power supply is one major limitation for battery-based devices and mobile applications will be terminated if the battery is running out. It therefore makes sense to leverage the technology of wireless power transfer (WPT) so that mobile devices are not power-limited by their batteries but can be energized remotely, e.g., [15]- [19]. WPT particularly in the form of wireless powered communication networks (WPCNs) [17]- [19] This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/ have recently been considered as an important paradigm to provide genuine sustainability for mobile communications. In addition, many works have seen the possible synergy integrating MEC with WPT [6], [13]. An interesting work in [6] considered a wireless powered single-user MEC system, in which binary offloading was investigated, i.e., either local computing only or fully offloading, so as to maximize the computing probability. More recently in [13], an energy-efficient wireless powered multiuser MEC system combining with a multi-antenna AP was considered. The optimal transmit energy beamforming of the AP, offloading decision and resource allocation for minimizing the energy consumption at the AP were obtained.
However, WPCNs are susceptible to suffer from the so-called "double-near-far" effect, which occurs because a farther user from an AP harvests less energy and is also required to communicate in longer distances [17], [18]. It is known that user cooperation has been extensively investigated in wireless communications for its ability to enhance data rate under unfavourable channel conditions, e.g., [18]- [22]. Particularly the efforts in [18]- [20] focused on the effect of cooperation between near-far users, trying to improve the performance of WPCNs. Most recently, user cooperation was also considered in MEC [23], where a three-node MEC system was considered to exploit joint computation and communication cooperation for reducing the energy consumption.

B. Our Contributions
In this paper, we study a wireless powered MEC system to complete the computation-intensive latency-critical tasks of two near-far users 1 exploiting cooperative communication, where the entire process is solely powered by the AP. Note that as far as green computing is concerned, minimizing the carbon footprint of the AP has appeared to be the priority in WPT-MEC systems. Hence, our objective is to minimize the total transmit energy of the AP with jointly optimal power and time allocation to fully explore the benefits of user cooperation in enhancing the performance of the WPT-MEC system. Our contributions are summarized as follows: • A harvest-then-offload protocol with a block-based time division mechanism that leverages cooperative communications to overcome the doubly near-far effect in WPT-MEC systems, is proposed. • We first formulate the AP's transmit energy minimization (APTEM) problem, and then transform it into an equivalent min-max optimization problem (that also turned out to be equivalent to the AP's transmit power minimization (APTPM) problem). The problem is optimally tackled by a two-phase approach. In the first phase, the inner sum-energy-saving maximization (SESM) problem based on a given energy transmit power is solved by the Lagrangian method, where the optimal offloading decisions with joint power and time allocation are found in closed or semi-closed form. Then in the second phase, a simple bisection search is adopted to obtain the minimum-energy transmit power based on the solution of the SESM problem, resulting the joint-optimal solution.
1 In this paper, we use "device" and "user" interchangeably. • Further, we prove that the optimal offloaded data sizes of the two users have threshold-based structures in relation to some offloading priority indicators, and the thresholds are determined by the users' energy harvesting potentials, reflecting the effect of user cooperation. It is also verified that the optimal WPT time-duration is a monotonic nondecreasing function of the AP's transmit power, which further shows the equivalence between the APTEM and APTPM problems. Moreover, we prove that at least one user makes no energy saving when the minimum-energy transmit power is employed at the AP. • A low-complexity algorithm is proposed to solve the APTEM problem, and we show that the complexity is at most with the order of O(1) ln(1/σ ) ln(1/δ), where σ, δ > 0 respectively denote the computational accuracies of two tiers of one-dimensional search in the algorithm. • Simulation results verify the theoretical analysis of the proposed cooperative computation offloading scheme by comparing with two baselines. It is shown that the proposed scheme not only achieves significant performance improvement, but also demonstrates the effectiveness of handling computation-intensive latency-critical tasks and resisting the double-near-far effect in WPCNs. The rest of this paper is organized as follows. In Section II, we introduce the system model and the problem formulation. The proposed two-phase method for energy-efficient resource allocation with user cooperation is presented in Section III. Section IV provides the simulation results. Some possible extensions will be discussed in Section V and we conclude our paper in Section VI.

II. SYSTEM MODEL AND PROBLEM FORMULATION
Consider a wireless powered MEC system shown in Fig. 1 that consists of a single-antenna AP (with an integrated MEC server), and two single-antenna mobile devices, denoted by D 1 and D 2 , both operating in the same frequency band and each having a computation-intensive latency-critical task to be completed. A block-based TDMA structure is adopted where each block has a duration of T seconds. During each block, AP energizes the mobile devices in the downlink via WPT. Using the harvested energy, the two devices accomplish their computation tasks in a partial offloading fashion [2], where the task-input bits are bit-wise independent and can be arbitrarily divided to facilitate parallel trade-offs between local computing at the mobile devices and computation offloading to the MEC server. After the AP computes the offloaded data, it sends the results back to the devices. Note that local computing and downlink WPT can be performed simultaneously while wireless communications (for offloading) and WPT are non-overlapping in time considering half-duplex transmission for both two users. As a result, the harvest-then-transmit protocol proposed in [17] is employed in our model but for wireless powered computation offloading, which we refer to it as the harvest-then-offload protocol.
Assuming that the AP has the perfect knowledge of all the channels and task-related parameters which can be obtained by feedback, the AP is designed to make offloading decisions and allocate both radio and computational resources optimally. Our aim is to minimize the total transmit energy of the AP for completing the computation tasks of the two users.

A. Computation Task Model
Each user D i (i ∈ {1, 2}) has a computation-intensive and latency-critical task in each block, fully characterized by a positive parameter tuple I i , C i , O i , T i , where I i denotes the size (in bits) of the computation input data (e.g., the program codes and input parameters), C i is the amount of required computational resource for computing 1-bit of input data (i.e., the number of CPU cycles required), O i is the output data size which is proportional to but much less than I i , and T i is the maximum tolerable latency. A mobile user can apply the methods (e.g., call graph analysis) in [24] and [25] to obtain the information of I i and C i . Note that this model allows rich task modelling flexibility in practice and can be easily extended to consider other kinds of resources by introducing more parameters in the tuple. In this paper, we assume that the maximum tolerable latency for two users is one block length, i.e., T 1 = T 2 = T .

B. User Cooperation Model for Computation Offloading
For computation-intensive tasks with large input data size I i , it would be difficult to rely upon local computing to satisfy the latency constraint, and thus computation offloading may be necessary. Considering the double-near-far effect in our considered WPCN, cooperation amongst near-far users during offloading will help to improve the computation performance. Without loss of generality, it is assumed that D 2 is nearer to the AP than D 1 , and we denote the distances between AP and D 1 , AP and D 2 , D 1 and D 2 as d 1 , d 2 , and d 12 , respectively, with d 2 ≤ d 1 . We also assume that d 12 ≤ d 1 , and therefore it will be easier for D 2 to decode the information sent by D 1 than the AP, which makes such cooperative communications useful. For an arbitrary single block, the time division structure is shown in Fig. 2. During the first period t 0 , AP broadcasts wireless power to both D 1 and D 2 in the downlink with transmit power P 0 . Assume that the two devices have enough battery storages, and thus the energy harvested by each device during the WPT period is given by where g i is the downlink channel power gain from the AP to D i and 0 < ν i ≤ 1 is the energy conversion efficiency for D i . Note that no other sources of energy are available to carry out the computation tasks except from WPT of the AP. After the WPT period, D 1 transmits its input-data-bearing information with average power p 1 from its harvested energy during the subsequent period t 1 , and both the AP and D 2 decode their respective received signals from D 1 . To overcome the doubly near-far effect, during the remaining time of the block, the nearer user D 2 will first relay the farther user D 1 's information with average power p 21 over t 21 amount of time and then transmits its own input-data-bearing information to the AP with average power p 22 over period t 22 , all using its harvested energy. We denote the time allocation and power allocation vectors as t = [t 0 , t 1 , t 21 , t 22 ] and p = [p 1 , p 21 , p 22 ], respectively. According to the results (Theorems 1-5) in [22], with a given pair of t and p, the offloaded data size of D 1 for remote computation at the AP should be the smaller value between the decoded data sizes at the AP and D 2 , i.e., where L 1,1 (t, p), L 1,2 (t, p) and L 1,12 (t, p) denote D 1 's offloaded data size from D 1 to the AP, from D 2 to the AP, and from D 1 to D 2 , respectively, which are given by where r 1,1 (p), r 1,2 (p), and r 1,12 (p) are the transmission rates according to the channel achievable rates for offloading D 1 's input data. In the above expressions, h 1 , h 2 are the uplink channel power gains from D 1 and D 2 to the AP, respectively, and h 12 is the device-to-device channel power gain from D 1 to D 2 . 2 Also, B is the channel bandwidth. N 0 and N 2 are respectively the receiver noise power at the AP and D 2 , and we further assume that N 2 = N 0 without loss of generality. Similarly, the offloaded data size of D 2 for computing at the AP is described as where r 2 (p) denotes the transmission rate for offloading D 2 's input data. According to the task model, the offloaded data size of each user should not be greater than its corresponding input data size, i.e., L i (t, p) ≤ I i , for i ∈ {1, 2}.
In practice, the MEC-integrated AP will provide sufficient CPU capability and transmit power, while the computed results are usually of much smaller sizes especially compared with the input data size, i.e., O i I i . Hence, the decoding and computation time spent at the AP as well as the time consumed for delivering the computed results are negligible. For the nearer user D 2 , the decoding time for D 1 's information is also negligible compared with the uplink offloading time for both D 1 and D 2 's information. For these reasons, we only consider the WPT time and the uplink offloading time as the total latency of the WPT-MEC system, and thus we obtain a latency constraint given by For each user, the energy required to receive its computed results from the AP is also considered negligible. Therefore, the energy consumption of D 1 and D 2 for computation offloading equals to the energy consumed for wireless transmissions, given by

C. Local Computing Model
Given a pair of time and power allocation vectors (t, p), the offloaded data sizes {L i (t, p)} will be known, and hence the remaining input data of the corresponding computation tasks, i.e., I i − L i (t, p), should be computed locally at D i , i ∈ {1, 2}. For local computing, we assume that the CPU frequency is fixed as f i for D i , which means that the two mobile devices are of limited computing resources. In order to satisfy the latency constraint, i.e., (I i − L i (t, p)) C i / f i ≤ T , the offloaded data for D i should have a minimum size of Under the assumption of a low CPU voltage that normally holds for low-power devices, the energy consumption per CPU cycle for local computing at D i can be denoted as where κ i is the effective capacitance coefficient that depends on the chip architecture. Hence, the energy consumption of D i for local computing can be expressed as

D. Problem Formulation
Based on the model, the energy saving for Furthermore, the APTEM problem for minimizing AP's transmit energy can be formulated as problem (P1) below where (11c) and (11d) represent the energy harvesting constraints for D 1 and D 2 , respectively. Note that problem (P1) is a nonconvex optimization problem in the above form because of the expressions of L 1 (t, p) and L 2 (t, p), and the product of P 0 t 0 . Actually, problem (P1) can be equivalently transformed into the following min-max problem (P2) 3 However, problem (P2) is still nonconvex in this form. In order to make this problem solvable and facilitate further analysis, we propose a two-phase method. In the first phase, we solve the inner sub-problem with a given P 0 where the sum-energysaving (SES), i.e., E s,1 (t, p) + E s,2 (t, p) is maximized under the constraints in (P1), referred to as the SESM problem (P3): through which the optimal time and power allocation corresponding to the given P 0 can be obtained. In the second phase, we will find the minimum P 0 by a bi-section search method. In the following section, we will demonstrate the details of the problem-solving process with the two-phase method.

III. THE TWO-PHASE METHOD FOR COOPERATIVE MEC
Here, we focus on the two-phase method for joint power and time allocation for cooperative MEC. The process of operating the first phase with a given P 0 is presented in Sections III-A to III-D, where the optimal offloaded data size, the power allocation of D 1 (in semi-closed from) and D 2 (in closed from) as well as the optimal time allocation are obtained for each sub-problem. Besides, the equivalence between problem (P1) and (P2) is given in Section III-E. Finally, the second phase is described in Section III-F, where the minimum P 0 is achieved.

A. Transforming the SESM Problem (P3) Into Convex
To make the nonconvex SESM problem (P3) solvable with a given P 0 , we first introduce the variables q 1 = p 1 t 1 ν 1 g 1 P 0 and q 21 = p 21 t 21 ν 2 g 2 P 0 . By denoting q = [q 1 , q 21 ], L 1,1 (t, p), L 1,2 (t, p) and L 1,12 (t, p) described in (3)-(5) can then be re-expressed as functions of t and q as where β 1 = ν 1 g 1 h 1 N 0 , β 2 = ν 2 g 2 h 2 N 0 , and β 12 = ν 1 g 1 h 12 N 2 . Note that the above three functions equal to 0 when t 1 = 0, t 21 = 0 and t 1 = 0, respectively. Using the property of perspective function [26], it is easily verified that L 1,1 (t, q), L 1,2 (t, q) and L 1,12 (t, q) are all joint concave functions of t and q. Besides, they are all monotonically increasing functions over each element of (t 1 , q 1 ), (t 21 , q 21 ) and (t 1 , q 1 ), respectively. Next, we introduce a new variable to replace L 1 (t, p) in problem (P3) with two additional convex constraints, L 1,1 (t, q)+ L 1,2 (t, q) ≥ L 1 and L 1,12 (t, q) ≥ L 1 . Thus, the expression of energy saving for D 1 in the objective function of (P3) (and its corresponding constraints) has been turned into concave (convex). However, even though we can use a similar variable-changing method to convert To tackle this, we redefine the offloaded data size of D 2 as an independent variable L 2 , and then by defining a function x ≥ 0, the offloading power p 22 can be described as a function of L 2 and t 22 according to (6), given by Hence, the energy savings for D 1 and D 2 with a given P 0 can be rewritten as Therefore, the SESM problem (P3) can be equivalently reformulated as another SESM problem (P4) As g(x) is a convex function, its perspective function t 22 g( L 2 t 22 ) is a joint convex function of t 22 and L 2 considering both the cases of t 22 > 0 and t 22 = 0 [26]. Therefore, the objective function is concave and all the constraints are convex, constituting a convex optimization problem (P4).

B. Problem-Solving With Lagrange Method
To gain more insights of the solution, we next solve problem (P4) optimally by leveraging the Lagrange method [26]. The partial Lagrange function of (P4) is defined as where η ≥ 0 and λ = [λ 1 , . . . , λ 4 ] 0 ( denotes the componentwise inequality) consist of the Lagrange multipliers associated with the constraints (21b) and (21c)-(21f) in problem (P4), respectively. In order to facilitate the analysis in the sequel, we define another two functions where g (x) denotes the first-order derivative of g(x), and thus the following two lemmas are established.
where W 0 (z) is the principal branch of the lambert W function defined as the solution for W 0 (z)e W 0 (z) = z [27], and e is the base of the natural logarithm. Proof: See Appendix A.
Proof: See Appendix B. We first assume that problem (P4) is feasible with the given P 0 and let (t * , q * , L * 1 , L * 2 ) denote the optimal solution of (P4) and η * , λ * denote the optimal Lagrange multipliers. Then applying the Karush-Kuhn-Tucker (KKT) conditions [26] leads to the following necessary and sufficient conditions: Note that t * 0 + t * 1 + t * 21 + t * 22 = T must hold; otherwise, we can always allocate the remaining time to t * 0 to further increase the energy saving of the two users, and thus η * > 0 holds for sure. Furthermore, the following lemma describes an important result concerning t * , q * and L * 1 : Lemma 3: The optimal time and power allocation (t * , q * ) ensures the following property of D 1 's offloaded data size, L * 1 .

C. Optimal Offloading Decisions With Power Allocation
First, we define an offloading priority indicator for D i as Note that μ i depends on the corresponding variables quantifying uplink channel (h i ), local computing overhead (C i Q i ), and it is an monotonically increasing function of h i , C i and Q i . The relationship between the optimal offloaded data size and power allocation for each user with the corresponding offloading priority indicator is shown in the following Theorem.
Proof: See Appendix E. Remark 2 (Whether Computation Offloading is Necessary?): According to Theorem 1, it is easy to note that the offloading decision and power allocation of each user depend on their corresponding offloading priority indicator μ i as well as the minimum required offloaded data size M + i . If M + 1 = 0 and μ 1 < (β 1 + β 2 )P 0 /z * , then operating the whole computation task locally is optimal for D 1 ; otherwise computation offloading is required. Similarly, if M + 2 = 0 and ρ(μ 2 ) < (β 1 + β 2 )P 0 , then fulfilling the whole computation task locally is optimal for D 2 ; otherwise computation offloading is necessary.

Remark 3 (Effects of Parameters on the Offloading Priority):
It is easy to note that ρ(μ 2 ) is a monotonic increasing function of μ 2 for μ 2 > 1 (as for μ 2 ≤ 1, L * 2 = M + 2 ), and thus it also monotonically increases with parameters C 2 , Q 2 and h 2 in this case, according to the monotonicity rule of compound function. The results in Theorem 1 show that the optimal offloaded data sizes for the two cooperative users D i , i ∈ {1, 2} grow with increasing μ i , which is consistent with the intuition that more resources should be scheduled to computation offloading when users have good channels (i.e., large h i ) or endure high local computing energy consumption (i.e., large C i and Q i ), so as to save energy.
Remark 4 (Binary Structure of the Offloading Decisions for Two Cooperative Users): Theorem 1 reveals that the optimal offloading decisions for both D 1 and D 2 have a similar threshold-based structure when computation offloading saves energy. Moreover, since the exact cases of μ 1 = (β 1 + β 2 )P 0 /z * in (47) and ρ(μ 2 ) = (β 1 +β 2 )P 0 in (50) rarely occur in practice, the optimal offloading decisions have a binary structure for both cooperative users.

Remark 5 (Effects of Parameters on the Thresholds of the Offloading Decisions):
The same item in the thresholds of the offloading decisions for the two users in Theorem 1, i.e., (β 1 + β 2 ) = (ν 1 g 1 h 1 + ν 2 g 2 h 2 )/N 0 reflects the energy harvesting potentials of the two users (i.e., ν 1 g 1 and ν 2 g 2 ) and the quality of uplink offloading channels for the users (i.e., h 1 and h 2 ), which demonstrates the effect of user cooperation that either user's offloading decision is affected by the other user's energy-harvesting ability and offloading-channel quality.
Lemma 5: For the case of L * 1 > 0, the optimal transmit rates of D 1 and D 2 for offloading D 1 's input data are same, which is expressed as Proof: It is easy to verify the result in Lemma 5 by substituting the optimal transmit power in (48) and (49) into the expressions of r 1,1 (p) and r 1,2 (p) in (3) and (4), respectively.

Theorem 2 (Optimal Time Allocation for WPT and Cooperative Computation Offloading):
1) The optimal time allocation for offloading D 2 's input data is given by 2) The optimal WPT duration time can be derived as 3) The optimal time allocation for offloading D 1 's input data, i.e., (t * 1 , t * 21 ) can be expressed as 4 where (t * 1 , t * 21 ) = (0, 0) when L * 1 = 0. Proof: See Appendix F.

E. The Equivalence Between Problem (P1) and (P2)
In this part, we proceed to show the equivalence between the original APTEM problem (P1) and the min-max problem (P2). First, an important property of the optimal WPT duration time t * 0 is given in the following Lemma 6. Lemma 6: The optimal WPT duration time t * 0 is a monotonic non-decreasing function of P 0 .
Proof: See Appendix G.

Remark 6 (The Effect of P 0 and t 0 on Maximizing SES):
The result of Lemma 6 shows that t * 0 is small when P 0 is relatively small, since in this case the extra energy harvested by increasing t 0 cannot compensate the extra energy consumed by reducing the time for computation offloading (i.e., T − t 0 ), leading to a smaller SES. On the contrary, when P 0 becomes large, t * 0 increases accordingly to obtain more SES.
In the sequel, we first try to prove the equivalence between problem (P2) and (P5), and then show the equivalence of problem (P5) and (P1) to finally verify the theorem. Problem (P5) is a general problem for minimizing the WPT transmit power P 0 by jointly optimizing P 0 , t and p, while (P2) gives a specific method for obtaining the minimum P 0 . (P2) is solved by a two-phase method where the minimum P 0 can be obtained through a one-dimensional (bisection) search by solving problem (P3) (or P4) with each given P 0 . It is easy to understand that if we assume the given P 0 is the minimum P 0 , then the optimal t and p of (P5) can be obtained by solving the SESM problem (P3) with the given P 0 . If we find the minimum given P 0 that maximizes the sum-energy-saving with all the constraints being satisfied through a bisection search, then the obtained (P 0 , t , p ), i.e., the optimal solution of (P2), is actually the joint-optimal solution of (P5). Hence, we can say that problem (P2) and (P5) are equivalent for obtaining the joint-optimal (P 0 , t , p ).
According to the result of Lemma 6, the optimal WPT duration time of the SESM problem (P3), i.e., t * 0 , is a monotonic non-decreasing function of P 0 , which indicates that P 0 t * 0 (P 0 ) is a monotonic increasing function of P 0 . Hence, we can conclude that the minimum P 0 of the APTPM problem (P5) is same as the optimal P 0 for minimizing P 0 t 0 in the original APTEM problem (P1), which means that (P1) and (P5) are equivalent, finally proving the equivalence between problem (P1) and (P2). This indicates that when the minimum feasible P 0 is used in (P3) (or P4), the obtained maximum SES reaches its minimum with respect to P 0 .

F. Optimal Resource Allocation for Obtaining P 0
In this section, we will discuss the second phase of solving problem (P2). It is easy to note that with a larger feasible P 0 , as extra P 0 > 0 is available, the feasible region of problem (P3) (or P4) will be larger as well, and thus more extra energy, at least v 1 g 1 P 0 t 0 + v 2 g 2 P 0 t 0 will be saved, which means that the maximum SES obtained by (P3) (or P4) is a monotonic increasing function of P 0 as long as (P3) (or P4) is feasible. Hence, the minimum P 0 of the original APTEM problem (P1) can be obtained through a bisection search of P 0 .
As a matter of fact, the optimal time allocation parameters should satisfy the latency constraint (7). Note that t * 22 monotonically decreases with P 0 , and thus a lower bound of P 0 , denoted as P L 0 , can be obtained by solving the equation Based on this P L 0 , we can further obtain a proper upper bound of P 0 , denoted as P U 0 , which should make problem (P4) feasible and lead to positive energy savings for both of the users. The optimal P 0 must be in the range of (P L 0 , P U 0 ), and the following lemma shows a property of P 0 which gives a stopping criterion of the bisection search.
Lemma 7: When the minimum feasible P 0 is used in problem (P3) (or P4), at least one of the two users should use up all its harvested energy, i.e., E * s,1 (P 0 ) = 0 or E * s,2 (P 0 ) = 0. Proof: The above lemma can be proved by the method of contradiction. If both E * s,1 (P 0 ) > 0 and E * s,2 (P 0 ) > 0 hold, then at least P 0 = min > 0 can be reduced to minimize P 0 , which will make E * s,1 (P 0 − P 0 ) = 0 or E * s,2 (P 0 − P 0 ) = 0. The whole process of solving the original APTEM problem (P1) is summarized in Algorithm 1, where the final optimal P 0 and the corresponding offloaded data size (L 1 , L 2 ), and power-time allocation (p , t ) can all be obtained.

Remark 7 (Low-Complexity Algorithm):
Through implementing Algorithm 1, the optimal solutions of the original APTEM problem (P1) can be obtained with closed or semiclosed form by substituting the optimal P 0 into Theorem 1 and Theorem 2. At most two tiers of one-dimensional (bisection) search are needed to execute Algorithm 1. The inner tier one is for obtaining z * in Theorem 1-1) and the outer tier one is for acquiring the optimal P 0 following the step 2-step 15. Therefore, the complexity of Algorithm 1 is at most with the order of O(1) ln(1/σ ) ln(1/δ), where σ, δ > 0 denote the computational accuracies of the two tiers of onedimensional search. Compared with the traditional blockcoordinate descending algorithm where iterative optimization should be operated, the proposed Algorithm 1 is of much lower complexity.

IV. SIMULATION RESULTS
In this section, the performance of the proposed wireless powered computation offloading scheme with user cooperation by jointly optimizing power and time allocation is investigated by computer simulations. We will refer to our scheme as "UC-JOPT" in the figures for comparison. Also, we include the results of the following two baselines: 1) A simplified wireless powered computation offloading scheme with user cooperation where D 1 and D 2 use same transmit time to offload D 1 's input data ("UC-ET"). In this scheme, p, L 1 , L 2 , t 22 and t 1 are assigned as the optimal solutions obtained from Theorem 1 and Theorem 2. As for t 21 , D 2 chooses to use the same time duration as t 1 to relay D 1 's input-data information, and thus t 0 = T − t 1 − t 21 − t 22 , which is suboptimal when compared with the optimal resource allocation in the proposed UC-JOPT scheme. 2) Wireless powered computation offloading scheme with inactive user cooperation by letting t 21 = 0 and t 1 = L * 1 /r 1,1 (p * ) ("IUC" or cooperation is disabled). The simulation settings are set as follows unless specified otherwise. The bandwidth and the time block length are set as B = 10MHz and T = 0.2s, respectively. It is assumed that the channel reciprocity holds for the downlink and uplink, and thus g 1 = h 1 , g 2 = h 2 . The channel power gain is modeled as h j = 10 −3 d −α j φ j , j ∈ {1, 2, 12}, where φ j represents the short-term fading which is assumed to be an exponentially distributed random variable with unit mean (Rayleigh fading). For distance d j in meters with the same path-loss exponent α, a 30dB average signal power attenuation is assumed for all the channels at reference of 1m. We assume that d 1 = 10m, d 2 = 6m, d 12 = 6m and α = 2. The noise at the AP and D 2 is assumed to have a white power of N 0 = 10 −9 W. For each user D i , i ∈ {1, 2}, the CPU frequency f i is uniformly selected from the set of {0.1, 0.2, . . . , 1.0}GHz. We set ν i = 0.8 and κ i = 10 −28 , respectively. As for the computation tasks, the input data size and the required number of CPU cycles per bit follow the uniform distribution with I i ∈ [100, 500] KB and C i ∈ [1000, 2000] cycles/bit, respectively. The figures by simulations in the following subsections are based on 1000 independent realizations, in which h j , f i , I i and C i are randomly selected according to the above assumptions in each realization, modeling the real heterogeneous computing scenarios.

A. The Equivalence of Problem (P1) and (P2)
In this subsection, we will verify the equivalence of problem (P1) and (P2) by simulations. The average minimum transmit energy (AMTE) combining with the corresponding average minimum transmit power (AMTP) results at the AP are shown in Fig. 3 and Fig. 4, versus the block length T and the same input data size I = I 1 = I 2 , respectively. From Fig. 3, we can observe that the corresponding curves of AMTE and AMTP illustrate the same trend and property, verifying the equivalence of these two criteria in problem (P1) and (P2). It is shown that the proposed UC-JOPT scheme obviously outperforms the baselines. Specifically, the curves of UC-JOPT are much lower than those of UC-ET, indicating the effectiveness of the optimization for time allocation. Besides, the AMTE and AMTP of UC-JOPT are even less than half of those for IUC, which further displays the significance of user cooperation in handling the doubly near-far effect. It is valuable to note that the gaps of AMTE (AMTP) between  different schemes become more significant for a shorter block length, demonstrating the superiority of the proposed UC-JOPT scheme in handling the latency-critical tasks. Fig. 4 also shows the equivalence between problem (P1) and (P2) by depicting both AMTE and AMTP versus the same input data size I . The AMTE and AMTP of all the schemes increase gradually with I , as expected. Besides, the performance improvement of the proposed UC-JOPT scheme is clearly displayed, and we can obtain similar results as those reported in Fig. 3. Also, it is noted that the reduction of AMTE (AMTP) between different schemes become more obvious as I increases, which further indicates the advantage of the proposed UC-JOPT scheme in completing computationintensive tasks.
The above results verify that the proposed UC-JOPT scheme is highly capable of dealing with computation-intensive latency-critical tasks and resisting the double-near-far effect in WPCNs by fully taking the benefits of joint-optimal resource allocation and user cooperation.

B. The Effect of Path Loss
From the expression of the channel power gain described above, it is understood that the path-loss exponent α and the distances d 1 , d 2 and d 12 have great influence on the value of h 1 , h 2 and h 12 , and thus further affect the AMTE (AMTP) of each scheme. In this part of simulations, we set same shortterm fading parameters for D 1 and D 2 , i.e., φ 1 = φ 2 , and focus on the effect of α and distances on the AMTE. Setting d 1 = 10m, d 2 = ξ d 1 , and d 12 = (1 − ξ)d 1 , Fig. 5 depicts the AMTE with respect to ξ for α = 1.5, 2, 2.5.
From the results in Fig. 5, we have the observation that the performance of the proposed UC-JOPT scheme is superior to the benchmarks, and the corresponding improvements are even more pronounced with a larger α, indicating that the UC-JOPT scheme is highly effective in resisting the attenuation caused by path loss. It is also noticed that the AMTE curves of the two cooperative schemes, i.e., UC-JOPT and UC-ET, first decrease then increase with ξ , and there is a saddle point of ξ in each curve achieving the minimum AMTE. This is due to the fact that for the cooperative computation offloading schemes, the performance depends not only on h 2 but also h 12 , and there exists a tradeoff between these two values. When ξ is small, the performance is limited by the value of h 12 , and the AMTE curves decrease with ξ since h 12 increases accordingly. Around the saddle point, the performance of both two cooperative schemes degrades with ξ as the decreasing h 2 plays a dominant role in this situation. This figure also shows that when ξ is less than the saddle point, the gaps between the two cooperative schemes are not that obvious, while the gaps widen obviously as ξ goes beyond the saddle point. It is interesting to note that the performance of the proposed UC-JOPT scheme converges to that of IUC as ξ gradually tends to 1 since both D 1 and D 2 suffer from severe signal attenuation, and t 21 gradually approaches to 0. However, the performance of the UC-ET scheme is even worse than that of the IUC scheme when ξ becomes larger approaching to 1, which shows the importance and effect of optimizing the offloading time fraction.

V. EXTENSIONS
This work focuses on the wireless powered cooperationassisted MEC model for only a three-node scenario, with an AP, and two near-far mobile devices D 1 , D 2 , all with single antenna. However, extensions to other more complex scenarios are possible. This section discusses some straightforward approaches to extend the proposed system to more general settings.
• Multi-antenna AP-In this case, the design of the transmit energy beamforming and the receive signal combining at the AP will be considered to improve the network performance giving the multiple antenna capability of the AP. Such design can be easily achieved by using maximum ratio transmission for wireless power transfer and maximum ratio combining for data reception at the AP. The formulation and approach will be more or less the same except that the resulting channel coefficients after the antenna processing is considered. • More Mobile Devices-Our proposed method in its current form addresses the near-far problem by pairing two mobile devices (one "near" user and another "far" user) for cooperation. Therefore, a natural approach would be to list, then rank and pair users according to their distances from the AP. Communications among different pairs can be dealt with over orthogonal channels within the same cell covered by the AP. By doing so, our proposed solution could be adopted directly. Not allowing different pairs to occupy the same radio channels makes sense because the intra-cell interference would be too much to bear unless advanced interference mitigation techniques are in place. In that case, user pairing has to be done with consideration of the interference levels, as this will affect the energy consumption at the mobile devices and the AP. Same goes to extend the proposed work to a multi-cell scenario where inter-cell interference is an important factor. After a proper user pairing with consideration of interference control and balancing, our proposed method can be directly applied, although the pairing will be more challenging. • Computing Resource Sharing-Another possible extension is to allow users to share not only the radio resources (i.e., power and relaying cooperation as in our current work) but also the computing resources, where the users with stronger computation capacity can help weaker users complete their computational tasks. In this scenario, the required optimization will be much more complex because the energy consumption for carrying out tasks for others and sending back the results to others will need to be evaluated and compared with that for simply relaying the decoded data to the AP. The overall optimization problem can be formulated in a similar manner with the emphasis on minimizing the transmit energy of the AP but the required optimization is not believed to be convex. The exact way to tackle this will require further analysis and will be considered in our future work.

VI. CONCLUSION
In this paper, we investigated the use of cooperative communications in computation offloading for a WPT-MEC system, in which an AP acts as an energy source via WPT and serves as an MEC server to assist two nearfar mobile devices to complete their computation-intensive latency-critical tasks. Joint power and time allocation for cooperative computation offloading has been considered based on a block-based harvest-then-offload protocol, with the aim to minimize the transmit energy of the AP for completing the computation tasks of the two users. A two-phase method was proposed to find the optimal solution. Simulation results revealed that the proposed scheme greatly outperforms the baselines.

APPENDIX A PROOF OF LEMMA 1
It is easy to verify that f (x) is a monotonic increasing function for x ≥ 0 with f (0) = 0 by simply deriving its firstorder derivative. Hence, the equation f (x) = C with C > 0 has a unique solution. Through derivation, f (x) = C can be equivalently expressed as By using the definition and property of Lambert function [27], we obtain the solution where W 0 (−e (−(C+1)) ) ∈ (−1, 0).

APPENDIX B PROOF OF LEMMA 2
Similar to Lemma 1, by deriving the first-order derivative of h(x), we can verify that h(x) is a monotonic decreasing function of x ≥ 0 with h(0) = 0. Hence, the equation h(x) = G with G < 0 has a unique solution. Through derivation, h(x) = G can be equivalently expressed as Therefore, we obtain x * = B ln 2 W 0 G/N 0 +1 −e + 1 > 0 by using the definition and property of Lambert function [27],

APPENDIX C PROOF OF LEMMA 3
According to the constraint (21d) and condition (30), we know that is a monotonically decreasing function of t 22 . It is easy to prove that the inequality L 1,1 (t * , q * ) < L 1,12 (t * , q * ) always holds for the considered case of h 1 < h 12 . If L 1,1 (t * , q * ) + L 1,2 (t * , q * ) > L 1,12 (t * , q * ) holds, we can always allocate part of t * 21 to t * 22 while maintaining the same L * 1 , L * 2 , q * , t * 0 , t * 1 and the sum of t * 21 , t * 22 , which will decrease L 1,2 (t * , q * ) until the equality holds. This operation will result in an increased E s,2 t * , q * , L * 2 by decreasing t * 22 g( without reducing E s,1 (t * , q * , L * 1 ), and thus will increase the objective function of problem (P4). Hence, expression (40) always holds with the optimal solution of problem (P4). APPENDIX D PROOF OF THEOREM 1 1) In order to prove the first result of Theorem 1, we need the following lemma.
For the case of M + 1 = 0, μ 1 < (β 1 + β 2 )P 0 /z * , it can be derived that L * 1 = 0 according to condition (33), which means that fulfilling D 1 's computation task locally saves more energy, and thus we have p * 1 = 0, p * 21 = 0. 2) Next, we will prove the second result of Theorem 1. Similarly, we also first show that for the cases of M + 2 > 0 or ρ(μ 2 ) ≥ (β 1 + β 2 )P 0 , computation offloading for D 2 is necessary, and thus L * 2 > 0, t * 22 > 0, q * 22 > 0. According to Lemma 2, the optimal transmission rate for offloading D 2 's input data, i.e., can be obtained through (30) as where (a) is obtained through the property of λ * 2 in (44) and the definition of β 2 . Based on the expression of g(x), its firstorder derivative can be expressed as g (x) = N 0 ln 2 B 2 x B , which is a monotonically increasing function of x. Through the KKT condition (34), we can derive that the cases ∂L ∂ L *  [27] should be used. According to (18), the optimal transmit power for offloading , giving the result in (51).
For the case of M + 2 = 0, ρ(μ 2 ) < (β 1 + β 2 )P 0 , it can be derived that L * 2 = 0 according to (34), which means that fulfilling D 2 's task locally saves more energy, thus p * 22 = 0. with the expression of r * 2 in (64). With the result of t * 22 , we can further derive the optimal WPT duration time t * 0 as follows. For the case of L * 1 = 0, we understand that t * 1 = 0 and t * 21 = 0, and thus t * 0 = T − t * 22 . For the case of L * 1 > 0, combining the results of Lemma 3, Lemma 5, and the active time-sharing constraint in (21b), establishes the following equation which leads to the results in (54).
As for the derivation of (t * 1 , t * 21 ) when L * 1 > 0, we resort to the results of Lemma 3 and Theorem 1, and further derive the following Lemma.

APPENDIX G PROOF OF LEMMA 6
According to the expression of t * 0 in (54), its monotonicity with respect to P 0 is determined by the monotonicity of L * 1 /r 1,1 (p * ) and t * 22 = L * 2 /r * 2 when L * 1 > 0 or L * 2 > 0. From the expression of r * 2 in (64), it is clear that r * 2 is a monotonic increasing function of P 0 due to the fact that the first-branch of Lambert function W 0 (·) is a monotonic increasing function. Next, we will prove that P 0 /z * is also a monotonic increasing function of P 0 to further proceed this proof.