Machine-type communication (MTC) devices provide a broad range of data collection especially on the massive data generated environments such as urban, industrials and event-enabled areas. In dense deployments, the data collected at the closest locations between the MTC devices are spatially correlated. In this paper, we propose a k-means grouping technique to combine all MTC devices based on spatially correlated. The MTC devices collect the data on the event-based area and then transmit to the centralized aggregator for processing and computing. With the limitation of computational resources at the centralized aggregator, some grouped MTC devices data offloaded to the nearby base station collocated with the mobile edge-computing server. As a sensing capability adopted on MTC devices, we use a power exponential function model to compute a correlation coefficient existing between the MTC devices. Based on this framework, we compare the energy consumption when all data processed locally at centralized aggregator or offloaded at mobile edge computing server with optimal solution obtained by the brute force method. Then, the simulation results revealed that the proposed k-means grouping technique reduce the energy consumption at centralized aggregator while satisfying the required completion time.
Market trends of Machine-Type Communication (MTC) devices deployments are exponential increases. This is determined by many applications and services discovered such automatic industries monitoring, smart metering, surveillance cameras, environment monitoring and trace devices [
Therefore, the wasted of the computational resources occurred at CA when the data collected from different MTC devices are processing independently. The significant amount of computational resources can be saved by considering the advantages of existence of a spatial correlation.
However, the growth of the data traffic collected by MTC devices increase the pressure on the mobile operators specifically for the delay-sensitive applications, which requires a very short time to be processed. Several approaches are proposed to deal with that challenge such as edge computing, data offloading and data caching [
Generally, the MTC devices are deployed to perform specific tasks collectively; the data collected from each device are not completely independent rather correlated. Thus, in such case to avoid the resource wasted for the individual device processing at the aggregator, the CA combines the correlated devices together to form a group [
• We define the correlation model to compute the correlation coefficient between the MTC devices by considering the device coordinate points on the hyperplane.
• We propose device k-means grouping technique to groups MTC devices based on spatial correlation around the data collected coverage.
• We use the differential entropy framework to compute the size of the data in each group.
• Based on the size of the data we introduce the concepts of the data offloading and we define the optimization problem for minimizing the energy consumption on CA for processing the data collected from MTC devices.
• We use the brute force method to find the optimal solution for total energy consumption on the centralized aggregator.
The remaining part of this structured as follows. Related work discussed in Section 2. In Section 3, we discuss the network model that includes the network model together with details of the correlation model and the proposed k-means grouping technique. The Section 4 present the theoretical concepts of computation together with the optimization problem used to minimize the energy consumption and Section 5 discuss the results used to verify our context and lastly we conclude by summarizing the idea presented on whole paper.
There are many emerging works exist in MTC communication and wireless networks. In [
Conventionally, to describe the concept of computational capability, a mobile cloud computing (MCC) have been widely introduced [
Besides the MTC devices have sensing capability, implied the correlation between adopted as in wireless sensor networks. In [
In this section, we describe the details about the network model together with theoretical part of used correlation model and proposed k-means grouping technique.
We consider the network model having the set of MTC devices D = { d 1 , d 2 , ⋯ , d M } that consists of sensing ability shown in
The idea of a data correlation in computing is to reduce the size of data collected
by MTC devices by evaluating the existence of data similarity or data dependency. Since the CA wastes the computational resources for processing the individual device that contains the same data independently, therefore we adapt the data correlation model used on the Wireless Sensor Networks to evaluate the presence of correlation between the spatial correlated MTC devices. The correlation model verified by using the covariance function that decreases with Euclidean distance for 0 at l = ∞ and 1 at l = 0 , where l represents the Euclidean distance between the locations of MTC devices. Additionally, we assume the data collected from the MTC devices denoted as Y = { y 1 , y 2 , ⋯ , y M } have nature of the multivate Gaussian distribution having mean of μ and variance of σ . Hence, the covariance existence between the MTC device d 1 and d 2 calculated as on [
cov ( d i , d j ) = σ i σ j c o r r ( d i , d j ) (1)
where σ i and σ j denotes to variance of devices d i and d j respectively. We can further improve the expression as
c o r r ( d i , d j ) = E ( d i , d j ) σ i σ j = K ϕ ( ‖ P i − P j ‖ ) = K ϕ ( l i , j ) (2)
where K ϕ ( . ) denotes a correlation function with given correlation parameter of ϕ and l i , j represents the existence distance between the device d i and d j . Furthermore, we assume, the value of K ϕ ( . ) related with data collected from MTC device d i and d j as y i and y j respectively. To determine the value of K ϕ ( . ) covariance function models proposed such as Rational Quadratic, Spherical, Power Exponential and Matern based on the structure of the correlation [
K ϕ ( l i , j ) = exp { − ( l i , j θ 1 ) θ 2 } (3)
where θ 1 > 0 and θ 2 ∈ ( 0 , 2 ] represents the control parameters for a given correlation and the smoothness at given random region. For simplicity we use θ 2 = 2 and the correlation coefficient between the device d i and d j in short represented as ξ i , j becomes
K ϕ ( l i , j ) = ξ i , j = exp { − ( η l i , j 2 ) } (4)
where η = θ 1 − 2 represents the exponent that control correlation that exist between the devices. Hence, we can determine the correlation matrix of M MTC devices as
K c o r r = ( ξ 1 , 1 ξ 1 , 2 ⋯ ξ 1 , N ξ 2 , 1 ξ 2 , 2 ⋯ ξ 2 , N ⋮ ⋮ ⋮ ⋮ ⋯ ⋯ ξ i , j ⋯ ⋮ ⋮ ⋮ ⋮ ξ N , 1 ξ N , 2 ⋯ ξ N , N ) (5)
The above correlation matrix provides overall existed correlation between the devices with the value between 0 and 1. That’s mean when the value of ξ i , j = 0 the devices are located very far not correlated and similarly when ξ i , j = 1 the devices are very near to each other have highly data correlation.
In utilizing the limited computational resources at CA, we use the grouping technique to groups all the MTC devices based on the coordinate points. Since all MTC devices in each group assume to be very close, then data collected by the individual device may be identical. Hence, the computation resources reduced on the CA by processing together the MTC in one group in either to the CA or MEC.
The proposed grouping technique of MTC devices consist of two part; the first we need to compute the distance or similarities existing between the two devices and then grouping the MTC devices using the clustering technique. In this paper, we compute the distance between the two devices according to the equations on the previous section and using the k-means clustering algorithm to group the MTC devices.
The k-means algorithm frequently used to partition set data automatically into K disjoint clusters or groups as described in [
l c ( P i − μ g ) = ‖ P i − μ g ‖ 2 (6)
where P i represents the coordinate point of device d i and μ g denotes the initialized coordinate point of centroid cluster. Then, the centroid cluster updated iteratively with associated devices as expressed on
μ g = 1 | Ω g | ∑ j ∈ Ω g P j (7)
where Ω g is the set of number of the MTC devices contained on group or cluster. The Algorithm 1 explain in details.
Algorithm 1. K-means grouping technique.
In the section, we describe the theoretical details of computational model and we define the energy consumption optimization problem regardless with the data processing decision which either locally at CA or offloaded to the remote at MEC server.
The CA collocated near to the MTC devices that receive all the data collected from them. As we assume the MTC devices are randomly distributed, the CA uses the correlation framework to check whether the correlation between the devices existed. After that the devices grouping occurred. Then, the correlation matrix corresponding to the member of each group determined according to the Equation (5). With the data collected by each MTC device follows a multi-variate Gaussian distribution, then we can use the idea of information entropy to find the size of data on each group. Therefore, we adopt the differential entropy as explained in [
h ( Y ) = 1 2 log [ ( 2 π e ) S | K S | ] (8)
where | K S | represents the correlation matrix of any group with S number MTC devices. Then, the size data in each group based on correlation matrix modeled as entropy H ( Δ g ) corresponding to the data set Y Δ g = { y 1 Δ 1 , y 2 Δ 2 , ⋯ , y S g Δ S g } on each group evaluated as
H ( Δ g ) = 1 2 log [ ( 2 π e ∏ j = 1 S g ( Δ j ) 2 ) S g | K S g | ] (9)
where S g represents number of MTC devices presented in the group g. With the above expression, we obtain size of the input data for each group.
In the local computing, we consider the computation time and energy consumption for processing each group. For the computing process requires the computation capability of CA denoted by C l and CPU clock frequency represented as f l [
T l i = C l χ ( i ) f l i (10)
where χ ( i ) represents the size of the input data on the group i that obtained from Equation (9). Furthermore, the energy consumption per each group i given by
E l i = κ ( f l i ) τ T l i (11)
where κ ( f l i ) τ represents the power coefficient at CA and κ is a constant that determined by architecture of capacity of the chip. The parameter τ represent frequency exponent with constant value of τ ≥ 2 . The frequently parameter used is approximate equal to 3 as illustrated by [
As we assume the CA has limits of the computational resources, then some groups are offloaded to MEC server for processing. In this scenario the computation time on MEC obtained by considering the transmission time of uplink, downlink and execution time to the MEC. In this paper, we ignore the transmission time of downlink because the data size remained after processing too small [
T t r i = χ ( i ) R (12)
where R represents the transmission rate between the CA and MEC which obtained as
R = W log ( 1 + λ P t r ) (13)
where W denotes the channel bandwidth reserved for CA and P t r is the transmission power of CA consumed to offloads to MEC and λ is the channel gains that normalized by the power of white Gaussian noise. Then, we can evaluate the energy consumption used for transmission as
E t r i = P t r T t r i (14)
In addition, we evaluate the computation time at MEC as
T s i = C s χ ( i ) f s (15)
where C s and f s represents the CPU capability and clock frequency of the MEC server respectively and χ ( i ) is the size of input data of the group i. Therefore, the total offloading time for CA, T o i obtained as
T o i = T t r i + T s i (16)
From the Equations obtained from previous section we formulate the optimization problem for total energy consumption at CA by considering the energy consumed when groups processed at CA plus the transmission energy when the groups offloaded to the MEC formulated as
min z i ∑ i = 1 K ( ( 1 − z i ) E l i + z i E t r i )
s.t
C 1 : ∑ i = 1 K ( 1 − z i ) T t r i ≤ T max
C 2 : ∑ i = 1 K z i T o i ≤ T max
C 3 : z i ∈ { 0 , 1 } , i = 1 , 2 , ⋯ , K (17)
where the constraints C1 and C2 represents limits of the completion time for processing the whole group and C3 denotes the decision computing variable for processing on each group either at CA or at MEC. T max is a maximum completion time to process all the MTC devices based on number of groups.
The optimization problem formulated above is an integer-programming problem, which we can solve using various heuristic approaches such as generic programming, dynamic programming, brute force or exhaustive search method and variable relaxation approaches (linear and semidefinite relaxation programming). Based on analyzing the performance of the proposed grouping technique we present the brute force approach to determine the optimal solution obtained from the finite number of iterations.
This section describes performance and results for the proposed scheme. To evaluate the performance of proposed scheme we use the parameters referred in [
0.9 ( 600 × 10 6 ) 3 = 4.1667 × 10 − 27 J / cyc
Then, the value of a transmitted power of CA equal 1.012 W, normalized channel gain equal to 17 dB and the transmission bandwidth is 0.185 MHz. Addition, the CPU frequency and capability of MEC equal to 600 × 107 Cycle/sec and 960 × 107 Cycle/sec respectively. The values of quantization level range between minimum and maximum of 1/2 (1 bit) and 1/256 (8 bits) respectively with a degree of the correlation equal to 0.05. We evaluate the performance of the system by comparing the energy consumption when the data processed on the three different implementation scenario in terms of a number of MTC devices and the energy consumption comprises of both energy consumption for computation and for transmission. We simply choose the number of group or cluster equal to 4 and 8 to verify our proposed grouping technique.
The energy consumption succeeded by the proposed grouping technique as the number of MTC devices increases as illustrated in
local computing more favorable compared to offloading to remotely computing (MEC). Moreover as illustrated in a small number of groups the size of data processing is much small compared to a larger number of groups. The optimal BFM solution approaches at CA computing for grouping MTC devices into a small number of groups and at larger number of groups it approaches at MEC computing.
In this paper, we investigate the problem of data correlated in MTC devices based on the resource-constrained allocated at Centralized Aggregator for computing and processing. We propose k-means grouping technique to group MTC devices corresponding to a spatial correlation on the event-based area. With combining the MTC devices, we reduced the data redundancy caused by similar data processing and saving the computation resources at CA. Then, we use the differential entropy to measure the size of data contents in each group. Through the extensive simulations, we illustrated the benefits of our proposed grouping technique compared with the individual MTC device computation. Our simulation results indicate that the optimal BFM solutions performance is very close between the computation at CA and MEC in terms of energy consumption and the trade-off in computation time. In the future, we will investigate more grouping techniques together with multiple aggregators.
We thank the Editor and the referee for their comments. This work was partially supported by Natural Science Foundation of China (Grant No. 61461136002), Key Program of National Natural Science Foundation of China (Grant No. 61631018), Fundamental Research Funds for the Central Universities, and Huawei Innovation Research Program.
The authors declare no conflicts of interest regarding the publication of this paper.
Ally, J.S., Asif, M. and Ma, Q.L. (2019) Energy-Efficient MTC Data Offloading in Wireless Networks Based on K-Means Grouping Technique. Journal of Computer and Communications, 7, 47-61. https://doi.org/10.4236/jcc.2019.72004