^{1}

^{1}

^{2}

^{*}

The advancements of mobile devices, public networks and the Internet of creature huge amounts of complex data, both construct & unstructured are being captured in trust to allow organizations to produce better business decisions as data is now pivotal for an organizations success. These enormous amounts of data are referred to as
* Big Data*, which enables a competitive advantage over rivals when processed and analyzed appropriately. However Big Data Analytics has a few concerns including Management of Data, Privacy & Security, getting optimal path for transport data, and Data Representation. However, the structure of network does not completely match transportation demand,

*i.e.*, there still exist a few bottlenecks in the network. This paper presents a new approach to get the optimal path of valuable data movement through a given network based on the knapsack problem. This paper will give value for each piece of data, it depends on the importance of this data (each piece of data defined by two arguments size and value), and the approach tries to find the optimal path from source to destination, a mathematical models are developed to adjust data flows between their shortest paths based on the 0 - 1 knapsack problem. We also take out computational experience using the commercial software Gurobi and a greedy algorithm (GA), respectively. The outcome indicates that the suggest models are active and workable. This paper introduced two different algorithms to study the shortest path problems: the first algorithm studies the shortest path problems when stochastic activates and activities does not depend on weights. The second algorithm studies the shortest path problems depends on weights.

Big data is a big deal. We read about it and its promises of insight. But we will need a network to collect and distribute big data connected to processing locations. Many big data applications require real-time communications. Plan for big data on your network now; don’t wait until issues arrive. Catch-up costs money and results in delayed implementations.

The only sure predictions around big data’s impact are that the network will be busier, need more capacity, and probably cost more. How much capacity will be needed is only an estimate. It could wind up being far more than estimated if the big data applications are very successful. Educated predictions on traffic may look good now, but conditions can change and render them inaccurately.

Real-time processing of big data will require real-time data delivery; data will already be old and historical. One of the advantages of big data, especially in regard to the Internet of Things (IoT), is its enabling of a rapid response to changing business functions and conditions such as security alerts, building automation, location tracking, etc. Big data collected quickly fosters just-in-time decisions.

The United Nations Economic Commission for Europe predicts that data growth will be 350% higher in 2019 than it is in 2015;

This paper proposes an applicable method to adjust the optimal path for moving big data between source and destination depending on the size and impotence of this data, let us suppose that we have more data need to distribute through given network, according to the importance or value of each data and

total capacity of the network the approach selects the suitable amount of importance data and insure that it is not exceed the total network capacity by using knapsack problem.

The proposed method is concerned with specific topics of resource allocation that have been studied in related literatures. In1996, Yu Gang [

The effective of lower and upper bounds were generated by surrogate relaxation. The ratio of these two bounds is shown to be bounded by a constant for situations where the data range is limited to be within a fixed percentage from its mean. A branch-and-bound algorithm has been implemented to efficiently solve the MNK problem to optimality. In 2009, Campegiani and Presti [

As a typical non-deterministic polynomial-time hard (NP-hard) problem, the unbounded knapsack problem (UKP) is defined as follows: We are given a set of n types O = { o 1 , o 2 , ⋯ , o n } of items without quantity restriction. Items of the same type share a common weight w_{i} and a common value ϕ_{i}. The problem is to choose a subset of these items aiming to maximize their overall value, while their overall weight does not exceed a given capacity c. Without loss of generality, it should be assumed that all values and weights are positive, all weights are smaller than the capacity c, and the overall weight of all items exceeds c. The model of UKP problem can be formulated as follows:

Maximize ∑ i = 1 n ϕ i x i , (1)

Constrain ∑ i = 1 n w i x i ≤ c , (2)

∀ x i ∈ Z + , 1 ≤ i ≤ n (3)

where x_{i} represents the number of items of type o_{i} included in the knapsack.

In this section, the model of 0 - 1 knapsack problem and the amoeboid organism are introduced.

Mathematical model of the amoeboid organism

From the experiments on the amoeboid organism as described in [

According to the mechanism, two rules describing the changes in the tubular structure of the amoeboid organism are: first, open-ended tubes, which are not connected between the two food sources, are likely to disappear; second, when two or more tubes connect the same two food sources, the longer tube is likely to disappear [

The variable Q i j is used to express the flux through tube M i j from N i to N j . Assuming the flow along the tube as an approximately poiseuille flow, the flux Q i j can be expressed as [

Q i j = D i j L i j ( p i − p j ) , (4)

where p i is the pressure at the node N i , D i j is the conductivity of the edge M i j .

Assume zero capacity at each node; hence by considering the conservation law of sol the following equation can be obtained see [

∑ Q i j = 0 , ( j ≠ 1 , 2 ) , i = 1 , ⋯ , n (5)

For the source node N 1 and the sink node N 2 the following two equations hold

∑ Q i 1 + I 0 = 0 , (6)

∑ Q i 2 + I 0 = 0 , (7)

where I 0 is the flux flowing from the source node. It can be seen that I 0 is a constant value in this model.

In order to describe such an adaptation of tubular thickness we assume that the conductivity D i j changes over time according to the flux Q i j . The following equation for the evolution of D i j can be used

d d t D i j = f ( | Q i j | − r D i j ) (8)

where r is a decay rate of the tube. It can be obtained that the equation implies that the conductivity ends to vanish if there is no flux along the edge, while it is enhanced by the flux. The f is monotonically increasing continuous function f ( 0 ) = 0 .

Then the network Poisson equation for the pressure can be obtained from the Equations (4)-(7) as follows [

∑ i D i j L i j ( p i − p j ) = { − 1 for j = 1 , + 1 for j = 2 , 0 O / W (9)

By setting p 2 = 0 as a basic pressure level, all p i can be determined by solving Equation (9) and Q i j can also be obtained.

In this paper, it has been obtained that f is monotonically increasing continuous function satisfying f ( 0 ) = 0 in Equation (8). Therefore, f ( Q ) = | Q | is used in this paper. With the flux calculated, the conductivity can be derived, where Equation (10) is used instead of Equation (8), adopting the functional form f ( Q ) = | Q | .

D i j n + 1 − D i j n δ t = | Q − D i j n + 1 | (10)

Given an acyclic undirected network G(N,A), consisting of a set of nodes N = { 1 , 2 , ⋯ , n } and m undirected arcs A ∈ N × N . Each arc is denoted by ordered pair (i, j), where i , j ∈ N . The weight of arc t = ( v i , v j ) is denoted by a interval data w = t i = [ t i − , t i + ] . Given two nodes v i and v t , assume P is one path from node v i to node v t in the network G. The weight of path P is the sum of the arcs’ weight in the path and it is stated as w(p). As a result, the shortest path problem can be formulated as follows [

w ( p 0 ) = min ∑ p w ( p ) (11)

The following equation is defined to convert α interval data into a crisp number [

w = { t i } = α ∗ t i − + ( 1 − α ) ∗ t i + , 0 ≤ α ≤ 1 (12)

Dynamic Programming (DP) solves the problem by producing “ f 1 , f 2 , ⋯ , f n ” sequentially. As mentioned in the previous section, f i ( x ) is a monotone no lessening basic step work. “ f i ( x ) ” may be exemplified as the set S P i from claiming rows from the coordination of that phase focuses of “ f i ( x ) “.

The size of the set S i , (i.e. | S P i | ), is not greater than “C + 1” and rows should be planned in growing arrangement x while f i ( x ) . The series of sets, “ S P 0 , S P 1 , ⋯ , S P n ” is a history of the DP and that should be backtracked through “Algorithm 1” to get the solution vector x see [

In this section, a numerical example is used to show the efficiency of the proposed method.

As can be seen in

and 0 ≤ α ≤ 1 , but assume that α is set 0.5,

From

From Equation (12) assume all the items’ values in the initial conductivity matrix are set α = 0.5, the shortest path from node 1 to node 12 can be found using the amoeboid organism algorithm and the result in

It can be seen that different shortest paths are obtained when α has different values.

Consider the problem (X) with the following given data in _{i}), and a knapsack profit (P_{i}) obtained by allocating required resource to the specified item i. All P_{i}s and W_{i}’s are positive integer numbers.

L = [ 0 5 6.5 7.5 0 0 0 0 0 0 0 0 5 0 5 0 7 4.5 0 9 0 0 0 0 6.5 5 0 3 0 10.5 0 0 0 0 0 0 7.5 0 3 0 0 6.5 8.5 0 0 0 0 0 0 7 0 0 0 6 0 7.5 0 6 0 8 0 4.5 10.5 6.5 6 0 5 3 4 8 0 0 0 0 0 8.5 0 5 0 0 5.5 0 0 0 0 9 0 0 7.5 3 0 0 9.5 3.5 0 0 0 0 0 0 0 4 5.5 9.5 0 6 5 0 0 0 0 0 6 8 0 3.5 6 0 0 4.5 0 0 0 0 0 0 0 0 0 5 0 7.5 0 0 0 0 8 0 0 0 0 4.5 7.5 0 ]

0 | 1.9929 | 0.0012 | 0 | 0 | 1.9929 | 0 | 0 | 0 | 0 | 0 |
---|---|---|---|---|---|---|---|---|---|---|

0.001 | 0 | 0.001 | 0 | 1.9839 | 0 | 0 | 0.0011 | 0 | 0 | 0 |

0.001 | 0.001 | 0 | 0.001 | 0 | 0.0012 | 0 | 0 | 0 | 0 | 0 |

0.001 | 0 | 0.001 | 0 | 0 | 0.0011 | 0.0011 | 0 | 0 | 0 | 2.0141 |

0 | 0.001 | 0 | 0 | 0 | 0.001 | 0 | 0.001 | 0.001 | 0 | 0 |

0 | 0.001 | 0.001 | 0.001 | 0.001 | 0 | 0.001 | 0.0014 | 0.001 | 0 | 0 |

0 | 0 | 0 | 0.001 | 0 | 0.0012 | 0 | 0 | 0 | 0 | 0 |

0 | 0.001 | 0 | 0 | 0.001 | 0.001 | 0 | 0 | 0.0027 | 0 | 0 |

0 | 0 | 0 | 0 | 0 | 0.001 | 0.001 | 0.001 | 0.001 | 0.0012 | 0 |

0 | 0 | 0 | 0 | 0.001 | 0.001 | 0 | 0.001 | 0 | 0 | 0.0106 |

0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.001 | 0 | 0.0011 |

0 | 0 | 0 | 0 | 0.001 | 0 | 0 | 0 | 0.001 | 0.001 | 0 |

Then, the subproblem SP[TN, Capcount] in algorithm 4 will be computed to find optimal solution for the list (SP) of the n items.

The network in the formulation has several layers of nodes: It has one layer corresponding to each item and one layer corresponding to a source node s and another corresponding to a sink node t. The layer corresponding to an item i has W + 1 nodes, i 0 , i 1 , ⋯ , i w . Node.

The knapsack problem assuming that the knapsack has a capacity of W = 6, v_{j} is the value of item j, w_{j} is the weight of item j. ^{st} number in the residual circles means the item’s category, the 2^{nd} number in these residual circles states the capability of the knapsack that the solution has consumed. The number along the arc funds the value of the consistent item. For example the circle with rate (1, 4) in the first layer means that the item has used 4 units of the knapsack’s capacity. At the similar time, each path from node S to node E explains a possible answer to the problem. For example, the path S − (1, 4) − (2, 6) − (3, 6) − (4, 6) − EE means that item 1 and item 2 are involved in the knapsack, item 3 and item 4 are omitted. It also shows that the response to the knapsack problem match the longest track in the network.

The shortest path problem shows a substantial role in many usages. In this paper, based on an amoeboid creature algorithm, a new procedure is proposed to resolve the shortest path problems with interval bracket. A numeral example is explained to show the qualification of the proposed method. The 0 - 1 knapsack problem plays a substantial role in real-life applications. In this paper, based on amoeboid creature algorithm and classic 0 - 1 Knapsack Algorithm, a new method is suggested to solve classical 0 - 1 knapsack problems. We have used the

benchmark problems to exam the amoeboid creature algorithm. The computational outcomes explain the efficiency of the presented approach. One of our outstanding studies is to solve other 0 - 1 knapsack problems under additional complex situations, such as the multi-objective shortest path problem and the knapsack problem with more criteria.

The authors declare no conflicts of interest regarding the publication of this paper.

Yosef, E., Salama, A. and Wahed, M.E. (2018) Big Data Flow Adjustment Using Knapsack Problem. Journal of Computer and Communications, 6, 30-39. https://doi.org/10.4236/jcc.2018.610003