Journal of Intelligent Learning Systems and Applications
Vol.07 No.01(2015), Article ID:54046,15 pages
10.4236/jilsa.2015.71003

Applying DNA Computation to Error Detection Problem in Rule-Based Systems

Behrouz Madahian1, Amin Salighehdar2, Reza Amini3

1Department of Mathematical Sciences, University of Memphis, Memphis, USA

2Financial Engineering Department, Stevens Institute of Technology, Hoboken, USA

3Department of Computer Science, Florida International University, Miami, USA

Received 25 January 2015; accepted 10 February 2015; published 13 February 2015

ABSTRACT

As rule-based systems (RBS) technology gains wider acceptance, the need to create and maintain large knowledge bases will assume greater importance. Demonstrating a rule base to be free from error remains one of the obstacles to the adoption of this technology. In the past several years, a vast body of research has been carried out in developing various graphical techniques such as utilizing Petri Nets to analyze structural errors in rule-based systems, which utilize propositional logic. Four typical errors in rule-based systems are redundancy, circularity, incompleteness, and inconsistency. Recently, a DNA-based computing approach to detect these errors has been proposed. That paper presents algorithms which are able to detect structural errors just for special cases. For a rule base, which contains multiple starting nodes and goal nodes, structural errors are not removed correctly by utilizing the algorithms proposed in that paper and algorithms lack generality. In this study algorithms mainly based on Adleman’s operations, which are able to detect structural errors, in any form that they may arise in rule base, are presented. The potential of applying our algorithm is auspicious giving the operational time complexity of O(n*(Max{q, K, z})), in which n is the number of fact clauses; q is the number of rules in the longest inference chain; K is the number of tubes containing antecedents which are comprised of distinct number of starting nodes; and z denotes the maximum number of distinct antecedents comprised of the same number of starting nodes.

Keywords:

DNA Computing, Rule-Based Systems, Rule Verification, Structural Errors

1. Introduction

Adoption of expert systems in real world applications has been greatly increased. In past years, much effort has been devoted to analyze different aspects of rule-based systems such as knowledge representation, reasoning, and verification of rule-based systems [1] -[5] . A rule base, which is the central part of an expert system codifies the knowledge from domain expert in the form of inference rules. Often these inference rules are built into a rule base incrementally over years and subject to frequent refinements. Due to the different and even conflicting views provided by domain experts besides the above construction process, a rule base can contain many structural errors. According to [1] - [4] [6] , and several other studies, four typical types of structural errors include inconsistency (conflict rules), incompleteness (missing rules), redundancy (redundant rules), and circularity (circular depending rules). Many different techniques have been developed to detect the above errors in rule-based systems. Earlier work mainly focused on detecting structural errors by checking rules pair-wisely [7] - [9] . Recent work aimed at detecting structural errors caused from applying multiple rules in longer inference chains. The majority of recent verification techniques involve using some graphical notations such as graphs [10] - [12] , and Petri nets [2] [3] [6] [13] - [15] .

DNA computation has emerged in recent years as an exciting new research field at the intersection of computer science, biology, engineering, and mathematics. There exist two main barriers to the continued development of traditional silicon-based computers [16] . Invention of silicon integrated circuits and advances in miniaturization has led to incredible increases in processors speed and memory access time. However, there is a limit to how far this miniaturization can go. Eventually chip fabrication will hit the wall imposed by the Heisenberg Uncertainty Principle (HUP) [16] . DNA computing based on its complementary characteristics and massive parallelism (when a step is performed in an experiment, the operation is performed in parallel on all molecules in the tube) has the potential to solve complex problems such as NP-Complete ones [16] . The physician Richard Feynman first proposed the idea of using living cells and molecular complexes to construct sub-microscopic computers [17] . After Feynman proposal, there has been an explosion of interest in performing computations at molecular level. Adleman who used DNA strands to solve a directed Hamiltonian path problem, indicated the feasibility of a molecular approach to solve combinatorial problems [18] . Subsequently, by solving satisfiability problem (SAT), Lipton demonstrated the advantage of using the massive parallelism inherent in DNA-based computing.

Authors in [1] proposed algorithms, which utilized DNA computing to render an error free rule base for rule- based systems. The algorithms proposed in [1] lack generality since just for special cases of rule base can work and there are cases that structural errors are not removed correctly by utilizing their algorithms. Rules in a Rule- Based system are typically formed as X → Y, in which X is an antecedent node (AN) and Y is a conclusion node (CN). Two nodes, atomic and compound, exist in rule bases. We are interested in the problem domain of Horn Clauses, as addressed in [2] - [4] , which allow only one conclusion part in each rule and compound antecedents in rules are only presented in the conjunction format. So, before we transform rules to their corresponding rule paths, a normalization should be carried out in order to obtain Horn Clause form of the rules [3] [4] . In this paper, we are interested in finding structural errors and the set of rules causing these errors. The reasons of structural errors may be due to rule conflicting, mismatched condition and conclusion, and circular and redundant rules [3] [13] [15] . Inconsistent rules result in conflict, which is the direct source of incorrect rule derivation. Redundant rules increase the size of rule base and cause non necessary reasoning. Incomplete rules prohibit the rule base from activating certain normal rule derivation. Circular dependent rules will force the rule base to run into an infinite loop of reasoning.

In this study, algorithms are developed to cope with all cases. Our algorithms have the ability to detect structural errors in any form that they may occur in rule base. As a result, DNA computing as an alternative to verify structural errors in rule-based systems gains more generality. The remainder of this paper is organized as follows. Structural errors are briefly described in Section 2. In Section 3, we outline DNA computation and introduce our DNA-based algorithms to detect structural errors in Rule-Based systems. We analyze the complexity of our algorithm and conclusion is represented in Section 4.

2. Typical Structural Errors in Rule Bases

・ Redundancy. When unnecessary rules exist in the rule base, redundancy occurs. These rules not only increase the size of the rule base but also may cause additional useless inferences. Redundancy is a potential source of inconsistency when knowledge is updated [4] . A rule is redundant with respect to the conclusion if two rules have identical conditions and conclusions (identical rules) or two rules have identical conclusions while the condition for one rule is either a generalization or special case of the condition for the other one [3] [4] . The rule, which has more general condition subsumes (is stronger than) the other rule. Logical redundancy implies operational redundancy [4] . Thus, subsumed (less general) rules can be eliminated, which has no influence on logical inference capability [4] .

・ Incompleteness. When there are missing rules in a rule base, incompleteness occurs. Except the rules for representing facts and goal nodes, a rule is called as a useless rule if the rule’s condition (conclusion) cannot be matched by other rules’ conclusion (condition). The unmatched condition are called dangling conditions, while the unmatched conclusions are called dead-end conclusions. Mostly, the reasons of useless rules are due to some missing rules.

・ Circularity. When two or several rules have circular dependency, Circularity occurs. Circular dependent rules can cause infinite reasoning and must be broken.

・ Inconsistency. Since inconsistent rules result in conflict facts, for correct functioning of an expert system, inconsistency must be resolved. Two rules r1 and r2 (that their conclusions are not compatible) are inconsistent if there exists a state, such that simultaneously both antecedents (pre-conditions) of r1 and r2 can be fired [4] .

2.1. The Structure of DNA and Basic Denaturing and Annealing Operations

DNA (deoxyribonucleic acid) encodes the genetic information of cellular organisms [16] . It consists of polymer chains, commonly referred to as DNA strands. Each strand may be viewed as a chain of nucleotides, or bases, attached to a sugar-phosphate backbone. The four DNA nucleotides are Adenine, Guanine, Cytosine, and Thymine, commonly abbreviated to “A”, “G”, “C”, and “T” respectively. Each strand, according to chemical convention, has a 5ʹ and 3ʹ end. Thus, any single strand has a natural orientation. This orientation is due to the fact that one end of the single strand has a free 5ʹ phosphate group, and the other end has a free 3ʹ deoxyribose hydroxyl group. “A” bonds with “T” and “G” bond with “C”. The pairs (A, T) and (G, C) are therefore known as complementary base pairs. The two pairs of bases form hydrogen bonds between each other. Double stranded DNA may be dissolved into single strands (denatured), by heating the solution to a temperature determined by the composition of the strand [19] . Heating breaks the hydrogen bonds between complementary strands. Annealing is reverse of denaturing, whereby a solution of single strands is cooled, allowing complementary strands to bind together (Figure 1).

Figure 1 demonstrate the annealing of 5ʹ end of a single strand DNA to the 3ʹ end of another DNA in presence of DNA ligase and a splint. In this figure splint has 20 base pairs and consists of the complement of the 10 nucleotides at the 3ʹ end of one strand and the complement of 10 nucleotides at the 5ʹ end of the other.

In order to anneal the 5ʹ end of a single strand DNA to the 3ʹ end of another DNA, in the presence of “DNA ligase” we hybridize a set of specific splint oligos of length 20. Each splint consists of the complement of the 10 nucleotides at the 3ʹ end of one strand and the complement of 10 nucleotides at the 5ʹ end of the other.

2.2. Initial Set Construction

Oligonucleotides uniquely encoding each node and splint are assigned. As proposed by Barich [20] , there are

Figure 1. Annealing and denaturing and using annealing to form a long single stranded DNA.

some constraints for strand library design indicating that sequences must be designed in a way that strands have little secondary structure in order to prevent unintended probe-library hybridization. Thus, short oligonucleotides uniquely encoding each node and splint must be used. Then, splints are used to join to their complement sequence. Hence, by defined polarity, short single stranded oligos which represent nodes covalently join and create longer single stranded DNA molecules. The above procedure enables us to encode each rule path, including appropriate nodes in form of long single stranded DNA.

2.3. Operations Description

We use the basic operations on strands that are defined by Adleman [18] , the Parallel Filtering Model of Amos [16] , and an alternative filtering-style model called the Stickers Model developed by Roweis [21] . In all filtering models a computation consists of a sequence of operations on finite multi-set of strands. It is normally the case that a computation begins and terminates with a single multi set of strands. An initial solution consists of strands which are of length O(n). Where n is the problem size. The initial solution should include all possible solutions (Each encoded by a strand) to the problem to be solved. The point here is that the initial set in any implementation of the model is assumed to be relatively easy to generate as a starting point for the computation. The computation then proceeds by filtering out strands which encode illegal solution and cannot be the result. The operations defined in parallel filtering models and Adleman experiments are as follows. The implementation of these operations can be found in detail in [16] [18] [21] and [22] .

・ Separate (T, s, k, Ton, Toff): This operation separates strands that contain sequence “s” starting from position “k”, into Ton; otherwise, into Toff [21] .

・ Extract (T, s, T+, T): Given a tube T and a sub-strand “s”, this operation creates two new sets T+, T, where T+ include all strands in T containing “s”, and T includes all strands in T that do not contain “s” [18] .

・ Union (T, T1, T2, …, Tn): This operation creates set T, which is the set union of the T1, T2, …, Tn [16] .

・ Copy (T, T1, T2, …, Tn): This operation produces copies T1, T2, …, Tn of the set T [16] .

・ Detect (T): Given a set T, this operation returns true (Y), if T contains at least one DNA strand; Otherwise, it returns false (N) [22] .

・ Read (T): This operation describes each DNA strand in set T [22] .

・ Remove (T): This operation removes all strands in tube T [22] .

2.4. Encoding Inference Rules by DNA Strands

For a general inference rule with compound antecedent (R:(X1Λ… ΛXn) → Y), each antecedent node (AN) and conclusion node (CN) are encoded by a 20-mer DNA strand. To encode the relations “Λ” and “→”, two tetra- nucleotide sequences, “AAAA” and “CCCC” (and their complements) are used respectively. Thus, by creating 24 nucleotide long splints, which contain the appropriate tetra-nucleotide, relations “Xi Λ Xj” and “Xi → Xj” are enforced [1] . The “Xi → Xj” is a splint whose sequence in the 3ʹ-5ʹ direction is the concatenation of the complement of 10 nucleotides at the 3ʹ-end of the strand node Xi, four nucleotides of “CCCC”, and the complement of 10 nucleotides at the 5ʹ-end of the strand Xj. Similarly, “Xi Λ Xj” is a splint whose sequence is the concatenation of complement of 10 nucleotides at the 3ʹ-end of the strand Xi, four nucleotides of “AAAA”, and the complement of 10 nucleotides at the 5ʹ-end of strand Xj. All strands representing starting nodes are designed in the way that, all of them have common sub-strand “TTTTTTTTTT” at the 5ʹ end of their strands and all strands representing the goal nodes are designed in the way that all of them contain the common sub-strand “GGGGGG GGGG” at the 3ʹ end of their strands. Thus, sequences “AAAAAAAAAA” and “CCCCCCCCCC” are needed in our algorithm to distinguish these nodes, as explained in Section 3.6.1. Finally, in order to make sure that strands representing the complements of “TTTTTTTTTT” and “GGGGGGGGGG” will bond to the starting nodes and goal nodes respectively and nowhere else, all strands representing the other nodes should be designed in the way that they do not have successive “G”s or “T”s at the 3ʹ or 5ʹ end of their strands. For instance Figure 2(a) shows three distinct strands representing nodes. Figure 2(b) shows two distinct strands representing “Λ” operator and two distinct strands representing “→” operator. The resulting strands are shown in Figure 2(c). Since permutations of k-node compound antecedent creates the same antecedent (k! strands representing the same AN); therefore, for each rule with compound antecedent, one of the CN for the k! strands is chosen randomly and poured into tube Tr1 in order to detect redundancy and circularity [1] . We encode all permutations of antecedent of the rules with compound AN and pour them into tube TΛ. In order to detect subsumed rules, special tubes TΛsk are

(a)(a)(c)

Figure 2. (a) Distinct strands representing each node; (b) Splints representing “Λ” and “→” operators; (c) The resulting long strands R1 and R2, both representing the same rule.

created as described below. The antecedent of all rules with compound AN, which are comprised of k starting nodes, are poured into tube TΛsk. The strands are labeled with ki. For instance, TΛs1 is comprised of antecedents of all rules with atomic AN of starting nodes and are labeled as 1i strands (ith strand of tube TΛs1) and TΛs2 is comprised of antecedents of all rules with compound AN of two starting node labeled as 2i strands. Thus tubes TΛskc, Comprised of many copies of complements of strands in tubes TΛsk are created. These are used in Detect Redundancy algorithm described in Section 3.6.4.

2.5. Generating the Solution Space

An essential difficulty in all filtering models is that initial multi sets of strands generally have quantity, which is exponential in the problem size [16] . What is done in practice is that an initial set is constructed containing a polynomial number of distinct strands. The design of these strands ensures that exponentially large initial set of the system (rule paths) can be generated automatically [16] . Sequence of pre-steps are carried out in order to create the initial solution space containing rule paths. An example of creating the initial set for a simple rule base is depicted in Figure 3. These pre-steps are alike what is presented in [1] , as follows.

Pre-Step 1. Tubes TΛ, T →, Tr1, Tr2, Tr3, Tc, and Ts are needed at this stage. Strands representing splints “XΛY” And “X → Y” are poured into tube TΛ and T→ respectively. Starting nodes and goal nodes are not conclusions and antecedent parts (conditions) of any rule respectively. Thus, in order eliminate these kinds of rules and prevent circularity from occurring to starting nodes (in our sample rule base, X1) and goal nodes (in our sample rule base, X6), copies of strands designed as “GGGG” followed by the 10 nucleotides at the 5ʹ end of starting nodes and 10 nucleotides at 3ʹ end of the goal nodes followed by “GGGG” are poured into tube T→. Thus any splint in T→ that anneals to the above strands, is removed. We may have rules in which the goal nodes are included in compound antecedents. In order to eliminate these kinds of rules, copies of strands designed as 10 nucleotides at 3ʹ end of goal nodes followed by “TTTT” and “TTTT” followed by 10 nucleotides at the 5ʹ end of the goal nodes are poured into tube TΛ. Thus, any splint in tube TΛ that anneals to the above strands is eliminated (assuming Y as goal node, splints formed as Xi Λ Y and Y Λ Xi are eliminated). Next the CN of rules with compound AN are poured into tube Tr1 and strands representing CN for each atomic AN is poured into tube Tr2. In order to identify which CN has more than one rule leading to it, strands in Tr2 are poured into tube Tr1 to make sure that Tr1 contains the CN for all rules. Then only one copy of the complements for all CN is poured into Tr1. Any CN in Tr1that does not anneal to its complement represents CN with more than one rule leading to it and is poured intoTr3. Next copies of all AN are poured into Tc.

Pre-Step 2. Splints in TΛ and copies of strands “TTTT” are poured into tube Tc and DNA ligation is allowed to occur. By means of splints, each set of compound AN would stick together and creates a double stranded DNA in length of the splint. Then single strands are separated from Tc to Ts.

Figure 3. Initial set generation.

Pre-Step 3. To process the situation that one of the compound AN of a rule is the CN of another rule, popula- tion of strands from T→ are poured into tube Tc. Splints in T→ bond to the strands mentioned above and long double stranded DNAs, which are subset of rule chains are formed.

Pre-Step 4. At this stage copies of “GGGG” and the strands from Ts are poured into Tc to form rule-chain subsets containing atomic AN. Each long strand created at this step corresponds to one set of possible inference rule paths that may contain any of typical structural errors.

Pre-Step 5. At this stage, double-stranded DNAs from Tc are denatured and poured into tube T. Finally, all possible rule sequences are represented in T.

2.6. Detection of Structural Errors

2.6.1. Removing Incomplete Rule Paths

In order to remove incomplete rule paths, which do not start with starting nodes or do not lead to the goal node the algorithm below is performed [1] .

Detect completeness algorithm

1) Input (T);

2) Extract (T, “TTTTTTTTTT”, Ty, Tincomp);

3) Extract (Ty, “GGGGGGGGGG”, T, Tincomp);

4) Remove (Tincomp).

All strands containing at least one starting node in their sequence are extracted from T into tube Ty; otherwise, into tube Tincomp at line 2 (multiple starting nodes can be at the beginning of rule paths in form of compound antecedent of the first rule). This extract operation is carried out by pouring many copies of strand “AAAAAAAA AA” into tube T. This strand only anneals to single strands containing “TTTTTTTTTT”. As explained in Section 3.4, only strands representing starting nodes are designed so that all of them have this sub-strand. At line 3, strands containing goal nodes are extracted from Ty and poured into tube T; otherwise, into tube Tincomp. This extract operation is carried out by pouring many copies of strand “CCCCCCCCCC” into tube T. This strand only anneals to single strands containing “GGGGGGGGGG”. As explained in Section 3.4, only strands representing goal nodes are designed so that all of them have this sub-strand. At the end of algorithm, strands in tube Tincomp represent incomplete rule paths and should be removed. By performing this algorithm just one time, all complete rule-paths with different starting nodes and goal nodes are extracted. Thus, there is no need to perform the algorithm for all starting nodes and goal nodes repeatedly. Assuming that Xi is a starting node and Yi is a goal node, it should be noted that there is no splint to complement the 10 nucleotides at the 5ʹ end of Xi; therefore, in all strands containing Xi, it has to be located in front of the strands (in case of starting nodes in form of compound AN there is no splint to complement 10 nucleotides at the 5ʹ end of first starting node). Similarly, there is no splint to complement the 10 nucleotides at the 3ʹ end of Yi. Thus, in all strands containing Yi, it has to be located at the end of the strand.

2.6.2. General Algorithm to Detect Circularity

Algorithm proposed in [1] is aimed at removing all the circularity depending rules that may exist, by removing strands, in which the same node appears in at least two location. Only strands containing the nodes that exist in Tr3 are likely to have circularity problem. Assuming z to represent the number of nodes in Tr3. We modify the algorithm presented in [1] and call it “Detect Circularity Part 0” algorithm as follows.

Detect Circularity Part 0

Tube Tif is composed of strands containing any goal node at position q*24 and thus there is no need to be compared. At the end of this algorithm, all rule chains, in which node Xi appears at least twice in the strand are poured into tubes Ticir. We remove these strands from T. There are some cases that this algorithm is unable to remove circularity error and after applying the algorithm, circularity error will remain in rule base. Suppose that after performing above algorithm, Xi and Xj are found to be circular nodes and there exist at least two paths between nodes Xi and Xj or more precisely there exist two rules or chains of rules acting reverse between these two nodes (e.g. Xi → Xj and Xj → Xi) besides one or more distinct chains of rules from a starting node (starting nodes can be distinct) leading to nodes Xi and Xj in addition to distinct chains of rules from these nodes leading to the goal node. In such a situation, this algorithm is unable to remove circularity error. That is, by removing rule paths which have at least two occurrences of nodes Xi and Xj, circularity error will not be removed. In order to clarify these situations, take simple rule base shown in Figure 4 as an example. Assuming X1 and X4 are starting and goal nodes respectively. The resulting directed graph made by these rules and all the paths starting from node X1 and leading to X4 is depicted in Figure 4.

By performing the “Detect Circularity Part 0” for nodes X2 and X3 strands number 5 and 6 are detected to have two occurrence of these nodes respectively. These strands are poured into tubes T1cir and T2cir. Next, we remove these strands from T. Now, if we establish the directed graph made by rules embedded in remainder paths, we see that removing paths 4 and 5 will not result in elimination of any of rules causing circularity error. That is, these rules (R4: X2 → X3, R5: X3 → X2) exist in other rule paths. Thus, this error remains and the algorithm is unable to remove it. As a matter of fact this is the case for all rule base with rules or chains of rules acting reverse between circular nodes, in addition to rules or chains of rules from a starting node leading to each one of these circular nodes and having from each circular node, paths or more precisely chains of rules, leading to the goal node. Thus, by means of these algorithm circularity error for these cases cannot be eliminated. To make these statements more clear, another instance of rule base and its corresponding directed graph is depicted in Figure 5. Assume nodes X1, X5 to be starting node and goal node respectively. As it is obvious from the directed graph of this rule base, there exists a circle in this rule base comprised of rules R5, R6, and R7. These rules are circularly dependent. Similar to the previous section we make an attempt to remove this error by means of above algorithm. All the complete paths start from node X1 leading to X5 are as follows.

By executing the algorithm that explained above, for nodes, which are present in Tr3 (X4, X3, and X4), circular paths {4, 8, 12} are found in which nodes X2, X3, and X4 appear twice in these strands respectively. We take into account the remainder paths and establish the directed graph constructed by rules embedded in them (rule paths {1, 2, 3, 5, 6, 7, 9, 10, 11}). Obviously the directed graph made by these rules is the same as the directed graph of original rule base. We notice that all circularly dependent rules (R5, R6, R7) still exist in remainder paths (and consequently in rule base) and none of them is removed. Thus, after performing the algorithm, circularity error still exists in rule base. It should be noted that, as it is obvious from our examples, in such a situation there exist complete rule paths having two occurrence of circular node Xi, in which circular node Xj is located between two position of Xi. And there exist complete rule paths having two occurrence of circular node Xj in which circular node Xi is located between two position of Xj (in our example paths 4, 8, and 12 have this property for nodes X4, X3, and X4).

In this section we propose an algorithm, which can perfectly remove circularity errors. This algorithm comprises three parts performed for each pair from circular nodes (Xi, Xj) that has been found in the previous

Figure 4. Rule base with circularity error.

Figure 5. Rule-base with circularity error.

algorithm. It should be noted that all paths at this stage start from starting nodes and there are no rules in which starting nodes are inferred from them. Thus, in performing different parts of our Detect Circularity algorithm, we don’t need to check position 0 to 24 of the paths. That is, in all parts of Detect Circularity algorithm, q is initially equal to one (q = 1). First, Detect Circularity Part 0 is performed. This algorithm results in separating paths, in which node Xi appears at least in two positions (i.e. circular nodes are found) in the strands and pouring them into tube Ticir. Next, for instance, if we assume that three circular nodes X2, X3, and X4 are found, three pairs (X2, X3), (X2, X4), and (X3, X4) should be analyzed in subsequent parts of our algorithms. Using k to represent the number of pairs from circular nodes other parts ofour algorithm is represented in Table 1.

Detect Circularity Part 1: At this part of our algorithm for each pair from circular nodes, the algorithm is performed in parallel as follows. Three extra tubes (Tija, Tijb, Tijc) are necessary for each pair (Xi, Xj). Initially, k copies of Ti cir is created as tubes Tij in parallel. Lines 4 to 9 are carried out until there is no strand in tube Tij. At line 6, strands from Tij having node Xi at position q*24 are extracted and poured into tube Tija. At line 7, strands from Tijb that all of them has occurrence of Xi are extracted and poured into tube Tijc if they have Xj located after the first position of Xi. Strands in Tijc represent paths having Xj located between first and last position of Xi (strands in Tij include at least two occurrence of node Xi). Within the qth iteration of the algorithm, line 5 checks whether Tijc contains any strands. If it is so, the algorithm makes sure that there exist paths that Xj is located between first and last position of node Xi and part one of the algorithm finishes; otherwise, it will continue until tube Tij contains no strands. If tube Tijc contains no strands. After performing this part, it means that there is no rules or chains of rules between these two circular nodes acting reverse (which cause circularity still remains in the rule base after performing Detect Circularity Part 0). Thus by removing strands in Ticir and Tjcir from tube T, the remainder paths contain no circularity dependent rules caused by these two nodes. And performing Detect Circularity part 0 is enough in order to remove circularity. The next part of the algorithm is performed. If tube Tijc contains any strand. The second part of the algorithm is performed in parallel as follows.

Detect Circularity Part 2: At this part of our algorithm for each pair from circular nodes, the algorithm is performed in parallel as follows. Three extra tubes (Tjia, Tjib, Tjic) are necessary for each pair (Xi, Xj). Initially, k copies of Tjcir is created as tubes Tji in parallel. In this part, between lines 4 to 9, for all pairs of circular nodes in parallel, if there exist a strand in Tji, which has sub-strand representing Xi located between two position of sub- strand representing Xj, is poured into tube Tjic. There is no rules or chains of rules acting reverse between circular nodes Xi and Xj If there is no strand in Tjic. In this situation, by removing strands in tubes Ticir and Tjcirfrom T, we will be sure that there is no circularity error considering nodes Xi and Xj; otherwise, we put into practice the third part of our algorithm for each pair from circular nodes, which both previous parts has been fulfilled for them as follows.

Table 1. Detect circularity algorithm.

Detect Circularity Part 3: At this part of the algorithm four extra tubes are necessary for each pair (Xi, Xj) from circular nodes. Initially, k copies of T are generated as tubes Ti. At the final part of our Detect Circularity algorithm for each pair from circular nodes (Xi, Xj), Xi or Xj is selected (here Xi is chosen). Then between lines 4 to 9, all paths in which node Xj is located after Xiare extracted from Ti and poured into Ti2 in parallel. At the end of this part of our algorithm, tubes Ti2 are merged into tube T2. We then extract strands in T2 from tube T. Thus, all complete paths from T, in which node Xj is located after node Xi will be removed from T. Thus, we make sure that after performing this part, there are no rules or chains of rules from Xi leading to Xj in tube T. Consequently one of rules or one of chains of rules causing circularity between Xi and Xj are removed. In the end, the resulting rule base will be free of any form of circularly dependent rules.

In order to demonstrate effectiveness of the algorithm, we perform it for rule base shown in Figure 5. As stated above, in this rule base, rules {R5, R6, R7} cause circularity between nodes {X2, X3, X4}. After performing Detect Circularity Part 0, each tube Ticir has strands shown below.

T2cir = {R1R4R5R6R7} T3cir = {R3R5R6R4R9} T4cir = {R2R6R4R5R8}

These strands are removed from T. In order to clarify the procedure of our algorithm, we perform it for each pair of circular nodes {(X2, X3), (X2, X4), (X3, X4)} successively. In Detect Circularity Part 1, tubes T23, T24, and T34 are created for these pairs of circular nodes in parallel at line 3. Tubes T32, T42, and T43 are created for these pairs of circular nodes in parallel at Detect Circularity Part 2, line3. First consider nodes (X2, X3). By performing Detect Circularity Part 1, In T2cir(T23), we find X3located between two position of X2. Thus, Detect circularity part 2 is performed and X2 is found to be located between two positions of X3 in T3cir(T32). Thus, there exists rules (chains of rules) acting reverse between these nodes (i.e. {R5, (R6, R7)}). At Detect circularity part 3, we choose X2 and extract all strands, in which X2 is located before X3 from T1 and pour them into T12. We remove these strands from T. Paths 2, 3, and 7 are removed from T and the remainder paths are as follows.

1: R2R8: X1→X2→X5

5: R3R9: X1→X4→X5

6: R3R7R8: X1→X4→X2→X5

9: R4R10: X1→X3→X5

10: R4R6R9: X1→X3→X4→X5

11: R4R6R7R8: X1→X3→X4→X2→X5

Now, we perform the algorithm for nodes (X2, X4). We perform Detect circularity part 1 for node X2 and in tube T2cir(T24), we find X4located between two position of node X2 besides finding (in tube T4cir(T42)) X2 located between two position of node X4, in Detect circularity part 2. In Detect circularity part 3, we choose node X2 and extract from tube T2 all strands having X2 located before X4 in their sequence. These strands (if there exist any) should be removed from T. In the end, we perform the algorithm for nodes (X3, X4). Since in Detect Circularity Part 1, we find paths in which (in tube T3cir(T34)), X4 is located between two position of node X3 in addition to finding paths, in which X3 is located between two position of X4 in tube T4cir(T43) in Detect Circularity Part 2, we choose one of these nodes (here we choose X4) and remove all strands, in which X4 is located before X3. Consequently the remainder paths and the directed graph made by them are shown in Figure 6.

As it is obvious, rule R4 is removed and circularity error is eliminated from the rule base. Consequently, there is no circle between rules in the resultant rule base. It should be noted that Detect Circularity Part 3 (in lines 5 and 6), depending on the selection of the node that is located before the other in strands, extracts strands in which selected circular node is located before the other circular node (for instance, X2 is located before X3 in our example). Therefore, at least one of rule chains (rules) causing circularity between circular nodes is removed and

Figure 6. Resultant rule base after removing circularity.

at the most, all the rule chains (rules) causing circularity between circular nodes are removed (in our example all rules {R4, R5, R6} at the most). This removal is dependent on the node selection in Detect Circularity Part 3. however, all pairs from circular nodes, which fulfill Detect Circularity Part 1 and Part 2, has distinct rules or chains of rules from starting nodes leading to each one of them. Thus, our algorithm does not cause incompleteness in the rule base in all cases.

2.6.3. Inconsistency Detection of Rule Bases

Conflicts are known conditions in the system. Thus, we define conflicting nodes as pair (Xi, Xj). If there exists a physical state for which the rules resulting in conflicting nodes can be fired simultaneously, one can say that the rules are physically (practically) conflicting [4] . Our Detect Conflict algorithm is performed for each pairs of conflicting nodes in parallel as follows (assuming k is the number of conflicting pair nodes).

Detect Inconsistency

For each pair of conflicting nodes (Xi, Xj) in parallel, strands containing node Xi are extracted from Tz and poured into tube Tz+; otherwise, poured into tube Tz (line 3) .Strands containing Xj are extracted from Tz+ and poured into Tube Tz2; otherwise, into tube Tz1 (line 4). Strands containing Xj are extracted from Tz and poured into Tz3; otherwise, into Tz4 (line 5). Strands in Tz1 and Tz3 contain Xi and Xj within their sequences respectively. Strands in tube Tz2 contain conflicting nodes Xi and Xj in their chain and are invalid. These tubes are merged into tube Tr to be discarded. According to the definition of inconsistency [4] , two rules resulting in inconsistent nodes (Xi, Xj) (and consequently their corresponding rule paths in tubes Tz1, Tz3) are inconsistent, if there exists a state, such that simultaneously both rules can be fired. Although, this is a potential inconsistency, such a possibility exists. These kinds of rules should be further analyzed by domain experts. If there exist a physical state for which the rules can be fired simultaneously, they should be analyzed, modified, or one of them be eliminated. Modification is carried out to arrive at pre-conditions (antecedents) for these rules that cannot be fired at the same time. It is the problem of so called conflict resolution mechanism to select a single rule to be fired [4] . Conflict situations can be solved with appropriate inference control mechanism [4] . If one wants to keep both of these rules, this can be done by controlling the facts and inference control and avoid generating inconsistency by selection of one of the rules (never fire the second rule if the other was fired (e.g. by priority mechanism) [4] . In the event that if, it is decided to eliminate one of the rules resulting in inconsistent nodes (Xi, Xj), this can be done by removing strands in one of these tubes (strands in tube Tz1 or Tz3) from tube T (based on domain experts decision). In this way, inconsistency can be eliminated and the resultant rule base will be free of inconsistency error.

2.6.4. Redundancy Detection of Rule Bases

According to definition of redundancy, a rule is redundant with respect to the conclusion, in the even that if two rules have identical conditions and conclusions identical rules) or two rules have identical conclusions while the condition for one rule is either a generalization or special case of the condition for the other one [3] [4] . The rule, which has more general condition subsumes (is stronger than) the other rule. Suppose that we have more than one starting node in rule base and starting nodes are logically independent and none of them implies the other. For instance, consider the simple rule base and directed graph established for it in Figure 7. Assume nodes X1, X2, X3, and X4 are starting nodes and X8 is the goal node.

Complete paths from starting nodes leading to goal node are as follows.

According to the definition of redundancy, paths 1, 2, 4 are not redundant from paths 3. Thus, in order to maintain the completeness of rule base, both rule paths 1 and 3 (rules R5 and R8) are system required and should remain in rule base (since these rules are not subsumed by any other rule in this rule base). However, algorithm proposed in [1] considers these rule paths redundant from each other. In the subsequent section we propose an

Figure 7. Resultant rule base after removing circularity.

algorithm to detect redundancy in which we have taken into account starting nodes and assumed independency of starting nodes in detecting redundancy error. In our algorithm we will find redundant rules in forms of identical and subsumed rules as follows. The detect redundancy algorithm is represented in Table 2.

Initially, z copies of tube T are generated as tubes Ti. At line 2, in parallel for each redundant node, we extract all strands containing redundant node Xi. Next strands containing “→Xi” are extracted from Ti+ and poured into tube TiRed; otherwise, into tube TiReq at line 3. It should be noted that if a rule path contains the redundant node Xi, but it lacks a node that directly infers Xi, then Xi must be part of a compound AN in the rule path. Thus, this rule path needs to exist when considering redundant node Xi and is poured into tube TiReq (line 3). Next, we categorize strands in tube TiRed in terms of length of the antecedent of the first rules in strands. To do so, many copies of strands representing complements of strands in tube TΛsk are poured into tube TiRed. First, for correct extraction of strands, this line is carried out for tube which contains paths with longest AN of first rules and is repeated until the last tube which contains paths with shortest AN of first rules. Thus, any strand in tube TiRed that anneals to the above strands are extracted and poured into tube TikRed (tube TΛsk is comprised of strands representing k starting nodes in conjunction form). Thus, the AN of first rules of strands in tube TikRed is comprised of k starting nodes, if there exist any strand of this form.

Strands in each tube TikRed, which the AN of first rules of them, are the same (or permutation of one another) are redundant from each other (antecedent of the first rule in all strands in tube TikRed have k starting nodes). Other strands in tubes TikRed, represent rule paths with distinct starting nodes, or strands in which at least one of starting nodes of the first rules are distinct from those of other rule paths. These strands are not redundant from each other.

In order to determine subsumed rules, we compare strands in each tube TikRed with strands in all other tubes {TijRed,…} for all j > k, generated at Detect Redundancy Part 1. Our Detect Redundancy Algorithm Part 2 is described below. For each tube TikRed, this algorithm is performed in parallel as follows. Each strand in tube TΛsk is represented by ki. Thus, kz denotes strand number z of tube TΛsk (explained in Section 3.4). At line 3, strands containing kz are extracted from Tik Red and poured into tube Tik+. At line 4, If there exists any strand in tube Tik+, the first rule of this strand subsumes all rule paths in tubes TijRed (j > k), which contain kz. Thus, all rule paths containing kz are extracted from all TijRed (j>k) in parallel and poured into tube TijR+ (line 5). Consequently all strands in tubes TijR+, are subsumed rule paths. In order to show the effectiveness of our algorithm, we perform it for the rule base depicted in Figure 7. Node X6 has more than one rule leading to it thus it is candidate for redundancy error. Paths {1, 2, 3, 4} contain sub-strand “→X6” and are extracted from T6 and poured into T6Red at Detect Redundancy Part 1. Next, strands in tube T6Red, which contain, strands in tube TΛs3 (i.e. X1 Λ X2 Λ X4) are extracted from T6Red and poured into tube T63Red. At next iteration of while-loop, strands in tube T6Red, which contain sub-strands in tube TΛs2 (i.e. X1 Λ X2 and X2 Λ X3) are extracted from T6Red and poured into tube T62Red. Finally, strands containing the sub-strand in tube TΛs1 (i.e. X1) are extracted and poured into tube T61Red. At Detect Redundancy Part 2, lines 3 to 5 is performed in parallel for all tubes T61Red, T62Red, T63Red. We describe the process for tube T61Red. At line 3, strands in tube T61Red, which contains sub-strand in tube TΛs1 (X1) are extracted from T61Red and poured into tube T61+. Since tube T61+ is not empty, line 5 of the algorithm is performed and strands containing Sub-strand X1 are extracted from T62Red and T63Red and poured into tubes T62R+, T63R+ respectively. Finally, strands in tubes T62R+ and T63R+ are redundant (subsumed) rule paths. It should be noted that the algorithm is performed for all tubes TikRed (in our example, T61Red, T62Red, and T63Red). Thus, between lines 2 to 5, the algorithm will find other redundant rule paths by checking tubes T62Red and T63Red, if there exists any. The process of our algorithm for the example explained above is depicted in Figure 8(a).

Table 2. Detect redundancy algorithm.

(a) (b)

Figure 8. (a) Redundancy detection; (b) Example of redundant rule base.

It should be noted that, rules in the middle of the rule paths with atomic or compound AN does not influence the redundancy detection procedure even if there exist nodes in the middle of the paths that are inferred from distinct starting nodes. To make these statements more clear, take the rule base depicted in Figure 8(b) as an example. Assume that X1 and X2 are starting nodes and X7is the goal node. The complete rule paths for this rule base is shown in Figure 8(b).

By performing detect redundancy algorithms, rule paths (1, 2) and (3, 4) are redundant from each other. By removing one of the redundant rule paths, incompleteness will not arise in the rule base. The reason is that, a system required rule that is eliminated by removing one of the paths, remains in other rule paths. For such a special case depicted in Figure 8(b), we should be careful not to eliminate redundant rule paths in a way that a system required rule be removed from the rule base, as a result of elimination of redundant rule paths considering different starting nodes (for example, rule X3 Λ X4 → X7 and X4 Λ X3 → X7 will be removed by elimination of rule paths 1 and 3).

3. Conclusions

Different techniques have been developed in order to represent rule-based systems and detect structural errors in them. Nazareth proposed an approach based on Petri nets to verify rule-based systems [13] . Zhang and Nguyen proposed a tool based on Pr/T net to automatically detect potential errors in a rule-based system [23] . Agarwal and Tanniru utilized incidence matrix of Petri nets for detecting structural errors in rule base [14] . An approach based on hyper-graph to verify rule-based systems is proposed by Ramaswamy et al. [10] , which utilizes directed hyper-graphs to model rule-based systems’ graph and transform the hyper-graph into adjacency matrix. He et al. [6] utilized a special class of low level Petri nets (ω-nets) in order to detect the structural errors in rule based systems. In their approach, transitions were used to represent rules, and places of Petri-nets represent conditions and conclusions. Thus, rules that their firing does not result in marking of w-net are considered either redundant or circular rules. The algorithms presented in [6] do not distinguish between redundancy and circularity problem. Algorithms based on DNA computing are proposed in [1] in order to detect the four structural errors in rule-base systems. That paper presents algorithms, which are able to detect structural errors just for special cases with one starting node and one goal node. For rule base, which contains more than one starting node and goal node, structural errors are not removed correctly by utilizing these algorithm. By virtue of the special strand design that we used to encode starting nodes and goal nodes, all complete paths can be extracted from initial solution by performing our detect incompleteness algorithm just one time.

Thus, in case of multiple starting nodes and goal nodes, there is no need to repeat the algorithm. Our algorithm efficiently removes all circularity errors in chains of rules in any form that they may occur. In our approach for each pair of inconsistent nodes (Xi, Xj), rule paths which lead to these nodes are extracted from T and categorized in distinct tubes in parallel. Then, these rule paths and rules resulting in inconsistent nodes should be further analyzed, modified or eliminated based on experts domain decision. If there exists a state for which these rules resulting in inconsistent nodes (Xi, Xj) can be fired simultaneously, and it is decided that one of the rules resulting in these nodes should be removed, this can be done by removing strands in one of the tubes containing Xi or Xj based on experts domain decision. The Detect Redundancy algorithm considers starting nodes, which are logically independent and there is no implied relationship between them. Hence, two rules with the same conclusions, which are inferred under different conditions (different starting nodes) are not considered redundant from each other. And Detect Redundancy algorithm is able to detect subsumed rule paths.

Efforts utilizing traditional measures of complexity such as time and space have been made to characterize DNA computation. Most existing models determine the time complexity of DNA-based algorithms by counting the number of biological steps it take to solve the given problem. We use the strong model of DNA computation for parallel filtering models. This model considers that a basic operator actually needs a time dependent on the problem size rather than taking constant time to be carried out [16] . The operation time of some of the operators utilized in this paper is presented in [16] . For instance, Union (T, T1, …, Tn) and Copy (T, T1, …, Tn) take O(n) time rather than taking constant time , which n is the problem size.

Our algorithm comprises eight parts: Detect Completeness, Detect Circularity Part 0 to Part 3, and Detect Conflict, Detect Redundancy Part 1, and Detect Redundancy Part 2. We assume that the initial library (solution) is already constructed. Issues about constructing the initial library can be found in [16] . Operations of our algorithm are as follows. The Detect Completeness algorithm includes 2 Extract and one Remove operations and the time it takes is from the order of O(n). Detect Circularity Part 0 includes one Copy, (q − 1) Detect, (3*(q − 1)) Separate, (q − 1) Union, and one Remove operations and takes O(n*q) time. Detect Circularity Part 1, consists of one parallel Copy, (2*(q − 1)) parallel Detect, (2*(q − 1)) parallel Separate, (q − 1) parallel Union, and one Remove and takes O(n*q) time. By the same token, number of operations included in Detect Circularity Part 2, is exactly the same as Detect Circularity Part 1 and takes O(n*q) time. Detect Circularity Part 3, consists of one Copy, (q − 1) parallel Detect, (3*(q − 1)) parallel Separate, (q − 1) parallel Union, one Remove, and one Union operations and takes O(n*q) time. Detect Conflict algorithm consists of one parallel Copy, 3 parallel extract, one parallel Union and takes O(n) time. Detect Redundancy Part 1 algorithm consists of one Copy, (K + 2) parallel Extract, and takes O(K*n) time, in which “K” is the number of tubes TΛsk. Detect Redundancy Part 2 algorithm consists of (2*z) parallel Extract and (z) parallel detect operations and takes O(z*n) time, in which “z” denotes the maximum number of distinct sub-strands in tubes TΛsk. In the end, one Read operation is performed which takes O(1) time.

Using the strong parallel model of DNA computation, according to the above complexity analysis, the biological operations of our algorithm in the worst case is O (20q + K + 3z + 3), in “q” is the number of rules in the longest inference chain, “K” is the number of tubes {TΛs1, …, TΛsk}, and “z” denotes the maximum number of distinct sub-strands in tubes TΛsk. If we assume that just Detect circularity Part 0 and Part 1 are performed, then the complexity of our algorithm would be O(10q + K + 3z − 1). The time complexity of our algorithm in all cases is O(n*(Max{q, K, z})). Applicability of utilizing DNA computing to verification of rule-based systems first shown in [1] . But proposed algorithms were not general and there are lots of cases, in which these errors cannot be removed correctly by utilizing their algorithms. In this paper, the deficiencies of the algorithms presented in [1] are outlined. We have proposed algorithms in which these deficiencies are eliminated. The proposed algorithms are able to detect the four structural errors in rule base in any forms that they may occur. Our algorithms utilize an entirely linear increase of computation and for a rule base with n nodes, the time complexity of our algorithm is O(n*(Max{q, K, z})), almost the same as the time complexity that has been achieved in [1] . The number of biological operations used in our algorithm at the worst case of complexity, for which all parts of the Detect Circularity algorithm is performed is O(22q + K + 3z + 3). Our algorithm utilizes some more operations than the algorithm proposed in [1] . However, this small number of added operations results in efficient performance of our algorithms for different cases and consequently, supplements DNA computing approach to verify Rule-Based Systems. In future, we plan to investigate the application of sparse Bayesian models in classification of errors in rule-bases systems [24] - [26] . Additionally we plan to investigate application of system dynamics modeling in implementation and verification of rule based systems [27] .

References

1. Yeh, C.-W. and Chu, C.-P. (2008) Molecular Verification of Rule-Based Systems Based on DNA Computation. IEEE Transactions on Knowledge and Data Engineering, 20, 965-975. http://dx.doi.org/10.1109/TKDE.2007.190743
2. Jeffrey, J., Lobo, J. and Murata, T. (1996) A High-Level Petri Net for Goal-Directed Semantics of Horn Clause Logic. IEEE Transactions on Knowledge and Data Engineering, 8, 241-259. http://dx.doi.org/10.1109/69.494164
3. Yang, S.J.H., Tsai, J.P. and Chen, C.C. (2003) Fuzzy Rule Base Systems Verification Using High-Level Petri Nets. IEEE Transactions on Knowledge and Data Engineering, 15, 457-473. http://dx.doi.org/10.1109/TKDE.2003.1185845
4. Ligeza, A. (2006) Logical Foundations for Rule-Based Systems. Springer, New York, 189-211. http://dx.doi.org/10.1007/3-540-32446-1_12
5. Tan, Z.-H. (2006) Fuzzy Metagraph and Its Combination with the Indexing Approach in Rule-Base Systems. IEEE Transactions on Knowledge and Data Engineering, 18, 829-841. http://dx.doi.org/10.1109/TKDE.2006.96
6. He, X., Chu, W.C., Yang, H. and Yang, S.J.H. (1999) A New Approach to Verify Rule-Based Systems Using Petri Nets. The 23rd Annual International Computer Software and Applications Conference, Phoenix, 27-29 October 1999, 462-467.
7. Nguyen, T.A. (1987) Verifying Consistency of Production Systems. 3rd IEEE Conference on Artificial Intelligence and Applications, Orlando, 4-8.
8. Nguyen, T.A., Perkins, W.A., Laffey, T.J. and Pecora, D. (1985) Checking Expert System Knowledge Bases for Consistency and Completeness. Ninth International Joint Conferences on Artificial Intelligence, Los Angeles, August 1985, 375-378.
9. Cragun, B.J. and Steudel, H.J. (1987) A Decision-Table-Based Processor for Checking Completeness and Consistency in Rule-Based Expert Systems. International Journal of Man-Machine Studies, 26, 633-648. http://dx.doi.org/10.1016/S0020-7373(87)80076-7
10. Ramaswamy, M., Sarkar, S. and Chen, Y.S. (1997) Using Directed Hypergraphs to Verify Rule-Based Expert Systems. IEEE Transactions on Knowledge and Data Engineering, 9, 221-237. http://dx.doi.org/10.1109/69.591448
11. Nazareth, D.L. and Kennedy, M.H. (1991) Verification of Rule-Based Knowledge Using Directed Graphs. Knowledge Acquisition, 3, 339-360. http://dx.doi.org/10.1016/S1042-8143(05)80024-X
12. Valiente, G. (1993) Verification of Knowledge Based Redundancy and Subsumption Using Graph Transformations. International Journal of Expert Systems, 6, 341-355.
13. Nazareth, D.L. (1993) Investigating the Applicability of Petri Nets for Rule-Based System Verification. IEEE Transactions on Knowledge and Data Engineering, 4, 402-415. http://dx.doi.org/10.1109/69.224193
14. Agarwal, R. and Tanniru, M. (1992) A Petri Net Based Approach for Verifying the Integrity of Production Systems. International Journal of Man-Machine Studies, 36, 447-468. http://dx.doi.org/10.1016/0020-7373(92)90043-K
15. Wu, C.H. and Lee, S.J. (1997) Knowledge Verification with an Enhanced High-Level Petri-Net Model. IEEE Expert, 12, 73-80.
16. Amos, M. (2004) Theoretical and Experimental DNA Computation. Springer, New York, 46-77.
17. Feynman, R.P. and Gilbert, D. (1960) There’s Plenty of Room at the Bottom. Engineering and Science Magazine, 23, 22-36.
18. Adleman, L.M. (1994) Molecular Computation of Solutions to Combinatorial Problems. Science, 266, 1021-1024. http://dx.doi.org/10.1126/science.7973651
19. Breslauer, K.J., Frank, R., Blocker, H. and Marky, L.A. (1986) Predicting DNA Duplex Stability from the Base Sequence. Proceedings of the National Academy of Sciences, 83, 3746-3750. http://dx.doi.org/10.1073/pnas.83.11.3746
20. Braich, R.S., Johnson, C., Rothemund, P.W.K., Hwang, D., Chelyapov, N. and Adleman, L.M. (2001) Solution of a Satisfiability Problem on a Gel-Based DNA Computer. DNA Computing Lecture Notes in Computer Science, 2054, 27- 42.
21. Roweis, S., Winfree, E., Burgoyne, R., Chelyapov, N.V., Goodman, M.F., Rothemund, P.W.K. and Adleman, L.M. (1998) A Sticker-Based Model for DNA Computing. Journal of Computational Biology, 5, 615-629. http://dx.doi.org/10.1089/cmb.1998.5.615
22. Chang, W.-L. and Guo, M. (2004) Molecular Solutions for the Subset-Sum Problem on DNA-Based Supercomputing. Biosystems, 73, 117-130. http://dx.doi.org/10.1016/j.biosystems.2003.11.001
23. Zhang, D. and Nguyen, D. (1994) PREPARE: A Tool for Knowledge Base Verification. IEEE Transactions on Knowledge and Data Engineering, 6, 983-989.
24. Madahian, B., Deng, L. and Homayouni, R. (2014) Application of Sparse Bayesian Generalized Linear Model to Gene Expression Data for Classification of Prostate Cancer Subtypes. Open Journal of Statistics, 4, 518-526. http://dx.doi.org/10.4236/ojs.2014.47049
25. Madahian, B. and Faghihi, U. (2014) A Fully Bayesian Sparse Probit Model for Text Categorization. Open Journal of Statistics, 4, 611-619. http://dx.doi.org/10.4236/ojs.2014.48057
26. Madahian, B., Deng, L. and Homayouni, R. (2014) Development of Sparse Bayesian Multinomial Generalized Linear Model for Multi-Class Prediction. BMC Bioinformatics, 15, P14. http://dx.doi.org/10.1186/1471-2105-15-S10-P14
27. Madahian, B., Klesges, R.C., Klesges, L. and Homayouni, R. (2012) System Dynamics Modeling of Childhood Obesity. BMC Bioinformatics, 13, A13. http://dx.doi.org/10.1186/1471-2105-13-S12-A13