Paper Menu >>
Journal Menu >>
![]() J. Software Engineering & Applications, 2008, 1: 83-87 Published Online December 2008 in SciRes (www.SciRP.org/journal/jsea) Copyright © 2008 SciRes JSEA Workflow Mining of More Perspectives of Workflow Peng Liu 1 , Bosheng Zhou 2 1 School of Computer Science and Technology, Beijing University of Aeronautics and Astronautics, Beijing 100191, China, 2 School of Computer Science and Technology, Beijing University of Aeronautics and Astronautics, Beijing 100191, China Email: childbiggo@hotmail.com Received November 27 th , 2008; revised November 30 th , 2008; accepted December 1 st , 2008. ABSTRACT The goal of workflow mining is to obtain objective and valuable information from event logs .The research of workflow mining is of great significance for deploying new business process as well as analyzing and improving the already deployed ones. Many information systems log event data about executed tasks. Workflow mining is concerned with the derivation of a graphical process model out of this data. Currently, workflow mining research is narrowly focused on the rediscovery of control flow models. In this paper, we present workflow mining of more perspectives of workflow to broaden the scope of workflow mining. The mining model is described with GBMS’s VPML and we present the entire model’s workflow mining with the GBMS’s VPML. Keywords: Workflow Mining, GBMs, VPML 1. Introduction Workflow technology continues to be subjected to on-going development in its traditional application areas of business process modeling and business process coordination, and now in emergent areas of component frameworks and inter-workflow, business-to-business interaction. Addressing this broad and rather ambitious reach, a large number of workflow products are commercially available, which see a large variety of languages and concepts based on different paradigms. In 1993, the standardization organization of workflow technology—Workflow Management Coalition (WFMC) was built. The definition of workflow in WFMC is as follows: workflow is concerned with the automation of procedures where documents, information or tasks are passed between participants according to a defined set of rules to achieve, or contribute to an overall business goal [1]. So, all kinds of activities of enterprises are organized and corresponded by business processes modeling and defined business logic relation. Workflow model is the process model that can be executed in Computer and its performance can be analyzed with the executing result. In selection of workflow engine, it must review the analysis ability of the process model. Workflow mining means the knowledge discovery of workflow system, which can be induced by the definition of data mining. The ultimate target of workflow mining is to mine the transactional logs of workflow system, and to discover the knowledge of workflow including workflow models mining, workflow performance mining and workflow models improving. Most workflow mining research is aimed at the rediscovery of explicit control flow models, which are used to specify the behavior of a process. We believe that this approach limits the scope and utility of workflow mining by neglecting the important notion that workflow is much more than control flow. In fact, the behavior of a process is but one of its perspectives. In [2], Wang Lei and Zhou Bosheng present four major perspectives (Behavior Model, Information Model, Resource Model, and Coordination Model) of a process model that have emerged from the disciplines that helped shape the workflow area. 2. The Research Basis 2.1 GBMS and VPML VPML [3] (Visual Process Modeling Language) is a graphic language that supports a special process definition. Process is a set of transformations of input elements into products with specific properties, characterized by transformation parameters; VPML describes the process with visual process diagram and relevant text specification. The diagram shows the structure of the process and the specification shows the attribute of all items in the diagram. The process has a higher degree in visualization and formalization. The process model built in VPML can simulate. It was proved that VPML not only describes a whole complete process model, but also is a process modeling language with rich functions, visual diagram, easy learning, flexible application. GBMS [4] (Government Business Modeling System)is a modeling system oriented E-government. GBMS not only supports the Government business’s process hackling, modeling, simulating and optimal restructure ![]() 84 Workflow Mining of More Perspectives of Workflow Copyright © 2008 SciRes JSEA that implements interdepartmental business integration and the sharing of information resources, but also organic integrates the government business modeling with the requirement extraction and analysis of the E-government Business Application System. GBMS completely describes the business model in the view aspect of process, organization, resource, information and collaboration and is process-centered, organic integrates five views and keeps each view’s consistency and integrity. GBMS lays the foundation for design and implementation of the Business Application System in E-government area. 2.2 The Background and the System Architecture The research background of this paper is that The Construction of Oriented E-government Software Component Library. First, using GBMS, we build process model, organization model, information model, collaboration model, resource model and behavior model. The process model is the core of these models, which clearly describes the government business and finds out the shared resources. Meanwhile by accumulating plenty of the government business models, it not only gradually establishes E-government business pattern base, designs and builds component library oriented E-government and sharing in departments, but also unifies the standard of the government business process and avoids the duplicate investment of E-government. On the base of the component library, it executes the models on jBPM’s workflow engine and takes the rapid response to the new requirement in the E-government area, assembles the components of the E-government component library, and rapidly builds relevant the E-government business application system. It can build the system on demand and reduce production costs. The system’s architecture is given in Figure 1. The whole project has 5 parts: (1) GBMS; (2) Component System; (3) jBPM Workflow Engine System; (4) Automatically Generating System; (5) Workflow mining System. Figure 1. The system architecture This paper extends the paper “The Research on the Modeling Transformation From GBMS to jBPM” [5], and presents how to mine the entire model from the E-government business system in order to validate the business model built in GBMS. 2.3 Semantics of GBMS and VPML DEFINITION 1. GBMS as a 6-tuple PM={P, A, R, Control, Support, Input, Output} where P={Product 1 ,Product 2 ,…Product j } is a set of products; A={a 1 ,a 2 ,…a m } is a set of Activities; R={r 1 ,r 2 ,…r k } is a set of resources; Control is a set of relations of controlling; Support is the relations of resources supporting activities, Support ⊆ AxR; Input is the relations of activities and input products,Input ⊆ AxP; Output the relations of activities and output products, Output ⊆ AxP. We need to define some assistant objects based on the above definition. State (i): i ∈ P ∪ R, represents the states of element i. It is enumeration type: enum={able,disabled} means the states can be either able or disabled. Status (a): where a ∈ A, represents the status of the activity a. In GBMS Model, activity has 2 statuses, START and END. Source(c): where c ∈ Input ∪ Output, means the set of source objects related by c. Target(c): where c ∈ Input ∪ Output, is a set of target object related by c. Prod_Source(a):where a ∈ A, represents the set of input products related with activity a. Prod_Target(a):where a ∈ A, means the set of all output products related with activity a Resource(a): where a ∈ A, the set of all resources related with activity a. Role(a): where a ∈ A, represents the set of all roles related with activity a. DEFINITION 2. The GBMS Data Model is used to describe the information perspective of workflow. GBMS Data Model is the 4-tuple(P, A, Input, Output). Input(p, a):is that the Activity a need Product p to run. Output(p, a) is that Activity a is running to produce the Product p. DEFINITION 3. The GBMS Organization Model is used to describe the organization perspective of workflow. GBMS Organization Model is the triple (A, R, Support): A is the set of GBMS’s Activities; R is the set of resources in GBMS; Support is the relations of resources supporting activities; Support (a, r) is that the Resource r supports the Activity a to run. DEFINITION 4. S is a subset of P, which represents the set of all source products. ![]() Workflow Mining of More Perspectives of Workflow 85 Copyright © 2008 SciRes JSEA {|Input Output, ( )} SS P c Source cS =⊂∀ ∈∪ ∩= Φ DEFINITION 5. E is a subset of P, which represents the set of all the final products. {|Input Output, arg( )} EE P c Tet cE =⊂∀∈∪ ∩= Φ DEFINITION 6. O(a) represents the constraints. Or ( ){,Pr_( ), Re( ),( )| ()&&() &&( )} O aaApodSource a rsource auRole a StatepableState rable State uable = ∈∈ ∈ ∈ == == == 3. The Workflow Mining Algorithm It is theoretically proved that VPML is equivalent Petri-Net [6]. So this paper uses the mining algorithm based on the Aalst’s α-algorithm [7,8,9]. (1) Constructing dependence frequency table (D/F- table); By given activities a and b, I) activity a and activity b’s appearance frequency #a and #b; II) b is directly ahead of a : #b<#a; III) b directly succeeds a : #a>#b; IV) b is ahead of a: #b<<<#a; V) a is ahead of b: #a>>>#b; VI) the degree of dependence between a and b: #a#b; (2) Mining activities relation table (R-table) by D/F-table; Based on the D/F-table, mining the basic activities relations (a> w b ,a w b, a# w b, a// w b) [8]; (3) Reconstructing the workflow net by R-table and α-algorithm. 4. More Perspectives of Workflow Mining In this section, we present a sample of more perspectives of Workflow Mining. The workflow event log is shown in Table 1. First, we do some reasonable assumptions because of the workflow mining algorithm. We assume that events are logged in temporal order, the event logs do not contain noise, and the event logs are theoretically complete. 4.1 Workflow Mining in the Behavior Model The behavior model shows the behavior perspective of the workflow. This perspective is the basic and important of workflow. It is the performance of workflow system. In the workflow event log table, it contains 3 cases. We use the algorithm in the 3 rd section to mine the behavior model of GBMS. (1) Definitude the basic relations between the activities using the D/F-table. In the log , there are 3 cases, σ 1 ={A,B,C,D,E}, σ 2 = {A,B,D,C,E},σ 3 ={A,F,G}. Then we get the relations between activities: a> w b: A> w B,A> w F,B> w C,B> w D,C> w D, C> w E,D> w E, F> w G; a w b: A w B,B w C,B w D,A w F,F w G,C w E, D w E; a# w b: B# w F a// w b:C// w D, D// w C; (2) Using the α-algorithm to mine the behavior model. 1) T w ={A, B, C, D, E, F, G}; 2) T i ={A }; 3) T o ={E,G}; 4) Xw={(A,B),(B,C),(B,D),(C,E),(D,E),(A,F),(F, G)}; 5) Y w =X w ; 6) P w ={p(A,B), p(B,C), p(B,D), p(B,D), p(C,E), p(D,E), p(A,F), p(F,G)} { ∪i w ,o w }; 7) F w ={(A, p(A,B)), (p(A,B), B), (B, p(B,C)), (p(B,C), C), (B, p(B,D)), (p(B,D), D), (C, p(C,E)),(p(C,E), E), (D, p(D,E)), (p(D,E), E), (A, p(A,F)), (p(A,F), F), (F, p(F,G)), (p(F,G), G)} { ∪i w , o w }; 8) α (W)={ T w , P w, F w }. (3) In GBMS, the behavior model is described in the Figure 2. Table 1. Workflow event log Case Activity Status Need Produced Resource 1 A START DOC1 ROLE1 1 A END DOC2 2 A START DOC1 ROLE1 2 A END DOC3 3 A START DOC1 ROLE1 3 A END DOC2 1 B START DOC2 ROLE2 1 B END DOC4, DOC5 3 B START DOC2 ROLE2 3 B END DOC4, DOC5 1 C START DOC4 ROLE4 3 D START DOC5 ROLE4 1 D START DOC5 ROLE5 3 C START DOC4 ROLE5 1 C END DOC6 3 D END DOC7 1 D END DOC7 3 C END DOC6 1 E START DOC6, DOC7 ROLE6 1 E END DOC9 3 E START DOC6, DOC7 ROLE6 3 E END DOC9 2 F START DOC3 ROLE3 2 F END DOC8 2 G START DOC8 MACHINE1 2 G END DOC9 ![]() 86 Workflow Mining of More Perspectives of Workflow Copyright © 2008 SciRes JSEA Figure 2. The GBMS behavior model 4.2 Workflow Mining in the Information Model In workflow mining area, there have been no more researches in informational perspective. In this section, we introduce the workflow mining in the information model. The information model is used to describe the products consumed and produced in the business process and the relations between the products. In GBMS, the activity can be executed when it has products and roles that the states of those must be able to support. So the information model’s mining is very important and can show the data flow of the process. In the workflow event log Table 1, it contains 7 activities and 9 kinds of products. In activity’s START status, it needs product to run; In END status, it can produce the new products. So in Table 1 of event log, we can get the formulas: 9) Prod_Source(A)={DOC1};Prod_Target(A)={ DOC 2, DOC3} 10) Prod_Source(B)={DOC2};Prod_Target(B)={ DOC4, DOC5} 11) Prod_Source(C)={DOC4};Prod_Target(C)={ DOC6} 12) Prod_Source(D)={DOC5};Prod_Target(D)={ DOC7} 13) Prod_Source(E)={DOC6,DOC7};Prod_Target (E)={ DOC9} 14) Prod_Source(F)={DOC3};Prod_Target(F)={ DOC8} 15) Prod_Source(G)={DOC8};Prod_Target(G)={ DOC9} Input={Input(A,DOC1),Input(B,DOC2),Input(C,DOC 4), Input(D,DOC5), Input(E,DOC6), Input(E,DOC7), Input(F,DOC3), Input(G,DOC8)} Output={Output (A,DOC2), Output (A,DOC3), Output (B,DOC4), Output (B,DOC5), Output (C,DOC6), Output (D,DOC7), Output (E,DOC9), Output (F,DOC8), Output (G,DOC9)} Figure 3 illustrates the integration of the GBMS behavior model and data model. 4.3 Workflow Mining in the Organization Model In GBMS, the organization model is used to define the government organization; it concludes all kinds of resources: person, machine, place, etc. In mining this aspect, a workflow analyst can discover if the participants of a business process are being used efficiently and effectively. Figure 3. The rediscovered of GBMS behavior model and data model In the workflow event log Table 1, it contains 7 activities, 6 roles and 1 machine. Activity needs the resource to support to run in the workflow system. From Table 1, we can get the formal representation of the rediscovered organization model: 1) Role (A)={ROLE1} 2) Role (B)={ROLE2} 3) Role (C)={ROLE4} 4) Role (D)={ROLE5} 5) Role (E)={ROLE6} 6) Role (F)={ROLE3} 7) Resource (G)={MACHINE1} Support ={Support(A,ROLE1), Support(B,ROLE2), Support (C,ROLE4), Support(D,ROLE5), Support (E,ROLE6), Support (F,ROLE3), Support (G,MACHINE1)} The Figure 4 shows the integration of the GBMS behavior model, data model and organization model. This sample is a business process of Beijing Xuanwu government shown in Figure 5. It is a process of business application. First the applicant fills in the table of application, and sends the table for the approval. If the application is the type of common service, then it is assigned to Windows’ transaction, and distribute the app to different leaders to deal with, last it arrives the director to disposal. If the application is the type of administration permission, it is directly sent to the administrator to deal with and recorded into the database of Government. It was built by GBMS. Figure 4. The rediscovered of GBMS behavior model, data model and organization model A B C D E F G A B C E D F G DOC1 DOC 2 DOC 4 DOC 6 DOC 5 DOC 7 DOC 8 DOC 3 DOC 9 ![]() Workflow Mining of More Perspectives of Workflow 87 Copyright © 2008 SciRes JSEA Figure 5. the application business process of GBMS 5. Conclusions and Future Work Currently, most workflow mining algorithms have the goal of rediscovering a control flow model. We feel that this approach limits the scope and utility of workflow mining; it neglects the important point that workflow is much more than control flow. So this paper has presented the workflow mining in more perspective of workflow and given a full sample to show how to mine the entire GBMS model from the event log. In future, we will study the incomplete log with noise and improve the mining algorithm. This is one research of the whole project. We also need to validate the mining model with the model design by the GBMS. 6. Acknowledgement This project is funded by the Ministry of Science and Technology, P.R.C. We should like to thank Beijing Cyber Technology for supporting the project. REFERENCES [1] Workflow Management Coalition, “Workflow management coalition terminology and glossary,” Technical Report, WfMCTC–1011, Brussels: Workflow Management Coalition, 1996. [2] L. Wang and B. S. Zhou, “The study of enterprise model,” Computer Engineering and Application, 1002– 8331–(2001)12–0005–05. [3] B. S. Zhou, H. X, and L. Zhang, “The Principle of Process Engineering and an Introduction to Process Engineering Environments,” Journal of Software (supplement), pp. 519–534, August 1997. [4] B. S. Zhou and S. Y. Zhang, “Visual Process Modeling Language VPML,” Journal of Software (supplement), pp. 535–545, August 1997. [5] P. Liu and B. S. Zhou, “The research on the modeling transformation from GBMS to jBPM,” 2008 International Conference on Computer Science and Software Engineering. [6] A. H. Ren, “Research on the Concurrent Software Developing Method Based on Object Oriented Petri Nets,” BHU, The school of computer science, pp. 116–128, 2001. [7] W. M. P. van der Aalst, T. Weijters, and L. Maruster, “Workflow mining: Discovering process models from event logs,” IEEE Transactions on Knowledge and Data Engineering, 16(9), pp. 1128–1142, 2004. [8] A. J. M. M. Weijters and W. M. P. van der Aalst, “Process mining: Discovering workflow models from event-based data,” in: B. Kroose, M. de Rijke, G. Schreiber, M. van Someren (Eds.), Proceedings of the 13th Belgium-Netherlands Conference on Artificial Intelligence (BNAIC 2001), pp. 283–290, 2001. [9] A. J. M. M. Weijters and W. M. P. van der Aalst, “Workflow mining: Discovering workflow models from event-based data,” in: C. Dousson, F. Hooppner, R. Quiniou (Eds.), Proceedings of the ECAI Workshop on Knowledge Discovery and Spatial Data, pp. 78–84, 2002. |