In this research paper, we evaluate an assortment of tools and intend to investigate multifarious characteristic of Imagix-4D Reverse Engineering Tool and on the basis of investigation find out inadequacy of Imagix-4D Reverse Engineering Tool (illustrate only abstract Class Diagram, and it has no support to illustrate ER-Diagram and Sequence Diagram) and propose a Reverse Engineering Tool based on Unified Mapping Method (RETUM) for prominence of Class Diagram Visualizations which surmount the limitation (class diagram which is intricate in visualization) of Imagix-4D Reverse Engineering Tool.
Understanding the intricate relationships that exist between the source code components of a software system can be an arduous task. In the preceding years, several tools [
When assessing the superiority and maintainability of large C, C++ and Java source code bases, tools are needed for extracting several facts [
In this paper, we present our experience in the architecting of Imagix-4D that is a source code analysis tool from Imagix Corporation [
In this section we will describe the applied tool selection criteria, the reasons why we have selected particular tools into the study their basic characteristics [
RE Tools | Key Attributes | Source Environment | Merits | Demerits |
---|---|---|---|---|
Rigi [ | Fault-tolerance Completeness Correctness Performance Extensible Scalability Portability Availability Usability | C, C++ | 1) The major advantages of the tool are that it features new technologies (e.g. layered views, Shrimp view, layout algorithms etc.). 2) The tool provides supporting Capabilities (e.g. filters, metrics, groups, etc.) and it is extensible in some way. 3) The only tool that allows to save the generated views and represent at-ions. | 1) The major drawback of Rigi is the provided parser which can only parse functions and structure data Types. 2) This limits the views that can be generated mainly to functional views (call graph). 3) Another Problem is that the tool because it is a research prototype is not too stable. |
Doclike Viewer [ | Performance Scalability Portability Usability | C, C++ | 1) Doclike Viewer is best to be used within software life cycle. | 1) It uses Rigi parser. It does not have its own parser. |
SNIFF+ [ | Fault-tolerance Scalability Portability Usability | C, C++ | 1) Sniff+ provides an efficient and portable environment with a comfortable user interface. 2) Sniff+ also provides good printing capabilities. 3) Sniff+ is the only tool that also supports browsing between all generated views which comes in handy sometimes. | 1) In view, is limited to that connections can only go in one direction from an entity, therefore, resulting in many representations of one item (e.g. function) if the item is referenced somewhere else. 2) Because of this limitation of the views, highly connected entities cannot be identified and the reading of the views can get complicated with large graphs. |
Shrimp [ | Performance Extensible Scalability Portability Availability | C, C++ | 1) Tool provides a customizable and Interactive environment for navigating and browsing complex information spaces. 2) It employs a fully zoom able interface for exploring software. | 1) Adapting SHriMP to new data domains within Eclipse, and applying the idea of terminals to program visualization [ |
Code crawler [ | Extensible Scalability Portability Availability Usability | C, C++, java small talk | 1) It supports reverse engineering through the combination of metrics and software visualization. 2) In this we can see screen shot also. | 1) The visualization will be performed: for every node and edge the user can choose from a selection of metrics is contain little complexity. |
CSV [ | Correctness Performance Extensible Scalability Portability Availability Usability | C, C++ | 1) In this user can choose colour for Syntax elements as if statements in C++. 2) Supports gradual zooming up to the point where a line of text becomes 1 pixel. | 1) Excludes option for lexical highlighting. |
Solidsx [ | Available Portable Usable Scalable Performance Completeness Fault-tolerance | C, C++, .NET/c# and Java code bases | 1) It tightly integrates several visual techniques HEBs, tree maps, table lenses with several reverse engineering and analysis in a single environment. 2) The most important feature for user acceptance of Solidsx is integration ease. 3) Solidsx was used in several industrial reverse engineering and program comprehension Projects. | 1) Tool is too generic; needs customized wizards that should address specific questions [ |
Dalli [ | Compliance Full coverage Completeness Scalability Portability | Language independence | 1) Dalli is recoverable because parsing and lexical technique which is highly versatile. 2) Dalli tool is versatile in light weight then other base technique. | 1) It provides low accuracy. 2) Dalli itself cannot extract the complete source code as there is no one tool that can successfully extract the complete source code/ architecture model. 3) Dalli tool required to preprocessing as it allows and analyst to interact with the recovered information by accessing the result of reconstruction effort. |
GUPRO [ | Compliance Crossref completeness Scalability Portability Availability | C, C++, Java, and RDBMS | 1) It uses a schema independent querying mechanism. 2) This model implies (conceptual model) the structure of the graph-based GUPRO-repository. Source code is extracted into the repository and the repository graphs can be viewed by an integrated querying and browsing facility. 3) GUPRO has a complete treatment of preprocessor facilities [ | 1) Due to large software system all facts are source cannot fill at once due to Limited repository size, fact extractors for multi-languages systems follow a four step parsing approach [ |
---|---|---|---|---|
DEFCTO [ | Fault tolerance Completeness Compliance Crossref Preprocessor Completeness Availability Portability | Language independent | 1) Arbitrary factual annotation can be added to the grammar; it is independent from any preconceived analysis model and is fully general. 2) The method is succinct and its notational efficiency has been demonstrated by comparison with other method. | 1) This technique does not rely on a specified grammar formalism or parser. |
COLUMBS [ | Fault tolerance Completeness Compliance Crossref Preprocessor Completeness Portability | C/C++ projects and to extract their UML Class Model and call graph | 1) It supports project handling, data extraction, data representation and data storage. Furthermore, client entering methods can be used to produce comprehensible (clear-cut) diagrams from the extracted information. 2) Recoverable fault tolerance because data extraction is pre-processed. 3) It is compliance because it is highly adoptable from user as it is a professional tool covering Reverse Engineering Tool in a single package [ | 1) Costly and not ease to availability. |
Imagix-4D [ | Availability Portability Usability Scalability Performance | 1) It is used primarily for understanding, documenting and evolving existing C, C++ and Java software. 2) It is also used in Software metrics measure design quality and identify potential testing and maintenance issues. | 1) It provides views to rapidly check and systematically study software. 2) Presents key information on software in a 3D-graphical format which enables the user to quickly focus on particular areas of interest. 3) It helps software developers comprehend complex or legacy C, C++ and Java source code. 4) By using Imagix-4D to reverse engineer and analyze our code, we are able to speed your development, enhancement, reuse, and testing. 5) It eliminates bugs due to faulty understanding. 6) It enables us to rapidly check or systematically study your software on any level from its high level architecture to the details of its build, class and function dependencies. 7) We can visually explore a wide range of aspects about your software-control structures, data usage, and inheritance. All based on its precise static analysis of your source. 8) Using this tool we are able to find and focus on the relevant portions of your source code through its querying capabilities. 9) Using this tool we are able to find and focus on the relevant portions of your source code through its querying capabilities [ | 1) The disadvantage of smaller graph is that highly connected graphs get complicated and unreadable. 2) The hand designed class and function diagrams sometimes does not get match with the tool designed diagrams (Class Diagram). 3) The parser lacks of important information about method/function calls which is due to inability of interpreting template parameters (Sequence Diagram). 4) It is unable to resolve the function to which the invocation resolves during compilation time (Sequence Diagram). 5) Imagix-4D requires many hours of analysis for larger code-bases. 6) Imagix-4D does not produce a full executable slice, since it does not perform analysis of relevant conditions for the identified statements. 7)) In Imgix-4D has draw Class diagram but it is limited in nature it not give all relationship (Association, Aggregation, Dependencies, Generalization Realization). |
Reveal Tool [ | Classes, Relationship Dependencies Associations Generalization Realization Aggregation | Input from C++ Code and output as Class Diagram | 1) Method based on Keystone. 2) Mechanism used Bottom Up & Backtracking Parse Algorithm Token Decoration. 3) Detection/Mapping attributes based on ambiguity level: Classes it has low ambiguity. 4) Semantically Accuracy in C++ to UML plotting more accurate in and Classes and Association. 5) Ease and sufficient generation of Reverse models. | 1) Detection/Mapping attributes based on ambiguity level: Relationships contains high ambiguity (Dependencies contains high ambiguities, Associations contains high ambiguity, Generalization contains ambiguity, Realization contains medium ambiguity), Aggregation contains high ambiguity. |
---|---|---|---|---|
Rational Rose Tool [ | Classes, Relationship Dependencies Associations | Input from C++ Code and output as UML Diagram. | 1) Method based on parsing. 2) Mechanism used disassembler. 3) Detection/Mapping attributes based on ambiguity level: Classes it has low ambiguity. | 1) Detection/Mapping attributes based on ambiguity level: Relationships contains high ambiguities (Dependencies contains high ambiguities, and Associations contains high ambiguities). 2) Exact Mapping is not done and less accurate. 3) UML does not include internal dependencies such as method invocations and variable accesses. Those dependencies are necessary in the problem detection and reorganization phases of the re-engineering life cycle. Thus, choosing UML would violate the requirement of being a sufficient basis of re-engineering operations. |
Super Womble [ | Classes | Input from C++ Code and output as Class Diagram. | 1) Method based on parsing. 2) Mechanism used Abstract Syntax Tree, Token Stream, Lexical Analyzer. 3) Detection/Mapping attributes based on ambiguity level: Classes it has low ambiguity and Object Diagrams Contains low ambiguity. | Exact Mapping is not done and less accurate. |
Pilfer [ | Classes Relations Dependencies Association, Generalization Realization Aggregation | Input from C++ Code and output as Class Diagram. | 1) Method based on parsing. 2) Detection/Mapping attributes based on ambiguity level: Classes it has low ambiguity 3) Light weight Detection. 4) More accurate in graph generation. | Detection/Mapping attributes based on ambiguity level: Relationships contains high ambiguity Dependencies contains high ambiguities, Associations contains high ambiguity, Generalization contains ambiguity, Realization contains medium ambiguity, and Aggregation contains high ambiguity. |
Tool Selection Criteria
Because there are numerous tools for reserve engineering purposes it is not possible to analyze all of them in a single study. We have decided to focus on some properties of those tools
The C programming language is still very important in this context since it is used in numerous important legacy systems which are under maintenance. It is also the only language for which there exist multiple empirical studies on information needs [
The below architecture in
After these components is correctly extracted from UML mining module then a local parse tree is generated and the information is stored in repository for its further usage. Now the direct mapping is possible after this phase but to customize the requirement the proposed work is also adding some more features like code annotation module in which the identified results is further refined by using two specific methods Filtering and Multi- View. This result is then forwarded to exporter which later on plots the identified extracted patterns in a form of Class diagram, Sequence diagram or Call graph as an output.
After analysis it seems that in near future, suggested tool will proves its efficiency and usability in terms of its language supportability (C++/C# and, Java) diagram supportability input range (Class and Activity), detection and mapping mechanism (Various Parameters for accurate mapping). After applying the updated concepts at initial level of work, it is identified that the approach will proves as an unambiguous UML generation from source code and is more accurate, easy and complete.
We proposed a algorithm for design Reverse engineering tool of RETUM.
Step 1: First we take legacy codes (object oriented or procedure oriented codes) as input.
Step 2: Legacy code samples are passed into the code analysis module as input. These code analysis modules takes the code of various languages and makes them separated according to the type of keyword used and store them into a temporary storage and symbol tree is constructed for correct analysis of tokens according to their uses in codes.
Step 3: Next step the takes input from code analysis phase and generate token with the help of token generator (generates various tokens for mapping).
Step 4: These tokens acts as a data extraction components form source codes. Extraction components needs to be extracted for accurate mapping from UML mining of different entity relationships, class and objects instances.
Step 5: After these components is correctly extracted from mining module UML mining then a local parse tree is generated and the information is stored in repository for its further usage.
Step 6: Now for the customize the requirement the proposed work is also adding some more features like code annotation module in which the identified results is further refined by using two specific methods Filtering and Multi-View.
Step 7: This result is then forwarded to exporter which later on plots the identified extracted patterns in a form of object oriented diagram or procedure oriented diagram as an output.
We realize of above algorithm for design simplification adaptation of class diagram.
Step 1: Initially starts with legacy code or source code as input.
Step 2: Here we take the specific java file as input.
Step3: The UML Doclet API will process the java file (Any additional UMLGraph or javadoc arguments can be added at the end of the command line. This command will read the specification file (e.g. Test.java) and generate directly a diagram of the appropriate type).
This option provides the maximum flexibility. In order to run, javadoc needs to access tools jar.
1. Specify the location of tools.jar as a part of Java’s classpath and specify the full name of the UML Graph doclet as an argument to Java. This is an invocation example under Windows java -classpath”lib/UmlGraph; jar, c:\program files\java\jdk 1.6.0_02\lib\Tools.jar” org.umlagraph.doclet.Uml Graph – package Test.java and under Unix java -classpath ‘/usr/share/lib/UmlGraph.jar:/opt/java-1.6/lib/tools.jar’\org.umlgraph.doclet. UmlGraph -package Test.java
2. Place the UmlGraph.jar file in a directory that also contains the Java SDK tools.jar file.java -jar /path/to/ UmlGraph.jar yourfile 1.java...
Step 4: The UML graph & UML tool API will extract the relevant data from java file.
javadoc -docletpath UmlGraph.jar -doclet org.umlgraph.doclet.UmlGraph -private Simple.java
4.1 Add command line option umlgen (generates UML diagrams if the source documentation contains) and umltypegen (generates UML diagrams for all documented classes and interfaces).
4.2 Add command line umlpackagegen (generates UML diagrams for all documented packages).
4.3 Add command line umloverviewgen (generates project overview UML diagrams).
4.4 Add command line umlautogen (generates all types of UML diagrams).
Step 5: After step 4, the Maven API is added by UML Doclet.
Step 6: The class diagram is generated and display to the user.
Above algorithm specific used for class diagram generation, which take input as java file and produce output as graphical form details in appendix.
In this research paper, we investigate various features of Imagix-4D, and concentrate on class diagram visualization of Imagix-4D. In Imagix-4D class diagram visualization which is more complex, it is not easy to understand a proposed tool RETUM which works on this inadequacy of above tool and illustrates simple comprehensive Class Diagram and we will propose here extension of Imagix-4D Reverse Engineering Tool to draw sequence diagram and ER-Diagram which are Extend Feature of Imagix-4D.
Appendix 1: Discussion and Enlightenment of Class Diagram Tool Phase: Class diagrams characterizing the static data and class structure of Java source code. To achieve such a diagrammatic representation, translation rules are defined that transform Java syntax into class diagram.
This diagram is showing
This dialog box will appear
This dialog box will appear
Similarly, this dialog box will appear
Here when we click on convert button after attaching the required file (.java) then we get a dialogue box showing in
class Person {
String Name;
public static void main(String a[
}
class Employee extends Person {
public static void main(String a[
}
class Client extends Person {
public static void main(String a[
}
Standard Class Diagram Generated by RETUM Tool
In Imagix-4D generated class diagram