The massive extension in biological data induced a need for user-friendly bioinformatics tools could be used for routine biological data manipulation. Bioanalyzer is a simple analytical software implements a variety of tools to perform common data analysis on different biological data types and databases. Bioanalyzer provides general aspects of data analysis such as handling nucleotide data, fetching different data formats information, NGS quality control, data visualization, performing multiple sequence alignment and sequence BLAST. These tools accept common biological data formats and produce human-readable output files could be stored on local computer machines. Bioanalyzer has a user-friendly graphical user interface to simplify massive biological data analysis and consume less memory and processing power. Bioanalyzer source code was written through Python programming language which provides less memory usage and initial startup time. Bioanalyzer is a free and open source software, where its code could be modified, extended or integrated in different bioinformatics pipelines. Bioinformatics Produce huge data in FASTA and Genbank format which can be used to produce a lot of annotation information which can be done with Python programming language that open the door form bioinformatics tool due to their elasticity in data analysis and simplicity which inspire us to develop new multiple tool software able to manipulate FASTA and Genbank files. The goal Develop new software use s Genomic data files to produce annotated data. Software was written using python programming language and biopython packages.
Bioinformatics has evolved and expanded continuously over the past years and has become very important basic demand in life science research. There is an enormous growth of biological data on network and databases due to the massive amount of research done daily. The public databases growth rate is increasing exponentially over years, for example: NCBI Gene database and Protein database, nucleotide database reached 24, 300 and 210 million records in 2016 and have 13.8%, 37.7% and 5.2% annually growth rate, respectively [
The biological data analysis and interpretation is getting a major bottleneck in Bioinformatics [
Tenth of different general-use bioinformatics softwares are publicly available. DNASTAR (Lasergene) is a commercial bioinformatics software that compresses different applications such gene discovery, genomic visualization, NGS assembly with Sanger validation, primer design, Sanger sequence assembly, sequence alignment and others [
In this study, we are introducing Bioanalyzer software, which is a bioinformatics tool that compresses simple and common data analysis applications with a user-friendly GUI. Bioanalyzer source code and freely available, where its code could be modified, extended or integrated in different bioinformatics pipelines. Bioanalyzer is a simple analytical software implements a variety of tools to performing common data analysis on different biological data types and databases.
Bioanalyzer was developed using python libraries for perform data manipulation and using of Tkinter package to design the interface. About forty module and function from Biopython with integration from open source scripts and our self-wrote scripts.
Biopython is python library for Genomic data analysis and annotation provides plethora on scripts such as: data reading and extracting from FASTA and Genbank files, Multiple Sequence Alignment, BLAST searching against NCBI database and even accessing to the NCBI database itself [
To maximize the accuracy of protein alignment, PAM and BLOSUM matrices is used for score the accepted mutation and find functional domains.
Matplotlib is most sufficient and accurate for data visualization. Matplotlib used in the software draw and visualization of chromosome, restriction site, dotplot graph.
Tkinter library is used to build the GUI that consist of frames, buttons, text boxes etc. tkinter provides availability to link scripts and functions with press of buttons and display the result text on text viewer [
We used pyinstaller (http://www.pyinstaller.org/) to convert python file to standalone executable application. Pyinstaller collect the packages used in the python software and converting them locally installed packages in the directory of the software where the software can retrieve any function from this packages on this directory instead of calling the packages and function on system.
Bioanalyzer provides general aspects of data analysis such as handling nucleotide data, fetching different data formats information, NGS quality control, data visualization, performing multiple sequence alignment and sequence BLAST. The following description of each section of software with sample of results.
Nucleotide tools accepts nucleotide sequence(s) or NCBI accessions as an input. These tools provide DNA translation, GC%, reverse complement, transcription (
Data Extraction can be used to extract specific targeted information from genebank sequence(s) with option of choosing file content and name (
[
Database tools could be useful in handling specific NCBI accessions in different databases for sequence retrieval in FASTA or genbank formats. This tool in discovering new mutations responsible for diseases by comparing different database records for the same gene in specific gene family [
Alignment is most daily used tools in bioinformatics to do local, global, needleman or water nucleotide sequence or protein (
Visualization tools draw the massive nucleotide sequence such as chromosome files, illustrating genes/CDS positions (
The weblogo tool illustrates the the consensus sequence in given record(s) which reflect the presence of the functional domains in protein such as: active site of or ligand binding site (
Quality control tools deal with FASTAQ files in order to do post-sequencing processing such as primer and adopters trimming to prepare the reads for different analysis such genome assembly, mapping or any other application [
Bioanalyzer was written using Python programming language (version 3.4+) that provides set of new functions, new tools and already available tools with minor edition in order to improve its functionality and presenting the output in more ordered way to implement a data analysis, extraction and visualization all gathered in one software.
Bioanalyzer was written using Python programming language (version 3.4+) that provides set of functions and tools to implement a data analysis, extraction and visualization. An additional python codes were written to provide new other
tools. source code, installer and manual are publicly available at (http://www.ageri.sci.eg/index.php/facilities-services/ageri-softwares/bioanalyzer or https://github.peterhabib/com/bioanalyzer).
Research is funded by the corresponding author Aladdin Hamweih, senior scientist, Department of Biotechnology at International Center for Agricultural Research in the Dry Areas (ICARDA).
The authors declare that there is no conflict of interest regarding the publication of this paper.
Habib, P.T., Alsamman, A.M. and Hamwieh, A. (2019) BioAnalyzer: Bioinformatic Software of Routinely Used Tools for Analysis of Genomic Data. Advances in Bioscience and Biotechnology, 10, 33-41. https://doi.org/10.4236/abb.2019.103003