What is BioShell¶
BioShell is a general bioinformatics toolkit, focused on biomolecular structures. It provides:
- Command line applications
- that have been distributed since the original 1.0 version of the package. Some of them have changed their names (e.g. HCPM has been renamed to clust)
- Many (currently over a hundred) small applications
- that also serve as integration tests. They come with example input data and expected output
- Python library
- majority of BioShell classes may be directly used in Python
- C++ library
- which offers highly optimized implementations of oftenly used BioInformatics algorithms and protocols.
BioShell is a set of command-line programs for easy data manipulation from a UNIX-like terminal or a shell script. The programs can read and write standard file formats and handle protein sequences and structures. The tools helps also in simple calculations, like sequence alignment, Phi/Psi angles, crmsd and many more. See Programs page for details.
BioShell tests & examples¶
Since the most recently published version 3.0, BioShell package comes with extensive set of example applications, which have been created to simultaneously reach tree goals:
- to extend the set of BioShell command line tools. Programs with names starting with
ap_are in fact yet another applications. The difference between these test and standard apps is that the latter perform only a single action and their command line is simplified. These programs are integration tests at the same time.
- to provide high quality code snippets that help BioShell users write their own programs. Small programs, that show how to use a particular class or a function, are named
ex_*. At the same time they serve as unit tests
- to test the code. Both
ap_*tests are automatically executed by a test server to ensure the quality and integrity of the package. Input data as well as curated output of these tests is versioned in git repository along the source code.
All the examples are included in respective API documentation pages. Since the test are continuously tested, the serve as a source of validated snippets for creating future programs.
BioShell library for Python (aka PyBioShell)¶
BioShell distribution provides also bindings to Python scripting language; that is, BioShell is also a versatile library for python scripting. BioShell objects can be imported as any other python modules. Example scripts are also included in the repository.
Precompiled library (a single
.so file) for Unix distribitions can be downloaded from this page. The compilation process is described here
BioShell C++ library¶
Finally, BioShell is a C++ software library. Both
ex_* BioShell tests are included in respective
API documentation pages. Since the test are continuously tested, they serve as a source of validated snippets for
creating future programs.
BioShell versions 1.x¶
The original BioShell package was designed as a suite of programs designed for pre- and post-processing in protein structure modeling protocols. The package has been providing a convenient set of tools for in conversion between various sequence and structure formats. It has been also possible to calculate simple properties of protein conformations. The very first commands (e.g. HCPM for clustering protein structures) were implemented in C, later on the development switched to C++.
BioShell versions 2.x¶
Around 2006/07 BioShell has been reimplemented in JAVA, designed as a library for scripting languages running on Java Virtual Machine, most notably Python, but also Scala, Ruby, Groovy and many others. Currently the most recent stable release is 2.2. API docs as well as example scripts may be found in documentation. All program from 1.x versions were also ported to JAVA.
- BioShell - the third version:
- J.M. Macnar, N.A. Szulc, A.E. Badaczewska-Dawid and D. Gront “Exhaustive tests set of BioShell 3.0 suite for structural bioinformatics” Bioinformatics submitted
- Three-dimensional protein threading:
- Gront, M. Blaszczyk, P. Wojciechowski, A. Kolinski “Bioshell Threader: protein homology detection based on sequence profiles and secondary structure profiles.” Nucleic Acids Research 2012 doi:10.1093/nar/gks555
- One-dimensional protein threading:
- Gniewek, A. Kolinski, D. Gront “Optimization of profile-to-profile alignment parameters for one-dimensional threading.” J. Computational Biology 2012 Jul;19(7):879-86
- BioShell - the second version:
- Gront and A. Kolinski “Utility library for structural bioinformatics” Bioinformatics 2008 24(4):584-585
- BBQ - program for backbone reconstruction:
- Gront, S. Kmiecik, A. Kolinski “Backbone Building from Quadrilaterals. A fast and accurate algorithm for protein backbone reconstruction from alpha carbon coordinates.” J. Comput. Chemistry 2007 28(9):1593-1597
- BioShell - the first version:
- Gront and A. Kolinski “BioShell - a package of tools for structural biology computations” Bioinformatics 2006 22(5):621-622
- Program for clustering protein structures (currently named clust):
- Gront and A. Kolinski “HCPM - program for hierarchical clustering of protein models” Bioinformatics 2005 21(14):3179-3180