ap_rescore_alignmentΒΆ

Reads sequence alignment(s) in the FASTA format and recalculates scores. The input file may contained more than two sequences, i.e. may provide a Multiple Sequence Alignment. Every pair of aligned sequences is rescored in this case. Output values are printed on the screen. The default scoring parameters are: BLOSUM62, -10, -1

USAGE:

./ap_rescore_alignment input.fasta  [substitution-matrix  [gap_open [gap_extend] ] ]

EXAMPLE:

./ap_rescore_alignment test_inputs/2azaA_2pcyA-ali.fasta

Keywords:

Categories:

  • core/alignment/on_alignment_computations.cc

Input files:

Output files:

Program source:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
#include <iostream>
#include <utils/exit.hh>

#include <core/data/io/fasta_io.hh>
#include <core/alignment/PairwiseSequenceAlignment.hh>
#include <core/alignment/on_alignment_computations.hh>
#include <core/alignment/scoring/NcbiSimilarityMatrixFactory.hh>

using namespace core::data::io;
using namespace core::alignment::scoring;
using namespace core::data::sequence;

std::string program_info = R"(

Reads sequence alignment(s) in the FASTA format and recalculates scores. The input file
may contained more than two sequences, i.e. may provide a Multiple Sequence Alignment.
Every pair of aligned sequences is rescored in this case. Output values are printed on the screen.
The default scoring parameters are: BLOSUM62, -10, -1

USAGE:
./ap_rescore_alignment input.fasta  [substitution-matrix  [gap_open [gap_extend] ] ]

EXAMPLE:
./ap_rescore_alignment test_inputs/2azaA_2pcyA-ali.fasta

)";

/** @brief Estimates pairwise sequence similarity for a set of sequences given in a FASTA format
 *
 * CATEGORIES: core/alignment/on_alignment_computations.cc;
 * KEYWORDS:   sequence alignment; FASTA input; sequence alignment score
 * GROUP:      Alignments
 */
int main(const int argc, const char *argv[]) {

  if (argc < 2) utils::exit_OK_with_message(program_info); // --- complain about missing program parameter

  // --- read input database fasta file
  std::vector<Sequence_SP> ali_fasta;  // --- stores sequences (should be already aligned)
  core::data::io::read_fasta_file(argv[1], ali_fasta);

  std::string matrix_name = (argc>2) ? argv[2] : "BLOSUM62";
  short gap_open = (argc > 3) ? atoi(argv[3]) : -10;
  short gap_extend = (argc > 4) ? atoi(argv[4]) : -1;
  NcbiSimilarityMatrix_SP sim_m = NcbiSimilarityMatrixFactory::get().get_matrix(matrix_name);
  // --- prints both fasta sequences names and their recalculated score
  for (size_t i = 1; i < ali_fasta.size(); ++i)
    for (size_t j = 0; j < i; ++j)
      std::cout << std::setw(10) << ali_fasta[i]->header().substr(0, 10) << " "
                << std::setw(10) << ali_fasta[j]->header().substr(0, 10)
                << " " << core::alignment::calculate_score(*ali_fasta[i], *ali_fasta[j], *sim_m, gap_open, gap_extend)
                << "\n";
}
../_images/file_icon.png