Skip to content

Profile

Position-Specific Scoring Matrix (PSSM) is a matrix used to score alignments between a protein sequence (cf. DNA and RNA) and a profile, which is a representation of a multiple sequence alignment. Each score indicates the likelihood of a specific nucleotide or amino acid occurring at a particular position in the sequence. There are many models to calculate it.

Blast

The Basic local alignment search tool (Blast) 1 can be used to generate one kind of the PSSM. It can be accessed like that below.

Python

1
2
3
4
5
6
import pypropel as pp

aac = pp.fpmsa.pssm(
    fpn=to('data/pssm/1aigL.pssm'),
    mode='blast',
)

Output

{1: [-5.0, 1.0, 3.0, 2.0, 3.0, 1.0, 3.0, 2.0, 2.0, 3.0, 2.0, 2.0, 2.0, 2.0, 2.0, -1.0, -1.0, 1.0, 4.0, 3.0], 2: [2.0, 2.0, 4.0, 3.0, 1.0, 2.0, 3.0, -2.0, 2.0, -3.0, -6.0, 3.0, 3.0, 1.0, 1.0, 2.0, -0.0, -1.0, 3.0, 2.0], ...}

HHM

HHM is created by profile hidden Markov models (HMMs). The .hhm file is generated by HHblits2. It measures PSSM about how conserved each amino acid is at each column in a MSA.

Python

1
2
3
4
5
6
import pypropel as pp

aac = pp.fpmsa.pssm(
    fpn=to('data/hhm/1aigL.hhm'),
    mode='hhm',
)

Output

{1: array([1. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
       0. , 0. , 0. , 0. , 0. , 0. , 0. , 1. , 0. , 0. , 0. , 0. , 0. ,
       0. , 0.5, 1. , 1. ]), 2: array([0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 1. , 0. , 0. , 0. ,
       0. , 0. , 0. , 0. , 0. , 0. , 0. , 1. , 0. , 0. , 0. , 0. , 0. ,
       0. , 0.5, 1. , 1. ]), ...}

Evolutionary profile

Evolutionary profile is calculated as frequencies of amino acids at each MSA column 2.

Python

1
2
3
4
5
6
import pypropel as pp

ep = pp.fpmsa.pssm(
    msa=msa,
    mode='ep',
)

Output

{'A': array([8.25272552e-01, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
       0.00000000e+00, 0.00000000e+00, 5.30380817e-03, 0.00000000e+00,
       0.00000000e+00, 1.06076163e-03, ..., 3.18228490e-03, 1.16683780e-02,
       1.06076163e-03]), 'C': array([0.00000000e+00, ...

You can pass ep_norm onto mode to get the normalised evolutionary profile.

1
mode='ep_norm'

  1. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J. (1990) “Basic local alignment search tool.” J. Mol. Biol. 215:403-410. https://doi.org/10.1016/S0022-2836(05)80360-2 

  2. Hönigschmid P, Frishman D. Accurate prediction of helix interactions and residue contacts in membrane proteins. J Struct Biol. 2016 Apr;194(1):112-23. doi: 10.1016/j.jsb.2016.02.005. Epub 2016 Feb 3. PMID: 26851352.