Skip to content

JSD

1. Download the JSD package

The standalone package of JSD can be downloaded at https://compbio.cs.princeton.edu/conservation/. The JSD package uses the Jensen-Shannon divergence for computing residue conservation based on a MSA.

Tip

The JSD output can be used as a feature to describe residue evolutionary profile at each MSA column.

2. Running the JSD package

We wrote a wrapper to use the package in computer clusters for batch processing of the MSA files. It takes as input MSAs in clustal format.

Before using, we need to set a few parameters.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
param_config = {
    'method': '-s',
    # 'window': '-w',
    # 'distance': '-d',
    'sv_fp': '-o',
    'clustal_fp': '',
}

value_config = {
    'tool_fp': 'python',
    'method': 'js_divergence',
    'window': '3',
    'distance': 'swissprot.distribution',
    'script_fpn': to('prot/feature/alignment/external/jsd/score_conservation.py'),
    'clustal_fp': to('data/msa/clustal/wild/SR24_AtoI/'),
    'sv_fp': to('data/jsd/SR24_AtoI/'),
}

In prot.txt, there are 7 proteins

ATAD2_LOC113841329
CAMK1G
CYP2W1_LOC101804267
KIF27
KIF27_LOC113841629
LOC119718710
RBBP8NL

Then, we can use pp.external.jsd to running JSD for a set of proteins.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
import pypropel as pp

from pypropel.util.Reader import Reader as pfreader
df = pfreader().generic(df_fpn=to('data/msa/clustal/wild/SR24_AtoI/prot.txt'))
prots = df[0].unique()

for key, prot in enumerate(prots):
    order_list = [
        value_config['tool_fp'],
        value_config['script_fpn'],

        param_config['method'], value_config['method'],
        # param_config['window'], value_config['window'],
        # param_config['distance'], value_config['distance'],
        param_config['sv_fp'], value_config['sv_fp'] + prot + '.jsd',
        param_config['clustal_fp'], value_config['clustal_fp'] + prot + '.clustal',
    ]
    pp.external.jsd(
        order_list=order_list,
        job_fp='./',
        job_fn=str(key),
    )