Entropy#
After generating LIPS results for protein 1xqf
chain A
, we can utilise them to annotate the protein at both the surface level and per-residue level.
In this tutorial, we demonstrate how to use TMKit to annotate each residue with entropy scores of the LIPS results that it belongs to.
Reminder of data
Please make sure that the build-in example dataset has been downloaded before you walk through the tutorial.
Example usage#
We can first read the file of a helix surface ID id=1
using the following code. Please remember that we have put the results for protein 1xqf
chain A
in folder ./data/lips/
as in the last tutorial.
import tmkit as tmk
df = tmk.feature.get_surf_entropy(
fp='./data/lips/',
prot_name='1xqf',
file_chain='A',
id=1,
)
print(df)
aa_ids ents
0 2 1.158
1 5 6.798
2 6 4.896
3 9 4.852
4 12 3.694
.. ... ...
150 352 4.846
151 355 4.551
152 356 3.724
153 359 3.276
154 362 4.539
[155 rows x 2 columns]
The output is a dataframe containing 2 columns, that is, aa_ids and ents, as shown below.
Attribute |
Description |
---|---|
aa_ids |
amino acid ID |
ents |
entropy score |
If we want to annotate all amino acids with the entropy scores, we need to use all 7 surface data. In TMKit, we can do it this way.
import tmkit as tmk
_, _, entropy_dict, _ = tmk.feature.read(
fp='./data/lips/',
prot_name='1xqf',
file_chain='A',
)
print(entropy_dict)
{1.0: 1.125, 4.0: 5.71, 5.0: 6.798, ..., 357.0: 5.976, 360.0: 1.749}
Actually, we can use TMKit to check the summary about all 7 surfaces, which shows the overall entropy score at the surface level.
import tmkit as tmk
df = tmk.feature.get_helix_all_surf_entropy(
fp='./data/lips/',
prot_name='1xqf',
file_chain='A',
)
print(df)
surfs ents
0 5 4.846
1 0 4.912
2 3 4.852
3 1 4.885
4 2 4.749
5 6 4.746
6 4 4.948
Attributes#
Attribute |
Description |
---|---|
prot_name |
name of a protein in the prefix of a PDB file name (e.g., 1xqf in 1xqfA.pdb) |
file_chain |
chain of a protein in the prefix of a PDB file name (e.g., A in 1xqfA.pdb) |
fp |
path where the LIPS results for a protein are placed |
id |
surface id, 0-6 |
See also
Please see here for better understanding the file-naming system.
Output#
Please check the output in each vignette above.