Lipophilicity score

Lipophilicity score#

After generating LIPS results for protein 1xqf chain A, we can utilize them to annotate the protein at both the surface level and per-residue level.

In this tutorial, we demonstrate how to use TMKit to annotate each residue with its corresponding lipophilicity scores derived from the LIPS results.

Example usage#

We can first read the file of a helix surface ID id=1 using the following code. Please remember that we have put the results for protein 1xqf chain A in folder ./data/lips/ as in the last tutorial.

Code

import tmkit as tmk

df = tmk.feature.get_surf_lips(
    fp='./data/lips/',
    prot_name='1xqf',
    file_chain='A',
    id=1,
)
print(df)

Output

    aa_ids  lipos
       2  0.026
       5  0.804
       6  0.573
       9  0.697
      12  0.973
..      ...    ...
   352  0.828
   355  0.688
   356  0.535
   359  0.984
   362  0.615

[155 rows x 2 columns]

The output is a dataframe containing 2 columns, that is, aa_ids and lipos, as shown below.

Attribute	Description
aa_ids	amino acid ID
lipos	lipophilicity score

If we want to annotate all amino acids with the lipophilicity scores, we need to use all 7 surface data. In TMKit, we can do it this way.

Code

import tmkit as tmk

_, lipos_dict, _, _ = tmk.feature.read(
    fp='./data/lips/',
    prot_name='1xqf',
    file_chain='A',
)
print(lipos_dict)

Output

{1.0: 0.018, 4.0: 0.74, 5.0: 0.804, ..., 357.0: 0.727, 360.0: 1.174}

Actually, we can use TMKit to check the summary about all 7 surfaces, which shows the overall lipophilicity score at the surface level.

Code

import tmkit as tmk

df = tmk.feature.get_helix_all_surf_lips(
    prot_name='1xqf',
    file_chain='A',
    fp='./data/lips/',
)
print(df)

Output

   surfs  lipos
    5  1.834
    0  1.770
    3  1.729
    1  1.815
    2  1.791
    6  1.777
    4  1.767

Additionally, there are average lipophilicity scores at the surface level, which can be accessed as follows. The column lxe represents the average lipophilicity scores.

Code

import tmkit as tmk

df = tmk.feature.get_helix_all_surf_avelips(
    fp='./data/lips/',
    prot_name='1xqf',
    file_chain='A',
)
print(df)

Output

   surfs    lxe
    5  8.889
    0  8.694
    3  8.389
    1  8.865
    2  8.507
    6  8.435
    4  8.741

We can continue to make most of the average lipophilicity scores to annotate helix surfaces.

Code

import tmkit as tmk

_, _, _, lips_dict = tmk.feature.read(
    fp='./data/lips/',
    prot_name='1xqf',
    file_chain='A',
)
print(lips_dict)

Output

{5.0: [1.834, 4.846, 8.889], 0.0: [1.77, 4.912, 8.694], 3.0: [1.729, 4.852, 8.389], 1.0: [1.815, 4.885, 8.865], 2.0: [1.791, 4.749, 8.507], 6.0: [1.777, 4.746, 8.435], 4.0: [1.767, 4.948, 8.741]}

We finally sort out an overall dataframe containing the LIPS results for all amino acids using the code below.

Code

import tmkit as tmk

df_surf = tmk.feature.read_helix_all_surf(
    fp='./data/lips/',
    prot_name='1xqf',
    file_chain='A',
)

df = tmk.feature.torosseta(
    fp='./data/lips/',
    prot_name='1xqf',
    file_chain='A',
    df_surf_lips=df_surf,
)
print(df)

Output

======>reading surface 5
======>reading surface 0
======>reading surface 3
======>reading surface 1
======>reading surface 2
======>reading surface 6
======>reading surface 4
      aa_ids  mean_lipo  lipos   ents
0          6      8.435  0.573  4.896
1          9      8.435  0.697  4.852
2         10      8.435  1.621  2.735
3         13      8.435  0.838  5.327
4         16      8.435  0.722  4.914
...      ...        ...    ...    ...
1080     358      8.507  0.522  2.980
1081     359      8.507  0.984  3.276
1082     362      8.507  0.615  4.539
1083       1      8.507  0.018  1.125
1084       2      8.507  0.026  1.158

[1085 rows x 4 columns]

Attributes#

Attribute	Description
prot_name	name of a protein in the prefix of a PDB file name (e.g., 1xqf in 1xqfA.pdb)
file_chain	chain of a protein in the prefix of a PDB file name (e.g., A in 1xqfA.pdb)
fp	path where the LIPS results for a protein are placed
id	surface id, 0-6

Output#

Please check the output in each vignette above.

Lipophilicity score

Contents

Lipophilicity score#

Example usage#

Attributes#

Output#