Single site

We will need to install TMKit1 to read a protein sequence and create single-site positions, placed with windows. This will initiate the vector of features, and be prepared for being fed by site-wise features.

After installation, we first read 1aigL.fasta.

Python

1
2
3
4
5
6
import tmkit as tmk

sequence = tmk.seq.read_from_fasta(
    fasta_fpn='./data/fasta/1xqfA.fasta'
)
print(sequence)

Output

ALLSFERKYRVPGGTLVGGNLFDFWVGPFYVGFFGVATFFFAALGIILIAWSAVLQGTWNPQLISVYPPALEYGLGGAPLAKGGLWQIITICATGAFVSWALREVEICRKLGIGYHIPFAFAFAILAYLTLVLFRPVMMGAWGYAFPYGIWTHLDWVSNTGYTYGNFHYNPAHMIAISFFFTNALALALHGALVLSAANPEKGKEMRTPDHEDTFFRDLVGYSIGTLGIHRLGLLLSLSAVFFSALCMIITGTIWFDQWVDWWQWWVKLPWWANIPGGING

Generation of all posible residues.

Python

1
2
3
4
5
6
pos_list = tmk.seq.pos_list_single(
    len_seq=len(sequence),
    seq_sep_superior=None,
    seq_sep_inferior=0,
)
print(pos_list)

Output

[
    [1],
    [2],
    [3],
    ...,
    [280],
    [281],
]

Adding amino acid types and IDs to the positions of all posible residues.

Python

1
2
positions = tmk.seq.pos_single(sequence=sequence, pos_list=pos_list)
print(positions)

Output

[
    [1, 'A', 1, 0],
    [2, 'L', 2, 0],
    [3, 'L', 3, 0],
    ...,
    [280, 'N', 280, 0],
    [281, 'G', 281, 0],
]

Applying a sliding window to each residue pair.

Python

1
2
3
4
5
6
win_aa_ids = tmk.seq.win_id_single(
    sequence=sequence,
    position=positions,
    window_size=1,
)
print(win_aa_ids)

Output

[
    [None, 1, 2],
    [1, 2, 3],
    [2, 3, 4],
    ..., 
    [279, 280, 281],
    [280, 281, None],
]

Python

1
2
3
4
5
6
7
win_aas = tmk.seq.win_name_single(
    sequence=sequence,
    position=positions,
    window_size=1,
    mids=win_aa_ids,
)
print(win_aas)

Output

[
    [None, 'A', 'L'],
    ['A', 'L', 'L'],
    ['L', 'L', 'S'],
    ...,
    ['G', 'I', 'N'],
    ['I', 'N', 'G'],
    ['N', 'G', None],
]

Initiating feature vector.

Python

1
2
3
features = [[] for i in range(len(sequence))]
print(features)
print(len(features))

Output

[[], [], [], ..., [], [], []]
281


  1. Jianfeng Sun, Arulsamy Kulandaisamy, Jinlong Ru, M Michael Gromiha, Adam P Cribbs, TMKit: a Python interface for computational analysis of transmembrane proteins, Briefings in Bioinformatics, Volume 24, Issue 5, September 2023, bbad288, https://doi.org/10.1093/bib/bbad288