TMKit: transmembrane protein analysis#
TMKit is an open-source Python programming interface, which is modular, scalable, and specifically designed for processing transmembrane protein data. It enables users to perform database wrangling, engineer features at the mutational, domain, and topological levels, and visualise protein-protein interaction interfaces through its unique programming interface. In addition, TMKit includes seqNetRR, a high-performance computing library that allows for customised construction and rewiring of residue connections. This library is particularly well-suited for assigning coevolutionary features at a fast speed.

πFeature
β handing multiple kinds of transmembrane protein data
β fast speed
β structural visualisation
π§ Functionalities#
TMKit provides 9 function classes to handle a number of transmembrane protein sequence and structural analysis problems, including visualisation, sequence, quality control, topology, mapping, annotation, connectivity, edge extraction, and feature.

A fundamental component designed to handle sequence reading in diverse formats, sequence retrieval from various sources, and MSA generation.

TM and non-TM topologies (side 1, side 2, strand, coil, inside, loop, and interfacial), structure-derived (TOPDB) or predicted topologies (TMHMM and Phobius).

Amino acid residues in biological functions annotated through the MutHTP, Pred-MutHTP and CATH databases.

A high-performance computing library for extracting connections between residues by building graphs and assigning features quickly.

Identification of protein-protein interaction (PPI) interfaces critical to understand the biological processes.

Evaluation criteria, including the experimentation methods used, resolution, subclass, and sequence length, to qualify proteins.

Identifier mapping between structural and sequence data (e.g., FASTA IDs and PDB IDs) to guarantee the correct interpretation of biological findings.

Studying connections of a protein to others in a PPI network is of crucial importance to understand its biological role.

A set of transmembrane protein-specific and general-purpose features is provided by TMKit in support of machine learning modelling.
π― Easy to use#
After installation, you can import TMKit by putting the following code in a Python script or a Jupyter notebook.
import tmkit as tmk
π Modules#
You can access the 14 modules covering 9 function classes.
See also
install
No. |
Module name |
Function class |
Description |
---|---|---|---|
1 |
|
Quality control |
fetch example data |
2 |
|
Quality control |
generate and extract metrics of sequences and structures |
3 |
|
Sequence |
parse sequences and structures |
4 |
|
Sequence |
produce commands for generating multiple sequence alignment |
5 |
|
Feature |
protein biological features |
6 |
|
Mapping |
seek difference between RCSB and PDBTM structures |
7 |
|
Topology |
transmembrane protein topologies |
8 |
|
Feature |
performance evaluation of residue contact prediction |
9 |
|
Connectivity |
protein connectivity |
10 |
|
Annotation |
transmembrane proteinβs mutation data processing |
11 |
|
Visualisation |
visualise protein structures |
12 |
|
Annotation |
access protein domains and families |
13 |
|
Mapping |
conversion between protein identifiers |
14 |
|
Edge extraction |
rewiring of connections between residues |
π¨βπ» Developer#
Jianfeng Sun, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences (NDORMS), Headington, Oxford OX3 7LD, University of Oxford.
π Citation#
Citation
Citation Jianfeng Sun, Arulsamy Kulandaisamy, Jinlong Ru, M Michael Gromiha, Adam P Cribbs, TMKit: a Python interface for computational analysis of transmembrane proteins, Briefings in Bioinformatics, Volume 24, Issue 5, September 2023, bbad288, https://doi.org/10.1093/bib/bbad288