Data - deeptminter

Input¶

External tools¶

Protein sequences are only needed for running DeepTMInter, and are also used to generate a list of intermediate files before running DeepTMInter, as shown in Table 1.

Table 1:External tools for generating intermediate files before running DeepTMInter.

Tool	Role	Function	Source
HHblits	Input	generating multiple sequence alignments	Remmert et al. (2011)
Gaussian DCA	Input	predictor of residue contacts	Baldassi et al. (2014)
Freecontact	Input	predictor of residue contacts	Kaján et al. (2014)
Phobius	Input	predictor of transmembrane topologies	Käll et al. (2004)
Uniclust30 database	Intermediate	sequence database	Mirdita et al. (2016)

Output files¶

DeepTMInter finally returns an output file with the suffix of .m1, .m2, .m3, .m4, .m5 or .deeptminter.

Predictions of interaction sites in tansmembrane proteins are shown in the output file, with three columns:

positions of animo acids in the input sequence
animo acids
probabilities of being interaction sites

If you want to get the results in the context of no ideally preferred regions predicted by Phobius. You can set -r as combined to run the program. This will return the predictions of the whole fasta sequence. Then, you can tailor the whole predictions to whatever you want.

Example data¶

Users can download some example data and check an assortment of input files.

Code

Output

import deeptminter

deeptminter.predict.download_data(
    url='https://github.com/2003100127/deeptminter/releases/download/example_data/example_data.zip',
    sv_fpn='../data/example_data.zip',
)

 ____                _____ __  __ ___       _            
|  _ \  ___  ___ _ _|_   _|  \/  |_ _|_ __ | |_ ___ _ __ 
| | | |/ _ \/ _ \ '_ \| | | |\/| || || '_ \| __/ _ \ '__|
| |_| |  __/  __/ |_) | | | |  | || || | | | ||  __/ |   
|____/ \___|\___| .__/|_| |_|  |_|___|_| |_|\__\___|_|   
                |_|                                      

05/04/2025 15:27:20 logger: =>Downloading starts...
05/04/2025 15:27:21 logger: =>downloaded.

References¶

Remmert, M., Biegert, A., Hauser, A., & Söding, J. (2011). HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nature Methods, 9(2), 173–175. 10.1038/nmeth.1818
Baldassi, C., Zamparo, M., Feinauer, C., Procaccini, A., Zecchina, R., Weigt, M., & Pagnani, A. (2014). Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners. PLoS ONE, 9(3), e92721. 10.1371/journal.pone.0092721
Kaján, L., Hopf, T. A., Kalaš, M., Marks, D. S., & Rost, B. (2014). FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinformatics, 15(1). 10.1186/1471-2105-15-85
Käll, L., Krogh, A., & Sonnhammer, E. L. L. (2004). A Combined Transmembrane Topology and Signal Peptide Prediction Method. Journal of Molecular Biology, 338(5), 1027–1036. 10.1016/j.jmb.2004.03.016
Mirdita, M., von den Driesch, L., Galiez, C., Martin, M. J., Söding, J., & Steinegger, M. (2016). Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Research, 45(D1), D170–D176. 10.1093/nar/gkw1081
Marks, D. S., Colwell, L. J., Sheridan, R., Hopf, T. A., Pagnani, A., Zecchina, R., & Sander, C. (2011). Protein 3D Structure Computed from Evolutionary Sequence Variation. PLoS ONE, 6(12), e28766. 10.1371/journal.pone.0028766
Sun, J., & Frishman, D. (2020). Data for DeepTMInter: Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning. Mendeley Data. https://doi.org/10.17632/2t8kgwzp35.2