Input¶
External tools¶
Protein sequences are only needed for running DeepTMInter, and are also used to generate a list of intermediate files before running DeepTMInter, as shown in Table 1.
Fasta
Protein sequences in the Fasta format are required. The file extension must be .fasta
for recognition of the software.
Table 1:External tools for generating intermediate files before running DeepTMInter.
Tool | Role | Function | Source |
---|---|---|---|
HHblits | Input | generating multiple sequence alignments | Remmert et al. (2011) |
Gaussian DCA | Input | predictor of residue contacts | Baldassi et al. (2014) |
Freecontact | Input | predictor of residue contacts | Kaján et al. (2014) |
Phobius | Input | predictor of transmembrane topologies | Käll et al. (2004) |
Uniclust30 database | Intermediate | sequence database | Mirdita et al. (2016) |
Output files¶
DeepTMInter finally returns an output file with the suffix of .m1
, .m2
, .m3
, .m4
, .m5
or .deeptminter
.
Predictions of interaction sites in tansmembrane proteins are shown in the output file, with three columns:
- positions of animo acids in the input sequence
- animo acids
- probabilities of being interaction sites
If you want to get the results in the context of no ideally preferred regions predicted by Phobius. You can set -r
as combined
to run the program. This will return the predictions of the whole fasta sequence. Then, you can tailor the whole predictions to whatever you want.
Example data¶
Users can download some example data and check an assortment of input files.
import deeptminter
deeptminter.predict.download_data(
url='https://github.com/2003100127/deeptminter/releases/download/example_data/example_data.zip',
sv_fpn='../data/example_data.zip',
)
____ _____ __ __ ___ _
| _ \ ___ ___ _ _|_ _| \/ |_ _|_ __ | |_ ___ _ __
| | | |/ _ \/ _ \ '_ \| | | |\/| || || '_ \| __/ _ \ '__|
| |_| | __/ __/ |_) | | | | | || || | | | || __/ |
|____/ \___|\___| .__/|_| |_| |_|___|_| |_|\__\___|_|
|_|
05/04/2025 15:27:20 logger: =>Downloading starts...
05/04/2025 15:27:21 logger: =>downloaded.
- Remmert, M., Biegert, A., Hauser, A., & Söding, J. (2011). HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nature Methods, 9(2), 173–175. 10.1038/nmeth.1818
- Baldassi, C., Zamparo, M., Feinauer, C., Procaccini, A., Zecchina, R., Weigt, M., & Pagnani, A. (2014). Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners. PLoS ONE, 9(3), e92721. 10.1371/journal.pone.0092721
- Kaján, L., Hopf, T. A., Kalaš, M., Marks, D. S., & Rost, B. (2014). FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinformatics, 15(1). 10.1186/1471-2105-15-85
- Käll, L., Krogh, A., & Sonnhammer, E. L. L. (2004). A Combined Transmembrane Topology and Signal Peptide Prediction Method. Journal of Molecular Biology, 338(5), 1027–1036. 10.1016/j.jmb.2004.03.016
- Mirdita, M., von den Driesch, L., Galiez, C., Martin, M. J., Söding, J., & Steinegger, M. (2016). Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Research, 45(D1), D170–D176. 10.1093/nar/gkw1081
- Marks, D. S., Colwell, L. J., Sheridan, R., Hopf, T. A., Pagnani, A., Zecchina, R., & Sander, C. (2011). Protein 3D Structure Computed from Evolutionary Sequence Variation. PLoS ONE, 6(12), e28766. 10.1371/journal.pone.0028766
- Sun, J., & Frishman, D. (2020). Data for DeepTMInter: Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning. Mendeley Data. https://doi.org/10.17632/2t8kgwzp35.2