Skip to article frontmatterSkip to article content

Data

Input

External tools

Protein sequences are only needed for running DeepTMInter, and are also used to generate a list of intermediate files before running DeepTMInter, as shown in Table 1.

Table 1:External tools for generating intermediate files before running DeepTMInter.

ToolRoleFunctionSource
HHblitsInputgenerating multiple sequence alignmentsRemmert et al. (2011)
Gaussian DCAInputpredictor of residue contactsBaldassi et al. (2014)
FreecontactInputpredictor of residue contactsKaján et al. (2014)
PhobiusInputpredictor of transmembrane topologiesKäll et al. (2004)
Uniclust30 databaseIntermediatesequence databaseMirdita et al. (2016)

Output files

DeepTMInter finally returns an output file with the suffix of .m1, .m2, .m3, .m4, .m5 or .deeptminter.

Predictions of interaction sites in tansmembrane proteins are shown in the output file, with three columns:

  1. positions of animo acids in the input sequence
  2. animo acids
  3. probabilities of being interaction sites

If you want to get the results in the context of no ideally preferred regions predicted by Phobius. You can set -r as combined to run the program. This will return the predictions of the whole fasta sequence. Then, you can tailor the whole predictions to whatever you want.

Example data

Users can download some example data and check an assortment of input files.

Code
Output
import deeptminter

deeptminter.predict.download_data(
    url='https://github.com/2003100127/deeptminter/releases/download/example_data/example_data.zip',
    sv_fpn='../data/example_data.zip',
)
References
  1. Remmert, M., Biegert, A., Hauser, A., & Söding, J. (2011). HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nature Methods, 9(2), 173–175. 10.1038/nmeth.1818
  2. Baldassi, C., Zamparo, M., Feinauer, C., Procaccini, A., Zecchina, R., Weigt, M., & Pagnani, A. (2014). Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners. PLoS ONE, 9(3), e92721. 10.1371/journal.pone.0092721
  3. Kaján, L., Hopf, T. A., Kalaš, M., Marks, D. S., & Rost, B. (2014). FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinformatics, 15(1). 10.1186/1471-2105-15-85
  4. Käll, L., Krogh, A., & Sonnhammer, E. L. L. (2004). A Combined Transmembrane Topology and Signal Peptide Prediction Method. Journal of Molecular Biology, 338(5), 1027–1036. 10.1016/j.jmb.2004.03.016
  5. Mirdita, M., von den Driesch, L., Galiez, C., Martin, M. J., Söding, J., & Steinegger, M. (2016). Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Research, 45(D1), D170–D176. 10.1093/nar/gkw1081
  6. Marks, D. S., Colwell, L. J., Sheridan, R., Hopf, T. A., Pagnani, A., Zecchina, R., & Sander, C. (2011). Protein 3D Structure Computed from Evolutionary Sequence Variation. PLoS ONE, 6(12), e28766. 10.1371/journal.pone.0028766
  7. Sun, J., & Frishman, D. (2020). Data for DeepTMInter: Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning. Mendeley Data. https://doi.org/10.17632/2t8kgwzp35.2