Usage#
After understanding the concept of LocRRCs, you can use the following example to generate per-residue cumuCCs and perform the feature assignment.
Example code#
We can directly generate LocRRCs and assign co-evolutionary features to them. The final results are saved in a file ./data/output.txt
where each row means the co-evolutionary features of LocRRCs for a residue pair.
import tmkit as tmk
tmk.edge.extract(
method='unipartite',
fasta_fpn='data/fasta/1xqfA.fasta',
net_fpn='data/rrc/tool/1xqfA.evfold',
window_size=5,
seq_sep_inferior=0,
seq_sep_superior=None,
pair_mode='patch',
assign_mode='hash',
input_kind='freecontact',
is_sv=True,
sv_fpn='./data/output.txt',
)
Name |
Description |
---|---|
|
feature format that is regulated by a co-evolution method, such CCMPred, FreeContact, GDCA, or Plmc. |
|
path where a covariance matrix (co-evolutionary features) is placed. The covariance matrix should be generated by either CCMPred, FreeContact, GDCA, or plm-DCA. Or, a file that contains three columns (the 1st two for residue pair IDs and the 3rd one for co-evolutionary strengths) is fine. |
|
method to generate cumuCCs. It can be |
|
window size |
|
list of residues after applying a window |
|
molecular sequence |
|
The lower bounds of how far any two residues are in pairs |
|
The upper bounds of how far any two residues are in pairs. |
|
top-ranked int(len_seq*cumu_ratio) co-evolutionary strengths that a residue of interest is involved in. |
|
if saving result, True or False. If False, you can assign sv_fpn None. path to save the result |
|
path to save the result. |
Please see here for better understanding the file-naming system.
Output#
Finally, you will see the following output.
===>Molecular sequence: AVADKADNAFMMICTALVLFMTIPGIALFYGGLIRGKNVLSMLTQVTVTFALVCILWVVYGYSLAFGEGNNFFGNINWLMLKNIELTAVMGSIYQYIHVAFQGSFACITVGLIVGALAERIRFSAVLIFVVVWLTLSYIPIAHMVWGGGLLASHGALDFAGGTVVHINAAIAGLVGAYLPHNLPMVFTGTAILYIGWFGFNAGSAGTANEIAALAFVNTVVATAAAILGWIFGEWALRGKPSLLGACSGAIAGLVGVTPACGYIGVGGALIIGVVAGLAGLWGVTMPCDVFGVHGVCGIVGCIMTGIFAASSLGGVGFAEGVTMGHQLLVQLESIAITIVWSGVVAFIGYKLADLTVGLRVP
===>Molecule length: 362
===>Window size: 5
===>Sequence separation inferior: 0
===>Sequence separation superior: None
===>Mode: internal
===>Input kind: freecontact
===>pair number: 65341
=========>Window molecule generation: 0.5069675445556641s.
======>unipartite pair assignment: 3.2129993438720703s.
===>total time: 3.8409998416900635s.
===>saving...
===>saved!