Usage

Usage#

After understanding the concept of LocRRCs, you can use the following example to generate per-residue cumuCCs and perform the feature assignment.

Example code#

We can directly generate LocRRCs and assign co-evolutionary features to them. The final results are saved in a file ./data/output.txt where each row means the co-evolutionary features of LocRRCs for a residue pair.

import tmkit as tmk

tmk.edge.extract(
    method='unipartite',
    fasta_fpn='data/fasta/1xqfA.fasta',
    net_fpn='data/rrc/tool/1xqfA.evfold',
    window_size=5,
    seq_sep_inferior=0,
    seq_sep_superior=None,
    pair_mode='patch',
    assign_mode='hash',
    input_kind='freecontact',
    is_sv=True,
    sv_fpn='./data/output.txt',
)

Name

Description

input_kind

feature format that is regulated by a co-evolution method, such CCMPred, FreeContact, GDCA, or Plmc.

fpn

path where a covariance matrix (co-evolutionary features) is placed. The covariance matrix should be generated by either CCMPred, FreeContact, GDCA, or plm-DCA. Or, a file that contains three columns (the 1st two for residue pair IDs and the 3rd one for co-evolutionary strengths) is fine.

assign_mode

method to generate cumuCCs. It can be hash, hash_rf, hash_ori, pandas, or numpy.

window_size

window size

window_m_ids

list of residues after applying a window

sequence

molecular sequence

seq_sep_inferior

The lower bounds of how far any two residues are in pairs

seq_sep_superior

The upper bounds of how far any two residues are in pairs.

cumu_ratio

top-ranked int(len_seq*cumu_ratio) co-evolutionary strengths that a residue of interest is involved in.

is_sv

if saving result, True or False. If False, you can assign sv_fpn None. path to save the result

sv_fpn

path to save the result.

Please see here for better understanding the file-naming system.

Output#

Finally, you will see the following output.

===>Molecular sequence: AVADKADNAFMMICTALVLFMTIPGIALFYGGLIRGKNVLSMLTQVTVTFALVCILWVVYGYSLAFGEGNNFFGNINWLMLKNIELTAVMGSIYQYIHVAFQGSFACITVGLIVGALAERIRFSAVLIFVVVWLTLSYIPIAHMVWGGGLLASHGALDFAGGTVVHINAAIAGLVGAYLPHNLPMVFTGTAILYIGWFGFNAGSAGTANEIAALAFVNTVVATAAAILGWIFGEWALRGKPSLLGACSGAIAGLVGVTPACGYIGVGGALIIGVVAGLAGLWGVTMPCDVFGVHGVCGIVGCIMTGIFAASSLGGVGFAEGVTMGHQLLVQLESIAITIVWSGVVAFIGYKLADLTVGLRVP
===>Molecule length: 362
===>Window size: 5
===>Sequence separation inferior: 0
===>Sequence separation superior: None
===>Mode: internal
===>Input kind: freecontact
===>pair number: 65341
=========>Window molecule generation: 0.5069675445556641s.
======>unipartite pair assignment: 3.2129993438720703s.
===>total time: 3.8409998416900635s.
===>saving...
===>saved!