Conversion between PDB and UniProt

Conversion between PDB and UniProt#

TMKit provides a function to convert between a PDB ID to an UniProt accession code.

Example usage#

First, we can convert from a PDB ID to an UniProt accession code. The PDB ID that will be recognized by TMKit should be a protein name concatenated with a chain name by _, e.g., 1xqf.A. In our example dataset, there is a file that can be found in ./data/map/pdb_chain_uniprot.csv, which needs to be specified during the conversion.

import tmkit as tmk

res = tmk.mapping.pdb2uniprot(
    id='1qxf.A',
    ref_fpn='data/map/pdb_chain_uniprot.csv',
)
print(res)

It outputs O28935. Then, we can convert from an UniProt accession code to a PDB ID.

import tmkit as tmk

res = tmk.mapping.uniprot2pdb(
    id='O28935',
    ref_fpn='data/map/pdb_chain_uniprot.csv',
)
print(res)

It outputs 1qxf.A.

If there is a list of protein IDs to be converted, we can do it like below.

import tmkit as tmk
import pandas as pd

prot_series = pd.Series(['6e3y', '6rfq', '6t0b'])
for prot in prot_series.index:
    res = tmk.mapping.pdb2uniprot(
        id=prot_series.iloc[prot],
        ref_fpn='data/map/pdb_chain_uniprot.csv',
    )
    print(res)

It outputs P63092, Q9B6E8, and P07256.

Attributes#

Attribute

Description

id

a PDB ID (e.g., 1qxf.A) or a UniProt accession code (e.g., O28935)

ref_fpn

reference file for conversion between PDB IDs and UniProt accession codes

See also

Please see here for better understanding the file-naming system.

Output#

Please check the output in each vignette above.