Skip to content

Home

Image title

Welcome to the homepage of mclUMI!

What is it?

For UMI collapsing/deduplication to improve molecular quantification accuracy.

mclUMI is a toolkit developed by using the Markov clustering (MCL) network-based algorithm for correcting UMI errors and thus precisely counting unique UMIs. Dynamic counting results feature the tool. mclUMI is implemented with Python.

Image title
Fig 1. Schematic of mclUMI for UMI deduplication

Features

There is a summary for technical features of mclUMI.

Technical features

It provides 4 modules for UMI deduplication, including

dedup_basic, dedup_pos, dedup_gene, and dedup_sc

Each module for UMI deduplication includes 7 algorithms

mcl, mcl_ed, mcl_val, unique, cluster, adjacency, and directional

Each takes as input the alignment result in a bam file and outputs a UMI-deduplicated alignment in a new bam file and another 2 summary files.

  • Algorithm category
    • Graph-based UMI collapsing
    • Euclidean distance-based UMI collapsing
  • Installation package
    • PyPI
    • Conda
    • Docker
    • Github
  • Sequencing level
    • Single genomic locus
    • Bulk RNA-seq
    • single-cell (sc) RNA-seq

Programming

mclUMI provides two user-friendly interfaces to run internally (Python inline) or externally (CLI).

  • language - Python
  • module - Object Oriented Programming (OOP)
  • command - Python and Shell

In Python

import mclumi as mu

mu.onepos
mu.multipos
mu.gene
mu.sc
...

In Shell

$ mclumi [module | str] \
-m [method | str] \
-ed [edit distance | int]  \
-pfpn [yaml file | str] \
-bfpn [bam file | str] \
-wd [output path | str] \
-vb [if verbose | boolean]