Read conversion
UMI-tools¶
UMI-tools provides a Bundle method to filter reads with high quality. Users can access this method through the mu.prep.run
function as follows.
1 2 3 4 5 6 7 8 9 |
|
The following vignette shows how this function filters reads.
# Parameter().bundle_umi_tools
{
'stats': 'deduplicated',
'get_umi_method': 'read_id',
'umi_sep': '_',
'umi_tag': 'RX',
'umi_tag_split': None,
'umi_tag_delim': None,
'cell_tag': None,
'cell_tag_split': '-',
'cell_tag_delim': None,
'filter_umi': None,
'umi_whitelist': None,
'umi_whitelist_paired': None,
'method': 'directional',
'threshold': 1,
'spliced': False,
'soft_clip_threshold': 4,
'read_length': False,
'per_gene': False,
'gene_tag': None,
'assigned_tag': None,
'skip_regex': '^(__|Unassigned)',
'per_contig': False,
'gene_transcript_map': None,
'per_cell': False,
'whole_contig': False,
'detection_method': None,
'mapping_quality': 0,
'output_unmapped': False,
'unmapped_reads': 'discard',
'chimeric_pairs': 'use',
'unpaired_reads': 'use',
'ignore_umi': False,
'ignore_tlen': False,
'chrom': None,
'subset': None,
'in_sam': False,
'paired': False,
'out_sam': False,
'no_sort_output': False,
'stdin': "<_io.TextIOWrapper name='example.bam' mode='r' encoding='UTF-8'>",
'stdlog': "<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>", 'log2stderr': False,
'compresslevel': 6,
'timeit_file': None,
'timeit_name': 'all',
'timeit_header': None,
'loglevel': 1,
'short_help': None,
'random_seed': None
}
As an example, the example.bam
file can be downloaded here. After conversion, the bam file will be named example_bundle.bam
, which contains 1,175,027 reads having 20,683 raw unique UMI sequences observed at 12,047 genomic positions.
Info
The bundle function from UMI-tools is currently the only method in mclUMI for filtering reads. Please stay tuned for more as the development of mclUMI.