Graph cluster true
We extracted a UMI graph from simulated data observed at a single locus.
There are three Markov clusters.
dedup_cluster = [
[69, 72, 838, 1221, 97, 210, 249, 315, 324, 374, 457, 658, 727, 760, 771, 933, 1073, 1126, 1198, 1260, 1271, 1307, 1498, 1505, 1541, 1563, 914, 946, 1083, 684, 1288, 1543, 822],
[1174, 119, 290, 303, 204, 218, 289, 302, 404, 545, 586, 633, 674, 709, 720, 802, 884, 943, 980, 1355, 1436, 1488, 1553, 786, 537, 867, 1649, 1255, 701, 1080, 347, 251],
[1315, 1549]
],
The connected components are
ccs = {69: [72, 838, 1221],
72: [69, 97, 210, 249, 315, 324, 374, 457, 658, 727, 760, 771, 838, 933, 1073, 1126, 1198, 1221, 1260,
1271, 1307, 1498, 1505, 1541, 1563], 838: [69, 72, 1221], 1221: [69, 72, 838, 914, 946],
97: [72, 658, 1505], 210: [72, 1083, 1271], 249: [72, 684, 727, 1073], 315: [72, 374, 946, 1126],
324: [72, 771], 374: [72, 315, 1126], 457: [72, 1288, 1307], 658: [72, 97, 1505], 727: [72, 249, 1073],
760: [72, 933, 1543], 771: [72, 324], 933: [72, 760], 1073: [72, 249, 727], 1126: [72, 315, 374],
1198: [72, 1260, 1541], 1260: [72, 1083, 1198, 1541, 1543], 1271: [72, 210], 1307: [72, 457],
1498: [72, 684, 1563], 1505: [72, 97, 658, 822], 1541: [72, 1198, 1260], 1563: [72, 914, 1498],
914: [1221, 1563], 946: [315, 1221], 1083: [210, 1260], 684: [249, 1498], 1288: [457],
1543: [760, 1174, 1260], 822: [1505], 1174: [119, 290, 303, 1315, 1543],
119: [204, 218, 289, 290, 302, 303, 404, 545, 586, 633, 674, 709, 720, 802, 884, 943, 980, 1174, 1355,
1436, 1488, 1553], 290: [119, 303, 786, 1174], 303: [119, 290, 1174], 1315: [1174, 1549],
204: [119, 537, 720, 1488], 218: [119, 302, 674, 867, 1649], 289: [119, 980],
302: [119, 218, 674, 1255], 404: [119, 802, 943], 545: [119, 701, 1436], 586: [119, 709, 884],
633: [119, 1080, 1553], 674: [119, 218, 302], 709: [119, 586, 884], 720: [119, 204, 347, 1488],
802: [119, 404, 537, 943], 884: [119, 586, 701, 709], 943: [119, 404, 802], 980: [119, 251, 289],
1355: [119, 251, 867], 1436: [119, 545], 1488: [119, 204, 720], 1553: [119, 633, 786], 786: [290, 1553],
1549: [1315], 537: [204, 802], 867: [218, 1355, 1649], 1649: [218, 867], 1255: [302], 701: [545, 884],
1080: [633], 347: [720], 251: [980, 1355]},
The edge list of the UMI graph is
edge_list=[(97, 72), (658, 97), (1355, 119), (204, 119), (1649, 218), (1541, 72), (933, 72), (914, 1563),
(1505, 72), (1221, 69), (404, 119), (1649, 867), (1307, 72), (302, 119), (802, 119), (1543, 760),
(1126, 72), (709, 586), (786, 1553), (374, 315), (946, 315), (1073, 72), (720, 204), (933, 760),
(1174, 1543), (210, 72), (771, 72), (537, 204), (727, 249), (1563, 72), (684, 1498), (822, 1505),
(1549, 1315), (1505, 97), (545, 119), (374, 72), (1553, 119), (1436, 119), (867, 218), (457, 72),
(1563, 1498), (709, 119), (720, 119), (1260, 1198), (1543, 1260), (218, 119), (251, 1355),
(701, 545), (1541, 1260), (1488, 204), (1553, 633), (1083, 210), (980, 289), (119, 1174),
(727, 72), (290, 119), (1073, 727), (1541, 1198), (884, 586), (802, 404), (303, 119), (347, 720),
(674, 218), (1221, 838), (1221, 72), (914, 1221), (943, 119), (586, 119), (943, 802), (838, 72),
(1271, 210), (980, 119), (674, 302), (786, 290), (251, 980), (1488, 720), (1488, 119),
(946, 1221), (1073, 249), (884, 709), (1288, 457), (1126, 315), (943, 404), (537, 802),
(302, 218), (249, 72), (315, 72), (303, 290), (290, 1174), (884, 119), (633, 119), (771, 324),
(658, 72), (1505, 658), (1307, 457), (303, 1174), (72, 69), (289, 119), (838, 69), (684, 249),
(1498, 72), (1315, 1174), (1126, 374), (701, 884), (1255, 302), (674, 119), (1080, 633),
(1083, 1260), (1436, 545), (1260, 72), (1271, 72), (867, 1355), (760, 72), (1198, 72), (324, 72)]
The unique UMIs have the counts stored in a umi_uniq_val_cnt-simu.txt
file. UMI IDs are listed on the left while their counts are shown on the right.
27 122
119 23.8
15 118
13 115
16 112
126 109
32 106
72 20.8
80 104
7 104
14 103
47 103
10 101
58 100
...
import umiche as uc
uc.plot.graph_cluster(
umi_uniq_val_cnt_fpn=to('data/simu/mclumi/seq_errs/umi_uniq_val_cnt-simu.txt'),
dedup_cluster=dedup_cluster,
ccs=ccs,
edge_list=edge_list,
).draw()
