Development of a Comprehensive Genetic Tool for Identification of Cannabis sativa Samples for Forensic and Intelligence Purposes



Journal Title

Journal ISSN

Volume Title



Cannabis sativa L. (marijuana) is the most commonly used illicit drug in the United States. Due to partial legalization, law enforcement faces a unique challenge in tracking and preventing flow of the legal marijuana to states where it is still prohibited. Moreover, significant illegal C. sativa traffic from Mexico exists at the US border. To date, no DNA method for Cannabis using short tandem repeat (STR) markers following International Society of Forensic Genetics (ISFG) or Scientific Working Group on DNA Analysis Methods (SWGDAM) recommendations (e.g., use of sequenced allelic ladder, use of tetra-nucleotide STR markers, etc.) has been reported. In addition, there is no existing Cannabis STR reference population database that can be used for forensic purposes (e.g., population in Hardy-Weinberg and linkage equilibrium, parameters of forensic interest). There have been very limited chloroplast (cpDNA) and mitochondrial DNA (mtDNA) studies investigating C. sativa haplotypes in the Americas. Lastly, massively parallel sequencing (MPS) technology has not yet been applied to targeted sequencing of C. sativa for forensic purposes. This project explores the use of genetic tools to identify and determine the origin of C. sativa for forensic purposes. Results provide the forensic DNA community a comprehensive genetic tool (STR, cpDNA, mtDNA, and MPS) that allows for the individualization of Cannabis samples, the association of different cases as well as origin determination of samples for forensic and intelligence purposes. First, a previously reported 15-loci STR multiplex was evaluated. Results of the evaluation indicated that this STR system is not suitable for forensic identification due to several issues; namely high heterozygote peak imbalance in some markers, overlapping alleles between two closely located STR markers, high stutter peaks in dinucleotide markers, inter-loci peak imbalance and presence of null alleles in four of the markers. Therefore, a novel 13-loci STR multiplex was developed and optimized for C. sativa identification (3500 Genetic analyzer), according to ISFG and SWGDAM recommendations, using primer and multiplex STR design software, and a gradient PCR approach for optimal annealing temperature determination. This STR multiplex was validated according SWGDAM guidelines. Case-to-case comparisons were performed by phylogenetic analysis using the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) method and parsimony analysis with statistically significant differences detected using pair-wise genetic-distance comparisons. Homogeneous subpopulations (low FST) were determined by phylogenetic analysis and confirmed by bootstrap analysis (95% confidence interval). Results revealed a homogeneous subpopulation that could be used as a Cannabis reference STR population database (N=101) with parameters of population genetics (observed heterozygosity, expected heterozygosity, Hardy-Weinberg equilibrium, and linkage disequilibrium) and of forensic interest (allele frequencies and power of discrimination, etc.). Another previously reported multi-loci system was modified and optimized to genotype five chloroplast and two mitochondrial markers. For this purpose, two assays were designed: a homopolymeric STR pentaplex and a SNP triplex with one chloroplast (Cscp001) marker shared by both methods for quality control. For successful mitochondrial and chloroplast typing, a novel real-time PCR quantitation method was developed and validated to accurately estimate the quantity of the chloroplast DNA (cpDNA) using a synthetic DNA standard. Moreover, a sequenced allelic ladder was also designed for accurate genotyping of the homopolymeric STR pentaplex. And finally, as a proof of concept, a custom panel for MPS was designed to interrogate 12 Cannabis-specific STR loci by sequence. A simple workflow was designed to integrate the custom PCR multiplex into a workflow compatible with the Ion Plus Fragment Library Kit, Ion Chef, and Ion S5 system. For data sorting and sequence analysis, a custom configuration file was designed for STRait Razor v3 to parse and extract STR sequence data. The study resulted in a preliminary investigation of sequence variation for 12 autosomal STR loci in 16 Cannabis samples. Results revealed intra-repeat variation in eight loci where the nominal or size-based allele was identical, but variances were discovered by sequence. In addition, full concordance was observed between the MPS and capillary electrophoresis (CE) data. Although the panel was not fully optimized and only a small number of samples were evaluated, this study demonstrated that more informative STR typing of Cannabis samples can successfully be performed on a MPS platform.



Cannabis sativa, Forensic DNA, Forensic plant science, Massively parallel sequencing, Short tandem repeats