Predicting haplogroups using a versatile machine learning program (PredYMaLe) on a new mutationally balanced 32 Y-STR multiplex (CombYplex): unlocking the full potential of the human STR mutation rate spectrum to estimate forensic parameters

2020 
Abstract We developed a new mutationally well-balanced 32 Y-STR multiplex (CombYplex) together with a machine learning (ML) program PredYMaLe to assess the impact of STR mutability on haplogourp prediction, while respecting forensic community criteria (high DC/HD). We designed CombYplex around two sub-panels M1 and M2 characterized by average and high-mutation STR panels. Using these two sub-panels, we tested how our program PredYmale reacts to mutability when considering basal branches and, moving down, terminal branches. We tested first the discrimination capacity of CombYplex on 996 human samples using various forensic and statistical parameters and showed that its resolution is sufficient to separate haplogroup classes. In parallel, PredYMaLe was designed and used to test whether a ML approach can predict haplogroup classes fromY-STR profiles. Applied to our kit, SVM and Random Forest classifiers perform very well (average 97%), better than Neural Network (average 91%) and Bayesian methods (
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    55
    References
    1
    Citations
    NaN
    KQI
    []