MapReduce distributed parallel computing framework for diagnosis and treatment of knee joint Kashin-Beck disease

2021 
To improve the accuracy and computational efficiency of the MapReduce distributed parallel computing framework, thereby mining the diagnosis and treatment data of Kashin-Beck Disease (KBD) of the knee joint. Based on the shortcomings of the traditional K-means Clustering Algorithm (KCA), a simplified method for distance calculation was proposed. The Manhattan distance was used instead of Euclidean distance. Further improvement strategies were proposed to implement and compare KCA of MapReduce (MR-KCA) and Improved MR-KCA (IMR-KCA). With the same data, the sum of squared errors of MR-KCA and IMR-KCA decreased with the increase in the number of center points. Compared with MR-KCA, the quality of IMR-KCA was higher, and their difference was especially evident at 8 GB data capacity. The total execution time of both MR-KCA and IMR-KCA increased with the increase in the number of center points. Compared to MR-KCA, the total execution time of IMR-KCA was significantly reduced, especially when the data capacity was 8 GB. When the number of center points was 5000, IMR-KCA could reduce the total execution time by 50%. Through experiments, IMR-KCA was proved to better present the diagnosis and treatment data of patients with knee joint KBD. The scalability rates of MR-KCA and IMR-KCA decreased as the number of nodes increased, but the scalability rates of both algorithms could be maintained above 0.80, which had better scalability. Compared with MR-KCA, IMR-KCA had significantly higher scalability. The IMR-KCA proposed in this study had high accuracy and computing efficiency, which could be used in the visualization of KBD diagnosis and treatment.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    0
    Citations
    NaN
    KQI
    []