A Structurally Validated Sequence Alignment of All 497 Typical Human Protein Kinase Domains

2019 
Abstract Protein kinases are important in a large number of signaling pathways. Their dysregulation is involved in a number of human diseases, especially cancer. Studies on the structures of individual kinases have been used to understand the functions and phenotypes of mutations in other kinases that do not yet have experimental structures. The key factor in accurate inference by homology is an accurate sequence alignment. We present a parsimonious structure-based sequence alignment of 497 human protein kinase domains excluding atypical kinases, even those with related but somewhat different folds. Starting with a computed multiple sequence alignment, the alignment was manually refined in Jalview based on pairwise structural superposition onto a single kinase (Aurora A), followed by sequence alignment of the remaining kinases to their closest relatives with known structures. The alignment is arranged in 17 blocks of conserved regions and unaligned blocks in between that contain insertions of varying lengths present in only a subset of kinases. The aligned blocks contain well-conserved elements of secondary structure and well-known functional motifs, such as the DFG and HRD motifs. We validated the multiple sequence alignment by a pairwise, all-against-all alignment of 272 human kinases with known crystal structures. Our alignment has true-positive rate (TPR) and positive predictive value (PPV) accuracies of 97%. The remaining inaccuracy in our alignment comes from a few structures with shifted elements of secondary structure, and from the boundaries of aligned and unaligned regions, where compromises need to be made to encompass the majority of kinases. A new phylogeny of the protein kinase domains in the human genome based on our alignment indicates that 14 kinases previously labeled as “OTHER” can be confidently placed into the CAMK group. These kinases comprise the Aurora kinases, Polo kinases, ULK kinases, Calcium/calmodulin-dependent kinase kinases, and STK36.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    66
    References
    2
    Citations
    NaN
    KQI
    []