In molecular biology, kataegis describes a pattern of localized hypermutations identified in some cancer genomes, in which a large number of highly-patterned basepair mutations occur in a small region of DNA. The mutational clusters are usually several hundred basepairs long, alternating between a long range of C→T substitutional pattern and a long range of G→A substitutional pattern. This suggests that kataegis is carried out on only one of the two template strands of DNA during replication. Compared to other cancer-related mutations, such as chromothripsis, kataegis is more commonly seen; it is not an accumulative process but likely happens during one cycle of replication. In molecular biology, kataegis describes a pattern of localized hypermutations identified in some cancer genomes, in which a large number of highly-patterned basepair mutations occur in a small region of DNA. The mutational clusters are usually several hundred basepairs long, alternating between a long range of C→T substitutional pattern and a long range of G→A substitutional pattern. This suggests that kataegis is carried out on only one of the two template strands of DNA during replication. Compared to other cancer-related mutations, such as chromothripsis, kataegis is more commonly seen; it is not an accumulative process but likely happens during one cycle of replication. The term kataegis (καταιγίς) is derived from the ancient Greek word for 'thunderstorm'. It was first used by scientists at the Wellcome Trust Sanger Institute to describe their observations of breast cancer cells. In the process of mapping mutation clusters across the genome, they used a visualization tool called 'rainfall plots', as shown on the picture on the right, with which they observed a clustering pattern for kataegis. Regions of kataegis have been shown to be colocalised with regions of somatic genome rearrangements. In these regions, known as the breakpoints, basepairs are more prone to get deleted, substituted, or translocated. Most hypotheses of the kataegis involves errors during the frequent DNA repair at the breakpoints. A collection of enzymes from the DNA repair system will come in to excise the mismatch basepair. When these enzymes try to mend the mutational damage, they unwind DNA into single strands and create lesion regions that do not have a purine/pyrimidine base. Across the lesion region, the bases in the unpaired, single-stranded DNA(ssDNA) are more accessible to the modifying enzyme groups that can cause further damage in the sequence, thus forming the mutational clusters seen in kataegis. Two enzyme families are assumed to be related to kataegis. The APOBEC('apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like') enzyme family causes predominately C→T mutations, and translesional DNA synthesis (TLS) DNA polymerase causes C→G or C→T mutations. APOBEC family is a group of cytidine deaminase enzymes that plays an important role in immune system. Its major function is to induce genetic mutations in antibodies, which need a huge variety of genes in order to bind to different antigens. APOBEC family can also protect against the infection of RNA retroviruses and retrotransposons. In a single-strand DNA (ssDNA), APOBEC can transfer an amine group from a cytosine(C) and turn it into a uracil(U); such mutations can deaminate the viral gene and terminate the retro-transcription process that codes RNA back to DNA. As shown in Figure 1, the base mutations in kataegis regions were found to be almost exclusively cytosine to thymine in the context of a TpC dinucleotide(p denotes the phosphoribose backbone). At DNA lesion sites, APOBEC enzyme can have access to long ssDNA and induce a C→U mutations. APOBEC family is processive and can continue to induce multiple mutations in a small region. If this part of DNA is replicated before such mutation is repaired, the mutation gets passed on to the subclones. The original CG pair will become a TA pair after one round of replication, hence the predominantly seen C→T mutation in kataegis. Among the APOBEC family, APOBEC3 subfamily are responsible for protection against retroviruses such as HIV(known to be modified by APOBEC3F and APOBEC3G). Since their original functions include editing ssDNA, they are more likely to be responsible for causing large numbers of mutations on human ssDNA. The direct link between the APOBEC deaminases and kataegistic clusters of mutations was recently obtained by expressing hyperactive deaminase in yeast cells. Recent evidence has linked the over-expression of the family member APOBEC3B with multiple human cancers, highlighting its possible contribution to genomic instability and kataegis. Meanwhile, activation-induced cytidine deaminase (AID) is shown to facilitate kataegis formation in human lymphomas. AID's majorly function is to diversify the genes among immune cells. Recent research shows that AID is involved in site-specific mutations in B cell tumor, while APOBEC3 subfamily causes the non-specific, cross-genomic mutations in non-B cell tumor. Translesional DNA synthesis (TLS) DNA polymerase family brings in the nucleotide to bridge across the abasic sites in DNA lesion. Due to the natural of the function of this enzyme, TLS DNA polymerase has a high error rates. It can slip at sequence or insert A or C base pairs into a distorted region on DNA strand; ss shown in Figure 3, TLS DNA polymerase may cause mutations in many different ways.