Preliminary profile of the Cryptosporidium parvum genome: an expressed sequence tag and genome survey sequence analysis.

2000 
Abstract Cryptosporidium parvum is a protozoan enteropathogen that infects humans and animals and causes a pronounced diarrheal disease that can be life-threatening in immunocompromised hosts. No specific chemo- or immunotherapies exist to treat cryptosporidiosis and little molecular information is available to guide development of such therapies. To accelerate gene discovery and identify genes encoding potential drug and vaccine targets we constructed sporozoite cDNA and genomic DNA sequencing libraries from the Iowa isolate of C. parvum and determined ∼2000 sequence tags by single-pass sequencing of random clones. Together, the 567 expressed sequence tags (ESTs) and 1507 genome survey sequences (GSSs) totaled one megabase (1 mb) of unique genomic sequence indicating that ∼10% of the 10.4 mb C. parvum genome has been sequence tagged in this gene discovery expedition. The tags were used to search the public nucleic acid and protein databases via BLAST analyses, and 180 ESTs (32%) and 277 GSSs (18%) exhibited similarity with database sequences at smallest sum probabilities P ( N )≤10 −8 . Some tags encoded proteins with clear therapeutic potential including S -adenosylhomocysteine hydrolase, histone deacetylase, polyketide/fatty-acid synthases, various cyclophilins, thrombospondin-related cysteine-rich protein and ATP-binding-cassette transporters. Several anonymous ESTs encoded proteins predicted to contain signal peptides or multiple transmembrane spanning segments suggesting they were destined for membrane-bound compartments, the cell surface or extracellular secretion. One-hundred four simple sequence repeats were identified within the nonredundant sequence tag collection with (TAA) ≥6 /(TTA) ≥6 and (TA) ≥10 /(AT) ≥10 being the most prevalent, occurring 40 and 15 times, respectively. Various cellular RNAs and their genes were also identified including the small and large ribosomal RNAs, five tRNAs, the U2 small nuclear RNA, and the small and large virus-like, double-stranded RNAs. This investigation has demonstrated that survey sequencing is an efficient procedure for gene discovery and genome characterization and has identified and sequence tagged many C. parvum genes encoding potential therapeutic targets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    103
    References
    75
    Citations
    NaN
    KQI
    []