Development and characterization of a pooled Haemophilus influenzae genomic library for the evaluation of gene expression changes associated with mucosal biofilm formation in otitis media

2003 
Abstract Haemophilus influenzae is one of the most important respiratory pathogens of man. It has been etiologically associated with otitis media, otorrhea, and chronic obstructive pulmonary disease. Identification of new genomic elements will provide novel targets to fight chronic infections caused by this organism. Objective: The new paradigm that chronic infections are caused by bacterial biofilms prompted us to study the relationship between bacterial pathogenicity, biofilm formation and bacterial communal cooperation. To do this, it is essential to determine the virulence gene sets that are involved in the above processes and whether they are present in every bacterial cell or distributed in a “communal gene-pool”, the distributed genome hypothesis (DGH). We designed, constructed and characterized a highly redundant genomic DNA library comprised of the genomes of ten low passage clinical isolates of H. influenzae carrying large numbers of genes that are not present in the laboratory strains of H. influenzae . Methods: Genomic DNA fragments of the ten clinical strains were hydro-dynamically sheared to produce a mean fragment size of 1.5–2.5 kb. The ten sheared DNAs were than pooled and used in the construction of a genomic library with 76 800 clones. Results: Our restriction endonuclease and sequence analyses of 800 clones demonstrate that 75% of the clones carry an insert larger than 0.5 kb. The library has an ∼1.5 kb average insert size, and therefore, better than 4.5× redundancy for each of the genomes of the ten clinical isolates. Our sequencing effort (∼1 million nucleotides to date) reveals that a high percentage of genes (75 clones, 11% of the 686 sequenced clones) present in this library are not represented in the genome of the reference strain H. influenzae Rd. Conclusions: The library, based on the above results, has a better than 4.5× coverage for each of the ten constituent genomes. On the basis of our preliminary sequencing data (∼1 million nucleotides) the library lacks of highly repeated sequences, therefore, the expected genome coverage (4.5×) is not degraded. Using the prevalence of non-Rd like sequences (11%) detected during characterization of the genomic library, we estimated that the library contains DNA sequences equivalent to ∼2 million bp, which are not represented in the reference genome of the H. influenzae Rd strain and that is greater in size than the genome of this reference strain, providing ample targets for innovative drug design.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    16
    Citations
    NaN
    KQI
    []