Protein Information Resource: a community resource for expert annotation of protein data

2001 
The Protein Information Resource, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the most comprehensive and expertly annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database. To provide timely and high quality annotation and promote database interoperability, the PIR-International employs rule-based and classificationdriven procedures based on controlled vocabulary and standard nomenclature and includes status tags to distinguish experimentally determined from predicted protein features. The database contains about 200 000 non-redundant protein sequences, which are classified into families and superfamilies and their domains and motifs identified. Entries are extensively crossreferenced to other sequence, classification, genome, structure and activity databases. The PIR web site features search engines that use sequence similarity and database annotation to facilitate the analysis and functional identification of proteins. The PIR-International databases and search tools are accessible on the PIR web site at http://pir.georgetown.edu/ and at the MIPS web site at http://www.mips.biochem.mpg.de. The PIR-International Protein Sequence Database and other files are also available by FTP. The Protein Information Resource (PIR) for over three decades has been a community resource that provides protein databases and analysis tools to support research on molecular evolution, functional genomics and computational biology. The PIR, along with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), maintains and distributes the PIR-International Protein Sequence Database, the most comprehensive, well-annotated and non-redundant protein sequence database in the public domain. To further support genomic and proteomic research, we have greatly improved our bioinformatics infrastructure in the last 2 years, which allows us: (i) to continue to provide high quality protein sequence data and annotation, while keeping pace with the large influx of data being generated by genome sequencing projects; (ii) to develop an integrated system of protein databases and analytical tools for expert annotation and knowledge discovery; and (iii) to improve accessibility of our resource and interoperability of our databases. Some key developments include: highly-automated protein sequence classification and annotation, enhanced web site with many new search engines and functionality for protein data mining and analysis, a new integrated classification database that provides comprehensive descriptions of family relationships and functional/structural annotations, database migration into Oracle 8i object-relational database system and database distribution in XML format.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    76
    Citations
    NaN
    KQI
    []