Management of Molecular Simulation Database

2014 
Large amount of molecular simulations (MS) data produced in scientific studies is stored in computer flat files. Any information requested by the users (or program) is handled by operating system. Often, the same files are accessed multiples times when different information is requested or different analytical query is executed. Multiple users accessing same file would result in multiple access of the file. The I/O bandwidth requirement is huge and multiple accesses delay requests that arrive in sequence. Therefore, there is need for managing the MS data effectively so that access to it is efficient.In this work, we propose the idea of storing MS data in a database management system (DBMS) and develop novel indexing strategies to help optimize the process of a wide range of queries. The query-plan generation feature of DBMS minimizes number of accesses to the file system. Multiple functions (queries) accessing same simulation file can share single access request made to the file system.We also propose a website to host such a DBMS that can facilitate researchers to upload their data and perform analysis efficiently. Users can analyze their data using efficient functions implemented to access the data. Index structures are generated to store all results of analysis that may be interesting to other users, so that the results of analysis are readily available without the need to duplicate the analysis. The data upload feature can be made available through application program interfaces (APIs). Users can upload their data using the APIs. The DBMS takes care of generating indexes, storing results of analysis and retrieving efficiently, whenever users run analysis query or request information.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []