An Architecture for Unstructured Data Management

2012 
As the information age is coming, there is a vast amount of information available in the Internet. Most of data on Web are unstructured. But the significant data should be organized and stored in a suitable way for future purposes. One of the unsolved problems is the management of unstructured data. The unstructured data such as presentation, spreadsheet, text document, memo, images and web pages are difficult to manage while the data become a large scale and the users have different requirements and interests. In this paper, we proposed an architecture for unstructured data management by integrating source query, data collection and data management to solve these problems. The data collection layer extracts the data we care about, we use the existing tools to extract automatic and we can also add the data to the repository manually. The data management layer manage all the collection data by classifying the data, selecting nodes to store and managing centralized as index. The source query layer allows users to query and get the data diversity according the adaptive query service and recommendation service. Finally, we implemented a prototype system OCourse based on this system architecture to show its feasible and efficient.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []