Mobile content adaptation is a technology of effectively representing the contents originally built for the desktop PC on wireless mobile devices. Previous approaches for Web content adaptation are mostly device-dependent. Also, the content transformation to suit to a smaller device is done manually. As a result, the user has difficulty in selecting relevant information from a heavy volume of contents since the context information related to the content is not provided. To resolve these problems, this paper proposes an enhanced method of Web content adaptation for mobile devices. In our system, the process of Web content adaptation consists of 4 stages including block filtering, block title extraction, block content summarization, and personalization through learning. As a result of learning, personalization is realized by showing the information for the relevant block at the top of the content list
As the structure of a Web page is getting more complicated, the construction of wrapper induction rules becomes more difficult and time-consuming. The main problem in most wrapper induction methods is the difficulty in discriminating the meaningful blocks that contain the target information from the noise blocks that contains irrelevant information such as advertisements, menus, or copyright statements. To solve this problem, this paper proposes the RIPB(recognizing informative page blocks) algorithm that detects the informative blocks in a Web page by exploiting the visual block segmentation scheme. RIPB uses the visual page segmentation algorithm to analyze and partition a Web page into a set of logical blocks, and then groups related blocks with similar structures into a block cluster and recognizes the informative block clusters by applying some heuristic rules to the cluster information. The results of a series of experiments indicate that RIPB contributes to improve the accuracy of information extraction by allowing the wrapper induction module to focus only on the informative block information and ignore other noise information in building extraction rules.
Personalized information retrieval and recommendation systems have been proposed to deliver the right information to users with different interests. However, most of previous systems are using keyword frequencies as the main factor for personalization, and as a result, they could not analyze semantic relations between words. Also, previous methods often fail to provide the documents that are related semantically with the query words. To solve these problems, we propose a recommendation system which provides relevant documents to users by identifying semantic relations between an ontology that semantically represents the documents crawled by a Web robot and user behavior history. Recommendation is mainly based on content-based similarity, semantic similarity, and preference weights.
Mobile content adaptation is a technology of effectively representing the contents originally built for the desktop PC on wireless mobile devices. Previous approaches for Web content adaptation are mostly device-dependent. Also, the content transformation to suit to a smaller device is done manually. Furthermore, the same contents are provided to different users regardless of their individual preferences. As a result, the user has difficulty in selecting relevant information from a heavy volume of contents since the context information related to the content is not provided. To resolve these problems, this paper proposes an enhanced method of Web content adaptation for mobile devices. In our system, the process of Web content adaptation consists of 4 stages including block filtering, block title extraction, block content summarization, and personalization through learning. Learning is initiated when the user selects the full content menu from the content summary page. As a result of learning, personalization is realized by showing the information for the relevant block at the top of the content list. A series of experiments are performed to evaluate the content adaptation for a number of Web sites including online newspapers. The results of evaluation are satisfactory, both in block filtering accuracy and in user satisfaction by personalization.
This paper describes an enhanced method of Web content adaptation to mobile devices for online News article provision in ubiquitous environments. Our system exploits a scheme of visual block segmentation for Web pages that filters out unnecessary blocks and extracts useful article information from content blocks. This method resolves the problems of previous approaches to Web content adaptation in which the content transformation to suit to a smaller device is device-dependent and manually-driven. Our method also employs a learning module that is initiated when the user selects to view the full content in the content summary page. As a result of learning, personalization is realized by showing the information for the relevant block at the top of the content list. A series of experiments are performed to evaluate our mobile content adaptation method for a number of well-known Web News sites, and the result of evaluation is satisfactory both in block filtering accuracy and in user satisfaction by personalization.
It is generally difficult for researchers to obtain information related to their own fields and novel technologies from huge data residing in the World Wide Web. Furthermore, they often try to apply them to other particular fields which are different from theirs. The main motivation of this phenomenon is to solve existing problems or improve the performance of their systems. Hence, it is important to detect collaborative fields in which technologies of particular fields are applied to another area to find various trends. In this paper, we propose a method to detect collaborative fields by using social networks representing the relations among authors of papers, and describe some experimental results to show the effectiveness of the proposed method when collaborative fields are detected by using social networks.
Clustering is an essential way to extract meaningful information from massive data without human intervention in the field of data mining. Clustering algorithms can be divided into four types: partitioning algorithms, hierarchical algorithms, grid-based algorithms, and locality-based algorithms. Each algorithm, however, has problems that are not easily solved. K-means, for example, suffer from setting up an initial centroid problem when distribution of data is not hyper-ellipsoid. Chain effect, outlier, and degree of density in data are problems occurring in other types of algorithms. To solve these problems, various kinds of algorithms were proposed. In this paper, we propose a novel grid-based clustering algorithm through building clusters in each cell and show how to solve the previously mentioned problems.
Sample training data for machine learning often contain irrelevant information or redundant concept. It is also the case that the original data may include noise. If the information collected for constructing learning model is not reliable, it is difficult to obtain accurate information. So the system attempts to find relations or regulations between features and categories in the teaming phase. The feature selection is to remove irrelevant or redundant information before constructing teaming model. for improving its performance. Existing feature selection methods assume that the distribution of documents is balanced in terms of the number of documents for each class and the length of each document. In practice, however, it is difficult not only to prepare a set of documents with almost equal length, but also to define a number of classes with fixed number of document elements. In this paper, we propose a new feature selection method that considers the impurities among the words and unbalanced distribution of documents in categories. We could obtain feature candidates using the word impurity and eventually select the features through unbalanced distribution of documents. We demonstrate that our method performs better than other existing methods via some experiments.