Public health is a major factor that in reducing of disease round the world. Today, most governments recognise the importance of public health surveillance in monitoring and clarifying the epidemiology of health problems. As part of public health surveillance, public health professionals utilise the results of epidemiological analysis to reform health care policy and health service plans. There are many health reports on epidemiological analysis within government departments, but the public are not authorised to access these reports because of commercial software restrictions. Although governments publish many reports of epidemiological analysis, the reports are coded in epidemiology terminology and are almost impossible for the public to fully understand.
In order to improve public awareness, there is an urgent need for government to produce a more easily understandable epidemiological analysis and to provide an open access reporting system with minimum cost. Inevitably, it poses challenges to IT professionals to develop a simple, easily understandable and freely accessible system for public use. It is not only required to identify a data analysis algorithm which can make epidemiological analysis reports easily understood but also to choose a platform which can facilitate the visualisation of epidemiological analysis reports with minimum cost. In this thesis, there were two major research objectives: the clustering analysis of epidemiological data and the geospatial visualisation of the results of the clustering analysis. SOM, FCM and k-means, the three commonly used clustering algorithms for health data analysis, were investigated. After a number of experiments, k-means has been identified, based on Davies-Bouldin index validation, as the best clustering algorithm for epidemiological data. The geospatial visualisation requires a Geo-Mashups engine and geospatial layer customisation. Because of the capacity and many successful applications of free geospatial web services, Google Maps has been chosen as the geospatial visualisation platform for epidemiological reporting.
Verifying the facts alleged by the prosecutors before the trial requires the judges to retrieve evidence within the massive materials accompanied. Existing Legal AI applications often assume the facts are already determined and fail to notice the difficulty of reconstructing them. To build a practical Legal AI application and free the judges from the manually searching work, we introduce the task of Legal Evidence Retrieval, which aims at automatically retrieving the precise fact-related verbal evidence within a single case. We formulate the task in a dense retrieval paradigm, and jointly learn the constrastive representations and alignments between facts and evidence. To get rid of the tedious annotations, we construct an approximated positive vector for a given fact by aggregating a set of evidence from the same case. An entropy-based denoise technique is further applied to mitigate the impact of false positive samples. We train our models on tens of thousands of unlabeled cases and evaluate them on a labeled dataset containing 919 cases and 4,336 queries. Experimental results indicate that our approach is effective and outperforms other state-of-the-art representation and retrieval models. The dataset and code are available at https://github.com/yaof20/LER.
Objective
To detect adverse drug reaction (ADR) signals using data mining algorithm and explore its application value.
Methods
Reports on adverse reactions induced by anti-infective drugs in National centor for ADR monitoring from January 2009 to December 2013 were collected and potential ADR risk signals were detected using proportional reporting ratio method (PRR), reporting odds ratio method (ROR), Medicines and Healthcare Products Regulatory Agency method (MHRA), Bayesian confidence propagation neural network method (BCPNN), and multi-item gamma Poisson shrinker method (MGPS). The results of detection using the above-mentioned 5 signal detection methods were compared.
Results
A total of 35 807 ADR reports induced by anti-infective drugs were collected, 35 759 effective reports were entered, and 834 suspected drugs were involved. In the 35 759 reports, 464 kinds of ADR were defined according to lowest level term and 21 kinds of ADR were defined according to system/organ classification. After the data cleaning, splitting, and encoding process, 6 620 reports containing suspected drug-adverse reaction combination were acquired. There were 3 966 reports (59.91%) in which suspected drug-adverse reaction combination appeared once, 937 reports (14.15%) in which suspected drug-adverse reaction combination appeared twice, and 1 717 reports (25.94%) in which suspicious drug-adverse reaction combination appeared more than thrice. The number of ADR signals detected using PRR, ROR, MGPS, BCPNN, and MHRA was 651, 614, 306, 75, and 57, respectively; the categories of drugs were 194, 168, 124, 34 and 40, respectively; ADR types were 139, 139, 121, 35, and 40, respectively. In the top ten risk signals, azithromycin-nausea were detected by the 5 signal detection methods, levofloxacin-pruritus were detected by PRR, ROR, MHRA, and BCPNN. The top ten signals detected by PRR were totally same as those by ROR and signals detected by other methods were various.
Conclusions
Potential risk signals in ADR reports could be detected systematically and automatically using PRR, ROR, MGPS, BCPNN, and MHRA. However, each method has its own advantage and disadvantage and should be applied according to the actual situation and demand.
Key words:
Data mining; Adverse drug reactions
Interval graphs provide a natural model for a vast number of scheduling and VLSI problems. A variety of interval graph problems have been solved on the PRAM family. Recently, a powerful architecture called the reconfigurable mesh has been proposed: in essence, a reconfigurable mesh consists of a mesh-connected architecture augmented by a dynamically reconfigurable bus system. It has been argued that the regular structure of the reconfigurable mesh is suitable for VLSI implementation. The authors develop a set of tools and show how they can be used to devise constant time algorithms to solve a number of interval-related problem on reconfigurable meshes. These problems include finding a maximum independent set, a minimum clique cover, a minimum dominating set, a minimum coloring, along with algorithms to compute the shortest path between a pair of intervals and, based on the shortest path, an algorithm to find the center of an interval graph. More precisely, with an arbitrary family of n intervals as input, all their algorithms run in constant time on a reconfigurable mesh of size n*n.< >