SMITH, A. B. & BATTEN, D. J. (eds). 2002. Fossils of the Chalk, 2nd ed., revised and enlarged. Field Guides to Fossils Series no. 2. ix + 374 pp. London: The Palaeontological Association. Price £14.00 (paperback). ISBN 0 901702 78 1; ISSN 0962-5321. - Volume 140 Issue 1
As Machine Learning becomes more prominent in the military, we are faced with a different take on the old problem of how to collect data relevant to some military mission need. We can now embrace the paradigm of too much data where previously we needed to focus on data reduction because humans can only process a finite amount of information. Commanders, analysts, and intelligence officers are often tasked with understanding the current situation in a mission area to create a common operating picture in order to complete their mission objectives. Data pertaining to missions can often be scraped from multiple domains, including patrol reports, newswire, and RF sensors, image sensors, and various other sensor types in the field. In this paper, we describe a system called the Multi-Domain Integration and Correlation Engine (MD-ICE), which ingests data which ingests data from two domains: textual open source information (newswire and social media)and sensor network information, and processes it using tools from various machine learning research areas. MD-ICE manipulates the resulting data into a machine readable unified format to allow for labelling and inference of inter-domain correlations. The goal of MD-ICE is to utilize these information domains to better understand situational context, where open source information provides semantic context (i.e. what type of event, who is involved, etc...)and the sensor network information provides the fine-grain detail (how many people involved, exact area of the event, etc ...). This understanding of situational context in turn can, with further research, help commanders reach their mission objectives faster through better situational understanding and prediction of future needs.
Following the technological advances that have enabled genome-wide analysis in most model organisms over the last decade, there has been unprecedented growth in genomic and post-genomic science with concomitant generation of an exponentially increasing volume of data and material resources. As a result, numerous repositories have been created to store and archive data, organisms and material, which are of substantial value to the whole community. Sustained access, facilitating re-use of these resources, is essential, not only for validation, but for re-analysis, testing of new hypotheses and developing new technologies/platforms. A common challenge for most data resources and biological repositories today is finding financial support for maintenance and development to best serve the scientific community. In this study we examine the problems that currently confront the data and resource infrastructure underlying the biomedical sciences. We discuss the financial sustainability issues and potential business models that could be adopted by biological resources and consider long term preservation issues within the context of mouse functional genomics efforts in Europe.
Molecular and genetic approaches now provide powerful tools for investigating the origin of human populations and the evolution of genes affecting both complex and monogenic traits. This allows a biologically appropriate classification of individuals and population groups based on genotypes and gene frequencies rather than appearance; this information will facilitate disease risk assessment. Many of the observed differences in gene frequencies between human populations may be accounted for by population movements, and it is difficult to assess how large a role natural selection has played in population differentiation. For several disease genes there is evidence that environmental agents, such as infectious pathogens, dietary factors, and environmental toxins, have been responsible for an increased frequency in some populations. Further examples of such selection may emerge from current efforts to define the genetic basis of many polygenic common diseases. We need more information on molecular genetic differences between human populations; we could obtain that information from the successful completion of the Human Genome Project and the proposed Human Genome Diversity Project.
Abstract Motivation: Conventional phylogenetic analysis for characterizing the relatedness between taxa typically assumes that a single relationship exists between species at every site along the genome. This assumption fails to take into account recombination which is a fundamental process for generating diversity and can lead to spurious results. Recombination induces a localized phylogenetic structure which may vary along the genome. Here, we generalize a hidden Markov model (HMM) to infer changes in phylogeny along multiple sequence alignments while accounting for rate heterogeneity; the hidden states refer to the unobserved phylogenic topology underlying the relatedness at a genomic location. The dimensionality of the number of hidden states (topologies) and their structure are random (not known a priori) and are sampled using Markov chain Monte Carlo algorithms. The HMM structure allows us to analytically integrate out over all possible changepoints in topologies as well as all the unknown branch lengths. Results: We demonstrate our approach on simulated data and also to the genome of a suspected HIV recombinant strain as well as to an investigation of recombination in the sequences of 15 laboratory mouse strains sequenced by Perlegen Sciences. Our findings indicate that our method allows us to distinguish between rate heterogeneity and variation in phylogeny caused by recombination without being restricted to 4-taxa data. Availability: The method has been implemented in JAVA and is available, along with data studied here, from http://www.stats.ox.ac.uk/~webb. Contact: cholmes@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.