Abstract From birth to 5 years of age, brain structure matures and evolves alongside emerging cognitive and behavioral abilities. In relating concurrent cognitive functioning and measures of brain structure, a major challenge that has impeded prior investigation of their time‐dynamic relationships is the sparse and irregular nature of most longitudinal neuroimaging data. We demonstrate how this problem can be addressed by applying functional concurrent regression models (FCRMs) to longitudinal cognitive and neuroimaging data. The application of FCRM in neuroimaging is illustrated with longitudinal neuroimaging and cognitive data acquired from a large cohort ( n = 210) of healthy children, 2–48 months of age. Quantifying white matter myelination by using myelin water fraction (MWF) as imaging metric derived from MRI scans, application of this methodology reveals an early period (200–500 days) during which whole brain and regional white matter structure, as quantified by MWF, is positively associated with cognitive ability, while we found no such association for whole brain white matter volume. Adjusting for baseline covariates including socioeconomic status as measured by maternal education (SES‐ME), infant feeding practice, gender, and birth weight further reveals an increasing association between SES‐ME and cognitive development with child age. These results shed new light on the emerging patterns of brain and cognitive development, indicating that FCRM provides a useful tool for investigating these evolving relationships.
A combination of advanced sequencing and mapping techniques is used to produce a reference genome of Aegilops tauschii, progenitor of the wheat D genome, providing a valuable resource for comparative genetic studies. Sequencing the genomes of crops plants provides useful resources for crop improvement and breeding. Jan Dvořák, Katrien Devos, Steven Salzberg and colleagues report a reference genome for Aegilops tauschii, the diploid progenitor of the D genome of hexaploid wheat. They use a combination of ordered-clone genome sequencing, whole-genome shotgun sequencing and BioNano optical genome mapping to assemble this large and highly repetitive genome. This provides a useful resource for comparative genomics studies of wheat. Aegilops tauschii is the diploid progenitor of the D genome of hexaploid wheat1 (Triticum aestivum, genomes AABBDD) and an important genetic resource for wheat2,3,4. The large size and highly repetitive nature of the Ae. tauschii genome has until now precluded the development of a reference-quality genome sequence5. Here we use an array of advanced technologies, including ordered-clone genome sequencing, whole-genome shotgun sequencing, and BioNano optical genome mapping, to generate a reference-quality genome sequence for Ae. tauschii ssp. strangulata accession AL8/78, which is closely related to the wheat D genome. We show that compared to other sequenced plant genomes, including a much larger conifer genome, the Ae. tauschii genome contains unprecedented amounts of very similar repeated sequences. Our genome comparisons reveal that the Ae. tauschii genome has a greater number of dispersed duplicated genes than other sequenced genomes and its chromosomes have been structurally evolving an order of magnitude faster than those of other grass genomes. The decay of colinearity with other grass genomes correlates with recombination rates along chromosomes. We propose that the vast amounts of very similar repeated sequences cause frequent errors in recombination and lead to gene duplications and structural chromosome changes that drive fast genome evolution.
Data depth is a powerful nonparametric tool originally proposed to rank multivariate data from center outward. In this context, one of the most archetypical depth notions is Tukey's halfspace depth. In the last few decades notions of depth have also been proposed for functional data. However, Tukey's depth cannot be extended to handle functional data because of its degeneracy. Here, we propose a new halfspace depth for functional data which avoids degeneracy by regularization. The halfspace projection directions are constrained to have a small reproducing kernel Hilbert space norm. Desirable theoretical properties of the proposed depth, such as isometry invariance, maximality at center, monotonicity relative to a deepest point, upper semi-continuity, and consistency are established. Moreover, the regularized halfspace depth can rank functional data with varying emphasis in shape or magnitude, depending on the regularization. A new outlier detection approach is also proposed, which is capable of detecting both shape and magnitude outliers. It is applicable to trajectories in L2, a very general space of functions that include non-smooth trajectories. Based on extensive numerical studies, our methods are shown to perform well in terms of detecting outliers of different types. Three real data examples showcase the proposed depth notion.
Many scientific applications and signal processing algorithms require complete satellite images. However, missing data in satellite images is very common due to various reasons such as cloud cover and sensor-specific problems. This paper introduces a general spatiotemporal satellite image imputation method based on sparse functional data analytic techniques. To handle observations consisting of a few longitudinally repeated satellite images that are themselves partially observed and noise-contaminated, we propose a multistep imputation method by following the best linear unbiased prediction principle and pooling information across all available locations and time points. Theoretical properties are established for the proposed approach under a new observation model for functional data that covers the dataset in question as a special case. Practical analysis on the Landsat data are conducted to illustrate and validate our algorithm which also shows that the proposed method considerably outperforms existing algorithms in terms of prediction accuracy. An efficient implementation using R and Rcpp is made available in the R package stfit.
Data from NASA's Orbiting Carbon Observatory-2 (OCO-2) satellite is essential to many carbon management strategies. A retrieval algorithm is used to estimate CO2 concentration using the radiance data measured by OCO-2. However, due to factors such as cloud cover and cosmic rays, the spatial coverage of the retrieval algorithm is limited in some areas of critical importance for carbon cycle science. Mixed land/water pixels along the coastline are also not used in the retrieval processing due to the lack of valid ancillary variables including land fraction. We propose an approach to model spatial spectral data to solve these two problems by radiance imputation and land fraction estimation. The spectral observations are modeled as spatially indexed functional data with footprint-specific parameters and are reduced to much lower dimensions by functional principal component analysis. The principal component scores are modeled as random fields to account for the spatial dependence, and the missing spectral observations are imputed by kriging the principal component scores. The proposed method is shown to impute spectral radiance with high accuracy for observations over the Pacific Ocean. An unmixing approach based on this model provides much more accurate land fraction estimates in our validation study along Greece coastlines.