Impaired electrical conduction has been shown to play an important role in the development of heart rhythm disorders. Being able to determine the conductivity is important to localize the arrhythmogenic substrate that causes abnormalities in atrial tissue. In this work, we present an algorithm to estimate the conductivity from epicardial electrograms (EGMs) using a high-resolution electrode array. With these arrays, it is possible to measure the propagation of the extracellular potential of the cardiac tissue at multiple positions simultaneously. Given this data, it is in principle possible to estimate the tissue conductivity. However, this is an ill-posed problem due to the large number of unknown parameters in the electrophysiological data model. In this paper, we make use of an effective method called confirmatory factor analysis (CFA), which we apply to the cross correlation matrix of the data to estimate the tissue conductivity. CFA comes with identifiability conditions that need to be satisfied to solve the problem, which is, in this case, estimation of the tissue conductivity. These identifiability conditions can be used to find the relationship between the desired resolution and the required amount of data. Numerical experiments on the simulated data demonstrate that the proposed method can localize the conduction blocks in the tissue and can also estimate the smoother variation in the conductivities. The conductivity values estimated from the clinical data are in line with the values reported in literature and the EGMs reconstructed based on the estimated parameters match well with the clinical EGMs.
A pair of properly functioning ears is of great importance for various common daily tasks. While the perception of speech is perhaps one of the most obvious examples, being notified by certain sounds like (fire-) alarms or traffic sounds can even be lifesaving. Therefore it is not surprising that hearing impairment may have a strong (social) impact. Typically, hearing problems will occur mostly with elderly people. However, due to the upcoming popularity of portable music-players, like the iPod, permanent hearing loss is currently also a problem for younger people.
This paper considers techniques for single-channel speech enhancement based on the discrete Fourier transform (DFT). Specifically, we derive minimum mean-square error (MMSE) estimators of speech DFT coefficient magnitudes as well as of complex-valued DFT coefficients based on two classes of generalized gamma distributions, under an additive Gaussian noise assumption. The resulting generalized DFT magnitude estimator has as a special case the existing scheme based on a Rayleigh speech prior, while the complex DFT estimators generalize existing schemes based on Gaussian, Laplacian, and Gamma speech priors. Extensive simulation experiments with speech signals degraded by various additive noise sources verify that significant improvements are possible with the more recent estimators based on super-Gaussian priors. The increase in perceptual evaluation of speech quality (PESQ) over the noisy signals is about 0.5 points for street noise and about 1 point for white noise, nearly independent of input signal-to-noise ratio (SNR). The assumptions made for deriving the complex DFT estimators are less accurate than those for the magnitude estimators, leading to a higher maximum achievable speech quality with the magnitude estimators.
Multi-microphone speech enhancement methods typically require a reference position with respect to which the target signal is estimated. Often, this reference position is arbitrarily chosen as one of the reference microphones. However, it has been shown that the choice of the reference microphone can have a significant impact on the final noise reduction performance. In this paper, we therefore theoretically analyze the impact of selecting a reference on the noise reduction performance with near-end noise being taken into account. Following the generalized eigenvalue decomposition (GEVD) based optimal variable span filtering framework, we find that for any linear beamformer, the output signal-to-noise ratio (SNR) taking both the near-end and far-end noise into account is reference dependent. Only when the near-end noise is neglected, the output SNR of rank-1 beamformers does not depend on the reference position. However, in general for rank-r beamformers with r > 1 (e.g., the multichannel Wiener filter) the performance does depend on the reference position. Based on these, we propose an optimal algorithm for microphone reference selection that maximizes the output SNR. In addition, we propose a lower-complexity algorithm that is still optimal for rank-1 beamformers, but sub-optimal for the general r > 1 rank beamformers. Experiments using a simulated microphone array validate the effectiveness of both proposed methods and show that in terms of quality, several dB can be gained by selecting the proper reference microphone.
Existing objective speech-intelligibility measures are suitable for several types of degradation, however, it turns out that they are less appropriate for methods where noisy speech is processed by a time-frequency (TF) weighting, e.g., noise reduction and speech separation. In this paper, we present an objective intelligibility measure, which shows high correlation (rho=0.95) with the intelligibility of both noisy, and TF-weighted noisy speech. The proposed method shows significantly better performance than three other, more sophisticated, objective measures. Furthermore, it is based on an intermediate intelligibility measure for short-time (approximately 400 ms) TF-regions, and uses a simple DFT-based TF-decomposition. In addition, a free Matlab implementation is provided.
For multi-channel noise reduction algorithms like the minimum variance distortionless response (MVDR) beamformer, or the multi-channel Wiener filter, an estimate of the noise correlation matrix is needed. For its estimation, it is often proposed in the literature to use a voice activity detector (VAD). However, using a VAD the estimated matrix can only be updated in speech absence. As a result, during speech presence the noise correlation matrix estimate does not follow changing noise fields with an appropriate accuracy. This effect is further increased, as in nonstationary noise voice activity detection is a rather difficult task, and false-alarms are likely to occur. In this paper, we present and analyze an algorithm that estimates the noise correlation matrix without using a VAD. This algorithm is based on measuring the correlation of the noisy input and a noise reference which can be obtained, e.g., by steering a null towards the target source. When applied in combination with an MVDR beamformer, it is shown that the proposed noise correlation matrix estimate results in a more accurate beamformer response, a larger signal-to-noise ratio improvement and a larger instrumentally predicted speech intelligibility when compared to competing algorithms such as the generalized sidelobe canceler, a VAD-based MVDR beamformer, and an MVDR based on the noisy correlation matrix.
Binaural hearing aids (HAs) can potentially perform advanced noise reduction algorithms, leading to an improvement over monaural/bilateral HAs. Due to the limited transmission capacities between the HAs and given knowledge of the complete joint noisy signal statistics, the optimal rate-constrained beamforming strategy is known from the literature. However, as these joint statistics are unknown in practice, sub-optimal strategies have been presented. In this paper, we present a unified framework to study the performance of these existing optimal and sub-optimal rate-constrained beamforming methods for binaural HAs. Moreover, we propose to use an asymmetric sequential coding scheme to estimate the joint statistics between the microphones in the two HAs. We show that under certain assumptions, this leads to sub-optimal performance in one HA but allows to obtain the truly optimal performance in the second HA. Based on the mean square error distortion measure, we evaluate the performance improvement between monaural beamforming (no communication) and the proposed scheme, as well as the optimal and the existing sub-optimal strategies in terms of the information bit-rate. The results show that the proposed method outperforms existing practical approaches in most scenarios, especially at middle rates and high rates, without having the prior knowledge of the joint statistics.
Speech intelligibility prediction of noisy and processed noisy speech is important in a number of application domains such as hearing instruments and forensics. Most available objective intelligibility measures employ either a signal-to-noise ratio (SNR)-based or correlation-based comparison between frequency bands of the clean and the processed speech. In this paper, we approach the speech intelligibility prediction from the angle of information theory and show that an information theoretic concept provides a unified viewpoint on both the SNR and the correlation based approaches. Two objective intelligibility measures are introduced based on estimated mutual information between the clean speech and the processed speech in the time and the frequency subband domain. Our proposed measures show high correlation with subjective intelligibility measure (i.e. word correct scores) and comparative results with the short-term objective intelligibility measure (STOI).
One of the biggest challenges in multimicrophone applications is the estimation of the parameters of the signal model, such as the power spectral densities (PSDs) of the sources, the early (relative) acoustic transfer functions of the sources with respect to the microphones, the PSD of late reverberation, and the PSDs of microphone-self noise. Typically, existing methods estimate subsets of the aforementioned parameters and assume some of the other parameters to be known a priori. This may result in inconsistencies and inaccurately estimated parameters and potential performance degradation in the applications using these estimated parameters. So far, there is no method to jointly estimate all the aforementioned parameters. In this paper, we propose a robust method for jointly estimating all the aforementioned parameters using confirmatory factor analysis. The estimation accuracy of the signal-model parameters thus obtained outperforms existing methods in most cases. We experimentally show significant performance gains in several multimicrophone applications over state-of-the-art methods.