Benjamin F. Grewe

University of Zurich

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Christian Henning

University of Zurich

João Sacramento

ETH Zurich

Johannes von Oswald

École Polytechnique Fédérale de Lausanne

Mark J. Schnitzer

Stanford University

Fritjof Helmchen

University of Zurich

Alexander Meulemans

ETH Zurich

Maria R. Cervera

ETH Zurich

Pau Vilimelis Aceituno

University of Zurich

Benjamin Ehret

University of Zurich

Thilo Stadelmann

ZHAW Zurich University of Applied Sciences

Cooperative Institutions

University of Zurich

ETH Zurich

Stanford University

SIB Swiss Institute of Bioinformatics

Howard Hughes Medical Institute

École Polytechnique Fédérale de Lausanne

Freie Universität Berlin

Humboldt-Universität zu Berlin

Charité - Universitätsmedizin Berlin

Harvard University

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

Neural networks with late-phase weights

arXiv (Cornell University) (2020)

Johannes von Oswald Seijin Kobayashi Alexander Meulemans Christian Henning Benjamin F. Grewe

The largely successful method of training neural networks is to learn their weights using some variant of stochastic gradient descent (SGD). Here, we show that the solutions found by SGD can be further improved by ensembling a subset of the weights in late stages of learning. At the end of learning, we obtain back a single model by taking a spatial average in weight space. To avoid incurring increased computational costs, we investigate a family of low-dimensional late-phase weight models which interact multiplicatively with the remaining parameters. Our results show that augmenting standard models with late-phase weights improves generalization in established benchmarks such as CIFAR-10/100, ImageNet and enwik8. These findings are complemented with a theoretical analysis of a noisy quadratic problem which provides a simplified picture of the late phases of neural network learning.

Stochastic Gradient Descent

10.48550/arxiv.2007.12927

Cite

Citations (2)

Uncertainty estimation under model misspecification in neural network regression

arXiv (Cornell University) (2021)

Maria R. Cervera Rafael Dätwyler Francesco D’Angelo Hamza Keurti Benjamin F. Grewe

Although neural networks are powerful function approximators, the underlying modelling assumptions ultimately define the likelihood and thus the hypothesis class they are parameterizing. In classification, these assumptions are minimal as the commonly employed softmax is capable of representing any categorical distribution. In regression, however, restrictive assumptions on the type of continuous distribution to be realized are typically placed, like the dominant choice of training via mean-squared error and its underlying Gaussianity assumption. Recently, modelling advances allow to be agnostic to the type of continuous distribution to be modelled, granting regression the flexibility of classification models. While past studies stress the benefit of such flexible regression models in terms of performance, here we study the effect of the model choice on uncertainty estimation. We highlight that under model misspecification, aleatoric uncertainty is not properly captured, and that a Bayesian treatment of a misspecified model leads to unreliable epistemic uncertainty estimates. Overall, our study provides an overview on how modelling choices in regression may influence uncertainty estimation and thus any downstream decision making process.

Softmax function

Categorical variable

10.48550/arxiv.2111.11763

Cite

Citations (0)

Reticulon 4A/Nogo-A influences the distribution of Kir4.1 but is not essential for potassium conductance in retinal Müller glia

Neuroscience Letters (2016)

Sandrine Joly Dana A. Dodd Benjamin F. Grewe Vincent Pernet

Aquaporin 4

Neuroglia

Immunoprecipitation

Ectopic expression

10.1016/j.neulet.2016.06.010

Cite

Citations (4)

Back-propagation of physiological action potential output in dendrites of slender-tufted L5A pyramidal neurons

Frontiers in Cellular Neuroscience (2010)

Benjamin F. Grewe

Pyramidal neurons of layer 5A are a major neocortical output type and clearly distinguished from layer 5B pyramidal neurons with respect to morphology, in vivo firing patterns, and connectivity; yet knowledge of their dendritic properties is scant. We used a combination of whole-cell recordings and Ca(2+) imaging techniques in vitro to explore the specific dendritic signaling role of physiological action potential patterns recorded in vivo in layer 5A pyramidal neurons of the whisker-related 'barrel cortex'. Our data provide evidence that the temporal structure of physiological action potential patterns is crucial for an effective invasion of the main apical dendrites up to the major branch point. Both the critical frequency enabling action potential trains to invade efficiently and the dendritic calcium profile changed during postnatal development. In contrast to the main apical dendrite, the more passive properties of the short basal and apical tuft dendrites prevented an efficient back-propagation. Various Ca(2+) channel types contributed to the enhanced calcium signals during high-frequency firing activity, whereas A-type K(+) and BK(Ca) channels strongly suppressed it. Our data support models in which the interaction of synaptic input with action potential output is a function of the timing, rate and pattern of action potentials, and dendritic location.

Dendritic spike

Apical dendrite

Barrel cortex

Dendrite (mathematics)

Pyramidal cell

Neocortex

Calcium imaging

10.3389/fncel.2010.00013

Cite

Citations (56)

Wide. Fast. Deep: Recent Advances in Multiphoton Microscopy ofIn VivoNeuronal Activity

Journal of Neuroscience (2019)

Jérôme Lecoq N. S. Orlova Benjamin F. Grewe

Multiphoton microscopy (MPM) has emerged as one of the most powerful and widespread technologies to monitor the activity of neuronal networks in awake, behaving animals over long periods of time. MPM development spanned across decades and crucially depended on the concurrent improvement of calcium indicators that report neuronal activity as well as surgical protocols, head fixation approaches, and innovations in optics and microscopy technology. Here we review the last decade of MPM development and highlight how in vivo imaging has matured and diversified, making it now possible to concurrently monitor thousands of neurons across connected brain areas or, alternatively, small local networks with sampling rates in the kilohertz range. This review includes different laser scanning approaches, such as multibeam technologies as well as recent developments to image deeper into neuronal tissues using new, long-wavelength laser sources. As future development will critically depend on our ability to resolve and discriminate individual neuronal spikes, we will also describe a simple framework that allows performing quantitative comparisons between the reviewed MPM instruments. Finally, we provide our own opinion on how the most recent MPM developments can be leveraged at scale to enable the next generation of discoveries in brain function.

Brain Function

Premovement neuronal activity

10.1523/jneurosci.1527-18.2019

Cite

Citations (95)

Bio-Inspired, Task-Free Continual Learning through Activity Regularization

arXiv (Cornell University) (2022)

Francesco Lässig Pau Vilimelis Aceituno Martino Sorbaro Benjamin F. Grewe

The ability to sequentially learn multiple tasks without forgetting is a key skill of biological brains, whereas it represents a major challenge to the field of deep learning. To avoid catastrophic forgetting, various continual learning (CL) approaches have been devised. However, these usually require discrete task boundaries. This requirement seems biologically implausible and often limits the application of CL methods in the real world where tasks are not always well defined. Here, we take inspiration from neuroscience, where sparse, non-overlapping neuronal representations have been suggested to prevent catastrophic forgetting. As in the brain, we argue that these sparse representations should be chosen on the basis of feed forward (stimulus-specific) as well as top-down (context-specific) information. To implement such selective sparsity, we use a bio-plausible form of hierarchical credit assignment known as Deep Feedback Control (DFC) and combine it with a winner-take-all sparsity mechanism. In addition to sparsity, we introduce lateral recurrent connections within each layer to further protect previously learned representations. We evaluate the new sparse-recurrent version of DFC on the split-MNIST computer vision benchmark and show that only the combination of sparsity and intra-layer recurrent connections improves CL performance with respect to standard backpropagation. Our method achieves similar performance to well-known CL methods, such as Elastic Weight Consolidation and Synaptic Intelligence, without requiring information about task boundaries. Overall, we showcase the idea of adopting computational principles from the brain to derive new, task-free learning algorithms for CL.

MNIST database

Benchmark (surveying)

10.48550/arxiv.2212.04316

Cite

Citations (0)

Posterior Meta-Replay for Continual Learning

arXiv (Cornell University) (2021)

Christian Henning Maria R. Cervera Francesco D'Angelo Johannes von Oswald Regina Traber

Continual Learning (CL) algorithms have recently received a lot of attention as they attempt to overcome the need to train with an i.i.d. sample from some unknown target data distribution. Building on prior work, we study principled ways to tackle the CL problem by adopting a Bayesian perspective and focus on continually learning a task-specific posterior distribution via a shared meta-model, a task-conditioned hypernetwork. This approach, which we term Posterior-replay CL, is in sharp contrast to most Bayesian CL approaches that focus on the recursive update of a single posterior distribution. The benefits of our approach are (1) an increased flexibility to model solutions in weight space and therewith less susceptibility to task dissimilarity, (2) access to principled task-specific predictive uncertainty estimates, that can be used to infer task identity during test time and to detect task boundaries during training, and (3) the ability to revisit and update task-specific posteriors in a principled manner without requiring access to past data. The proposed framework is versatile, which we demonstrate using simple posterior approximations (such as Gaussians) as well as powerful, implicit distributions modelled via a neural network. We illustrate the conceptual advance of our framework on low-dimensional problems and show performance gains on computer vision benchmarks.

Source

Cite

Citations (4)

Neural ensemble dynamics underlying a long-term associative memory

Nature (2017)

Benjamin F. Grewe Jan Gründemann Lacey Kitch Jérôme Lecoq Jones G. Parker

Extinction (optical mineralogy)

Associative learning

Representation

Associative property

Association (psychology)

10.1038/nature21682