ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Review

Recent advances in understanding the auditory cortex

[version 1; peer review: 2 approved]
PUBLISHED 26 Sep 2018
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

Our ability to make sense of the auditory world results from neural processing that begins in the ear, goes through multiple subcortical areas, and continues in the cortex. The specific contribution of the auditory cortex to this chain of processing is far from understood. Although many of the properties of neurons in the auditory cortex resemble those of subcortical neurons, they show somewhat more complex selectivity for sound features, which is likely to be important for the analysis of natural sounds, such as speech, in real-life listening conditions. Furthermore, recent work has shown that auditory cortical processing is highly context-dependent, integrates auditory inputs with other sensory and motor signals, depends on experience, and is shaped by cognitive demands, such as attention. Thus, in addition to being the locus for more complex sound selectivity, the auditory cortex is increasingly understood to be an integral part of the network of brain regions responsible for prediction, auditory perceptual decision-making, and learning. In this review, we focus on three key areas that are contributing to this understanding: the sound features that are preferentially represented by cortical neurons, the spatial organization of those preferences, and the cognitive roles of the auditory cortex.

Keywords

auditory cortex, receptive field, model, map, cognition, plasticity

Introduction

Our seemingly effortless ability to localize, distinguish, and recognize a vast array of natural sounds, including speech and music, results from the neural processing that begins in the inner ear and continues through a complex sequence of subcortical and cortical brain areas. Many aspects of hearing—such as computation of the cues that enable sound sources to be localized or their pitch to be extracted—rely on the processing that takes place in the brainstem and other subcortical structures. It is widely thought to be the case, however, that the auditory cortex plays a critical role in the perception of complex sounds. Although this partly reflects the emergence of response properties, such as sensitivity to combinations of sound features, it is striking how similar many of the properties of neurons in the primary auditory cortex (A1) are to those of subcortical neurons1. Equally important is the growing realization that auditory cortical processing, in particular, is highly context-dependent and integrates auditory (and other sensory) inputs with information about an individual’s current internal state, including their arousal level, focus of attention, and motor planning, as well as their past experience2. The auditory cortex is therefore an integral part of the network of brain regions responsible for generating meaning from sounds, auditory perceptual decision-making, and learning.

In this review, we focus on three key areas of auditory cortical processing where there has been progress in the last few years. First, we consider what sound features A1 neurons represent, highlighting recent attempts to improve receptive field models that can predict the responses of neurons to natural sounds. Second, we examine the distribution of those stimulus preferences within A1 and across the hierarchy of auditory cortical areas, focusing on the extent to which this conforms to canonical principles of columnar organization and functional specialization that are the hallmark of visual and somatosensory processing. Finally, we look at the cognitive role of auditory cortex, highlighting its involvement in prediction, learning, and decision making.

Spectrotemporal receptive fields

The tuning properties of sensory neurons are defined by their receptive fields, which describe the stimulus features to which they are most responsive. Reflecting the spectral analysis that begins in the inner ear, auditory neurons are most commonly characterized by their sensitivity to sound frequency. The spectrotemporal receptive field3,4 (STRF) is the dominant computational tool for characterizing the responses of auditory neurons. Most widely used as part of a linear-nonlinear (LN) model, an STRF comprises a set of coefficients that describe how the response of the neuron at each moment in time can be modelled as a linear weighted sum of the recent history of the stimulus power in different spectral channels. Owing to their simplicity, STRFs can be reliably estimated by using randomly chosen structured stimuli, such as ripples57, and relatively small amounts of data. They can then be used, in combination with a static output nonlinearity, to describe and predict neural responses to arbitrary sounds8, making them invaluable tools in understanding the computational roles of single neurons and whole neuronal populations. However, STRFs do not accurately capture the full complexity of the behavior of auditory neurons; for example, multiple studies have shown that there are differences between STRFs estimated by using standard synthetic stimuli and those estimated by using natural sounds911, suggesting that these models systematically fail to fully describe neural responses to complex stimuli.

Multiple stimulus dimensions

Standard STRF models contain a single receptive field—one set of spectrotemporal weighting coefficients that describes a single time-varying spectral pattern to which the neuron is sensitive. There is no reason, however, to believe that the spectrotemporal selectivity of auditory neurons is so simple. Artificial neural networks are built on the principle that arbitrarily complex computations can be constructed from simple neuron-like computing elements, and cortical neurons are likely to have similarly complex selectivity based on nonlinear combination of the responses of afferent neurons with simpler selectivity. Capturing this complexity requires STRF models that nonlinearly combine the responses of neurons to multiple stimulus dimensions.

With a maximally informative dimensions approach, it has been found that multiple stimulus dimensions are required to describe the responses of neurons in A112,13, whereas a similar approach requires only a single dimension to describe most neurons in the inferior colliculus (IC) in the midbrain14. This suggests that neuronal complexity is higher in the cortex and that this complexity can be captured by nonlinear combination of the responses of multiple simpler units, a finding that also applies to high-level neurons in songbirds15,16. Two recent articles have successfully shown that neural networks are an effective way to model these interactions. Harper et al.17 used a two-layer perceptron to describe the behavior of neurons in ferret auditory cortex, and Kozlov and Gentner16 used a similar network to describe high-level auditory neurons in the starling. To achieve this, both studies used neural networks of intermediate complexity, thus avoiding an explosion in the number of parameters that must be fitted. They also used careful regularization—where the model is optimized subject to a penalty on parameter values. Regularization effectively reduces the number of parameters that must be fitted, meaning that models can be fitted accurately using smaller datasets18. This approach enabled both studies to show that auditory neurons can be better described by a model which takes into account the tuning of multiple afferent neurons (hidden units), suggesting that the selectivity of individual neurons is the result of a combination of information about multiple stimulus dimensions.

Other studies using synthetic stimuli also indicate that cortical neurons integrate different sound features, including frequency, spectral bandwidth, level, amplitude modulation over time, and spatial location19,20. By using an online optimization procedure (see earlier work by deCharms et al.21) to dynamically generate sounds that varied along these dimensions, the authors of these studies were able to efficiently search stimulus space and constrain their model within a few minutes of stimulus presentation. They also found that individual neurons multiplex information about multiple stimulus dimensions and that neuronal responses to multidimensional stimuli could not be predicted simply from responses to low-dimensional stimuli.

Taken together, these studies suggest that the successor of the STRF will be a model that allows multiple STRF-like elements to be combined in nonlinear ways. The development of such models will require large datasets that include neuronal responses to complex, natural stimuli.

Dynamic changes in tuning

Static multidimensional models capture complex selectivity for the recent spectral structure of sounds. However, several studies have highlighted the importance of dynamic changes in the response properties of auditory cortical neurons2226. Most real-life soundscapes are characterized by constant changes in the statistics of the sounds that reach the ears. Neurons rapidly adapt to stimulus statistics—for example, stimulus probability, contrast, and correlation structure—and adjust their sensitivity in response to behavioral requirements2729. This can produce representations that are invariant to some changes in sound features and robustly selective for other features30.

Several authors have incorporated such adaptation by adding a nonlinear input stage to the standard LN model25,3133. Willmore et al.33 explicitly incorporated the behavior of afferent neurons into a model of A1 spectrotemporal tuning. In both the auditory nerve34 and IC35, dynamic coding occurs in the form of adaptation to mean sound level. When the mean sound level is high, neurons shift their dynamic ranges upwards, so that neuronal responses are relatively invariant to changes in background level. Because these structures are precursors of the auditory cortex, it makes sense to incorporate IC adaptation into the input stage of a model of auditory cortex and Willmore et al.33 found that this improved the performance of their model of cortical neurons.

Neurons in the A1 of the ferret show compensatory adaptation to sound contrast (that is, the variance of the sound level distribution)24,36. When the contrast of the input to a given neuron is high, the gain of the neuron is reduced, thereby making it relatively insensitive to changes in sound level. When the contrast of the input is low, the gain of the neuron rises, increasing its sensitivity. This adaptive coding therefore tends to compensate for changes in sound contrast. Adaptation to the mean and variance of sound level is specifically beneficial for the representation of dynamic sounds against a background of constant noise. If the noise is statistically stationary (that is, the mean and variance are fixed), then the effect of adaptation is to minimize the responses of cortical neurons to the background. Thus, the neuronal responses depend mainly on the dynamic foreground sound and are relatively invariant to its contrast. This enables cortical neurons to represent complex sounds using a code that is relatively robust to the presence of background noise. Similar noise robustness has been shown independently in the songbird37 and in decoding studies applied to populations of mammalian cortical neural responses30,38.

Normative models

To fully understand how sounds are represented by neurons in the auditory cortex, we need to ask why this particular representation has been selected, whether by evolution or by developmental processes. One approach to addressing this question is to build normative computational models. Normative models embody certain constraints that are hypothesized to be important for determining the neural representation. By building normative models and comparing their representations with real physiological representations, it is possible to test hypotheses about which constraints determine the structure of the neural codes used in the brain. For example, Olshausen and Field39 built a neural network and trained it to produce a sparse, generative representation of natural visual scenes. Once trained, the network’s units exhibited receptive fields with many similarities to those found in neurons in the primary visual cortex (V1), suggesting that V1 itself may be optimized to produce sparse representations of natural visual scenes.

Carlson et al.40 trained a neural network so that it produced a sparse, generative representation of natural auditory stimuli. They found that the model learned STRFs whose spectral structure resembled that of real neurons in the auditory system, suggesting that the auditory system may be optimized for sparse representation. Carlin and Elhilali41 showed that optimizing model neural responses to produce a code based on sustained firing rates also revealed feature preferences similar to those of auditory neurons. Recently, Singer et al.42 observed that the temporal structure of the STRFs in sparse coding models (which generally have envelopes that are symmetrical in time) is very different from those of real neurons in both V1 and A1, where neurons are typically most selective for recent stimulation and have envelopes that decay into the past. These authors trained networks to predict the immediate future of either natural visual or auditory stimuli and found that the networks developed receptive fields that closely matched those of real visual and auditory cortical neurons, including their asymmetric temporal envelopes (Figure 1). This suggests that neurons in these areas may be optimized to predict the immediate future of sensory stimulation. Such a representation may be advantageous for acting efficiently in the world using neurons, which introduce inevitable delays in information transmission.

a41b1d72-156f-4d23-bdec-2ce19a620287_figure1.gif

Figure 1. Neuronal selectivity in the auditory cortex is optimized to represent sound features in the recent sensory past that best predict immediate future inputs.

A feedforward artificial neural network was trained to predict the immediate future of natural sounds (represented as cochleagrams describing the spectral content over time) from their recent past. This temporal prediction model developed spectrotemporal receptive fields that closely matched those of real auditory cortical neurons. A similar correspondence was found between the receptive fields produced when the model was trained to predict the next few video frames in clips of natural scenes and the receptive field properties of neurons in primary visual cortex. Model nomenclature: sj, hidden unit output; ui, input—the past; vk, target output—the true future; v^k, output—the predicted future; wji, input weights (analogous to cortical receptive fields); wkj, output weights. Reprinted from Singer et al.42.

As discussed above, STRFs do not provide a complete description of the stimulus preferences of auditory neurons. It is therefore important to extend the normative approach into more complex models in order to generate hypotheses about what nonlinear combinations of features we might expect to find in the auditory system. To address this question, Młynarski and McDermott43 trained a hierarchical, generative neural network to represent natural sounds and found that some units in the second layer of the model developed sensitivity to combinations of elementary sound features but that others developed opponency between sound features. By providing insights into how different parameters interact to determine their response properties, these modelling approaches can generate testable hypotheses about the possible roles played by auditory neurons at different levels of the processing hierarchy.

Origin of auditory cortical tuning properties

The receptive field properties of auditory cortical neurons are derived from their ascending thalamic inputs and are shaped by recurrent synaptic interactions between excitatory neurons as well as by local inhibitory inputs. The importance of ascending inputs in determining the functional organization of the auditory pathway is illustrated by the presence of tonotopic maps at each processing level, which have their origin in the decomposition of sounds into their individual frequencies along the length of the cochlea in the inner ear. The presence of tonotopic maps in the cortex can be demonstrated by using a variety of methods, including the use of functional brain imaging in humans44. However, this is a relatively coarse approach and the much finer spatial resolution provided by in vivo two-photon calcium imaging suggests that the spatial organization varies across the layers of A1 and that frequency tuning in the main thalamorecipient layer 4 is more homogenous than in the upper cortical layers45. Whilst this laminar transformation is consistent with a possible integration of inputs in the superficial layers, which might contribute to the emergence of sensitivity to multiple sound features, it turns out that the thalamic input map to A1 is surprisingly imprecise46. The significance of this for the generation of cortical response properties remains to be investigated, but heterogeneity in the frequency selectivity of ascending inputs might provide a substrate for the contextual modulation of cortical response properties.

In the last few years, progress has been made in determining how the local circuitry of excitatory and inhibitory neurons contributes to auditory cortical response properties4750. For example, both parvalbumin-expressing and somatostatin-expressing inhibitory interneurons in layers 2/3 have been implicated in regulating the frequency selectivity of A1 neurons48,50, while manipulating the activity of parvalbumin-expressing neurons, the most common type of cortical interneuron, has been shown to alter the behavioral performance of mice on tasks that rely on their ability to discriminate different sound frequencies49. Furthermore, local inhibitory interneurons are involved in mediating the way A1 responses change according to the recent history of stimulation5154, and it is likely that dynamic interactions between different cell types are responsible for much of the context-dependent modulation that characterizes the way sounds are processed in the auditory cortex. Thus, in addition to being present at subcortical levels, adaptive coding of auditory information can arise de novo from local circuit interactions in the cortex.

Beyond tonotopy

A tonotopic representation is defined by a systematic variation in the frequency selectivity of the neurons from low to high values. A long-standing and controversial question regarding the functional organization of A1 and other brain areas that contain frequency maps is how neuronal sensitivity to other stimulus features is represented across the isofrequency axes. In the visual cortex of carnivores and primates, neuronal preferences for different stimulus features—orientation, spatial frequency and eye of stimulation—are overlaid so that they are all represented at each location within a two-dimensional map of the visual field55. In a similar vein, functional magnetic resonance imaging (fMRI) studies in humans56 and macaque monkeys57 have revealed that core auditory fields, including A1, contain a gradient in sensitivity to the rate of amplitude modulation, which is arranged orthogonally to the tonotopic map. In other words, a temporal map seems to exist within the cortical representation of each sound frequency.

By contrast, analysis of the activity of large samples of neurons within the upper layers of mouse auditory cortex has failed to provide evidence for such topography. Instead, neurons displaying similar frequency tuning bandwidth58, preferred sound level58, and sensitivity to changes in sound level59 or to differences in level between the ears60—an important cue to localizing sound sources—form intermingled clusters across the auditory cortex. Within the isofrequency domain, clearer evidence for segregated processing modules with distinct spectral integration properties or preferences for other sound features has been obtained in monkeys61 and cats62. It therefore remains possible that the more diffuse spatial arrangement observed in mice may be a general property of sensory cortex in this species63 or more generally of animals with relatively small brains. Although multiple, interleaved processing modules for behaviorally relevant sound features may exist to varying degrees in different species, our understanding of how they map onto the tonotopic organization of A1 or other auditory areas is far from complete.

Vertical clustering of neurons with similar tuning properties also represents a potentially important aspect of the functional organization of the cerebral cortex. Compared with other sensory systems, there has been less focus on the columnar organization of auditory cortex. As previously mentioned, there is evidence for layer-specific differences in the frequency representation in auditory cortex45 and this has been shown to extend to other response properties64,65. At the same time, ensemble activity within putative cortical columns may play a role in representing auditory information66,67.

Representing sound features in multiple cortical areas

As with other sensory modalities, the auditory cortex is subdivided into a number of separate areas, which can be distinguished on the basis of their connections, response properties, and (to some extent) the auditory perceptual deficits that result from localized damage. That these areas are organized hierarchically is clearly illustrated by neuroimaging evidence in humans for the involvement of sequential cortical regions in transforming the spectral features of speech into its semantic content68,69. Furthermore, by training a neural network model on speech and music tasks, Kell et al.70 found that the best performing model architecture separated speech and music into separate pathways, in keeping with fMRI responses in human non-A171. Neuroimaging evidence in monkeys also suggests that analysis of auditory motion may involve different pathways from those engaged during the processing of static spatial information72. In both human and non-human primates, the concept of distinct ventral and dorsal processing streams is widely accepted. These were initially assigned to “what” and “where” functions, respectively, but their precise functions, and the extent to which they interact, continue to be debated7376.

Evidence for a division of labor among the multiple areas that comprise the auditory cortex extends to other species, as illustrated, for example, by the finding that anatomically segregated regions of mouse auditory cortex can be distinguished by differences in the frequency selectivity of the neurons and in their sensitivity to frequency-modulated sweeps77. Nevertheless, the question of whether a particular aspect of auditory perception is localized to one or more of those areas is, in some ways, ill posed. Given the extensive subcortical processing that takes place and the growing evidence for multidimensional receptive field properties in early cortical areas19,20,78, it is likely that neurons convey information about multiple sound attributes. Indeed, recent neuroimaging studies have demonstrated widespread areas of activation in response to variations in spatial79 or non-spatial80 parameters, adding to earlier electrophysiological recordings which indicated that sensitivity to pitch, timbre, and location is distributed across several cortical areas81.

Decoding auditory cortical activity

Even where different sound features are apparently encoded by the same cortical regions, differences in the evoked activity patterns may enable them to be distinguished. For example, auditory cortical neurons can unambiguously represent more than one stimulus parameter by independently modulating their spike rates within distinct time windows78. Furthermore, a recent fMRI study demonstrated that although variations in pitch or timbre—two key aspects of sound identity—activate largely overlapping cortical areas, they can be distinguished by using multivoxel pattern analysis80. Studies like these hint at how the brain might solve the problem of perceptual invariance—recognizing, for example, the melody of a familiar tune even when it is played on different musical instruments irrespective of where they are located.

This type of approach is part of a growing trend to investigate the decoding or reconstruction of sound features from the measured responses of populations of neurons (or other multidimensional measures of brain activity, such as fMRI voxel responses82 or electroencephalography signals). Indeed, Yildiz et al.83 found that a normative decoding model of auditory cortex described neuronal response dynamics better than some classic encoding models that define the relationship between the stimulus and the neural response. It is therefore possible that the responses of populations of cortical neurons can be more accurately understood in terms of decoding rather than encoding.

The application of decoding techniques has provided valuable insights into how neural representations change under different sensory conditions, such as in the presence of background noise30,38,84,85, or when specific stimuli are selectively attended8689. They can also help to identify the size of the neural populations within the cortex from which auditory information needs to be read out in order to account for behavior9092 and the way in which stimulus features are represented there. For example, evidence from electrophysiological93,94, two-photon calcium imaging60 and fMRI95,96 studies is all consistent with an opponent-channel model in which the location of sounds in the horizontal plane, at least based on interaural level differences, is decoded from the relative activity of contralaterally and ipsilaterally tuned neurons within each hemisphere. Furthermore, based on the accuracy with which spectrotemporal modulations in a range of natural sounds could be reconstructed from high-resolution fMRI signals, Santoro et al.97 concluded that even early stages of the human auditory cortex may be optimized for processing speech and voices.

Auditory scene analysis

Separating a sound source of interest from a dynamic mixture of potentially competing sounds is a fundamental function of the auditory system. In order to perceptually isolate a sound source as a distinct auditory object98, its constituent acoustic features like pitch, timbre, intensity, and location need to be individually analyzed and then grouped together to form a coherent perceptual representation. Although neural computations underlying scene analysis have been demonstrated at different subcortical levels99101, the cortex is the hub where stimulus-driven feature segregation and top-down attentional selection mechanisms converge102. Indeed, several recent studies suggest a critical role for the auditory cortex in forming stable perceptual representations based on grouping and segregation of spectral, spatial, and temporal regularities in the acoustic environment103115.

Whilst cortical spike-based accounts of auditory segregation of narrowband signals rely on mechanisms of tonotopicity, adaptation, and forward suppression116, recent work has highlighted the importance of temporal coherence113,117119 and oscillatory sampling120122 in driving dynamic segregation of attended broadband stimuli. Results from human electro-corticography experiments suggest that low-level auditory cortex encodes spectrotemporal features of selectively attended speech86,87. Speech representations derived from high-gamma (75- to 150-Hz) local field potential activity in non-A1 of subjects listening to two speakers talking simultaneously are dominated by the spectrotemporal features of whichever speaker attention is drawn to86. Moreover, the attention-modulated responses were found to predict the accuracy with which target words were correctly detected in the two-speaker speech mixture, revealing a neural correlate of “cocktail party” listening86. Attentional modulation of the representation of multiple speech sources increases along the cortical hierarchy114, and a similar finding has been reported for the emergence of task-dependent spectrotemporal tuning in neurons in ferret cortex123.

Prediction

The predictability of sounds plays an important role in processing natural sound sequences like speech and music124. A prominent generative model of perception is based on the concept of predictive coding; that is, the brain learns to minimize the prediction error between internal predictions of sensory input and the external sensory input125,126. This model has been successfully applied to explain the encoding of pitch in a hierarchical fashion in distributed areas of the auditory cortex127. Furthermore, distinct oscillatory signatures for the key variables of predictive coding models, including surprise, prediction error, prediction change, and prediction precision, have all been demonstrated in the auditory cortex128. Most of the work in this area at the level of individual neurons has focused on stimulus-specific adaptation, a phenomenon in which particular sounds elicit stronger responses when they are rarely encountered than when they are common22. Recent work has indicated that A1 neurons exhibit deviance or surprise sensitivity129 and encode prediction error124 and that prediction error signals increase along the auditory pathway130. Furthermore, the auditory cortex encodes predictions for not only “what” kind of sensory event is to occur but also “when” it may occur131.

Behavioral engagement and auditory decision-making

Further evidence for dynamic processing in the auditory cortex has been obtained from studies demonstrating that engagement in auditory tasks modulates both spontaneous132,133 and sound-evoked92,133135 activity, as well as correlations between the activity of different neurons92,136, in ways that enhance behavioral performance. Furthermore, changes in task reward structure can alter A1 responses to otherwise identical sounds28, and other studies suggest that the auditory cortex is part of the network of brain regions involved in maintaining perceptual representations during memory-based tasks137,138. Although the source of these modulatory signals is not yet fully understood, accumulating evidence suggests the involvement of top-down inputs from areas such as the parietal and frontal cortices88,139141 as well as the neuromodulatory systems discussed in the next section. In addition, inputs from motor cortex have been shown to suppress both spontaneous and evoked activity in the auditory cortex during movement142,143.

The notion of the auditory cortex as a high-level cognitive processor is also supported by investigations into perceptual decision-making. Traditionally, decision making is a function that has been attributed to parietal and frontal cortex in association with subcortical structures like the basal ganglia. However, recent work suggests that the auditory cortex not only encodes physical attributes of task-relevant stimuli but also represents behavioral choice and decision-related signals92,144146. Tsunada et al.147 provided direct evidence for a role for auditory cortex in decision making: they found that micro-stimulation of the anterolateral but not the mediolateral belt region of the macaque auditory cortex biased behavioral responses toward the choice associated with the preferred sound frequency of the neurons at the site of stimulation.

The auditory cortex and learning

An ability to learn and remember features in complex acoustic scenes is crucial for adaptive behavior, and plasticity of sound processing in the auditory cortex is an integral part of the circuitry responsible for these essential functions. Robust learning often occurs without conscious awareness and persists for a long time. Agus et al.148 demonstrated that human participants can rapidly learn to detect a particular token of repeated white noise and remember the same pattern for weeks149 in a completely unsupervised fashion. Functional brain imaging experiments have implicated the auditory cortex as well as the hippocampus in encoding memory representations for such complex scenes. The underlying mechanisms of this rapid formation of robust acoustic memories are not clear, but recent psychoacoustical experiments suggest a computation based on encoding summary statistics150.

Recognition of structure in complex sequences based on passive exposure is also imperative for acquiring knowledge of the various rules manifest in speech and language151. Electrophysiological recordings from songbird forebrain areas that correspond to the mammalian auditory cortex have revealed evidence for statistical learning, expressed as a decrease in the spike rate for familiar versus novel sequences152. Following passive learning with sequences of nonsense speech sounds that contained an artificial grammar structure, exposure to violations in this sequence activated homologous cortical areas153 and modulated hierarchically nested low-frequency phase and high-gamma amplitude coupling in the auditory cortex in a very similar way in humans and monkeys154. Interestingly, this form of oscillatory coupling is thought to underlie speech processing in the human auditory cortex155, suggesting that it may represent an evolutionarily conserved strategy for analyzing sound sequences.

Not all learning, however, occurs without supervision, and reinforcement in the form of reward or punishment is also key for successful learning outcomes. As in other sensory modalities, training can produce improvements in auditory detection and discrimination abilities, including linguistic and musical abilities156. A number of studies have reported that auditory perceptual learning is associated with changes in the stimulus-encoding response properties of A1 neurons157. This often entails an expansion in the representation of the stimuli on which subjects are trained, and the extent of the representational plasticity is thought to encode both the behavioral importance of these stimuli and the strength of the associative memory158. Nevertheless, there have been few attempts to show that A1 plasticity is required for auditory perceptual learning. Indeed, training-induced changes in the functional organization of A1 have been found to disappear over time, even though improvements in behavioral performance are retained159,160. However, a recent study in which gerbils were trained on an amplitude-modulation detection task found both a close correlation between the magnitude and time course of cortical and behavioral plasticity and that inactivation of the auditory cortex reduced learning without affecting detection thresholds161. Cortical inactivation has also been shown to impair training-dependent adaptation to altered spatial cues resulting from plugging of one ear162. This is consistent with physiological evidence that the encoding of different spatial cues in the auditory cortex is experience-dependent, changing in ways that can explain the recovery of localization accuracy following exposure to abnormal inputs94,163,164.

The alterations in cortical response properties that accompany perceptual learning are likely driven by top-down inputs that determine which stimulus features to attend to165. Several neuromodulatory systems, including the cholinergic basal forebrain166,167 and the noradrenergic locus coeruleus168,169, play an important role in auditory processing and plasticity and appear to provide reinforcement signals and information about behavioral context to auditory cortex. It has recently been shown that cholinergic inputs primarily target inhibitory interneurons in the auditory cortex170,171, which suggests a possible basis by which the balance of cortical excitation and inhibition is transiently altered during learning172.

Although its involvement in learning is probably one of the most important functions of the auditory cortex, there is growing evidence for a specific role for its outputs to other brain areas. For instance, selective strengthening of auditory corticostriatal synapses has been observed over the course of learning an auditory discrimination task173, while the integrity of A1 neurons that project to the IC is required for adaptation to hearing loss in one ear174. Indeed, it is likely that descending corticofugal pathways play a more general role in auditory learning175,176 as well as other aspects of auditory perception and behavior177180.

Conclusions

The studies outlined in this article show how neurons in the auditory cortex encode sounds in ways that are directly relevant to behavior. Auditory cortical processing depends not only on the sounds themselves but also on the individual’s internal state, such as the level of arousal, and the sensory and behavioral context in which sounds are detected. Therefore, to understand how cortical neurons process specific sound features, we have to consider the complexity of the auditory scene and the presence of other sensory cues as well as factors such as motor activity, experience, and attention. Indeed, the auditory cortex seems to play a particularly important role in learning and in constructing memory-dependent perceptual representations of the auditory world.

We are now beginning to understand the computations performed by auditory cortical neurons as well as the role of long-range inputs and local cortical circuits, including the participation of specific cell types, in those computations. Research over the last few years has also provided insights into the way information flows within and between different auditory cortical areas as well as the complex interplay between the auditory cortex and other brain areas. In particular, through its extensive network of descending projections, cortical activity can influence almost every subcortical processing stage in the auditory pathway, but we are only beginning to understand how those pathways contribute to auditory function. Further progress in this field will require the application of coordinated computational and experimental approaches, including the increased use of methods for measuring, manipulating, and decoding activity patterns from populations of neurons.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 26 Sep 2018
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
King AJ, Teki S and Willmore BDB. Recent advances in understanding the auditory cortex [version 1; peer review: 2 approved] F1000Research 2018, 7(F1000 Faculty Rev):1555 (https://doi.org/10.12688/f1000research.15580.1)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 26 Sep 2018
Views
0
Cite
Reviewer Report 26 Sep 2018
Shihab Shamma, Department of Electrical & Computer Engineering, University of Maryland, College Park, Maryland, USA;  The Institute for Systems Research, University of Maryland, College Park, Maryland, USA 
Approved
VIEWS 0
I confirm that I have read this submission and believe that I have an ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Shamma S. Reviewer Report For: Recent advances in understanding the auditory cortex [version 1; peer review: 2 approved]. F1000Research 2018, 7(F1000 Faculty Rev):1555 (https://doi.org/10.5256/f1000research.16995.r38454)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
0
Cite
Reviewer Report 26 Sep 2018
Christoph Schreiner, UCSF Center for Integrative Neuroscience, University of California San Francisco, San Francisco, USA 
Approved
VIEWS 0
I confirm that I have read this submission and believe that I have an ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Schreiner C. Reviewer Report For: Recent advances in understanding the auditory cortex [version 1; peer review: 2 approved]. F1000Research 2018, 7(F1000 Faculty Rev):1555 (https://doi.org/10.5256/f1000research.16995.r38455)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 26 Sep 2018
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.