A Foreword On The Context Of This Question
The question of "the nature of the neural code" is obviously an old one, but its particular formulation with alternatives called "rate coding" and "temporal coding" is of a more recent vintage – the early 90s. The attempt to resolve this question kick-started the careers of many a famous neuroscientist, including Clay Reid and Yang Dan. Of course, in its most straightforward interpretation, the question has no single answer: presumably, the nervous system uses different codes when they are called for by the statistics of the stimulus or the nature of the task.
The question has, however, taken on something of a "litmus test" quality. Staunch defenders of the 'rate code' model consider those damn temporal-code nutjobs to be dilettantish gadflies (or worse, theorists!) who "can't have ever done physiology" (JL Gallant, personal correspondence). Partisans of temporal coding think the sclerotic orthodoxy of rate coding is based on insufficient appreciation of the precision and capability of neural systems (ask, e.g., Tony Bell) and a child-like or perhaps neurotic desire for the world to be simple, understandable, and even linear.
The debate has taken on this quasi-political character in part because the evidence is inconclusive, so its resolution comes down to a matter of faith – a question less of what data tells us about the structure of the world as of what world you believe we live in or what motivates you to do neuroscience. This kind of scientific debate is precisely the kind that philosopher of science Thomas Kuhn identifies as the breeding ground of scientific revolutions in his classic work, The Structure of Scientific Revolutions.
This debate over a major neuroscientific paradigm could be, at least partially, resolved within the decade: the attraction of spatiotemporally-modulated optogenetic stimulation is that it will allow for coding theories to be causally tested, rather than merely inferred from correlations. What a time to be alive.
The Rate Code
As stated above, the rate code is the orthodox view of neural coding. Each cell has a stimulus or task feature that it "likes", and when that feature is present or salient, the neuron fires spikes with a (generally stochastic, Poisson) rate that is monotonically (e.g., proportionally) related to the intensity or salience of that feature. This model works well with stimuli that vary slowly in time, since the noisiness of spikes requires integration over some time window.
Since it is the prevailing view, empirical examples of this coding scheme are abundant: it is the basic model of Hubel and Wiesel, it comprises the output of sensory transduction pathways in many systems, and it is the view that (most) simulated neural networks operate with.
One of my favorite objections to this model comes from Bruno Olshausen's ass-kicking paper What is the Other 85% of V1 Doing?, which methodically demonstrates that Hubel-Wiesel rate coding can only explain, at most, what 15% of the cells in V1 are doing. Central to the argument is a metabolic fact: if the population parameters derived from electrophysiological experiments are correct, a rate-coding V1 would consume an order of magnitude more energy. This is in agreement with more careful estimations of the distribution of baseline firing rates of cortical populations, which put even the slowest-firing neurons of the H-W rate model in the top 15% of cells.
The Temporal Code(s)
The basic idea of the temporal code is as simply stated as that of the rate code: information about the stimulus or action is contained in the relative timing of spikes, not just in, or instead of in, the rate of those spikes. The meaning, however, of "relative timing" is flexible and so there are actually multiple "temporal codes", which range from merely extreme examples of rate coding to incompatible schemes.
The more incompatible temporal codes are also, generally, more complex than rate coding (hence temporal coding's popularity with theorists and computationalists), leading the evidence to lean more heavily on correlation (more precisely, calculations of mutual information) than on causation. One other avenue of argument against rate coding is to point to extremely low jitter, or variance around a mean, in spike timings across multiple presentations of the same stimulus, in systems like the retina. Achieving such precise timings is expensive and unnecessary for rate coding, the argument goes, so it can only have evolved to support a temporal code.
"True" Temporal Coding
The original formulation of temporal coding took inspiration from digital computers and the all-or-none nature of the action potential: what if neurons encoded information as bitstreams of timed spikes (…010110011…) or perhaps with short binary 'words' of the same (…010 110 011 …)? The potential information-carrying capacity of such a code was as much a draw as its philosophically-appealing connection to human methods for information transmission and concomitant mathematical pedigree.
Unfortunately for Shannon and Turing, human brains are not computers – there is no central clock to sync processing and no obvious way to chunk time so that such a coding scheme can operate. Furthermore, such codes operate most efficiently when 1's and 0's appear equally frequently – an expensive proposition, metabolically speaking, especially with the time windows necessary to achieve high-bandwidth, low-latency transmission of information.
Weaker versions of this "true" or "vanilla" temporal coding scheme are possible: against a background of silence, cells could transmit information as bursts of spikes with precisely-timed inter-spike intervals. There are also three alternative, simpler temporal codes, listed below. This view of temporal coding, however, is what is usually meant when the term is used in isolation and without further explanation.
EXAMPLE: Low jitter and high specificity for binary words in the LGN. Reinagel and Reid, 2000. Temporal Coding of Visual Information in the Thalamus. J Neurosci.
Sparse Temporal Coding
In this framework, cells fire a single spike, or a small burst, to represent the presence of a transient stimulus. If your definition of "rate" is fluid enough, this can be considered a rate code with a very small time window and a binary function mapping the rate to the stimulus.
EXAMPLE: Rodent S1 represents whisker "stick-slip" events with bursts of action potentials. Jadhav, Wolfe, and Feldman, 2009. Sparse Temporal Coding of Elementary Tactile Features During Active Whisker Sensation. Nat Neurosci.
Phase-Locked Temporal Coding
One answer to the question "relative to what?", which arises when we define temporal coding in terms of "relative spike timings" is "relative to the phase of a network oscillation". This view is obviously controversial, since many neuroscientists think that oscillations are epiphenomenal – i.e. the result, not the cause, of critical computational processes.
One of the most prominent phase coding hypotheses tied grid cells to theta rhythms. The proposal was that how far away an animal is from the place field of a neuron was encoded by the the time at which the place cell fires relative to the beginning of the theta cycle; this phenomenon is known as theta phase precession (Skaggs, W. E., & McNaughton, B. L. (1996). Theta Phase Precession in Hippocampal Neurons. Hippocampus, 6, 149-172.). As a general theory of place cell function in all animals, this idea contested; see Grid cells without theta oscillations in the entorhinal cortex of bats, but in rodents, the finding is robust.
A similar, but more general, idea has also been proposed for temporal coding by the phase of the cortical gamma cycle during which a neuron fires: neurons fire during the same phase if the features that they represent are properties of a coherent object; see The gamma cycle.
EXAMPLE: Rich information is present in phase-locked gamma oscillations in retina and LGN. Koepsell, Wang, Sommer et al., 2009. Retinal Oscillations Carry Visual Information to Cortex. Front Syst Neurosci.
Population Temporal Coding
All the other codes discussed here have been single-cell codes, or answers to the question: how does one cell represent information through the pattern of its spiking? How multiple cells represent information is a whole separate question (literally, it's the next one on the list!). Population temporal coding is an answer to that general question. Any of the models discussed under that question, considered generally, could be combined with temporal coding, though they are generally discussed in terms of rate codes. The combination looks something like this: the relative timing of pairs of neurons (or pairs of assemblies) in a population encodes information. This is an especially compelling theory for how neurons encode information about stimulus timing.
EXAMPLE: Spikes in the retina can be used to rapidly discriminate flickered stimuli. Gollisch and Meister, 2008. Rapid Neural Coding in the Retina with Relative Spike Latencies. Science.