neuron networks

=biology =machine learning =neural networks

Any type of brain must have a way to transfer signals across distances, and it's desirable to transmit those signals quickly. The potential ways that biological systems could transmit signals quickly that come to mind for me are:

1) bioluminescence -> biological optical fiber -> photoreceptors
2) electron transfer to a conductive polymer
3) mechanically pulling a fiber (something like a hair with lubricin on its surface, inside a lubricating sheath) linked to a mechanoreceptor
4) ion channels triggered by other ions, causing propagating electrostatic charge when triggered

Of those, (4) seems the easiest to evolve, and it's what developed on Earth.

A neuron firing this way by opening sodium channels is all-or-nothing, so information must be transmitted in the timing. Neurons have no global absolute clock, so information must be contained in "time since last spike" rather than the absolute time of spikes, unless a spike means that something has just happened.

For some current theories of neuron firing, see this page.

From artificial neural network (ANN) research, we know that linear representations of activations are worse at low resolution than some nonlinear ones. I would expect a spike to typically represent an activation of approximately:

Formula 1: a + b * e^(-c * time_since_last_spike)

Each neuron is connected to many synapses, typically ~7000 in a human brain. When a neuron fires, at each connected synapse, some neurotransmitters are released. Most neurotransmitters have no immediate effect on neuron firing, and instead regulate slower processes, but let's consider just the short-term behavior of neurons. A neuron N1 fires, it releases some neurotransmitters at a synapse, and they bind to receptors at neuron N2 which have some net effect on the electric potential of N2.

That net short-term effect on N2 potential is analogous to a "weight" in an ANN, but it's not a constant, it's a function: spike timing -> change in potential. (Also, the neurotransmitter receptors at synapses can be disabled or added over a longer timescale.) The N2 potential is analogous to an ANN accumulator, but it decays towards a baseline over time.

The represented weights can be both positive and negative. It's also common for a single synapse to have receptors with positive and negative effects on cell potential active at the same time.

I wrote above that I'd expect the activation represented by a spike to often approximately follow Formula 1. An obvious way to accomplish that is to have a synapse with a channel type with approximately constant value on firing, and another channel type transporting ions that, as time passes after a spike, are transported and asymptotically approach an approximately opposite value.

That being the case, I'd expect two spikes in rapid succession at the same synapse to usually represent a large activation value. Depending on the "weights" at that synapse, that could either strongly inhibit neuron firing, or lead to immediate firing. But of course, there's no need for all synapses to use timing representations with the same shape. At some synapses, longer times between spikes probably represent larger values; that would be useful for making fast simple reactions, where you want a single pulse to propagage through some paths quickly.

When neurons modify receptors at synapses, how much internal data can they draw upon? Memories are partly stored by DNA methylation patterns, so potentially quite a bit.

A typical human brain has about 7*10^14 synapses. GPT-3 has about 1.7*10^11 weights. Does this mean that GPT-3 has about 1/4000th the effective weights of a human brain? No.

1) Synapse connections are sparse, which makes them equivalent to at least 10x as many dense ANN weights.
2) Neurons can shift between receptor patterns at synapses, so at timescales long enough for that, we should multiply by at least 10x again.

I feel confident in saying a human brain has >10^5x the effective weights of GPT-3. Does this mean that scaling up GPT-3 to 10^5 as many parameters would produce a human-level intelligence? No.

Transistors are much faster than neurons. Thanks to that advantage, GPT-3 was trained on more text than a human can read in a lifetime - yet it's still widely considered "undertrained" relative to its parameter count. A "human-level AI" wouldn't be a normal human - it would be closer to a human that spent thousands of years reading the internet.

On the other hand, transformers scale quadratically with input context size. In that sense, they're a brute-force solution that only works well for small contexts. (Dense ANNs are also a brute-force solution - they're less efficient than sparse ones, but easier to implement and still useful.) Humans think more efficiently, and at least some humans can operate on higher conceptual levels than something like GPT-3. That said, the remaining insights to bridge that gap could be fairly simple.