no subject

So, I'll try to write this as a series of informal comments.

First of all, one should think about recurrent neural networks (RNNs) and about DMMs as "two-stroke engines" (двухтактные двигатели). On the "up movement", the "activation functions" built into the neurons are applied to the inputs of the neurons, and the outputs of the neurons are produced. On the "down movement", the neuron inputs are recomputed from the neuron outputs using the network matrix. This cycle of "up movement"/"down movement" is repeated indefinitely.

The network matrix defines the topology and the weights of the network. The columns of the network matrix are indexed by the neuron outputs, and the rows of the network matrix are indexed by the neuron inputs. If a particular weight is zero, this means that the corresponding output and input are not connected, so the sparsity structure of the matrix defines the topology of the network.

It was traditional to have the same activation function for all neurons in the network, and it was traditional to use some kind of sigmoid for this purpose, but these days it is a common place to mix a few different kinds of activation functions within the same network. Besides sigmoid functions such as logistic ("soft step") and hyperbolic tangent, a particularly popular function in recent years is ReLU (rectifier, y=max(0,x) ). The table in the following Wikipedia article compares some of the activation functions people are trying.

https://en.wikipedia.org/wiki/Activation_function

(34 comments)

no subject

Post a comment in response: