'Is there any difference between an activation function and a transfer function?
It seems there is a bit of confusion between activation and transfer function. From Wikipedia ANN:

It seems that the transfer function calculates the net while the activation function the output of the neuron. But on Matlab documentation of an activation function I quote:
satlin(N, FP) is a neural transfer function. Transfer functions calculate a layer's output from its net input.
So who is right? And can you use the term activation function or transfer function interchangeably?
Solution 1:[1]
After some research I've found in "Survey of Neural Transfer Functions", from Duch and Jankowski (1999) that:
transfer_function = activation function + output function
And IMO the terminology makes sense now since we need to have a value (signal strength) to verify it the neuron will be activated and then compute an output from it. And what the whole process do is to transfer a signal from one layer to another.
Two functions determine the way signals are processed by neurons. The activation function determines the total signal a neuron receives. The value of the activation function is usually scalar and the arguments are vectors. The second function determining neuron’s signal processing is the output function o(I), operating on scalar activations and returning scalar values. Typically a squashing function is used to keep the output values within specified bounds. These two functions together determine the values of the neuron outgoing signals. The composition of the activation and the output function is called the transfer function o(I(x)).
Solution 2:[2]
I think the diagram is correct but not terminologically accurate.
The transfer function includes both the activation and transfer functions in your diagram. What is called transfer function in your diagram is usually referred to as the net input function. The net input function only adds weights to the inputs and calculates the net input, which is usually equal to the sum of the inputs multiplied by given weights. The activation function, which can be a sigmoid, step, etc. function, is applied to the net input to generate the output.
Solution 3:[3]
Transfer function come from the name transformation and are used for transformation purposes. On the other hand, activation function checks for the output if it meets a certain threshold and either outputs zero or one. Some examples of non-linear transfer functions are softmax and sigmoid.
For example, suppose we have continuous input signal x(t). This input signal is transformed into an output signal y(t) through a transfer function H(s).
Y(s) = H(s)X(s)
Transfer function H(s) as can be seen above changes the state of the input X(s) into a new output state Y(s) through transformation.
A closer look at H(s) shows that it can represents a weight in a neural network. Therefore, H(s)X(s) is simply the multiplication of the input signal and its weight. Several of these input-weight pairs in a given layer are then summed up to form the input of another layer. This means that input to any layer to a neural network is simply the transfer function of its input and the weight, i.e a linear transformation because the input is now transformed by the weights. But in the real world, problems are non-linear in nature. Therefore, to make the incoming data non-linear, we use a non-linear mapping called activation function. An activation function is a decision making function that determines the presence of particular neural feature. It is mapped between 0 and 1, where zero mean the feature is not there, while one means the feature is present. Unfortunately, the small changes occurring in the weights cannot be reflected in the activation value because it can only take either 0 or 1. Therefore, nonlinear finctions must be continuous and differentiable between this range.
In really sense before outputting an activation, you calculate the sigmoid first since it is continuous and differential and then use it as an input to an activation function which checks whether the output of the sigmoid is higher than its activation threshhold. A neural network must be able to take any input from -infinity to +positive infinite, but it should be able to map it to an output that ranges between {0,1} or between {-1,1} in some cases - thus the need for activation function.
Solution 4:[4]
I am also a newbie in machine learning field. From what I understand...
Transfer function: Transfer function calculates the net weight, so you need to modify your code or calculation it need to be done before Transfer function. You can use various transfer function as suitable with you task.
Activation function: This is used for calculating threshold value i.e. when your network will give the output. If your calculated result is greater then threshold value it will show output otherwise not.
Hope this helps.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Bran |
| Solution 3 | |
| Solution 4 | Sabah Shariq |
