New Feature in DeepTrainer – Per Layer Granularity for Activation Functions
After implementing a set of new activation functions in DeepTrainer (see my previous post) I had to come to the conclusion that most modern activation functions require different treatment than the almighty Hyperbolic Tangent. With TanH I could conveniently use the same activation function for all layers and neurons in the network, and it worked […]