Written by adminDecember 4, 2018

New Feature in DeepTrainer – Per Layer Granularity for Activation Functions

After implementing a set of new activation functions in DeepTrainer (see my previous post) I had to come to the conclusion that most modern activation functions require different treatment than the almighty Hyperbolic Tangent. With TanH I could conveniently use the same activation function for all layers and neurons in the network, and it worked […]

Written by adminDecember 2, 2018

New activation functions in DeepTrainer: Sigmoid, TanH, ArcTan, ReLU, PReLU, ELU, SoftPlus

What is new in DeepTrainer Over the weekend I updated the codebase with 6 new activation functions. My original aim was to try out the ReLU function as I heard that many modern applications these days are using it, so it came to me as an obvious challenge I needed to accept. Besides ReLU I […]

Written by adminOctober 8, 2018

Calculating dot product using matrix partitioning

Matrices have a beautiful feature that comes very handy when creating fast dot product algorithms. If a matrix is partitioned to equal size blocks, and the blocks themselves are treated as matrix cells in a new matrix, then the dot product calculation will be the same with the new matrix. In the DeepTrainer project I have […]

Written by adminMay 14, 2018

Single and double precision 4×4 and 8×8 block matrix products using SSE2, AVX and AVX512 intrinsics

Dear Reader, In this post I would like to summarize the matrix multiplication algorithms I am using in my neural network library – hopefully they will come handy for some of you. The data types Every SIMD instruction works on a vector of data in parallel. This vector is a row in our matrix. __m128: […]

Written by adminMay 13, 2018

Artificial Intelligence Fight IV. – Introducing 8×8 matrix partitions and Advanced Vector Extensions (AVX) intrinsics

I woke up at midnight and thought – what if I tried to further optimise the SSE block multiplier? I started browsing the intrinsics documentation from Intel, and I quickly came to the conclusion that it would not be too hard to implement AVX intrinsics instead of SSE, which allows blocks of 8×8 single precision […]

Written by adminMay 13, 2018

Artificial Intelligence Fight III. – Introducing matrix partitioning and SSE intrinsics

Dear Reader, I have been through a very productive weekend, I have managed to implement a simple matrix partitioning algorithm, and also a matrix multiplier function that multiplies 4×4 matrices using SSE intrinsics. Partitioning is not yet as efficient as I know it could be, because it creates 4×4 blocks no matter what. Even in […]

Written by adminApril 29, 2018

Artificial Intelligence Fight II. – introducing parallel processing

Dear Reader I have been working on a multithreaded implementation of the Backpropagation algorithm. The most computationally intensive parts of a learning iteration are the forward-propagation and the backpropagation steps which are used in all algorithms to determine the gradients. The main difference between these algorithms is how these gradients are used to update the […]

Written by adminApril 22, 2018

Artificial Intelligence Fight – comparing neural network implementations.

Ok, my apologies in advance if you expected artificial intelligence agents playing with each other in this post, I couldn’t help myself when I wrote the title. This post is about comparing the speeds of two significantly different implementations of the same neural network algorithm. I’ve just done a very simple performance comparison between my […]

Written by adminApril 20, 2018

Why NVidia’s Volta architecture is important for AI and Machine Learning

In this short entry I would like to talk about the details behind the “magical” buzzwords spreading these days on the Machine Learning and Big Data scenes. I am sure the marketing and sales department at NVidia is making great efforts to put expressions like “tensor cores” to be the topic of common talk – […]

Written by adminApril 15, 2018

C++11 DLL library for the Matrix-RProp algorithm

Dear Reader, This project is currently in progress but I thought I would publish it anyway. I have created a modern C++ DLL from the code extracted from the Borland C++ Builder project, as my 15 year old code is hardly going to be useful for anyone these days. https://github.com/bulyaki/NNTrainerLib What this library currently does […]

DeepTrainer

Deep Learning algorithm R&D for Artificial Neural Networks

Category: Neural Network