I am now making a demo application of the DeepTrainerTester app available for download, this is the same application I have been using for my benchmarks. It uses SSE/AVX2 instructions for matrix multiplication, AVX512 requires a different executable. However, since I could not see any performance improvement with AVX512 I am not sharing that program.
You can download the zipped installer from here.
Although the application is able to load arbitrary data files, I would recommend using the built-in datasets for demo and testing purposes.
Saving the network is not yet properly implemented in this version.