In the last weeks we have made substancial advancements in the tooling for our project.
We have included flake8 as a linter and black as a fromatter check, this leaves our copde in a pretty nice code, and by doing this checks ater every push but before sending the code to run tests on a GPU instance, we are potentially saving lots of GPU compute.
I have included a simple readme for those who would want to venture out and try the functionality already developed, you can check it out in the repo.
There has also been progress on the documentation side. After some hours of cursing Sphinx, I got to solve the enigma, I was missing the additional `.rst` that can be included manually or through using ApiDoc. Sphinx will not discover the module structure without these files.
Benchmarking is getting interesting. This may well deserve an entry for itself, explaining the ins and outs of pytest-benchmark. Asier (my GSoC colleague who is currently working on a TensorFlow backend for QuTiP) has done a great job using pytest-benchmarks to accomodate plots and graphs comparing different data-layers (see the PR). So my ambition is to reuse as much code from his as I can but also provide separate CPU and GPU timings, this will be of paramount importance in detecting code bottlenecks later on as well as excessive memory communication. The way to do this is dynamically inheriting the pytest-benchamrk fixture and stepping on its timing functions, you can read more about it here.
I am coming up soon with the end of a PR that includes most of the functionality needed for CuPyDense objects. This will allow us to use CuPyDense as a data-backend for Qobj.
There are some functions missing still, like expect, which I think may warrant writing a new CUDA kernel. And there are some improvements to other functions that may come from directly calling the “hidden” cupy.cubla module functions. Which appear to be faster in the case of large squsare matrices than naively calling CuPy’s @ operator. More on this to comee soon.