During my upcoming postdoctoral stay, I’ll shift my work to applying deep learning on biological data. So I thought I get a head start and teach myself as much deep learning as possible before! One of my pet projects in doing that has been the generation of images of malaria-infected cells by a variational autoencoder (VAE), with the deep learning framework PyTorch, which you can read about here. Malaria-infected (human) cells in this case have been stained for the intracellular parasite causing malaria, Plasmodium falciparum, and can be clearly distinguished from healthy cells. So, if you take enough cells from both states (healthy / diseased), you should be able to learn the relevant features and generate new images of both categories.
That’s exactly what I did. For this, I used a VAE which basically pushes your data through a bottleneck and tries to reconstruct the original image. This forces the model to really learn the most essential features of your images, as it has to represent the images with very few features. The bottleneck layer in a VAE is a vector of means and a vector of variances, which allows us to create sampling distributions with these features. You can now sample as many new images, from both classes, as you like! If you want to learn more, head to the linked article.