🔌 Toolbox of short, reusable pieces of code and knowledge.
In this snippet thread, I’ll be posting a bunch of ML techniques.
Same goal as transfer learning but adds explicit feature extraction to make it better. Essentially, pre-train on large dataset (source domain) and fine-tune your architecture on a target distribution. But… first, find a feature extractor network (via ML too!… domain adverserial training) to apply to both domains to get them “in the same distribution”.
Image source.
Say you have a dataset consisting of many unlabeled samples but few labeled samples. A good idea would be to do some sort of pseudo-labeling process (semi-supervised learning). But this is really only feasible right now for simple tasks like classification. For dense tasks like detection or segmentation, it might be a good idea to pre-train the weights using the unlabeled parts of the data. But how so? Masked image modeling provides a method.
Let’s say that…
References:
Instead of traditionally classifying images using a black-box neural network, one new approach would be to leverage NLP annotations. This is common in clinical workflows where NLP annotations are provided naturally. Then, one can embed both the image and the text annotation into a high-dimensional space and use contrastive learning to predict positive/negative for the class.
This method is described in the paper “Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning”.