snippets

🔌 Toolbox of short, reusable pieces of code and knowledge.

View the Project on GitHub rosikand/snippets

Toolbox of ML Techniques

In this snippet thread, I’ll be posting a bunch of ML techniques.

Domain adaption

Same goal as transfer learning but adds explicit feature extraction to make it better. Essentially, pre-train on large dataset (source domain) and fine-tune your architecture on a target distribution. But… first, find a feature extractor network (via ML too!… domain adverserial training) to apply to both domains to get them “in the same distribution”.

image

Image source.

Good references

Masked image modeling for self-supervised pre-training

Say you have a dataset consisting of many unlabeled samples but few labeled samples. A good idea would be to do some sort of pseudo-labeling process (semi-supervised learning). But this is really only feasible right now for simple tasks like classification. For dense tasks like detection or segmentation, it might be a good idea to pre-train the weights using the unlabeled parts of the data. But how so? Masked image modeling provides a method.

Good references

Deep metric learning

Let’s say that…

References:

Image classification using CLIP

Instead of traditionally classifying images using a black-box neural network, one new approach would be to leverage NLP annotations. This is common in clinical workflows where NLP annotations are provided naturally. Then, one can embed both the image and the text annotation into a high-dimensional space and use contrastive learning to predict positive/negative for the class.

This method is described in the paper “Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning”.