ICML 2019 Meta-Learning Tutorial (Video part 1)

Types πŸ”—

There are three common approaches to meta-learning: metric-based, model-based, and optimization-based. Lilian Weng’s Meta-Learning: Learning to Learn Fast covers them in great detail. As Finn in Learning to Learn – The Berkeley Artificial Intelligence Research Blog states, there are two optimizations at play – the learner, which learns new tasks, and the meta-learner, which trains the learner. Methods for meta-learning have typically fallen into one of three categories: recurrent models, metric learning, and learning optimizers.

Model-based, e.g. recurrent models πŸ”—

The meta-learner uses gradient descent, whereas the learner simply rolls out the recurrent network. This approach is one of the most general approaches and has been used for few-shot classification and regression, and meta-reinforcement learning. Due to its flexibility, this approach also tends to be less (meta-)efficient than other methods because the learner network needs to come up with its learning strategy from scratch.

Metric-based πŸ”—

meta-learning is performed using gradient descent, whereas the learner corresponds to a comparison scheme, e.g. nearest neighbors, in the meta-learned metric space. These approaches work quite well for few-shot classification, though they have yet to be demonstrated in other meta-learning domains such as regression or reinforcement learning.

Motivations πŸ”—

Human motivation πŸ”—

β€œHumans have a remarkable ability to quickly grasp new concepts from a very small number of examples or a limited amount of experience, leveraging prior knowledge and context.”

Comparison to supervised DL πŸ”—

Goal πŸ”—

Two views of meta-learning πŸ”—

Applying Probabilistic view to existing algos πŸ”—

Perhaps like ML-PIP (Jonathan Gordon et al., 2019)?

Terminology πŸ”—

Datasets πŸ”—

Meta-learning vs multitask learning, transfer learning πŸ”—

Furthermore, transfer learning and multi-task learning typically have large dataset sizes for each task, whereas meta-learning has small task-specific datasets. Meta-learning also tends to have a larger number of tasks.


Gordon, J., Bronskill, J., Bauer, M., Nowozin, S., & Turner, R., Meta-learning probabilistic inference for prediction, In , International Conference on Learning Representations (pp. ) (2019). : . ↩

Wei, Y., Zhang, Y., Huang, J., & Yang, Q., Transfer learning via learning to transfer, In J. Dy, & A. Krause, Proceedings of the 35th International Conference on Machine Learning (pp. 5085–5094) (2018). Stockholmsm{\“a}ssan, Stockholm Sweden: PMLR. ↩