TY - JOUR
T1 - Learning to Learn Task-Adaptive Hyperparameters for Few-Shot Learning
AU - Baik, Sungyong
AU - Choi, Myungsub
AU - Choi, Janghoon
AU - Kim, Heewon
AU - Lee, Kyoung Mu
N1 - Publisher Copyright:
© 1979-2012 IEEE.
PY - 2024/3/1
Y1 - 2024/3/1
N2 - The objective of few-shot learning is to design a system that can adapt to a given task with only few examples while achieving generalization. Model-agnostic meta-learning (MAML), which has recently gained the popularity for its simplicity and flexibility, learns a good initialization for fast adaptation to a task under few-data regime. However, its performance has been relatively limited especially when novel tasks are different from tasks previously seen during training. In this work, instead of searching for a better initialization, we focus on designing a better fast adaptation process. Consequently, we propose a new task-adaptive weight update rule that greatly enhances the fast adaptation process. Specifically, we introduce a small meta-network that can generate per-step hyperparameters for each given task: learning rate and weight decay coefficients. The experimental results validate that learning a good weight update rule for fast adaptation is the equally important component that has drawn relatively less attention in the recent few-shot learning approaches. Surprisingly, fast adaptation from random initialization with ALFA can already outperform MAML. Furthermore, the proposed weight-update rule is shown to consistently improve the task-adaptation capability of MAML across diverse problem domains: few-shot classification, cross-domain few-shot classification, regression, visual tracking, and video frame interpolation.
AB - The objective of few-shot learning is to design a system that can adapt to a given task with only few examples while achieving generalization. Model-agnostic meta-learning (MAML), which has recently gained the popularity for its simplicity and flexibility, learns a good initialization for fast adaptation to a task under few-data regime. However, its performance has been relatively limited especially when novel tasks are different from tasks previously seen during training. In this work, instead of searching for a better initialization, we focus on designing a better fast adaptation process. Consequently, we propose a new task-adaptive weight update rule that greatly enhances the fast adaptation process. Specifically, we introduce a small meta-network that can generate per-step hyperparameters for each given task: learning rate and weight decay coefficients. The experimental results validate that learning a good weight update rule for fast adaptation is the equally important component that has drawn relatively less attention in the recent few-shot learning approaches. Surprisingly, fast adaptation from random initialization with ALFA can already outperform MAML. Furthermore, the proposed weight-update rule is shown to consistently improve the task-adaptation capability of MAML across diverse problem domains: few-shot classification, cross-domain few-shot classification, regression, visual tracking, and video frame interpolation.
KW - Few-shot learning
KW - MAML
KW - meta-learning
KW - video frame interpolation
KW - visual tracking
UR - http://www.scopus.com/inward/record.url?scp=85151570545&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2023.3261387
DO - 10.1109/TPAMI.2023.3261387
M3 - Article
C2 - 37030677
AN - SCOPUS:85151570545
SN - 0162-8828
VL - 46
SP - 1441
EP - 1454
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 3
ER -