Skip to main navigation Skip to search Skip to main content

Heuristic Weight Initialization for Transfer Learning in Classification Problems

  • Kyungpook National University

Research output: Contribution to journalArticlepeer-review

Abstract

Transfer learning is the predominant method for adapting pre-trained models on another task to new domains while preserving their internal architectures and augmenting them with requisite layers in Deep Neural Network models. Training intricate pre-trained models on a sizable dataset requires significant resources to fine-tune hyperparameters carefully. Most existing initialization methods mainly focus on gradient flow-related problems, such as gradient vanishing or exploding, or other existing approaches that require extra models that do not consider our setting, which is more practical. To address these problems, we suggest employing gradient-free heuristic methods to initialize the weights of the final new-added fully connected layer in neural networks from a small set of training data with fewer classes. The approach relies on partitioning the output values from pre-trained models for a small set into two separate intervals determined by the targets. This process is framed as an optimization problem for each output neuron and class. The optimization selects the highest values as weights, considering their direction towards the respective classes. Furthermore, empirical 145 experiments involve a variety of neural network models tested across multiple benchmarks and domains, occasionally yielding accuracies comparable to those achieved with gradient descent methods by using only small subsets.

Original languageEnglish
Pages (from-to)4155-4171
Number of pages17
JournalComputers, Materials and Continua
Volume85
Issue number2
DOIs
StatePublished - 2025

Keywords

  • Transfer learning
  • gradient descent
  • gradient free
  • heuristics

Fingerprint

Dive into the research topics of 'Heuristic Weight Initialization for Transfer Learning in Classification Problems'. Together they form a unique fingerprint.

Cite this