Abstract
Recent advances in large language models (LLMs) have produced impressive fluency, yet their application to specialized scientific domains like wood science remains limited. This study introduces WoodLLaMA, a domain-specific LLM fine-tuned on metadata from 16,929 wood science research articles, and examines the effects of fine-tuning and retrieval-augmented generation (RAG) on model performance. Evaluation utilized two datasets not included in the training data: a Journal question–answer (QA) set representing domain-specific expertise and a Wood Handbook QA set reflecting fundamental wood science knowledge. Using intrinsic metrics (perplexity) and QA-based metrics (cosine similarity, keyword matching, and BERTScores), along with qualitative case studies, fine-tuning was found to enhance linguistic fluency while RAG improved semantic alignment. Combining fine-tuning and RAG yielded the most robust and consistent performance. These results demonstrate the complementary value of fine-tuning and RAG for building domain-specific LLMs. The study offers a methodological framework for LLM evaluation and identifies future directions—such as leveraging full-text data, enabling multilingual support, integrating multimodal resources, and incorporating human-in-the-loop learning methods—for enhancing the performance and broadening the applicability of WoodLLaMA across a diverse range of domains.
| Original language | English |
|---|---|
| Article number | 13 |
| Journal | Journal of Wood Science |
| Volume | 72 |
| Issue number | 1 |
| DOIs | |
| State | Published - Dec 2026 |
Keywords
- Domain-specific modeling
- Fine-tuning
- Large language model
- Retrieval-augmented generation
- Wood science
- WoodLLaMA
Fingerprint
Dive into the research topics of 'Evaluating fine-tuning and retrieval-augmented generation for domain-specific language modeling in wood science'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver