MEOW: A Metadata-Driven, End-to-End LLM Framework for Academic Survey Outline Generation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2026)
We propose MEOW, the first metadata-driven, end-to-end LLM framework for academic survey outline generation. Our pipeline combines supervised fine-tuning (SFT) with reinforcement learning (GRPO) under the RLHF paradigm, leveraging Chain-of-Thought annotations to guide structured taxonomy reasoning. We curate ~20k+ surveys from arXiv, bioRxiv, and medRxiv and design multi-dimensional metrics (structure, content, pragmatics) for outline evaluation. Optimizing Qwen3-8B (full & LoRA) with reward functions targeting structural distance and format compliance, MEOW achieves state-of-the-art performance, surpassing baselines such as SurveyX and outperforming strong LLMs (e.g., GPT-5 Nano, DeepSeek-R1) on our benchmark.
- LLM
- SFT + GRPO (RLHF)
- Qwen3-8B
- vLLM
- Survey Generation