PhD Student University of Michigan Ann Arbor, Michigan, United States
Purpose: Limited availability of labeled brain MRI, particularly for less common tumor subtypes, constrains development and validation of neuroimaging AI tools. While conditional Generative Adversarial Networks (GANs) are used for medical image synthesis, their traditional CNN-only discriminators emphasize local texture and often fail to enforce long-range anatomic consistency, leading to incoherent samples and unstable training under data scarcity. We develop and evaluate a class-conditional GAN framework that improves global anatomic realism in synthetic brain MRI by combining convolutional and transformer-based discrimination with progressive transfer learning.
Methods/Materials: We propose a conditional Hybrid-cGAN with a dual-branch discriminator integrating a spectrally normalized CNN for local texture modeling and a pretrained Vision Transformer (ViT-Base/16) for global context. ViT layers were progressively unfrozen based on validation metric plateaus to stabilize dynamics. Experiments were conducted on 3561 axial 2D slices from a four-class public brain MRI dataset (glioma, meningioma, pituitary tumor, no-tumor), split 70/15/15 into training, validation, and test sets. Generation quality was evaluated using Fréchet Inception Distance (FID), Kernel Inception Distance (KID), and MS-SSIM, alongside an auxiliary tumor classifier to assess class-specific fidelity.
Results: Progressive ViT unfreezing produced stepwise improvements. Validation FID decreased from >380 in early training to 153 after unfreezing later ViT blocks, with reductions in KID (≈0.34 to 0.09) and stable MS-SSIM (~0.90). On the test set, the best-FID checkpoint achieved FID=149.8, KID=0.086, and MS-SSIM=0.906, while macro-averaged precision and recall of the auxiliary classifier remained stable, indicating preserved class-conditional structure without mode collapse.
Conclusions: A hybrid CNN-ViT discriminator with metric-guided progressive unfreezing improves training stability and global anatomic coherence in class-conditional brain MRI synthesis. This approach potentially supports synthetic data generation for neuroimaging AI development, particularly in low-data, or restricted-sharing settings relevant to clinical radiology.