Validates and curates fine-tuning datasets by detecting duplicates, class imbalance, and quality issues.