Non-contact acoustic screening for sleep apnea: a subject-aware deep learning approach


Aygün Çakıroğlu M., KIZILKAYA AYDOĞAN E., Bolattürk Ö. F., Aydoğan S., İSMAİLOĞULLARI S., DELİCE Y.

Sleep and Breathing, cilt.30, sa.1, 2026 (SCI-Expanded, Scopus) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 30 Sayı: 1
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1007/s11325-026-03594-2
  • Dergi Adı: Sleep and Breathing
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, EMBASE, MEDLINE
  • Anahtar Kelimeler: Contactless (non-contact) audio, Deep learning, Model calibration and explainability, Polysomnography, Sleep apnea
  • Hatay Mustafa Kemal Üniversitesi Adresli: Evet

Özet

Purpose: To explore the feasibility of using camera-derived, non-contact audio synchronized with PSG for clinically relevant sleep-apnea classification, and to benchmark compact deep models under a subject-aware design using a previously unstudied, real-world dataset. Methods: Thirty-two adults underwent simultaneous polysomnography (PSG) and camera-based non-contact audio recording. The synchronized audio segments were used to train and compare three compact deep-learning architectures (convolutional, attention-augmented, and transformer-based) under a subject-aware evaluation design that prevented identity leakage. Model performance and calibration were assessed at both segment and subject levels using standard statistical tests. Results: Subject-level evaluation was based on a very small, imbalanced test set of six subjects (one positive). Within this limited yet previously unstudied local dataset, the CNN_trans model achieved an apparent perfect ranking performance (AUC = 1.00; 95% CI 0.00–1.00), though this likely reflects the small, imbalanced test cohort, with recall = 1.00 and precision = 0.55. The wide confidence interval reflects substantial statistical uncertainty, and DeLong comparisons showed no significant AUC difference between CNN_trans and CNN_att (ΔAUC = − 0.042; p = 0.43). Conclusion: PSG-synchronized, non-contact audio supports accurate and well-calibrated sleep-apnea classification with compact deep models. This subject-aware evaluation suggests that contactless acoustic monitoring may have potential clinical relevance, motivating larger, multi-site validation.