ASRM 2018 · Poster presentation

High-accuracy machine-learning predictive model for embryo selection in IVF PGT cycles with single embryo transfers

O. O. Barash, K. A. Ivani, S. P. Willman, M. D. Hinckley, M. V. Homer, L. N. Weckstein

Reproductive Science Center of the San Francisco Bay Area, USA

ASRM 2018 Annual Scientific Congress

1,828

IVF PGT cycles

1,363 SETs · 1,017 patients

9,501

Embryos analyzed

5.07 ± 3.53 per cycle

AUC 0.841

Ensemble ROC AUC

LogLoss 0.536 · 73 base models

6,470

Synthetic features

From 320 originals · 357 selected

Background

Why this study

Recent publications have demonstrated significant improvement in IVF outcomes by implementing NGS-based PGT while transferring fewer embryos. The ultimate goal is to transfer a single euploid embryo in every IVF PGT cycle.

Objective. Create a complex machine-learning model that accurately predicts the reproductive potential of a particular euploid embryo to establish a viable pregnancy in PGT cycles with single embryo transfer (SET).

Materials & Methods

Dataset and learning pipeline

Design. Retrospective study of NGS PGT outcome data; supervised and unsupervised learning to identify differences in clinical PR.
Period. January 2013 – July 2018; 1,828 cycles / 9,501 embryos; 1,363 single embryo transfers; mean age 37.14 ± 4.11 y.o.
Features. 175 clinical (age, BMI, AMH, FSH, AFC, stimulation protocol, duration, gonadotropin dose, gravidity, diagnosis…) + 145 morphological / kinetic (eggs retrieved, 2PN, blastulation rate, morphology, euploidy rate, day-5 vs day-6 biopsy, fertilization rate…).
Pipeline. Weight-of-evidence, categorical and cross-validation target encoding → 6,470 synthetic features, 357 selected. 73 base classifiers (GLM, kNN, RF, XGBoost, AdaBoost…) stacked via Generalized Model Stacking. Evaluated by 10-fold cross-validation.

Results · Variable importance

History, morphology, and biopsy day dominate the signal

The number of previous biochemical pregnancies, blastocyst morphology, and the day embryos became available for biopsy had the highest variable importance in the ensembled model — 0.82, 0.52, and 0.43, respectively.

Top-3 variable importances from the stacked ensemble.

Results · Ensemble performance

Stacking lifts AUC to 0.84 with calibrated per-cycle probabilities

Per-cycle probability of a positive clinical outcome ranged from 0.1299 to 0.8959 (baseline prediction 0.6598). Combined predictions from multiple weak learners (generalized logistic regression, random forest, gradient boosting…) processed by Generalized Model Stacking produced a predictive performance of AUC = 0.8407, LogLoss = 0.5356.

Univariate analysis confirmed the feature ranking: a statistically significant difference in ongoing PR between patients with and without a history of previous biochemical pregnancies — 71.97 % (701/974) vs 46.77 % (174/372), χ² = 75.13, OR = 2.922, 95 % CI 2.282 – 3.741.

Morphology effect on ongoing PR after SET: good (AA/AB/BA) vs fair (BB) vs borderline-fair — 70.18 % (626/892) vs 59.44 % (211/355) vs 41.38 % (48/116) — χ² = 13.279, OR = 1.606, 95 % CI 1.244 – 2.074, p < 0.05.

Table I · Multiple logistic regression

Confirmatory GLM — prior biochemical, biopsy day, morphology

Coeff. = coefficient expressed in logits; CI = 95 % confidence interval for the odds ratio. AIC: 1,637.5. Number of Fisher scoring iterations: 4.

Feature	Coeff.	Std. err.	p value	Odds ratio	CI 95 %
(Intercept)	4.255	0.809	0.000	70.422	14.536 – 346.833
Prior biochemical	-0.664	0.092	0.000	0.515	0.429 – 0.615
Biopsy day 6	-0.604	0.143	0.000	0.546	0.412 – 0.723
Embryo morph. fair	-0.329	0.141	0.020	0.720	0.546 – 0.950
Borderline-fair embryo	-0.768	0.225	0.001	0.464	0.298 – 0.719
Total day-5 biopsies	-0.052	0.023	0.022	0.949	0.908 – 0.993
Prior full-term	0.323	0.074	0.000	1.382	1.199 – 1.600

Conclusion

Summary of findings

Machine-learning algorithms applied to large clinical datasets can predict the outcome of IVF treatment with high accuracy. Factors affecting PGT outcomes were defined, ranked, and evaluated; ensemble methods of statistical learning offer superior performance over their singleton counterparts and help transform medical records into medical knowledge.

References

Cited works

1.Breiman L. Stacked regressions. Machine Learning. 1996;24(1):49–64. PubMed
2.Kotsiantis SB, Zaharakis I, Pintelas P. Supervised machine learning: a review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering. 2007;160:3–24.
3.LeDell E, van der Laan MJ, Petersen M. AUC-maximizing ensembles through metalearning. Int J Biostat. 2016;12(1):203–218. PubMed
4.Fragouli E, Alfarawati S, Spath K, et al. The origin and impact of embryonic aneuploidy. Hum Genet. 2013;132:1001–1013. PubMed
5.Schoolcraft WB, Katz-Jaffe MG. Comprehensive chromosome screening of trophectoderm with vitrification facilitates elective single-embryo transfer for infertile women with advanced maternal age. Fertil Steril. 2013;100:615–619. PubMed

Reprint requests

Oleksii Barash, Ph.D. · ivfbigdata@gmail.com