Skip to content
IVF BIG DATA

ESHRE 2018 · Poster presentation

Model ensembling and synthetic features in PGS cycles with single embryo transfer

O. O. Barash, K. A. Ivani, S. P. Willman, N. Huen, M. V. Homer, L. N. Weckstein

Reproductive Science Center of the San Francisco Bay Area, USA

ESHRE 2018 Annual Meeting

1,029

IVF PGS cycles with SET

Jan 2013 – Nov 2017

320 → 3,247

Features engineered

609 selected per cycle

104

Stacked classifiers

Level-0 + metalearner

AUC 0.841

Final ROC AUC

Sens 0.80 · Spec 0.78

Background

Why this study

Comprehensive chromosomal screening has been proven as the best option to increase clinical outcomes in autologous IVF cycles for advanced-maternal-age patients. Recent advances in statistical learning, coupled with a steady increase in PGS cycles, have created a background for effective, non-biased, and reproducible data analysis via model ensembling.

Objective. Evaluate the capabilities of model ensembling and synthetic features to assess the impact of the wide range of clinical, morphological, and kinetic parameters of embryo development in vitro on clinical pregnancy rates in autologous IVF PGS cycles with SET.

Materials & Methods

Dataset and pipeline

  • Cohort. 1,029 IVF PGS cycles with SET, January 2013 – November 2017; mean age 35.5 ± 4.8 y.o.
  • Features. 175 clinical + 145 morphological/kinetic parameters per cycle. Weight-of-evidence, categorical and cross-validation target encoding → 3,247 synthetic features; 609 selected.
  • Modelling. 104 base classifiers stacked via Generalized Model Stacking; evaluated by 10-fold cross-validation.
  • Endpoint. Clinical PR defined by fetal heartbeat at 6–7 weeks.

Results · Key drivers

Prior biochemical pregnancies, morphology, and biopsy day dominate

The number of previous biochemical pregnancies, blastocyst morphology, and the time when the embryo became available for biopsy had the highest variable importance — 0.82, 0.52, and 0.43, respectively.

Univariate analysis confirmed these rankings: ongoing PR differed between patients with and without a history of previous biochemical pregnancies — 72.36 % (479/662) vs 43.88 % (136/310), χ² = 73.72, OR = 3.349, 95 % CI 2.527 – 4.438.

Significant differences in ongoing PR were also detected between embryos biopsied on day 5 vs day 6 — 69.34 % (405/584) vs 54.28 % (241/444), χ² = 24.53, OR = 1.906, 95 % CI 1.475 – 2.463 — and between good-quality vs fair-quality embryos — 67.95 % vs 57.62 %, χ² = 10.61, OR = 1.560, 95 % CI 1.193 – 2.040.

0255075100Ongoing PR, %69.3%Day 5 biopsy54.3%Day 6 biopsy68.0%Good morph.57.6%Fair morph.
Fig. 6 — Ongoing PR by biopsy day and by embryo morphology.

Results · Ensemble performance

Stacked predictions reach AUC 0.841 with high sensitivity and specificity

Per-cycle probability of a positive clinical outcome ranged from 0.106 to 0.936 (baseline prediction 0.63). The ROC curve and AUC were used to evaluate results across cutoffs; combined predictions from multiple weak learners (GLM, RF, gradient boosting…) processed by Generalized Model Stacking produced AUC = 0.8412. The maximum AUC was achieved at a cutoff of 0.598, maintaining sensitivity 0.805 and specificity 0.782.

Table I · Multiple logistic regression

Confirmatory GLM — prior biochemical, morphology, biopsy day

Coeff. = coefficient expressed in logits; CI = 95 % confidence interval for the odds ratio. AIC: 1,397.3. Number of Fisher scoring iterations: 4.

FeatureCoeff.Std. err.p valueOdds ratioCI 95 %
(Intercept)3.5300.9572.25 × 10⁻⁴34.1265.259224.449
Prior biochemical-0.5070.0734.33 × 10⁻¹²0.6020.5200.693
Embryo morphology0.7230.2666.67 × 10⁻³2.0601.2233.482
Biopsy day-0.5230.1351.04 × 10⁻⁴0.5920.4550.772

Conclusion

Summary of findings

A history of previous biochemical pregnancies, embryo morphology, and timing of blastocyst biopsy have the biggest impact on clinical PR in IVF PGS cycles with SET. Ensemble methods of statistical learning offer superior performance over their singleton counterparts and will help transform medical records into medical knowledge.

References

Cited works

  1. 1.Breiman L. Stacked regressions. Machine Learning. 1996;24(1):49–64. PubMed
  2. 2.Kotsiantis SB, Zaharakis I, Pintelas P. Supervised machine learning: a review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering. 2007;160:3–24.
  3. 3.LeDell E, van der Laan MJ, Petersen M. AUC-maximizing ensembles through metalearning. Int J Biostat. 2016;12(1):203–218. PubMed
  4. 4.Fragouli E, Alfarawati S, Spath K, et al. The origin and impact of embryonic aneuploidy. Hum Genet. 2013;132:1001–1013. PubMed
  5. 5.Schoolcraft WB, Katz-Jaffe MG. Comprehensive chromosome screening of trophectoderm with vitrification facilitates elective single-embryo transfer for infertile women with advanced maternal age. Fertil Steril. 2013;100:615–619. PubMed

Reprint requests

Oleksii Barash, Ph.D. · ivfbigdata@gmail.com