Skip to content
IVF big data

Morphology cohort

Factors affecting clinical pregnancy

One of the most common questions from both patients and physicians is simple on the surface, yet complex in practice: which embryo should be selected for transfer?

Answering this requires weighing several key factors—blastocyst expansion (grades 1–6), inner cell mass (ICM) quality (A–B-C), trophectoderm (TE) quality (A–B-C), and the day of development at freezing (Day 5, 6, or 7). Even these four variables alone generate 216 possible embryo profiles (6 × 3 × 3 × 3), making intuitive decision-making both challenging and inconsistent.

To move beyond subjective assessment and address this question rigorously, we applied a data-driven approach. Using binary logistic regression, we modeled clinical pregnancy as the outcome, incorporating maternal age, biopsy day, number of embryos transferred, and standardized morphology metrics (expansion, ICM, and TE derived from Gardner grading). In parallel, principal component analysis (PCA) was used to better understand the structure and relative contribution of these correlated features.

All predictors were standardized (z-scores), allowing direct comparison of effect sizes. Results are presented as odds ratios per one standard deviation, providing a clear interpretation of each variable’s impact.

When reviewing the results, use both the tables and charts together:

  • Tables provide precise estimates and uncertainty
  • Bar charts highlight the relative magnitude and direction of each effect

This combined approach helps translate a complex, multi-dimensional decision into something quantifiable, reproducible, and clinically actionable.

Feature importance for clinical pregnancy (logistic regression)

Observational model. Coefficients describe association with intrauterine pregnancy (IUP) in this historical cohort after standardizing predictors. They do not prove causation. Odds ratios are per one standard deviation increase in each predictor (after z-scoring). Ridge regularization reduces overfitting when categories are sparse.

FET cycles (model)

4,596

Clinical pregnancies (IUP)

3,329

IUP rate

72.4%

ROC AUC (discrimination)

0.621

Cohort definition: rows require a parseable Gardner morphology (expansion + ICM + TE). Excluded without Gardner code: 0. Model fit: McFadden pseudo‑R² = 0.035, IRLS iterations = 33 (converged).

Most Important factors for embryo selection (Relative importance (|β| on z-scale)

After z-scoring every predictor, the fitted logistic model assigns each one a coefficient β on the log-odds scale. Because scales are comparable, the absolute value |β| is a simple index of how strongly that variable moves predicted IUP risk when it increases by one cohort standard deviation.

How to read the figure: vertical axis = |β| (larger → stronger association in this model). Bars are ordered by |β| (largest on the left). This ranks contributors within this regression; it does not by itself prove which factors are clinically most important.

Figure: magnitude of standardized logistic coefficients (|β|) for each predictor; same estimates as in the regression table below.

ROC curve (discrimination)

Each point corresponds to a threshold on the model's predicted probability of IUP (scores sorted from high to low). The blue curve is this logistic model; the dashed gray line is chance (no discrimination, AUC = 0.5). AUC = 0.621 is the area under the blue curve (trapezoidal rule, consistent with the rank-sum AUC reported above).

Square plot: both axes are 0–1. Curves above the diagonal indicate better-than-random ranking of IUP vs non-IUP by predicted probability.

Regression table

Predictors are z-scored before fitting. Odds ratios ("OR") are for a one standard-deviation increase in the predictor. Wald z and two-sided p-values are approximate.

Predictorβ (log-odds)SEOR95% CIzp
Intercept0.92960.033327.96<0.0001
Embryo age (years)-0.00240.03300.9980.9351.064-0.070.942
Biopsy day (day 5 or 6)-0.26040.03210.7710.7240.821-8.10<0.0001
Embryos for ET (count)0.02960.03341.0300.9651.1000.890.375
Expansion grade (1–6)-0.08170.03310.9220.8640.983-2.470.014
ICM grade (ordinal A/B/C)0.21750.03151.2431.1691.3226.91<0.0001
TE grade (ordinal A/B/C)0.12040.03321.1281.0571.2043.620.0003

Odds ratios (per 1 SD)

Each bar is the odds ratio (OR) for a one standard-deviation increase in that predictor, holding the model structure fixed. OR = 1 means no association with the odds of IUP; OR > 1 means higher predicted odds of IUP per SD increase; OR < 1 means lower.

Reference line: the horizontal dashed line at OR = 1 marks “no effect” on the odds scale. Bars entirely above that line suggest a positive association with IUP on average in this cohort; interpret together with confidence intervals in the table.

Figure: odds ratios per 1 SD (same point estimates as OR column in the regression table); vertical scale fits the cohort’s OR spread.

How to read this

  • Outcome: Clinical pregnancy vs non-pregnancy
  • Predictors: numeric embryo age, biopsy day, embryos for ET, and Gardner expansion plus ordinal ICM/TE scores from Gardner letter grades.
  • Standardization: each predictor is mean-centered and scaled by its standard deviation so coefficients are comparable in magnitude.
  • AUC / ROC: the ROC plot shows sensitivity vs false positive rate for all classification thresholds; the numeric AUC summarizes how well predicted probabilities rank IUP vs non-IUP (1 = perfect, 0.5 = random).

Principal components analysis of features affecting clinical pregnancy

Features: embryo age, biopsy day (5/6/7), embryos for transfer, Gardner expansion, morphology quality (ordinal), and ICM / TE letter scores from Gardner grades. All features are z-scored; orthogonal factors are derived from the correlation structure (variance-maximizing, ordered). Outcome (IUP) is used only for coloring and correlation with factor scores. A written interpretation of variance, loadings, and outcome alignment appears below the tables.

Exploratory analysis. Correlated features are summarized as orthogonal factors (ordered by how much variance they explain). This describes structure in the cohort; it does not prove causation. Clinical pregnancy (IUP) was not used to build the factors—it is shown only for overlay and correlation with factor scores.

What each factor represents (this cohort)

Names list the two features with largest absolute loadings on that factor after standardizing all inputs. They describe which variables move together—not a clinical diagnosis name.

  • Morphology qualityICM

    Morphology quality · ICM (This is the factor that explains the most variance in the cohort.) Standardized loadings for the strongest contributors (in order): Morphology quality (+0.59), ICM (+0.58), TE (+0.41). A positive sign means that variable tends to increase when scores on this factor increase.

  • ExpansionBiopsy day

    Expansion · Biopsy day (This factor is orthogonal to all previous factors.) Standardized loadings for the strongest contributors (in order): Expansion (-0.75), Biopsy day (-0.54), TE (-0.31). A positive sign means that variable tends to increase when scores on this factor increase.

  • EmbryosAge

    Embryos · Age (This factor is orthogonal to all previous factors.) Standardized loadings for the strongest contributors (in order): Embryos (+0.70), Age (-0.69), Expansion (+0.15). A positive sign means that variable tends to increase when scores on this factor increase.

  • EmbryosAge

    Embryos · Age (This factor is orthogonal to all previous factors.) Standardized loadings for the strongest contributors (in order): Embryos (+0.70), Age (+0.70), ICM (+0.10). A positive sign means that variable tends to increase when scores on this factor increase.

Sample & variance

4,596 FET records, 7 standardized features.

FactorVariance explainedCumulativer (factor, IUP)
Morphology qualityICM
32.8%32.8%0.183
ExpansionBiopsy day
18.2%51.0%0.080
EmbryosAge
15.1%66.1%0.006
EmbryosAge
13.2%79.3%0.039

Variance explained (scree)

Each bar is the share of total variance across standardized features captured by that factor (ordered by importance). Steep drops mean later factors add little.

% variance per factorMorphology quality · ICM33%Morphology qualityICMExpansion · Biopsy day18%ExpansionBiopsy dayEmbryos · Age15%EmbryosAgeEmbryos · Age13%EmbryosAge% of total var

Cumulative variance

How much of the total variance is retained when including factors in order. With seven inputs, all factors together reach 100%.

Cumulative % varianceMorphology quality · ICM33%Morphology qualityICMExpansion · Biopsy day51%ExpansionBiopsy dayEmbryos · Age66%EmbryosAgeEmbryos · Age79%EmbryosAge

Loadings heatmap

Each cell is the loading of one standardized feature on one factor (eigenvector element). Blue = negative, red = positive; stronger color = larger magnitude. Compare with the numeric table below.

FeatureFactorsMorphology qualityICMExpansionBiopsy dayEmbryosAgeEmbryosAgeAgeAge: loading -0.085 on factor 1-0.09Age: loading -0.110 on factor 2-0.11Age: loading -0.695 on factor 3-0.69Age: loading 0.700 on factor 40.70Biopsy dayBiopsy day: loading -0.362 on factor 1-0.36Biopsy day: loading -0.538 on factor 2-0.54Biopsy day: loading 0.013 on factor 30.01Biopsy day: loading -0.022 on factor 4-0.02EmbryosEmbryos: loading -0.078 on factor 1-0.08Embryos: loading 0.084 on factor 20.08Embryos: loading 0.701 on factor 30.70Embryos: loading 0.702 on factor 40.70ExpansionExpansion: loading -0.100 on factor 1-0.10Expansion: loading -0.751 on factor 2-0.75Expansion: loading 0.154 on factor 30.15Expansion: loading -0.039 on factor 4-0.04Morphology qualityMorphology quality: loading 0.587 on factor 10.59Morphology quality: loading -0.165 on factor 2-0.17Morphology quality: loading -0.001 on factor 3-0.00Morphology quality: loading 0.076 on factor 40.08ICMICM: loading 0.575 on factor 10.58ICM: loading -0.087 on factor 2-0.09ICM: loading -0.002 on factor 3-0.00ICM: loading 0.097 on factor 40.10TETE: loading 0.412 on factor 10.41TE: loading -0.306 on factor 2-0.31TE: loading 0.042 on factor 30.04TE: loading 0.005 on factor 40.00Loading scale−10+1Negative → neutral → positive

Factor scores vs clinical pregnancy (IUP)

Point-biserial r between each factor score and IUP (1 = pregnancy, 0 = not). Bars show direction: to the right = higher scores associate with IUP on average.

+Morphology quality · ICMMorphology qualityICMr = 0.183Expansion · Biopsy dayExpansionBiopsy dayr = 0.080Embryos · AgeEmbryosAger = 0.006Embryos · AgeEmbryosAger = 0.039

Score distributions by outcome

Overlaid histograms for the full cohort: green = IUP, gray = not Pregnant. Same bins; taller bar = more cycles in that score range. Overlap shows that outcomes are not separated by a single linear factor.

Morphology qualityICM

-6.101.87 (factor score)

n = 3,329 IUP · 1,267 not IUP (same bins; bar height ∝ count in bin)

ExpansionBiopsy day

-2.783.72 (factor score)

n = 3,329 IUP · 1,267 not IUP (same bins; bar height ∝ count in bin)

Loadings (first two factors)

Each column is the contribution of that standardized feature to the factor (direction in feature space).

Feature
Morphology qualityICM
ExpansionBiopsy day
Embryo age-0.085-0.110
Biopsy day-0.362-0.538
Embryos for ET-0.0780.084
Gardner expansion-0.100-0.751
Morphology quality (ordinal)0.587-0.165
ICM grade (score)0.575-0.087
TE grade (score)0.412-0.306

Interpretation of findings

Data-driven narrative for this cohort

This analysis derived orthogonal variance factors from 4,596 FET records and 7 standardized inputs: embryo age, biopsy day, embryos transferred, Gardner expansion, morphology quality (ordinal), and ICM/TE letter scores. The method finds orthogonal factors in which the cohort varies the most (variance-ranked). Clinical pregnancy (IUP) was not used to build those factors—they summarize patterns among the predictors only. Associations with IUP are computed afterward (correlation of factor scores with outcome).

The factor summarized as Morphology quality · ICM accounts for about 32.8% of total variance across standardized features (largest share). The features with the largest absolute loadings on this factor are Morphology quality (ordinal), ICM grade (score), TE grade (score) (see loadings table for signs). A positive loading means that variable tends to increase when moving in the positive direction along this factor in this cohort, and vice versa. This factor is usually the dominant mix of age, morphology, and expansion because they co-vary in real cohorts.

The factor summarized as Expansion · Biopsy day is orthogonal to Morphology quality · ICM and explains about 18.2% of variance separately. It is driven most by Gardner expansion, Biopsy day, TE grade (score). It often captures another pattern (for example differences between biopsy day and morphology that Morphology quality · ICM does not already explain).

Together, the 4 factors in the table (each labeled by its short summary) explain 79.3% of total variance. With only seven input variables, at most seven dimensions exist; factors with smaller variance shares are often harder to interpret.

  • Morphology quality · ICM vs clinical pregnancy. Point-biserial correlation r ≈ 0.183 (modest linear association). On average, cycles with a clinical pregnancy (IUP) have higher scores on Morphology quality · ICM than cycles without (linear summary only).
  • Expansion · Biopsy day vs clinical pregnancy. r ≈ 0.080. Expansion · Biopsy day can align with outcome differently from Morphology quality · ICM because it summarizes a distinct part of predictor space.
  • Scatter plot (Morphology quality · ICM vs Expansion · Biopsy day). Overlap between green (Pregnant) and gray (not Pregnant) points indicates that these two linear summaries do not separate outcomes cleanly—pregnancy still occurs across many regions of the plot. This analysis is not optimized for prediction.
  • Caveats. Results are exploratory and cohort-specific. They do not prove causation. Confounding factors are not adjusted here. Use these findings to generate hypotheses, not to counsel individual patients from this page alone.

Scores: Morphology quality · ICM (horizontal) vs Expansion · Biopsy day (vertical)

Subsampled points for display (even spacing along the sorted row index). Green = clinical pregnancy (IUP).

Morphology qualityICMExpansionBiopsy day
IUP Not IUP