Model Architecture

The G×E×M model fuses three specialized encoders through a novel trilinear attention mechanism, capturing pairwise and three-way interaction tensors between genomic, environmental, and management spaces.

Model Overview

SNP Graph

Climate Seq.

Mgmt. Vec.

→

GAT Encoder

Transformer

MLP Encoder

→

Trilinear
Attention Fusion

→

Yield
+ CI

Total params: 8.2MEmbedding dim: 256Training: 8×A100 GPUsFramework: PyTorch 2.4

Encoder Modules

🧬

Graph Attention Network

GAT Genotype Encoder

Input

SNP marker graph (V nodes = SNPs, E = LD edges)

Output

g ∈ ℝ^256

3.1M params

Models epistatic interactions as a graph problem. Each SNP is a node; linkage disequilibrium thresholds define edges. Four stacked GAT layers with 8 attention heads each capture long-range genomic interactions across chromosomes.

Graph EmbeddingNode features: allele dosage, MAF, position

GAT Layer × 4Multi-head attention (8 heads), residual connections

Global Mean PoolAggregate over all SNP nodes → fixed-dim vector

Linear Projection→ 256-dim genotype embedding

🌦️

Temporal Transformer

Transformer Environment Encoder

Input

Climate sequence (T=365, D=7 variables)

Output

e ∈ ℝ^256

3.6M params

A 6-layer Transformer ingests 365 daily climate vectors (T_max, T_min, precipitation, solar radiation, VPD, wind speed, humidity). The CLS token learns to aggregate seasonal patterns relevant to yield formation.

Positional EncodingSinusoidal day-of-year encoding

Transformer Block × 68 heads, d_model=256, FFN=1024, dropout=0.1

CLS Token PoolingClassification token aggregates sequence

Linear Projection→ 256-dim environment embedding

🚜

Deep MLP

MLP Management Encoder

Input

Management vector m ∈ ℝ^12

Output

m_enc ∈ ℝ^256

1.5M params

Encodes 12 agronomic management variables: nitrogen rate, planting density, irrigation amount and timing, tillage type, pesticide applications, cover crop, planting date, row spacing, and seed treatment.

Input NormalizationBatchNorm + feature-wise scaling

FC Layer × 412 → 128 → 512 → 512 → 256

ReLU + Dropoutp=0.15 between each layer

Embedding Output→ 256-dim management embedding

⚡ Trilinear Attention Fusion

The three embeddings g, e, m ∈ ℝ^256 are fused through a learned trilinear attention mechanism that captures all pairwise and three-way interaction tensors.

G×E Interaction

A_ge = softmax((g W_Q)(e W_K)ᵀ / √d)

Cross-attention between genotype and environment embeddings

G×M Interaction

A_gm = softmax((g W_Q)(m W_K)ᵀ / √d)

How management modifies genotypic potential

E×M Interaction

A_em = softmax((e W_Q)(m W_K)ᵀ / √d)

Environment-management co-adaptation tensor

Trilinear Fusion

z = FFN([g; e; m; A_ge⊗m; A_gm⊗e; A_em⊗g])

Full G×E×M interaction vector via outer products

Yield Head

ŷ = MLP(z) | σ² = MLP(z) via MC-Dropout

Point estimate + aleatoric uncertainty

Training Objective

L = MSE(ŷ, y) + λ KL[q(w)||p(w)]

NLL loss with variational weight regularization

Ablation & Benchmark Results

Model Variant	RMSE (t/ha) ↓	R² ↑	Pearson r ↑	Parameters	Notes
G×E×M (Ours)Best	0.41	0.93	0.96	8.2M	Full trilinear fusion
G×E Only	0.53	0.88	0.94	6.7M	No management encoder
G×M Only	0.71	0.82	0.90	4.6M	No environment encoder
E×M Only	0.65	0.84	0.91	5.1M	No genotype encoder
Concat Fusion	0.52	0.88	0.94	8.2M	Concatenation baseline
GBLUP	0.89	0.74	0.86	—	Classical genomic
DeepGS	0.71	0.81	0.90	2.1M	CNN genomic selection

Evaluated on G2F Genomes-to-Fields maize dataset, 2014–2022, leave-environment-out cross-validation.