The Ohio State University
Manganese
Direct atomic transitions
Wavelength [A]
d flux / d abundance
YST, Conroy, Rix+ 18
Magnesium
Atomic transitions
Wavelength [A]
d flux / d abundance
Alteration of stellar atmosphere
YST, Conroy, Rix+ 18
Low resolution
Wavelength [A]
6900
6920
6940
Normalized Spectrum
0.4
0.6
0.8
1.0
Task
The ML tool with the relevant assumptions
Blindly applying machine learning
Classical tool
Convolutional Neural Networks
Local perceptive field
Transformer
Long-range information
4500
4600
4700
4800
4900
5000
Wavelength [A]
Generative Residual [dex]
-0.02
0
0.02
0
-0.1
0.1
-0.02
0
0.02
TransformerPayne (< 0.1%)
Convolutional Neural Networks
Multi-Layer Perceptron
("The Payne", 1% error)
Rozanski, YST+ 2024
(~ 10%)
YST, ARAA, 2026
https://tingyuansen.github.io/NASA_AI_ML_STIG/
https://tingyuansen.github.io/NASA_AI_ML_STIG/
Open-AI, 2023
"Model Size "
Performance
The more computing power we can allocate, the more the models will continue to improve
Number of parameters
Emulation Error, MAE
1%
10M
1M
100K
Model Width
Rozanski & YST, 2025
In the "old" Payne
Photon noise
Model
Systematics
Emulator
systematics
With Transformers
Photon noise
Model
Systematics
Emulator
systematics
Can in principle be arbitrarily small
Note: larger models, especially Transformers, can be slow
0.5
15900
15940
Best-fit model
Normalized Spectrum
Observation
Wavelength [A]
APOGEE M-Dwarfs
15920
15960
15880
1.0
RAVE-on, "data-driven" abundances
Nyx (accreted dwarf system ??)
Zucker,..,YST+ 21
[Fe/H]
[Mg/Fe]
Galah high-res "ab initio" measurements
Not so fast!
-1.5
-1.5
-0.9
-0.3
0.3
-1.5
-0.9
-0.3
0.3
[Fe/H]
[Fe/H]
[Mg/Fe]
log g
spectrum
[Fe/H]
[Mg/Fe]
log g
spectrum
distance
Selection
function
Chemical evolution
[Fe/H]
[Mg/Fe]
log g
spectrum
distance
Selection
function
Chemical evolution
True "response function", Kurucz models
Wavelength [A]
4000
5000
6000
7000
8000
Effective Temperature
4400
4600
4800
5000
Wavelength [A]
4000
5000
6000
7000
8000
Color code : the "gradient" spectrum
Data-driven
Inferring Eu from other elements, not measuring Eu
YST+ 2017
Measuring magnesium from the magnesium lines
Synthetic spectra
Observed spectra
Xiang, YST, Rix+ 2019
Zhang, Xiang, YST+ 2024
This Study
DESI Data Release
[Fe/H]
-3
-2
-1
0
-3
-2
-1
0
-3
-2
-1
0
-0.2
0.0
0.2
0.4
0.6
0.8
[Mg/Fe]
[Mg/Fe]
Thick Disk
Thin Disk
GSE
-0.2
0.0
0.2
0.4
0.6
0.8
[Mg/Fe]
Zhang, Xiang, YST+ 2024
[Fe/H]
[X/Fe]
Ti
Cr
Mn
Al
Si
Ca
O
Ni
C
Chemical evolution model
Zhang, Xiang, YST+ 2024
https://pair.withgoogle.com/explorables/grokking/
https://pair.withgoogle.com/explorables/grokking/
https://pair.withgoogle.com/explorables/grokking/
Neural Network Weights
YST, ARAA, 2026
Few-shot
learning
is the key
Zhao, YST+, 2025
Zhao, YST+, 2025
Pre-training with LAMOST
DESI Pipeline
APOGEE (target)
NN from scratch
NN from LAMOST
+ fine tune
Synthetic spectra
Observed spectra
METR.com
Kim & YST, 2026
Dedicated to the memory of Robert L. Kurucz
(1944–2025)
An excellent replica!
Fortran vs Python
github.com/tingyuanse/kurucz
Star a =
Star b =
Individual bad lines can dominate the systematics
e.g., Liu, YST, Yong+, Nature, 2024
precision ~ 0.01 dex
Star a =
Star b =
Individual bad lines can dominate the systematics
Canceling out line-by-line systematics
Individual bad lines can dominate the systematics
Not enough graduate students...
Proposing actions
Execute actions
State evolution
Knowledge distillation
YST+, 2025
YST+, 2025
YST+, 2025
YST+, 2025
Liu, YST, Yong+, Nature, 2024
precision ~ 0.01 dex
One of
the 250 stars
David Yong's right hand
Equivalent width
Egent with GPT-5-mini