Modeling stellar spectra with unsupervised machine learning generative models

@TingAstro

Yuan-Sen Ting 

Australian National University

Jo Ciuca (Astro-3D Fellow, ANU)

Yuan-Sen Ting (me, ANU)

Science team :

Fabio Albertelli

Jakub Misiek

Animation team :

Disclaimer : We reserve all rights to the video productions in this talks. The videos should not be used or adapted in any way without the consent of the authors (Yuan-Sen Ting, Jakub Misiek, Fabio Albertelli)

Why machine learning + spectroscopy ?

Typically ~10,000 pixels

= Image Net

~ O(10,000) pixels

20-50 labels

= multi-class labels

Temperature, gravity, stellar ages, evolutionary stages, chemical abundances

Most ML spectroscopy study has been focusing on supervised learning

Input

labels / spectra

Output

labels / spectra

Magic happens

Spectra

Chemical composition

The danger of inferring abundances from supervised learning

RAVE-on, data-driven abundances

Nyx (accreted dwarf system ??)

Zucker+ 21

[Fe/H]

[Mg/Fe]

Galah high-res "ab initio" measurements

Not so fast !

-1.5

-1.5

-0.9

-0.3

0.3

-1.5

-0.9

-0.3

0.3

[Fe/H]

Correlation is not causality

[Fe/H]

[Mg/Fe]

log g

spectrum

Correlation is not causality

[Fe/H]

[Mg/Fe]

log g

spectrum

distance

Selection

function

Chemical evolution

Measurement?

Correlation is not causality

[Fe/H]

[Mg/Fe]

log g

spectrum

distance

Selection

function

Chemical evolution

or indirect inference ?

Unsupervised machine learning - generative models

Generative models : two "dreamed up" human faces

Karras+ 18

this person does not exist.com

Generative models : training with unlabelled data set

Real human

Fake human

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

x

Generative models : a high-dimensional density estimation problem

\{ x_i \}
p(x)

: human face

Ensemble

Distribution

Drawing the "contour" in a high-dimensional image space

~10,000 dimensions

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

~10,000 dimensions

Drawing the "contour" in a high-dimensional image space

Generating samples within the "contour"

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

Classical methods like gaussian mixture models will not do the jobs

~10,000 dimensions

p(x)

Not all unsupervised generative models are suitable for sciences

Why not Generative Adversarial Networks ( GAN ) 

Zakharov+ 19

Training data set: real human

Generator
( machine 1)

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

Discriminator
( machine 2 )

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

Generator ( machine 1 )

Discriminator ( machine 2 )

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

Lack of diversity

GAN generates "good looking" results but suffer from mode dropping

Normalizing Flows

see YST & Weinberg 21

Ciuca & YST, in prep.

Adopting neural networks as a change of variables

Neural Network

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

(1) Invertibility

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

Normalizing flows : neural networks that satisfy two criteria

(2) Jacobian can be calculated easily

2

1

Jacobian = Area 2 / Area 1

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

Normalizing flows : neural networks that satisfy two criteria

GAN

Normalizing Flows

~10,000 dimensions

Applications of unsupervised machine learning

( 1 ) Outliers detection

A statistical robust way to extract spectral outliers

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

p( {\rm spectra} ) \gg 1
p({\rm spectra}) \ll 1

Normal spectra

Baby Yoda spectra

p\,({\rm spectra})

Baby Yoda model created by NestaEric on Sketchfab, licensed under CCB BY

Detecting spectral outliers without the need of analyzing spectra

Mock APOGEE Test

Ciuca & Ting, In prep.

-40,000

-20,000

0

20000

Log Likelihood, log p(x)

Probability Density 

Normal spectra

Core of the distribution

Periphery 

Metal-poor

Extreme magnesium

Extreme aluminium

Random masking

Applications of unsupervised machine learning

( 2 ) Finding correlation between pixels

Modeling the conditional distribution of APOGEE spectra

p\,({\rm spectra} | \, {\rm Teff , logg} \, )

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

p\,({\rm spectra})

Energy Level

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

p\,(x_i | {\rm Teff}, {\rm logg})
{\rm Corr} (x_i, x_j)

Recovering missing atomic features via the empirical correlations

Ciuca & Ting, In prep.

Mock APOGEE Test

Ni

Co

Cu

Mn

Cr

V

Ti

S

Na

Ni

Co

Cu

Mn

Cr

V

Ti

S

Na

1.0

0.8

0.6

0.4

0.2

0.0

Correlation

Applications of unsupervised machine learning

( 3 ) Bridging the gap between theory and data

Synthetic spectral models are imperfect

0.5

15900

15940

Best-fit model

Normalized Spectrum

Observation

Wavelength [A]

Fitting an observed stellar spectrum

15920

15960

15880

1.0

APOGEE spectra

Overcoming model imperfections using domain adaptation

Observed spectra

Synthetic spectra

"Observed spectra"

"Synthetic spectra"

Overcoming model imperfections using domain adaptation

Domain adaptation : finding commonalities + merging the distributions

All rights reserved - Y.-S. Ting, J. Misiek, F. Albertelli

Closing the synthetic-observation gap with Cycle-StarNet

0.5

1.0

0.5

1.0

15900

15950

Observation

Old model

Normalized Spectrum

Cycle-StarNet calibrated

Wavelength [A]

 , YST, Fabbro+ 2021

O'Briain

Observation

Summary :

Supervised (data-driven) machine learning is subjected many caveats due to the need of unbiased stellar labels

Unsupervised machine learning performs statistical robust density estimations in high dimensional spaces ( 1,000 - 10,000 D)

For sciences, normalizing flows are much more robust than GAN 

We used unsupervised ML to ( 1 ) perform outliers detection,
( 2 ) study high-order moments and ( 3 ) autocalibrate models