A Deep Learning Framework for Amplitude Generation of Generic EMRIs

Author(s)

Zeng, Yan-bo, Zhang, Jian-dong, Hu, Yi-Ming, Mei, Jianwei

Abstract

One of the main targets for space-borne gravitational wave detectors is the detection of Extreme Mass Ratio Inspirals (EMRIs). The data analysis of EMRIs requires waveform models that are both accurate and fast. The major challenge for the fast generation of such waveforms is the generation of the Teukolsky amplitudes for generic (eccentric and inclined) Kerr orbits. The requirement for the modeling of $\sim10^5$ harmonic modes across a four-dimensional parameter space makes traditional approaches, including direct computation or dense interpolation, computationally prohibitive. To overcome this issue, we introduce a convolutional encoder-decoder architecture for a fast and end-to-end global fitting of the Teukolsky amplitudes. We also adopt a transfer learning strategy to reduce the size of the training dataset, and the model is trained gradually from the simplest Schwarzschild circular orbits to generic Kerr orbits step by step. Within this framework, we obtain a surrogate model based on a semi-analytical Post-Newtonian dataset, and the full harmonic amplitudes can be generated within milliseconds, while the median mode-distribution error for generic orbits is approximately $\sim10^{-3}$. This result indicates that the framework is viable for constructing efficient waveform models for EMRIs.

Figures

The neural network employs an encoder-decoder architecture to predict the Teukolsky amplitude's modulus and phase in parallel branches. The encoder, a 10-layer residual MLP with Swish activations, maps the four orbital parameters $(a, p, e, x_I)$ to a latent vector. This vector is then upsampled to the full 4D mode-space dimensions using trilinear interpolation, which correspond to $(m, n, k)$ dimensions. A series of 10 residual CNN blocks, featuring 3D convolutions with anisotropic kernels and attention gates, refines the structural tensor. Independent output heads with physically-motivated activations (Softplus for the modulus and Tanh for the phase) produce the predictions, which are passed through a final layer that enforces physical constraints (e.g., $|m| \le \ell$).
Caption The neural network employs an encoder-decoder architecture to predict the Teukolsky amplitude's modulus and phase in parallel branches. The encoder, a 10-layer residual MLP with Swish activations, maps the four orbital parameters $(a, p, e, x_I)$ to a latent vector. This vector is then upsampled to the full 4D mode-space dimensions using trilinear interpolation, which correspond to $(m, n, k)$ dimensions. A series of 10 residual CNN blocks, featuring 3D convolutions with anisotropic kernels and attention gates, refines the structural tensor. Independent output heads with physically-motivated activations (Softplus for the modulus and Tanh for the phase) produce the predictions, which are passed through a final layer that enforces physical constraints (e.g., $|m| \le \ell$).
Distribution of orbital parameters in the \textbf{training dataset}. The corner plot shows 1D marginalized histograms (diagonal) and 2D projected distributions (off-diagonal) for the spin $a$, semi-latus rectum $p$, eccentricity $e$, and inclination cosine $x_I$. The color map distinguishes the different orbital geometries as defined in Table~\ref{tab:orbit_classification}. The plot visualizes the dataset's stratified nature, with dense populations corresponding to specific classes like Schwarzschild (SC/SE), Kerr Equatorial (KEC/KEE), Kerr Inclined Circular (KIC), and Kerr Generic (KG) orbits.
Caption Distribution of orbital parameters in the \textbf{training dataset}. The corner plot shows 1D marginalized histograms (diagonal) and 2D projected distributions (off-diagonal) for the spin $a$, semi-latus rectum $p$, eccentricity $e$, and inclination cosine $x_I$. The color map distinguishes the different orbital geometries as defined in Table~\ref{tab:orbit_classification}. The plot visualizes the dataset's stratified nature, with dense populations corresponding to specific classes like Schwarzschild (SC/SE), Kerr Equatorial (KEC/KEE), Kerr Inclined Circular (KIC), and Kerr Generic (KG) orbits.
Distribution of orbital parameters in the \textbf{validation dataset}. This dataset consists of randomly drawn samples that were held out from the training process. It covers all orbital classes, providing a robust test of the model's ability to generalize to unseen data.
Caption Distribution of orbital parameters in the \textbf{validation dataset}. This dataset consists of randomly drawn samples that were held out from the training process. It covers all orbital classes, providing a robust test of the model's ability to generalize to unseen data.
The mode-distribution error ($\mathcal{M}_\text{amp}$) categorized by orbital geometry on a logarithmic scale. The shape of each violin shows the probability density of the error, while the inner box plot marks the median and interquartile range.
Caption The mode-distribution error ($\mathcal{M}_\text{amp}$) categorized by orbital geometry on a logarithmic scale. The shape of each violin shows the probability density of the error, while the inner box plot marks the median and interquartile range.
Comparison of the predicted and true log-magnitudes for the top 20 dominant modes of a representative Kerr Generic (KG) orbit. The specific orbital parameters are $(a, p, e, x_I) = (0.30, 16.26, 0.20, -0.50)$. The x-axis lists the mode indices $(\ell, m, n, k)$ for each of the 20 modes, ordered by their true amplitude. The y-axis shows the amplitude modulus, $|A_{\ell m n k}|$, on a logarithmic scale. Blue bars represent the true amplitudes from the PN dataset (Reference), while orange bars show the corresponding predictions from our surrogate model (NN). The Mean Absolute Percentage Error (MAPE) for this specific sample is 3.95\%.
Caption Comparison of the predicted and true log-magnitudes for the top 20 dominant modes of a representative Kerr Generic (KG) orbit. The specific orbital parameters are $(a, p, e, x_I) = (0.30, 16.26, 0.20, -0.50)$. The x-axis lists the mode indices $(\ell, m, n, k)$ for each of the 20 modes, ordered by their true amplitude. The y-axis shows the amplitude modulus, $|A_{\ell m n k}|$, on a logarithmic scale. Blue bars represent the true amplitudes from the PN dataset (Reference), while orange bars show the corresponding predictions from our surrogate model (NN). The Mean Absolute Percentage Error (MAPE) for this specific sample is 3.95\%.
References
  • [1] J. Luo et al. (TianQin), Class. Quant. Grav. 33, 035010 (2016), arXiv:1512.02076 [astro-ph.IM].
  • [2] H.-M. Fan, Y.-M. Hu, E. Barausse, A. Sesana, J.-d. Zhang, X. Zhang, T.-G. Zi, and J. Mei, Phys. Rev. D 102, 063016 (2020), arXiv:2005.08212 [astro-ph.HE].
  • [3] J. Mei et al. (TianQin), PTEP 2021, 05A107 (2021), arXiv:2008.10332 [gr-qc].
  • [4] P. Amaro-Seoane, J. R. Gair, M. Freitag, M. Coleman Miller, I. Mandel, C. J. Cutler, and S. Babak, Class. Quant. Grav. 24, R113 (2007), arXiv:astro-ph/0703495.
  • [5] P. Amaro-Seoane, Living Rev. Rel. 21, 4 (2018), arXiv:1205.5240 [astro-ph.CO].
  • [6] S. Babak, J. Gair, A. Sesana, E. Barausse, C. F. Sopuerta, C. P. L. Berry, E. Berti, P. Amaro-Seoane, A. Petiteau, and A. Klein, Phys. Rev. D 95, 103012 (2017), arXiv:1703.09722 [gr-qc].
  • [7] C. P. L. Berry, S. A. Hughes, C. F. Sopuerta, A. J. K. Chua, A. Heffernan, K. Holley-Bockelmann, D. P. Mihaylov, M. C. Miller, and A. Sesana, Bull. Am. Astron. Soc. 51, 42 (2019), arXiv:1903.03686 [astro-ph.HE].
  • [8] M. Colpi et al. (LISA), (2024), arXiv:2402.07571 [astroph.CO].
  • [9] B. Bonga, H. Yang, and S. A. Hughes, Phys. Rev. Lett. 123, 101103 (2019), arXiv:1905.00030 [gr-qc].
  • [10] H.-M. Fan, X.-Y. Lyu, J.-d. Zhang, Y.-M. Hu, R.-J. Yang, and T.-F. Feng, Phys. Rev. D 111, 103008 (2025), arXiv:2410.12408 [astro-ph.HE].
  • [11] C. L. MacLeod and C. J. Hogan, Phys. Rev. D 77, 043512 (2008), arXiv:0712.0618 [astro-ph].
  • [12] L.-G. Zhu, H.-M. Fan, X. Chen, Y.-M. Hu, and J.-d. Zhang, Astrophys. J. Suppl. 273, 24 (2024), arXiv:2403.04950 [astro-ph.CO].
  • [13] J. R. Gair, M. Vallisneri, S. L. Larson, and J. G. Baker, Living Rev. Rel. 16, 7 (2013), arXiv:1212.5575 [gr-qc].
  • [14] A. Maselli, N. Franchini, L. Gualtieri, and T. P. Sotiriou, Phys. Rev. Lett. 125, 141101 (2020), arXiv:2004.11895 [gr-qc].
  • [15] T.-G. Zi, J.-D. Zhang, H.-M. Fan, X.-T. Zhang, Y.-M. Hu, C. Shi, and J. Mei, Phys. Rev. D 104, 064008 (2021), arXiv:2104.06047 [gr-qc].
  • [16] T. Zi, Z. Zhou, H.-T. Wang, P.-C. Li, J.-d. Zhang, and B. Chen, Phys. Rev. D 107, 023005 (2023), arXiv:2205.00425 [gr-qc].
  • [17] X.-T. Zhang, C. Messenger, N. Korsakova, M. L. Chan, Y.-M. Hu, and J.-d. Zhang, Phys. Rev. D 105, 123027 (2022), arXiv:2202.07158 [astro-ph.HE].
  • [18] C.-Q. Ye, H.-M. Fan, A. Torres-Orjuela, J.-d. Zhang, and Y.-M. Hu, Phys. Rev. D 109, 124034 (2024), arXiv:2310.03520 [gr-qc].
  • [19] J. R. Gair, L. Barack, T. Creighton, C. Cutler, S. L. Larson, E. S. Phinney, and M. Vallisneri, Class. Quant. Grav. 21, S1595 (2004), arXiv:gr-qc/0405137.
  • [20] S. Babak et al. (Mock LISA Data Challenge Task Force), Class. Quant. Grav. 27, 084009 (2010), arXiv:0912.0548 [gr-qc].
  • [21] L. Barack and C. Cutler, Phys. Rev. D 69, 082005 (2004), arXiv:gr-qc/0310125.
  • [22] C. Cutler and M. Vallisneri, Phys. Rev. D 76, 104018 (2007), arXiv:0707.2982 [gr-qc].
  • [23] T. Hinderer and E. E. Flanagan, Phys. Rev. D 78, 064028 (2008), arXiv:0805.3337 [gr-qc].
  • [24] J. Miller and A. Pound, Phys. Rev. D 103, 064048 (2021), arXiv:2006.11263 [gr-qc].
  • [25] N. Afshordi et al. (LISA Consortium Waveform Working Group), Living Rev. Rel. 28, 9 (2025), arXiv:2311.01300 [gr-qc].
  • [26] Y. Mino, M. Sasaki, and T. Tanaka, Phys. Rev. D 55, 3457 (1997), arXiv:gr-qc/9606018.
  • [27] T. C. Quinn and R. M. Wald, Phys. Rev. D 56, 3381 (1997), arXiv:gr-qc/9610053.
  • [28] S. E. Gralla and R. M. Wald, Class. Quant. Grav. 25, 205009 (2008), [Erratum: Class.Quant.Grav. 28, 159501 (2011)], arXiv:0806.3293 [gr-qc].
  • [28] S. E. Gralla and R. M. Wald, Class. Quant. Grav. 25, 205009 (2008), [Erratum: Class.Quant.Grav. 28, 159501 (2011)], arXiv:0806.3293 [gr-qc].
  • [29] A. Pound, Phys. Rev. D 81, 024023 (2010), arXiv:0907.5197 [gr-qc].
  • [30] S. Detweiler, Phys. Rev. D 85, 044048 (2012), arXiv:1107.2098 [gr-qc].
  • [31] A. Pound, Phys. Rev. Lett. 109, 051101 (2012), arXiv:1201.5089 [gr-qc].
  • [32] S. E. Gralla, Phys. Rev. D 85, 124011 (2012), arXiv:1203.3189 [gr-qc].
  • [33] A. I. Harte, Fund. Theor. Phys. 179, 327 (2015), arXiv:1405.5077 [gr-qc].
  • [34] E. Poisson, A. Pound, and I. Vega, Living Rev. Rel. 14, 7 (2011), arXiv:1102.0529 [gr-qc].
  • [35] L. Barack and A. Pound, Rept. Prog. Phys. 82, 016904 (2019), arXiv:1805.10385 [gr-qc].
  • [36] A. Pound and B. Wardell, (2021), 10.1007/978-981-154702-7 38-1, arXiv:2101.04592 [gr-qc].
  • [37] B. Wardell, A. Pound, N. Warburton, J. Miller, L. Durkan, and A. Le Tiec, Phys. Rev. Lett. 130, 241402 (2023), arXiv:2112.12265 [gr-qc].
  • [38] G. Compère and L. Küchler, SciPost Phys. 13, 043 (2022), arXiv:2112.02114 [gr-qc].
  • [39] L. Küchler, G. Compère, L. Durkan, and A. Pound, SciPost Phys. 17, 056 (2024), arXiv:2405.00170 [gr-qc].
  • [40] L. Honet, L. Küchler, A. Pound, and G. Compère, Phys. Rev. D 113, 044051 (2026), arXiv:2510.13958 [gr-qc].
  • [41] S. A. Hughes, S. Drasco, E. E. Flanagan, and J. Franklin, Phys. Rev. Lett. 94, 221101 (2005), arXiv:gr-qc/0504015.
  • [42] S. A. Hughes, N. Warburton, G. Khanna, A. J. K. Chua, and M. L. Katz, Phys. Rev. D 103, 104014 (2021), [Erratum: Phys.Rev.D 107, 089901 (2023)], arXiv:2102.02713 [gr-qc].
  • [42] S. A. Hughes, N. Warburton, G. Khanna, A. J. K. Chua, and M. L. Katz, Phys. Rev. D 103, 104014 (2021), [Erratum: Phys.Rev.D 107, 089901 (2023)], arXiv:2102.02713 [gr-qc].
  • [43] T. Nakamura, K. Oohara, and Y. Kojima, Prog. Theor. Phys. Suppl. 90, 1 (1987).
  • [44] M. Shibata, Phys. Rev. D 48, 663 (1993).
  • [45] D. Kennefick, Phys. Rev. D 58, 064012 (1998), arXiv:grqc/9805102.
  • [46] K. Glampedakis and D. Kennefick, Phys. Rev. D 66, 044002 (2002), arXiv:gr-qc/0203086.
  • [47] S. Drasco and S. A. Hughes, Phys. Rev. D 73, 024027 (2006), [Erratum: Phys.Rev.D 88, 109905 (2013), Erratum: Phys.Rev.D 90, 109905 (2014)], arXiv:grqc/0509101.
  • [47] S. Drasco and S. A. Hughes, Phys. Rev. D 73, 024027 (2006), [Erratum: Phys.Rev.D 88, 109905 (2013), Erratum: Phys.Rev.D 90, 109905 (2014)], arXiv:grqc/0509101.
  • [47] S. Drasco and S. A. Hughes, Phys. Rev. D 73, 024027 (2006), [Erratum: Phys.Rev.D 88, 109905 (2013), Erratum: Phys.Rev.D 90, 109905 (2014)], arXiv:grqc/0509101.
  • [48] R. Fujita and H. Tagoshi, Prog. Theor. Phys. 112, 415 (2004), arXiv:gr-qc/0410018.
  • [49] R. Fujita, W. Hikida, and H. Tagoshi, Prog. Theor. Phys. 121, 843 (2009), arXiv:0904.3810 [gr-qc].
  • [50] Y. Yin, R. K. L. Lo, and X. Chen, (2025), arXiv:2511.08673 [gr-qc].
  • [51] Y. Mino, M. Sasaki, M. Shibata, H. Tagoshi, and T. Tanaka, Prog. Theor. Phys. Suppl. 128, 1 (1997), arXiv:gr-qc/9712057.
  • [52] M. Sasaki and H. Tagoshi, Living Rev. Rel. 6, 6 (2003), arXiv:gr-qc/0306120.
  • [53] N. Sago, T. Tanaka, W. Hikida, K. Ganz, and H. Nakano, Prog. Theor. Phys. 115, 873 (2006), arXiv:grqc/0511151.
  • [54] K. Ganz, W. Hikida, H. Nakano, N. Sago, and T. Tanaka, Prog. Theor. Phys. 117, 1041 (2007), arXiv:gr-qc/0702054.
  • [55] R. Fujita, Prog. Theor. Phys. 128, 971 (2012), arXiv:1211.5535 [gr-qc].
  • [56] A. G. Shah, Phys. Rev. D 90, 044025 (2014), arXiv:1403.2697 [gr-qc].
  • [57] R. Fujita and M. Shibata, Phys. Rev. D 102, 064005 (2020), arXiv:2008.13554 [gr-qc].
  • [58] “Black Hole Perturbation Toolkit,” (bhptoolkit.org).
  • [59] Z. Nasipak, Phys. Rev. D 106, 064042 (2022), arXiv:2207.02224 [gr-qc].
  • [60] Z. Nasipak, Phys. Rev. D 109, 044020 (2024), arXiv:2310.19706 [gr-qc].
  • [61] Z. Nasipak, (2025), arXiv:2507.07746 [gr-qc].
  • [62] M. van de Meent, Phys. Rev. D 97, 104033 (2018), arXiv:1711.09607 [gr-qc].
  • [63] A. J. K. Chua, M. L. Katz, N. Warburton, and S. A. Hughes, Phys. Rev. Lett. 126, 051102 (2021), arXiv:2008.06071 [gr-qc].
  • [64] M. L. Katz, A. J. K. Chua, L. Speri, N. Warburton, and S. A. Hughes, Phys. Rev. D 104, 064047 (2021), arXiv:2104.04582 [gr-qc].
  • [65] A. J. K. Chua, C. R. Galley, and M. Vallisneri, Phys. Rev. Lett. 122, 211101 (2019), arXiv:1811.05491 [astroph.IM].
  • [66] C. E. A. Chapman-Bird et al., Phys. Rev. D 112, 104023 (2025), arXiv:2506.09470 [gr-qc].
  • [67] B. Carter, Phys. Rev. 174, 1559 (1968).
  • [68] S. A. Hughes, Phys. Rev. D 61, 084004 (2000), [Erratum: Phys.Rev.D 63, 049902 (2001), Erratum: Phys.Rev.D 65, 069902 (2002), Erratum: Phys.Rev.D 67, 089901 (2003), Erratum: Phys.Rev.D 78, 109902 (2008), Erratum: Phys.Rev.D 90, 109904 (2014)], arXiv:gr-qc/9910091.
  • [68] S. A. Hughes, Phys. Rev. D 61, 084004 (2000), [Erratum: Phys.Rev.D 63, 049902 (2001), Erratum: Phys.Rev.D 65, 069902 (2002), Erratum: Phys.Rev.D 67, 089901 (2003), Erratum: Phys.Rev.D 78, 109902 (2008), Erratum: Phys.Rev.D 90, 109904 (2014)], arXiv:gr-qc/9910091.
  • [68] S. A. Hughes, Phys. Rev. D 61, 084004 (2000), [Erratum: Phys.Rev.D 63, 049902 (2001), Erratum: Phys.Rev.D 65, 069902 (2002), Erratum: Phys.Rev.D 67, 089901 (2003), Erratum: Phys.Rev.D 78, 109902 (2008), Erratum: Phys.Rev.D 90, 109904 (2014)], arXiv:gr-qc/9910091.
  • [68] S. A. Hughes, Phys. Rev. D 61, 084004 (2000), [Erratum: Phys.Rev.D 63, 049902 (2001), Erratum: Phys.Rev.D 65, 069902 (2002), Erratum: Phys.Rev.D 67, 089901 (2003), Erratum: Phys.Rev.D 78, 109902 (2008), Erratum: Phys.Rev.D 90, 109904 (2014)], arXiv:gr-qc/9910091.
  • [68] S. A. Hughes, Phys. Rev. D 61, 084004 (2000), [Erratum: Phys.Rev.D 63, 049902 (2001), Erratum: Phys.Rev.D 65, 069902 (2002), Erratum: Phys.Rev.D 67, 089901 (2003), Erratum: Phys.Rev.D 78, 109902 (2008), Erratum: Phys.Rev.D 90, 109904 (2014)], arXiv:gr-qc/9910091.
  • [68] S. A. Hughes, Phys. Rev. D 61, 084004 (2000), [Erratum: Phys.Rev.D 63, 049902 (2001), Erratum: Phys.Rev.D 65, 069902 (2002), Erratum: Phys.Rev.D 67, 089901 (2003), Erratum: Phys.Rev.D 78, 109902 (2008), Erratum: Phys.Rev.D 90, 109904 (2014)], arXiv:gr-qc/9910091.
  • [69] W. Schmidt, Class. Quant. Grav. 19, 2743 (2002), arXiv:gr-qc/0202090.
  • [70] D. C. Wilkins, Phys. Rev. D 5, 814 (1972).
  • [71] E. Teo, Gen. Rel. Grav. 53, 10 (2021), arXiv:2007.04022 [gr-qc].
  • [72] O. Burke, G. A. Piovano, N. Warburton, P. Lynch, L. Speri, C. Kavanagh, B. Wardell, A. Pound, L. Durkan, and J. Miller, Phys. Rev. D 109, 124048 (2024), arXiv:2310.08927 [gr-qc].
  • [73] S. Drasco, E. E. Flanagan, and S. A. Hughes, Class. Quant. Grav. 22, S801 (2005), arXiv:gr-qc/0505075.
  • [74] S. A. Teukolsky, Astrophys. J. 185, 635 (1973).
  • [75] O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K. Misawa, K. Mori, S. McDonagh, N. Y. Hammerla, B. Kainz, et al., arXiv preprint arXiv:1804.03999 (2018).
  • [76] M. Tancik, P. Srinivasan, B. Mildenhall, S. FridovichKeil, N. Raghavan, U. Singhal, R. Ramamoorthi, J. Barron, and R. Ng, Advances in neural information processing systems 33, 7537 (2020).
  • [77] N. Rahaman, A. Baratin, D. Arpit, F. Draxler, M. Lin, F. Hamprecht, Y. Bengio, and A. Courville, in International conference on machine learning (PMLR, 2019) pp. 5301–5310.
  • [78] J. L. Ba, J. R. Kiros, and G. E. Hinton, arXiv preprint arXiv:1607.06450 (2016).
  • [79] Y. Wu and K. He, in Proceedings of the European conference on computer vision (ECCV) (2018) pp. 3–19.
  • [80] S. Isoyama, R. Fujita, A. J. K. Chua, H. Nakano, A. Pound, and N. Sago, Phys. Rev. Lett. 128, 231101 (2022), arXiv:2111.05288 [gr-qc].
  • [81] L. C. Stein and N. Warburton, Phys. Rev. D 101, 064007 (2020), arXiv:1912.07609 [gr-qc].
  • [82] S. A. Hughes, in 14th Marcel Grossmann Meeting on Recent Developments in Theoretical and Experimental General Relativity, Astrophysics, and Relativistic Field Theories, Vol. 2 (2017) pp. 1953–1959, arXiv:1601.02042 [grqc].
  • [83] E. E. Flanagan and T. Hinderer, Phys. Rev. Lett. 109, 071102 (2012), arXiv:1009.4923 [gr-qc].
  • [84] I. Loshchilov and F. Hutter, arXiv preprint arXiv:1711.05101 (2017).
  • [85] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., Advances in neural information processing systems 32 (2019).
  • [86] T. Chen, B. Xu, C. Zhang, and C. Guestrin, arXiv preprint arXiv:1604.06174 (2016).
  • [87] S. Drasco, Phys. Rev. D 79, 104016 (2009), arXiv:0711.4644 [gr-qc].