Inpainting over the cracks: challenges of applying pre-merger searches for massive black hole binaries to realistic LISA datasets

Author(s)

Cabourn Davies, Gareth, Harry, Ian

Abstract

A key science target of the Large Interferometer Space Antenna (LISA) is to carry out multi-messenger observations of massive black hole binaries, observing the merger simultaneously in gravitational waves and with electromagnetic observatories. Identifying that a merger is happening and providing an updating estimate of the sky location in the hours, days and weeks before the merger is critical to enable electromagnetic observations of the merger event. In this work we demonstrate and compare two methods for premerger identification of massive black hole binaries; a zero-latency filter approach and, for the first time, an approach using an ``inpainting'' technique. We apply these methods to the LISA Data Challenge dataset 2a--Sangria-HM--and demonstrate the successful recovery of the 14 signals in the dataset that we expected to be identifiable at least half a day before merger. We show that the inpainting method can identify premerger signals even when gaps are present in the data, demonstrating the recovery of a signal even when 3 day-long data gaps are added to the 14 days preceding merger. Finally, we explore the challenge of overlapping signals, using a region of overlapping signals in the Sangria-HM dataset, all of which merge within a 10-day window, and show how removing signals that have been confidently identified from the data allows us to identify quieter signals in the same period.

Figures

Comparison of power spectral densities used in this work. The power spectral density is estimated from the Sangria-HM dataset using the Welch method with a segment duration of 18.25 days. The model shown is the noise model used to produce the Sangria-HM dataset, and is shown with and without unresolvable Galactic binary sources. We show the power spectral density model and estimate smoothed using the method described in Section~\ref{subsec:smoothing}. We also show the characteristic strain for a representative \ac{MBHB} signal from Sangria-HM (Signal 2), which shows that at 14, 7 and 4 days-before-merger, the \ac{MBHB} does not overlap the Galactic binary confusion noise in frequency, but when the signal reaches around one day before merger (blue line), the Galactic binary confusion noise starts to overlap with the signal.
Caption Comparison of power spectral densities used in this work. The power spectral density is estimated from the Sangria-HM dataset using the Welch method with a segment duration of 18.25 days. The model shown is the noise model used to produce the Sangria-HM dataset, and is shown with and without unresolvable Galactic binary sources. We show the power spectral density model and estimate smoothed using the method described in Section~\ref{subsec:smoothing}. We also show the characteristic strain for a representative \ac{MBHB} signal from Sangria-HM (Signal 2), which shows that at 14, 7 and 4 days-before-merger, the \ac{MBHB} does not overlap the Galactic binary confusion noise in frequency, but when the signal reaches around one day before merger (blue line), the Galactic binary confusion noise starts to overlap with the signal.
The method used to smooth over the dip in the power-spectral density at 60 mHz. We draw a straight line between the two peaks either side of this point, and smoothly transition between the estimated power-spectral density and the straight line using a Hann window. This ensures that there are no immediate jumps in the power-spectral density to cause spectral leakage. This is applied to both the estimated and modeled power-spectral density used in later analysis.
Caption The method used to smooth over the dip in the power-spectral density at 60 mHz. We draw a straight line between the two peaks either side of this point, and smoothly transition between the estimated power-spectral density and the straight line using a Hann window. This ensures that there are no immediate jumps in the power-spectral density to cause spectral leakage. This is applied to both the estimated and modeled power-spectral density used in later analysis.
Template bank sizes assuming different power-spectral densities. Inclusion of unresolvable Galactic binary foreground in the model increases bank size by a factor between 2.1 and 2.6. Using the estimated rather than modeled power-spectral density increases bank size by a factor between 1.0 and 1.9.
Caption Template bank sizes assuming different power-spectral densities. Inclusion of unresolvable Galactic binary foreground in the model increases bank size by a factor between 2.1 and 2.6. Using the estimated rather than modeled power-spectral density increases bank size by a factor between 1.0 and 1.9.
Optimal signal-to-noise ratio (SNR) build-up and signal-to-noise ratio of premerger-truncated signals matched filtered against the full signal for Signal 10. We see that the premerger-truncated signals' signal-to-noise ratios are a product of the optimal signal-to-noise and the match between the full and truncated signals. Equivalent plots for all signals are included in the data release.
Caption Optimal signal-to-noise ratio (SNR) build-up and signal-to-noise ratio of premerger-truncated signals matched filtered against the full signal for Signal 10. We see that the premerger-truncated signals' signal-to-noise ratios are a product of the optimal signal-to-noise and the match between the full and truncated signals. Equivalent plots for all signals are included in the data release.
Signal 10 results, showing results with and without the secondary peaks, for the removed results, the signal is removed from the data two hours before merger. We also see peaks in the data before merger, these are the post-signal peaks for Signal 9, which is 21 days before Signal 10. Equivalent plots for all signals are included in the data release.
Caption Signal 10 results, showing results with and without the secondary peaks, for the removed results, the signal is removed from the data two hours before merger. We also see peaks in the data before merger, these are the post-signal peaks for Signal 9, which is 21 days before Signal 10. Equivalent plots for all signals are included in the data release.
Search results for the zero-latency filter search
Caption Search results for the zero-latency filter search
Signal-to-noise ratio (SNR) as a function of the forecast merger time and the time before merger. Produced using the inpainting technique to analyze the Sangria-HM dataset. Signals have been removed from the data once they reach 2 hours before merger.
Caption Signal-to-noise ratio (SNR) as a function of the forecast merger time and the time before merger. Produced using the inpainting technique to analyze the Sangria-HM dataset. Signals have been removed from the data once they reach 2 hours before merger.
Timeline of results for Signal Zero, demonstrating the inclusion of gaps in the data in the days before merger. Timeline plots for all signals are included in the data release.
Caption Timeline of results for Signal Zero, demonstrating the inclusion of gaps in the data in the days before merger. Timeline plots for all signals are included in the data release.
 : No removal of signals
Caption : No removal of signals
 : Signals are removed 2 hours before merger
Caption : Signals are removed 2 hours before merger
Timeline of results in the congested period of days 110-140 of the Sangria-HM dataset. These are inpainting results filtered to be at the 0, 5, 1, 4, 7 and 14 days-before-merger that were used in the zero latency search. We see that Signal 4 - the loudest signal - dominates the SNR, and can actually be found before Signals 2 and 3. We remove Signal 4 14 days before merger. Signal 3 can be detected almost immediately after Signal 4 is removed from the data, and is therefore removed 11 days before merger. Signal 2 can then be found around a day before merger. Peaks remain for the 1 and 0.5 days-before-merger results for Signals 3 and 4 - this is due to imperfect removal of the signal from the data.
Caption Timeline of results in the congested period of days 110-140 of the Sangria-HM dataset. These are inpainting results filtered to be at the 0, 5, 1, 4, 7 and 14 days-before-merger that were used in the zero latency search. We see that Signal 4 - the loudest signal - dominates the SNR, and can actually be found before Signals 2 and 3. We remove Signal 4 14 days before merger. Signal 3 can be detected almost immediately after Signal 4 is removed from the data, and is therefore removed 11 days before merger. Signal 2 can then be found around a day before merger. Peaks remain for the 1 and 0.5 days-before-merger results for Signals 3 and 4 - this is due to imperfect removal of the signal from the data.
References
  • [1] P. Amaro-Seoane et al. (LISA), Laser Interferometer Space Antenna (2017), arXiv:1702.00786 [astro-ph.IM].
  • [2] M. Colpi et al., LISA Definition Study Report (2024), arXiv:2402.07571 [astro-ph.CO].
  • [3] P. A. Seoane et al. (LISA), Astrophysics with the Laser Interferometer Space Antenna, Living Rev. Rel. 26, 2 (2023), arXiv:2203.06016 [gr-qc].
  • [4] K. A. Arnaud et al., The Mock LISA Data Challenges: An overview, AIP Conf. Proc. 873, 619 (2006), arXiv:grqc/0609105.
  • [5] K. A. Arnaud et al. (Mock LISA Data Challenge Task Force), A How-To for the Mock LISA Data Challenges, AIP Conf. Proc. 873, 625 (2006), arXiv:gr-qc/0609106.
  • [6] K. A. Arnaud et al., Report on the first round of the Mock LISA data challenges, Class. Quant. Grav. 24, S529 (2007), arXiv:gr-qc/0701139.
  • [7] S. Babak et al. (Mock LISA Data Challenge Task Force), Report on the second Mock LISA Data Challenge, Class. Quant. Grav. 25, 114037 (2008), arXiv:0711.2667 [gr-qc].
  • [8] S. Babak et al., The Mock LISA Data Challenges: From Challenge 1B to Challenge 3, Class. Quant. Grav. 25, 184026 (2008), arXiv:0806.2110 [gr-qc].
  • [9] K. A. Arnaud et al., An Overview of the second round of the Mock LISA Data Challenges, Class. Quant. Grav. 24, S551 (2007), arXiv:gr-qc/0701170.
  • [10] S. Babak et al. (Mock LISA Data Challenge Task Force), The Mock LISA Data Challenges: From Challenge 3 to Challenge 4, Class. Quant. Grav. 27, 084009 (2010), arXiv:0912.0548 [gr-qc].
  • [11] Q. Baghi (LDC Working Group), The LISA Data Challenges, in 56th Rencontres de Moriond on Gravitation (2022) arXiv:2204.12142 [gr-qc].
  • [12] The New LISA Data Challenges, https://lisa-ldc. lal.in2p3.fr/ ().
  • [13] G. Cabourn Davies et al., Premerger observation and characterization of massive black hole binaries, Phys. Rev. D 111, 043045 (2025), arXiv:2411.07020 [hep-ex].
  • [14] N. Houba, S. H. Strub, L. Ferraioli, and D. Giardini, Detection and prediction of future massive black hole mergers with machine learning and truncated waveforms, Phys. Rev. D 110, 062003 (2024), arXiv:2405.11340 [astro-ph.IM].
  • [15] W.-H. Ruan and Z.-K. Guo, Premerger detection of massive black hole binaries using deep learning, Phys. Rev. D 109, 123031 (2024), arXiv:2402.16282 [astro-ph.IM].
  • [16] B. Kocsis, Z. Haiman, and K. Menou, Pre-Merger Localization of Gravitational-Wave Standard Sirens With LISA: Triggered Search for an Electromagnetic Counterpart, Astrophys. J. 684, 870 (2008), arXiv:0712.1144 [astro-ph].
  • [17] S. T. McWilliams, R. N. Lang, J. G. Baker, and J. I. Thorpe, Sky localization of complete inspiral-mergerringdown signals for nonspinning massive black hole binaries, Phys. Rev. D 84, 064003 (2011), arXiv:1104.5650 [gr-qc].
  • [18] P. Saini, S. A. Bhat, and K. G. Arun, Premerger localization of intermediate mass binary black holes with LISA and prospects of joint observations with Athena and LSST, Phys. Rev. D 106, 104015 (2022), arXiv:2208.03004 [gr-qc].
  • [19] T. Dal Canton, A. Mangiagli, S. C. Noble, J. Schnittman, A. Ptak, A. Klein, A. Sesana, and J. Camp, Detectability of modulated X-rays from LISA’s supermassive black hole mergers, Astrophys. J. 886, 146 (2019), arXiv:1902.01538 [astro-ph.HE].
  • [20] E. Castelli, Q. Baghi, J. G. Baker, J. Slutsky, J. Bobin, N. Karnesis, A. Petiteau, O. Sauter, P. Wass, and W. J. Weber, Extracting gravitational wave signals from LISA data in the presence of artifacts, Class. Quant. Grav. 42, 065018 (2025), arXiv:2411.13402 [gr-qc].
  • [21] B. Zackay, T. Venumadhav, J. Roulet, L. Dai, and M. Zaldarriaga, Detecting gravitational waves in data with non-stationary and non-Gaussian noise, Phys. Rev. D 104, 063034 (2021), arXiv:1908.05644 [astro-ph.IM].
  • [22] The New LISA Data Challenges, https://lisa-ldc. in2p3.fr/challenge2a ().
  • [23] C. Garcı́a-Quirós, M. Colleoni, S. Husa, H. Estellés, G. Pratten, A. Ramos-Buades, M. Mateu-Lucena, and R. Jaume, Multimode frequency-domain model for the gravitational wave signal from nonprecessing black-hole binaries, Phys. Rev. D 102, 064002 (2020), arXiv:2001.10914 [gr-qc].
  • [24] N. Seto, Annual modulation of the galactic binary confusion noise background and LISA data analysis, Phys. Rev. D 69, 123005 (2004), arXiv:gr-qc/0403014.
  • [25] J. Crowder and N. Cornish, A Solution to the Galactic Foreground Problem for LISA, Phys. Rev. D 75, 043008 (2007), arXiv:astro-ph/0611546.
  • [26] S. H. Strub, L. Ferraioli, C. Schmelzbach, S. C. Stähler, and D. Giardini, Global analysis of LISA data with Galactic binaries and massive black hole binaries, Phys. Rev. D 110, 024005 (2024), arXiv:2403.15318 [gr-qc].
  • [27] C. J. Moore, R. H. Cole, and C. P. L. Berry, Gravitational-wave sensitivity curves, Class. Quant. Grav. 32, 015014 (2015), arXiv:1408.0740 [gr-qc].
  • [28] S. Marsat, J. G. Baker, and T. Dal Canton, Exploring the Bayesian parameter estimation of binary black holes with LISA, Phys. Rev. D 103, 083011 (2021), arXiv:2003.00357 [gr-qc].
  • [29] S. Babak, Building a stochastic template bank for detecting massive black hole binaries, Class. Quant. Grav. 25, 195011 (2008), arXiv:0801.4070 [gr-qc].
  • [30] I. W. Harry, B. Allen, and B. S. Sathyaprakash, A Stochastic template placement algorithm for gravitational wave data analysis, Phys. Rev. D 80, 104014 (2009), arXiv:0908.2090 [gr-qc].
  • [31] B. Allen, χ2 time-frequency discriminator for gravitational wave detection, Phys. Rev. D 71, 062001 (2005), arXiv:gr-qc/0405045.
  • [32] L. Tsukada, K. Cannon, C. Hanna, D. Keppel, D. Meacher, and C. Messick, Application of a Zerolatency Whitening Filter to Compact Binary Coalescence Gravitational-wave Searches, Phys. Rev. D 97, 103009 (2018), arXiv:1708.04125 [astro-ph.IM].
  • [33] C. D. Capano, M. Cabero, J. Westerweck, J. Abedi, S. Kastha, A. H. Nitz, Y.-F. Wang, A. B. Nielsen, and B. Krishnan, Multimode Quasinormal Spectrum from a Perturbed Black Hole, Phys. Rev. Lett. 131, 221402 (2023), arXiv:2105.05238 [gr-qc].
  • [34] B. Allen, W. G. Anderson, P. R. Brady, D. A. Brown, and J. D. E. Creighton, FINDCHIRP: An Algorithm for detection of gravitational waves from inspiraling compact binaries, Phys. Rev. D 85, 122006 (2012), arXiv:grqc/0509116.
  • [35] M. Katz and J. Roberts, mikekatz04/bbhx: New release! (2023).
  • [36] K. Cannon, S. Caudill, C. Chan, B. Cousins, J. D. Creighton, B. Ewing, H. Fong, P. Godwin, C. Hanna, S. Hooper, R. Huxford, R. Magee, D. Meacher, C. Messick, S. Morisaki, D. Mukherjee, H. Ohta, A. Pace, S. Privitera, I. de Ruiter, S. Sachdev, L. Singer, D. Singh, R. Tapia, L. Tsukada, D. Tsuna, T. Tsutsui, K. Ueno, A. Viets, L. Wade, and M. Wade, Gstlal: A software framework for gravitational wave discovery, SoftwareX 14, 100680 (2021), arXiv:2010.05082 [astro-ph.IM].
  • [37] J. D. Hunter, Matplotlib: A 2d graphics environment, Computing in Science & Engineering 9, 90 (2007).
  • [38] C. R. Harris, K. J. Millman, S. J. van der Walt, R. Gommers, P. Virtanen, D. Cournapeau, E. Wieser, J. Taylor, S. Berg, N. J. Smith, R. Kern, M. Picus, S. Hoyer, M. H. van Kerkwijk, M. Brett, A. Haldane, J. F. del Rı́o, M. Wiebe, P. Peterson, P. Gérard-Marchant, K. Sheppard, T. Reddy, W. Weckesser, H. Abbasi, C. Gohlke, and T. E. Oliphant, Array programming with NumPy, Nature 585, 357 (2020).
  • [39] A. Nitz, I. Harry, D. Brown, C. M. Biwer, J. Willis, T. D. Canton, C. Capano, T. Dent, L. Pekowsky, G. S. C. Davies, S. De, M. Cabero, S. Wu, A. R. Williamson, B. Machenschalk, D. Macleod, F. Pannarale, P. Kumar, S. Reyes, dfinstad, S. Kumar, M. Tápai, L. Singer, P. Kumar, veronica villa, maxtrevor, B. U. V. Gadre, S. Khan, S. Fairhurst, and A. Tolley, gwastro/pycbc: v2.3.3 release of pycbc (2024).
  • [40] P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, S. J. van der Walt, M. Brett, J. Wilson, K. J. Millman, N. Mayorov, A. R. J. Nelson, E. Jones, R. Kern, E. Larson, C. J. Carey, İ. Polat, Y. Feng, E. W. Moore, J. VanderPlas, D. Laxalde, J. Perktold, R. Cimrman, I. Henriksen, E. A. Quintero, C. R. Harris, A. M. Archibald, A. H. Ribeiro, F. Pedregosa, P. van Mulbregt, and SciPy 1.0 Contributors, SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python, Nature Methods 17, 261 (2020).