The effects of multi-echo fMRI combination and rapid T2*-mapping on offline and real-time BOLD sensitivity

A variety of strategies are used to combine multi-echo functional magnetic resonance imaging (fMRI) data, yet recent literature lacks a systematic comparison of the available options. Here we compare six different approaches derived from multi-echo data and evaluate their influences on BOLD sensitivity for offline and in particular real-time use cases: a single-echo time series (based on Echo 2), the real-time T2*-mapped time series (T2*FIT) and four combined time series (T2*-weighted, tSNR-weighted, TE-weighted, and a new combination scheme termed T2*FIT-weighted). We compare the influences of these six multi-echo derived time series on BOLD sensitivity using a healthy participant dataset (N=28) with four task-based fMRI runs and two resting state runs. We show that the T2*FIT-weighted combination yields the largest increase in temporal signal-to-noise ratio across task and resting state runs. We demonstrate additionally for all tasks that the T2*FIT time series consistently yields the largest offline effect size measures and real-time region-of-interest based functional contrasts. These improvements show the possible utility of multi-echo fMRI for studies employing real-time paradigms, while caution is still advised due to decreased tSNR of the T2*FIT time series. We recommend the use and continued exploration of T2*FIT for offline task-based and real-time fMRI analysis. Supporting information includes: a data repository (https://dataverse.nl/dataverse/rt-me-fmri), an interactive web-based application to explore the data (https://rt-me-fmri.herokuapp.com/), and further materials and code for reproducibility (https://github.com/jsheunis/rt-me-fMRI).


Abstract 1
A variety of strategies are used to combine multi-echo functional magnetic resonance imaging 2 (fMRI) data, yet recent literature lacks a systematic comparison of the available options. Here 3 we compare six different approaches derived from multi-echo data and evaluate their 4 influences on BOLD sensitivity for offline and in particular real-time use cases: a single-echo 5 time series (based on Echo 2), the real-time T2*-mapped time series (T2*FIT) and four 6 combined time series (T2*-weighted, tSNR-weighted, TE-weighted, and a new combination 7 scheme termed T2*FIT-weighted). We compare the influences of these six multi-echo derived 8 time series on BOLD sensitivity using a healthy participant dataset (N=28) with four task-based 9 fMRI runs and two resting state runs. We show that the T2*FIT-weighted combination yields 10 the largest increase in temporal signal-to-noise ratio across task and resting state runs. We 11 demonstrate additionally for all tasks that the T2*FIT time series consistently yields the largest 12 offline effect size measures and real-time region-of-interest based functional contrasts. These 13 improvements show the possible utility of multi-echo fMRI for studies employing real-time 14 paradigms, while caution is still advised due to decreased tSNR of the T2*FIT time series. We 15 recommend the use and continued exploration of T2*FIT for offline task-based and real-time 16 fMRI analysis. Supporting information includes: a data repository 17 (https://dataverse.nl/dataverse/rt-me-fmri), an interactive web-based application to explore the 18 data (https://rt-me-fmri.herokuapp.com/), and further materials and code for reproducibility 19 (https://github.com/jsheunis/rt-me-fMRI). Task; Resting state. 25

Introduction 1
In functional magnetic resonance imaging (fMRI), T2*-weighted MRI sequences use the blood 2 oxygen level-dependent (BOLD) signal as a proxy for neuronal activity. Our ability to infer 3 accurate information about neuronal processes is influenced by the sensitivity with which we 4 can capture these BOLD changes and subsequently delineate its sources of variance. 5 Improved sensitivity is particularly important for real-time use cases, such as adaptive 6 experimental paradigms, real-time quality control, or fMRI neurofeedback, where BOLD 7 changes are quantified and used as they are acquired without the benefit of a full dataset or 8 the requisite amount of post-processing time. It is well known that optimum sensitivity of single-9 echo fMRI is achieved at an echo time (TE) close to the apparent tissue T2*-value at baseline 10 (Menon et al., 1993), which also underlies an inherent drawback of T2*-weighted sequences. 11 Location-specific BOLD sensitivity is suboptimal since T2* varies across tissue types and brain 12 regions (Peters et al., 2007), which can result in spatial variability in the detection of task-13 related activation patterns. Furthermore, magnetic susceptibility gradients on a macroscopic 14 level result in image defects such as signal dropout and distortion, which is pronounced in the 15 ventromedial prefrontal, orbitofrontal, the medial temporal and the inferior temporal lobes 16 (Devlin et al., 2002). Additionally, the complex interplay of blood flow, blood volume and 17 magnetic susceptibility effects can be influenced strongly by system-and participant-level 18 noise sources, thus confounding the BOLD signal. 19 20 An advancement that has shown promise in making inroads into these drawbacks is multi- summation is a critical step in multi-echo post-processing that has been reported to increase 26 temporal signal-to-noise ratio, decrease signal drop-out, and improve activation extent for 27 task-analysis (Poser et al., 2006). Posse et al. (1999) proposed several echo combination 28 schemes, including simple echo summation (i.e. equal weights) and weighting echoes by their 29 relative expected BOLD contrast contribution (i.e. T2*), which would require a numerical or 30 fitted estimation of T2*. Other possible weighting schemes include optimised scalar weights, 31 TE-weighted combination, and tSNR-weighted combination (also termed the PAID method) 32 proposed by Poser et al. (2006). A theoretical framework for optimizing multi-echo combination 33 has also been proposed by Gowland and Bowtell (2007). However, the relative benefits of all 34 available combination schemes remain unclear. 35 36 With access to multiple data samples along the decay curve, multi-echo allows quantification 37 of the effective transverse relaxation parameter T2* (decay time) or R2* (its inverse, decay 38 rate), and S0 (initial net magnetization). This form of quantitative T2*-mapping (such as 39 described by Weiskopf et al., 2013) acquires multiple closely spaced echoes followed by a 40 data fitting procedure that yields a static, baseline T2*-map. In the context of functional 41 imaging, however, temporal or per-volume T2*-mapping is also feasible, with the core benefit 42 being the separation and quantification of T2* and S0 changes (from baseline) during 43 stimulated neuronal activation. Such real-time use cases of multi-echo data have been 44 reported, starting with Posse et al.'s (1998) single-shot, multi-echo spectroscopic imaging 1 sequence that quantified region-specific T2* changes during olfactory and visual tasks, and 2 which reported a larger functional contrast (up to 20% increase in the visual cortex) compared 3 to standard EPI data. Several developments followed, including measuring single-event 4 related brain activity (Posse et al., 2001), whole brain T2*-mapping at 1.5T using a linear 5 combination of echoes (Hagberg et al., 2002), later with added gradient compensation (Posse 6 et al., 2003), and a multi-echo EPI sequence at 3T with real-time distortion correction 7 (Weiskopf et al., 2005). Rapid T2*-mapping has also been a useful tool in studying the interplay 8 between cerebral blood flow, blood volume and blood oxygenation, particularly in combination 9 with contrast agents (see, TE-dependent weights pre-selected to yield an average T2*-value of 30ms in the amygdala. 14 15 Although methodological studies have reported the benefits of multi-echo fMRI combination, 16 a comprehensive evaluation of its practical benefits is lacking. Specifically, a variety of 17 combination methods exist that can lead to both offline and real-time improvements in BOLD 18 sensitivity, but there has been no systematic comparison between such methods. Additionally, 19 per-volume T2*-mapping forms a necessary step in established multi-echo-based methods, 20 but recent literature has not explored its value for task fMRI analysis. Consequently, this study 21 has two main goals: (1) to explore the differences in BOLD sensitivity, both offline and per-22 volume, between time series of standard single echo EPI, per-volume estimated T2*FIT, and 23 multi-echo-combined time series (including tSNR-weighted, T2*, TE-weighted, and T2*FIT-24 weighted); and (2) to explore the T2*FIT time series as an alternative to single-echo or multi-25 echo-combined time series for offline and real-time fMRI analysis. We investigate these aims 26 for whole brain data in separate task paradigms, eliciting responses to motor and emotion 27 processing tasks and mental versions thereof, and during resting state. To quantify 28 differences, we employ several metrics such as tSNR, task activity effect size, region-of-29 interest based temporal percentage signal change (tPSC), functional contrast, and temporal 30 contrast-to-noise ratio (tCNR). 31

Multi-echo fMRI relaxation and combination 32
Multi-echo fMRI sequences acquire multiple whole brain functional images at discrete echo 33 times (TE) after a single transverse excitation pulse of the scanner. This is done within the 34 standard repetition time (TR) which then yields multiple echoes per volume. The relaxation of 35 the fMRI signal in a given voxel after transverse excitation, assuming a mono-exponential decay 36 model, is given as: 37 Eq. 1.
with S(t) being the time-decaying fMRI signal, S0 being the tissue magnetization directly after 38 transverse excitation, and T2* being the local tissue transverse relaxation (i.e. decay time) 39 constant (the inverse of the decay rate, R2*). Per-voxel estimates of S0 and T2* (depicted below 40 in Fig. 1) can be derived using a log-linear regression estimation and the available echo times 1 (t1 to tn, where pinv is the pseudo-inverse log the natural logarithm): 2 Eq. 2.

Figure 1. A representation of mono-exponential signal decay showing diminishing image intensity along
4 three echoes. The second echo is sampled at the optimum echo time equal to average grey matter T2*, standard 5 for single echo fMRI. The equation for the red, mono-exponential decay curve is provided (Eq. 1). 6 7 The mathematics of all widely used multi-echo combination schemes are based on the 8 underlying concepts of data weighting, summation and averaging. In the supplementary online 9 material, we provide a thorough background of these concepts along with explanatory 10 equations S1 through S6. Importantly, the multi-echo combination schemes presented below 11 use the convention of weighted summation with normalized weights. This implies that (1) all 12 weights are normalized such that their sum equals 1, then (2) each normalized weight is 13 multiplied by its corresponding data point, then (3) these products are summed to produce the 14 weighted summation. 15 16 Simple echo summation 17 Simple echo summation assumes equal weights for all echoes (totalling N), which is calculated 18 for an individual echo n as: 19 Eq. 3. 20 The T2*-weighted combination scheme used by Posse et al. (1999) and termed "optimal 21

TE-weighted combination 4
Purely using each echo's echo time, TEn, as the weight for that echo has also been suggested 5 (Posse et al, 1999). In this case, the same scalar value is used as the weighting factor for all 6 voxels of a specific echo: 7  13 Finally, as proposed in the introduction, real-time T2*-mapping is made possible when using 14 multi-echo fMRI. Here, the per-volume estimation of T2* at each voxel, termed T2*FIT(t), can 15 also be used as the weighting factor in a per-volume echo combination scheme: 16

Eq. 8.
The per-volume nature of this echo combination scheme makes it ideal for use in both offline 17 and real-time applications, when an priori T2*-map is not available or not preferred. To the best 18 of our knowledge, this T2*FIT-weighted combination approach has not been described 19 previously in the literature 20 21 In the methods and results presented in this work, we compare metrics derived from standard 22 single echo fMRI analysis to metrics derived from analysing T2*-weighted, tSNR-weighted, TE-23 weighted, T2*FIT-weighted, and the T2*FIT parameter time series, in both offline and per-24 volume scenarios. 25

1
In-depth descriptions of the participant details, ethics approval, experimental design, MRI 2 protocol, preprocessing, and data quality can be accessed in the related data article (Heunis 3 et al., 2020a). Summarising statements are provided below for the sake of completeness. 4

5
MRI and physiology data were collected from N=28 participants (male=20; female=8; age = 6 24.9 ± 4.6 mean + standard deviation). The study was approved by the local ethics review 7 board and all participants gave written consent for their data to be collected, processed and 8 shared in accordance with a GDPR-compliant procedure. 2. rest_run-1: the first resting state run, eyes fixated on a white cross 14 3. fingerTapping: a right hand finger tapping functional task 15 4. emotionProcessing: a matching-shapes-and-faces functional task 16 5. rest_run-2: the second resting state run, eyes fixated on a white cross 17 6. fingerTappingImagined: an imagined finger tapping functional task 18 7. emotionProcessingImagined: a functional task to recall an emotional memory 19 20 All four task paradigms followed an ON/OFF boxcar design, starting with the OFF condition, 21 with both conditions lasting 10 volumes (= 20 s at TR = 2 s). The control (i.e. OFF) condition 22 for the fingerTapping task was to focus on a small white cross on a black screen; for the 23 emotionProcessing task the control condition was the shape-matching block; and for the 24 fingerTappingImagined and emotionProcessingImagined tasks the control conditions were 25 counting backwards, respectively, in multitudes of 7 and 9. 3.5×3.5×3.5 mm; in-plane matrix size = 64×64; number of slices = 34; slice thickness = 3.5 38 mm; interslice gap = 0 mm; slice orientation = oblique; slice order/direction = 1 sequential/ascending; phase-encoding direction = A/P; SENSE acceleration factor = 2.5. Parts 2 of the cerebellum and brainstem were excluded for some participants to ensure full coverage 3 of the cortex and subcortical areas of interest. Echo times, spatial resolution, and the SENSE 4 factor were tuned with the aim of improving spatial resolution and coverage while limiting the 5 TR to maximum 2000 ms, including a maximum number of echoes, and keeping the SENSE 6 factor low to prevent SENSE artefacts. 7 8 In addition, cardiac and respiratory fluctuations were recorded during the functional scans, 9 respectively using a pulse oximeter fixed to the participant's left index finger, and a pressure-10 based breathing belt strapped around the participant's upper abdomen. These were sampled 11 at 500 Hz. unless otherwise stated, to describe the effects and facilitate the use of these methods in real-17 time fMRI use cases.

19
All All data analysis scripts can be accessed for reproducibility or reuse with attribution at 36 https://github.com/jsheunis/rt-me-fMRI. 37

38
The basic anatomical and functional preprocessing pipeline applied to all data is described in 39 detail in the data article, and in included: 40 1. Defining a functional template from Echo 2 of the first volume of the first resting state 41 run. 42 2. Mapping prior data to the subject functional space, including: 1 a. Coregistration of the anatomical image and atlas-based regions of interest to 2 the functional template space, and resampling these to the functional 3 resolution. 4 b. Tissue-based segmentation the coregistered anatomical image and definition 5 of binary maps for grey matter, white matter, cerebrospinal fluid (CSF) and the 6 whole brain. 7 3. Basic functional preprocessing steps, including: estimating realignment parameters 8 from the Echo 2 time series, running slice timing correction on all echo time series, 9 applying realignment parameters to all echo time series, and applying spatial 10 smoothing (7 mm isotropic, i.e. twice the voxel width) to all echo time series. 11 4. Generating data quality control metrics and visualizations to allow inspection of the 12 quality of anatomical and functional data and their derivatives.

14
Two aspects of the preprocessing and analyses pipelines are worth highlighting in the context 15 of this study. Firstly, while an important focus for this work is its application and utility in real-16 time scenarios, all processing was done offline, either on the full dataset or on a per-volume 17 (i.e. simulated real-time) basis. This was viable since the study did not include any 18 neurofeedback or real-time adaptive paradigms that would have required real-time 19 computation and interaction. Secondly, the concept of "minimally processed" data was 20 approached in accordance with suggestions from the tedana pipeline and its developers (see 21 https://tedana.readthedocs.io/en/latest/index.html; Dupre et al., 2020), which states that 22 minimal steps (including slice timing correction and 3D volume) should be applied to multi-23 echo data before decay parameter estimation or multi-echo combination. 24

Data quality control 25
The fMRwhy toolbox has a BIDS-compatible data quality pipeline for functional and anatomical 26 MRI, fmrwhy_bids_workflowQC, that can be run automatically for a full BIDS-compliant 27 dataset. After running minimal preprocessing steps it generates a subject-specific HTML-28 report with quality control metrics and visualizations to allow inspection of the data and its 29 derivatives. Individual reports can be accessed in the derivatives directory of the shared BIDS-30 compliant dataset of this study (see Heunis et al., 2020a for details). Additionally, a web-31 application named rt-me-fMRI is provided along with this work and accessible at: https://rt-32 me-fmri.herokuapp.com/. It can be used interactively to explore various summaries of data 33 quality metrics, including distributions of framewise displacement (FD) and tSNR, and 34 physiology recordings, as well as the results of this study. 35 36 None of the participant datasets were excluded after inspection of the included quality metrics, 37 even in cases of more than average or severe motion (specifically sub-010, sub-020, and sub-38 021), since such data could still be useful for data quality related insights or for future denoising 39 methods validation. Existing weighting parameters or parameter maps are required to allow both offline and per-2 volume combination of multi-echo data. Of the previously reported options for weighting 3 schemes given in Section 2.2, the simplest option used in this study is the echo time (Eq. 6) 4 derived from the functional MRI protocol, which yields a TE-weighted combination. Other prior 5 weighting parameters are calculated using the first resting state functional scan. For each 6 minimally preprocessed echo time series of the resting state run, the time series mean and 7 standard deviation are calculated. The mean divided by the standard deviation yields the 8 temporal signal-to-noise ratio (tSNR), per echo, that is used as another weighting parameter 9 (Eq. 5) described as the PAID method by Poser et al. (2006). Additionally, the mean images 10 from the three echo time series are used to derive the per-voxel estimates of S0 and T2* 11 assuming a mono-exponential decay model and using a log-linear regression estimation (Eq. 12 2). This baseline T2* map can be used for T2*-weighted combination (Eq. 4), described as 13 optimal combination by Kundu et al. (2012). Lastly, the same log-linear regression that is 14 applied to the time series mean images can also be applied to a single volume of any multi-15 echo data. This implies that the three echo images of any volume can be used as data points 16 to estimate per-volume and per-voxel parameter maps, S0FIT(t) and T2*FIT(t), which in turn 17 can be used for per-volume multi-echo combination (Eq. 8), hereinafter referred to as T2*FIT-18 combination.

20
Multi-echo combination schemes are applied to all functional data excluding the first resting 21 state run, from which prior baseline weight maps are derived. In sum, six time series are 22 computed per functional run (as described in Fig. 2): Echo 2, T2*-combined, tSNR-combined, 23 TE-combined, T2*FIT-combined, and the T2*FIT time series. First, the tSNR of each time series is calculated prior to any further processing. Then, each 29 time series is spatially smoothed using a Gaussian kernel with FWHM at 7 mm (i.e. double 30 the voxel size). This is followed by participant-level GLM-based analysis of the four task runs. 31 Task regressors included the main "ON" blocks for the fingerTapping, fingerTappingImagined 32 and emotionProcessingImagined tasks, and both the separate "SHAPES" and "FACES" trials 33 for the emotionProcessing task. Regressors not-of-interest for all runs included six realignment 34 parameter time series and their derivatives, the CSF compartment time series, and 35 RETROICOR regressors (both cardiac and respiratory to the 2nd order, excluding interaction 36 regressors). Additional steps executed by SPM12 before beta parameter estimation include 37 high-pass filtering using a cosine basis set and AR(1) autoregressive filtering of the data and 38 GLM design matrix. Contrasts are applied to the task-related beta maps for the fingerTapping, 39 fingerTappingImagined and emotionProcessingImagined tasks, and to the FACES, SHAPES, 40 and FACES>SHAPES beta maps for the emotionProcessing task. In order to yield a standard 41 measure of effect size, the parameter estimates or contrast maps are then used to calculate 42 percentage signal change (PSC) using the method described by Pernet (2014) and given by: 43 1 2     where βcondition and βconstant are parameter estimates corresponding to the relevant GLM regressors 1 that are scaled with regards to the actual BOLD magnitude. To account for this the scaling factor, 2 SF, is determined as the maximum value of a reference trial taken at the resolution of the super-3 sampled design matrix Xss (where supersampling is typically done before convolution with the 4 hemodynamic response function): 5

Eq. 10.
Statistical thresholding was applied to identify task-related clusters of activity by controlling 6 the familywise error rate (FWE), with p < 0.05, and a voxel extent threshold of 0. 7 3.4.5. Real-time analysis 8 Minimally processed time series are also analysed per-volume (using data acquired up to each 9 volume in time) in order to explore multi-echo related BOLD sensitivity changes for real-time 10 applications. Real-time analysis typically involves minimal processing (including 3D 11 realignment), spatially averaging the signal within given ROIs, and additional per-volume 12 denoising steps on the averaged signal. Here, we run a per-volume denoising process 13 adapted from OpenNFT (Koush et al., 2017) on all task time series. This process is depicted 14 in the bottom row of Fig. 2 and includes, in order: 1) Spatial smoothing using a Gaussian kernel 15 with FWHM at 7 mm, 2) Spatial averaging of voxel signals with defined ROIs, and 3) 16 Cumulative GLM-based detrending of the ROI signals, including linear and quadratic trend 17 regressors. This then yields per-volume minimally denoised ROI-signals from which 18 percentage signal change or another calculation can be used as the basis for the 19 neurofeedback or real-time ROI signal. 20

Comparison metrics 21
To explore the differences between various multi-echo combinations and standard single echo 22 data, and to investigate the usefulness of the former over the latter, we employ several 23 comparison metrics: 24 25 $ Temporal signal-to-noise ratio (tSNR), calculated as the voxel-wise time series 26 mean divided by voxel-wise time series standard deviation. tSNR is an indicator of the 27 amount of signal available from which to extract potentially useful BOLD fluctuations. 28 Additionally, tSNR maps can be a robust visual indicator of increases or decreases in 29 signal dropout.

30
$ Percentage signal change (PSC), of task-based contrast maps resulting from 31 participant-level GLM analysis. PSC represents a standardised measure of effect size 32 (which beta or contrast values are not) and is an indicator of the BOLD sensitivity of 33 the data based on GLM analysis.

34
$ T-statistic values, related to the task based contrast maps resulting from participant-35 level GLM analysis. 36 $ Temporal percentage signal change (tPSC), of the single echo, combined-echo and 1 derived time series data of the task runs. This is calculated per voxel on minimally 2 processed task data as the per-volume signal's percentage signal change from the 3 time series mean (or, for real-time scenarios, from the mean of the preceding baseline 4 "OFF" block or the cumulative mean). These are then spatially averaged within the 5 regions listed below to yield ROI-based time series. These time courses are similar to 6 what would be calculated in real-time as the ROI-based neurofeedback signal, and 7 their amplitudes can be an indicator of BOLD sensitivity. 8 $ Carpet plots for the whole brain and within regions of interest. These plots show 9 the voxel intensity over time in terms of tPSC and can be a valuable visual indicator of 10 quality issues (Power, 2017) and also of amplitude differences (i.e. functional contrast) 11 in ROIs. data.

19
$ Temporal contrast-to-noise ratio (tCNR) of the single echo, combined-echo and 20 derived time series data of the task runs. To calculate the tCNR, the functional contrast 21 in an ROI is divided by the time series standard deviation of the tPSC signal in the 22 same ROI. This is related to both the tSNR and BOLD sensitivity. Where tPSC consists 23 of time courses, tCNR provides a single summary value per voxel or region.

25
Extracting and spatially averaging voxel time series from specific regions is a common 26 approach to exploring patterns of task-based activity in fMRI (Poldrack, 2006). This can be 27 done both offline on a full dataset, and in real-time on the data as they are acquired. In this 28 work, we explore and compare the above-mentioned metrics on both whole-brain and region-29 specific levels. Regions include: 30 31 $ Grey matter (GM), white matter (WM) and cerebrospinal fluid (CSF) compartments. 32 This allows quantifying, for example, whether combined multi-echo data changes a 33 given metric similarly or differently across tissue types.

34
$ Binary task-based clusters of voxels surviving conservative statistical thresholding 35 (FWE). These clusters vary spatially per time series of a given task run and they 36 represent the clusters of functionally most responsive voxels based on the underlying 37 data but assuming a shared criteria (i.e. statistical threshold).

38
$ A binary cluster of voxels that is a logical OR of the conservatively thresholded task 39 based clusters of all six time series of a given task run (FWE-OR). This allows the 40 comparison of metrics in a region that includes the voxels that are judged to be 41 significantly active in any time series, thus removing time series-specific spatial bias.

42
$ Atlas-based anatomical regions of interest (Atlas-based ROI) that have been mapped 43 to individual anatomical scans and coregistered and resampled to the individual 44 functional space. This allows quantification of the above metrics within an a priori 45 defined ROI, thus excluding spatial bias introduced by statistical thresholding. The 46 1 bilateral amygdala (for emotion processing). 2

3
A web-application named rt-me-fMRI is provided alongside this work and accessible at: 4 https://rt-me-fmri.herokuapp.com/. This browser-based application can be used interactively 5 to explore the summary and participant-specific results presented below, and is intended to 6 serve as supplementary material to this work. slices of a single subject is given In Fig. 3. Signal decay can be seen clearly as the signal 10 intensity diminishes from Echo 1 to Echo 3 (top to bottom) in all displayed slices. Signal 11 dropout from Echo 1 through Echo 3 is particularly evident in the areas of the orbitofrontal and 12 ventromedial prefrontal cortices, and the inferior and anterior temporal lobe (magenta arrows; 13 slices 8, 10, 12) and the cerebellum (light blue arrows; slices 2, 4, 6). 14 15

14
It is evident that most echo combination schemes, with the exception of TE-weighted 15 combination, recover some signal lost due to dropout in the orbitofrontal and ventromedial 16 prefrontal regions (magenta arrows; slices 8, 10) and inferior and anterior temporal regions 17 (light blue arrows; slices 6,8). This signal recovery is further demonstrated in the tSNR maps 18 provided in Fig. 4B, particularly by the magenta arrows showing areas of signal dropout in 19 Echo 2 and subsequent recovery in combined and derived time series tSNR maps. Even the 20 T2*FIT, for which the tSNR is evidently much lower than all other time series including Echo 2, 1 recovers some of the signal that is lost due to low BOLD sensitivity in the affected areas, 2 although signal loss is also more evident (slice 10). Additionally, tSNR in areas close to the 3 bilateral temporal-occipital junction and towards the occipital lobe (Fig. 4B, green arrows) 4 appears to increase substantially for all combined time series vs. Echo 2. This is more 5 pronounced in the T2*FIT-weighted compared to the T2*-weighted and tSNR-weighted 6 combinations, and less so in the TE-weighted combination. 7 8 To provide a more quantified view than these visualisations of signal intensity (Fig. 4A) and 9 tSNR (Fig. 4B), distribution plots were created for grey matter tSNR values, both for subjects 10 individually and for the whole group. Fig. 5 shows these distributions together as a ridge plot 11 for sub-001_task-rest_run-2, from which we can see the mean tSNR increase for all combined 12 time series compared to Echo 2, with the T2*FIT-weighted combination showing the largest 13 increase (36.14%) and the T2*FIT time series showing a substantial decrease (-55.89%). 14 15 16 17

23
The mean grey matter tSNR values are summarised on a group level for all subjects and all 24 functional runs in Fig. 6. Fig. 6A displays mean grey matter tSNR for the whole brain, while 25 Figs. 6B and 6C show the same for the left motor cortex and the bilateral amygdala, 26 respectively. Fig. 6A shows the same relationship between tSNR values for single echo, 27 combined echo, and derived time series as was seen for a single run of the single subject in 28

12
This increase in the tSNR of T2*FIT-weighted combination replicates results that we previously 1 reported on a different dataset (Heunis et al., 2019). This relationship also repeats for different 2 regions, as can be seen for the left motor cortex (Fig. 6B) and the bilateral amygdala (Fig. 6C).

4
Note, however, that the mean tSNR values increase differentially based on the region. For the 5 T2*FIT-weighted combination, for example, whole brain data show a mean tSNR increase of 6 36.95%; the left motor cortex shows a mean tSNR increase of 31.63%; and the bilateral 7 amygdala shows a mean tSNR increase of 53.35%. Other combined time series show 8 percentage increases following the same pattern, i.e. a larger tSNR increase for the amygdala 9 than for the whole brain or motor cortex. This could be explained by the baseline T2*-values 10 in the motor cortex and the whole brain being closer to the time Echo 2 (28 ms) than the T2*-11 values in the amygdala, i.e. that the T2*-weighting of Echo 2 in those regions is already closer 12 to optimal than the weighting of Echo 2 in the amygdala. This suggests that the amygdala and 13 similarly affected areas with T2*-values that are different from the average have more to gain 14 from the multi-echo combination process.

16
Another noteworthy aspect is the low signal intensity and low tSNR of the T2*FIT time series.

17
The low signal intensity is explained by the fact that T2*FIT values correspond to quantified 18 units (ms) that are expected to be in a certain range (~ 0 to 120 ms for the human brain at 3T, 19 Peters   Temporal percentage signal change is useful to inspect the per-volume fluctuations of signal 10 in task-related regions, both for offline and real-time scenarios. tPSC in the offline scenario is 11 calculated per volume from minimally processed data, yielding a per-voxel tPSC time series 12 that can be depicted in a carpet plot or used for ROI analysis. tPSC for realtime scenarios is 13 calculated from real-time minimally denoised ROI-averaged signal (with regards to the mean 14 of the preceding baseline "OFF" block or with regards to the cumulative total or baseline mean) 15 yielding the real-time ROI-signal typically used in region-based neurofeedback. 16 17 Offline temporal percentage signal change 18 19 Fig. 9 (A through C) shows per-voxel tPSC as atlas-based carpet plots for three time series 20 of the fingerTapping task in a single subject (sub-001_task-fingerTapping): (A) Echo 2, (B) 21 T2*-weighted combination, and (C) T2*FIT. T2*-weighted combination was included as a 22 representative combined time series, since all multi-echo combined time series showed similar 23 intensity fluctuations. Visual inspection indicates that the T2*FIT time series has larger intensity 24 differences in the ON/OFF bands than other signals, suggesting increased functional contrast 25 for T2*FIT compared to single-echo and combined multi-echo time series. This is reflected 26 further in Fig. 9D, which shows averaged tPSC signals within the same atlas-based ROI. The T2*FIT signal clearly has a larger functional contrast (higher tPSC during task blocks and 1 lower tPSC during resting blocks) than all other signals, for which the amplitude differences 2 are near-identical. Functional contrast in tPSC signals increase as the regions change from 3 atlas-based, to FWE-OR, to FWE (i.e. as regions become more spatially matched to the 4 participant's functional activation localisation), as can be seen in the supplementary browser-5 based application. Note that the T2*FIT time series exhibits a noisier tPSC signal, as can be 6 seen from the larger volume-to-volume fluctuations compared to the overall signal amplitude, 7 visible in both Figs. 9C and 9D. 8 9 However, there is a disparity between regions and activity types when assessing the tendency 10 of T2*FIT to increase functional contrast in offline tPSC. Specifically in the amygdala, 11 influenced by increased physiological noise and signal dropout, we are less likely to find the atlas-based cluster, the T2*FIT time series yields a substantial increase in negative task 20 contrasts.

Real-time temporal percentage signal change 1 2
While offline tPSC is useful for post-hoc inspection of signal quality and task activity, it does 3 not accurately reflect the effects seen for real-time scenarios where per-volume calculations 4 can only use information available up to the most recently acquired volume. For that purpose, 5 minimally processed data are cumulatively detrended and real-time tPSC is then calculated 6 with regards to a cumulative baseline mean, which also allows closer comparison with offline 7 tPSC which underwent similar steps. Fig 11 shows functional contrast calculated from real-8 time tPSC signals for the same tasks and clusters as in Fig. 10. series is substantially increased compared to its offline counterpart (Fig. 10). Functional contrast is presented as 20 differences in percentage signal change (y-axes).

22
Note the increased functional contrast for the real-time T2*FIT time series (Fig. 11) compared 23 to its offline counterpart (Fig. 10). This is evident for all time series and clusters shown in Fig.  24 11, where the minimum percentage increase of T2*FIT functional contrast over Echo 2 25 functional contrast is 260.61% (from 0.33 to 1.19) in the FWE-OR cluster of the 26 emotionProcessing task. A caveat here, as with the offline tPSC and functional contrast 27 calculations, is that the T2*FIT time series has large overall fluctuations compared to the other 28 time series. It is conceivable that these fluctuations could drive an increase in the estimated 29 functional contrast, regardless of the variability of such an estimate. To take this into account, 30 the functional contrasts are divided by the standard deviation of the tPSC time series to yield 31 the temporal contrast-to-noise ratio (tCNR). This is shown for the FWE-OR clusters in the 1 fingerTapping and emotionProcessing tasks in Fig. 12   The tCNR distributions clearly show the effect that high signal fluctuations (i.e. increased time 1 series standard deviation) can have on the tCNR. In both Fig. 12A and Fig. 12B, the tCNR of 2 the real-time T2*FIT time series is substantially lower than that of Echo 2 (decreasing by about 3 70% in both cases) as well as that of all other time series. In Fig. 12C and 12D, tPSC signals 4 in the FWE-OR clusters show higher amplitude differences for the T2*FIT time series 5 compared to all other time series, echoing the increased functional contrast seen for the group 6 in Fig. 11, although an increase in volume-to-volume fluctuations relative to the signal 7 amplitude is also evident. This pattern of increasing T2*FIT time series functional contrast and 8 decreasing tCNR repeats for all tasks and clusters, which shows the importance of taking into 9 account both signal fluctuations and amplitude differences. 10 series. These differences were explored in terms of: temporal signal-to-noise ratio, percentage 19

Discussion
signal change as task-based effect size measure, offline and real-time temporal percentage 20 signal change, functional contrast, and temporal contrast-to-noise ratio.

22
Our results, across 28 participants, are summarised as follows. Dropout recovery is more 23 pronounced (in orbitofrontal, ventromedial prefrontal regions as well as inferior and anterior 24 temporal regions) for the T2*-weighted, tSNR-weighted, and T2*FIT-weighted combinations 25 than for the TE-weighted combination. All multi-echo combined time series yield increases in 26 tSNR compared to Echo 2, with the newly-proposed T2*FIT-weighted combination resulting in 27 the largest increase in mean tSNR. For the T2*FIT-weighted combination, increases in mean 28 tSNR are larger for the amygdala than for the left motor cortex or the whole brain. In contrast, 29 the T2*FIT time series results in a substantial mean decrease in tSNR from Echo 2.

30
Alternatively, the T2*FIT time series yields the largest effect size measures across all 31 investigated functional tasks and clusters, whereas the effect size measures derived from 32 combined echo time series tend to decrease slightly from those of Echo 2, for all functional 33 tasks. Based on temporal percentage signal change calculated offline from minimally 34 processed data, the T2*FIT time series yields the highest functional contrast for all tasks. 35 Similarly, based on temporal percentage signal change calculated in simulated real-time from 36 cumulatively denoised data, the T2*FIT time series also yields the highest functional contrast 37 for all tasks, although this increase is substantially more than the increase seen for its offline 38 counterpart. For both offline and real-time scenarios, the temporal contrast-to-noise ratio of 39 the T2*FIT time series shows a substantial decrease from Echo 2, while that of all other time 40 series are comparable to Echo 2.

42
The fact that multi-echo combined time series yields increased tSNR compared to single echo 1 data has been widely demonstrated in previous research, and has been repeated here for all 2 combined time series with respect to Echo 2. Additionally, we show that the novel T2*FIT-3 weighted combination yields the largest increase, replicating our previous results from a 4 different dataset (Heunis et al., 2019). In the amygdala, a mean increase in tSNR of 53.35% 5 was calculated across participants, while the mean increases for the left motor cortex and the 6 whole brain were respectively 31.63% and 36.95%. These differences suggest that multi-echo 7 combination, and in particular T2*FIT-weighted combination, could prove more useful in terms 8 of tSNR for areas traditionally suffering from suboptimal BOLD sensitivity due to their lower 9 local baseline T2*-values. On the other hand, improving tSNR in individual regions could also 10 benefit whole-brain methods where spatially distributed ROIs or networks are used as the 11 neurofeedback substrate (e.g. connectivity-based neurofeedback employed by Megumi  this approach has proven reduced BOLD sensitivity than the rest of combination approaches 17 investigated here.

19
While not novel, an important aspect demonstrated here was the decrease in tSNR for the 20 T2*FIT time series. Importantly, the fitting procedure used to estimate per-volume T2*-and S0-21 values (assuming a mono-exponential decay curve) yields noisy results that influence the 22 amplitude of the signal fluctuations with respect to the mean, thus increasing the standard 23 deviation and decreasing tSNR. The pitfalls of assuming mono-exponential (as opposed to 24 multi-compartment) decay and using a fitting procedure with few data points (3 in this case) 25 have been described before (Whittall et al., 1999) and remain applicable here. Future work 26 should aim to exploit technical advances such as simultaneous multi-slice imaging to increase 27 the number of echoes acquired per volume, while investigating more robust models of T2*-28 decay.

30
While tSNR is a useful quantifier of relative spatial signal increases and dropout recovery, it 31 does not paint the complete picture of influences on BOLD sensitivity. To investigate how 32 multi-echo derived data could improve our ability to link BOLD changes to neuronal effects, 33 we The T2*FIT time series also consistently yielded the largest functional contrasts in terms of 5 differences in task vs. baseline amplitudes in tPSC signals calculated from offline and real-6 time data. As an example, we observe a 87.91% increase in mean PSC (T2*FIT compared to 7 Echo 02) for the FWE-OR cluster of the fingerTapping task, and increases in functional 8 contrast for the same task and cluster of 100% and 293%, respectively for offline and real-9 time scenarios. Interestingly, functional contrast for the real-time calculated tPSC signal 10 showed an increase above the functional contrast calculated from offline data. The main 11 mathematical difference in real-time vs offline approaches that this could be ascribed to is the 12 cumulative calculation vs offline calculation, especially as regards the mean (cumulative 13 baseline mean vs full time series mean). 14 15 Theoretically, we should expect an increase in BOLD sensitivity when analysing quantified T2* 16 fluctuations versus fluctuations in single echo image intensity, since the separation of T2*-and 17 S0 should remove (to a considerable extent) system-level, inflow, and subject-motion effects 18 from the T2*-signal. What is left in the form of voxel-based T2*FIT-values would then 19 theoretically be more indicative of local neuronal activity than information derived from single 20 echo data, assuming noise from the fitting procedure and other confounding factors do not 21 attenuate this contrast substantially. Having said that, this highlights an important caveat 22 related to the interpretation of increased functional contrast of the T2*FIT signal compared to 23 that of other signals: the size of signal fluctuations should be taken into account. This is 24 evidenced by the large decrease in temporal contrast-to-noise ratio of the offline and real-time 25 calculated tPSC signals. Temporal smoothing or statistical filtering (e.g. a Kalman filter) are 26 methods that have been employed in neurofeedback tools to smooth out ROI signals 27 calculated in real-time, and this could be investigated as a way to diminish detrimental effects 28 of T2*FIT fluctuations. 29 30 In terms of practical applicability to real-time fMRI research, we have shown the usefulness of 31 multi-echo for real-time use cases in a 28-person dataset with several functional task designs. 32 We demonstrate that real-time T2*FIT-weighted combination yields brain wide mean tSNR 33 increases and improves signal recovery in regions affected by dropout, compared to single 34 echo and other combined multi-echo time series. We show additionally that the real-time 35 T2*FIT time series yields large functional contrasts compared to single echo or combined multi-36 echo time series. These improvements could benefit both real-time brain wide connectivity 37 measures and real-time region-based signals, respectively, showing the possible utility for 38 studies on adaptive paradigms and neurofeedback.

40
It remains important to consider caveats before implementation and in order to drive future 41 improvements. We noted, as did Clare et al. (2001), that the selection of the region of interest 42 within which to investigate activation effects, functional contrast, tSNR and more, can increase 43 the variability of results and subsequent inferences. This issue was evaluated here considering 44 three different ways of delineating the region of interest: FWE, FWE-OR, and atlas-based, 45 and we observed attenuation of effect sizes, T-values and functional contrast as regions 46 become less spatially matched to participants' functional activation localisation. This is 1 particularly important for the real-time neurofeedback context, where a predefined subject-2 specific region of interest is often required to enable real-time region-based signal extraction.

4
This concern about variability in the performance due to ROI definition extends to the 5 implementation of real-time denoising steps as well, as noted in our previous work on 6 denoising steps in neurofeedback studies (Heunis et al., 2020b). In the current study, we 7 intentionally implemented a minimal real-time processing pipeline to avoid confounding the 8 results. While the presented benefits of multi-echo fMRI for real-time experiments are 9 promising, further work is necessary to quantify the effects of a full multi-echo and denoising 10 pipeline on BOLD sensitivity and data quality. Taking into consideration the caveats discussed 11 here, we advise researchers planning real-time fMRI studies to design and conduct effective 12 pilot studies and to evaluate the effects robustly before deciding on the optimal multi-echo 13 implementation settings. 14 15 Lastly, we have shown that real-time multi-echo processing, specifically rapid T2*-mapping 16 and subsequent multi-echo combination is technically viable and practically supported. The 17 software tools generated through this work (and shared with the community) support several 18 per-volume or real-time multi-echo processing operations, including real-time 3D realignment 19 of multi-echo data, real-time estimation of multi-echo decay parameters, real-time multi-echo 20 combination using several weighting schemes, and multiple standard real-time preprocessing 21 steps. It provides a practical toolkit for exploring real-time multi-echo fMRI data and for 22 comparing the effects of acquisition and processing settings on BOLD sensitivity in individuals. 23 Additionally, the interactive browser application allows easy access to the results (https://rt-24 me-fmri.herokuapp.com/), while the provision of supporting material and code 25 (https://github.com/jsheunis/rt-me-fMRI) allows the presented results to be reproduced and 26 allows replication attempts to be conducted on future datasets. The Netherlands. The other authors have declared that no further competing interests exist. 30