GEOMETRIC MODE DECOMPOSITION

. We propose a new decomposition algorithm for seismic data based on a band-limited a priori knowledge on the Fourier or Radon spectrum. This decomposition is called geometric mode decomposition (GMD), as it decomposes a 2D signal into components consisting of linear or parabolic features. Rather than using a predeﬁned frame, GMD adaptively obtains the geometric parameters in the data, such as the dominant slope or curvature. GMD is solved by alternatively pursuing the geometric parameters and the corresponding modes in the Fourier or Radon domain. The geometric parameters are obtained from the weighted center of the corresponding mode’s energy spectrum. The mode is obtained by applying a Wiener ﬁlter, the design of which is based on a certain band-limited property. We apply GMD to seismic events splitting, noise attenuation, interpolation, and demultiple. The results show that our method is a promising adaptive tool for seismic signal processing, in comparisons with the Fourier and curvelet transforms, empirical mode decom- position (EMD) and variational mode decomposition (VMD) methods.


1.
Introduction. Many image processing tasks take advantage of a priori knowledge of sparsity. The discrete cosine transform and wavelet transform were the transforms most frequently selected as sparse transforms [10]. The 1D wavelet transform is a multi-scale transform and shows a good performance for representing piece-wise smooth 1D signals. The 2D tensor wavelet transform also can represent 2D images with dot-like features sparsely. However, in fact, 2D images usually contain edge-like features. [8] proposed a new system of representations, named ridgelets, which effectively handles line singularities in 2D, and was followed by a large number of studies on multi-scale and multi-directional wavelet transforms, such as the curvelet [28,35], shearlet [14], contourlet [11], and surfacelet transform [27].
The sparse transforms mentioned above are all predefined, which does not ensure that the bases are the optimal ones for representing an image in a sparse manner. There are two techniques for seeking the optimal basis: dictionary learning in the data space and mode decomposition in the frequency space.
Dictionary learning methods that take advantage of the self-similarities inside the data, among which K-SVD is the most well known, have been proposed for image processing [1,29,4]. These methods first decompose the 2D data into overlapping patches, which are used as samples to train the dictionary. The dictionary is trained such that an optimal sparse representation is achieved, and then used for image processing. Dictionary learning methods achieve improved results as compared to fixed basis methods. A new dictionary learning method, called data driven tight frame (DDTF), was recently proposed for image denoising [6,2]. This method constructs a tight frame rather than an over-complete dictionary, and as a result, it is very computationally efficient and has been used for seismic data interpolation [24,39].
In contrast to dictionary learning methods, mode decomposition methods are aimed at finding a tight support of the data in the frequency domain. For instance, if a texture lies between two scales, it may be separated by a conventional wavelet transform. Therefore, one idea is to adaptively decompose the data into some 'modes'. [18] and [19] proposed the 1D and 2D empirical wavelet transform (EWT) to explicitly build an adaptive wavelet basis to decompose a given signal into adaptive sub-bands. However, the EWT strongly depends on boundary detection methods in the Fourier domain and it may fail when two modes are overlapping. [12,13] proposed the 1D and 2D variational mode decomposition (VMD) based on the band-limited properties of the modes. VMD is fully adaptive and robust to noise. 2D VMD is suitable for analyzing oscillation patterns, such as crystal lattices. However, like the 2D wavelet transform, 2D VMD cannot optimally represent geometric features, such as lines or parabolas. Inspired by the framework of VMD, we have designed an adaptive geometric mode decomposition for 2D data.
1.1. Mode and geometric mode. A different approach, called empirical mode decomposition (EMD), was proposed by [21], which decomposes a signal into principal 'modes'. The mode here is defined as a signal whose number of extrema and the number of zero-crossings must differ at most by one. In most later works, the modes were defined as intrinsic mode functions (IMFs): where φ k (t) is the phase such that φ k (t) ≥ 0, and A k (t) ≥ 0 is the envelope. φ k (t) and A k (t) vary considerably more slowly than φ k (t) ( [18]). The immediate consequence of the new IMF definition is that the mode is an oscillation signal in the time domain and band-limited in the frequency domain. The extension of the definition to two dimensions is straightforward. However, in the 2D situation, in addition to the oscillation patterns, we are also interested in geometric information, such as lines, parabolic features, and hyperbolic features, in particular when handling seismic data. We name the signal consisting of certain geometric information the geometric mode. We provide two descriptions of the geometric mode based on the band-limited principle.
The first description is in the Fourier domain. The frequency spectra of the lines are not band-limited in all directions; they are band-limited in one direction, but not in the perpendicular direction, as shown in Figure 1. In this figure, the signal is band-limited only in the direction of the arrow. However, this definition of bandlimited signals in the frequency domain is not suitable for parabolic or hyperbolic features. The second description is in the Radon domain. For parabolic/hyperbolic features, we can use a parabolic/hyperbolic Radon transform, which results in bandlimited support in the Radon spectrum, as shown in Figure 2. We can also define other geometric modes, provided that they can be described with finite parameters, such as polynomial-like curves.
1.2. Seismic data processing and role of mode decomposition. Seismic exploration is an efficient method for exploring underground structures, which facilitates the identification of the locations of oil reservoirs. Seismic record processing includes interpolation, noise attenuation, etc. for increasing the resolution of the migrated seismic images [32]. The interpolation of seismic data based on sparse transforms constitutes the most popular methods, such as the Fourier [34,32,25], Radon [38], and curvelet transform [30]. Noise attenuation can be classified into random noise attenuation [33,9] and correlated noise attenuation, such as multiple [37,16] and ground-roll attenuation [31]. Multiples are caused by the multiple reflections between the intersections of the medium. Primaries and multiples are approximately hyperbolic with different curvatures, and therefore, a hyperbolic Radon transform was used for demultiple [17,26]. The normal moveout (NMO) technique results in seismic records that approximately consist of flat events and parabolic events, and therefore, the parabolic Radon transform was also used for demultiple on NMO-corrected traces, followed by inverse NMO, to achieve higher efficiency [22]. Usually, for demultiple with a Radon transform, a manual muting method in the Radon spectrum is used to separate primaries and multiples. Mode decomposition plays an important role in seismic data processing. EMD was used to analyze the time-frequency relationship [36], seismic reflection data [3], the instantaneous attributes of seismic data [23], and for random and coherent noise attenuation [5]. In [5], the authors first transformed the data into the frequencyspace (F X) domain by using fast Fourier transform (FFT) along the time axis and then applied EMD on each frequency slice of the Fourier spectrum. Next, the first IMF was treated as noise and the sum of the remaining IMFs was treated as a useful signal. The advantages of EMD over other methods are that (1) it is easily implemented and no parameter is required, and (2) it is adaptive to the data and can handle non-stationary signals. However, when used in noise attenuation, EMD attacks all the energy at high wave numbers, and therefore, fails to preserve the energy of high-dip-angle events. As an improvement, [9] proposed EMDPEF, which can preserve the energy of high-dip-angle events better than the EMD method. In our previous work [40], we introduced complex-valued VMD of seismic data and showed that it can achieve a higher resolution in the wave-number axis and a higher denoising quality than the EMD and F X deconvolution methods [34]. However, the relationship between different frequency slices was not considered. Under the assumption of linear events, the energy support in different frequency slices should remain on the same line. Considering this a priori information, we can significantly remove more noise while retaining useful energy.
1.3. Our work. Three characteristics of 2D VMD inspired us. First, the framework of 2D VMD allows one to explore the directional modes of the data adaptively. Second, the support of 2D IMFs in the Fourier domain is dot-like, which is designed for global oscillation patterns. Usually, practical data are combined with linear features, such as the edge of different regions, or the events in the seismic data. Third, the mode decomposition in the Fourier spectrum can also occur in the Radon spectrum.
Inspired by 2D VMD, we propose a theoretical framework for adaptively decomposing data into different geometric modes according to a priori information assumed in a specific application. For example, in this paper, in order to represent the modes with directional and line-like geometric features, we present a geometric mode decomposition (GMD) method based on the Fourier transform (GMD-F). We assume the frequency domain is line-supported in GMD-F, unlike in 2D VMD. Therefore, we design a new objective function and derive a new direction Wiener filter.
Another example of GMD is based on the Radon transform. The common midpoint portions (CMP) and common shot gathers (CSG) are approximately constituted by a combination of hyperbolic events. NMO-corrected traces are approximately constituted by a combination flat events and parabolic events. The NMOparabolic approximation is computationally more efficient, and therefore, we focus on the parabolic situation in this paper. We design a second version of GMD based on the parabolic Radon spectrum, denoted by GMD-R. We apply GMD for seismic data event separation, denoising, interpolation, and adaptive demultiple, and achieve results that are better than those of the F X deconvolution method.
The rest of this paper is arranged as follows. In the second part, we introduce the theory of VMD and GMD. In the third part, we derive the new Wiener filter and provide the algorithm for solving GMD. In the fourth part, we present some applications of GMD to seismic data. Finally, we conclude this paper and give possible directions of future work.

Theory.
2.1. Variational mode decomposition. VMD was proposed to decompose data into an ensemble of band-limited IMFs. The IMFs are extracted concurrently instead of recursively to achieve high efficiency. VMD is achieved by solving the optimization problem [12] (2) min The optimization problem (2) can be physically described as follows. (1) 1D data f are decomposed as a combination of the band-limited modes u k . (2) If u k is real-valued, it should be transformed to an analytic signal for a single-sided spectrum. 1 πx is the impulse response of the Hilbert transform H. Therefore the analytical signal is The center frequency of (δ(x) + j πt ) * u k (x) is shifted to zero frequency by the term e −jω k x , where ω k denotes the weighted center of the spectrum of mode u k (x). (4) The shifted signal should be smooth along the x-axis, and therefore, we minimize the norm of the derivation along the x direction. [13] proposed 2D VMD for real-valued data where u AS,k is the 2D analytic signal of interest, defined in the frequency domain as (4)û AS,k ( ω) = (1 + sgn( ω · ω k )) u k ( ω).

2.2.
Geometric mode decomposition based on the Fourier spectrum. Both 1D and 2D VMD assume the modes are band-limited. 2D VMD can be applied in situations where oscillation patterns exist, such as in crystal fracture analysis. However, there are situations where the modes are band-limited only in one direction, but not in the perpendicular direction, such as where the 2D seismic data consist of linear events. Suppose f (x, t) is a seismic section consisting of one linear event: where a is the function of a seismic wavelet, p controls the slope the linear event.
The 2D Fourier transform of f (x, t) is: where δ(·) is the Dirac function. Therefore thef fulfills the one-direction bandlimited property. Since the 2D Fourier transform is a linear operation, the Fourier spectrum of a seismic section consisting of more than one linear event still fulfills the one-direction band-limited property corresponding to each event.
In our previous work [40], we addressed the 2D seismic data on each frequency slice of their Fourier spectrum, which is 1D band-limited, whereas in this study, we address 2D directly. We use the directional derivation in the objective function of GMD-F where θ k is the main direction of the mode u k in the data space and n θ k = (cos θ k , sin θ k ). The objective function (7) can be described by the following steps.
(1) The 2D data f are decomposed as a combination of the modes u k . (2) The mode u k should be smooth along the direction θ k . (3) The norm of the derivation is minimized on the direction θ k .
Step 2 can be split into two steps for consistency with the 1D situation. First, instead of shifting the center frequency of u k to the original point, we rotate u k by θ k clockwise. Second, the rotated mode should be smooth on the x-axis.

2.3.
The relationship between GMD-F and 2D VMD. In equation (20) and (21), we show the difference between GMD-F and 2D VMD. In this section, we apply the Fourier transform to the terms inside the norm in equation (3) and (7) to explore the relationship between GMD-F and 2D VMD. In equation (7), the norm is: and in equation (3), the norm is: If we set: ω − ω k = ω, n θ k · n θ k then GMD-F and 2D VMD are the same. Now we get: where n ⊥ θ k is a vector perpendicular to n θ k . So in the Fourier domain, GMD-F is a special case of 2D VMD when setting ω k = ω, n ⊥ θ k · n ⊥ θ k . Figure 3 shows this relationship in the Fourier domain. ω, ω, n θ k · n θ k and ω k are the three sides of a right triangle. Figure 3. The relationship between GMD and 2D VMD.

2.4.
Geometric mode decomposition based on the Radon spectrum. GMD-F can handle only linear features. For data with complex structure, a window method can be used to retain consistency with the linear event assumption. As mentioned, prestack seismic events can be approximated by hyperbolic or parabolic and the NMO-parabolic approximation is computationally more efficient,such that we focus on the parabolic situation in this paper. The definition of the parabolic Radon transform is where d(x, t) is the original seismograph, x is a spatial variable such as the offset, q is the slope of the parabolic feature (for consistency, we use the term 'slope' instead of 'curvature'), and τ is the intercept time. The realization of GMD-R is introduced in the 'Algorithm' section. It is easy to generalize GMD-R to the situation of M order polynomials with the following generalized Radon transform: The generalized GMD-R is easily solved by a M + 1 dimensional clustering algorithm. However, we omit the details since it is beyond the scope of the applications in this paper.
2.5. GMD-R with one parameter. There exists a situation where we are more concerned with the slope parameter than the intercept parameter, since velocity is more important in seismic data processing.Motivated by this idea, we present an objective function inspired by VMD for GMD-R with one parameter, p, denoted by GMD-R1: where NMO(u, p) represents the NMO (normal moveout) of seismic data u with a parameter p, which is related to normal moveout velocity v. Accurate NMO is defined as: where t 0 denotes the travel time at zero offset. The NMO with parabolic approximation is: The NMO of seismic data flattens the events. The objective function (8) can be described as follows. (1) The 2D data f are decomposed as a combination of the modes u k . (2) NMO is applied to u k . (3) the image of u k with respect to NMO should be smooth along the x-axis. (4) The norm of the derivation along the x-axis is minimized after NMO is applied. Note that in (8) we have only the slope parameter p k , without the intercept parameter τ k . As mentioned, we focus on the parabolic situation in this paper, and therefore, NMO is approximately achieved with the parabolic transform in equation 10.
3. Algorithm. Problem (7) is first written in the form of an augmented Lagrangian: Alternating minimization method of multipliers (ADMM) [20] is used for solving GMD-F and is summarized in Algorithm (1) Update θ k : end for 9: Dual ascent: The minimization of (12) can be written as (15) u n+1 k = arg min It can be solved in the Fourier domain to obtain an explicit solution: The minimization of (13) can be written as It is nontrivial to find the analytic solution for θ k . As an approximation, we first solve for the weighted center of the corresponding mode's power spectrum, i.e., ω k . Then, we approximate θ k with the direction of ω k . ω k is solved analytically by [12] The direction of θ k should be perpendicular to ω k . Therefore, θ k = atan(− ω k,1 ω k,2 ). Algorithm (1) can be summarized as a clustering problem in the Fourier spectrum, which is similar to k-means. In the first step, we randomly initialize K center frequencies. In the second, a Wiener filter (analogous to the distance criteria in k-means) is used to cluster the modes. In the third, the new center frequencies (directions) are calculated by the weighted center of the power spectrum of the new modes. The final two steps are repeated until some convergence criteria are met. The algorithm of EMD is given in Appendix A for comparison.
The main difference between VMD and GMD-F is the Wiener filters, as we write here (20) H Figure 4 shows the Wiener filters used in 2D VMD and GMD-F. It is clear that 2D VMD is point-supported and GMD-F is line-supported. Therefore 2D VMD is designed for data with point-like spectrum support, such as crystal images, while GMD-F is designed for line-like spectrum support, such as seismic data with linear events. The role of parameter α is also clear. When the value of α is small, the mode support becomes wider and therefore is more tolerant to noise. This is consistent with the optimization function. A small value of α means less regularity, and therefore, the result is more tolerant to noise. Figure 5 shows GMD-F for synthetic seismic data consisting of three linear events (K = 3), shown in Figure 5(a). Its frequency spectrum is shown in Figure 5  is set as 2000. Figure 5(e) also shows the evolution of the center frequencies of the three modes. Here, we use the center frequencies to indicate the main directions. The proposed optimization problem 7 is nonconvex, such that instead of giving theoretical analysis of the convergence, we test the numerical convergence of GMD-F for the data in Figure 5(a) with a random initialization of the center frequencies ω k . In Figure 6, the x-axis represents ω x and the y-axis the iterations in logarithmic scale. The different lines represent the evolution of ω x of the different modes. One hundred experiments were performed. As we can see, in a noise-free situation, all the randomly initialized center frequencies in our test converge to the desired points within 1000 iterations. If the data are corrupted by weak noise, our algorithm still converges. However, when the noise energy becomes sufficiently strong, the convergence success ratio decreases. Readers can refer to [12] for similar results.
Same as Algorithm (1), GMD-R can be summarized as a clustering problem on the Radon spectrum. In the first step, we randomly initialize K of (τ, p) pairs. In the second, a Wiener filter is used to cluster the modes in the Radon spectrum. In the third, the new (τ, p) pairs are calculated by the weighted center of the new modes' power spectrum. The final two steps are repeated until some convergence  Figure 7 shows GMD-R for synthetic seismic data consisting of three parabolic events, shown in Figure 7(a). Its Radon spectra is shown in Figure 7(e). Number of modes is set as 3 and α = 0.002. Note that the order of parameter α is different from the choice in the previous example, since the units used in the Fourier transform and Radon transform are different. Figure 7(e) also shows the evolution of the (τ, p) centers of the three modes. Figure 7  Update u k : end for 6: for k = 1 : K do 7: Update β k :  The solution of the optimization problem (8) is different from that in the previous GMD-R in that the Wiener filter is a function only of p, not the pair of (p, τ ). GMD-R1 is more suitable for data containing many events with only a few different slopes but many different intercepts. GMD-R1 helps reduce the number of the modes needed significantly and increases the efficiency of GMD-R. Figure 9 shows GMD-R1 for the data in Figure 8(a). Number of modes is set as 2 and α = 0.01. We show two decomposed modes in Figure 9(a) and Figure 9(b). The mode in Figure  9(a) contains two events with similar slopes. 4. Applications of GMD. In this section, we first present some applications of GMD to synthetic data. GMD-F is used for denoising poststack seismic data consisting of linear events, as compared with the VMD method and the F X deconvolution method [7]. GMD-R is used for simultaneous denoising and interpolation of prestack seismic data consisting of parabolic events, as compared with the Spitz method [34]. Then, we use GMD-F to denoise field data and compare this method with the F X deconvolution method and the curvelet method. For field data with a complex structure, we operate on small temporal and spatial windows based on the assumption of linear events. Finally, GMD-R1 is used for demultiple, as compared with the directly muting method. In the experiments, the mode centers are randomly initialized and the modes are initialized as zeros. Figure 10 shows the results of testing GMD-F for seismic noise attenuation. For denoising purposes, the parameter λ is set to zero. Figure 10(a) and Figure 10(b) show the noise corrupted data (SNR = -1.61) and their F K spectra. The data consist of three linear events, each of which represents an interface of underground. Note that the energy of one event is weaker than that of the other two. Number of modes is set as 3 and α = 2000. Figure 10(c) - Figure 10(e) show the denoised results of the GMD-F, 1D VMD, and F X deconvolution methods. It is clear that the F X deconvolution method removes useful energy of all three events. The 1D VMD method cannot preserve the energy of the weak event. On the other hand, the proposed GMD-F method can preserve the energy more effectively. Figure 10(i) - Figure 10(k) show the corresponding F K spectra. The Wiener filter in GMD-F remains on a line, and therefore, keeps the energy on a line. However, the noise in high frequency is retained, and therefore, we truncate high frequency energy to zero. 1D VMD handles the frequency slice separately, and therefore, weak energy may be merged with noise. (i)-(k) F K spectra of (c)-(e). Figure 11 shows the results of testing GMD-R for seismic anti-aliased interpolation. The data in Figure 8(a) are regularly sub-sampled by 1/4 and contaminated with random noise, shown in Figure 11(a). Its F K spectra is shown in Figure 11(b). Sub-sampled seismic data is caused by environmental or economic reasons, which restrict the distribution of the receivers. Interpolation of seismic data is essential for improving the resolution of migration and inversion. Number of modes is set as 3 and α = 0.002. Figure 11(c) and Figure 11(d) show the interpolated data and the F K spectra with GMD-R. The achieved results are considerably better than those of the Spitz method [34], shown in Figure 11(e).   Figure 12(b). The dip events in the data show the dip structures of the underground, whichare important for locating the oil reservoirs and should be well preserved while denoising. Number of modes is set as 5 and α = 5000. Figure 13 illustrates the zoomed versions of the denoised results of the GMD-F, curvelet, and F X deconvolution methods with the corresponding noise sections. The GMD-F method significantly preserves more useful energy than the F X deconvolution method by taking advantage of the line-supported a priori knowledge. The curvelet method tends to over-smooth the denoising result.  Figure 14 shows a demultiple test based on GMD-R1. Multiples are caused by the multiple reflections between the intersections of the medium and may lead to low resolution of migration or inversion. Multiple suppression is an important processing step for marine data seismic data, especially in shallow water environment [15]. Figure 14(a) shows the NMO-corrected traces. Figure 14(b) shows the parabolic Radon spectrum. Number of modes is set as 2. The primary and the multiple are located on different slopes and are detected with GMD-R1 adaptively, indicated by the blue and the red line, respectively. Figures 14(c) and 14(d) show the separated multiple and primary with GMD-R1, with α = 0.005. Our method can achieve demultiple and random noise attenuation simultaneously, as a Wiener filter is applied in the multiple and primary modes. Figures 15(a) and 15(b) show the demultiple results with GMD-R1, with α = 10 −5 . A small value of α makes the result less smooth, but the primary is still contaminated with notable multiples. Figures 15(c) and 15(d) show the separated multiple and primary when the direct muting method is applied. We manually mute a region of the Radon spectrum to zero to obtain the multiple. The primary is obtained by subtracting the multiple from the original data. As we can see, Figure 15(c) still shows some flat events and Figure 15(d) is contaminated with both notable multiple and random noise.

5.
Conclusion. We propose a new decomposition algorithm for seismic data based on a band-limited a priori knowledge on the Fourier or Radon spectrum. Different from the existing predefined multi-directional frames, GMD allows adaptive identification of the directional or other dominant geometric information in the data themselves. Our method is also different from EMD and VMD in that EMD is based on an oscillated and symmetric mode definition, VMD is based on a pointsupported assumption in the Fourier domain and GMD is based on a line-supported assumption in the Fourier domain or point-supported assumption in the Radon domain. We have applied GMD to seismic event splitting, denoising, interpolation, and demultiple. The results show that our method is a promising adaptive tool for seismic signal processing. GMD provides a framework for feature separation, for example, the hyperbolic Radon transform can also be integrated. Future work will focus on the theoretical analysis of the GMD algorithm and applications of GMD involving field seismic data.
Appendix A. EMD was designed to analyze a non-stationary signal, by decomposing the signal into different 'modes' of oscillations, named intrinsic mode functions (IMFs). An IMF satisfies two conditions: (1) the number of extrema and the number of zero crossings must be equal or differ by at most one, and (2) at any point the mean value of the envelope defined by the local maxima and the envelope defined by the local minima must be zero [21]. IMFs are extracted recursively by using a sifting algorithm: The sifting process: 3: repeat 4: Find the local maxima and minima of f (t).

5:
Fit these local maxima and minima by cubic spline interpolation in turn to generate the upper and lower envelopes.

6:
The mean of the upper and lower envelopes is calculated and subtracted from the initial data and the same interpolation scheme is reiterated on the remainder.

7:
until The mean envelope is reasonably close to zero everywhere 8: The resultant signal is designated as the kth IMF.

9:
Subtract the kth IMF from f (t) and set the difference as new f (t).