# Sparse tensor phase space Galerkin approximation for radiative transport

## Abstract

We develop, analyze, and test a sparse tensor product phase space Galerkin discretization framework for the stationary monochromatic radiative transfer problem with scattering. The mathematical model describes the transport of radiation on a phase space of the Cartesian product of a typically three-dimensional physical domain and two-dimensional angular domain. Known solution methods such as the discrete ordinates method and a spherical harmonics method are derived from the presented Galerkin framework. We construct sparse versions of these well-established methods from the framework and prove that these sparse tensor discretizations break the “curse of dimensionality”: essentially (up to logarithmic factors in the total number of degrees of freedom) the solution complexity increases only as in a problem posed in the physical domain alone, while asymptotic convergence orders in terms of the discretization parameters remain essentially equal to those of a full tensor phase space Galerkin discretization. Algorithmically we compute the sparse tensor approximations by the combination technique. In numerical experiments on 2+1 and 3+2 dimensional phase spaces we demonstrate that the advantages of sparse tensorization can be leveraged in applications.

## Introduction

In this paper, we consider the numerical solution of the radiative transfer problem (RTP). This transport problem is stated on the phase space $\Omega =D×\mathcal{S}$ as the Cartesian product of a bounded physical domain $D\subset {\mathbb{R}}^{d}$, where d=2,3, and the unit ${d}_{\mathcal{S}}$-sphere as the parameter domain $\mathcal{S}$ of dimension ${d}_{\mathcal{S}}=d-1=1,2$. The RTP (see e.g. Modest 2003) is then given as the task of finding the unknown radiative intensity$u:\Omega \to \mathbb{R}$, a real function over the phase space satisfying

$\begin{array}{ll}\mathbit{s}·{\nabla }_{x}u\left(\mathbit{x},\mathbit{s}\right)\phantom{\rule{0.3em}{0ex}}& +\phantom{\rule{2.56804pt}{0ex}}\left(\kappa \left(\mathbit{x}\right)+\sigma \left(\mathbit{x}\right)\right)\phantom{\rule{2em}{0ex}}\\ u\left(\mathbit{x},\mathbit{s}\right)=& \phantom{\rule{2.56804pt}{0ex}}\kappa \left(\mathbit{x}\right){I}_{b}\left(\mathbit{x}\right)+\phantom{\rule{0.3em}{0ex}}\sigma \left(\mathbit{x}\right){\int }_{\mathcal{S}}\Phi \left(\mathbit{s},{\mathbit{s}}^{\prime }\right)u\left(\mathbit{x},{\mathbit{s}}^{\prime }\right)d\phantom{\rule{0.3em}{0ex}}{\mathbit{s}}^{\prime },\phantom{\rule{2em}{0ex}}\end{array}$
(1a)
$\begin{array}{ll}u\left(\mathbit{x},\mathbit{s}\right)=& \phantom{\rule{2.56804pt}{0ex}}g\left(\mathbit{x},\mathbit{s}\right),\phantom{\rule{2em}{0ex}}\mathbit{x}\in \mathrm{\partial D},\phantom{\rule{1em}{0ex}}\mathbit{s}·\mathbit{n}\left(\mathbit{x}\right)<0\phantom{\rule{2.56804pt}{0ex}}.\phantom{\rule{2em}{0ex}}\end{array}$
(1b)

We refer to Eq. (1a) as the stationary monochromatic radiative transfer equation (RTE), while Eq. (1b) constitutes inflow boundary conditions. A ray of light of direction s is attenuated by absorption and scattering with the medium. In (1a), κ≥0 is the absorption coefficient, σ≥0 the scattering coefficient, and Φ>0 the scattering kernel or scattering phase function. The scattering phase function is normalized to ${\int }_{\mathcal{S}}\Phi \left(\mathbit{s},{\mathbit{s}}^{\prime }\right)d\phantom{\rule{0.3em}{0ex}}{\mathbit{s}}^{\prime }=1$ for each direction s. Sources inside the domain D are modeled by the blackbody intensity I b ≥0, radiation from sources outside of the domain or from its enclosings is prescribed by the boundary data g≥0. The vector n(x) denotes the outward unit normal vector which is defined in (almost every) point x on the boundary D of the physical domain.

Due to the high dimension of the phase space, the nonlocality of the scattering operator, and the hyperbolic nature of the PDE, the efficient numerical simulation of radiative transfer is a challenging computational task even today. Still, radiative transfer as such or as a means of energy transfer among others is of interest in many applications, e.g. in the fields of heat transfer (Modest 2003), neutron transport (Hébert 2010), atmospheric sciences (Evans 1998), medical imaging (e.g. Peng et al. 2011), or other areas where transported particles interact with a background medium, but only negligibly with each other.

In this paper, we extend the range of sparse tensor product discretization methods for the RTP investigated before (Grella and Schwab 2011a; Grella and Schwab 2011b; Grella 2013; Widmer et al. 2008) by a new phase space Galerkin framework.

Apart from Monte Carlo methods for raytracing, the most popular deterministic approaches for the radiative transfer problem are the discrete ordinates method and the spherical harmonics approximation. We quote a brief overview of their properties from (Grella 2013).

In the discrete ordinate method (DOM) or S N -approximation, the angular domain is discretized by a number of fixed directions, which are inserted into Eq. (1) so that a system of spatial PDEs results. Without scattering the equations for single directions are independent of each other, with scattering, however, they are coupled through the scattering integral. After the straightforward discretization of the angular domain, the spatial PDEs are typically solved using finite differences, finite elements, or finite volume methods.

The DOM is popular as it is simple to implement, offers straightforward parallelization, and can capture directed radiation relatively well as some of the ordinates can usually be chosen freely.

On the downside, the method can suffer from so-called “ray effects” (Lathrop 1968): due to the point evaluation in the angular domain, the scalar flux or incident radiation from small isotropic sources may appear star-like with rays emanating from the source into the chosen angular directions (Stone 2007, p. 2 and following). These effects occur especially pronounced in settings with low scattering and absorption, i.e. in optically thin media.

An example for truncated series expansion is the method of spherical harmonics or P N -approximation. The solution of Eq. (1) is replaced by a series of spherical harmonics up to some order N with spatially dependent coefficients. Due to orthogonality relations, the scattering part often decouples or couples only few terms depending on the scattering kernel. However, the system of PDEs for the spatial coefficient functions is always coupled by the transport part s· x u.

As low order series expansions in spherical harmonics do not permit a very localized resolution of the angular variable, the method performs best when the solutions are nearly isotropic in angle, which is the case in diffusive, so-called “optically thick” media. Then, rather low order spherical harmonics approximations might suffice for a good approximation. Indeed, the P1 method can be formulated as a diffusion equation for the incident radiation (Modest 2003, Sec. 15.4). For smooth solutions, the spherical harmonics method exhibits spectral convergence in angle (Grella and Schwab 2011a).

On the other hand, beam-like solutions require a high spectral order to be resolved appropriately, leading to high computational complexity. In general, higher spectral orders also lead to a sharp increase in computational complexity when boundary conditions are to be satisfied (Modest and Yang 2008).

When combined with a standard finite element or finite volume discretization in the physical domain D, the deterministic, numerical S N - and P N -approximations exhibit the so-called “curse of dimensionality”: the error (typically the L2-error of the solution) with respect to the numbers of degrees of freedom (DoF) M D and ${M}_{\mathcal{S}}$ on the component domains D and $\mathcal{S}$ scales with the dimension d and ${d}_{\mathcal{S}}$ of the application problem as $O\left({M}_{D}^{-s/d}+{M}_{\mathcal{S}}^{-t/{d}_{\mathcal{S}}}\right)$ with constants s and t.

The first sparse finite element approximation method was proposed in (Zenger 1991) for the solution of Laplace equation in the unit square and cube. In this paper, Zenger developed the (direct) sparse grid approximation method which alleviates this curse of dimensionality: the computational complexity is reduced, up to logarithmic terms, to that of a one-dimensional problem.

The idea of sparse tensorization of finite element and finite difference methods was generalized by (Bungartz and Griebel 2004; Hegland 2003; Garcke 2007), and others, for the numerical solution of PDEs as well as for other applications where standard numerical methods are obstructed by the curse of dimensionality.

Sparse tensor methods were first applied to radiative transfer by (Widmer et al. 2008). In that paper, the authors formulated a least squares phase space Galerkin sparse tensor approximation with hierarchical finite elements as discretization of the physical domain and wavelets in the angular domain. They proved that sufficient regularity of the solution provided, their method breaks the curse of dimensionality: the problem complexity reduces to log-linear in the number of degrees of freedom, while convergence rates deteriorate only by a logarithmic factor. However, the discretization of the scattering operator had not been addressed in that work.

In earlier work (Grella and Schwab 2011a), we showed that the sparse tensor product method of (Widmer et al. 2008) can also be combined with a spectral discretization involving spherical harmonics, resulting in a sparse P N -method which also treats scattering. Boundary conditions were satisfied in a strong sense by introducing piecewise spectral functions in angle.

Secondly we presented a sparse tensor version of the DOM as a sparse collocation method with a Galerkin ansatz in the physical domain and strong enforcement of the boundary conditions, while not yet accounting for scattering (Grella and Schwab 2011b). This sparse tensor S N -method was realized computationally with the sparse grid combination technique (Griebel et al. 1992) to construct a sparse approximation to the radiative transfer solution.

The sparse DOM was subsequently reformulated as a phase space Galerkin method with quadrature in angle (Grella 2013) in order to treat sparse P N - and sparse S N -method in a more uniform manner. In this reformulation, we employed streamline upwind Petrov Galerkin (SUPG) stabilization and weak satisfacion of boundary conditions. Sparse S N -methods were derived as a direct sparse tensor method and implemented algorithmically via the combination technique.

In the present paper, we derive a sparse P N - and sparse S N -method from the same phase space Galerkin framework with transport stabilization and scattering. Boundary conditions are satisfied in a weak sense. In doing so we close a gap in the list of conceivable combinations of formulations regarding stabilization and type of angular approximation. In contrast to the previous approach (Grella 2013), we stabilize the formulation in a different way and the analytical focus will be on the direct sparse approach. With transport stabilization and direct sparse approach we follow (Widmer et al. 2008) more closely, extending their work by treatment of scattering and weak satisfaction of the boundary conditions.

Similar savings in computational effort are realized with other variational formulations, such as Petrov-Galerkin saddle point formulations (see e.g. Dahmen et al. (2012) and the references there).

The outline of this paper is as follows. In Section ‘Phase space Galerkin method’ we formulate the phase space Galerkin framework in operator form and outline how P N and S N -methods can be derived from it. Then we develop full tensor and sparse tensor discretizations based on the framework and analyze and compare their convergence properties.

Section ‘Numerical experiments’ presents several basic numerical experiments designed with the purpose of validating and illustrating the theoretical convergence results.

Finally we conclude this work in Section ‘Conclusion’ by summarizing and reviewing the results.

## Phase space Galerkin method

We begin by introducing the radiative transfer problem in operator form. Using this compact notation we then state the variational formulation of our phase space Galerkin method and proceed to discretizations of the method.

### Operator formulation

Problem (1) reads in operator form: Find the intensity $u\left(\mathbit{x},\mathbit{s}\right):D×\mathcal{S}\to \mathbb{R}$ such that

$\mathrm{A}u=f,\phantom{\rule{1em}{0ex}}u{|}_{\partial {\Omega }_{-}}=\mathrm{g.}$
(2)

In this, Ω represents the inflow part of the boundary $\mathrm{\partial \Omega }=\mathrm{\partial D}×\mathcal{S}$ of the computational domain or phase space$\Omega =D×\mathcal{S}$. The inflow boundary is defined by

$\partial {\Omega }_{-}:=\left\{\left(\mathbit{x},\mathbit{s}\right)\in \Omega :\mathbit{x}\in {\Gamma }_{-}\left(\mathbit{s}\right)\right\}$
(3)

with the physical part of the inflow boundary

${\Gamma }_{-}\left(\mathbit{s}\right):=\left\{\mathbit{x}\in \mathrm{\partial D}:\mathbit{s}·\mathbit{n}\left(\mathbit{x}\right)<0\right\}.$
(4)

Correspondingly we define the physical part of the outflow boundary as

${\Gamma }_{+}\left(\mathbit{s}\right):=\left\{\mathbit{x}\in \mathrm{\partial D}:\mathbit{s}·\mathbit{n}\left(\mathbit{x}\right)>0\right\}.$
(5)

The radiative transfer operator A=T+Q consists of the transport operator T,

$\mathrm{T}u:=\left(\mathbit{s}·{\nabla }_{x}+\kappa \right)u,$
(6)

and the scattering operator Q,

$\begin{array}{ll}\mathrm{Q}u:=& \sigma {Q}_{1}u:=\sigma \left(\text{Id}-\Sigma \right)u:=\sigma \left(\mathbit{x}\right)u\left(\mathbit{x},\mathbit{s}\right)\\ -\phantom{\rule{0.3em}{0ex}}\sigma \left(\mathbit{x}\right){\int }_{\mathcal{S}}\Phi \left(\mathbit{s},{\mathbit{s}}^{\prime }\right)u\left(\mathbit{x},{\mathbit{s}}^{\prime }\right)d\phantom{\rule{0.3em}{0ex}}{\mathbit{s}}^{\prime }\phantom{\rule{2.77626pt}{0ex}}.\end{array}$
(7)

Here, Q1=Id−Σ is the unity scattering operator, and Σ is the scattering integral operator, the integral of Φ and u. The source function f contains the sources of radiation in the domain,

$f:=\kappa {I}_{b},$
(8)

and g is the incoming radiation on the boundary Ω, as in Sec. ‘Introduction’.

### Properties of the scattering operator

Aside from the positivity and normalization requirements already mentioned in Sec. ‘Introduction’, we assume an isotropic medium, i.e. Φ does not depend on x. As Φ models the type of scattering, this assumption can safely be made for most applications (cf. Modest 2003, p. 268). Variations in the strength of scattering due to e.g. varying spatial density of the medium are modeled by the scattering coefficient σ. As long as the following properties hold for almost every x, the complexity and convergence analysis later on could also be conducted without this assumption.

Furthermore, if spherical scatterers are assumed, the scattering phase function does not vary with the azimuthal angle so that Φ only depends on the inner product of s and s. From this it follows immediately that Φ(s,s)=Φ(s·s)=Φ(s,s).

From here on, we shall take Φ (cf. Kanschat to be forward dominant2008, Def. 1) if $\Phi \left(\mathbit{s},{\mathbit{s}}^{\prime }\right)=\sum _{k=0}^{\infty }{a}_{k}cos\left(karccos\left(\mathbit{s}·{\mathbit{s}}^{\prime }\right)\right)$ with all a k ≥0. Then, one can show that Σ is positive semi-definite (Kanschat 2008, Lemmata 2 and 3), i.e.

${\left(v,\mathrm{\Sigma v}\right)}_{{L}^{2}\left(\mathcal{S}\right)}\ge 0\phantom{\rule{1em}{0ex}}\forall v\in {L}^{2}\left(\mathcal{S}\right).$
(9)

Normalization and symmetry of Φ with respect to its arguments leads to normalization of the operator norm $||\Sigma |{|}_{{L}^{2}\left(\mathcal{S}\right)\to {L}^{2}\left(\mathcal{S}\right)}=1$ (Kanschat 2008, Lemma 5).

From these properties and a Hilbert-Schmidt theorem for integral operators (e. g. Knapp 2005, Thm. 2.4), one can derive that the spectrum of Q1 lies in [ 0,1] with an isolated eigenvalue λ0=0, from which the next largest eigenvalue λ1 differs by a positive constant (Ávila et al. 2011, Sec. 2.2).

With the previous considerations, one arrives at the following properties of Q:

#### Lemma 1.

For any uL2(Ω), the scattering operator Q as defined by Eq. (7) satisfies (cf. Ávila et al. 2011, Eq. (11))

$\begin{array}{ll}{\lambda }_{1}{∥\sigma {P}^{\perp }u∥}_{{L}^{2}\left(\Omega \right)}& \le \parallel \mathrm{Q}u{\parallel }_{{L}^{2}\left(\Omega \right)}\le {∥\sigma ∥}_{{L}^{\infty }\left(\Omega \right)}{∥u∥}_{{L}^{2}\left(\Omega \right)},\phantom{\rule{2em}{0ex}}\end{array}$
(10)
$\begin{array}{ll}{\left(u,\mathrm{Q}u\right)}_{{L}^{2}\left(\Omega \right)}& \ge {∥\mathrm{Q}u∥}_{{L}^{2}\left(\Omega \right)}^{2}\ge 0,\phantom{\rule{2em}{0ex}}\end{array}$
(11)

in which the projector P maps $u\left(\mathbit{x},·\right)\in {L}^{2}\left(\mathcal{S}\right)$ to (kerQ), the space orthogonal to the kernel of Q, and λ1(0,1] is the smallest nonzero eigenvalue of Q1.

For a proof of (11) we refer to (Grella 2013).

### Variational formulation

Our variational formulation will be based on a Galerkin finite element framework over the phase space Ω with stabilization applied to the operator RTP (2).

#### A generic stabilized phase space variational formulation

To begin with, we define the Hilbert space

$\mathcal{V}:=\left\{u\in {L}^{2}\left(\Omega \right):\phantom{\rule{2.22144pt}{0ex}}\mathbit{s}·{\nabla }_{x}u\in {L}^{2}\left(\Omega \right)\right\}$
(12)

with the usual L2(Ω) inner product

${\left(u,v\right)}_{{L}^{2}\left(\Omega \right)}:={\int }_{\mathcal{S}}{\int }_{D}u\left(\mathbit{x},\mathbit{s}\right)v\left(\mathbit{x},\mathbit{s}\right)\mathrm{d}xd\phantom{\rule{0.3em}{0ex}}\mathbit{s},$
(13)

and the triple bar norm

$\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\parallel \phantom{\rule{0.3em}{0ex}}|v\parallel \phantom{\rule{0.3em}{0ex}}{|}^{2}:={∥v∥}_{{L}^{2}\left(\Omega \right)}^{2}+{∥\mathbit{s}·{\nabla }_{x}v∥}_{{L}^{2}\left(\Omega \right)}^{2}+{∥{Q}_{1}v∥}_{{L}^{2}\left(\Omega \right)}^{2},\phantom{\rule{1em}{0ex}}v\in \mathcal{V}\phantom{\rule{2.77626pt}{0ex}}.$
(14)

For the weak enforcement of boundary conditions, we define the boundary form

$b\left(u,v\right):={\left(v,|\mathbit{s}·\mathbit{n}|u\right)}_{{L}^{2}\left(\partial {\Omega }_{-}\right)}={\int }_{\mathcal{S}}{\int }_{\Gamma \text{_}\left(\mathbit{s}\right)}|\mathbit{s}·\mathbit{n}|\mathit{\text{uv}}\mathrm{d}xd\phantom{\rule{0.3em}{0ex}}\mathbit{s},$
(15)

in which we have omitted the dependence of the outward unit normal n on the position x. This boundary form was introduced by Manteuffel et al. (2000, Eq. (2.16)). It is well-defined for functions vL2(Ω) with finite inflow norm

${∥v∥}_{-}:=b{\left(v,v\right)}^{1/2}.$
(16)

Combining (14) and (16) yields the new norm

${∥v∥}_{1}:={\left(|\phantom{\rule{0.3em}{0ex}}\parallel v|\phantom{\rule{0.3em}{0ex}}{\parallel }^{2}+{∥v∥}_{-}^{2}\right)}^{1/2},$
(17)

which gives rise to the closed, linear subspace of ,

${\mathcal{V}}_{1}:=\left\{v\in \mathcal{V}:{∥v∥}_{1}<\infty \right\}$
(18)

which, with the inner product related to v1, is a Hilbert space. For functions $u,v\in {\mathcal{V}}_{1}$, we define the bilinear form

$\begin{array}{c}a\left(u,v\right):={\left(\mathrm{R}v,\phantom{\rule{1em}{0ex}}\mathrm{A}u\right)}_{{L}^{2}\left(\Omega \right)}+2b\left(u,v\right),\end{array}$
(19)

where R is a stabilization operator on the test function side yet to be specified. Together with the linear form

$\begin{array}{c}l\left(v\right):={\left(\mathrm{R}v,\phantom{\rule{1em}{0ex}}f\right)}_{{L}^{2}\left(\Omega \right)}+2b\left(g,v\right),\end{array}$
(20)

the bilinear form constitutes the following variational problem: Find $u\in {\mathcal{V}}_{1}$ such that

$a\left(u,v\right)=l\left(v\right)\phantom{\rule{1em}{0ex}}\forall v\in {\mathcal{V}}_{1}.$
(21)

Different ways of stabilization are conceivable and have been used in the literature, e. g. the least squares approach by (Manteuffel et al. 2000), or SUPG introduced by (Brooks and Hughes 1982). For our purposes here, we will choose the T-stabilized formulation (Grella and Schwab 2011a) to avoid mesh-dependent quantities and the square of the scattering operator. More precisely, we set R=ε T with a stabilization parameter ε that depends on the absorption coefficient κ.

#### Properties of the variational formulation

At this point, we introduce the anisotropic or mixed Sobolev spaces${H}^{s,t}\left(\Omega \right)={H}^{s}\left(D\right)\otimes {H}^{t}\left(\mathcal{S}\right)$ as

(22)

with the corresponding mixed Sobolev norms${∥·∥}_{{H}^{s,t}\left(\Omega \right)}$, given by

$\parallel v{\parallel }_{{H}^{s,t}\left(\Omega \right)}^{2}:=\sum _{0\le |\mathbit{\alpha }|\le s}\sum _{0\le |\mathbit{\beta }|\le t}{∥{\mathrm{D}}_{\mathbit{s}}^{\mathbit{\beta }}{\mathrm{D}}_{\mathbit{x}}^{\mathbit{\alpha }}v∥}_{{L}^{2}\left(\Omega \right)}^{2}.$
(23)

Here, ${\mathrm{D}}_{\mathbit{s}}^{\mathbit{\beta }}{\mathrm{D}}_{\mathbit{x}}^{\mathbit{\alpha }}v$ denotes the weak derivative of $v:D×\mathcal{S}\to \mathbb{R}$ of order |α| w. r. t. xD and order |β| w. r. t. $\mathbit{s}\in \mathcal{S}$, with the multi-indices $\mathbit{\alpha }\in {ℕ}_{0}^{d}$ and $\mathbit{\beta }\in {ℕ}_{0}^{{d}_{\mathcal{S}}+1}$.

The following lemma collects auxiliary results which will become helpful later.

##### Lemma 2(Auxiliary results).
1. 1.

Let $v\in \mathcal{V}$. Then ${\left(v,\mathbit{s}·{\nabla }_{x}v\right)}_{{L}^{2}\left(\Omega \right)}\ge \frac{1}{2}\underset{\mathcal{S}}{\int }\underset{{\Gamma }_{-\left(\mathcal{S}\right)}}{\int }{v}^{2}\mathbit{s}·\mathbit{n}\left(\mathbit{x}\right)\mathrm{d}\mathbit{x}d\phantom{\rule{0.3em}{0ex}}\mathbit{s}$. If furthermore $v\in {\mathcal{V}}_{0}$, then ${\left(v,\mathbit{s}·{\nabla }_{x}v\right)}_{{L}^{2}\left(\Omega \right)}\ge 0$.

2. 2.

For vH 1,0(Ω), it holds $∥\mathbit{s}·{\nabla }_{x}v∥\le \sqrt{d}\parallel v{\parallel }_{{H}^{1,0}\left(\Omega \right)}$.

##### Proof

1. A proof is given by (Manteuffel et al. 2000, Thm. 2.1). It uses the divergence theorem and exploits the fact that $v{|}_{\partial {\Omega }_{-}}=0$ for s·n(x)<0 if $v\in {\mathcal{V}}_{0}$, where n(x) is the outward unit normal on the boundary D:

$\begin{array}{ll}\phantom{\rule{-25.0pt}{0ex}}{\left(v,\mathbit{s}·{\nabla }_{x}v\right)}_{{L}^{2}\left(\Omega \right)}& =\frac{1}{2}\underset{\mathcal{S}}{\int }{\int }_{D}{\nabla }_{x}·\left(\mathbit{s}{v}^{2}\right)\mathrm{d}\mathbit{x}\phantom{\rule{1em}{0ex}}\mathrm{d}\mathbit{s}\phantom{\rule{2em}{0ex}}\\ =\frac{1}{2}{\int }_{\mathcal{S}}{\int }_{\mathrm{\partial D}}{v}^{2}\mathbit{s}·\mathbit{n}\left(\mathbit{x}\right)\mathrm{d}\mathbit{x}\phantom{\rule{1em}{0ex}}\mathrm{d}\mathbit{s}\phantom{\rule{2em}{0ex}}\\ =\frac{1}{2}{\int }_{\mathcal{S}}{\int }_{{\Gamma }_{-}\left(\mathbit{s}\right)}{v}^{2}\mathbit{s}·\mathbit{n}\left(\mathbit{x}\right)\mathrm{d}\mathbit{x}\phantom{\rule{1em}{0ex}}\mathrm{d}\mathbit{s}\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}+\frac{1}{2}{\int }_{\mathcal{S}}{\int }_{\Gamma +\left(\mathbit{s}\right)}{v}^{2}\mathbit{s}·\mathbit{n}\left(\mathbit{x}\right)\mathrm{d}\mathbit{x}\phantom{\rule{1em}{0ex}}\mathrm{d}\mathbit{s}.\phantom{\rule{2em}{0ex}}\end{array}$

As s·n≥0 in the second integral, we obtain the first assertion. If additionally $v\in {\mathcal{V}}_{0}$, then the first integral vanishes, and the second assertion follows.

2. We again quote Manteuffel et al. (2000, Lemma 4.1 (i)):

$\begin{array}{ll}\phantom{\rule{-25.0pt}{0ex}}{∥\mathbit{s}·{\nabla }_{x}v∥}_{{L}^{2}\left(\Omega \right)}^{2}& \le {\int }_{D}{\int }_{\mathcal{S}}{\left(\sum _{i=1}^{d}{s}_{i}{D}_{{x}_{i}}v\right)}^{2}\mathrm{d}\mathbit{s}\phantom{\rule{1em}{0ex}}\mathrm{d}\mathbit{x}\phantom{\rule{2em}{0ex}}\\ \le d{\int }_{D}{\int }_{\mathcal{S}}\sum _{i=1}^{d}{\left({s}_{i}{D}_{{x}_{i}}v\right)}^{2}\mathrm{d}\mathbit{s}\phantom{\rule{1em}{0ex}}\mathrm{d}\mathbit{x}\phantom{\rule{2em}{0ex}}\\ \le d\sum _{i=1}^{d}{∥{D}_{{x}_{i}}v∥}^{2}\le d\parallel v{\parallel }_{{H}^{1,0}\left(\Omega \right)}^{2}.\phantom{\rule{2em}{0ex}}\end{array}$

In order to establish well-posedness of the variational formulation (21), we prove continuity and coercivity of the bilinear form (19) and continuity of the linear form (20) in the following.

##### Lemma 3

(Continuity of bilinear form). Let σ, κ, εL(D) with ${∥\sigma ∥}_{{L}^{\infty }\left(D\right)}=:{\sigma }_{max}$, ${∥\kappa ∥}_{{L}^{\infty }\left(D\right)}=:{\kappa }_{max}$, ${∥\epsilon ∥}_{{L}^{\infty }\left(D\right)}=:{\epsilon }_{max}$, then there is a constant 0<c c < such that for all $u,v\in {\mathcal{V}}_{1}$

$|a\left(u,v\right)|\le {c}_{c}{∥u∥}_{1}\parallel v{\parallel }_{1}.$
##### Proof

We proceed analogously to (Manteuffel et al. 2000, Thm. 3.3). To begin with, we estimate for all $u,v\in \mathcal{V}$

$\begin{array}{ll}\phantom{\rule{-15.0pt}{0ex}}∥\mathrm{R}v∥& =∥\mathrm{\epsilon \kappa v}+\epsilon \mathbit{s}·{\nabla }_{x}v∥\le {\epsilon }_{max}{\kappa }_{max}\parallel v\parallel +{\epsilon }_{max}∥\mathbit{s}·{\nabla }_{x}v∥\phantom{\rule{2em}{0ex}}\\ \le max\left\{{\epsilon }_{max}{\kappa }_{max},1\right\}{∥v∥}_{1},\phantom{\rule{2em}{0ex}}\end{array}$
(24)
$\begin{array}{ll}\phantom{\rule{-15.0pt}{0ex}}∥\mathrm{A}u∥& \le {\kappa }_{max}∥u∥+∥\mathbit{s}·{\nabla }_{x}u∥+{\sigma }_{max}∥{\mathrm{Q}}_{1}u∥\phantom{\rule{2em}{0ex}}\\ \le max\left\{{\kappa }_{max},1,{\sigma }_{max}\right\}|\phantom{\rule{0.3em}{0ex}}\parallel u|\phantom{\rule{0.3em}{0ex}}\parallel .\phantom{\rule{2em}{0ex}}\end{array}$
(25)

Using the Cauchy-Schwarz inequality as well as estimates (24) and (25) it holds

$\begin{array}{ll}\phantom{\rule{-13.0pt}{0ex}}|a\left(u,v\right)|& \le |∥\mathrm{R}v∥∥\mathrm{A}u∥+2\parallel v{\parallel }_{-}{∥u∥}_{-}|\phantom{\rule{2em}{0ex}}\\ \le 2{\left({∥\mathrm{R}v∥}^{2}+{∥v∥}_{-}^{2}\right)}^{1/2}{\left({∥\mathrm{A}u∥}^{2}+{∥u∥}_{-}^{2}\right)}^{1/2}\phantom{\rule{2em}{0ex}}\\ \le 2max\left\{1,{\epsilon }_{max}{\kappa }_{max}\right\}max\left\{{\kappa }_{max},1,{\sigma }_{max}\right\}\phantom{\rule{0.3em}{0ex}}{∥u∥}_{1}{∥v∥}_{1}.\phantom{\rule{2em}{0ex}}\end{array}$
##### Lemma 4

(Continuity of linear form). Given the assumptions of Lemma 3 on κ, σ, ε, and additionally fL2(Ω), $g:\partial {\Omega }_{-}\to \mathbb{R}$ with g<, there is a constant 0<c l < such that for $v\in {\mathcal{V}}_{1}$ it holds

$|l\left(v\right)|\le {c}_{l}\parallel v{\parallel }_{1}\phantom{\rule{2.83795pt}{0ex}}.$
##### Proof

The proof is analogous to that of Lemma 3:

$\begin{array}{ll}|l\left(v\right)|& \le |∥\mathrm{R}v∥∥\phantom{\rule{0.3em}{0ex}}f∥+2\parallel v{\parallel }_{-}{∥g∥}_{-}|\phantom{\rule{2em}{0ex}}\\ \le 2{\left({∥\mathrm{R}v∥}^{2}+{∥v∥}_{-}^{2}\right)}^{1/2}{\left({∥\phantom{\rule{0.3em}{0ex}}f∥}^{2}+{∥g∥}_{-}^{2}\right)}^{1/2}\phantom{\rule{2em}{0ex}}\\ \le 2max\left\{1,{\epsilon }_{max}{\kappa }_{max}\right\}\left(∥\phantom{\rule{0.3em}{0ex}}f∥+{∥g∥}_{-}\right){∥v∥}_{1}.\phantom{\rule{2em}{0ex}}\end{array}$

Next, we show coercivity of the bilinear form. For ease of exposition we shall assume ε and κ to be constant on the physical domain. Coercivity can also be obtained for non-constant coefficients (see Widmer 2009, Thm. 2.2, for an example). Coercivity of the SUPG variational formulation for the RTP has also been proved by (Ávila et al. 2011, Lemma 2), although in a different norm. Previously, we had proved coercivity of the T-stabilized variational formulation without the boundary form b(·,·) (Grella 2013, Lemma 4.1), here we include this boundary form in the formulation, which will motivate the choice of the stabilization parameter ε.

##### Lemma 5

(Coercivity of bilinear form). Let κ, ε be positive functions which are constant on the physical domain D. Assume min xD σ=:σmin>0 and ${\sigma }_{max}={∥\sigma ∥}_{{L}^{\infty }\left(D\right)}$, and additionally that

${\sigma }_{max}^{2}<4\kappa {\sigma }_{min}^{2},\phantom{\rule{1em}{0ex}}\epsilon <\frac{2}{\kappa }.$
(26)

Then the bilinear form a(·,·) from (19) is coercive on ${\mathcal{V}}_{1}×{\mathcal{V}}_{1}$: there is a constant c e >0 such that for all $v\in {\mathcal{V}}_{1}$ it holds

$a\left(v,v\right)\ge {c}_{e}{∥v∥}_{1}^{2}.$
##### Proof

For an overview of the involved terms we split the bilinear form into separate inner products:

$\begin{array}{ll}\phantom{\rule{-6.0pt}{0ex}}a\left(v,v\right)& =\left(\mathrm{\epsilon \kappa v}+\epsilon \mathbit{s}·{\nabla }_{x}v,\mathrm{A}v\right)+2b\left(v,v\right)\phantom{\rule{2em}{0ex}}\\ =\left(\mathrm{\epsilon \kappa v}+\epsilon \mathbit{s}·{\nabla }_{x}v,\mathbit{s}·{\nabla }_{x}v+\mathrm{\kappa v}+\mathrm{Q}v\right)+2b\left(v,v\right)\phantom{\rule{2em}{0ex}}\\ =\left(\mathrm{\epsilon \kappa v},\mathbit{s}·{\nabla }_{x}v\right)+\left(\mathrm{\epsilon \kappa v},\mathrm{\kappa v}\right)+\left(\mathrm{\epsilon \kappa v},\mathrm{Q}v\right)\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}+\left(\epsilon \mathbit{s}·{\nabla }_{x}v,\mathbit{s}·{\nabla }_{x}v\right)+\left(\epsilon \mathbit{s}·{\nabla }_{x}v,\mathrm{\kappa v}\right)\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}+\left(\epsilon \mathbit{s}·{\nabla }_{x}v,\mathrm{Q}v\right)+2b\left(v,v\right)\phantom{\rule{2em}{0ex}}\end{array}$
(27)

As we assumed ε and κ to be constant, we can factor these coefficients out of the inner products. We begin by analyzing the sum of first and fifth inner product.

Applying statement 1 of Lemma 2 yields

$2\mathrm{\epsilon \kappa }\left(v,\mathbit{s}·{\nabla }_{x}v\right)\ge -\mathrm{\epsilon \kappa }\parallel v{\parallel }_{-}^{2}.$

Together with the boundary term, we obtain

$\left(\mathrm{\epsilon \kappa v},\mathbit{s}·{\nabla }_{x}v\right)+\left(\epsilon \mathbit{s}·{\nabla }_{x}v,\mathrm{\kappa v}\right)+2b\left(v,v\right)\ge \left(2-\mathrm{\epsilon \kappa }\right)\parallel v{\parallel }_{-}^{2}.$

The second inner product is bounded from below by

$\left(\mathrm{\epsilon \kappa v},\mathrm{\kappa v}\right)\ge \epsilon {\kappa }^{2}\parallel v{\parallel }^{2}.$

To estimate the third inner product, we use property (11) of the scattering operator:

$\mathrm{\epsilon \kappa }\left(v,\mathrm{Q}v\right)\ge \mathrm{\epsilon \kappa }{∥\mathrm{Q}v∥}^{2}\ge \mathrm{\epsilon \kappa }{\sigma }_{min}^{2}{∥{\mathrm{Q}}_{1}v∥}^{2}.$

The fourth inner product in Eq. (27) is

$\begin{array}{l}\left(\epsilon \mathbit{s}·{\nabla }_{x}v,\mathbit{s}·{\nabla }_{x}v\right)=\epsilon {∥\mathbit{s}·{\nabla }_{x}v∥}^{2}.\end{array}$

For the sixth inner product we apply Cauchy-Schwarz inequality and Young’s inequality with a parameter θ>0:

$\begin{array}{ll}\left(\epsilon \mathbit{s}·{\nabla }_{x}v,\mathrm{Q}v\right)& \ge -\epsilon {\sigma }_{max}∥\mathbit{s}·{\nabla }_{x}v∥∥{\mathrm{Q}}_{1}v∥\phantom{\rule{2em}{0ex}}\\ \ge -\epsilon {\sigma }_{max}\left(\frac{\theta }{2}{∥\mathbit{s}·{\nabla }_{x}v∥}^{2}+\frac{1}{2\theta }{∥{\mathrm{Q}}_{1}v∥}^{2}\right)\phantom{\rule{2em}{0ex}}\end{array}$

Combining all estimates yields the result:

By eliminating θ we obtain the condition ${\sigma }_{max}^{2}<4\kappa {\sigma }_{min}^{2}$. The condition ε<2/κ results from the last of the terms over which the minimum is taken.

The previous condition on the stabilization parameter leads to a choice of ε=1/κ. Well-posedness of the variational formulation now follows directly.

##### Theorem 6

(Existence and uniqueness of solution to variational formulation). Provided that fL2(Ω) and g< there exists a unique solution $u\in {\mathcal{V}}_{1}$ to the variational formulation (21).

##### Proof

Since $\left({\mathcal{V}}_{1},{∥·∥}_{1}\right)$ is a Hilbert space and Lemmata 3–5 guarantee continuity of the augmented SUPG bilinear form and linear form as well as coercivity of the bilinear form, the Lax-Milgram theorem (Brenner and Scott 2008, Thm. 2.7.7) ensures existence and uniqueness of the solution to (21).

### Discretization

For the discretization of the variational problem (21), we restrict the space ${\mathcal{V}}_{1}$ in the variational formulation (21) to tensor products of hierarchic, finite dimensional approximation spaces over the component domains D and $\mathcal{S}$.

#### Full tensor discretization

In the standard full tensor approximation, we choose a full tensor product space VL,N to approximate ${\mathcal{V}}_{1}$:

${\mathcal{V}}_{1}\approx {V}^{L,N}:={V}_{D}^{L}\otimes {V}_{\mathcal{S}}^{N}.$
(28)

As ${H}^{1,0}\left(\Omega \right)\cong {H}^{1}\left(D\right)\otimes {L}^{2}\left(\mathcal{S}\right)$ is a dense subspace of ${\mathcal{V}}_{1}$, we define the family of physical approximation spaces as

${V}_{D}^{{l}_{D}}:={S}^{0,1}\left(D,{\mathcal{T}}_{D}^{{l}_{D}}\right)\subset {H}^{1}\left(D\right),\phantom{\rule{1em}{0ex}}{l}_{D}=1,\dots ,L,$
(29)

the spaces of continuous, piecewise linear functions on a dyadically refined mesh ${\mathcal{T}}_{D}^{{l}_{D}}$ over D. Here, the parameter l D stands for the physical resolution. It is related to the mesh width h in ${\mathcal{T}}_{D}^{{l}_{D}}$ by $h=O\left({2}^{-{l}_{D}}\right)$. With respect to the resolution l D =0,…,L, the spaces ${V}_{D}^{{l}_{D}}$ form a nested sequence

${V}_{D}^{0}\subset {V}_{D}^{1}\subset \dots \subset {V}_{D}^{L}\subset {H}^{1}\left(D\right).$

Let ${M}_{D}:=dim\underset{D}{\overset{L}{V}}$ denote the number of degrees of freedom for the FE space ${V}_{D}^{L}$ in the physical domain D. Then

${M}_{D}=O\left({2}^{\mathit{\text{dL}}}\right)$
(30)

with the dimension d of the physical domain. The exact number will depend on the geometry of the domain. For a square or cube D=[0,1]d, respectively, we obtain

${M}_{D}={\left({2}^{L}+1\right)}^{d}.$
(31)

In the angular domain, we distinguish between the P N -method and the S N -method.

##### S N -method

Here, the family of approximation spaces is given by

${V}_{\mathcal{S}}^{{l}_{\mathcal{S}}}:={S}^{-1,0}\left(\mathcal{S},{\mathcal{T}}_{\mathcal{S}}^{{l}_{\mathcal{S}}}\right)\subset {L}^{2}\left(\mathcal{S}\right),\phantom{\rule{1em}{0ex}}{l}_{\mathcal{S}}=1,\dots ,N,$
(32)

the spaces of piecewise constant functions on a dyadically refined mesh ${\mathcal{T}}_{\mathcal{S}}^{{l}_{\mathcal{S}}}$. As the physical spaces, these spaces exhibit a nested structure. The angular resolution N and the dimension of ${V}_{\mathcal{S}}^{N}$ are related by

${M}_{\mathcal{S}}:=dim\underset{\mathcal{S}}{\overset{N}{V}}=O\left({2}^{{d}_{\mathcal{S}}N}\right).$
(33)

P N -method To define the angular approximation spaces of the P N -method, we first introduce the spaces of spectral functions of the ${d}_{\mathcal{S}}$-sphere,

(34)

where ${Y}_{n,m}^{\left({d}_{\mathcal{S}}\right)}$ are the spherical harmonics of the ${d}_{\mathcal{S}}$-sphere, and ${m}_{n,{d}_{\mathcal{S}}}$ is the largest value of the secondary index m depending on the value of the primary index n and the dimension. These spaces offer an inherent nested structure. To obtain the same relation (33) between resolution level and degrees of freedom as in the S N -method, we connect the resolution level N and $Ñ$ by $Ñ={2}^{N}-1$.

Then, the angular approximation spaces are

${V}_{\mathcal{S}}^{{l}_{\mathcal{S}}}:={ℙ}_{{2}^{{l}_{\mathcal{S}}}-1}^{{d}_{\mathcal{S}}},\phantom{\rule{1em}{0ex}}{l}_{\mathcal{S}}=1,\dots ,N,$
(35)

and relation (33) also holds here. Up to the index relabeling and the additional boundary form, we obtain the spherical harmonics method already analyzed by (Grella and Schwab 2011a).

In both methods, the full tensor approximation space consequently has the dimension

${M}_{L,N}:=dim{V}^{L,N}={M}_{D}·{M}_{\mathcal{S}}=O\left({2}^{\mathit{\text{dL}}+{d}_{\mathcal{S}}N}\right).$
(36)

The full tensor approximate solution can be expressed by means of a physical basis ${\left\{{\alpha }_{i}\left(\mathbit{x}\right)\right\}}_{i=1}^{{M}_{D}}$ of ${V}_{D}^{L}$ and an angular basis ${\left\{{\beta }_{j}\right\}}_{j=1}^{{M}_{\mathcal{S}}}$ of ${V}_{\mathcal{S}}^{N}$ as

${u}_{L,N}\left(\mathbit{x},\mathbit{s}\right):=\sum _{i=1}^{{M}_{D}}\sum _{j=1}^{{M}_{\mathcal{S}}}{u}_{\mathit{\text{ij}}}{\alpha }_{i}\left(\mathbit{x}\right){\beta }_{j}\left(\mathbit{s}\right)$
(37)

with solution coefficients ${u}_{\mathit{\text{ij}}}\in \mathbb{R}$. The discrete variational formulation finally reads: Find uL,NVL,N such that

$a\left({u}_{L,N},{v}_{L,N}\right)=l\left({v}_{L,N}\right)\phantom{\rule{1em}{0ex}}\forall {v}_{L,N}\in {V}^{L,N},$
(38)

with the bilinear form a(·,·) from (19) and the linear form l(·) from (20). As VL,N is a subspace of ${\mathcal{V}}_{1}$ well-posedness ensured by Thm. 6 for the continuous problem follows also for this discrete problem.

By choosing a subset of ${H}^{1}\left(D\right)\otimes {L}^{2}\left(\mathcal{S}\right)$ as trial space we effectively assume a slightly higher regularity on the solution than what is guaranteed by the definition (12) of ${\mathcal{V}}_{1}$. For instance, solutions with line discontinuities due to the transport of discontinuous boundary data into the domain are not included in VL,N. However, since VL,N is dense in ${\mathcal{V}}_{1}$, even discontinuous solutions will be approximated with increasing resolution. Furthermore, in order to leverage the advantages of a sparse tensor approximation, a higher regularity of the solution will be required in any case.

#### Equivalence of collocation DOM and phase space Galerkin DOM with quadrature

Ordinarily the discrete ordinates method is presented as a collocation method in angle: Fixed directions ${\mathbit{s}}_{j}\in \mathcal{S}$, $j=1,\dots ,{M}_{\mathcal{S}}$, are inserted into the RTE (1a), and for each direction, the intensity ${u}_{j}\left(\mathbit{x}\right):=u\left(\mathbit{x},{\mathbit{s}}_{j}\right)\in {V}_{D}^{L}$ is sought as the solution to a purely spatial PDE. In these PDEs, the scattering integral is replaced by a quadrature rule

${\int }_{\mathcal{S}}\Phi \left({\mathbit{s}}_{j},{\mathbit{s}}^{\prime }\right)u\left(\mathbit{x},{\mathbit{s}}^{\prime }\right)d\phantom{\rule{0.3em}{0ex}}{\mathbit{s}}^{\prime }\approx \sum _{m=1}^{{M}_{\mathcal{S}}}{w}_{m}\Phi \left({\mathbit{s}}_{j},{\mathbit{s}}_{m}\right){u}_{m}$
(39)

with weights w m >0. By applying a Galerkin ansatz with stabilization in the physical domain to the PDEs, a system of coupled variational formulations

$\phantom{\rule{-12.0pt}{0ex}}\begin{array}{l}{\left({\mathrm{R}}_{j}v,\phantom{\rule{1em}{0ex}}{\mathrm{T}}_{j}{u}_{j}+\sigma {u}_{j}-\sum _{m=1}^{{M}_{\mathcal{S}}}{w}_{m}\Phi \left({\mathbit{s}}_{j},{\mathbit{s}}_{m}\right){u}_{m}\right)}_{{L}^{2}\left(D\right)}\\ \phantom{\rule{3em}{0ex}}+2{\left(v,|{\mathbit{s}}_{j}·\mathbit{n}|{u}_{j}\right)}_{{L}^{2}\left({\Gamma }_{-}\left({\mathbit{s}}_{j}\right)\right)}\\ \phantom{\rule{2em}{0ex}}={\left({\mathrm{R}}_{j}\mathit{\text{vf}}\right)}_{{L}^{2}\left(D\right)}+2{\left(v,|{\mathbit{s}}_{j}·\mathbit{n}|{g}_{j}\right)}_{{L}^{2}\left({\Gamma }_{-}\left({\mathbit{s}}_{j}\right)\right)}\phantom{\rule{1em}{0ex}}\forall v\in {V}_{D}^{L}\end{array}$
(40)

results with directional stabilization and transport operators

${\mathrm{R}}_{j}:=\mathrm{R}{|}_{\mathbit{s}={\mathbit{s}}_{j}},\phantom{\rule{1em}{0ex}}{\mathrm{T}}_{j}:=\mathrm{T}{|}_{\mathbit{s}={\mathbit{s}}_{j}},\phantom{\rule{1em}{0ex}}j=1,\dots ,{M}_{\mathcal{S}}.$
(41)

In the phase space Galerkin approach, variational formulation (38) is discretized further by substituting the angular quadrature rule (39) for all angular integrals so that the bilinear form (19) is approximated by

Let the linear functional l(·) from (20) be approximated by a functional $\stackrel{~}{l}\left(·\right)$ with angular quadrature correspondingly, then the directional solutions u j are determined from the variational formulation with angular quadrature

$ã\left(u,v\right)=\stackrel{~}{l}\left(v\right)\phantom{\rule{1em}{0ex}}\forall v\in {V}^{L,N},\phantom{\rule{1em}{0ex}}j=1,\dots ,{M}_{\mathcal{S}}.$
(42)

Since this formulation has to hold for all vVL,N, it follows that for test functions which vanish at every angular quadrature node s i , $i=1,\dots ,{M}_{\mathcal{S}}$ except one s j , formulation (42) can be reduced to the variational formulation (40) from the collocation discretization. This condition on the test functions is satisfied e. g. for a basis of the test space of characteristic functions on the angular mesh if each mesh cell contains exactly one angular quadrature node. With such a one-point quadrature rule and characteristic basis functions of ${V}_{\mathcal{S}}^{N}$, the phase space Galerkin DOM is therefore equivalent to the collocation DOM after discretization.

#### Sparse tensor discretization

The full tensor approach presented before shows the typical complexity for full tensor approximations: The number of degrees of freedoms increases exponentially with the dimension and the resolution levels in a dyadically refined scheme.

A way to counter this exponential increase is found in sparse tensorization. Using the same approximation spaces on the component domains ${V}_{D}^{{l}_{D}}$ and ${V}_{\mathcal{S}}^{{l}_{\mathcal{S}}}$ as for the full tensor approximation we define a sparse tensor approximation space ${\stackrel{̂}{V}}^{L,N}$ by

${\mathcal{V}}_{1}\approx {\stackrel{̂}{V}}^{L,N}:=\sum _{0\le \phantom{\rule{0.3em}{0ex}}f\left({l}_{D},{l}_{\mathcal{S}}\right)\le L}{V}_{D}^{{l}_{D}}\otimes {V}_{\mathcal{S}}^{{l}_{\mathcal{S}}},$
(43)

where the sparsity profile$f:\left\{0,\dots ,L\right\}×\left\{0,\dots ,N\right\}\to \mathbb{R}$ determines which tensor product subspaces ${V}_{D}^{{l}_{D}}\otimes {V}_{\mathcal{S}}^{{l}_{\mathcal{S}}}$ are to be included in the approximation. The sparsity profile usually depends on N as well. Here, we employ a linear profile

$f\left({l}_{D},{l}_{\mathcal{S}}\right)={l}_{D}+L{l}_{\mathcal{S}}/N,$
(44)

which is normally chosen if the component complexities M D and ${M}_{\mathcal{S}}$ depend on the resolution parameters L and N in the same way and identical order of approximation is sought over both component domains (cf. Zenger 1991; Bungartz and Griebel 2004; Griebel and Harbrecht 2013a).

If direct sum decompositions of the component approximation spaces ${V}_{D}^{{l}_{D}}$ and ${V}_{\mathcal{S}}^{{l}_{\mathcal{S}}}$ into detail spaces ${W}_{D}^{{l}_{D}}$ and ${W}_{\mathcal{S}}^{{l}_{\mathcal{S}}}$, i. e.

${V}_{D}^{{l}_{D}}={V}_{D}^{{l}_{D}-1}\oplus {W}_{D}^{{l}_{D}},\phantom{\rule{1em}{0ex}}{l}_{D}=1,\dots ,L$

are available (correspondingly in the angular domain), then the sparse tensor approximation space ${\stackrel{̂}{V}}^{L,N}$ can also be written as

${\stackrel{̂}{V}}^{L,N}=\sum _{0\le \phantom{\rule{0.3em}{0ex}}f\left({l}_{D},{l}_{\mathcal{S}}\right)\le L}{W}_{D}^{{l}_{D}}\otimes {W}_{\mathcal{S}}^{{l}_{\mathcal{S}}}.$
(45)

By choosing hierarchical bases for ${V}_{D}^{{l}_{D}}$ and ${V}_{\mathcal{S}}^{{l}_{\mathcal{S}}}$, each degree of freedom u i j can directly be associated with a tensor product detail space ${W}_{D}^{{l}_{D}}\otimes {W}_{\mathcal{S}}^{{l}_{\mathcal{S}}}$. The sparse solution is then given by

$\phantom{\rule{-3.0pt}{0ex}}\begin{array}{ll}{û}_{L,N}& =\sum _{0\le \phantom{\rule{0.3em}{0ex}}f\left({l}_{D},{l}_{\mathcal{S}}\right)\le L}{u}_{{l}_{D},{l}_{\mathcal{S}}},\\ {u}_{{l}_{D},{l}_{\mathcal{S}}}& =\sum _{i=1}^{dim\underset{D}{\overset{{l}_{D}}{W}}}\sum _{j=1}^{dim\underset{\mathcal{S}}{\overset{{l}_{\mathcal{S}}}{W}}}{u}_{\mathit{\text{ij}}}{\alpha }_{i}^{{l}_{D}}\left(\mathbit{x}\right){\beta }_{j}^{{l}_{\mathcal{S}}}\left(\mathbit{s}\right)\in {W}_{D}^{{l}_{D}}\otimes {W}_{\mathcal{S}}^{{l}_{\mathcal{S}}}.\end{array}$

Thus, the sparse discrete variational problem reads: Find ${û}_{L,N}\in {\stackrel{̂}{V}}^{L,N}$ such that

$a\left({û}_{L,N},{\stackrel{̂}{v}}_{L,N}\right)=l\left({\stackrel{̂}{v}}_{L,N}\right)\phantom{\rule{1em}{0ex}}\forall {\stackrel{̂}{v}}_{L,N}\in {\stackrel{̂}{V}}^{L,N}.$
(46)

The dimension of the sparse tensor product space ${\stackrel{̂}{V}}^{L,N}$ depends on the sparsity profile $f\left({l}_{D},{l}_{\mathcal{S}}\right)$. For a linear sparsity profile as in (44), the following complexity estimate is known (e. g. Bungartz and Griebel 2004, Lemma 3.6), or Griebel and Harbrecht (2013a, Thm. 4.1)).

##### Lemma 7.

Assuming the dimensions of the detail spaces ${W}_{D}^{{l}_{D}}$ and ${W}_{\mathcal{S}}^{{l}_{\mathcal{S}}}$ scale as $dim\left(\underset{i}{\overset{{l}_{i}}{W}}\right)\le {c}_{i}{2}^{{d}_{i}{l}_{i}}$ with constants c i >0 and dimensions d i , $i=D,\mathcal{S}$, with d D =d, and given a linear sparsity profile $f\left({l}_{D},{l}_{\mathcal{S}}\right)$ as in (44), the dimension of the sparse tensor product approximation space ${\stackrel{̂}{V}}^{L,N}$ as defined by (45) is

$\phantom{\rule{-15.0pt}{0ex}}{\stackrel{̂}{M}}_{L,N}\lesssim {L}^{\theta }{2}^{max\left\{\mathit{\text{dL}},{d}_{\mathcal{S}}N\right\}}\lesssim {\left(log{M}_{D}\right)}^{\theta }max\left\{{M}_{D},{M}_{\mathcal{S}}\right\},$
(47)

where θ=1 if $\mathit{\text{dL}}={d}_{\mathcal{S}}N$ and θ=0 otherwise. Relation “ ” defines an order up to constants with respect to the relevant scaling parameters L, N: ab iff aC b with constant C independent of L and N.

### Error analysis

In this section, we shall show that the convergence rates of the full tensor and sparse tensor Galerkin methods differ only by a logarithmic factor in the degrees of freedom, provided that somewhat stronger regularity requirements are met for the exact solution.

The analysis will proceed along the usual fashion, cp. (Bungartz and Griebel 2004). We define the Galerkin projector${\mathrm{P}}^{L,N}:{\mathcal{V}}_{1}\to {V}^{L,N}$ into the full tensor product approximation space

$a\left({\mathrm{P}}^{L,N}u,v\right)=a\left(u,v\right)\phantom{\rule{1em}{0ex}}\forall v\in {V}^{L,N}.$
(48)

Letting L (N) the fact that the subspaces are closed and dense implies that in the respective limits we obtain semidiscrete Galerkin projectors${\mathrm{P}}_{\mathcal{S}}^{N}:={lim}_{L\to \infty }{\mathrm{P}}^{L,N}$$\left({\mathrm{P}}_{D}^{L}:={lim}_{N\to \infty }{\mathrm{P}}^{L,N}\right)$ on the physical (angular) domain, as the Galerkin projector is stable in the ·1-norm:

#### Lemma 8

(Stability of the Galerkin projector). Let $v\in {\mathcal{V}}_{1}$. Then there is a constant c P >0 independent of L and N so that

${∥{\mathrm{P}}^{L,N}v∥}_{1}\le {c}_{P}\parallel v{\parallel }_{1}.$

#### Proof

With continuity (Lemma 3) of the bilinear form we obtain

$\begin{array}{ll}|a\left({\mathrm{P}}^{L,N}v,{v}_{L,N}\right)|& =|a\left(v,{v}_{L,N}\right)|\\ \le {c}_{c}\parallel v{\parallel }_{1}{∥{v}_{L,N}∥}_{1}\phantom{\rule{1em}{0ex}}\forall {v}_{L,N}\in {V}^{L,N}.\end{array}$

Since this holds for all vL,NVL,N, we can set vL,N=PL,Nv and exploit coercivity of the bilinear form (Lemma 5):

$\begin{array}{ll}{c}_{e}{∥{\mathrm{P}}^{L,N}v∥}_{1}^{2}& \le |a\left({\mathrm{P}}^{L,N}v,{\mathrm{P}}^{L,N}v\right)|\phantom{\rule{2em}{0ex}}\\ =|a\left(v,{\mathrm{P}}^{L,N}v\right)|\le {c}_{c}\parallel v{\parallel }_{1}\phantom{\rule{0.3em}{0ex}}{∥{\mathrm{P}}^{L,N}v∥}_{1}.\phantom{\rule{2em}{0ex}}\end{array}$

If PL,Nv≠0 we obtain the result with c P =c c /c e .

#### Error estimates on the physical domain

To begin with, we require some approximation results in the H1(D)-norm on the physical domain. With a Clément-type quasi-interpolation operator ${\mathrm{P}}_{\mathrm{I}}^{L}$ (Scott and Zhang, 1990, Thm. 4.1 and Cor. 4.1) we obtain

##### Lemma 9

(Approximation of quasi-interpolation). For polyhedral $D\subset {\mathbb{R}}^{d}$ and a shape-regular triangulation ${\mathcal{T}}_{D}^{L}$ on D with mesh width h=2L, the quasi-interpolation ${\mathrm{P}}_{\mathrm{I}}^{L}v$ of a function vHs+1(D), s [ 0,1], to the space ${V}_{D}^{L}={S}^{0,1}\left(D,{\mathcal{T}}_{D}^{L}\right)$ of piecewise affine functions on ${\mathcal{T}}_{D}^{L}$ satisfies the error estimate

$\parallel v-{\mathrm{P}}_{\mathrm{I}}^{L}v{\parallel }_{{H}^{1}\left(D\right)}\le {c}_{H}{2}^{-\mathit{\text{sL}}}\parallel v{\parallel }_{{H}^{s+1}\left(D\right)},$

where c H >0 is a constant independent of L.

##### Lemma 10

(Stability of quasi-interpolation). Under the assumptions of Lemma 9, quasi-interpolation is H1-stable, i.e. there exists a constant c B >0 independent of L such that for all vH1(D) it holds

$\parallel {\mathrm{P}}_{\mathrm{I}}^{L}v{\parallel }_{{H}^{1}\left(D\right)}\le {c}_{B}\parallel v{\parallel }_{{H}^{1}\left(D\right)}.$

Next we derive an error estimate for the Galerkin approximation on the physical domain. At this point, the approximation is semidiscrete.

##### Lemma 11

(Error estimate for Galerkin projection on physical domain). Let uHs+1,0(Ω), s{0,1}, be the exact solution to problem (21) and ${u}_{L}:={\mathrm{P}}_{D}^{L}u\in {V}_{D}^{L}\otimes {L}^{2}\left(\mathcal{S}\right)$ the Galerkin projected solution to

$a\left({u}_{L},{v}_{L}\right)=l\left({v}_{L}\right)\phantom{\rule{1em}{0ex}}\forall {v}_{L}\in {V}_{D}^{L}\otimes {L}^{2}\left(\mathcal{S}\right)$
(49)

with a(·,·) from (19) and l(·) from (20). Then, there is a constant c p >0 independent of L such that

${∥u-{u}_{L}∥}_{1}\le {c}_{p}\phantom{\rule{0.3em}{0ex}}{2}^{-\mathit{\text{sL}}}{∥u∥}_{{H}^{s+1,0}\left(\Omega \right)}.$
##### Proof

The proof is standard, and is based on coercivity and Galerkin orthogonality. We proceed analogous to (Ávila et al. 2011, Lemma 3 and Theorem 1). After inserting the quasi-interpolated solution ${û}_{L}:=\left({\mathrm{P}}_{\mathrm{I}}^{L}\otimes {\text{Id}}_{\mathcal{S}}\right)u$ with ${\mathrm{P}}_{\mathrm{I}}^{L}$ from Lemma 9 the triangle inequality permits us to write

${∥u-{u}_{L}∥}_{1}\le {∥u-{û}_{L}∥}_{1}+{∥{û}_{L}-{u}_{L}∥}_{1}.$
(50)

For the first part, we use the fact that there is a constant c n >0 for all $v\in {H}^{1}\left(D\right)\otimes {L}^{2}\left(\mathcal{S}\right)$ such that

$\parallel v{\parallel }_{1}\le {c}_{n}\parallel v{\parallel }_{{H}^{1,0}\left(\Omega \right)}.$

Thus, we can apply Lemma 9:

$\begin{array}{l}\phantom{\rule{-12.0pt}{0ex}}{∥u-{û}_{L}∥}_{1}\le {c}_{n}{∥u-{û}_{L}∥}_{{H}^{1,0}\left(\Omega \right)}\le {c}_{n}{c}_{H}{2}^{-\mathit{\text{sL}}}{∥u∥}_{{H}^{s+1,0}\left(\Omega \right)}.\end{array}$

For the second part in (50), we use coercivity of the bilinear form, then in a second step Galerkin orthogonality, and finally continuity of the bilinear form to write

$\begin{array}{ll}{∥{u}_{L}-{û}_{L}∥}_{1}^{2}& \le {c}_{e}^{-1}a\left({u}_{L}-{û}_{L},{u}_{L}-{û}_{L}\right)\phantom{\rule{2em}{0ex}}\\ \le {c}_{e}^{-1}a\left(u-{û}_{L},{u}_{L}-{û}_{L}\right)\phantom{\rule{2em}{0ex}}\\ \le {c}_{c}{c}_{e}^{-1}{∥u-{û}_{L}∥}_{1}\phantom{\rule{0.3em}{0ex}}{∥{u}_{L}-{û}_{L}∥}_{1}\phantom{\rule{2em}{0ex}}\\ \le {c}_{c}{c}_{e}^{-1}{c}_{n}{∥u-{û}_{L}∥}_{{H}^{1,0}\left(\Omega \right)}\phantom{\rule{0.3em}{0ex}}{∥{u}_{L}-{û}_{L}∥}_{1},\phantom{\rule{2em}{0ex}}\end{array}$

and therefore with Lemma 9

${∥{u}_{L}-{û}_{L}∥}_{1}\le {c}_{c}{c}_{e}^{-1}{c}_{n}{c}_{H}{2}^{-\mathit{\text{sL}}}{∥u∥}_{{H}^{s+1,0}\left(\Omega \right)}.$

By inserting into (50) we arrive at the result

${∥u-{u}_{L}∥}_{1}\le {c}_{n}{c}_{H}\left(1+{c}_{c}{c}_{e}^{-1}\right){2}^{-\mathit{\text{sL}}}{∥u∥}_{{H}^{s+1,0}\left(\Omega \right)}.$

#### Error estimates on the angular domain

On the angular domain, the considerations in the following require an approximation result for L2-projections.

##### Lemma 12

For functions $v\in {H}^{t}\left(\mathcal{S}\right)$, t{0,1}, the L2-projection to the space ${V}_{\mathcal{S}}^{N}$ satisfies the error estimate

${∥v-{\mathrm{P}}_{{L}^{2}\left(\mathcal{S}\right)}^{N}v∥}_{{L}^{2}\left(\mathcal{S}\right)}\le {c}_{l}{2}^{-\mathit{\text{tN}}}\parallel v{\parallel }_{{H}^{t}\left(\mathcal{S}\right)},$
(51)

where the constant c l >0 is independent of N.

This result can be obtained for approximation by spectral functions as in the spherical harmonics method (in which case t≥0 is arbitrary), for instance, as well as for approximation by piecewise constants as in the discrete ordinates method (in which case 0≤t≤1). It allows the derivation of the same approximation rate for the semidiscrete Galerkin projection on the angular domain.

##### Lemma 13

(Error estimate for angular Galerkin projection). Let uH1,t(Ω), t{0,1}, be the exact solution to problem (21) and ${u}_{N}:={\mathrm{P}}_{\mathcal{S}}^{N}u\in {H}^{1}\left(D\right)\otimes {V}_{\mathcal{S}}^{N}$ the Galerkin projected solution with angular part from the subspace ${V}_{\mathcal{S}}^{N}$ of ${L}^{2}\left(\mathcal{S}\right)$. Then there is a constant c a >0 independent of N such that

${∥u-{u}_{N}∥}_{1}\le {c}_{a}\phantom{\rule{0.3em}{0ex}}{N}^{-t}{∥u∥}_{{H}^{1,t}\left(\Omega \right)}.$
##### Proof

The proof proceeds analogously to the one of Lemma 11 while substituting the L2-projected solution with Lemma 12 for the quasi-interpolated solution, the details are therefore omitted here.

#### Error estimate for the full tensor phase space Galerkin method

The following theorem gives an error estimate for the full tensor approximation.

##### Theorem 14

(Error estimate full tensor Galerkin method). The full tensor Galerkin approximation uL,N=PL,Nu of a solution uHs+1,0(Ω)∩H1,t(Ω), s{0,1}, t{0,1}, to the variational problem (21) satisfies the asymptotic error estimate

${∥u-{u}_{L,N}∥}_{1}\lesssim {2}^{-\mathit{\text{sL}}}{∥u∥}_{{H}^{s+1,0}\left(\Omega \right)}+{2}^{-\mathit{\text{tN}}}{∥u∥}_{{H}^{1,t}\left(\Omega \right)},$
(52)

with relation “ ” as in Lemma 7.

##### Proof

By Céa’s Lemma (Brenner and Scott 2008, Thm. 2.8.1) the Galerkin approximation is quasi-optimal in VL,N, its error can therefore be bounded (up to constants) by the error of any other approximation to u in VL,N, for example the quasi-interpolated and L2-projected approximation ${\mathrm{P}}_{\mathrm{I}}^{L}\otimes {\mathrm{P}}_{{L}^{2}}^{N}u$:

$\begin{array}{ll}\phantom{\rule{-12.0pt}{0ex}}{∥u-{\mathrm{P}}^{L,N}u∥}_{1}& \lesssim {∥u-{\mathrm{P}}_{\mathrm{I}}^{L}\otimes {\mathrm{P}}_{{L}^{2}}^{N}u∥}_{1}\le {∥u-{\mathrm{P}}_{\mathrm{I}}^{L}\otimes \text{Id}u∥}_{1}\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}+{∥{\mathrm{P}}_{\mathrm{I}}^{L}\otimes \text{Id}u-{\mathrm{P}}_{\mathrm{I}}^{L}\otimes {\mathrm{P}}_{{L}^{2}}^{N}u∥}_{1}\phantom{\rule{2em}{0ex}}\\ \lesssim {2}^{-\mathit{\text{sL}}}{∥u∥}_{{H}^{s+1,0}\left(\Omega \right)}\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}+{∥\left(\text{Id}-\text{Id}\otimes {\mathrm{P}}_{{L}^{2}}^{N}\right){\mathrm{P}}_{\mathrm{I}}^{L}\otimes \text{Id}u∥}_{1}\phantom{\rule{2em}{0ex}}\\ \lesssim {2}^{-\mathit{\text{sL}}}{∥u∥}_{{H}^{s+1,0}\left(\Omega \right)}\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}+{2}^{-\mathit{\text{tN}}}{∥{\mathrm{P}}_{\mathrm{I}}^{L}\otimes \text{Id}u∥}_{{H}^{1,t}\left(\Omega \right)}\phantom{\rule{2em}{0ex}}\\ \lesssim {2}^{-\mathit{\text{sL}}}{∥u∥}_{{H}^{s+1,0}\left(\Omega \right)}+{2}^{-\mathit{\text{tN}}}{∥u∥}_{{H}^{1,t}\left(\Omega \right)}.\phantom{\rule{2em}{0ex}}\end{array}$

Here, we used the approximation properties of the quasi-interpolant from Lemma 9 and of the angular L2- projection from Lemma 12. The last step is a consequence of the H1-stability asserted in Lemma 10 of the quasi-interpolation.

#### Error estimate for the sparse tensor phase space Galerkin method

After the full tensor approximation properties, we consider the convergence properties of a direct sparse tensor approximation on the sparse tensor product space ${\stackrel{̂}{V}}^{L,N}$ as defined in (45).

In analogy to the full tensor Galerkin projector PL,N, we can define a sparse tensor Galerkin projector${\stackrel{̂}{\mathrm{P}}}^{L,N}$ by the orthogonality relation

$a\left({\stackrel{̂}{\mathrm{P}}}^{L,N}u,v\right)=a\left(u,v\right)\phantom{\rule{1em}{0ex}}\forall v\in {\stackrel{̂}{V}}^{L,N}.$

The error of the sparse tensor solution ${û}_{L,N}={\stackrel{̂}{\mathrm{P}}}^{L,N}u$ is estimated in the following theorem (see also Widmer 2009, Thm. 2.6) and (Griebel and Harbrecht 2013a, Thms. 4.3 and 7.1)).

##### Theorem 15

(Error estimate of direct sparse tensor solution). Let the linear sparsity profile as in (44) be given. Assume further that L and N vary such thats L+t N=ζ=const, then the direct sparse tensor approximation ${û}_{L,N}$ of a function uHs+1,t(Ω), s,t{0,1}, satisfies the error estimate

${∥u-{û}_{L,N}∥}_{1}\lesssim L\left({2}^{-\mathit{\text{sL}}}+{2}^{-\mathit{\text{tN}}}\right){∥u∥}_{{H}^{s+1,t}\left(\Omega \right)},$

where relation “ ” is defined as in Lemma 7.

##### Proof

We follow the proof of Thm. 2.6 by (Widmer 2009). First we introduce so-called difference projectors ${\Delta }_{\mathrm{I}}^{{l}_{D}}:={\mathrm{P}}_{\mathrm{I}}^{{l}_{D}}-{\mathrm{P}}_{\mathrm{I}}^{{l}_{D}-1}$ and ${\Delta }_{{L}^{2}}^{{l}_{\mathcal{S}}}:={\mathrm{P}}_{{L}^{2}}^{{l}_{\mathcal{S}}}-{\mathrm{P}}_{{L}^{2}}^{{l}_{\mathcal{S}}-1}$ as the difference between projections to two consecutive resolution levels with the convention ${\mathrm{P}}_{\mathrm{I}}^{-1}=0={\mathrm{P}}_{{L}^{2}}^{-1}$. They project onto the detail spaces ${W}_{D}^{{l}_{D}}$ and ${W}_{\mathcal{S}}^{{l}_{\mathcal{S}}}$, respectively.

With these difference projectors, a sparse quasi-interpolated and L2-projected approximation ${ū}_{L,N}\in {\stackrel{̂}{V}}^{L,N}$ to u can be expressed as

${ū}_{L,N}=\sum _{{l}_{D}=0}^{L}\sum _{{l}_{\mathcal{S}}=0}^{{l}_{\mathcal{S}}^{max}\left({l}_{D}\right)}{\Delta }_{\mathrm{I}}^{{l}_{D}}\otimes {\Delta }_{{L}^{2}}^{{l}_{\mathcal{S}}}u,$

where ${l}_{\mathcal{S}}^{max}\left({l}_{D}\right)$ is the largest feasible angular resolution index which results from solving $f\left({l}_{D},{l}_{\mathcal{S}}\right)\le L$ with respect to ${l}_{\mathcal{S}}$.

Now we exploit quasi-optimality of the Galerkin approximation on the sparse tensor product space to replace the Galerkin approximation error by the error of the quasi-interpolated and L2-projected approximation. Additionally applying the norm estimate $\parallel v{\parallel }_{1}\lesssim \parallel v{\parallel }_{{H}^{1,0}\left(\Omega \right)}$ yields

${∥u-{û}_{L,N}∥}_{1}\lesssim {∥u-\sum _{{l}_{D}=0}^{L}\sum _{{l}_{\mathcal{S}}=0}^{{l}_{\mathcal{S}}^{max}\left({l}_{D}\right)}{\Delta }_{\mathrm{I}}^{{l}_{D}}\otimes {\Delta }_{{L}^{2}}^{{l}_{\mathcal{S}}}u∥}_{{H}^{1,0}\left(\Omega \right)}.$
(53)

The error is split into two terms:

$\phantom{\rule{-7.0pt}{0ex}}\begin{array}{ll}{∥u-{ū}_{L,N}∥}_{{H}^{1,0}\left(\Omega \right)}& \le \underset{=:I}{\underset{⏟}{{∥\sum _{{l}_{D}=0}^{L}\sum _{{l}_{\mathcal{S}}={l}_{\mathcal{S}}^{max}\left({l}_{D}\right)+1}^{\infty }{\Delta }_{\mathrm{I}}^{{l}_{D}}\otimes {\Delta }_{{L}^{2}}^{{l}_{\mathcal{S}}}u∥}_{{H}^{1,0}\left(\Omega \right)}}}\\ +\underset{=:\mathit{\text{II}}}{\underset{⏟}{{∥\sum _{{l}_{D}=L+1}^{\infty }\sum _{{l}_{\mathcal{S}}=0}^{\infty }{\Delta }_{\mathrm{I}}^{{l}_{D}}\otimes {\Delta }_{\mathcal{S}}^{{l}_{\mathcal{S}}}u∥}_{{H}^{1,0}\left(\Omega \right)}}}.\end{array}$
(54)

The second term on the right hand side can be estimated by Lemma 9:

$\mathit{\text{II}}={∥\left(\text{Id}-{\mathrm{P}}_{\mathrm{I}}^{L}\right)\otimes \text{Id}\phantom{\rule{0.3em}{0ex}}u∥}_{{H}^{1,0}\left(\Omega \right)}\le {c}_{H}{2}^{-\mathit{\text{sL}}}{∥u∥}_{{H}^{s+1,0}\left(\Omega \right)}.$
(55)

This term will not contribute to the asymptotic terms.

The first term on the right hand side of (54) is split up further:

(56)

Both norms on the right hand side of (56) can be estimated by Lemma 9 and Lemma 12:

$\begin{array}{l}\phantom{\rule{6.5em}{0ex}}{∥\left(\text{Id}-{\mathrm{P}}_{\mathrm{I}}^{{l}_{D}}\right)\otimes \left(\text{Id}-{\mathrm{P}}_{{L}^{2}}^{{l}_{\mathcal{S}}^{max}\left({l}_{D}\right)}\right)u∥}_{{H}^{1,0}\left(\Omega \right)}\phantom{\rule{2em}{0ex}}\\ \le {c}_{H}{2}^{-s{l}_{D}}{∥\text{Id}\otimes \left(\text{Id}-{\mathrm{P}}_{{L}^{2}}^{{l}_{\mathcal{S}}^{max}\left({l}_{D}\right)}\right)u∥}_{{H}^{s+1,0}\left(\Omega \right)}\phantom{\rule{2em}{0ex}}\\ \le {c}_{H}{c}_{l}{2}^{-s{l}_{D}-\mathit{\text{tl}}{}_{\mathcal{S}}^{max}\left({l}_{D}\right)}{∥u∥}_{{H}^{s+1,t}\left(\Omega \right)}.\phantom{\rule{2em}{0ex}}\hfill \end{array}$

Inserting back into (56) yields

$I\le 2{c}_{H}{c}_{l}{∥u∥}_{{H}^{s+1,t}\left(\Omega \right)}\sum _{{l}_{D}=0}^{L}{2}^{-s{l}_{D}-\mathit{\text{tl}}{}_{\mathcal{S}}^{max}\left({l}_{D}\right)}.$
(57)

The task is now to estimate the series. Using the assumption ζ=−s+t N/L:

$\begin{array}{ll}\sum _{{l}_{D}=0}^{L}{2}^{-s{l}_{D}-\mathit{\text{tN}}/L\left(L-{l}_{D}\right)}& ={2}^{-\mathit{\text{tN}}}\sum _{{l}_{D}=0}^{L}{2}^{\left(-s+\mathit{\text{tN}}/L\right){l}_{D}}\phantom{\rule{2em}{0ex}}\\ ={2}^{-\mathit{\text{tN}}}\sum _{{l}_{D}=0}^{L}{2}^{\zeta {l}_{D}}.\phantom{\rule{2em}{0ex}}\end{array}$
(58)

We estimate the sum on the right hand side of (58) by its largest summand. Two cases can be distinguished here:

1. 1.

If ζ≤0, the largest summand occurs for l D =0:

$\begin{array}{l}{2}^{-\mathit{\text{tN}}}\sum _{{l}_{D}=0}^{L}{2}^{\zeta {l}_{D}}\le L{2}^{-\mathit{\text{tN}}}.\end{array}$
2. 2.

If ζ>0, the largest summand occurs for l D =L:

$\begin{array}{ll}{2}^{-\mathit{\text{tN}}}\sum _{{l}_{D}=0}^{L}{2}^{\zeta {l}_{D}}& \le {2}^{-\mathit{\text{tN}}}L{2}^{-\mathit{\text{sL}}+\mathit{\text{tN}}}=L{2}^{-\mathit{\text{sL}}}.\phantom{\rule{2em}{0ex}}\end{array}$

In summary, we may write

$\begin{array}{l}\sum _{{l}_{D}=0}^{L}{2}^{-s{l}_{D}-t{l}_{\mathcal{S}}^{max}\left({l}_{D}\right)}\le L{2}^{-\mathit{\text{sL}}-\mathit{\text{tN}}}.\end{array}$

By combining this estimate with relations (53) to (57), we finally arrive at

${∥u-{û}_{L,N}∥}_{{H}^{1,0}\left(\Omega \right)}\lesssim L{2}^{-\mathit{\text{sL}}-\mathit{\text{tN}}}{∥u∥}_{{H}^{s+1,t}\left(\Omega \right)}.$

In conclusion, we find that the convergence rate of O(2sLtN) of the full tensor approximation is maintained up to an additional factor L, which by M D =O(2dL) is logarithmic in the number of degrees of freedom. This result in conjunction with the greatly reduced complexity of the sparse tensor method (Lemma 7) shows its superior efficiency provided that the function u to be approximated is at least in Hs+1,t(Ω), with s,t{0,1}.

## Numerical experiments

### Algorithms

For the numerical experiments we compute a sparse tensor solution with the help of the combination technique. The sparse solution is constructed according to the formula

$\begin{array}{ll}{\stackrel{̌}{u}}_{L,N}& =\sum _{{\ell }_{D}=0}^{L}\left({u}_{{\ell }_{D},{\ell }_{\mathcal{S}}^{max}\left({\ell }_{D}\right)}-{u}_{{\ell }_{D},{\ell }_{\mathcal{S}}^{max}\left({\ell }_{D}+1\right)}\right)\phantom{\rule{2em}{0ex}}\end{array}$

from a number of solutions ${u}_{{\ell }_{D},{\ell }_{\mathcal{S}}}\in {V}^{{\ell }_{D},{\ell }_{\mathcal{S}}}$ to the full tensor discrete variational formulation (38) of reduced physical resolution D and angular resolution ${\ell }_{\mathcal{S}}$.

Clearly ${\stackrel{̌}{u}}_{L,N}$ is in the space ${\stackrel{̂}{V}}^{L,N}=\sum _{{l}_{D}=0}^{L}{V}^{{\ell }_{D},{\ell }_{\mathcal{S}}^{max}\left({\ell }_{D}\right)}$, which is identical to the sparse tensor approximation space from (43). However, in general the combination approximation differs from a direct sparse approximation ${û}_{L,N}$ (see also Grella 2013, Sec. 2.3.1). Due to the quasi-optimality of the direct sparse solution as an approximation in ${\stackrel{̂}{V}}^{L,N}$, the error of the combination approximation can serve as an upper bound (up to factors) for the error ${∥u-{û}_{L,N}∥}_{1}$ of the direct sparse approximation.

Note that the convergence of the combination technique for the radiative transfer problem has not been shown formally yet. A recent proof for elliptic operators by (Griebel and Harbrecht 2013b) would be applicable under certain stability assumptions on the semidiscrete Galerkin projectors (for details we refer to (Grella 2013, Sec. 5.3.7)). However, the use of the combination technique approximation has practical advantages over the direct sparse approximation. First, to construct the subproblem solutions of lower resolution, an existing full tensor solver with standard nonhierarchical FEM bases can be reused, no direct sparse solver needs to be implemented. Second, the splitting into subproblems entails a natural level for parallelism in the algorithm, which can still be combined with parallel solution procedures at the level of each subproblem (an implementation is described in (Grella 2013, Chap. 7)).

Each of the full tensor subproblems is solved by a phase space Galerkin finite element method with nonhierarchical affine hat functions as physical basis and piecewise constants as angular basis. In the experiment of Sec. ‘Experiment 2’, the midpoint rule is used for angular quadrature which corresponds to the S N -method. However, in situations where ray effects (Lathrop 1968) pollute the results, adaptive quadrature may help (Stone 2007). As a simple adaptive rule we link the number of quadrature points n q per dimension and per mesh element to the resolution levels l D , ${l}_{\mathcal{S}}$ of the subproblem by ${n}_{q}=max\left\{{l}_{D}/{l}_{\mathcal{S}},1\right\}$ in the experiment of Sec. ‘Experiment 1’. Even though the overall computational effort is then not bounded by Lemma 7, the total number of degrees of freedom still is. As the iterative, approximate solution of the linear system constitutes the most time consuming part, the sparse tensor method is, in practice, more efficient than the full tensor method.

### Quantities of interest

In applications, the radiative intensity is often coupled to other modes of energy transport via the net emission (e. g. Larsen et al. 2002, Eq. (1.1a)). The net emission can be computed in turn from the incident radiation

$G\left(\mathbit{x}\right)={\int }_{\mathcal{S}}u\left(\mathbit{x},\mathbit{s}\right)d\phantom{\rule{0.3em}{0ex}}\mathbit{s}.$
(59)

For this reason, we choose the incident radiation as a lower-dimensional variable to visualize results and to analyze errors. The relative L2- or H1-error of the incident radiation is given by

$\phantom{\rule{-12.0pt}{0ex}}\mathit{\text{err}}{\left({G}_{L,N}\right)}_{X}=\parallel G-{G}_{L,N}{\parallel }_{X}/\parallel G{\parallel }_{X},\phantom{\rule{1em}{0ex}}X={L}^{2}\left(D\right),{H}^{1}\left(D\right).$

### Numerical experiments

All experiments are set on the domains D= [ 0,1]d, $\mathcal{S}={\mathcal{S}}^{{d}_{\mathcal{S}}}$, with $d={d}_{\mathcal{S}}+1$. We solve the RTP with isotropic scattering $\Phi \left(\mathbit{s},{\mathbit{s}}^{\prime }\right)=1/|\mathcal{S}|$ and zero inflow boundary conditions g=0.

#### Experiment 1

We search the solution to the Gaussian blackbody radiation

${I}_{b}\left(\mathbit{x}\right)=2exp\left(-{3}^{2}{\left(\mathbit{x}-\mathbit{c}\right)}^{2}\right),\phantom{\rule{1em}{0ex}}\mathbit{c}={\left(0.5,0.5\right)}^{\top },$

with absorption and scattering coefficient κ=σ=1.

The H1-error of the incident radiation indeed converges faster in the sparse approximation than the full approximation (Figure 1). Note that the L2-error of the sparse approximation can be larger than the error of the full approximation because the sparsity profile $f\left({l}_{D},{l}_{\mathcal{S}}\right)$ has been optimized for essentially undeteriorated convergence in the ·1-norm of the error in the radiative intensity, which is more closely represented by the H1-error than the L2-error of the incident radiation.

#### Experiment 2

A blackbody radiation I b (x,s) corresponding to the exact solution

$u\left(\mathbit{x},\mathbit{s}\right)=\frac{3}{16\pi }\left(1+{\left(\mathbit{s}·{\mathbit{s}}^{\prime }\right)}^{2}\right)\prod _{i=1}^{3}\left(-4{x}_{i}\left({x}_{i}-1\right)\right),$

with fixed ${\mathbit{s}}^{\prime }={\left(1/\sqrt{3},1/\sqrt{3},1/\sqrt{3}\right)}^{\top }$ is inserted into the right hand side functional in (38) (Grella 2013, Sec. 8.2, Exp. 1). The absorption is set to κ=1, the scattering coefficient to σ=0.5.

For this experiment we employed a discrete ordinates solver in which the angular resolution N is related to the angular degrees of freedom by ${M}_{\mathcal{S}}={\left({N}^{\prime }+1\right)}^{2}$ so that Nlog2(N+1), where N is the angular resolution used otherwise in this paper.

Figure 2 shows the superior efficiency of the sparse approach with respect to number of degrees of freedom vs. achieved error. The convergence rates indicate that the curse of dimensionality is mitigated by the sparse discrete ordinates method.

For a comparison to other sparse tensor approaches we refer to the numerical experiment section of (Grella 2013), which features a sparse tensor spherical harmonics approximation and a sparse collocation discrete ordinates method realized via the combination technique. We observed that the approach presented here performs similarly to the sparse collocation DOM combination technique as the methods are similar from the point of view of implementation, even though their theoretical derivation is different. The presented approach is somewhat less susceptible to ray effects at the expense of slightly longer computational times as the angular quadrature is adapted to the resolution of the angular mesh. The spherical harmonics method is most effective for solutions with highly regular angular part because of its regularity requirements for spectral convergence. In general, at the same resolution levels L and N, the combination technique approach realizes approximately the same error as the direct sparse approach, while the number of degrees of freedom in the combination technique is larger than in the direct sparse approach because the approximation spaces of different subproblems in the combination technique overlap in the degrees of freedom. It is therefore slightly less efficient than the direct sparse approach, but considerably more efficient than the full tensor approach and advantageous in practice due to faster and simpler implementation and parallelization.

## Conclusion

We have shown a direct sparse tensor phase space Galerkin approximation of the radiative intensity in the stationary monochromatic radiative transfer problem can be computed with only $O\left(log{M}_{D}\left({M}_{D}+{M}_{\mathcal{S}}\right)\right)$ degrees of freedom as opposed to $O\left({M}_{D}{M}_{\mathcal{S}}\right)$ degrees of freedom for a standard full tensor approximation. Here, M D is the number of physical degrees of freedom and ${M}_{\mathcal{S}}$ the number of angular degrees of freedom. At the same time, the error of the sparse approximation in the ·1-norm still decreases essentially as the error of the full approximation, namely with the order $O\left(log{M}_{D}\left({M}_{D}^{-s/d}+{M}_{\mathcal{S}}^{-t/{d}_{\mathcal{S}}}\right)\right)$ as compared to $O\left({M}_{D}^{-s/d}+{M}_{\mathcal{S}}^{-t/{d}_{\mathcal{S}}}\right)$ in the full tensor approximation. The parameters s,t{0,1} indicate the regularity of the exact solution which is required to be in the space of mixed smoothness ${H}^{s+1,t}\left(D×\mathcal{S}\right)$ to achieve the sparse convergence rate, whereas ${H}^{s+1,0}\left(D×\mathcal{S}\right)\cap {H}^{1,t}\left(D×\mathcal{S}\right)$ is sufficient in the full tensor approximation.

To simplify implementation, we realized the sparse tensor approximation algorithmically via the combination technique. Together with suitable quadrature rules, we demonstrated in numerical experiments that this sparse tensor combination approximation retains the analyzed theoretical advantages of the direct sparse tensor method while allowing for straightforward parallelization also at the level of subproblems.

The proposed specialization of the phase space Galerkin framework investigated here has the advantage that both discrete ordinates and spherical harmonics method can be derived from it so that the sparse tensorization benefits hold for the sparse variants of both methods alike.

Therefore, for problems whose solutions exhibit so-called mixed regularity, the sparse tensor product phase space Galerkin approximations realize a significant increase in efficiency, i. e. achievable error per number of degrees of freedom. Even in applications where high numerical accuracy is the main objective, a sparse tensor product approximation might be of value as an initial value for an iterative solver or in a problem-adapted preconditilpalitoccheme.

## References

• Ávila M, Codina R, Principe J: Spatial approximation of the radiation transport equation using a subgrid-scale finite element method. Comput Meth Appl Mech Eng 2011, 200: 425-438. doi:10.1016/j.cma.2010.11.003 10.1016/j.cma.2010.11.003

• Brenner SC, Scott LR: The mathematical theory of finite element methods, volume 15 of Texts Applied in Mathematics. New York: Springer; 2008. doi:10.1007/978-0-387-75934-0

• Brooks A, Hughes TJR: Streamline upwind/Petrov-Galerkin formulation for convection dominated flows with particular emphasis on the incompressible Navier-Stokes equations. Comput Methods Appl Mech Engrg 1982, 32(1–3):199-259. doi:10.1016/0045-7825(82)90071-8

• Bungartz H-J, Griebel M: Sparse grids. Edited by: Iserles A. Cambridge University Press: Acta numerica volume 13; 2004:147-269.

• Dahmen W, Huang C, Schwab C, Welper G: Adaptive Petrov-Galerkin methods for first order transport equations. SIAM J Numer Anal 2012, 50(5):2420-2445. ISSN 0036-1429. doi:10.1137/110823158. 10.1137/110823158

• Evans KF: The spherical harmonic discrete ordinate method for three-dimensional atmospheric radiative transfer. J Atmos Sci 1998, 55(3):429-446. doi:10.1175/1520-0469(1998)055<0429:TSHDOM>2.0.CO;2. 10.1175/1520-0469(1998)055<0429:TSHDOM>2.0.CO;2

• Garcke J: A dimension adaptive sparse grid combination technique for machine learning. In Proceedings of the 13th Biennial Computational Techniques and Applications Conference, CTAC-2006, volume 48 of ANZIAM J Edited by: Read W, Larson JW, Roberts AJ. 2007, C725-C740. . http://anziamj.austms.org.au/ojs/index.php/ANZIAMJ/article/view/70

• Grella K: Sparse tensor approximation for radiative transport. 2013. PhD thesis ETH Zurich, No.21388. doi:10.3929/ethz-a-009970281.

• Grella K, Schwab C: Sparse tensor spherical harmonics approximation in radiative transfer. J Comput Phys 2011a, 230(23):8452-8473. ISSN 0021-9991. doi:10.1016/j.jcp.2011.07.028. 10.1016/j.jcp.2011.07.028

• Grella K, Schwab C: Sparse discrete ordinates method in radiative transfer. Comput Meth Appl Math 2011b, 11(3):305-326. ISSN 1609-9389. doi:10.2478/cmam-2011-0017.

• Griebel M, Schneider M, Zenger C: Iterative Methods in Linear Algebra, chapter A combination technique for the solution of sparse grid problems. North-Holland: Amsterdam; 1992.

• Griebel M, Harbrecht H: On the construction of sparse tensor product spaces. Math Comp 2013a, 82: 975-994. doi:10.1090/S0025-5718-2012-02638-X.

• Griebel M, Harbrecht H: On the convergence of the combination technique. Technical Report 1304, Institut für Numerische Simulation, Rheinische Friedrich-Wilhelms-Universität Bonn, March. 2013b. . http://wissrech.ins.uni-bonn.de/research/pub/griebel/CombiTechniqueConvergence.pdf

• Hegland M: Adaptive sparse grids. In Proc. of 10th Computational Techniques and Applications Conference CTAC-2001, volume 44 of ANZIAM J. Edited by: Burrage K, Sidje RB. 2003, C335-C353. . http://anziamj.austms.org.au/ojs/index.php/ANZIAMJ/article/view/685

• Hébert A: Handbook of nuclear engineering, chapter multigroup neutron transport and diffusion computations. Springer; 2010. doi:10.1007/978-0-387-98149-9_8.

• Kanschat G: Solution of radiative transfer problems with finite elements. In Numerical methods in multidimensional radiative transfer. Edited by: Kanschat G, Meinköhn E, Rannacher R, Wehrse R. Springer; 2008:49-98. doi:10.1007/978-3-540-85369-5.

• Knapp AW: Advanced Real Analysis. Birkhäuser Boston: Cornerstones; 2005. doi:10.1007/0-8176-4442-3. ISBN 978-0-8176-4382-9

• Larsen EW, Thömmes G, Klar A, Seaïd M, Götz T: Simplified P N approximations to the equations of radiative heat transfer and applications. J Comput Phys 2002, 183(2):652-675. ISSN 0021-9991. doi:10.1006/jcph.2002.7210. 10.1006/jcph.2002.7210

• Lathrop KD: Ray effects in discrete ordinates equations. Nucl Sci Eng 1968, 32(3):357.

• Manteuffel TA, Ressel KJ, Starke G: A boundary functional for the least-squares finite-element solution of neutron transport problems. SIAM J Numer Anal 2000, 37(2):556-586. doi:10.1137/S0036142998344706.

• Modest MF: Radiative heat transfer, 2nd edition. Amsterdam: Elsevier; 2003.

• Modest MF, Yang J: Elliptic PDE formulation and boundary conditions of the spherical harmonics method of arbitrary order for general three-dimensional geometries. J Quant Spectrosc Radiative Transf 2008, 109: 1641-1666. doi:10.1016/j.jqsrt.2007.12.018. 10.1016/j.jqsrt.2007.12.018

• Peng K, Gao X, Qu X, Ren N, Chen X, He X, Wang X, Liang J, Tian J: Graphics processing unit parallel accelerated solution of the discrete ordinates for photon transport in biological tissues. Appl Opt 2011, 50(21):3808-3823. doi:10.1364/AO.50.003808. 10.1364/AO.50.003808

• Scott LR, Zhang S: Finite element interpolation of nonsmooth functions satisfying boundary conditions. Math Comp 1990, 54: 483-493. doi:10.1090/S0025-5718-1990-1011446-7. 10.1090/S0025-5718-1990-1011446-7

• Stone JC: Adaptive discrete-ordinates algorithms and strategies. Texas A&M University: PhD thesis; 2007. . http://repository.tamu.edu//handle/1969.1/85857

• Widmer G, Hiptmair R, Schwab C: Sparse adaptive finite elements for radiative transfer. Comput Phys 2008, 227: 6071-6105. doi:10.1016/j.jcp.2008.02.025. 10.1016/j.jcp.2008.02.025

• Widmer G: Sparse finite elements for radiative transfer. ETH Zürich: PhD thesis; 2009. . No. 18420. doi:10.3929/ethz-a-005916456. http://e-collection.ethbib.ethz.ch/view/eth:374

• Zenger C: Sparse grids. In Parallel algorithms for partial differential equations, number 31 in notes on numerical fluid mechanics. Edited by: Hackbusch W. Vieweg; 1991. . http://www5.in.tum.de/pub/zenger91sg.pdf

## Acknowledgments

The author wishes to thank Prof. Dr. Ch. Schwab for helpful discussions and valuable suggestions to improvements of this article. Financial support for this work by Schweizerischer Nationalfonds (SNF) under project no. 121892, by Deutsche Forschungsgemeinschaft (DFG) within SPP1324, and by the European Research Council (ERC) under ERC Advanced Grant 247277 is gratefully acknowledged.

## Author information

Authors

### Corresponding author

Correspondence to Konstantin Grella.

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions