arrays (general info);
maps;
submaps;
harmonic coefficients;
full-size maps and alm arrays;
associated legendre functions, plm.
Top of the page
- The map values are always assumed to be ordered ring by ring from one pole (usually identified as North Pole) to the other. The pixel numbering scheme follows the same assumptions (i.e., corresponds to the Healpix ring ordering).
- In the language used hereafter, the object map[ npix*ncomp*nmaps] contains nmaps maps (or submaps) composed of
ncomp components (e.g., maps of different Stokes parameters), each containing npix pixels.
- Note that the full-size, non-distributed maps are always assumed to contain only one set of maps (i.e., nmaps = 1).
(With an exception of a "do-it-all" driver distribute_local_data_objects_map2alm.)
See full-size arrays for more info.
Top of the page
submaps ("local_map") -- all maps considered here are divided into submaps which are distributed over
the processors, with every submap assigned to a different processor.
The submap arrays are essentially of the same form as all the maps considered here as described above.
Each of them consists of a number of complete iso-latitude rings and includes the
corresponding rings (i.e., symmetric with respect to the equator) of the Northern and Southern
hemispheres. Moreover, for each hemisphere the submap is composed of the consecutive rings, which are therefore defined by
two integer numbers defining the first and the last ring of the Northern hemisphere (plus the equator, if present)
denoted hereafter as first_ring and last_ring. Note that both of the limiting rings are always assumed to belong to
the submap. The ring numbers always start from 1.
The first half of the submap corresponds to the Northern hemisphere starting from the ring first_ring and ending
with the ring last_ring. The second half of the submap stores the info about the Southern Hemisphere, starting with the
ring nringstot-last_ring+1 and ending with the ring nringstot-first_ring+1. (nringstot is a total number
of rings in a full sky map.)
If however the pixelization includes an equatorial ring of pixels and last_ring = nringsall for a given proc then
the equatorial pixel values are stored only once. The submap for this proc is then made first of all the rings of pixels of the
Norther hemisphere, followed
by the equatorial ring, and the Southern Hemisphere pixels stored as above.
For cut sky experiments, unobserved rings, i.e., those of which none of the pixels has been observed, do not have to be
taken into account while performing the distribution of a map over the processors. However, each processor needs to receive a
cosecutive subset of all rings defined by two extreme rings, first_ring and last_ring.
As a consequence some of the submaps may include unobserved rings, if those happen to be in the middle of the ring range assigned
to a given proc. This may lead to some memory inefficiency for
weirdly shaped maps (e.g., made of disconnected patches), as the allocated submaps, and other pixel domain object, will also
include those superfluous rings. The unnecessary operations can be however avoided by setting unobserved flags for the respective
rings in the scan structure.
We also note that in some cases of the cut-sky map distribution, as for example those performed by the routine
get_local_data_sizes,
the actual rings included in all the submaps may in general depend on the number of processors used for the calculations.
That may result in some total memory variations as a function of a number of procs, (but not FLOPS). In any realistic case
such a variation is expected to be irrelevant.
The inverse transforms (map2alm) thus utilize the values assigned to all the pixels in all the
rings included in the submaps and marked as observed. Those may include the pixels which are in fact never observed.
Hence, the map values for the
unobserved pixels need to be set explicitly to zero prior to calling the routines.
For the direct transforms the map values will be computed for all the pixels of all the rings marked as observed,
including pixels in fact unobserved and which therefore thus will need to be rejected (or marked as unobserved)
in post-processing.
In all cases full rings are always processed. This is done for the sake of the phi-transforms, which are performed with help
of the FFTs. The extra computational cost resulting from this is in fact negligible. The memory
overhead can be however occasionally significant (e.g., for the narrow sky patches elongated along the meridian)
that can be overcome by simple distributing the run over more processors and/or rotating the pixelization coordinate
system. There seems not to be any general way around, performing well for the nearly full sky and small sky coverages.
The solution adopted here seems to be the "lesser evil".
Top of the page
harmonic coefficients ("alm") -- are stored in a vector of values of the type, s2hat_dcomplex. The latter
is a derived type with a two double precision fields, .re and .im. The vector contains (lmax+1)*nmvals*ncomp*nmaps
numbers corresponding to a four dimensional array of which two different layouts
are supported to provide some level of compatibility with the serial Healpix routines. (Healpix alm arrays are 3-dimensional objects
as they do not allow for a multiple map option. How to combine Healpix serial and S2HAT libraries is suggested here.)
The permitted 4-dim layouts are:
S2HAT native layout: alm[ 0:lmax, 0:mvals-1, 1:ncomp, 1:nmaps]
HEALPIX-like layout: alm[ 1:ncomp, 0:lmax, 0:mvals-1, 1:nmaps]
and are stored in a column-wise order. Indices of the vector elements run from 0 to its size-1.
Here
lmax -- a maximal l value (harmonic polar index) to be recovered;
nmvals -- a number of different m (harmonic azimuthal index) values as assigned to a given proc. The actual m associated
with a given proc are stored in a vector mvals;
ncomp -- a number of components;
= 1 - I coefficients only, (the scalar transform only);
= 2 - E and B coefficients (for the spin transforms);
= 3 - I, E, and B coefficients (for polarized transforms only);
nmaps -- a number of the "alm" sets (i.e., maps) to be processed simultaneously.
The choice of the input matrix layout is decided via an extra input parameter, lda (4-byte integer), in a calling sequence of all the
S2HAT routines for which the ordering matters (thus including the transofrms themsleves.) It defines the leading dimension of the
array, which is either equal to lmax -- the S2HAT convention or ncomp for the HEALPIX convention.
- In the language used hereafter the array, alm defined here contains nmaps sets of alm coefficients
each including ncomp components (e.g., corresponding to I, E, B ).
- Note that full-size, non-distributed alm array appear explicitly only as 3 dimensional array with the fourth index (nmaps) suppressed.
(With an exception of a "do-it-all" driver distribute_local_data_objects_alm2map.) See full-size arrays for more info.
Top of the page
full-size maps and alm arrays --
in the S2HAT library full-size, non-distributed alm arrays and full-size non-distributed maps are usually assuming explicitly that nmaps = 1
(*).
This is dictated by the assumption that storing multiple complete sets of the alm coefficients
in a memory of a single processor will be generally prohibitively expensive. (Clearly, if not the case, the multiple full-size can be straightforwardly
stored in a user-defined array of a sufficient size.) In the distributed form a mutliple version of the maps and harmonic coefficients can be stored in a
single object. The provided gather and scatter routines allow distributing/collecting any full map to/from a
selected entry of the local arrays.
Note the full-size arrays of alm or maps are only used as by some of the gather/scatter routines and provided, for example, to facilitate interface with
the I/O part of a user code. The underlying assumption of the S2HAT library is that all operations can be performed on the memory-distributed objects bypassing
any memory bottlenecks.
The full-size arrays provide also a level of compatibility with serial HEALpix routines. (See Compatibility with Healpix).
(*)With an exception of two "do-it-all" drivers distribute_local_data_objects_map2alm
and distribute_local_data_objects_alm2map which are provided only for completeness.
Top of the page
associated Legendre functions -- are stored in a vector corresponding to a double precision two-dimensional array stored in a column-wise
order. Two different underlying 2-dim layouts are
supported, corresponding to the S2HAT native (plm[0:nplm-1,1:ncomp]) and HEALPIX (plm[1:ncomp,0:nplm-1]) choices.
Here
nplm -- a number of (l,m) modes assigned to a given processor multiplied by a number of constant elevation rings as
also assigned to it.
(This number can be estimated by the get_local_data_sizes or, more directly, by
the nummmodes routines.)
ncomp -- a number of map components in this case meaning specifically Stokes parameters and thus equal either to 1 or 3.
Unlike the harmonic coefficient arrays the plm ones are compressed to avoid any redundancies.
Therefore the values are assigned to the elements of the plm array as follows,
index=0
for every ring r ( = 0 ,...,nringsall-1)
for every m ( = mvals[0],...,mvals[nmvals-1])
for every l ( = m, ..., nlmax)
plm[ index] = P_{lm}(r)
index++
The plm coefficients are pressumed to be directly computed and stored only in their distributed form. No relevant gather/scatter routines are
therefore provided.
Top of the page
|