MIDAPACK - MIcrowave Data Analysis PACKage 1.0beta
Parallel software tools for high performance CMB DA analysis
multithreaded/sequential routines

Functions

int tpltz_init (int n, int lambda, int *nfft, int *blocksize, fftw_complex **T_fft, double *T, fftw_complex **V_fft, double **V_rfft, fftw_plan *plan_f, fftw_plan *plan_b)
 Initialize block size and all the fftw arrays and plans needed for the computation.
int tpltz_cleanup (fftw_complex **T_fft, fftw_complex **V_fft, double **V_rfft, fftw_plan *plan_f, fftw_plan *plan_b)
 Clean fftw workspace used in the Toeplitz matrix matrix product's computation.
int stmm_core (double **V, int n, int m, fftw_complex *T_fft, int blocksize, int lambda, fftw_complex *V_fft, double *V_rfft, int nfft, fftw_plan plan_f, fftw_plan plan_b, int flag_offset)
 Perform the stand alone product of a Toeplitz matrix by a matrix using the sliding window algorithm.
int stmm (double **V, int n, int m, int id0, int l, fftw_complex *T_fft, int lambda, fftw_complex *V_fft, double *V_rfft, fftw_plan plan_f, fftw_plan plan_b, int blocksize, int nfft)
 Perform the product of a Toeplitz matrix by a general matrix using the sliding window algorithm with optimize reshaping.
int reset_gaps (double **V, int id0, int local_V_size, int m, int nrow, int *id0gap, int *lgap, int ngap)
 Set the data to zeros at the gaps location.
int stbmm (double **V, int *n, int m, int nrow, double *T, int nb_blocks_local, int nb_blocks_all, int *lambda, int *idv, int idp, int local_V_size)
 Performs the multiplication of a symmetric, Toeplitz block-diagonal matrix, T, by an arbitrary matrix, V, distributed over processes in the generalized column-wise way.
int gstbmm (double **V, int *n, int m, int nrow, double *T, int nb_blocks_local, int nb_blocks_all, int *lambda, int *idv, int id0p, int local_V_size, int *id0gap, int *lgap, int ngap)
 Performs the multiplication of a symmetric, Toeplitz block-diagonal matrix, T, by an arbitrary matrix, V, distributed over processes in the generalized column-wise way. This matrix V contains defined gaps which represents the useless data for the comutation. The gaps indexes are defined in the global time space as the generized toeplitz matrix, meaning the row dimension. Each of his diagonal blocks is a symmetric, band-diagonal Toeplitz matrix, which can be different for each block.

Detailed Description

These are shared-memory routines.


Function Documentation

int gstbmm ( double **  V,
int *  n,
int  m,
int  nrow,
double *  T,
int  nb_blocks_local,
int  nb_blocks_all,
int *  lambda,
int *  idv,
int  id0p,
int  local_V_size,
int *  id0gap,
int *  lgap,
int  ngap 
)

Performs the multiplication of a symmetric, Toeplitz block-diagonal matrix, T, by an arbitrary matrix, V, distributed over processes in the generalized column-wise way. This matrix V contains defined gaps which represents the useless data for the comutation. The gaps indexes are defined in the global time space as the generized toeplitz matrix, meaning the row dimension. Each of his diagonal blocks is a symmetric, band-diagonal Toeplitz matrix, which can be different for each block.

We first rebuild the Toeplitz block matrix structure to reduce the computation cost and skip the computations of the values on the defined gaps. then, each process performs the multiplication sequentially for each of the gappy block and based on the sliding window algorithm. Prior to that MPI calls are used to exchange data between neighboring process. The parameters are :

Parameters:
V[input] distributed data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV
nnumber of rows for each Toeplitz block as stored in T
mnumber of columns of the global data matrix V
nrownumber of rows of the global data matrix V
TToeplitz matrix composed of the non-zero entries of the first row of each Toeplitz block and concatenated together have to be arranged in the increasing order of n without repetitions and overlaps.
nb_blocks_allnumber of all Toeplitz block on the diagonal of the full Toeplitz matrix
nb_blocks_localnumber of Toeplitz blocks as stored in T
lambdahalf bandwith size for each Toeplitz block stroed in T
idvglobal row index defining for each Toeplitz block as stored in the vector T first element of the interval to which given Toeplitz matrix is to be applied.
id0pglobal index of the first element of the local part of V
local_V_sizenumber of all elements in local V
id0gapindex of the first element of each defined gap
lgaplength of each defined gaps
ngapnumber of defined gaps

Definition at line 253 of file toeplitz_seq.c.

int reset_gaps ( double **  V,
int  id0,
int  local_V_size,
int  m,
int  nrow,
int *  id0gap,
int *  lgap,
int  ngap 
)

Set the data to zeros at the gaps location.

The data located within the gaps are set to zero. The gaps are defined in the time domain, meaning their indexes are defined in the row dimension.

Definition at line 1680 of file toeplitz.c.

int stbmm ( double **  V,
int *  n,
int  m,
int  nrow,
double *  T,
int  nb_blocks_local,
int  nb_blocks_all,
int *  lambda,
int *  idv,
int  idp,
int  local_V_size 
)

Performs the multiplication of a symmetric, Toeplitz block-diagonal matrix, T, by an arbitrary matrix, V, distributed over processes in the generalized column-wise way.

Each process performs the multiplication sequentially for each diagonal block and based on the sliding window algorithm. Prior to that MPI calls are used to exchange data between neighboring process. Each of the diagonal blocks is a symmetric, band-diagonal Toeplitz matrix, which can be different for each block. The parameters are :

Parameters:
V[input] distributed data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV
nnumber of rows for each Toeplitz block as stored in T
mnumber of columns of the global data matrix V
nrownumber of rows of the global data matrix V
TToeplitz matrix composed of the non-zero entries of the first row of each Toeplitz block and concatenated together have to be arranged in the increasing order of n without repetitions and overlaps.
nb_blocks_allnumber of all Toeplitz block on the diagonal of the full Toeplitz matrix
nb_blocks_localnumber of Toeplitz blocks as stored in T
lambdahalf bandwith size for each Toeplitz block stroed in T
idvglobal row index defining for each Toeplitz block as stored in the vector T first element of the interval to which given Toeplitz matrix is to be applied.
idpglobal index of the first element of the local part of V
local_V_sizea number of all elements in local V

Definition at line 71 of file toeplitz_seq.c.

int stmm ( double **  V,
int  n,
int  m,
int  id0,
int  l,
fftw_complex *  T_fft,
int  lambda,
fftw_complex *  V_fft,
double *  V_rfft,
fftw_plan  plan_f,
fftw_plan  plan_b,
int  blocksize,
int  nfft 
)

Perform the product of a Toeplitz matrix by a general matrix using the sliding window algorithm with optimize reshaping.

The input matrix is formatted into an optimize matrix depending on the block size and the number of simultaneous ffts (defined with the variable nfft). The obtained number of columns represent the number of vectors FFTs of which are computed simulatenously. The multiplication is then performed block-by-block with the chosen block size using the core routine. The parameters are :

Parameters:
V[input] data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV
nnumber of rows of V
mnumber of columns of V
id0first index of V
llength of V
T_fftcomplex array used for FFTs
lambdaToeplitz band width
V_fftcomplex array used for FFTs
V_rfftreal array used for FFTs
plan_ffftw plan forward (r2c)
plan_bfftw plan backward (c2r)
blocksizeblock size
nfftnumber of simultaneous FTTs

Definition at line 833 of file toeplitz.c.

int stmm_core ( double **  V,
int  n,
int  m,
fftw_complex *  T_fft,
int  blocksize,
int  lambda,
fftw_complex *  V_fft,
double *  V_rfft,
int  nfft,
fftw_plan  plan_f,
fftw_plan  plan_b,
int  flag_offset 
)

Perform the stand alone product of a Toeplitz matrix by a matrix using the sliding window algorithm.

The product is performed block-by-block with a defined block size or a computed optimized block size that reflects a trade off between cost of a single FFT of a length block_size and a number of blocks needed to perform the mutiplicaton. The latter determines how many spurious values are computed extra due to overlaps between the blocks. Use flag_offset=0 for "classic" algorithm and flag_offset=1 to put an offset to avoid the first and last lambdas terms. Usefull when a reshaping was done before with optimal column for a nfft. Better be inside the arguments of the routine. The parameters are:

Parameters:
V[input] data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV
nnumber of rows of V
mnumber of columns of V
T_fftcomplex array used for FFTs
blocksizeblock size used in the sliding window algorithm
lambdaToeplitz band width
V_fftcomplex array used for FFTs
V_rfftreal array used for FFTs
nfftnumber of simultaneous FFTs
plan_ffftw plan forward (r2c)
plan_bfftw plan backward (c2r)
flag_offsetflag to avoid extra 2*lambda padding to zeros on the edges

Definition at line 515 of file toeplitz.c.

int tpltz_cleanup ( fftw_complex **  T_fft,
fftw_complex **  V_fft,
double **  V_rfft,
fftw_plan *  plan_f,
fftw_plan *  plan_b 
)

Clean fftw workspace used in the Toeplitz matrix matrix product's computation.

Destroy fftw plans, free memory and reset fftw workspace.

See also:
tpltz_init
Parameters:
T_fftcomplex array used for FFTs
V_fftcomplex array used for FFTs
V_rfftreal array used for FFTs
plan_ffftw plan forward (r2c)
plan_bfftw plan backward (c2r)

Definition at line 324 of file toeplitz.c.

int tpltz_init ( int  n,
int  lambda,
int *  nfft,
int *  blocksize,
fftw_complex **  T_fft,
double *  T,
fftw_complex **  V_fft,
double **  V_rfft,
fftw_plan *  plan_f,
fftw_plan *  plan_b 
)

Initialize block size and all the fftw arrays and plans needed for the computation.

Initialize the fftw arrays and plans is necessary before any computation of the Toeplitz matrix matrix product. Use tpltz_cleanup afterwards.

See also:
tpltz_cleanup
Parameters:
nrow size of the matrix used for later product
lambdaToeplitz band width
nfftmaximum number of FFTs you want to compute at the same time
blocksizeoptimal block size used in the sliding window algorithm to compute an optimize value)
T_fftcomplex array used for FFTs
TToeplitz matrix
V_fftcomplex array used for FFTs
V_rfftreal array used for FFTs
plan_ffftw plan forward (r2c)
plan_bfftw plan backward (c2r)

Definition at line 186 of file toeplitz.c.