![]() |
MIDAPACK - MIcrowave Data Analysis PACKage 1.0beta
Parallel software tools for high performance CMB DA analysis
|
Functions | |
int | tpltz_init (int n, int lambda, int *nfft, int *blocksize, fftw_complex **T_fft, double *T, fftw_complex **V_fft, double **V_rfft, fftw_plan *plan_f, fftw_plan *plan_b) |
Initialize block size and all the fftw arrays and plans needed for the computation. | |
int | tpltz_cleanup (fftw_complex **T_fft, fftw_complex **V_fft, double **V_rfft, fftw_plan *plan_f, fftw_plan *plan_b) |
Clean fftw workspace used in the Toeplitz matrix matrix product's computation. | |
int | stmm_core (double **V, int n, int m, fftw_complex *T_fft, int blocksize, int lambda, fftw_complex *V_fft, double *V_rfft, int nfft, fftw_plan plan_f, fftw_plan plan_b, int flag_offset) |
Perform the stand alone product of a Toeplitz matrix by a matrix using the sliding window algorithm. | |
int | stmm (double **V, int n, int m, int id0, int l, fftw_complex *T_fft, int lambda, fftw_complex *V_fft, double *V_rfft, fftw_plan plan_f, fftw_plan plan_b, int blocksize, int nfft) |
Perform the product of a Toeplitz matrix by a general matrix using the sliding window algorithm with optimize reshaping. | |
int | reset_gaps (double **V, int id0, int local_V_size, int m, int nrow, int *id0gap, int *lgap, int ngap) |
Set the data to zeros at the gaps location. | |
int | stbmm (double **V, int *n, int m, int nrow, double *T, int nb_blocks_local, int nb_blocks_all, int *lambda, int *idv, int idp, int local_V_size) |
Performs the multiplication of a symmetric, Toeplitz block-diagonal matrix, T, by an arbitrary matrix, V, distributed over processes in the generalized column-wise way. | |
int | gstbmm (double **V, int *n, int m, int nrow, double *T, int nb_blocks_local, int nb_blocks_all, int *lambda, int *idv, int id0p, int local_V_size, int *id0gap, int *lgap, int ngap) |
Performs the multiplication of a symmetric, Toeplitz block-diagonal matrix, T, by an arbitrary matrix, V, distributed over processes in the generalized column-wise way. This matrix V contains defined gaps which represents the useless data for the comutation. The gaps indexes are defined in the global time space as the generized toeplitz matrix, meaning the row dimension. Each of his diagonal blocks is a symmetric, band-diagonal Toeplitz matrix, which can be different for each block. |
These are shared-memory routines.
int gstbmm | ( | double ** | V, |
int * | n, | ||
int | m, | ||
int | nrow, | ||
double * | T, | ||
int | nb_blocks_local, | ||
int | nb_blocks_all, | ||
int * | lambda, | ||
int * | idv, | ||
int | id0p, | ||
int | local_V_size, | ||
int * | id0gap, | ||
int * | lgap, | ||
int | ngap | ||
) |
Performs the multiplication of a symmetric, Toeplitz block-diagonal matrix, T, by an arbitrary matrix, V, distributed over processes in the generalized column-wise way. This matrix V contains defined gaps which represents the useless data for the comutation. The gaps indexes are defined in the global time space as the generized toeplitz matrix, meaning the row dimension. Each of his diagonal blocks is a symmetric, band-diagonal Toeplitz matrix, which can be different for each block.
We first rebuild the Toeplitz block matrix structure to reduce the computation cost and skip the computations of the values on the defined gaps. then, each process performs the multiplication sequentially for each of the gappy block and based on the sliding window algorithm. Prior to that MPI calls are used to exchange data between neighboring process. The parameters are :
V | [input] distributed data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV |
n | number of rows for each Toeplitz block as stored in T |
m | number of columns of the global data matrix V |
nrow | number of rows of the global data matrix V |
T | Toeplitz matrix composed of the non-zero entries of the first row of each Toeplitz block and concatenated together have to be arranged in the increasing order of n without repetitions and overlaps. |
nb_blocks_all | number of all Toeplitz block on the diagonal of the full Toeplitz matrix |
nb_blocks_local | number of Toeplitz blocks as stored in T |
lambda | half bandwith size for each Toeplitz block stroed in T |
idv | global row index defining for each Toeplitz block as stored in the vector T first element of the interval to which given Toeplitz matrix is to be applied. |
id0p | global index of the first element of the local part of V |
local_V_size | number of all elements in local V |
id0gap | index of the first element of each defined gap |
lgap | length of each defined gaps |
ngap | number of defined gaps |
Definition at line 253 of file toeplitz_seq.c.
int reset_gaps | ( | double ** | V, |
int | id0, | ||
int | local_V_size, | ||
int | m, | ||
int | nrow, | ||
int * | id0gap, | ||
int * | lgap, | ||
int | ngap | ||
) |
Set the data to zeros at the gaps location.
The data located within the gaps are set to zero. The gaps are defined in the time domain, meaning their indexes are defined in the row dimension.
Definition at line 1680 of file toeplitz.c.
int stbmm | ( | double ** | V, |
int * | n, | ||
int | m, | ||
int | nrow, | ||
double * | T, | ||
int | nb_blocks_local, | ||
int | nb_blocks_all, | ||
int * | lambda, | ||
int * | idv, | ||
int | idp, | ||
int | local_V_size | ||
) |
Performs the multiplication of a symmetric, Toeplitz block-diagonal matrix, T, by an arbitrary matrix, V, distributed over processes in the generalized column-wise way.
Each process performs the multiplication sequentially for each diagonal block and based on the sliding window algorithm. Prior to that MPI calls are used to exchange data between neighboring process. Each of the diagonal blocks is a symmetric, band-diagonal Toeplitz matrix, which can be different for each block. The parameters are :
V | [input] distributed data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV |
n | number of rows for each Toeplitz block as stored in T |
m | number of columns of the global data matrix V |
nrow | number of rows of the global data matrix V |
T | Toeplitz matrix composed of the non-zero entries of the first row of each Toeplitz block and concatenated together have to be arranged in the increasing order of n without repetitions and overlaps. |
nb_blocks_all | number of all Toeplitz block on the diagonal of the full Toeplitz matrix |
nb_blocks_local | number of Toeplitz blocks as stored in T |
lambda | half bandwith size for each Toeplitz block stroed in T |
idv | global row index defining for each Toeplitz block as stored in the vector T first element of the interval to which given Toeplitz matrix is to be applied. |
idp | global index of the first element of the local part of V |
local_V_size | a number of all elements in local V |
Definition at line 71 of file toeplitz_seq.c.
int stmm | ( | double ** | V, |
int | n, | ||
int | m, | ||
int | id0, | ||
int | l, | ||
fftw_complex * | T_fft, | ||
int | lambda, | ||
fftw_complex * | V_fft, | ||
double * | V_rfft, | ||
fftw_plan | plan_f, | ||
fftw_plan | plan_b, | ||
int | blocksize, | ||
int | nfft | ||
) |
Perform the product of a Toeplitz matrix by a general matrix using the sliding window algorithm with optimize reshaping.
The input matrix is formatted into an optimize matrix depending on the block size and the number of simultaneous ffts (defined with the variable nfft). The obtained number of columns represent the number of vectors FFTs of which are computed simulatenously. The multiplication is then performed block-by-block with the chosen block size using the core routine. The parameters are :
V | [input] data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV |
n | number of rows of V |
m | number of columns of V |
id0 | first index of V |
l | length of V |
T_fft | complex array used for FFTs |
lambda | Toeplitz band width |
V_fft | complex array used for FFTs |
V_rfft | real array used for FFTs |
plan_f | fftw plan forward (r2c) |
plan_b | fftw plan backward (c2r) |
blocksize | block size |
nfft | number of simultaneous FTTs |
Definition at line 833 of file toeplitz.c.
int stmm_core | ( | double ** | V, |
int | n, | ||
int | m, | ||
fftw_complex * | T_fft, | ||
int | blocksize, | ||
int | lambda, | ||
fftw_complex * | V_fft, | ||
double * | V_rfft, | ||
int | nfft, | ||
fftw_plan | plan_f, | ||
fftw_plan | plan_b, | ||
int | flag_offset | ||
) |
Perform the stand alone product of a Toeplitz matrix by a matrix using the sliding window algorithm.
The product is performed block-by-block with a defined block size or a computed optimized block size that reflects a trade off between cost of a single FFT of a length block_size and a number of blocks needed to perform the mutiplicaton. The latter determines how many spurious values are computed extra due to overlaps between the blocks. Use flag_offset=0 for "classic" algorithm and flag_offset=1 to put an offset to avoid the first and last lambdas terms. Usefull when a reshaping was done before with optimal column for a nfft. Better be inside the arguments of the routine. The parameters are:
V | [input] data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV |
n | number of rows of V |
m | number of columns of V |
T_fft | complex array used for FFTs |
blocksize | block size used in the sliding window algorithm |
lambda | Toeplitz band width |
V_fft | complex array used for FFTs |
V_rfft | real array used for FFTs |
nfft | number of simultaneous FFTs |
plan_f | fftw plan forward (r2c) |
plan_b | fftw plan backward (c2r) |
flag_offset | flag to avoid extra 2*lambda padding to zeros on the edges |
Definition at line 515 of file toeplitz.c.
int tpltz_cleanup | ( | fftw_complex ** | T_fft, |
fftw_complex ** | V_fft, | ||
double ** | V_rfft, | ||
fftw_plan * | plan_f, | ||
fftw_plan * | plan_b | ||
) |
Clean fftw workspace used in the Toeplitz matrix matrix product's computation.
Destroy fftw plans, free memory and reset fftw workspace.
T_fft | complex array used for FFTs |
V_fft | complex array used for FFTs |
V_rfft | real array used for FFTs |
plan_f | fftw plan forward (r2c) |
plan_b | fftw plan backward (c2r) |
Definition at line 324 of file toeplitz.c.
int tpltz_init | ( | int | n, |
int | lambda, | ||
int * | nfft, | ||
int * | blocksize, | ||
fftw_complex ** | T_fft, | ||
double * | T, | ||
fftw_complex ** | V_fft, | ||
double ** | V_rfft, | ||
fftw_plan * | plan_f, | ||
fftw_plan * | plan_b | ||
) |
Initialize block size and all the fftw arrays and plans needed for the computation.
Initialize the fftw arrays and plans is necessary before any computation of the Toeplitz matrix matrix product. Use tpltz_cleanup afterwards.
n | row size of the matrix used for later product |
lambda | Toeplitz band width |
nfft | maximum number of FFTs you want to compute at the same time |
blocksize | optimal block size used in the sliding window algorithm to compute an optimize value) |
T_fft | complex array used for FFTs |
T | Toeplitz matrix |
V_fft | complex array used for FFTs |
V_rfft | real array used for FFTs |
plan_f | fftw plan forward (r2c) |
plan_b | fftw plan backward (c2r) |
Definition at line 186 of file toeplitz.c.