MIDAPACK - MIcrowave Data Analysis PACKage
1.1b
Parallel software tools for high performance CMB DA analysis
|
Functions | |
int | define_blocksize (int n, int lambda, int bs_flag, int fixed_bs) |
Defines an optimal size of the block used in the sliding windows algorithm. | |
int | define_nfft (int n_thread, int flag_nfft, int fixed_nfft) |
Defines the number of simultaneous ffts for the Toeplitz matrix product computation. | |
int | fftw_init_omp_threads (int fftw_n_thread) |
Initialize omp threads for fftw plans. | |
int | rhs_init_fftw (int *nfft, int fft_size, fftw_complex **V_fft, double **V_rfft, fftw_plan *plan_f, fftw_plan *plan_b, int fftw_flag) |
Initializes fftw array and plan for the right hand side, general matrix V. | |
int | circ_init_fftw (double *T, int fft_size, int lambda, fftw_complex **T_fft) |
Initializes fftw array and plan for the circulant matrix T_circ obtained from T. | |
int | scmm_direct (int fft_size, int nfft, fftw_complex *C_fft, int ncol, double *V_rfft, double **CV, fftw_complex *V_fft, fftw_plan plan_f_V, fftw_plan plan_b_CV) |
Performs the product of a circulant matrix C_fft by a matrix V_rfft using fftw plans. | |
int | scmm_basic (double **V, int blocksize, int m, fftw_complex *C_fft, double **CV, fftw_complex *V_fft, double *V_rfft, int nfft, fftw_plan plan_f_V, fftw_plan plan_b_CV) |
Performs the product of a circulant matrix by a matrix using FFT's (an INTERNAL routine) | |
int | stmm_core (double **V, int n, int m, double *T, fftw_complex *T_fft, int blocksize, int lambda, fftw_complex *V_fft, double *V_rfft, int nfft, fftw_plan plan_f, fftw_plan plan_b, int flag_offset, int flag_nofft) |
Performs the stand alone product of a Toeplitz matrix by a matrix using the sliding window algorithm. (an INTERNAL routine) | |
int | stmm_main (double **V, int n, int m, int id0, int l, double *T, fftw_complex *T_fft, int lambda, fftw_complex *V_fft, double *V_rfft, fftw_plan plan_f, fftw_plan plan_b, int blocksize, int nfft, Flag flag_stgy) |
Performs the product of a Toeplitz matrix by a general matrix using the sliding window algorithm with optimize reshaping. (an INTERNAL routine) | |
int | build_gappy_blocks (int nrow, int m, Block *tpltzblocks, int nb_blocks_local, int nb_blocks_all, int64_t *id0gap, int *lgap, int ngap, Block *tpltzblocks_gappy, int *nb_blocks_gappy_final, int flag_param_distmin_fixed) |
Build the gappy Toeplitz block structure to optimise the product computation at gaps location. | |
int | stmm_simple_basic (double **V, int n, int m, double *T, int lambda, double **TV) |
Perform the product of a Toeplitz matrix by a matrix without using FFT's. | |
int | stmm_simple_core (double **V, int n, int m, double *T, int blocksize, int lambda, int nfft, int flag_offset) |
Perform the stand alone product of a Toeplitz matrix by a matrix using the sliding window algorithm. | |
int | flag_stgy_init_auto (Flag *flag_stgy) |
Set the flag to automatic paramaters. | |
int | flag_stgy_init_zeros (Flag *flag_stgy) |
Set the flag parameters to zeros. This is almost the same as automatic. | |
int | flag_stgy_init_defined (Flag *flag_stgy) |
Set the parameters flag to the defined ones. | |
int | print_flag_stgy_init (Flag flag_stgy) |
Print the flag parameters values. |
These are low-level routines.
int define_blocksize | ( | int | n, |
int | lambda, | ||
int | bs_flag, | ||
int | fixed_bs | ||
) |
Defines an optimal size of the block used in the sliding windows algorithm.
The optimal block size is computed as the minimum power of two above 3*lambda, i.e. the smallest value equal to 2^x, where x is an integer, and above 3*lambda. If bs_flag is set to one, a different formula is used to compute the optimal block size (see MADmap: A MASSIVELY PARALLEL MAXIMUM LIKELIHOOD COSMIC MICROWAVE BACKGROUND MAP-MAKER, C. M. Cantalupo, J. D. Borrill, A. H. Jaffe, T. S. Kisner, and R. Stompor, The Astrophysical Journal Supplement Series, 187:212–227, 2010 March). To avoid using block size much bigger than the matrix, the block size is set to 3*lambda when his previous computed size is bigger than the matrix size n. This case append mostly for small matrix compared to his bandwith.
n | matrix row dimension |
lambda | half bandwidth of the Toeplitz matrix |
bs_flag | flag to use a different formula for optimal block size computation |
fixed_bs | fixed blocksize value if needed |
Definition at line 130 of file toeplitz.c.
int define_nfft | ( | int | n_thread, |
int | flag_nfft, | ||
int | fixed_nfft | ||
) |
Defines the number of simultaneous ffts for the Toeplitz matrix product computation.
n_thread | number of omp threads |
flag_nfft | flag to set the strategy to define nfft |
fixed_nfft | fixed nfft value if nedeed (used for the case where flag_nfft=1) |
Definition at line 220 of file toeplitz.c.
int fftw_init_omp_threads | ( | int | fftw_n_thread | ) |
Initialize omp threads for fftw plans.
Initialize omp threads for fftw plans. The number of threads used for ffts (define by the variable n_thread) is read from OMP_NUM_THREAD environment variable. fftw multithreaded option is controlled by fftw_MULTITHREADING macro.
Definition at line 333 of file toeplitz.c.
int rhs_init_fftw | ( | int * | nfft, |
int | fft_size, | ||
fftw_complex ** | V_fft, | ||
double ** | V_rfft, | ||
fftw_plan * | plan_f, | ||
fftw_plan * | plan_b, | ||
int | fftw_flag | ||
) |
Initializes fftw array and plan for the right hand side, general matrix V.
Initialize fftw array and plan for the right hand side matrix V.
nfft | maximum number of FFTs you want to compute at the same time |
fft_size | effective FFT size for the general matrix V (usually equal to blocksize) |
V_fft | complex array used for FFTs |
V_rfft | real array used for FFTs |
plan_f | fftw plan forward (r2c) |
plan_b | fftw plan backward (c2r) |
fftw_flag | fftw plan allocation flag |
Definition at line 365 of file toeplitz.c.
int circ_init_fftw | ( | double * | T, |
int | fft_size, | ||
int | lambda, | ||
fftw_complex ** | T_fft | ||
) |
Initializes fftw array and plan for the circulant matrix T_circ obtained from T.
Builds the circulant matrix T_circ from T and initilizes its fftw arrays and plans. Use tpltz_cleanup afterwards.
T | Toeplitz matrix. |
fft_size | effective FFT size for the circulant matrix (usually equal to blocksize) |
lambda | Toeplitz band width. |
T_fft | complex array used for FFTs. |
Definition at line 392 of file toeplitz.c.
int scmm_direct | ( | int | fft_size, |
int | nfft, | ||
fftw_complex * | C_fft, | ||
int | ncol, | ||
double * | V_rfft, | ||
double ** | CV, | ||
fftw_complex * | V_fft, | ||
fftw_plan | plan_f_V, | ||
fftw_plan | plan_b_CV | ||
) |
Performs the product of a circulant matrix C_fft by a matrix V_rfft using fftw plans.
Performs the product of a circulant matrix C_fft by a matrix V_rfft using fftw plans: forward - plan_f_V; and backward - plan_b_CV. C_fft is a Fourier (complex representation of the circulant matrix) of length fft_size/2+1; V_rfft is a matrix with ncol columns and fft_size rows; V_fft is a workspace of fft_size/2+1 complex numbers as required by the backward FFT (plan_b_CV); CV is the output matrix of the same size as the input V_rfft one. The FFTs transform ncol vectors simultanously.
fft_size | row dimension | |
nfft | number of simultaneous FFTs | |
C_fft | complex array used for FFTs | |
ncol | column dimension | |
V_rfft | real array used for FFTs | |
[out] | CV | product of the circulant matrix C_fft by the matrix V_rfft |
V_fft | complex array used for FFTs | |
plan_f_V | fftw plan forward (r2c) | |
plan_b_CV | fftw plan backward (c2r) |
Definition at line 509 of file toeplitz.c.
int scmm_basic | ( | double ** | V, |
int | blocksize, | ||
int | m, | ||
fftw_complex * | C_fft, | ||
double ** | CV, | ||
fftw_complex * | V_fft, | ||
double * | V_rfft, | ||
int | nfft, | ||
fftw_plan | plan_f_V, | ||
fftw_plan | plan_b_CV | ||
) |
Performs the product of a circulant matrix by a matrix using FFT's (an INTERNAL routine)
This routine multiplies a circulant matrix, represented by C_fft, by a general matrix V, and stores the output as a matrix CV. In addition the routine requires two workspace objects, V_fft and V_rfft, to be allocated prior to a call to it as well as two fftw plans: one forward (plan_f_V), and one backward (plan_b_TV). The sizes of the input general matrix V and the ouput CV are given by blocksize rows and m columns. They are stored as a vector in the column-wise order. The circulant matrix, which is assumed to be band-diagonal with a band-width lambda, is represented by a Fourier transform with its coefficients stored in a vector C_fft (length blocksize). blocksize also defines the size of the FFTs, which will be performed and therefore this is the value which has to be used while creating the fftw plans and allocating the workspaces. The latter are given as: nfft*(blocksize/2+1) for V_fft and nfft*blocksize for V_rfft. The fftw plans should correspond to doing the transforms of nfft vectors simultaneously. Typically, the parameters of this routine are fixed by a preceding call to Toeplitz_init(). The parameters are :
V | matrix (with the convention V(i,j)=V[i+j*n]) | |
blocksize | row dimension of V | |
m | column dimension of V | |
C_fft | complex array used for FFTs (FFT of the Toeplitz matrix) | |
[out] | CV | product of the circulant matrix C_fft by the matrix V_rfft |
V_fft | complex array used for FFTs | |
V_rfft | real array used for FFTs | |
nfft | number of simultaneous FFTs | |
plan_f_V | fftw plan forward (r2c) | |
plan_b_CV | fftw plan backward (c2r) |
Definition at line 587 of file toeplitz.c.
int stmm_core | ( | double ** | V, |
int | n, | ||
int | m, | ||
double * | T, | ||
fftw_complex * | T_fft, | ||
int | blocksize, | ||
int | lambda, | ||
fftw_complex * | V_fft, | ||
double * | V_rfft, | ||
int | nfft, | ||
fftw_plan | plan_f, | ||
fftw_plan | plan_b, | ||
int | flag_offset, | ||
int | flag_nofft | ||
) |
Performs the stand alone product of a Toeplitz matrix by a matrix using the sliding window algorithm. (an INTERNAL routine)
The product is performed block-by-block with a defined block size or a computed optimized block size that reflects a trade off between cost of a single FFT of a length block_size and a number of blocks needed to perform the mutiplicaton. The latter determines how many spurious values are computed extra due to overlaps between the blocks. Use flag_offset=0 for "classic" algorithm and flag_offset=1 to put an offset to avoid the first and last lambdas terms. Usefull when a reshaping was done before with optimal column for a nfft. Better be inside the arguments of the routine. The parameters are:
V | [input] data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV |
n | number of rows of V |
m | number of columns of V |
T | Toeplitz matrix data composed of the non-zero entries of his first row |
T_fft | complex array used for FFTs |
blocksize | block size used in the sliding window algorithm |
lambda | Toeplitz band width |
V_fft | complex array used for FFTs |
V_rfft | real array used for FFTs |
nfft | number of simultaneous FFTs |
plan_f | fftw plan forward (r2c) |
plan_b | fftw plan backward (c2r) |
flag_offset | flag to avoid extra 2*lambda padding to zeros on the edges |
flag_nofft | flag to do product without using fft |
Definition at line 659 of file toeplitz.c.
int stmm_main | ( | double ** | V, |
int | n, | ||
int | m, | ||
int | id0, | ||
int | l, | ||
double * | T, | ||
fftw_complex * | T_fft, | ||
int | lambda, | ||
fftw_complex * | V_fft, | ||
double * | V_rfft, | ||
fftw_plan | plan_f, | ||
fftw_plan | plan_b, | ||
int | blocksize, | ||
int | nfft, | ||
Flag | flag_stgy | ||
) |
Performs the product of a Toeplitz matrix by a general matrix using the sliding window algorithm with optimize reshaping. (an INTERNAL routine)
The input matrix is formatted into an optimized matrix depending on the block size and the number of simultaneous ffts (defined with the variable nfft). The obtained number of columns represent the number of vectors FFTs of which are computed simulatenously. The multiplication is then performed block-by-block with the chosen block size using the core routine. The parameters are :
V | [input] data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV |
n | number of rows of V |
m | number of columns of V |
id0 | first index of V |
l | length of V |
T | Toeplitz matrix data composed of the non-zero entries of his first row |
T_fft | complex array used for FFTs |
lambda | Toeplitz band width |
V_fft | complex array used for FFTs |
V_rfft | real array used for FFTs |
plan_f | fftw plan forward (r2c) |
plan_b | fftw plan backward (c2r) |
blocksize | block size |
nfft | number of simultaneous FTTs |
flag_stgy | flag strategy for the product computation |
Definition at line 803 of file toeplitz.c.
int build_gappy_blocks | ( | int | nrow, |
int | m, | ||
Block * | tpltzblocks, | ||
int | nb_blocks_local, | ||
int | nb_blocks_all, | ||
int64_t * | id0gap, | ||
int * | lgap, | ||
int | ngap, | ||
Block * | tpltzblocks_gappy, | ||
int * | nb_blocks_gappy_final, | ||
int | flag_param_distmin_fixed | ||
) |
Build the gappy Toeplitz block structure to optimise the product computation at gaps location.
Considering the significant gaps, the blocks to which they belong are cut and split between the gap's edges to reduce the total row size of the flotting blocks. It take into consideration the minimum correlation length and a parameter that allows us to control the minimum gap size allowed to split the blocks. In some cases, the gap can be partially reduce to fit the minimum block size needed for computation or just for performance criteria. This is based on the fact that the gaps are previously set to zeros before calling this routine.
nrow | number of rows of the global data matrix V |
m | number of columns for the data matrix V in the global rowwise order |
tpltzblocks | list of the toeplitz blocks struture with its own parameters (idv, n, T_block, lambda). |
nb_blocks_local | number of Toeplitz blocks as stored in T |
nb_blocks_all | number of all Toeplitz block on the diagonal of the full Toeplitz matrix |
id0gap | index of the first element of each defined gap |
lgap | length of each defined gaps |
ngap | number of defined gaps |
tpltzblocks_gappy | list of the gappy toeplitz blocks struture with its own parameters |
nb_blocks_gappy_final | real number of obtained gappy Toeplitz blocks |
flag_param_distmin_fixed | flag to defined the minimum gap value allowed to split a Toeplitz block |
Definition at line 231 of file toeplitz_gappy.c.
int stmm_simple_basic | ( | double ** | V, |
int | n, | ||
int | m, | ||
double * | T, | ||
int | lambda, | ||
double ** | TV | ||
) |
Perform the product of a Toeplitz matrix by a matrix without using FFT's.
This routine multiplies the values directly between them. This exploit the fact that the bandwith is small compared to the matrix size. The number of operation is then no more than (lambda*2-1) multiplications and (lambda*2-1)-1 additions per row.
Definition at line 73 of file toeplitz_nofft.c.
int stmm_simple_core | ( | double ** | V, |
int | n, | ||
int | m, | ||
double * | T, | ||
int | blocksize, | ||
int | lambda, | ||
int | nfft, | ||
int | flag_offset | ||
) |
Perform the stand alone product of a Toeplitz matrix by a matrix using the sliding window algorithm.
The product is performed block-by-block with a defined block size or a computed optimized blocksize. This routine is not used by th API.
V | [input] data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV |
n | number of rows of V |
m | number of columns of V |
T | Toeplitz matrix data composed of the non-zero entries of his first row |
blocksize | block size used in the sliding window algorithm |
lambda | Toeplitz band width |
nfft | number of simultaneous FFTs |
flag_offset | flag to avoid extra 2*lambda padding to zeros on the edges |
Definition at line 128 of file toeplitz_nofft.c.
int flag_stgy_init_auto | ( | Flag * | flag_stgy | ) |
Set the flag to automatic paramaters.
flag_stgy | flag strategy for the product computation |
Definition at line 73 of file toeplitz_params.c.
int flag_stgy_init_zeros | ( | Flag * | flag_stgy | ) |
Set the flag parameters to zeros. This is almost the same as automatic.
flag_stgy | flag strategy for the product computation |
Definition at line 91 of file toeplitz_params.c.
int flag_stgy_init_defined | ( | Flag * | flag_stgy | ) |
Set the parameters flag to the defined ones.
flag_stgy | flag strategy for the product computation |
Definition at line 106 of file toeplitz_params.c.
int print_flag_stgy_init | ( | Flag | flag_stgy | ) |
Print the flag parameters values.
flag_stgy | flag strategy for the product computation |
Definition at line 131 of file toeplitz_params.c.