Functions
int	optimal_blocksize (int n, int lambda, int bs_flag)
	Compute an optimal block size value used in the sliding windows algorithm.
int	fftw_init_omp_threads ()
	Initialize omp threads for fftw plans.
int	rhs_init_fftw (int nfft, int fft_size, fftw_complex V_fft, double V_rfft, fftw_plan plan_f, fftw_plan *plan_b, int fftw_flag)
	Initialize fftw array and plan for the right hand side matrix V.
int	circ_init_fftw (double T, int fft_size, int lambda, fftw_complex *T_fft)
	Initialize fftw array and plan for the circulant matrix T_circ obtained from T.
int	scmm_direct (int fft_size, fftw_complex C_fft, int ncol, double V_rfft, double *CV, fftw_complex V_fft, fftw_plan plan_f_V, fftw_plan plan_b_CV)
	Performs the product of a circulant matrix C_fft by a matrix V_rfft using fftw plans.
int	scmm_basic (double *V, int blocksize, int m, fftw_complex C_fft, int lambda, double *CV, fftw_complex V_fft, double *V_rfft, int nfft, fftw_plan plan_f_V, fftw_plan plan_b_CV)
	Perform the product of a circulant matrix by a matrix using FFT's.
int	stmm_reshape (double *V, int n, int m, int id0, int l, fftw_complex T_fft, int lambda, fftw_complex V_fft, double V_rfft, fftw_plan plan_f, fftw_plan plan_b, int blocksize, int nfft)
	Reshape the data structure to optimize the Toeplitz matrix matrix computation by the sliding window algorithm and do the computation of the product using the core routine.
int	build_gappy_blocks (int n, int m, int nrow, double T, int nb_blocks_local, int nb_blocks_all, int lambda, int idv, int id0gap, int lgap, int ngap, int nb_blocks_gappy_final, double Tgappy, int idvgappy, int ngappy, int *lambdagappy, int flag_param_distmin_fixed)
	Build the gappy Toeplitz block structure to optimise the product computation at gaps location.

Detailed Description

These are low-level routines.

Function Documentation

int build_gappy_blocks	(	int *	n,
		int	m,
		int	nrow,
		double *	T,
		int	nb_blocks_local,
		int	nb_blocks_all,
		int *	lambda,
		int *	idv,
		int *	id0gap,
		int *	lgap,
		int	ngap,
		int *	nb_blocks_gappy_final,
		double *	Tgappy,
		int *	idvgappy,
		int *	ngappy,
		int *	lambdagappy,
		int	flag_param_distmin_fixed
	)

Build the gappy Toeplitz block structure to optimise the product computation at gaps location.

Considering the significant gaps, the blocks to which they belong are cut and split between the gap's edges to reduce the total row size of the flotting blocks. It take into consideration the minimum correlation length and a parameter allows us to control the minimum gap size allowed for the blocks splitting. In some cases, the gap can be partially reduce to fit the minimum block size needed for computation or just for performance criteria. This is based on the fact that the gaps are set to zeros in the main routine.

Definition at line 1707 of file toeplitz.c.

int circ_init_fftw	(	double *	T,
		int	fft_size,
		int	lambda,
		fftw_complex **	T_fft
	)

Initialize fftw array and plan for the circulant matrix T_circ obtained from T.

Build the circulant matrix T_circ from T and initilize his fftw arrays and plans. Use tpltz_cleanup afterwards.

See also:: tpltz_cleanup

Parameters:

T	Toeplitz matrix.
fft_size	effective FFT size for the circulant matrix (usually equal to blocksize)
lambda	Toeplitz band width.
T_fft	complex array used for FFTs.

Definition at line 281 of file toeplitz.c.

int fftw_init_omp_threads ( )

Initialize omp threads for fftw plans.

Initialize omp threads for fftw plans. The number of threads used for ffts (define by the variable n_thread) is read from OMP_NUM_THREAD environment variable. fftw multithreaded option is controlled by fftw_MULTITHREADING macro.

Definition at line 217 of file toeplitz.c.

int optimal_blocksize	(	int	n,
		int	lambda,
		int	bs_flag
	)

Compute an optimal block size value used in the sliding windows algorithm.

The optimal block size is computed as the minimum power of two above 3*lambda, i.e. the smallest value equal to 2^x, where x is an integer, and above 3*lambda. If bs_flag is set to one, a different formula is used to compute the optimal block size (see MADmap: A MASSIVELY PARALLEL MAXIMUM LIKELIHOOD COSMIC MICROWAVE BACKGROUND MAP-MAKER, C. M. Cantalupo, J. D. Borrill, A. H. Jaffe, T. S. Kisner, and R. Stompor, The Astrophysical Journal Supplement Series, 187:212–227, 2010 March). To avoid using block size much bigger than the matrix, the block size is set to 3*lambda when his previous computed size is bigger than the matrix size n. This case append mostly for small matrix compared to his bandwith.

Parameters:

n	matrix row dimension
lambda	half bandwidth of the Toeplitz matrix
bs_flag	flag to use a different formula for optimal block size computation

Definition at line 144 of file toeplitz.c.

int rhs_init_fftw	(	int *	nfft,
		int	fft_size,
		fftw_complex **	V_fft,
		double **	V_rfft,
		fftw_plan *	plan_f,
		fftw_plan *	plan_b,
		int	fftw_flag
	)

Initialize fftw array and plan for the right hand side matrix V.

Parameters:

nfft	maximum number of FFTs you want to compute at the same time
fft_size	effective FFT size for the general matrix V (usually equal to blocksize)
V_fft	complex array used for FFTs
V_rfft	real array used for FFTs
plan_f	fftw plan forward (r2c)
plan_b	fftw plan backward (c2r)
fftw_flag	fftw plan allocation flag

Definition at line 254 of file toeplitz.c.

int scmm_basic	(	double **	V,
		int	blocksize,
		int	m,
		fftw_complex *	C_fft,
		int	lambda,
		double **	CV,
		fftw_complex *	V_fft,
		double *	V_rfft,
		int	nfft,
		fftw_plan	plan_f_V,
		fftw_plan	plan_b_CV
	)

Perform the product of a circulant matrix by a matrix using FFT's.

This routine multiplies a circulant matrix, represented by C_fft, by a general matrix V, and stores the output as a matrix CV. In addition the routine requires two workspace objects, V_fft and V_rfft, to be allocated prior to a call to it as well as two fftw plans: one forward (plan_f_V), and one backward (plan_b_TV). The sizes of the input general matrix V and the ouput CV are given by blocksize rows and m columns. They are stored as a vector in the column-wise order. The circulant matrix, which is assumed to be band-diagonal with a band-width lambda, is represented by a Fourier transform with its coefficients stored in a vector C_fft (length blocksize). blocksize also defines the size of the FFTs, which will be performed and therefore this is the value which has to be used while creating the fftw plans and allocating the workspaces. The latter are given as: nfft*(blocksize/2+1) for V_fft and nfft*blocksize for V_rfft. The fftw plans should correspond to doing the transforms of nfft vectors simultaneously. Typically, the parameters of this routine are fixed by a preceding call to Toeplitz_init(). The parameters are :

Parameters:

	V	matrix (with the convention V(i,j)=V[i+j*n])
	blocksize	row dimension of V
	m	column dimension of V
	C_fft	complex array used for FFTs (FFT of the Toeplitz matrix)
	lambda	half band width Toeplitz
[out]	CV	product of the circulant matrix C_fft by the matrix V_rfft
	V_fft	complex array used for FFTs
	V_rfft	real array used for FFTs
	nfft	number of simultaneous FFTs
	plan_f_V	fftw plan forward (r2c)
	plan_b_CV	fftw plan backward (c2r)

Definition at line 449 of file toeplitz.c.

int scmm_direct	(	int	fft_size,
		fftw_complex *	C_fft,
		int	ncol,
		double *	V_rfft,
		double **	CV,
		fftw_complex *	V_fft,
		fftw_plan	plan_f_V,
		fftw_plan	plan_b_CV
	)

Performs the product of a circulant matrix C_fft by a matrix V_rfft using fftw plans.

Performs the product of a circulant matrix C_fft by a matrix V_rfft using fftw plans: forward - plan_f_V; and backward - plan_b_CV. C_fft is a Fourier (complex representation of the circulant matrix) of length fft_size/2+1; V_rfft is a matrix with ncol columns and fft_size rows; V_fft is a workspace of fft_size/2+1 complex numbers as required by the backward FFT (plan_b_CV); CV is the output matrix of the same size as the input V_rfft one. The FFTs transform ncol vectors simultanously.

Parameters:

	fft_size	row dimension
	C_fft	complex array used for FFTs
	ncol	column dimension
	V_rfft	real array used for FFTs
[out]	CV	product of the circulant matrix C_fft by the matrix V_rfft
	V_fft	complex array used for FFTs
	plan_f_V	fftw plan forward (r2c)
	plan_b_CV	fftw plan backward (c2r)

Definition at line 397 of file toeplitz.c.

int stmm_reshape	(	double **	V,
		int	n,
		int	m,
		int	id0,
		int	l,
		fftw_complex *	T_fft,
		int	lambda,
		fftw_complex *	V_fft,
		double *	V_rfft,
		fftw_plan	plan_f,
		fftw_plan	plan_b,
		int	blocksize,
		int	nfft
	)

Reshape the data structure to optimize the Toeplitz matrix matrix computation by the sliding window algorithm and do the computation of the product using the core routine.

The input matrix is formatted into an optimize matrix depending on the defined block size and the number of simultaneous ffts (defined as a variable nfft). The obtained number of columns represent the number of vectors FFTs of which are computed simulatenously. The product is then performed block-by-block with the chosen block size using the core routine. The parameters are :

Parameters:

V	[input] data matrix (with the convention V(i,j)=V[i+j*n]); [out] result of the product TV
n	number of rows of V
m	number of columns of V
id0	first index of V
l	length of V
T_fft	complex array used for FFTs
lambda	Toeplitz band width
V_fft	complex array used for FFTs
V_rfft	real array used for FFTs
plan_f	fftw plan forward (r2c)
plan_b	fftw plan backward (c2r)
blocksize	block size used in the sliding window algorithm
nfft	number of simultaneous FFTs

you need to put flag_offset=0 as parameter for the stmm_core routine.

Definition at line 635 of file toeplitz.c.

Functions

Detailed Description

Function Documentation