Functions
int	tpltz_init (int n, int lambda, int nfft, int blocksize, fftw_complex *T_fft, double T, fftw_complex V_fft, double V_rfft, fftw_plan plan_f, fftw_plan plan_b)
	Initialize block size and all the fftw arrays and plans needed for the computation.
int	tpltz_cleanup (fftw_complex T_fft, fftw_complex V_fft, double *V_rfft, fftw_plan plan_f, fftw_plan *plan_b)
	Clean fftw workspace used in the Toeplitz matrix matrix product's computation.
int	stmm_core (double *V, int n, int m, fftw_complex T_fft, int blocksize, int lambda, fftw_complex V_fft, double V_rfft, int nfft, fftw_plan plan_f, fftw_plan plan_b, int flag_offset)
	Perform the stand alone product of a Toeplitz matrix by a matrix using the sliding window algorithm.
int	stmm (double *V, int n, int m, int id0, int l, fftw_complex T_fft, int lambda, fftw_complex V_fft, double V_rfft, fftw_plan plan_f, fftw_plan plan_b, int blocksize, int nfft)
	Perform the product of a Toeplitz matrix by a general matrix using the sliding window algorithm with optimize reshaping.
int	reset_gaps (double *V, int id0, int local_V_size, int m, int nrow, int id0gap, int *lgap, int ngap)
	Set the data to zeros at the gaps location.
int	stbmm (double *V, int n, int m, int nrow, double T, int nb_blocks_local, int nb_blocks_all, int lambda, int *idv, int idp, int local_V_size)
	Performs the multiplication of a symmetric, Toeplitz block-diagonal matrix, T, by an arbitrary matrix, V, distributed over processes in the generalized column-wise way.
int	gstbmm (double *V, int n, int m, int nrow, double T, int nb_blocks_local, int nb_blocks_all, int lambda, int idv, int id0p, int local_V_size, int id0gap, int *lgap, int ngap)
	Performs the multiplication of a symmetric, Toeplitz block-diagonal matrix, T, by an arbitrary matrix, V, distributed over processes in the generalized column-wise way. This matrix V contains defined gaps which represents the useless data for the comutation. The gaps indexes are defined in the global time space as the generized toeplitz matrix, meaning the row dimension. Each of his diagonal blocks is a symmetric, band-diagonal Toeplitz matrix, which can be different for each block.

Detailed Description

These are shared-memory routines.

Function Documentation

int gstbmm	(	double **	V,
		int *	n,
		int	m,
		int	nrow,
		double *	T,
		int	nb_blocks_local,
		int	nb_blocks_all,
		int *	lambda,
		int *	idv,
		int	id0p,
		int	local_V_size,
		int *	id0gap,
		int *	lgap,
		int	ngap
	)

Performs the multiplication of a symmetric, Toeplitz block-diagonal matrix, T, by an arbitrary matrix, V, distributed over processes in the generalized column-wise way. This matrix V contains defined gaps which represents the useless data for the comutation. The gaps indexes are defined in the global time space as the generized toeplitz matrix, meaning the row dimension. Each of his diagonal blocks is a symmetric, band-diagonal Toeplitz matrix, which can be different for each block.

We first rebuild the Toeplitz block matrix structure to reduce the computation cost and skip the computations of the values on the defined gaps. then, each process performs the multiplication sequentially for each of the gappy block and based on the sliding window algorithm. Prior to that MPI calls are used to exchange data between neighboring process. The parameters are :

Parameters:

V	[input] distributed data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV
n	number of rows for each Toeplitz block as stored in T
m	number of columns of the global data matrix V
nrow	number of rows of the global data matrix V
T	Toeplitz matrix composed of the non-zero entries of the first row of each Toeplitz block and concatenated together have to be arranged in the increasing order of n without repetitions and overlaps.
nb_blocks_all	number of all Toeplitz block on the diagonal of the full Toeplitz matrix
nb_blocks_local	number of Toeplitz blocks as stored in T
lambda	half bandwith size for each Toeplitz block stroed in T
idv	global row index defining for each Toeplitz block as stored in the vector T first element of the interval to which given Toeplitz matrix is to be applied.
id0p	global index of the first element of the local part of V
local_V_size	number of all elements in local V
id0gap	index of the first element of each defined gap
lgap	length of each defined gaps
ngap	number of defined gaps

Definition at line 253 of file toeplitz_seq.c.

int reset_gaps	(	double **	V,
		int	id0,
		int	local_V_size,
		int	m,
		int	nrow,
		int *	id0gap,
		int *	lgap,
		int	ngap
	)

Set the data to zeros at the gaps location.

The data located within the gaps are set to zero. The gaps are defined in the time domain, meaning their indexes are defined in the row dimension.

Definition at line 1680 of file toeplitz.c.

int stbmm	(	double **	V,
		int *	n,
		int	m,
		int	nrow,
		double *	T,
		int	nb_blocks_local,
		int	nb_blocks_all,
		int *	lambda,
		int *	idv,
		int	idp,
		int	local_V_size
	)

Performs the multiplication of a symmetric, Toeplitz block-diagonal matrix, T, by an arbitrary matrix, V, distributed over processes in the generalized column-wise way.

Each process performs the multiplication sequentially for each diagonal block and based on the sliding window algorithm. Prior to that MPI calls are used to exchange data between neighboring process. Each of the diagonal blocks is a symmetric, band-diagonal Toeplitz matrix, which can be different for each block. The parameters are :

Parameters:

V	[input] distributed data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV
n	number of rows for each Toeplitz block as stored in T
m	number of columns of the global data matrix V
nrow	number of rows of the global data matrix V
T	Toeplitz matrix composed of the non-zero entries of the first row of each Toeplitz block and concatenated together have to be arranged in the increasing order of n without repetitions and overlaps.
nb_blocks_all	number of all Toeplitz block on the diagonal of the full Toeplitz matrix
nb_blocks_local	number of Toeplitz blocks as stored in T
lambda	half bandwith size for each Toeplitz block stroed in T
idv	global row index defining for each Toeplitz block as stored in the vector T first element of the interval to which given Toeplitz matrix is to be applied.
idp	global index of the first element of the local part of V
local_V_size	a number of all elements in local V

Definition at line 71 of file toeplitz_seq.c.

int stmm	(	double **	V,
		int	n,
		int	m,
		int	id0,
		int	l,
		fftw_complex *	T_fft,
		int	lambda,
		fftw_complex *	V_fft,
		double *	V_rfft,
		fftw_plan	plan_f,
		fftw_plan	plan_b,
		int	blocksize,
		int	nfft
	)

Perform the product of a Toeplitz matrix by a general matrix using the sliding window algorithm with optimize reshaping.

The input matrix is formatted into an optimize matrix depending on the block size and the number of simultaneous ffts (defined with the variable nfft). The obtained number of columns represent the number of vectors FFTs of which are computed simulatenously. The multiplication is then performed block-by-block with the chosen block size using the core routine. The parameters are :

Parameters:

V	[input] data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV
n	number of rows of V
m	number of columns of V
id0	first index of V
l	length of V
T_fft	complex array used for FFTs
lambda	Toeplitz band width
V_fft	complex array used for FFTs
V_rfft	real array used for FFTs
plan_f	fftw plan forward (r2c)
plan_b	fftw plan backward (c2r)
blocksize	block size
nfft	number of simultaneous FTTs

Definition at line 833 of file toeplitz.c.

int stmm_core	(	double **	V,
		int	n,
		int	m,
		fftw_complex *	T_fft,
		int	blocksize,
		int	lambda,
		fftw_complex *	V_fft,
		double *	V_rfft,
		int	nfft,
		fftw_plan	plan_f,
		fftw_plan	plan_b,
		int	flag_offset
	)

Perform the stand alone product of a Toeplitz matrix by a matrix using the sliding window algorithm.

The product is performed block-by-block with a defined block size or a computed optimized block size that reflects a trade off between cost of a single FFT of a length block_size and a number of blocks needed to perform the mutiplicaton. The latter determines how many spurious values are computed extra due to overlaps between the blocks. Use flag_offset=0 for "classic" algorithm and flag_offset=1 to put an offset to avoid the first and last lambdas terms. Usefull when a reshaping was done before with optimal column for a nfft. Better be inside the arguments of the routine. The parameters are:

Parameters:

V	[input] data matrix (with the convention V(i,j)=V[i+j*n]) ; [out] result of the product TV
n	number of rows of V
m	number of columns of V
T_fft	complex array used for FFTs
blocksize	block size used in the sliding window algorithm
lambda	Toeplitz band width
V_fft	complex array used for FFTs
V_rfft	real array used for FFTs
nfft	number of simultaneous FFTs
plan_f	fftw plan forward (r2c)
plan_b	fftw plan backward (c2r)
flag_offset	flag to avoid extra 2*lambda padding to zeros on the edges

Definition at line 515 of file toeplitz.c.

int tpltz_cleanup	(	fftw_complex **	T_fft,
		fftw_complex **	V_fft,
		double **	V_rfft,
		fftw_plan *	plan_f,
		fftw_plan *	plan_b
	)

Clean fftw workspace used in the Toeplitz matrix matrix product's computation.

Destroy fftw plans, free memory and reset fftw workspace.

See also:: tpltz_init

Parameters:

T_fft	complex array used for FFTs
V_fft	complex array used for FFTs
V_rfft	real array used for FFTs
plan_f	fftw plan forward (r2c)
plan_b	fftw plan backward (c2r)

Definition at line 324 of file toeplitz.c.

int tpltz_init	(	int	n,
		int	lambda,
		int *	nfft,
		int *	blocksize,
		fftw_complex **	T_fft,
		double *	T,
		fftw_complex **	V_fft,
		double **	V_rfft,
		fftw_plan *	plan_f,
		fftw_plan *	plan_b
	)

Initialize block size and all the fftw arrays and plans needed for the computation.

Initialize the fftw arrays and plans is necessary before any computation of the Toeplitz matrix matrix product. Use tpltz_cleanup afterwards.

See also:: tpltz_cleanup

Parameters:

n	row size of the matrix used for later product
lambda	Toeplitz band width
nfft	maximum number of FFTs you want to compute at the same time
blocksize	optimal block size used in the sliding window algorithm to compute an optimize value)
T_fft	complex array used for FFTs
T	Toeplitz matrix
V_fft	complex array used for FFTs
V_rfft	real array used for FFTs
plan_f	fftw plan forward (r2c)
plan_b	fftw plan backward (c2r)

Definition at line 186 of file toeplitz.c.

Functions

Detailed Description

Function Documentation