Skip to content

Description and usage

Tobias Nöbauer edited this page Jul 19, 2018 · 17 revisions

Contents

Input data and preprocessing

You have generated a Light Field Microscope (LFM) video of neuronal tissue labelled with fluorescent calcium indicator. If the video has not yet been converted into a series of TIF images (with filenames ending in the zero-padded frame number) of the specific image in the video, then do this before you start. Transfer the TIF files to your desired input folder and start generating the Input struct. Set the field Input.LFM_folder to the path to the input folder.

Download the Levoy group's LFDisplay tool and use it to determine x_offset, y_offset and dx (SID only considers the case where dx == dy). Set Input.rectify to true, otherwise if the images have already been rectified set it to false. In the former case also fill in the parameters as Input.x_offset, Input.y_offset and Input.dx.

Set the field Input.psf_filename_ballistic to the address of the file containing the point spread function (PSF) generated by the lfrecon package (TODO).

Next, specify the frames that you wish to include into the SID optimization procedure. Take into account that SID can only detect neurons that are active in some of those frames and inactive in others. The number of frames taken into account are prohibitive for the SID algorithm and they should not exceed 3000. Before you specify the frames you should also take a look at the raw movie. A good way to do this is looking at the average signal (mean over all pixels for each frame). If there are any motion artifacts or some other patterns in the video (example: someone opened a door during the imaging), then it is best not to include those frames. If you already chose specific frames you can include them in Input.frames.list as a vector of their frame indices. Otherwise you can choose a starting frame Input.frames.start, an incremental step size between frames you want to include, Input.frames.step, and a final frame Input.frames.end.

After specifying the frames you wish to include, there is also the option of not just including those specific frames, but instead for each frame you wish to include, loading frames from that frame up to the next frame specified by Input.frames, computing the average frame of these and including that image instead. You can choose to do this by setting Input.frames.mean true. It will increase the time it takes to load the video significantly but will also decrease noise contributions in your sample video.

Choose an output folder accessible to your MATLAB and set Input.output_folder to the output folder path.

Set Input.axial to the ratio between the physical length of a voxel in the axial direction vs. the physical length in the lateral direction.

Set Input.neur_rad to the expected radius (in units of pixels) of a neuron. In cases with more scattering choose a bigger value. Normally, 30% bigger is sufficient.

Set Input.native_focal_plane to the z-index of the native focal plane of your point spread function. This z-slice is typically the one with the highest reconstruction artifacts, or the one with the highest artifacts around it.

If your workstation offers GPU support, set Input.gpu_ids to the indices of the GPU devices you wish to offer to the algorithm, otherwise set it to an empty array [].

Background estimation

Next, specify if you wish to do a background subtraction. If you wish to do so set Input.bg_sub to true and set the number of iterations for the estimation of the background to Input.bg_iter to 2 (More iterations are usually not necessary). The algorithm will in a first step calculate the standard deviation image of the residual video, which will be included into the estimation of neuronal centers, and also the algorithm will include the spatial background component into the SID-optimization of the spatial filters of specific neurons as the last component of SID_output.forward_model_ini resulting in a continued optimization of the background, when taking local neuronal patterns into account and also significantly improve the performance of the non-negative matrix factorization (NNMF). It is recommended to set Input.bg_sub to true.

De-trending and bleaching correction

If effects of photobleaching are visible in your video, you can de-trend, which is performed by multiplying each pixel trace by a common estimate of the bleaching signal. Since the offset of the exponential decay is not the same for all pixels, this method is not perfect of course. Set Input.detrend to true if you wish to perform this operation. It is recommended to do so. Also set Input.delta (TODO).

Microlens pattern geometry

In order to prevent overfitting during the NNMF step, the algorithm calculates an estimation of the part of the images containing the round shapes located behind microlenses, since this is the only part containing the desired information. There are two parameters to help the algorithm give a good estimation, and this is essential. You can choose standard values for Input.crop_params, which is 2×1 double array of values between zero and one. Otherwise the algorithm will prompt you to choose them according to the current estimation that he will provide in a figure output during the run of the algorithm. The first parameter gives a cut-off value for the estimation of which part of the image even contains active microlenses, the second parameter gives a cut-off for an estimate of the microlens pattern.

Microlens pattern
Figure 1: Microlens pattern

If you wish to only analyze a certain rectangle inside your movie, specify a binary image set to true inside that rectangle you desire and false outside. You include this binary array as Input.crop_mask and you have to set Input.do_crop to true. If Input.do_crop is true, but there is no crop_mask specified, the algorithm will choose the estimation of the part of the images that is active, as described in the previous paragraph.

Core SID algorithm

Low-rank-NNMF

An important part of the initial estimation of neuronal positions is the output of a low-rank NNMF. In the case of very large images, this can be a problem and you can choose to turn off this part of the algorithm by setting the rank (Input.nnmf_opts.rank) of the NNMF to zero, but you also have to set Input.optimize_kernel to false and not set a value for Input.recon_opts.ker_shape.

First, set the rank (Input.nnmf_opts.rank) to an appropriate value. Usually about 30 components is enough, but if the reconstruction is very costly you should turn it down. Ten components still gives good results. If you increase the number of components, you increase the chance the neighboring neurons to fall in todifferent components, thereby increasing the chance to separate them from each other. Increasing the rank therefore usually decreases the density of neurons in the corresponding reconstructed spatial filters of the NNMF. Set the number of iterations the algorithm should perform (Input.nnmf_opts.max_iter) to 600. This is usually sufficient.

Initialization methods for the low-rank NNMF

A number of options are available for initializing the low-rank NNMF: The first option is to set Input.nnmf_opts.ini_method to rand, which initializes the temporal components T with smoothed, random traces and the spatial component S with the non-negative-least-squares solution. The second option is to set Input.nnmf_opts.ini_method to pca, which initializes the temporal and spatial components with the absolute value of the first n (rank = n) principal components of the sample video. It is recommended to use the PCA initializer since it offers the possibility to compare different runs of the NNMF with different parameters and leads to an inherent background subtraction by encouraging the first component to become background.

Regularizers

The NNMF implementation included with the SID package allows to configure a variety of regularizers, in order to increase the separation performance as well as to clear up the images and prevent artifacts in the subsequent reconstruction of the spatial components of the NNMF. The NNMF algorithm solves the problem

argmin{T≥0,S≥0}(||YS T||22 + λ R(S, T))

where λ is the vector of the Lagrange multipliers and R(S, T) is a vector of possible regularizers. Each component of λ corresponds to a specific field of Input.nnmf_opts:

Parameter Description Recommended value Recommended
Input.nnmf_opts.lamb_spat L1-reg of S Yes
Input.nnmf_opts.lamb_temp L1-reg of T No
Input.nnmf_opts.lamb_corr L2-norm of cov-matrix No
Input.nnmf_opts.lamb_orth_L1 L1-norm of gram matrix of S Yes
Input.nnmf_opts.lamb_orth_L2 L2-norm of gram matrix of S Yes
Input.nnmf_opts.lamb_spat_TV L2-norm of Total Variation in space No
Input.nnmf_opts.lamb_temp_TV L2-norm of Total Variation in T No

You can combine all these regularizers, but usually one is sufficient and using multiple regularizers can lead to undesired effects. It is recommended to only use Input.nnmf_opts.lamb_orth_L1 or Input.nnmf_opts.lamb_spat with a value of 1.

Choosing a good Lagrange multiplier / Cross-validation

The Question how to choose the Lagrange multipliers can be addressed by manually evaluating the outputs of the fast_nmf() function (SID’s low-rank NNMF implementation). This involves starting with an initial guess and then running the NNMF algorithm and visually evaluating the patterns in the spatial filters. If you see nicely separated single-neuron LFM-patterns (case 2), then the Lagrange multiplier was a good choice, otherwise if the pattern looks very noisy, or contains incomplete single neuron LFM-patterns (case 1), the Lagrange multiplier was too large, finally if the components are rather full (as opposed to sparse) the Lagrange multiplier was probably too small (case 3). Keep in mind that one of the components (normally the first) will always be full since it is used as the background component.


Figure 2: Case 1 (too noisy / sparse)


Figure 3: Case 2 (just right)


Figure 4: Case 3 (too dense, not sparse enough)

Otherwise, the fast_nmf() function includes the option to do cross validation for a selection of values of one of the Lagrange multipliers. What cross validation does is to partition the data into Input.nnmf_opts.xval.num_part parts and to run the NNMF algorithm for a certain Lagrange multiplier for each of the parts on the video minus that part, then generates an estimate of T on that part and computes the two-norm of the residual. These values are then averaged over all the parts and used as an estimate on how good that parameter (Lagrange multiplier) works when you try to train the NNMF to pick up spatial components that generalize well. Of course, this is only a very indirect measure and since it would require exceedingly long computation time to run the NNMF algorithm so many times on the whole video, the cross-validation algorithm only takes into account the 100×100 sub-video with the greatest variance.

It still turns out to be very useful and it is recommended to activate this function when dealing with new data. To do so, you first must set the field (sub-struct) xval inside Input.nnmf_opts. Set Input.nnmf_opts.xval.num_part (the number of partitions) to a value that leaves the size in time of all of the partitions at least bigger than the usual decay time of a calcium transient. In the case of our data, something between 5 and 10 was usually good. Then set Input.nnmf_opts.xval.std_image = SID_output.std_image. This is necessary to provide the algorithm with the dimensions of the video.

Finally set Input.nnmf_opts.xval.multiplier to the Lagrange multiplier that you want to find the value for (e.g. lamb_orth_L1) and set Input.nnmf_opts.xval.param to a vector of the parameters (Lagrange multiplier values) that you want to check. If you don’t set the last two fields, the standard multiplier is lamb_orth_L1 and the default parameter values are opts.xval.param = lambda * exp(-2 * [0:4]).

LFM reconstruction

The next step in the SID pipeline is the reconstruction of the 3D information encoded in the spatial components of the NNMF. The basic reconstruction is a non-negative least-squares (NNLS) solver, but since the LFM volume doesn't have constant resolution, but the point spread function is based on a homogenous discretization of the volume, we encounter reconstruction artifacts near the native focal plane, where the resolution reaches a minimum of one microlens diameter. This can be remedied by various regularization techniques.

The first technique is, as before, modifying the objective function. In this case, the SID reconstruction algorithm accepts the following regularizers:

Parameter Description
Input.recon_opts.lamb_L1 L1-regularization
Input.recon_opts.lamb_L2 L2-regularization
Input.recon_opts.lamb_TV_L2 L2-norm the total variation

Input.recon_opts.lamb_TV_L2 is a 1×3 vector and allows different regularization for each spatial direction. The total variation regularization helps in a lot in cases with big neurons (radius greater than 10 pixels), since the artifacts we usually encounter are of high spatial frequency located around the native focal plane. In fact, if we would run the algorithm without regularization long enough we would encounter a volume consisting only of high frequency artifacts. We therefore normally do a regularization by early stopping. The L1-regularization in some cases increases the artifacts, but helps in others, when there are undesired low spatial frequency structures. Combined with the next technique it can produce very nice results. You do not need to worry too much about choosing Input.recon_opts.lamb_L1 too high, since the algorithm corrects it down to a proper value if it was chosen too high (so high that the zero volume is a local minimum).

The second technique builds on the fact that the artifacts are of high spatial frequency and enforces sparsity on the problem by modifying the forward-projection-function (convolution of the Volume with LFM point spread function to generate the sensor image). This is done by choosing a kernel that should resemble the shape of a neuron in your normal reconstruction, and convolving the volume with that kernel before performing the convolution with the PSF. The SID reconstruction algorithm offers the following options:

Parameter Description Possible values
Input.recon_opts.ker_shape Shape of the kernel gaussian – Gaussian kernel
lorentz – Lorentzian kernel
ball – binary in shape of ball
user – kernel predefined by user
Input.recon_opts.ker_param Additional parameters for kernel Depending on ker_shape:
gaussian- 1×2 vector: First component is standard deviation in lateral direction, second is standard deviation in axial direction
lorentz – same
ball – same (radius instead of standard deviation)
user – ker_param is the kernel

Choosing an appropriate kernel and L1-regularization leads to very good results.

There are two good strategies for a precise estimation of the neuronal centers in the next step of the pipeline:

Strategy 1

Perform an LFM reconstruction with the regular forward-projection function and activate the L1-regularization and Total Variation regularization. Set Input.optimize_kernel to false, Input.recon_opts.lamb_L1 to 0.1, Input.recon_opts.lamb_TV_L2 to [0.1 0.1 4] and Input.filter to true. This last part means that a bandpass filter will be applied to the messy basic reconstructed volume, which should get rid of artifacts and only leave neuronal shapes in the volume.

Strategy 2

Perform an LFM reconstruction with a modified forward-projection function and activate the L1 regularization. The proper kernel will be estimated through the expected radius of the neurons (Input.neur_rad) and by applying Strategy 1 to a single NNMF component and subsequent segmentation (see next part). Set Input Input.optimize_kernel to true, Input.recon_opts.lamb_L1 to 1, and set Input.filter to true. The bandpass filter in the end is not strictly necessary but helps in some cases. The expected neuronal radius in the reconstructions will of course be modified before the band-pass filter, since they are more sharply defined when using this strategy, as a result the bandpass filter is very fast. This strategy is only advised if your NNMF generated quite sparse spatial components, like in case 2 of the low-rank NNMF section above, but not recommended otherwise, since the algorithm tends to overfit in those cases.

Segmentation

Given that everything worked up to this point as it should, you don’t have to change any of standard values for this section. It is however recommended to check the segmentation result for over-segmentation. You can do so by looking in the output folder during runtime and checking the _segmm_segmentation_ files. If you see too many red dots at places where you cannot see neuronal shapes, you should modify the variable Input.segmentation.threshold, its standard value is 0.01, which was experimentally determined, and its maximal value is 1.

There are two more fields associated with the segmentation sub-struct Input.segmentation:

  • Input.segmentation.top_cutoff: The segmentation ignores all neurons with a smaller z-coordinate than this threshold.
  • Input.segmentation.bottom_cutoff: The segmentation ignores all neurons with a larger z-coordinate than this threshold.

LFM footprint dictionary generation

The generation of the dictionary of LFM patterns of each of the putative neurons found by the segmentation may be performed by one of two algorithms.

The first algorithm is called generate_LFM_library_CPU, produces the LFM pattern corresponding to a binary sphere of radius SID_output.neur_rad at the location of the neuron, using a sparse forward projection algorithm.

The second algorithm is called generate_LFM_library_GPU, and is a little bit more involved. It needs the information in which of the low rank NNMF components the putative neurons were found and produces for each neuron the LFM-forward-projection of the volume containing the average of the reconstructed NNMF components where the neuron was found in a cube of length Input.neur_rad. This results in a better initial guess of the LFM_library but needs GPU support to complete within an acceptable amount of time.

The first algorithm will be used automatically if no GPU support is selected. If you wish to use the first algorithm but want to use GPU support in the rest of SID algorithm, then set Input.use_std_GLL to true.

Template generation

The template generation step is straight-forward. The algorithm generates for each z-slice a circle, based on the LFM PSF and a threshold Input.template_threshold. The default value is 0.01. If you wish to increase the template sizes you have to decrease this value, its range is [0 1].

Alternating bi-convex optimization

The core of the SID algorithm, and the last step before the final extraction of the neuronal time series, only requires three parameters to be set. First you can choose additional regularizers by setting their Lagrange multipliers:

  • `Input.SID_optimization.spatial`: Lagrange multiplier of spatial update
    Parameter Description
    lamb_L1 L1-regularization
    lamb_L2 L2-regularization
    lamb_orth_L1 L1-norm of the Gramian matrix
  • `Input.SID_optimization.temporal`: Lagrange multiplier of temporal update
    Parameter Description
    Lambda L1-regularization

    Usually a value of 1e-4 is a good choice. Should the number of neurons significantly decrease during the optimization procedure, it may be too large and should be entirely turned off or it can be determined by visual inspection of the components of SID_output.forward_model_iterated after the first iteration.

    Hint: Turn the value to zero, find a component of SID_output.forward_model_iterated that clearly contains a neuronal signature, use a successive approximation procedure to find the value, where the noise patterns outside of the clearly defined neuronal signature vanish, then halve this value and test it on multiple SID iterations.

Next, there is the total number of iterations the SID algorithm should perform. If you turned off the L1-regularizer of the spatial SID update, it may be advised not to exceed four iterations, since otherwise a form of over-fitting will occur. It is also a matter of the constraints of computational costs, and usually there cannot be gained much beyond five iterations. You can set this value by setting `Input.num_iter` to the appropriate value.

Finally, if you wish to increase the size of your templates with each run of the algorithm, you can set Input.update_template to true. This will help to merge components and thereby getting rid of fragments that would be otherwise picked up as neurons.

Note: All solvers include the additional option to use the standard deviation instead of the two-norm of the residual in the objective function. You can do so by adding opts.use_std in the corresponding opts struct, or if you wish to do it for all instances of that optimization, set Input.use_std to true. Keep in mind to turn off the background if there is no inherent background in the movie.

Clone this wiki locally