If you’ve been using your MiSeq for any length of time, you know that cluster density is a crucial part of NGS to generate the best and most data possible. Under-clustering and over-clustering your data will result in similar errors since your machine will not be able to find the best cluster focus in either case. In order to get the most data possible out of each of your runs, you’ll need to find your “sweet spot” to get the most usable data.
Start with a Clean Machine
Before you start, make sure your MiSeq library is cleaned up. If you haven’t properly cleaned your machine from your previous run, primer and adapter dimers or library fragments may still be in your library. These minor disturbances can have a major impact on your library quantification, which will in turn cause your cluster density to be off.
Different Sweet Spots for Different Methods
Next, choose the appropriate qualification method for your library prep type. Some methods work better than others for double-stranded DNA, single-stranded DNA or RNA as well as library insert size. It’s also important to take into account which version of chemistry you will be running since the optimal cluster density will differ between techniques.
Signs of Over-Clustering
It is pretty easy to tell if your run is over-clustered because the clusters will actually overlap each other, resulting in less template generation. The overlapping clusters cause a lower background-to-cluster percentage and lower data quality in the process.
Pro tip: Samples that are pooled together also cluster less efficiently than samples with small inserts.
Signs of Under-Clustering
Under-clustering is also an issue that can be caused by old NaOH that does not denature effectively, or high NaOH concentration after pH neutralization. You’ll want to take this into account while you’re trying to find your sweet spot. Library input concentration also has an effect on cluster density and the concentration amount can be different in each lab. The usual concentration for optimal cluster density is about 5-15 pM.
Library Nucleotide Diversity
Library nucleotide diversity is another factor to consider when determining how balanced your library is for the nucleotides in order to achieve maximum template generation and data output. If you are running a low-diversity library, you should lower your library input concentration and spike in at least 5% PhiX to increase the nucleotide diversity so that the sequencer can accurately map out clusters and call the bases.
In the end, you’ll have to play with your reagents a little bit to figure out exactly how to get the most data out of your sample runs. It can be tricky to find the correct mixture, but doing so can help you save money down the road. If you’re looking for more information on cluster densities, or you feel that your machine could be malfunctioning because of this issue, give us a call and we’ll walk you through your problems.