hicCorrectMatrix¶
Iterative correction for a HiC matrix (see Imakaev et al. 2012 Nature Methods for details). For the method to work correctly, bins with zero reads assigned to them should be removed as they can not be corrected. Also, bins with low number of reads should be removed, otherwise, during the correction step, the counts associated with those bins will be amplified (usually, zero and low coverage bins tend contain repetitive regions). Bins with extremely high number of reads can also be removed from the correction as they may represent copy number variations.
To aid in the identification of bins with low and high read coverage, the histogram of the number of reads can be plotted together with the Median Absolute Deviation (MAD).
It is recommended to run hicCorrectMatrix as follows:
$ hicCorrectMatrix diagnostic_plot –matrix hic_matrix.h5 o plot_file.png
Then, after revising the plot and deciding the threshold values:
$ hicCorrectMatrix correct –matrix hic_matrix.h5 –filterThreshold <lower threshold> <upper threshold> o corrected_matrix
For a more indepth review of how to determine the threshold values, please visit: http://hicexplorer.readthedocs.io/en/latest/content/example_usage.html#correctionofhicmatrix
usage: hicCorrectMatrix [h] [version] ...
Named Arguments¶
–version  show program’s version number and exit 
Options¶
Possible choices: diagnostic_plot, correct To get detailed help on each of the options:

Subcommands:¶
diagnostic_plot¶
Plots a histogram of the coverage per bin together with the modified zscore based on the median absolute deviation method (see Boris Iglewicz and David Hoaglin 1993, Volume 16: How to Detect and Handle Outliers The ASQC Basic References in Quality Control: Statistical Techniques, Edward F. Mykytka, Ph.D., Editor).
hicCorrectMatrix diagnostic_plot matrix hic_matrix.h5 o file.png
Required arguments¶
–matrix, m  Name of the HiC matrix to correct in .h5 format. 
–plotName, o  File name to save the diagnostic plot. 
Optional arguments¶
–chromosomes  List of chromosomes to be included in the iterative correction. The order of the given chromosomes will be then kept for the resulting corrected matrix. 
–xMax  Max value for the xaxis in counts per bin. 
–perchr  Compute histogram per chromosome. For samples from cells with uneven number of chromosomes and/or translocations it is advisable to check the histograms per chromosome to find the most conservative filterThreshold. Default: False 
–verbose  Print processing status. Default: False 
correct¶
Run the iterative correction.
hicCorrectMatrix correct matrix hic_matrix.h5 filterThreshold 1.2 5 out corrected_matrix.h5
Required arguments¶
–matrix, m  Name of the HiC matrix to correct in .h5 format. 
–outFileName, o  
File name to save the resulting matrix. The output is a .h5 file.  
–filterThreshold, t  
Removes bins of low or large coverage. Usually these bins do not contain valid HiC data or represent regions that accumulate reads and thus must be discarded. Use hicCorrectMatrix diagnostic_plot to identify the modified zvalue thresholds. A lower and upper threshold are required separated by space, e.g. –filterThreshold 1.5 5 
Optional arguments¶
–iterNum, n  Number of iterations to compute. Default: 500 
–inflationCutoff  
Value corresponding to the maximum number of times a bin can be scaled up during the iterative correction. For example, an inflation cutoff of 3 will filter out all bins that were expanded 3 times or more during the iterative correction.  
–transCutoff, transcut  
Clip high counts in the top transcut trans regions (i.e. between chromosomes). A usual value is 0.05  
–sequencedCountCutoff  
Each bin receives a value indicating the fraction that is covered by reads. A cutoff of 0.5 will discard all those bins that have less than half of the bin covered.  
–chromosomes  List of chromosomes to be included in the iterative correction. The order of the given chromosomes will be then kept for the resulting corrected matrix 
–skipDiagonal, s  
If set, diagonal counts are not included Default: False  
–perchr  Normalize each chromosome separately. This is useful for samples from cells with uneven number of chromosomes and/or translocations. Default: False 
–verbose  Print processing status Default: False 
–version  show program’s version number and exit 