hicNormalize

Normalizes given matrices either to the smallest given read number of all matrices or to 0 - 1 range. However, it does NOT compute the contact probability.

usage: hicNormalize --matrices MATRICES [MATRICES ...] --normalize
                    {norm_range,smallest} --outFileName FILENAME
                    [FILENAME ...] [--help] [--version]

Required arguments

--matrices, -m

The matrix (or multiple matrices) to get information about. HiCExplorer supports the following file formats: h5 (native HiCExplorer format) and cool.

--normalize, -n

Possible choices: norm_range, smallest

Normalize to a) 0 to 1 range, b) all matrices to the lowest read count of the given matrices.

Default: “smallest”

--outFileName, -o

Output file name for the Hi-C matrix.

Optional arguments

--version

show program’s version number and exit

Background

To be able to compare different Hi-C interaction matrices the matrices need to be normalized to a equal level of read coverage or value ranges. hicNormalize accomplish this by offering two modes: 0-1 range normalization or a read count normalization.

Usage example

Normalize to 0-1 range

norm_range mode of hicNormalize normalizes all reads to the 0 to 1 range i.e. the maximum value of the interaction matrix becomes 1 and the minimum value 0.

$ hicNormalize -m matrix.cool --normalize norm_range -o matrix_0_1_range.cool

Normalize to smallest read count

All matrices are normalized in the way the total read count of each matrix is equal to the read count of the matrix with the smallest read count of all input matrices.

Example

  • matrix.cool with a read count of 10000

  • matrix2.cool with a read count of 12010

  • matrix3.cool with a read count of 11000

In this example each entry in matrix2.cool and matrix3.cool are normalized with a factor of 12010 / 10000 respective with 11000 / 10000.

$ hicNormalize -m matrix.cool matrix2.cool matrix3.cool --normalize smallest
  -o matrix_normalized.cool matrix2_normalized.cool matrix3_normalized.cool