chicQualityControl

Computes the sparsity of each viewpoint to determine the quality. A viewpoint is considered of bad quality if it is too sparse i.e. there are too many locations with no interactions recorded.

This script outputs three files: A plot with the sparsity distribution per matrix, a plot with the sparsity distribution as histograms and a filtered reference points file.

An example usage is:

$ chicQualityControl -m matrix1.h5 matrix2.h5 -rp referencePointsFile.bed –range 20000 40000 –sparsity 0.01 -o referencePointFile_QC_passed.bed

usage: chicQualityControl --matrices MATRICES [MATRICES ...] --referencePoints
                          REFERENCEPOINTS --sparsity SPARSITY
                          [--outFileName OUTFILENAME]
                          [--outFileNameHistogram OUTFILENAMEHISTOGRAM]
                          [--outFileNameSparsity OUTFILENAMESPARSITY]
                          [--threads THREADS] [--fixateRange FIXATERANGE]
                          [--dpi DPI] [--help] [--version]

Required arguments

--matrices, -m

The input matrices to apply the QC on.

--referencePoints, -rp

Bed file contains all reference points which are check for a sufficient number of interactions.

--sparsity, -s

Viewpoints with a sparsity less than given are considered of bad quality. If multiple matrices are given, the viewpoint is removed as soon as it is of bad quality in at least one matrix.

Optional arguments

--outFileName, -o

The output file name of the passed reference points. Is used as prefix for the plots too.

Default: “new_referencepoints.bed”

--outFileNameHistogram, -oh

The output file for the histogram plot.

Default: “histogram.png”

--outFileNameSparsity, -os

The output file for the sparsity distribution plot.

Default: “sparsity.png”

--threads, -t

Number of threads.

Default: 4

--fixateRange, -fs

Fixate score of backgroundmodel starting at distance x. E.g. all values greater 500kb are set to the value of the 500kb bin.

Default: 500000

--dpi

Optional parameter: Resolution for the image in case theoutput is a raster graphics image (e.g png, jpg)

Default: 300

--version

show program’s version number and exit