R/process_dnase_atac_data.R
get_sites_counts.Rd
Extracts counts around candidate binding sites on both strands
from the genome counts data
(BigWig files generated using count_genome_cuts()
).
It utilizes the extract bed
function from the bwtool
software
to extract the read counts,
then combines the counts into one matrix, with the first half of the columns
representing the read counts on the forward strand,
and the second half of the columns representing the read counts
on the reverse strand.
get_sites_counts(
sites,
genomecount_dir,
genomecount_name,
tmpdir = genomecount_dir,
bedGraphToBigWig_path = "bedGraphToBigWig",
bwtool_path = "bwtool"
)
A data frame containing the candidate sites.
Directory for genome counts,
the same as outdir
in count_genome_cuts()
.
File prefix for genome counts,
the same as outname
in count_genome_cuts()
.
Temporary directory to save intermediate files.
Path to UCSC bedGraphToBigWig
executable.
Path to bwtool
executable.
A count matrix. The first half of the columns are the read counts on the forward strand, and the second half of the columns are the read counts on the reverse strand.
if (FALSE) {
# Extracts ATAC-seq count matrices around candidate sites
count_matrix <- get_sites_counts(sites,
genomecount_dir='processed_data',
genomecount_name='K562.ATAC')
}