由于这个系列所用测序数据包含7个测序数据(SRR56~62),在reads计数时会各自生成一个count矩阵,下游分析时一个一个处理count文件比较麻烦,需将七个count文件进行合并;由于reads计数时使用的是同一个基因组索引文件,因此生成的count文件Ensemble ID必定是一致的,这就方便我们以第一列ID为交集索引实现文件合并。使用的函...
HTSeq:一个用于处理高通量数据(High-throughout sequencing)的python包。 HTSeq包有很多功能类,熟悉python脚本的可以自行编写数据处理脚本。 另外,HTSeq也提供了两个脚本文件能够直接处理数据:htseq-qa(检测数据质量)和htseq-count(reads计数)。 用法:htseq-count [options] <alignment_file> <gff_file> <alignmen...
• It supports GTF and SAF format annotation • It supports strand-specific read counting • It can count reads at feature (eg. exon) or meta-feature (eg. gene) level • Highly flexible in counting multi-mapping and multioverlapping reads. Such reads can be excluded, fully counted ...
If your RNA-Seq data has not been made with a strand-specific protocol, this causes half of the reads to be lost. Hence, make sure to set the option --stranded=no unless you have strand-specific data! Options -f <format>, --format=<format> Format of the input data. Possible values...
-s yes/no/reverse: 数据是否来自于strand-specific assay。DNA是双链的,所以需要判断到底来自于哪条链。如果选择了no, 那么每一条read都会跟正义链和反义链进行比较。默认的yes对于双端测序表示第一个read都在同一个链上,第二个read则在另一条链上。
Important: The default for strandedness is yes. If your RNA-Seq data has not been made with a strand-specific protocol, this causes half of the reads to be lost. Hence, make sure to set the option --stranded=no unless you have strand-specific data!
It supports strand-specific read counting It can count reads at feature (eg. exon) or meta-feature (eg. gene) level Highly flexible in counting multi-mapping and multi-overlapping reads. Such reads can be excluded, fully counted or fractionally counted(这点跟HTSeq-count不一样了,其对于多重比...
>>> for p in tsspos: ... window = HTSeq.GenomicInterval( p.chrom, p.pos - halfwinwidth, p.pos + halfwinwidth, "." ) ... wincvg = numpy.fromiter( coverage[window], dtype='i', count=2*halfwinwidth ) ... if p.strand == "+": ... profile += wincvg ... else: ......
It supports strand-specific read counting It can count reads at feature (eg. exon) or meta-feature (eg. gene) level Highly flexible in counting multi-mapping and multi-overlapping reads. Such reads can be excluded, fully counted or fractionally counted(这点跟HTSeq-count不一样了,其对于多重比...
strand - defined as + (forward) or - (reverse). frame - One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on.. attribute - A semicolon-separated list ...