bcftools +count XXX.vcf.gz #输出统计的样本数、SNP数、indel数等[vcf很大的时候运行的很慢] OK,let us vcftools 首先我的第一个需求,从vcf中提取部分Sample vcftools --vcf in.vcf --recode --recode-INFO-all --stdout --keep id.txt > out.vcf 其中: --vcf vcf文件名 ##如果是压缩文件vcf.gz的...
--non-ref-af,--non-ref-ac... 保留都是ALT变异的位点。 --mac INT,--max-mac保留Minor Allel Count数大于INT数的位点 --min-alleles 2,--max-alleles 2筛选保留含有2个ALT变异的位点。常用。 根据基因型GENOTYPE数值进行过滤 --min-meanDP,--max-meanDP根据平均覆盖深度进行过滤。--min-meanDP 3 -...
--hwe2.5 计算Hardy-Weinberg p-value讲到如何求p值,这个参数就是根据p值来过滤的,小于阈值则被过滤掉 --max-missing 前面已经举例;--max-missing-count 某个位点缺失样本个数多于某个阈值则过滤掉 --phased 某个位点如果含有未定相的基因型则过滤 --minQ 根据vcf文件的QUAL列来过滤,比如 vcftools --gzvcf ...
This option will generate a histogram file of the length of all indels (including SNPs). It shows both the count and the percentage of all indels for indel lengths that occur at least once in the input file. SNPs are considered indels with length zero. The output file has the suffix “....
Include only sites with Minor Allele Count greater than or equal to the "--mac" value and less than or equal to the "--max-mac" value. One of these options may be used without the other. Allele count is simply the number of times that allele appears over all individuals at that site...
The --counts option outputs a similar file with the suffix '.frq.count', that contains the raw allele counts at each site. The --freq2 and --count2 options are used to suppress allele information in the output file. In this case, the order of the freqs/counts depends on the ...
Added experimental --hapcount function to determine the amount of unique haplotypes in user defined bins Addition of "any" options to filter by frequency or count of any alternate allele (instead of requiring all of them to pass) Improvements to temporary file handling for all LD functions ...
2 --max-alleles 2 \ --remove-indels 2>vcftools.log| gzip - >myresult/nohup1.vcf.gz & ...
If you think of a theoretical scenario where two populations have fixed the derived allele 1, but in one population that particular position was not called, then you're going to set it to 0. You will then count this SNP as perfectly differentiated between the two pops when it's not. ...
2. Calculate AC (Allele Count) values with "vcftools --counts" on the first VCF sample file provided in the previous tutorial. herong$ more sample.vcf ... #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMP001 SAMP002 20 1291018 rs11449 G A . PASS . GT 0/0 0/1 ...