对fq文件中的reads进行随机抽样 seqtk sample -s100 read1.fq 10000 > sub1.fq seqtk sample -s100 read2.fq 10000 > sub2.fq -s后面跟随机seed,对于双端测序的reads,必须使用一样的seed,不然得到的sample无法正确pair 对fq/fa文件中的reads进行开头/末尾的trim seqtk trimfq -b 5 -e 10 in.fa > out...
seqtk sample name_1.fq.gz 0.014 > name_new_L1_1.fq 也可以按reads条数截取(建议小的fastq文件这样操作,如果reads过大占用内存也会过大,当需要操作的是数据量较大的fastq时,建议采取按比例截取的方式) seqtk sample name_1.fq.gz 10000 > name_new_1.fq 可以通过-s参数(seed数)控制read1和read2是成...
比如说我们要从pair end的原始fastq文件中抽取10000条reads,可以用下面的命令。其中-s是seed,控制随机抽取,但是要注意在抽R1和R2的时候,一定要用相同的seed,这样才能保证抽出来的R1和R2仍然是配对的,否则有可能会错位。后面10000表示抽取的reads数目。 代码语言:javascript 复制 seqtk sample-s100 read1.fq10000>sub...
$seqtk sample Usage:seqtk sample[-2][-s seed=11]<in.fa><frac>|<number>#随机抽取序列,用法是seqtk sample fq/fa numOptions:-s INT RNG seed[11]#设置随机种子,默认11-22-passmode:twiceasslow butwithmuch reduced memory#占用更大的内存 ...
Subsample 10000 read pairs from two large paired FASTQ files (remember to use the same random seed to keep pairing):seqtk sample -s100 read1.fq 10000 > sub1.fq seqtk sample -s100 read2.fq 10000 > sub2.fq Trim low-quality bases from both ends using the Phred algorithm:...
Subsample 10000 read pairs from two large paired FASTQ files (remember to use the same random seed to keep pairing):关键是可以实现随机抽取序列 seqtk sample -s100 read1.fq10000>sub1.fqseqtk sample -s100 read2.fq10000>sub2.fq Trim low-quality bases from both ends using the Phred algorithm:...
fprintf(stderr, "Usage: seqtk sample [-s seed=11] <in.fa> <frac>|<number>\n\n"); fprintf(stderr, "Warning: Large memory consumption for large <number>.\n"); return 1; } frac = atof(argv[optind+1]); if (frac > 1.) num = (uint64_t)(frac + .499), frac = 0.; if ...
比如说我们要从pair end的原始fastq文件中抽取10000条reads,可以用下面的命令。其中-s是seed,控制随机抽取,但是要注意在抽R1和R2的时候,一定要用相同的seed,这样才能保证抽出来的R1和R2仍然是配对的,否则有可能会错位。后面10000表示抽取的reads数目。 seqtk sample -s100 read1.fq 10000 > sub1.fq ...
{r1_files} do # 找到配对的R2文件 r2_file=$(echo ${r1_file} | sed 's/R1/R2/') # 对R1和R2文件进行抽样 zcat ${r1_file} | seqtk sample -s${seed} - ${target_size} | gzip > subset_${r1_file} zcat ${r2_file} | seqtk sample -s${seed} - ${target_size} | gzip > ...
1 conda安装: 快 2用法(最重要): Subsample 10000 read pairs from two large paired FASTQ files (remember to use the same random seed to keep pairing):关键是可以实现随机抽取序列 seqtk sample -s100 read1.fq 10000 > sub1.fq seqtk sample -s100 read2.fq 10000 > sub2.fq ...