设置随机种子(默认值11)方便结果重复 seqkit sample A1_1.fq.gz -n 20000 -s 11 -o A1_1_20000...
sample sample sequences by number or proportion sana sanitize broken single line FASTQ files scat real time recursive concatenation and streaming of fastx files seq transform sequences (extract ID, filter by length, remove gaps...) shuffle shuffle sequences sliding extract subsequences in sliding win...
seqtk subseq -l 40 B1_NR100nl.fasta name.list > out.fa 5.随机抽取序列,按照比例或者数量 -s设定随机种子,便于重复 seqtk sample -s 10 test.fq 0.4 #比例seqtk sample -s 10 test.fq 100 #数量 6.重命名 会将序列id变为从1到n... seqtk rename in.fa <前缀> > out.fa 7..fastq转换为fast...
(start:end) rename rename duplicated IDs replace replace name/sequence by regular expression restart reset start position for circular genome rmdup remove duplicated sequences by id/name/sequence sample sample sequences by number or proportion seq transform sequences (revserse, complement, extract ID....
5.随机抽取序列,按照比例或者数量 -s设定随机种子,便于重复 seqtk sample -s 10 test.fq 0.4 #比例 seqtk sample -s 10 test.fq 100 #数量 6.重命名 会将序列id变为从1到n... seqtk rename in.fa <前缀> > out.fa 7..fastq转换为fasta,支持压缩格式 seqtk...
| seqkit head -n 1000 -o sample.fa.gz # 设置随机种子,方便重复结果: -s 11 zcat hairpin.fa.gz \ | seqkit sample -p 0.1 -s 11 |head # 抽样后打乱序列 :seqkit shuffle zcat hairpin.fa.gz \ | seqkit sample -p 0.1 \ | seqkit shuffle -o sample.fa.gz ...
seqkit grep -s -d -i -p TTSAA#简并碱基使用。S 代表C or G. seqkit grep -s -R 1:30 -i -r -p GCTGG##匹配限定到某区域 五、motif定位 对grep的拓展,可以正反链同时匹配,输出匹配的位置。 seqkit locate [flags] 参数: -d, --degenerate ...
sample by number (result may not exactly match) -p, --proportion float sample by proportion(按比例提) -s, --rand-seed int rand seed for shuffle (default 11) -2, --two-pass 2-pass modelower memory 举例:随机抽取序列 seqkit sample -n 10000 -s 11 test1_1.fq -o sample.fq ...
sample sample sequences by number or proportion split split sequences into files by id/seq region/size/parts (mainly for FASTA) split2 split sequences into files by size/parts (FASTA, PE/SE FASTQ) Commands for Edit: concat concatenate sequences with the same ID from multiple files ...
序列提取(-n指定提取数量,-s指定随机数,-p指定抽取比例,-o输出) seqkit sample -n 10000 -s 10 test_1.fq -o sample.fq随机提取10000条序列 seqkit sample -p 0.1 -s 10 test_1.fq -o sample.fq随机提取总序列的10%的序列 最后编辑于:2022.10.12 14:56:49 ...