Transform Split Values¶
The Transform Split Values tool extracts and manipulates values from existing sample fields in a VCF and outputs the results to a TSV file. The field to manipulate is chosen via the second positional argument.
Supported operations are the following:
ref
: Extract the first value in a R-number field (the reference value).alt
: Extract the second value in a R-number field (the alt value).sum
: Calculate the sum of all the numbers in the field.min
: Calculate the minimum of all the numbers in the field.max
: Calculate the maximum of all the numbers in the field.mean
: Calculate the mean of all the numbers in the field.median
: Calculate the median of all the numbers in the field.stdev
: Calculate the standard deviation of all the numbers in the field.ref_ratio
: The first value in a R-number field divided by the sum of all the numbers (the reference ratio).alt_ratio
: The second value in a R-number field divided by the sum of all the numbers (the alt ratio).
If your VCF is a multi-sample VCF, you have to pick one of the sample in
your VCF by setting the --sample-name
option. This is the sample that the
readcounts will be written for.
By default the output TSV will be written to a .tsv
file next to
your input VCF file. You can set a different output file using the
--output-tsv
parameter.
Usage¶
usage: transform-split-values [-h] [-t INPUT_TSV] [-s SAMPLE_NAME]
[-o OUTPUT_TSV]
input_vcf format_field
{ref,alt,sum,min,max,mean,median,stdev,ref_ratio,alt_ratio}
[{ref,alt,sum,min,max,mean,median,stdev,ref_ratio,alt_ratio} ...]
A tool that extracts and manipulates values from existing sample fields and
outputs the results to a TSV file.
positional arguments:
input_vcf The VCF file from which to extract information. Multi-
allelic sites must be decomposed.
format_field The multi-value format field to report.
{ref,alt,sum,min,max,mean,median,stdev,ref_ratio,alt_ratio}
The operation to execute on the chosen field. ref:
Extract the first value in a R-number field (the
reference value). alt: Extract the second value in a
R-number field (the alt value). sum: Calculate the sum
of all the numbers in the field. min: Calculate the
minimum of all the numbers in the field. max:
Calculate the maximum of all the numbers in the field.
mean: Calculate the mean of all the numbers in the
field. median: Calculate the median of all the numbers
in the field. stdev: Calculate the standard deviation
of all the numbers in the field. ref_ratio: The first
value in a R-number field divided by the sum of all
the numbers (the reference ratio). alt_ratio: The
second value in a R-number field divided by the sum of
all the numbers (the alt ratio).
optional arguments:
-h, --help show this help message and exit
-t INPUT_TSV, --input_tsv INPUT_TSV
A TSV report file to add information to. Required
columns are CHROM, POS, REF, ALT. These are used to
match each TSV entry to a VCF entry. Must be tab-
delimited.
-s SAMPLE_NAME, --sample-name SAMPLE_NAME
If the input_vcf contains multiple samples, the name
of the sample to extract information for.
-o OUTPUT_TSV, --output-tsv OUTPUT_TSV
Path to write the output report TSV file. If not
provided, the output TSV will be written next to the
input VCF with a .tsv file ending.