Skip to content

Latest commit

 

History

History

lilac

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Lilac

Introduction

See the Lilac Google doc

Configuration

To install, download the latest compiled jar file from the download links.

Resource Files

Lilac requires the following resource files:

  • Ref Genome FASTA
  • HLA allele nucleotide and amino acid definitions, generated by Lilac (see below)

Reference files are available here HMFTools-Resources:

Generate HLA Definitions

Nucleotide and amino acid allele defintions are available here:

Convert these definition files to the reference files used by Lilac with the following command:

java -cp lilac.jar com.hartwig.hmftools.lilac.utils.GenerateReferenceSequences \
   -resource_dir /path_to_definition_files/ \ 
   -output_dir /output_dir/ \

Mandatory Arguments

Argument Description
sample Sample ID
ref_genome Reference genome fasta file
reference_bam Sample's germline BAM

NOTE: Lilac handles BAMs which have been sliced for the HLA gene regions.

If a sample's tumor BAM is provided in place of the reference BAM, then Lilac will determine the allele solution from it instead.

Optional Inputs

Argument Description
ref_genome_version V37 (default), V38 or HG19 (ie 37 with 'chr' prefix)
tumor_bam Sample's tumor BAM
rna_bam Sample's RNA BAM if available
gene_copy_number_file Sample gene copy number file from Purple
somatic_variants_file Sample's somatic variant VCF file, for annotation of HLA gene variants

Optional Arguments

Argument Default Description
min_base_qual 30 Min base quality for BAM reads, see documentation for details
min_evidence 2 Min absolute required fragments for evidence phase
min_evidence_factor 0.00075 Min relative required fragments for evidence phase, as a fraction of total fragments
min_high_qual_evidence_factor 0.000375 Minimum relative required high base-quality fragments for evidence phase, as a fraction of total fragments
min_fragments_per_allele 7 See documentation for details
min_fragments_to_remove_single 40 See documentation for details
top_score_threshold 5 Maximum difference in candidate solution score vs top score as a percentage of total fragments
log_debug Off (logs at INFO) Logs in verbose mode
debug_phasing Off Logs phasing evidence construction
expected_alleles Not applied List of alleles separated by ';'. These alleles will have their coverage and ranking reported even if not in the winning solution
restricted_alleles Not applied List of alleles separated by ';'. Restrict evaluation to only these alleles.
threads 1 Number of threads to use for complex evaluation

Example Usage

java -jar lilac.jar \
   -sample COLO829T 
   -ref_genome /path_to_ref_genome_fasta_file/ \
   -resource_dir /path_to_lilac_resource_files/ \ 
   -reference_bam /sample_data_path/COLO829R.bam \
   -output_dir /output_dir/ \

Or with tumor BAM, somatic variants VCF and Purple gene copy number inputs:

java -jar lilac.jar \
   -sample COLO829T 
   -ref_genome /path_to_ref_genome_fasta_file/ \
   -resource_dir /path_to_lilac_resource_files/ \ 
   -reference_bam /sample_data_path/COLO829R.bam \
   -tumor_bam /sample_data_path/COLO829T.bam \
   -somatic_vcf /sample_data_path/COLO829T.purple.somatic.vcf.gz \
   -gene_copy_number /sample_data_path/COLO829T.purple.cnv.gene.tsv \
   -output_dir /output_dir/ \

Output

The following files are written:

Solution Summary

Field Description
Allele Allele ID
RefTotal Total assigned fragments from reference BAM
RefUnique Fragments uniquely assigned to allele
RefShared Fragments assigned to allele and others in this solution
RefWild Fragments matched to a wildcard allele
TumorCopyNumber Copy number from Tumor/Ref fragment ratio and Purple copy number
TumorTotal As above for tumor BAM
TumorUnique As above for tumor BAM
TumorShared As above for tumor BAM
TumorWild As above for tumor BAM
RnaTotal As above for RNA BAM
RnaUnique As above for RNA BAM
RnaShared As above for RNA BAM
RnaWild As above for RNA BAM
SomaticMissense Matched missense variants
SomaticNonsenseOrFrameshift Matched nonsense or frameshift variants
SomaticSplice Matched splice variants
SomaticSynonymous Matched synonymous variants
SomaticInframeIndel Matched inframe indels

QC Metrics

Field Description
Status PASS, otherwise 1 or more warnings
HlaY true/false, if exceeds HLA threshold (10% of total fragments)
ScoreMargin Difference in score to second-top solution
NextSolutionAlleles Allele difference in second-top solution
MedianBaseQuality Median base quality across all coding bases from all fragments
DiscardedIndels Discarded fragments due to unknown INDELs
DiscardedIndelMaxFrags Maximum fragments assigned to any particular unknown INDEL
DiscardedAlignmentFragments
A_LowCoverageBases Number of bases with less than 15-depth coverage across all coding bases, also for B and C genes
ATypes Number of HLA-A alleles
BTypes Number of HLA-B alleles
CTypes Number of HLA-C alleles
TotalFragments Total fragments used to fit candidate solutions
FittedFragments Total fragments used in top solution
UnmatchedFragments Fragments fitted to other solutions
UninformativeFragments Fragments covering only homozygous or wildcard regions
HlaYFragments Fragments assigned to HLA-Y
PercentUnique Percent of unique fragments in solution
PercentShared Percent of shared fragments in solution
PercentWildcard Percent of wildcard-allele-only fragments in solution
UnusedAminoAcids Distinct unmatched amino acids
UnusedAminoAcidMaxFrags Maximum fragments assigned to a distinct unmatched amino acids
UnusedHaplotypes Distinct unmatched haplotypes
UnusedHaplotypeMaxFrags Maximum fragments assigned to a distinct unmatched haplotype
SomaticVariantsMatched Somatic variants supported by solution allele
SomaticVariantsUnmatched Somatic variants not supported by solution allele

Additional output files

File Description
SAMPLE_ID.lilac.somatic.vcf.gz Annotation of HLA gene somatic variants if somatic VCF provided
SAMPLE_ID.fragments.csv Read details for all BAM fragments
SAMPLE_ID.candidate.coverage.csv Coverage for all candidate solutions within X% of the top solution's score
SAMPLE_ID.candidate.fragments.csv Allocation of each fragment to one or more solutions and which alleles they support
SAMPLE_ID.HLA-A.aminoacids.txt Fragment support for each amino acid by HLA gene
SAMPLE_ID.HLA-A.nucleotides.txt Fragment support for each nucleotide by HLA gene
SAMPLE_ID.candidates.aminoacids.txt Fragment support for amino acids in the candidate alleles
SAMPLE_ID.candidates.nucleotides.txt Fragment support for nucleotides in the candidate alleles

Version History and Download Links