See the Lilac Google doc
To install, download the latest compiled jar file from the download links.
Lilac requires the following resource files:
- Ref Genome FASTA
- HLA allele nucleotide and amino acid definitions, generated by Lilac (see below)
Reference files are available here HMFTools-Resources:
Nucleotide and amino acid allele defintions are available here:
- https://www.ebi.ac.uk/ipd/imgt/hla/download.html
- ftp://ftp.ebi.ac.uk/pub/databases/ipd/imgt/hla/
Convert these definition files to the reference files used by Lilac with the following command:
java -cp lilac.jar com.hartwig.hmftools.lilac.utils.GenerateReferenceSequences \
-resource_dir /path_to_definition_files/ \
-output_dir /output_dir/ \
Argument | Description |
---|---|
sample | Sample ID |
ref_genome | Reference genome fasta file |
reference_bam | Sample's germline BAM |
NOTE: Lilac handles BAMs which have been sliced for the HLA gene regions.
If a sample's tumor BAM is provided in place of the reference BAM, then Lilac will determine the allele solution from it instead.
Argument | Description |
---|---|
ref_genome_version | V37 (default), V38 or HG19 (ie 37 with 'chr' prefix) |
tumor_bam | Sample's tumor BAM |
rna_bam | Sample's RNA BAM if available |
gene_copy_number_file | Sample gene copy number file from Purple |
somatic_variants_file | Sample's somatic variant VCF file, for annotation of HLA gene variants |
Argument | Default | Description |
---|---|---|
min_base_qual | 30 | Min base quality for BAM reads, see documentation for details |
min_evidence | 2 | Min absolute required fragments for evidence phase |
min_evidence_factor | 0.00075 | Min relative required fragments for evidence phase, as a fraction of total fragments |
min_high_qual_evidence_factor | 0.000375 | Minimum relative required high base-quality fragments for evidence phase, as a fraction of total fragments |
min_fragments_per_allele | 7 | See documentation for details |
min_fragments_to_remove_single | 40 | See documentation for details |
top_score_threshold | 5 | Maximum difference in candidate solution score vs top score as a percentage of total fragments |
log_debug | Off (logs at INFO) | Logs in verbose mode |
debug_phasing | Off | Logs phasing evidence construction |
expected_alleles | Not applied | List of alleles separated by ';'. These alleles will have their coverage and ranking reported even if not in the winning solution |
restricted_alleles | Not applied | List of alleles separated by ';'. Restrict evaluation to only these alleles. |
threads | 1 | Number of threads to use for complex evaluation |
java -jar lilac.jar \
-sample COLO829T
-ref_genome /path_to_ref_genome_fasta_file/ \
-resource_dir /path_to_lilac_resource_files/ \
-reference_bam /sample_data_path/COLO829R.bam \
-output_dir /output_dir/ \
Or with tumor BAM, somatic variants VCF and Purple gene copy number inputs:
java -jar lilac.jar \
-sample COLO829T
-ref_genome /path_to_ref_genome_fasta_file/ \
-resource_dir /path_to_lilac_resource_files/ \
-reference_bam /sample_data_path/COLO829R.bam \
-tumor_bam /sample_data_path/COLO829T.bam \
-somatic_vcf /sample_data_path/COLO829T.purple.somatic.vcf.gz \
-gene_copy_number /sample_data_path/COLO829T.purple.cnv.gene.tsv \
-output_dir /output_dir/ \
The following files are written:
Field | Description |
---|---|
Allele | Allele ID |
RefTotal | Total assigned fragments from reference BAM |
RefUnique | Fragments uniquely assigned to allele |
RefShared | Fragments assigned to allele and others in this solution |
RefWild | Fragments matched to a wildcard allele |
TumorCopyNumber | Copy number from Tumor/Ref fragment ratio and Purple copy number |
TumorTotal | As above for tumor BAM |
TumorUnique | As above for tumor BAM |
TumorShared | As above for tumor BAM |
TumorWild | As above for tumor BAM |
RnaTotal | As above for RNA BAM |
RnaUnique | As above for RNA BAM |
RnaShared | As above for RNA BAM |
RnaWild | As above for RNA BAM |
SomaticMissense | Matched missense variants |
SomaticNonsenseOrFrameshift | Matched nonsense or frameshift variants |
SomaticSplice | Matched splice variants |
SomaticSynonymous | Matched synonymous variants |
SomaticInframeIndel | Matched inframe indels |
Field | Description |
---|---|
Status | PASS, otherwise 1 or more warnings |
HlaY | true/false, if exceeds HLA threshold (10% of total fragments) |
ScoreMargin | Difference in score to second-top solution |
NextSolutionAlleles | Allele difference in second-top solution |
MedianBaseQuality | Median base quality across all coding bases from all fragments |
DiscardedIndels | Discarded fragments due to unknown INDELs |
DiscardedIndelMaxFrags | Maximum fragments assigned to any particular unknown INDEL |
DiscardedAlignmentFragments | |
A_LowCoverageBases | Number of bases with less than 15-depth coverage across all coding bases, also for B and C genes |
ATypes | Number of HLA-A alleles |
BTypes | Number of HLA-B alleles |
CTypes | Number of HLA-C alleles |
TotalFragments | Total fragments used to fit candidate solutions |
FittedFragments | Total fragments used in top solution |
UnmatchedFragments | Fragments fitted to other solutions |
UninformativeFragments | Fragments covering only homozygous or wildcard regions |
HlaYFragments | Fragments assigned to HLA-Y |
PercentUnique | Percent of unique fragments in solution |
PercentShared | Percent of shared fragments in solution |
PercentWildcard | Percent of wildcard-allele-only fragments in solution |
UnusedAminoAcids | Distinct unmatched amino acids |
UnusedAminoAcidMaxFrags | Maximum fragments assigned to a distinct unmatched amino acids |
UnusedHaplotypes | Distinct unmatched haplotypes |
UnusedHaplotypeMaxFrags | Maximum fragments assigned to a distinct unmatched haplotype |
SomaticVariantsMatched | Somatic variants supported by solution allele |
SomaticVariantsUnmatched | Somatic variants not supported by solution allele |
Additional output files
File | Description |
---|---|
SAMPLE_ID.lilac.somatic.vcf.gz | Annotation of HLA gene somatic variants if somatic VCF provided |
SAMPLE_ID.fragments.csv | Read details for all BAM fragments |
SAMPLE_ID.candidate.coverage.csv | Coverage for all candidate solutions within X% of the top solution's score |
SAMPLE_ID.candidate.fragments.csv | Allocation of each fragment to one or more solutions and which alleles they support |
SAMPLE_ID.HLA-A.aminoacids.txt | Fragment support for each amino acid by HLA gene |
SAMPLE_ID.HLA-A.nucleotides.txt | Fragment support for each nucleotide by HLA gene |
SAMPLE_ID.candidates.aminoacids.txt | Fragment support for amino acids in the candidate alleles |
SAMPLE_ID.candidates.nucleotides.txt | Fragment support for nucleotides in the candidate alleles |