- update 2025 footer
- added support for parquet
- added quick datatype conversion when loading sumstats
- fixed N datatype to np.int64 instad of pd.Int64 for ldsc
- updated requirement for matplotlib version
- organized examples
- added plot pipcs (under development)
-gl.plot_miami2
: fixed error for titles_pad
-to_format()
: added parquet format.
-to_format()
: added separate output for each chromosome
-to_format()
: updated log
-plot_mqq(mode="r")
: added region_legend_marker=True
-plot_mqq(mode="r")
: fixed legend maker size
-plot_mqq(mode="r")
: fixed coloring for single reference variant
-plot_mqq()
: added anno_xshift=0
-plot_stacked_mqq()
: fixed error when saving as pdf
- fixed size error for variants with very low MLOG10P when plotting region plots.
- fixed error when lead variant was not available in stacked regional plot.
- fixed legend title error in trumpet plot.
- updated functions for automatically extracting kwargs.
- added marker to indicate reference variant to LD colorbar in regional plots
- fixed errors when plotting y ticks in
plot_mqq()
- added additional fix_id() in harmonization workflow (credit to @joshchiou)
- fixed errors due to liftover version change
- restructured
compare_effect()
- fixed error when getting hapmap3 variants
- added
plot_gwheatmap()
- added genename annotation in
compare_effect()
- fixed annotation error in
compare_effect()
- fixed is_q error in
compare_effect()
- fixed error for calculating arm length for annotation.
- removed some outdated code
- fixed the suffix for yaml in
to_format()
- implemented
.flip_snpid
and.strip_snpid
- implemented support for chm13
13
in.liftover
- updated reference url list (added chain files)
- implemented
chrom_pat
andsnpid_pat
filters forgl.Sumstats()
- fixed bug in
.plot_miami2
- updated reference url
- fixed bug for
is_q_mc
incompare_effect()
- added
anno_height
forplot_mqq()
- added
xtight
forplot_mqq()
- fixed bug for
xpad
inplot_mqq()
- supported baseline model for ldsc outputs
- updated stacked regional plot
- supported user-provided plink path
- fixed a bug in column order for
to_format()
- added version requirements for numpy (<2) and matplotlib (<3.9)
- changed python version requirements to >=3.9, <3.11
- updated the version of pysam to v0.22.1
- fixed bug for status code when flipping statistics
- fixed bug when POS > sequence length for check_ref()
- vectorized normalize_allele()
- Added cache to speed up strand inference (credit to @sup3rgiu Mr. Andrea)
- fixed error for .plot_mqq(m="qq")
- Added h5py==3.10.0 to dependencies.
- fast implementation of check_ref and to_format. (credit to @sup3rgiu Mr. Andrea)
- added highlight and pinpoint for plot_trumpet()
- fixed typo (credit to @sup3rgiu Mr. Andrea)
- changed
**args
to**kwargs
(credit to @sup3rgiu Mr. Andrea) - implemented munge-like filters
munge=True
for ldsc in gwaslab
- fixed error in region_ref_second
- fixed color issue for regional plots. The color assigned to each variant is actually the color for the lower LD r2 category. For example, variants with LD>0.8 will be colored with the color for 0.8>LD>0.6.
- integrated LDSC (partitioned h2/h2-cts)
- added wc_correction for get_lead()
- updated log
- integrated LDSC h2/rg
- updated LICENSE from MIT to GPL-3.0 license
- added check_novel_set / check_cis
- added filter_snp/palindromic/indel and filter_hapmap3
- updated log system
- fixed bug when gene name is empty in GTF in regional plot
- fixed error in harmonization when there is no palindromic SNPS or indels to check.
- fixed bug when saving plot as pdf with matplotlib>3.6
- fixed typos
- fixed seaborn version
- replaced array_split
- fixed warnings due to pandas upgrade
- renamed get_flanking to filter_flanking
- fixed error in cbar for regional plot
- restructured log system
- restructured sanity checking
- restructured stats flipping
- updated tutorial and examples
- fixed error in log
- added data consistency check
- added "s","r","n" mode for remove_dup()
- updated fix_id
- added sample data
- added datatype check for rsID and SNPID
- added memory usage check
- added extreme P value check
- removed statsmodels
- updated functions in basic_check()
- added a test example for basic_check()
- fixed fontsize and font_family errors in plot_mqq() for ax4
- added new options for plot_mqq(mode="r"); cbar_fontsize, cbar_font_family, cbar_title
- added new options for plot_mqq(mode="r"); track_n, track_n_offset, track_fontsize_ratio, track_exon_ratio, track_text_offset, track_font_family
- fixed version number
- get_lead will use MLOG10P first instead of P (
scaled
is deprecated inget_lead
) - added datatype check for fill_data
- added datatype verification
- updated sanity check default values (BETA, OR, HR)
- fix_allele() now prints out allele information
- updated sanity check default values (added tolerance for floats)
- fixed error in loading pickle created by older versions
- fixed bug in flipping OR and HR
- added version information to the start line of each checking function
- updated reference VCF
- updated clump() / to_finemapping() / run_susie_rss() (beta)
- updated Manhattan-like plotting system
- updated datatype for certain statistics
- updated file naming system (beta)
- added stacked mqq plot (beta)
- plot_miami2() can iteratively call plot_mqq to create miami plot, which supports more functions (beta)
- fixed bug in sanity_check : N = N_CASE + N_CONTROL
- fixed the logic chain for normalization
- added extra log in compare_effect
- fixed a few typos (credit to @gmauro Mr. Gianmauro Cuccuru)
- added clump() for beta testing
- added to_finemapping() for beta testing
- added run_susie_rss() for beta testing
- updated version requirements
- updated random seed range (0,2^32-1).
- fixed error for suggestive_sig_line in miami plot.
- loosened version requirements for python, pandas and matplotlib.
- set
scatter_kwargs={"rasterized":True}
as default for miami plot
- fixed a bug in
compare_effect
- update reference book
- added extra parameters for tabix_index()
- update random variants
- added
infer_af()
- updated reference datasets
- supported
highlight
andpinpoint
for multiple sets of loci and variants in.plot_mqq()
- added
overwrite
option forgl.download_ref()
- update references
- fixed bug in
get_lead()
. In some rare cases, it was not counted when the last variant is a new lead variant.
- added plot_power() and plot_power_x()
- added support for multiple EFO IDs
- updated built-in formatbook
- fixed regex error in read_ldsc for numbers in the format of 1e-01
- fixed typos
- fix error in gc calculation for mqq plot when
expected_min_mlog10p
is not 0 andstratified=True
.
- reimplement and unified the module for saving figures
- added trumpet plot document page
- added CHR range check in
.fix_chr()
and.plot_mqq()
(remove variants with CHR<=0) (#42) - fixed error in column headers for plot_mqq "b" mode (#40)
- fixed many typos.. (#41)
- fixed error in plot_mqq logging (warning for genome build)
- fixed error in annotation for miami plot
- added annotation using custom column for miami plot
- updated alogorithm for extracting lead variants (using scaled P)
- fixed bugs for miami plot
- added
expected_min_mlog10p
for qq mode inplot_mqq()
- added trumpet plot
- added support for gwaslab Sumstats object for gl.compare_effect()
- added saving options for gl.compare_effect()
- fixed bug for
is_q=False
in gl.compare_effect().
- fixed bug for qq mode in
plot_mqq()
- LDSC-rg genetic correlation heatmap
- Allele frequency correlation plot
- Miami plot using gl.Sumstats Object pickle files
- Auto-check for VCF chromosome prefix (chr1 or 1)
- fill_data() is now implemented iteratively
- Downloaded files auto detection.
- fixed bug
- added two-reference-variant mode for regional plot
- auto-check for genome build version
- added
additional_line
andadditional_line_color
for plot_mqq - fixed bugs for sig_level_lead
- added highlight_anno_args in plot_mqq()
- added GWAS-SSF style metadata yaml output support
- added region_ref for regional plots
- updated default formatbook
- fixed anno=None in miami plot
- fixed variant_id sep for ssf
- update tutorials
- update variant matching criteria for regional plot
- fixed default values for mqqplot
- added qq_scatter_args
- added connection timeout, status code check and md5sum verification for download_ref
- fixed regional plot lead variant line error
- added tutorial for v3.4
- fixed anno_alias priority
- fixed font_family for annotation
- get_lead can now use mlo10p to extract lead variants
- added sig_level_lead in plot_mqq
- updated yticklabel fontsize
- updated drop_chr_start
- added rr_lim : input a tuple like (0,100) or "max"
- fixed error in mqqplot (chr > 26)
- added jagged y-axis
- added cut_log
- fixed y tick labels
- added font_family
- added ylabels
- added sc_linewidth
- fixed use_rank
- fixed ystep
- restructured plot functions
- added dtype conversion for input pd.DataFrame
- reimplemented gtfparse and revised requirements
- update reference datasets
- fixed bugs when no variants were selected for mqqplot
- fixed bugs in
gl.check_downloaded_ref()
- added suggestive significance line
- update bugs in init.py
- update annotation arrow style
anno_style
for.plot_mqq
- fixed effect size comparison bugs (added sorting)
- added allele check for effect size comparison
- update large number selection algorithm
- fixed dtype errors in
fix_chr
andfix_pos
- added perSNPh2
.get_per_snp_r2()
and F statistics - implemented save in Miami plots
- fixed annotation error in Miami plots
- added checks for duplicates and NAs in compare_effect()
- implement package info gl.show_version()
- fixed rsid_to_chrpos()
- updated get_density() to calculate the signal density for sumstats
- implemented winner's curse correction for effect size comparison
- updated fontsize options for plot_mqq(). Added anno_fontsize, title_fontsize.
- updated default values and optimized methods for remove_dup(), fix_chr(), basic_check(),check_sanity().
- added dump_pickle() load_pickle() to save half-finished sumstats object.
- updated config and downloading system
- added xtcik_chr_dict
- fixed bugs in miami plot
- added hg38 recombination rate file
- fixed bugs in get_novel
- fixed bugs for reading gtf files
- updated requirements for dependencies
- pandas>=1.3,<1.5
- pyensembl==2.2.3
- support customized gtf/vcf/recombination_rate files
- fixed bugs for regional plot
- calculate_gc()
- fill MAF :
.fill_data(to_fill=["MAF"])
- specified python engine for query
- fixed bugs for matplotlib v3.6.x
- added method chain for filter_xxx functions
- updated requirements for dependencies
- pySAM>=0.18.1,<0.20
- matplotlib>=3.5
- pyensembl>=2.2.3
- updated requirements for dependencies
- updated bugs for mqqplot
- updated download system
- included recombination data
- updated packaging methods. Now when installing gwaslab, pip will install all dependencies as well.
- added download function:
- now you can download reference files from predefined list via gwaslab
gl.check_available_ref()
: list available reference filesgl.check_downloaded_ref()
: list downloaded reference filesgl.download_ref(name)
: download reference filesgl.remove_file(name)
: remove the local reference filesgl.get_path(name)
: get the local path for the reference data name
- implemented parsing gwas-vcf (
fmt="vcf"
) - implemented
Sumstats.filter_value(expr)
- fixed bugs for check_allele
- optimized functions for sorting columns
- removed outdated codes in Sumstats
- added
filter_value
- integrate
gwascatalog
toget_novel
- optimized
remove_dup
- fixed bugs
- added gwascatalog_trait()
- optimized check_sanity()
- optimized the logic for removing duplicated and multiallelic variants
- added update_formatbook()
- added functions to read vcf.gz
gl.Sumstats("myvcf.vcf.gz",fmt="vcf")
- gwaslab is now able to read chromosome-separated files
- fixed bugs
- added Miami plot
- added Brisbane plot
- updated tutorials
- mqq plot annotation: new customization options
- added forcefixid for fix_id()
- fixed bugs for plotting gene tracks
- extract novel loci given a list of known lead variants
- fixed bugs in fill_data()
- fixed path for hapmap3 snps for infer_build()
- added forest plot
- fixed options for mqqplot
- supported vcf
- incorporated pyensembl and scikit-allel.
- get_lead() : support automatic gene name annotation (using pyensembl)
- to_format():
- support common sumstats formats
- support 1-based bed-like formats for VEP
- support 0-based bed-like formats
- manhattan plot:
- optimized plotting logic
- annotate gene names
- added regional plot feature using a user-provided reference panel
- comparison effect plot:
- fix using OR
- implemented formatbook: easily import sumstats and output sumstats in certain formats (support for commonly used formats including ldsc, plink, plink2, gwas-ssf, saige, regenie, fastgwa, metal, mrmega, pgscatalog, pgscatalog_hm, gwascatalog, gwascatalog_hm and gwaslab)
- added
.filter_region_in/out
using bed files (or in-built regions like high-ld or hla) - implemented
.summay()
methods. - optimized rsID annotation pipeline. Support annotation using curated chr:pos:ref:alt - rsID tsv for quick annotation.
- changed some datatypes and optimized memory usage.
- replaced pyVCF with pySAM