Skip to content

A parser for Google Scholar, written in Python

Notifications You must be signed in to change notification settings

corymcox/scholar.py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 

Repository files navigation

scholar.py

scholar.py is a Python module that implements a querier and parser for Google Scholar's output. Its classes can be used independently, but it can also be invoked as a command-line tool.

The script used to live at http://icir.org/christian/scholar.html, and I've moved it here so I can more easily manage the various patches and suggestions I'm receiving for scholar.py. Thanks guys, for all your interest! If you'd like to get in touch, email me at [email protected] or ping me on Twitter.

Cheers,
Christian

Features

  • Extracts publication title, main online URL, number of citations, number of online versions, link to Google Scholar's main cluster for the work, and Google Scholar's cluster of all works referencing the publication.
  • Supports the full range of advanced query options provided by Google Scholar, such as title-only search or publication date timeframes.
  • Supports retrieval of citation details in standard external formats as provided by Google Scholar, including BibTeX and EndNote.
  • Command-line tool prints entries in CSV format, simple plain text, or in the citation export format.

Example

Try scholar.py --help for all available options. Note, the command line arguments changed considerably in version 2.0! A few examples:

Retrieve one article written by Einstein on quantum theory:

$ scholar.py -c 1 --author "albert einstein" --phrase "quantum theory"
         Title On the quantum theory of radiation
           URL http://icole.mut-es.ac.ir/downloads/Sci_Sec/W1/Einstein%201917.pdf
     Citations 184
      Versions 3
Citations list http://scholar.google.com/scholar?cites=17749203648027613321&as_sdt=2005&sciodt=0,5&hl=en&num=1
 Versions list http://scholar.google.com/scholar?cluster=17749203648027613321&hl=en&num=1&as_sdt=0,5
          Year 1917
 Citation link None

Retrieve a BibTeX entry for that quantum theory paper:

$ scholar.py -c 1 --author "albert einstein" --phrase "quantum theory" --citation bt
@article{einstein1917quantum,
  title={On the quantum theory of radiation},
  author={Einstein, Albert},
  journal={Phys. Z},
  volume={18},
  pages={121--128},
  year={1917}
}

License

scholar.py is using the standard BSD license.

About

A parser for Google Scholar, written in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published