Skip to content
carrerasrodrigo edited this page Mar 11, 2013 · 1 revision

By Default ghost waits until the page fires the "onLoad" event to return the execution to the user. This means that will load the main page, js files, css files, etc. before to fire the onLoad event. This is ok when we need to scrap a complex web page but if we just need the html of the page we will end up waiting more of what we need. To solve this problem I incorporate the fast open option (using wait_onload_event=False). This will return the execution to the user when the DOM it's ready (before the onLoad event). Let's see an example:

from ghost import Ghost

url = "http://news.ycombinator.com/"
# We enable the cache and set the maximun size to 10 MB
# We don't want to load images and load css or js files
gh = Ghost(cache_size=10)

# We create a new page
page, page_name = gh.create_page(download_images=False,
        prevent_download=["css", "js"])

# wait_onload_event will tell to Ghost to leave the open method
# when the On Ready event on the web page has been fired
page_resource = page.open(url, wait_onload_event=False)

# We retrive the links from the web page
links = page.evaluate("""
                        var links = document.querySelectorAll("a");
                        var listRet = [];
                        for (var i=0; i
Clone this wiki locally