You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, really cool project. I've been looking for a shell-like language that's better at dealing with structured data than the standard unix shells for a project I'm currently working on. I've tried out xonsh, but the shell vs python mode coercion just got in the way. (Nushell and Ion both looked promising, but more as a proof-of-concept)
Basically, I've become pretty frustrated with the state of the modern web and would like to automate and script my daily interactions with a web browser as much as possible.
I see that there is already the http command, but not much in the way of actually dealing with the HTML data. I think the language has a lot of potential to be used to do what I ultimately would like my project to, which is deconstruct web pages into JSON-like objects that could be fed into other scripts (or even have methods that could be called directly on them).
A simple example of what I'm trying to do would be performing a youtube search, displaying them in a human readable format, then selecting a video to be played in mpv. Or fetching tickets from Jira, sorting them by what seems most recent/urgent, then opening them up in Firefox in that order to present in the daily stand up call.
I know all of those services have their own respective APIs, but having to find and learn a new library for each one is too much work when I can just extract the data I want from a web page. And I could do the same for websites which do not have a corresponding API.
I have already experimented with doing this using Rust and evcxr/Papyrus as the interface, but as much as I like it, Rust isn't really suited for scripting and interactive usage. This project seems like it could ultimately be a better fit.
Like I mentioned, the state of the modern web is pretty grim, and with the abundance of single-page apps that rely heavily on JavaScript, driving a headless web browser via webdriver is IMO the only viable option for what I would call "modern" web scraping. The advantage of webdriver is that it gives you a standard language-agnostic API, and all of the HTML parsing/selector magic happens in the browser, on the actual DOM as rendered by React/Angular/JQuery/DHTML/whatever the flavour-of-the-month JS framework is.
Would webdriver support be a thing that you would be interested in integrating directly into the language? I'm willing to collaborate on this if we can find a way to stay in touch, otherwise I could also develop this as a separate library, but I think first-class support for web automation might be the "killer app" that could drive crush's adoption in the "real world", or at least the world of terminal nerds who really hate web browsers. 😜
The text was updated successfully, but these errors were encountered:
Hi. I will need to look into webdriver a bit and get back to you before I can answer your question. This is WebDriver the selenium subproject you're talking about?
Yeah, it's what started off as Selenium, became Selenium WebDriver and is now a W3C standard (or "recommendation") for browser automation implemented by all the major web browsers
Hi, really cool project. I've been looking for a shell-like language that's better at dealing with structured data than the standard unix shells for a project I'm currently working on. I've tried out xonsh, but the shell vs python mode coercion just got in the way. (Nushell and Ion both looked promising, but more as a proof-of-concept)
Basically, I've become pretty frustrated with the state of the modern web and would like to automate and script my daily interactions with a web browser as much as possible.
I see that there is already the http command, but not much in the way of actually dealing with the HTML data. I think the language has a lot of potential to be used to do what I ultimately would like my project to, which is deconstruct web pages into JSON-like objects that could be fed into other scripts (or even have methods that could be called directly on them).
A simple example of what I'm trying to do would be performing a youtube search, displaying them in a human readable format, then selecting a video to be played in mpv. Or fetching tickets from Jira, sorting them by what seems most recent/urgent, then opening them up in Firefox in that order to present in the daily stand up call.
I know all of those services have their own respective APIs, but having to find and learn a new library for each one is too much work when I can just extract the data I want from a web page. And I could do the same for websites which do not have a corresponding API.
I have already experimented with doing this using Rust and evcxr/Papyrus as the interface, but as much as I like it, Rust isn't really suited for scripting and interactive usage. This project seems like it could ultimately be a better fit.
Like I mentioned, the state of the modern web is pretty grim, and with the abundance of single-page apps that rely heavily on JavaScript, driving a headless web browser via webdriver is IMO the only viable option for what I would call "modern" web scraping. The advantage of webdriver is that it gives you a standard language-agnostic API, and all of the HTML parsing/selector magic happens in the browser, on the actual DOM as rendered by React/Angular/JQuery/DHTML/whatever the flavour-of-the-month JS framework is.
Would webdriver support be a thing that you would be interested in integrating directly into the language? I'm willing to collaborate on this if we can find a way to stay in touch, otherwise I could also develop this as a separate library, but I think first-class support for web automation might be the "killer app" that could drive crush's adoption in the "real world", or at least the world of terminal nerds who really hate web browsers. 😜
The text was updated successfully, but these errors were encountered: