-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parserOption to keep HTML Entities #140
Comments
My TheOldReader feed is here : The code :
|
Sorry that I was offline a few days. This feed content is quite standard, with if (isTheOldReader(feedUrl)) {
doSpecialFeedHandler(feedUrl)
} else {
doRegularFeedHandler(feedUrl)
} Here is my tested solution that you can refer to apply to your workflow: import { extract } from '@extractus/feed-extractor'
const isTheOldReader = (rss) => {
return rss.startsWith('https://theoldreader.com/')
}
const doRegularFeedHandler = (rss) => {
return extract(rss)
}
const doSpecialFeedHandler = async (rss) => {
const result = await extract(rss, {
getExtraEntryFields: (feedEntry) => {
const { description } = feedEntry
return {
description,
}
},
})
return result
}
const runParse = async (rss) => {
return isTheOldReader(rss) ? doSpecialFeedHandler(rss) : doRegularFeedHandler(rss)
}
const feedUrl = 'https://theoldreader.com/profile/JpEncausse.rss'
const data = await runParse(feedUrl)
console.log(data) |
Hello,
I'm parsing RSS Feeds from TheOldReader shared feed. It seems they try to rewrite the RSS entry and rebuild RSS Items. But for HTML content stored in content:encoded they encode the text into
<description>
tags:An Item from ArsTechnica :
When I decode the RSS Item with FeedParser all the HTML entities in the descrption field are stripped. What option should I set to avoid the strip of HTML entities ?
I tried the
normalization = false
but it fail and parse nothing.Thanks
The text was updated successfully, but these errors were encountered: