parserOption to keep HTML Entities

Hello,
I'm parsing RSS Feeds from TheOldReader shared feed. It seems they try to rewrite the RSS entry and rebuild RSS Items. But for HTML content stored in <content:encoded> they encode the text into `<description>` tags:

An Item from ArsTechnica : 
```
<description>&lt;div&gt;
&lt;figure&gt;
  &lt;img src="https://cdn.arstechnica.net/wp-content/uploads/2024/04/image-2-800x533.jpeg" alt="Image of a chip with a device on it that is shaped like two triangles connected by a bar."&gt;
      &lt;p&gt;&lt;a href="https://cdn.arstechnica.net/wp-content/uploads/2024/04/image-2-scaled.jpeg"&gt;Enlarge&lt;/a&gt; / Quantinuum's H2 "racetrack" quantum processor. (credit: Quantinuum)&lt;/p&gt;  &lt;/figure&gt;
&lt;div&gt;&lt;a&gt;&lt;/a&gt;&lt;/div&gt;
&lt;p&gt;On Tuesday, Microsoft made a series of announcements related to its Azure Quantum Cloud service. Among them was a demonstration of logical operations using the largest number of error-corrected qubits yet.&lt;/p&gt;
&lt;p&gt;"&lt;a href="https://arstechnica.com/science/2024/04/quantum-error-correction-used-to-actually-correct-errors/"&gt;Since April&lt;/a&gt;, we've tripled the number of logical qubits here," said Microsoft Technical Fellow Krysta Svore. "So we are accelerating toward that hundred-logical-qubit capability." The company has also lined up a new partner in the form of Atom Computing, which uses neutral atoms to hold qubits and has already demonstrated hardware with over 1,000 hardware qubits.&lt;/p&gt;
&lt;p&gt;Collectively, the announcements are the latest sign that quantum computing has emerged from its infancy and is rapidly progressing toward the development of systems that can reliably perform calculations that would be impractical or impossible to run on classical hardware. We talked with people at Microsoft and some of its hardware partners to get a sense of what's coming next to bring us closer to useful quantum computing.&lt;/p&gt;
&lt;/div&gt;&lt;p&gt;&lt;a href="https://arstechnica.com/?p=2048754#p3"&gt;Read 20 remaining paragraphs&lt;/a&gt; | &lt;a href="https://arstechnica.com/?p=2048754&amp;amp;comments=1"&gt;Comments&lt;/a&gt;&lt;/p&gt;</description>
```

When I decode the RSS Item with FeedParser all the HTML entities in the descrption field are stripped. What option should I set to avoid the strip of HTML entities ?
- Convert them back to HTML into a String
- or keep them as is

I tried the `normalization = false` but it fail and parse nothing.
Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

parserOption to keep HTML Entities #140

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

parserOption to keep HTML Entities #140

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions