About Parsing Nodes with Namespaces in XML
If you use defusedxml (or lxml) to parse RSS or other XML documents, you need to be able to read values from namespaced nodes, for example <content:encoded>. You can do that by passing a dictionary with your namespaces to the find() or findall() methods, like this:
from defusedxml.ElementTree import fromstring
namespaces = {
"content": "http://purl.org/rss/1.0/modules/content/",
"dc": "http://purl.org/dc/elements/1.1/",
}
xml_doc = fromstring(xml_string)
for item in xml_doc.findall("channel/item"):
print(item.find("content:encoded", namespaces).text)
XML namespaces are usually declared in the root node of XML document with xmlns prefix, for example:
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
>
<!-- ... --->
</rss>
Also by me
Django Messaging
For Django-based social platforms.
Django App for You
Django Paddle Subscriptions
For Django-based SaaS projects.
Django App for You
Django GDPR Cookie Consent
For Django websites that use cookies.
Django App for You