and structure webpages into articles, comment threads, products, events, and more. It can also be used to process HTML from webpages or even an entire web site and produce structured output in the form of JSON objects.
social: support new URL formats for Facebook, YouTube and X (#2758) (4c95847), closes #525FeaturestieredProxyUrls accept null for switching the proxy off (#2743) (82f4ea9), closes #27403.12.0 (2024-11-04)Bug Fixes.trim() urls from pretty-printed sitemap.xml files (#2709) (802...
("https://www.facebook.com/robots.txt", False), # Social media ("https://www.google.com/search", False), # Search pages # Edge cases ("https://api.github.com", True), # API service ("https://raw.githubusercontent.com", True), # Content delivery # Non-existent/error cases (...