筛选 RSS 源,制作新的全文 RSS 源
问题:假如你喜欢的网站只提供摘要型的 RSS 源,但是你希望能在 RSS 阅读器中阅读全文 RSS,同时还希望它只推送某些特定的文章 解决方法:利用 Huginn 制作一个经过筛选的全文 RSS 源,实现方法如下:
- RSSAgent:获取并解析网站提供的 RSS 源;
- TriggerAgent:过滤 RSS 源中的项目;
- WebsiteAgent:通过 RSS 源中的项目获取文章的全文;
- DataOutputAgent:输出全文 RSS。
1. RSSAgent
Name: Example RSS In
{
"expected_update_period_in_days": "14",
"clean": "false",
"url": "http://www.businesscat.happyjar.com/feed/"
}
2. TriggerAgent
Name: Example filter Event sources: Example RSS In Propagate immediately: Yes
{
"expected_receive_period_in_days": "14",
"keep_event": "true",
"rules": [
{
"type": "regex",
"value": ".*\\/comic\\/.*",
"path": "url"
}
]
}
注意:将
keep_event
设置为true
,从而将解析的项目元素传递给下一个 agent
3. WebsiteAgent
Name: Example page fetch
Event sources: Example filter
Propagate immediately: Yes
{
"expected_update_period_in_days": "14",
"url": "",
"type": "html",
"mode": "merge",
"extract": {
"imgurl": {
"css": "\#comic img",
"value": "@src"
}
}
}
注意:将
mode
设置为merge
,从而将解析的项目元素传递给下一个 agent
4. DataOutputAgent
Name: Example Rss out
Event sources: Example page fetch
Propagate immediately: Yes
{
"secrets": [
"examplerss"
],
"expected_receive_period_in_days": "14",
"template": {
"title": "Business Cat full comic feed",
"description": "This is a feed of recent Business Cat comics generated by Huginn",
"item": {
"title": "",
"description": "<img src=\"\" />",
"link": "",
"pubDate": ""
}
}
}
本文由 Huginn 中文网 翻译,已经获得项目作者授权,项目原文访问Generating a filtered full-text RSS feed from an existing RSS feed