--- date: '2023-02-23' title: 'Automating RSS syndication and sharing with Next.js and GitHub' draft: false tags: ['Next.js', 'RSS', 'automation', 'GitHub'] --- I wrote a basic syndication tool in Next.js to automate sharing items from configured RSS feeds to Mastodon. This tool works by leveraging a few basic configurations, the Mastodon API and a (reasonably) lightweight script that creates a JSON cache when initialized and posts new items on an hourly basis. The script that handles this functionality lives at `lib/syndicate/index.ts`: ```typescript import { toPascalCase } from '@/utils/formatters' import { extract, FeedEntry } from '@extractus/feed-extractor' import { SERVICES, TAGS } from './config' import createMastoPost from './createMastoPost' export default async function syndicate(init?: string) { const TOKEN_CORYDDEV_GISTS = process.env.TOKEN_CORYDDEV_GISTS const GIST_ID_SYNDICATION_CACHE = '406166f337b9ed2d494951757a70b9d1' const GIST_NAME_SYNDICATION_CACHE = 'syndication-cache.json' const CLEAN_OBJECT = () => { const INIT_OBJECT = {} Object.keys(SERVICES).map((service) => (INIT_OBJECT[service] = [])) return INIT_OBJECT } async function hydrateCache() { const CACHE_DATA = CLEAN_OBJECT() for (const service in SERVICES) { const data = await extract(SERVICES[service]) const entries = data?.entries entries.map((entry: FeedEntry) => CACHE_DATA[service].push(entry.id)) } await fetch(`https://api.github.com/gists/${GIST_ID_SYNDICATION_CACHE}`, { method: 'PATCH', headers: { Authorization: `Bearer ${TOKEN_CORYDDEV_GISTS}`, 'Content-Type': 'application/vnd.github+json', }, body: JSON.stringify({ gist_id: GIST_ID_SYNDICATION_CACHE, files: { 'syndication-cache.json': { content: JSON.stringify(CACHE_DATA), }, }, }), }) .then((response) => response.json()) .catch((err) => console.log(err)) } const DATA = await fetch(`https://api.github.com/gists/${GIST_ID_SYNDICATION_CACHE}`).then( (response) => response.json() ) const CONTENT = DATA?.files[GIST_NAME_SYNDICATION_CACHE].content // rewrite the sync data if init is reset if (CONTENT === '' || init === 'true') hydrateCache() if (CONTENT && CONTENT !== '' && !init) { const existingData = await fetch( `https://api.github.com/gists/${GIST_ID_SYNDICATION_CACHE}` ).then((response) => response.json()) const existingContent = JSON.parse(existingData?.files[GIST_NAME_SYNDICATION_CACHE].content) for (const service in SERVICES) { const data = await extract(SERVICES[service], { getExtraEntryFields: (feedEntry) => { return { tags: feedEntry['cd:tags'], } }, }) const entries: (FeedEntry & { tags?: string })[] = data?.entries if (!existingContent[service].includes(entries[0].id)) { let tags = '' if (entries[0].tags) { entries[0].tags .split(',') .forEach((a, index) => index === 0 ? (tags += `#${toPascalCase(a)}`) : (tags += ` #${toPascalCase(a)}`) ) tags += ` ${TAGS[service]}` } else { tags = TAGS[service] } existingContent[service].push(entries[0].id) createMastoPost(`${entries[0].title} ${entries[0].link} ${tags}`) await fetch(`https://api.github.com/gists/${GIST_ID_SYNDICATION_CACHE}`, { method: 'PATCH', headers: { Authorization: `Bearer ${TOKEN_CORYDDEV_GISTS}`, 'Content-Type': 'application/vnd.github+json', }, body: JSON.stringify({ gist_id: GIST_ID_SYNDICATION_CACHE, files: { 'syndication-cache.json': { content: JSON.stringify(existingContent), }, }, }), }) .then((response) => response.json()) .catch((err) => console.log(err)) } } } } ``` We start off with an optional `init` parameter that can be passed into our `syndicate` function to hydrate our syndication cache — the structure of this cache is essentially `SERIVCE_KEY: string[]` where `string[]` contains RSS post IDs. Now, given that Vercel is intended as front end hosting, I needed a reasonably simple and reliable solution for hosting a simple JSON object. I explored and didn't want to involve a full-fledged database or storage solution and wasn't terribly interested in dealing with S3 or B2 for this purpose, so I, instead, went with a "secret" GitHub gist[^1] and leveraged the GitHub API for storage. At each step of the [CRUD](https://en.wikipedia.org/wiki/Create,_read,_update_and_delete) process in this script we make a call to the GitHub API using a token for authentication, deal with the returned JSON and go on our merry way. Once the cache is hydrated the script will check the feeds available in `lib/syndicate/config.ts` and post the most recent item if it does not exist in the cache and then add it to said cache. The configured services are simply: ```typescript export const SERVICES = { 'coryd.dev': 'https://coryd.dev/feed.xml', glass: 'https://glass.photo/coryd/rss', letterboxd: 'https://letterboxd.com/cdme/rss/', } ``` As we iterate through this object we also attach tags specific to each service using an object shaped exactly like `SERVICES` in `config.ts`: ```typescript export const TAGS = { 'coryd.dev': '#Blog', glass: '#Photo #Glass', letterboxd: '#Movie #Letterboxd', } ``` This is partly for discovery and partly a consistent way for folks to filter my automated nonsense should they so choose. The format of Glass and Letterboxd are consistent and the tags are as well — for posts from my site (like this one 👋🏻) I start with `#Blog` and have also modified the structure of my RSS feed to expose the tags I add to each post. The feed is generated by a script that runs at build time called `generate-rss.ts` which looks like: ```typescript import { escape } from '@/lib/utils/htmlEscaper' import siteMetadata from '@/data/siteMetadata' import { PostFrontMatter } from 'types/PostFrontMatter' const generateRssItem = (post: PostFrontMatter) => ` ${siteMetadata.siteUrl}/blog/${post.slug} ${escape(post.title)} ${siteMetadata.siteUrl}/blog/${post.slug} ${post.summary && `${escape(post.summary)}`} ${new Date(post.date).toUTCString()} ${siteMetadata.email} (${siteMetadata.author}) ${post.tags && post.tags.map((t) => `${t}`).join('')} ${post.tags} ` const generateRss = (posts: PostFrontMatter[], page = 'feed.xml') => ` ${escape(siteMetadata.title)} ${siteMetadata.siteUrl}/blog ${escape(siteMetadata.description.default)} ${siteMetadata.language} ${siteMetadata.email} (${siteMetadata.author}) ${siteMetadata.email} (${siteMetadata.author}) ${new Date(posts[0].date).toUTCString()} ${posts.map(generateRssItem).join('')} ` export default generateRss ``` I've added a new namespace to the parent `` tag called `cd`[^2] — the declaration points to a page at this site that (very) briefly explains the purpose, I then created a `` field that exposes a comma delimited list of post tags. Back in `syndicate/index.ts`, this field is accessed when the RSS feed is parsed: ```typescript const data = await extract(SERVICES[service], { getExtraEntryFields: (feedEntry) => { return { tags: feedEntry['cd:tags'], } }, }) ... let tags = '' if (entries[0].tags) { entries[0].tags .split(',') .forEach((a, index) => index === 0 ? (tags += `#${toPascalCase(a)}`) : (tags += ` #${toPascalCase(a)}`) ) tags += ` ${TAGS[service]}` } else { tags = TAGS[service] } ``` Tags get transformed to Pascal case, prepended with `#` and sent off to be posted to Mastodon along with the static service-specific tags. The function that posts content to Mastodon is as simple as the following: ```typescript import { MASTODON_INSTANCE } from './config' const KEY = process.env.API_KEY_MASTODON const createMastoPost = async (content: string) => { const formData = new FormData() formData.append('status', content) const res = await fetch(`${MASTODON_INSTANCE}/api/v1/statuses`, { method: 'POST', headers: { Accept: 'application/json', Authorization: `Bearer ${KEY}`, }, body: formData, }) return res.json() } export default createMastoPost ``` Back at GitHub, this is all kicked off every hour on the hour using the following workflow: ```yaml name: scheduled-cron-job on: schedule: - cron: '0 * * * *' jobs: cron: runs-on: ubuntu-latest steps: - name: scheduled-cron-job run: | curl -X POST 'https://coryd.dev/api/syndicate' \ -H 'Authorization: Bearer ${{ secrets.VERCEL_SYNDICATE_KEY }}' ``` Now, as I post things elsewhere, they'll make their way back to Mastodon with a simple title, link and tag set. Read them if you'd like, or filter them out altogether. [^1]: It's secret inasmuch as it's obscured and, hence, not secured (which is also why `syndicate.ts` includes the gist ID directly) — it's all public post IDs, so peruse as one sees fit. [^2]: Not very creative, I know.