automation
Python RSS Reader for Automating Blog Updates
A Python tool to parse RSS feeds and automate Hugo blog updates to social media, with support for Docker and Google Cloud services.
Shipped January 2026
A Python-based tool designed to parse and process RSS feeds, primarily aimed at automating the posting of Hugo blog updates to social media or other platforms. This project includes utilities for interacting with Google Cloud services and supports deployment via Docker and Google Cloud Build.
Features
- Parses RSS feeds to extract and process new blog entries.
- Converts published dates to standardized formats for comparison and processing.
- Updates a backend service or database with new or updated feed entries via HTTP requests.
- Integrates with Google Cloud services including BigQuery, Cloud Storage, and Cloud Logging through reusable client utilities.
- Supports containerized deployment with Docker and automated builds using Google Cloud Build.
Tech Stack
- Python 3
- feedparser (for RSS parsing)
- requests (for HTTP requests)
- Google Cloud SDKs (BigQuery, Storage, Logging)
- Docker
- Google Cloud Build
Getting Started
Prerequisites
- Python 3.7 or higher
- Docker (optional, for containerized deployment)
- Google Cloud account and appropriate permissions
Installation
- Clone the repository:
git clone https://github.com/justin-napolitano/python-rss-reader.git
cd python-rss-reader
- (Optional) Set up a Python virtual environment:
python3 -m venv venv
source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Configure environment variables and credentials:
-
Place your Google Cloud service account JSON in
secret.jsonor set the environment variableGOOGLE_APPLICATION_CREDENTIALS. -
Configure any other required environment variables as needed.
Running the RSS Scraper
python rss-scraper.py
This will parse the RSS feed (default: https://jnapolitano.com/index.xml) and attempt to update the backend service with new entries.
Using Docker
Build the Docker image:
docker build -t python-rss-reader .
Run the container:
docker run --env GOOGLE_APPLICATION_CREDENTIALS=/path/to/secret.json -v /local/path/to/secret.json:/path/to/secret.json python-rss-reader
Google Cloud Build
The cloudbuild.yaml file defines steps to build and push the Docker image to Google Container Registry. Uncomment and configure additional steps to deploy to Cloud Run or set up Cloud Scheduler jobs.
Run Cloud Build:
gcloud builds submit --config cloudbuild.yaml .
Project Structure
python-rss-reader/
├── cloudbuild.yaml # Google Cloud Build configuration
├── Dockerfile # Docker image definition
├── gcputils/ # Google Cloud utility submodule
│ ├── BigQueryClient.py # BigQuery client wrapper
│ ├── GCSClient.py # Google Cloud Storage client wrapper
│ ├── GoogleCloudLogging.py# Cloud Logging client wrapper
│ ├── index.md # Documentation
│ └── readme.md # Documentation
├── images/ # Image assets
├── index.md # Project notes and thoughts
├── last_run.txt # Stores last run timestamp
├── readme.md # Project notes and thoughts (similar to index.md)
├── requirements.txt # Python dependencies
├── rss-scraper.py # Main RSS parsing and update script
└── secret.json # Google Cloud service account credentials (sensitive)
Future Work / Roadmap
- Implement a dedicated API or batch processor for handling feed updates instead of a monolithic script.
- Add more robust error handling and retry mechanisms.
- Extend support for publishing parsed posts to various social media platforms.
- Enhance configuration management for cloud deployments.
- Add automated tests and CI/CD pipelines.
- Improve documentation and usage examples.
Note: This README is based on available source files and inferred project goals. Some assumptions were made regarding deployment and usage.
Need more context?
Want help adapting this playbook?
Send me the constraints and I'll annotate the relevant docs, share risks I see, and outline the first sprint so the work keeps moving.