SpeasyTTS

In today's information-driven world, professionals and lifelong learners face a critical challenge: an overwhelming flood of valuable content competing for their limited attention.

Richard Simms

· Visit Project

How I consume content

Making the most of free moments to learn and grow.

I faced this challenge and I assume others do too: Within today's information-driven world, professionals and lifelong learners face a critical challenge: an overwhelming flood of valuable content competing for their limited attention. Despite the best intentions, many individuals struggle with:

  • Mounting "read later" queues that become sources of guilt and missed opportunities
  • Difficulty finding time to stay informed amid demanding professional responsibilities
  • The constant tension between personal growth and daily work pressures

The core problem was clear: how could busy individuals transform their content consumption from a stressful obligation into an effortless, enriching experience?

OG Image

speasy.replit.app/

Speasy

The objective was to create a user-centric platform that would:

  • Eliminate the friction of content consumption
  • Maximise learning potential during otherwise unproductive moments
  • Provide a seamless, intuitive experience that integrates naturally into other podcast apps
Speasy

Innovative product development

Problem identification

I, personally, have been using read-it-later apps for many years. I’ve tried a lot of them, but Instapaper and Omnivore were my I’ve tried many of them, but Instapaper and Omnivore were my favourites. Both were optimised for the reading experience and also offered a listen feature.

Omnivore used ElevenLabs TTS and was the reeds I switched from Instapaper to Omnivore, which was an open-source product that was later acquired and sunsetted by ElevenLabs.

With the abundance of TTS now available, I thought that I could build something that could serve my needs and validate the content consumption pain point.

The concepts

I genuinely desired Speasy to be almost invisible in the user experience. As a content consumer, I primarily interact with articles and emails. My goal was to seamlessly integrate the action sheet in iOS with the Speasy API. This would enable me to send the content to the API, which would then scrape the content and send it to a Text-to-Speech (TTS) API. The TTS API would convert the content into audio, which would be subsequently uploaded to an XML RSS feed. This feed would automatically be added to my podcast app, making the content available for me to listen to later.

However, there were several challenges that needed to be overcome:

  • Converting saved markdown into audio required breaking it down into individual character chunks for TTS APIs to process.
  • The resulting audio would then need to be reassembled.
  • I required a web interface to query the database to verify that each stage of the process was completed correctly.
  • Additionally, I had to ensure accurate processing of the XML feed.
Podcast players playing Speasy

Rapid prototyping and technical implementation

I worked on a PRD within ChatGPT and through many chats off working through a few failed attempts, my AI-CTO (ChatGPT) had a pretty good idea of what I was trying to achieve.

I created a prompt that was based on the core requirements

A web-based text-to-speech content aggregator that converts web articles into audio podcasts. The system will:

Core Features:
- Accept URLs through API endpoints or manual input
- Scrape and convert webpage content to markdown format
- Store content and processing status in PostgreSQL database
- Split markdown into appropriate chunks (<4000 characters)
- Generate audio using OpenAI's Text-to-Speech API
- Combine audio segments using FFmpeg
- Generate RSS podcast feed for audio content
- Provide playback interface for generated audio content
- Provicde a list of the converted articles and the status 
- Track and display processing status for each article
What do you want to build today?

I developed a minimum viable product (MVP) using the no-code tool Replit to expedite the development process. Which integrated OpenAI’s Text-to-Speech technology to ensure high-quality audio conversion. Replit was utilised to construct the product using TypeScript, Tailwind CSS, and PostgreSQL.

The MVP comprises over 18,000 lines of code, that I didn’t write.

Additionally, I created an iOS shortcut for the share sheet to effortlessly save the current URL or clipboard content. The MVP integrates seamlessly with popular podcast platforms such as Spotify, Apple Podcasts, and Overcast. Users can effortlessly play the audio and access the original article content through show notes.

OG Image

www.icloud.com/shortcuts

iOS shortcut

Speasy is a service that converts saved articles into audio files, creating a personalised podcast feed. It allows users to listen to articles anytime, anywhere. The service uses high-quality Text-to-Speech technology and compatibility with major podcast players.

OG Image

speasy.replit.app/api

Podcast rss xml feed

Contact

Questions or need more details? Ping me via email , or any of my other social media links.

Newsletter

Get personal updates and readings on topics like product, design, productivity, programming, and more!

Join the 123 other readers.