I built an AI audio briefing app and moved it to aws

July 2, 2026
7 min read
By Rahat Kabir

Contents

What I Built

I built Briefcast as a daily audio briefing system.

The idea was simple: instead of opening many websites every morning, I wanted the app to research the topics I care about, understand what is actually useful, turn that into a short briefing with the option to see more details, and make it available as audio at the same scheduled time every day.

My main focus was not only “call an LLM and summarize news.” I wanted to build a real research loop. The system should search, read sources, use tools through MCP, collect useful context, avoid repeated information, and then generate a briefing that feels focused instead of random.

So Briefcast became a mix of backend engineering, AWS deployment, scheduling, job queues, audio generation, and LLM tool use. The goal was to make it work like a small personal briefing service, not just a demo.

User Flow

From the user side, the flow is very simple.

I add the topics I care about, choose a daily briefing time, and Briefcast handles the rest. When the scheduled time comes, it creates a research job, runs the research process, generates briefing cards, and makes the result available in the app.

The home page shows the latest briefing, the cards, and audio when it is available. I also kept history, manual research, and operator views because I wanted to see what the system was doing, not just wait blindly for a result.

Briefcast home page with briefing cards

This part was important to me because the project should not feel like only a backend experiment. Even though most of the work happened behind the scenes, the final output had to feel usable: open the app, see the briefing, read the important parts, or listen to the audio.

Research Loop

The core part of Briefcast is the research loop.

I did not want the app to send one prompt to an LLM and accept the first answer. For each topic, the system runs a loop where the LLM can decide what it needs next: search the web, fetch a page, inspect a source, or call a tool.

The tool layer includes MCP tools like Brave Search and Firecrawl, plus built-in tools like webpage fetch and GitHub activity lookup. This helped keep the research flow flexible instead of hardcoding every source into the app.

During the loop, the LLM uses these tools to collect useful context. Then Briefcast turns the research into briefing cards, checks for repeated or very similar cards, and saves the final briefing.

This made the output feel more like a real briefing process and less like a simple summary.

Briefcast research search flow

How the Backend Works

The backend is built with FastAPI and Postgres.

FastAPI handles the API routes, auth, topics, settings, briefings, audio, and operator endpoints. Postgres stores the durable state: users, topics, jobs, cards, audio references, and progress logs.

I designed the research flow as a job-based system. The API creates a job, a worker processes it, and the frontend can show progress while the longer research work happens in the background.

This same pipeline works for both daily scheduled briefings and manual research runs.

Briefcast backend architecture diagram

How I Moved It to AWS

After the local backend was working, I moved Briefcast to AWS step by step.

I started by containerizing the FastAPI backend and pushing the image to ECR. Then I ran the API on ECS Fargate, connected it to a private RDS Postgres database, and placed it behind a load balancer. For HTTPS access, I used CloudFront in front of the API path.

For the frontend, I used AWS Amplify to host the React app. The user opens the frontend from Amplify, and the frontend talks to the backend through the CloudFront API URL.

Briefcast AWS architecture diagram

The background work runs separately from the API. EventBridge checks the schedule every 15 minutes, creates jobs, and sends them to SQS. A separate ECS worker picks up those jobs, runs the research loop, writes the briefing results back to RDS, stores audio in S3, and sends logs to CloudWatch.

This setup made the system easier to manage. The API can stay responsive, while research and audio generation happen in the background.

What I Learned From Deployment

The biggest lesson was that deploying the app was not only about putting the backend on AWS.

A lot of the work was operational: making sure the API stayed healthy, jobs did not disappear, old queue messages did not run unexpectedly, secrets were handled safely, and long-running research jobs could be observed from logs or the operator dashboard.

I also learned that background jobs need different thinking than normal API requests. A research job can take more than a minute, so I had to think about SQS visibility timeout, retries, DLQ, worker scaling, and how to show progress to the frontend.

Briefcast operator dashboard

This part made the project feel much closer to a real system. The hard part was not only making the feature work once, but making it understandable when something goes wrong.

Cost and Trade-offs

I also checked the AWS cost while deploying Briefcast, but I did not want to turn this into a perfect billing report.

The useful part was seeing the pattern. The main visible cost areas were RDS for the database, Elastic Load Balancing for the public API path, ECS for running containers, and VPC-related networking. There were also smaller costs from supporting services like Secrets Manager, CloudWatch, S3, and others.

Briefcast AWS billing snapshot

This made the trade-off clear. A production-shaped setup is easier to reason about, but some services start creating cost as soon as they are running. RDS, load balancing, networking, and running ECS tasks are not the same as purely usage-based services.

I also had to think about paid external calls. The worker can call LLM providers, Brave, Firecrawl, and Polly, so I kept those paths controlled while testing.

The main lesson was that architecture and cost are connected. For this stage, I accepted some cost to build a realistic AWS setup. If I continued toward production, I would add stricter budgets, alerts, scaling rules, and a clearer worker operating model.

What I Would Improve Next

This version of Briefcast works, but I still see it as an early production-style build.

The next improvements would be moving the AWS setup into Terraform, adding a custom domain and HTTPS certificate, improving cost alerts, and deciding the final worker model. Right now, the worker can be controlled carefully, but for a real product I would need a cleaner strategy for when it should run and how it should scale.

I would also improve the email setup, monitoring, and deployment process so the system is easier to recreate and maintain.

One important reality is that I am not keeping this version publicly available all the time. It is usable, and right now only a friend and I are testing it, but keeping all the AWS pieces running continuously has a cost.

For this stage, my goal was not to launch a public SaaS. My goal was to understand how an AI product like Briefcast behaves when it moves from local development into a real cloud setup. So I am treating this as a production-style learning build: the architecture works, the flow is tested, and the next step is making the operating model cheaper and easier to maintain.

Overall, this project helped me understand how an AI app becomes more than an LLM wrapper. The hard part is not only generating text. The hard part is building the system around it: scheduling, tools, queues, storage, observability, cost control, and a user experience that feels simple.