← Free Time

Debugging in Open Claw is not easy

How I tried building me a smart transcription assistant. And learned how hard debugging is with Open Claw.

Firstly, a little explanation. Last week was lengthened because of the 1st of May. And it was well spent. So that's why I did not put out a new "Promptly Done" post. No excuse, just a statement. Because sometimes there are way more important things than blog posts. Like enjoying life.
What I started last week as an idea and have tried out this week was building a new specialized skill for my OpenClaw instance.  The exact prompt was:

Hey I want to have the possibility to record an interview in a loud place, send it to you, get it transcribed and get a summary of the keypoints. What would be the best approach? What can you do yourself and what do I need to do on my terminal?

All of the above are, of course, things that different products can achieve on their own. But I wanted to see if I could build this on my own. Or more exactly, let OpenClaw build it rather than buy it off the shelf. The answer was:

What I can do myself I can help you:
• write or improve the transcription script
• write a summarization script
• tune prompts for noisy rooms and German speakers
• help chunk long recordings
• help debug failed runs
• summarize transcripts you send me as text
• if transcription backend works, process audio you send here

What you need to do on your terminal:
 You need to make sure the machine has:
1. a working speech-to-text backend • easiest: OpenAI Whisper API key • or local Whisper / Faster-Whisper 2
. audio tooling for large files • ideally ffmpeg
3. optionally, a local summary script • transcript in, summary out

That sounded like a quite solid game plan, and so I dug into it. However,  it did not work out.  I got the scripts and the infrastructure set up. But then I spiraled into an ever-continuing debugging chat. I went for things like:

Can you please implement something that splits the audiofile in smaller ones
And:
Ok great. Can you get this to also work on m4a files that I attach?
And:
So what is happening. I have the feeling anything that has to be done in chunks gets caught up

But what happened was that still the larger audio files would not get transcribed. To make matters worse:
So, in the end, I now have a skill that transcribes smaller bits of audio fine and also directly recorded audio messages. This worked amazingly well. But to figure out how to get this done for larger files and why OpenClaw fails right now, has been immensely difficult. 

And maybe this is also one of a more general problem of LLMs and such. It is hard for me as an end-user to understand what's going on. While it is working, I might not really care. But if it fails and keeps on failing – that's a real pain.


Comments 0

No comments yet — be the first.