Using AI to turn a talk into an article
- Balázs Kis
- Mar 28
- 4 min read
Updated: Apr 11

In this article, I describe how I created another article. The other article is called Only Humans Can Learn How to Translate, and is available here: Only Humans Can Learn How to Translate.
This article is not intended as a parody of research articles – but I realize it will function as such because some pretend research articles will remain on this level. I want this piece to be more useful than that, though: I will point out how this very incidental and anecdotal experiment can actually be turned into serious research.
The original talk
At a memoQ University Summit, held on January 22, 2025, at the NOVA University in Lisbon, I was asked to say a keynote relevant to the studies of the people present. I decided to speak about how only humans can learn how to translate. My main points were centered around how it is still a human privilege (1) to be conscious about the process and the objective, (2) to have a sense and judgment of good quality, (3) to actually care about and to want to achieve good quality.
The lecture was recorded and transcribed using AI. I have manually edited the transcript into an article, which is available here.
The experiment was there to find out if it is possible to use a publicly available AI tool to create the article, instead of spending hours with manual editing. Without giving away much, I can say that the answer is “yes and no” – some of the AI tools are suitable for the task but obviously with some caveats.
How it was transcribed
I had recorded the lecture in the classroom, and the original intention was to edit and publish the recording. However, the image in the recording was subpar (I got both the frame and the lighting wrong). The sound recording was passable, though.
I had fed the video files into a transcription tool called Turboscribe, which uses an Azure OpenAI model (probably a variant of GPT-4 because multi-modal operation is required). The transcription was almost faultless, even the names were correctly transcribed, which is all the more impressive because I am not a native English speaker.
It was amusing, though, that “ChatGPT” was transcribed into “Tragic-PT”, which was one of roughly three mistakes Turboscribe made in a one-hour talk. (When feeding the videos into Turboscribe, I chose their “Whale” option which promises the highest quality.)
I did not directly feed this transcript into the AI tools. Instead, I manually postedited it first. I usually speak without extensive rehearsals, which means that my talks are improvised and very spontaneous most of the time. As a result, the transcript will reflect many glaring characteristics of spoken languages – pauses, repetitions, restarted or excessive phrases. I manually cut this down so that most sentences became grammatical, and the transcript was easier to follow. But this postediting phase was minimal.
How to turn this into serious research: If you wants to run an experiment to evaluate speech transcription tools, you need to evaluate them on multiple talks but shorter ones (take 1- or 2-minute samples), and you need multiple human posteditors. You might want to evaluate editing times and/or edit distances between the ‘raw’ and the postedited transcriptions. You may save one of these as benchmarks and compare ‘raw’ transcriptions to those, thus setting up a BLEU-like score.
Do not use further AI tools to postedit the raw output – they will need to be evaluated separately and must not be part of your ‘measuring equipment’.
What AIs I used and how
I have tried four different publicly available AI tools with the following prompt (fine-tuned to the actual model as necessary):
“You are a content writer for a software company. The attached document is the edited transcript of a talk I delivered about AI in translation. Use this transcript to create a well-formed article intended for a printed publication for localization professionals. Use the most suitable language, structure, and formatting. BUT keep the original title, train of thought and language as much as you can. Keep it under 2500 words. Return the result in another document.”
The AI tools used were the following:
ChatGPT 4o
Claude 3.5 Sonnet
Microsoft 365 Copilot
Gemini
How to turn into serious research: Use a larger number of transcripts (see the comment in the previous section) and automate the processing through the APIs of the large language model or the AI service. Use human evaluation, but with structured instructions.
Results
If I have to rank the results, which I can only do subjectively, it would turn out as follows – from best to worst:
Claude – Claude created a well-readable article of 925 words with little to give away that this was AI-generated content. Minimum post-editing was only needed to convert the markdown formatting that Claude uses by default.
ChatGPT 4o – ChatGPT created a mostly well-readable article of 1015 words. The article exposed two kinds of problems, though:
ChatGPT had a problem finding the focus or main point of the article. It omitted an important point but kept a tangent and used discourse gymnastics to fit it into the rest of the text. In the Conclusion part, it attempted to include all the points I made, which resulted in word salad that contained more tangents than main points.
On one occasion, ChatGPT inserted a point that was not said in the talk.
and 4. Gemini and Copilot: In both cases, the output was not usable. The AI tool practically regurgitated the original transcript, omitting small parts here and there. The text was not turned into a meaningful publication-ready structure.
How to turn this into serious research: Create human-postedited benchmarks and use both automated comparisons (of several samples) as well as human evaluation.
(No) Conclusion
This article is purely anecdotal. If we need to answer the question of how good AI tools are at turning conference talk transcripts into articles, we need to look at several samples from various places. That said, the outcome of this experiment suggests that you can find an AI tool that would turn your messy transcript into a publishable summary if you are pressed for time. I will still publish my own manually edited article (which is about twice the size of the AI-generated ones at 1,998 words) because that is the one where the points that remained in the text were selected by my judgment.