I use AI to draft scripts all the time. It's fast, it's useful, and it gets ideas out of my head and onto a page before I've finished my coffee. So this isn't an anti-AI piece. Far from it.

But I've noticed a pattern. We get scripts sent over for brand films, explainer videos, thought leadership pieces, and increasingly they have a particular flavour. Polished grammar. Neat structure. And a strange, corporate stiffness that makes the presenter sound like they're reading the back of a software box from 2009.

I tested it. Typed "write me a video script about digital transformation" into ChatGPT. First line back: "In today's rapidly evolving digital landscape, organisations must leverage innovative solutions to stay ahead of the curve." Try saying that to a camera with a straight face. It's hard. I've watched people try.

The tools aren't the problem. The defaults are. And once you know what to look for, they're easy to fix.

The Default Habits Worth Knowing About

Written words and spoken words follow different rules. A sentence that reads fine on a page can sound completely wrong when someone says it into a camera. AI defaults lean on written-English conventions: long subordinate clauses, formal vocabulary, passive constructions. These are fine in a white paper. On screen, they make your presenter sound like they're reading someone else's homework.

Ask ChatGPT to write a 60-second explainer and you'll get something like:

"Our comprehensive suite of solutions empowers businesses to streamline their operations, enhance productivity, and drive meaningful outcomes in an increasingly competitive marketplace."

Read that out loud. That sentence takes about nine seconds to deliver, and by second four most viewers have mentally moved on. It has the cadence of a terms-and-conditions page. Nobody talks like that, and when a presenter is forced to, you can see them stiffen up.

Copilot is a bit better at sounding conversational but has its own quirks. It loves the word "harness." It defaults to upbeat energy regardless of subject matter, which means a script about data breaches ends up sounding weirdly cheerful. The tone mismatch is something to watch for.

I've been keeping a list of phrases that crop up regularly in AI-generated video scripts. If you spot these in your draft, they're worth swapping out:

  • "In today's [adjective] landscape" — ChatGPT reaches for this in almost every first draft. It's filler.
  • "Empower / enable / unlock" — vague verbs that sound impressive but don't tell the viewer anything specific.
  • "Cutting-edge / state-of-the-art / next-generation" — interchangeable. Swap any one for any other and the sentence reads the same.
  • "Streamline operations" — perfectly fine in a report. Sounds odd when a human says it to camera.
  • "Drive meaningful outcomes" — what does this mean in practice? Replace it with something concrete.
  • "Harness the power of" — Copilot's go-to. Easily spotted.

These aren't bad phrases in writing. They're just not how people speak. And video is spoken.

Why It Matters More on Camera

When someone reads a mediocre blog post, they skim. Their eyes jump ahead, skip the filler, find the useful bits. Video doesn't offer that luxury. Your audience is locked into the pace you set. If the script is padded, viewers can't fast-forward to the good part. They just close the tab.

The bigger issue is what it does to your presenter. I've seen confident, articulate people stumble over AI-generated sentences because the phrasing doesn't match how they speak. They tighten up. Their delivery goes wooden. And when a presenter loses confidence mid-take, the audience feels it immediately.

We worked with a CEO earlier this year who runs a brilliant company. Sharp, funny, knows his stuff inside out. But the original script had him saying things like "synergistic value creation" and he couldn't get through a single take without breaking character. We spent forty minutes rewriting it together, putting it into his words, and the difference was night and day. Same message, completely different energy.

AI first draft

Our mission is to empower organisations to leverage cutting-edge technology solutions that drive transformative outcomes across their enterprise ecosystem.

After a quick rewrite

We help companies pick the right tech and get it working.

AI first draft

Let's explore how our innovative platform enables seamless collaboration across distributed teams, driving enhanced productivity and fostering a culture of continuous improvement.

Camera-ready version

Our platform makes it easier for remote teams to work together. Here's how.

How to Get Better Output (Without Ditching AI)

The tools are good. I use ChatGPT and Copilot regularly for brainstorming, outlines, and rough drafts. They save time and they're getting better. The trick is knowing that the first output is a starting point, not a finished script.

The simplest test: read it out loud. If any sentence makes you feel like you're narrating a PowerPoint from 2014, it needs rewriting. If you wouldn't say it to a colleague over coffee, it doesn't belong in a video.

One thing I've started doing with clients is asking them to read the AI draft to me on a video call. Not perform it. Just read it as if they're explaining something to a friend. Within thirty seconds they're pausing, laughing, saying "OK that bit sounds off." They rewrite it themselves, in their own words, and the result is always better. Hearing corporate phrasing come out of your own mouth is its own quality control.

The other thing that helps: change your prompt. Tell ChatGPT "write this as if I'm explaining it to a colleague over coffee" and the output improves noticeably. It'll still sneak in the odd "leverage" or "empower." But it's closer to how people communicate, which is closer to what works when a camera is rolling.

A few more prompts that produce better video scripts:

  • "Write this for someone who will say it out loud on camera" — forces shorter sentences and simpler vocabulary.
  • "Avoid corporate jargon, keep it conversational" — obvious, but it makes a measurable difference.
  • "Write it in the voice of [the presenter's name] based on this example of how they speak" — paste in a transcript from a previous video and the AI matches the cadence.

AI knows what looks correct on a page. What sounds right on camera is a different skill, and that's where the human ear comes in. Use the tools for speed and structure, then spend ten minutes reading the draft out loud and rewriting the bits that feel stiff. That ten minutes is the difference between a video people watch and one they click away from.

Kate Bennett

Kate Bennett

Group CEO, Compare the Cloud

Kate leads Compare the Cloud and Disruptive Live, working with B2B tech brands on content, video, and events. She writes about what she's learning along the way.