GPT 4.1 support

Hi guys,

You're using GPT-4o in AI Blaze but GPT 4.1 has been out for a few weeks now – when will you be switching it over?

Thanks,

Alex

Hi Alex,

We're mulling over two things in regards to it right now:

  1. The Benchmark results for it compared to gpt-4o seem mixed overall. It seems like 4.1 is much better at coding than 4o, but other than that it's not clear it's an across-the-board improvement and may in fact be worse in some cases. We also default to Sonnet which is probably better than 4.1 on coding so those improvements don't seem critical to AI Blaze.

  2. ChatGPT either isn't adopting or if they did adopt it they decided to keep the GPT-4o label in the UI (it's unclear from OpenAI's public pronouncements). We're a bit concerned that users might be confused if we swap to a GPT-4.1 label when they still see GPT-4o in ChatGPT.

Anyways, those are two factors that are giving us hesitation on switching. What are your thoughts on it?

Hi Scott, thanks for getting back to me. I've switched my rewriting prompts that use the API over to GPT-4.1 and it's working really well so far. If you're unsure about making the switch for everyone, maybe you could add an option in the settings to enable more models, or introduce an 'experimental' option for people to try out new releases?

By the way, have you noticed that when using AI Blaze’s "Write with AI", Sonnet seems to 'think' before responding and then writes the message in a block, while GPT-4o just starts writing immediately? The extra moment Sonnet takes to describe what it’s about to do seems to help it generate better responses. Are you prompting them differently, or is Sonnet just responding differently to the same prompt?

Thanks

Hi Alex,

Our Sonnet and OpenAI GPT models use the same prompts. Sonnet is probably a larger model so it may be a bit slower.

It's good idea to offer experimental options, we'll explore that.

Thanks for your reply, Scott.

It's interesting how differently Sonnet and GPT-4o handle the prompt. I'm curious to see whether GPT-4.1 behaves the same way or if it's an improvement.