Here at SOMO, we use AI tools such as ChatGPT for different things. Some of us use it for inspiration, copywriting, and design, while I use it mostly for coding and complex data analysis.
I remember when GPT-4 came out – it was a big deal. Everyone was super excited with its capabilities. But lately? Not so much. The responses feel stiff, overly formal, and sometimes just uninspired. Not to mention the laziness. And it seems like I’m not alone in this. Online forums are full of people complaining about ChatGPT’s declining performance. There’s even a Stanford research paper from last year demonstrating the dip in performance.
The recently announced GPT-4o (Omni) model might be an improvement, but I’m not married to a single LLM. I like to explore different options and see what works best.
So, why is ChatGPT getting “dumber”?
There are three main theories:
- Cutting costs: Running an AI is extremely expensive. The chips are costly, and running them requires an insane amount of power. To save money and resources, OpenAI might be limiting GPT-4’s capabilities. This would also explain the usage caps they’ve implemented, even for paid users. Think of this as the shrinkflation in computing resourcs.
- Copyright concerns: Some people speculate that OpenAI is playing it safe with GPT-4’s creativity to avoid copyright issues.
- The familiarity factor: Last year, Peter Welinder from OpenAI suggested that GPT-4’s perceived decline is just our impression. “You use it more heavily, you start noticing issues you didn’t see before,” he stated. However, this explanation hasn’t convinced everyone. Some believe it’s a case of confirmation bias – the more people complain about GPT-4 getting “dumber,” the more likely we are to notice and agree with those claims.
The rise of the alternatives
While GPT-4 faces these challenges, other LLMs are stepping into the spotlight. Anthropic’s Claude 3 Opus is currently leading in benchmarks (though it’s only available in the US for now, unless you use a VPN). Meta’s Llama 3 is promoted as the most powerful free and open-source LLM on the market, and then there’s Mistral.
But the most interesting contender comes from Google, and it’s called ✨ Gemini ✨ .
Gemini: From Bard’s failure to success
Remember Bard, Google’s first public LLM? It was terrible. But Gemini has learned from those mistakes and become something truly impressive.
It all started with the Gemini 1.0 announcement in December 2023. Google released three versions: Ultra for complex tasks, Pro for general use, and Nano, a lightweight version for smaller tasks and on-device processing. While a significant improvement over previous Google models, Gemini 1.0 didn’t fully live up to the hype. Its release was met with controversies and somewhat overshadowed by the simultaneous announcement of Sora, OpenAI’s groundbreaking video generation AI.
But then came Gemini 1.5 Pro just two months later, and everything changed.
Gemini 1.5: What’s the big deal?
The key improvement? Long-context understanding. Gemini 1.5 can process a massive amount of text – up to one million tokens – without any issues. That’s eight times more than GPT-4.
Tokens and context windows: A simple explanation
Imagine feeding text to an AI. The AI breaks it down into smaller units called tokens. These can be whole words, parts of words, or punctuation. The context window is like the AI’s memory — it determines how many tokens it can handle at once. The larger the context window, the more information the AI can remember and use in its responses, resulting in more accurate and coherent results.
To demonstrate this power, Google had Gemini process the complete 402-page transcript of the Apollo 11 mission. It then answered questions about the smallest details with ease. That’s pretty impressive!
Update 16/05/2024: Google has announced they’re increasing the Gemini 1.5 Pro context window to 2 million tokens. They have also announced Gemini 1.5 Flash, a smaller and faster model optimised for more basic tasks. Read the full announcement.
How to try Gemini 1.5 for free
You can access Gemini through the standard web interface at gemini.google.com, but that’s still running the older Gemini 1.0 model. To experience the full potential of 1.5, you can use Google AI Studio – a prototyping tool for experimenting with AI models. And the best part: it’s available for FREE, at least for now. Simply head over to aistudio.google.com, log in using your Google account, accept the terms, and start experimenting with it.
Instructions for Google Workspace users
If you have a Workspace account, you might need to enable AI Studio access first. Just follow the steps below:
⚠️ You must have admin access to Google Workspace to complete the steps below. If not, you will have to forward the instructions below to the Workspace admin.
- Open the Google Workspace admin console.
- Go to Menu > Apps > Additional Google services.
- In the list of all services, scroll down to and click Early Access Apps. The settings for Early Access Apps page opens.
- On the Settings for Early Access Apps page, click Service status.
- To turn AI Studio on or off, select On or Off, and then click Save.
- Also on the Settings for Early Access Apps page, click Core Data Access Permissions and check the “Allow users at your organization to access Google Workspace and Customer Data using Early Access apps” option, then click Save.
Once you’re in, click Create new prompt and select the Chat prompt option. Then, go to the Model settings on the right and make sure Gemini 1.5 Pro is selected. Now you should be able to interact with the AI just like you would on ChatGPT. You can also add images, videos and other files to the prompt.
⚠️ Important: There’s no auto-save feature on Google AI Studio. If you want to save the chat history, you must do it manually by clicking the Save button at the top.
SOMO’s recommendation
If you’ve been using ChatGPT, I definitely recommend trying Gemini. It might become your new favourite. But remember, different AIs are good at different things, so your mileage may vary. Gemini might excel at coding, while ChatGPT might be better for creative writing. It’s all about finding the right tool for the task.
And since there’s no standard way to measure LLMs, benchmarks can only tell us so much. The real test is how well they perform in practical applications.
If you’re looking to incorporate AI into your digital marketing mix, feel free to reach out to us. We’d be happy to help you explore the possibilities. And I’m always interested in hearing how other marketers are using AI in their work!