ChatGPT v Gemini: which AI is best?

If you thought the AI war was done and settled and that ChatGPT would replace your boss in a year’s time, think again. The war is back on. This week Google, which already touches all parts of our lives, unveiled Gemini, which it hopes will be a ChatGPT killer.

Gemini is its version of an LLM, or large language model, the technology that underpins chatbots such as ChatGPT. Google claims it’s the first AI model to do better than humans on a popular knowledge and problem-solving test. You can use it now through Bard, its AI chatbot, provided that you’re in one of the 170 countries where it’s available (it’s not yet available in the UK, but is coming soon).

The implication is that Google’s product is the real deal, and we should all abandon ChatGPT for the competition. It’s a ballsy move from the company best known for its search engine, which had released an earlier version of Bard during the first flurry of interest in ChatGPT this year.

• Read more: Google launches Gemini

“The race itself is a tale as old as tech industry time — try to get the biggest market share so you’re the default choice,” says Catherine Flick, professor of ethics and games technology at Staffordshire University. “OpenAI is winning the chatbot wars so far as everyone talks about ChatGPT, not Bard, but Google has a search engine advantage.”

“Until the release of ChatGPT, Google had been moving cautiously in the generative AI space, seeming to recognise the risks of introducing powerful but flawed models,” says Dr Mike Katell of the Alan Turing Institute, the UK’s national institute for data science and AI. “One advantage Google’s AI arm has over OpenAI is that they have been developing and iterating LLMs for much longer than OpenAI.”

So which is better? Using some trickery to access Gemini, we asked the same eight questions of both models, and here’s how they fared:

1) I’ve got eight friends coming to dinner and I want to impress them but with something quite simple and quick to make. They’re all meat-eaters. What recipe should I use and what do you suggest for food and cocktails?

ChatGPT took a more Nigella-like approach, rattling off a recipe for roast beef tenderloin, which it says is “a crowd-pleaser, simple to prepare and always looks impressive on the table”. Alongside that, it recommended serving old-fashioneds, which sounds like a good pairing.

But if ChatGPT went for a louche “bung it in the oven and entertain” approach, Gemini’s answer was a more details-orientated affair, as it suggested a feast even Mary Berry would blanch at. The meal plan had three courses, starting with salmon crostini with caper cream, a flank steak with roasted veg and parmesan mashed potatoes, and warm chocolate lava cakes, alongside drinks recommendations for each. It was a bit too try-hard, and didn’t actually include recipes.

Winner: ChatGPT

2) My wife wants to go somewhere for a Christmas break but she hates skiing and would like to be somewhere warm for a week. What are the best options for a romantic getaway, plus can you give me an itinerary of things to do and places to eat?

Rather than cut to the chase with a definitive suggestion, as it had with the previous question, ChatGPT hedged its bets by suggesting a perfectly serviceable week in the Maldives or a similar stay in Bali. Bali was also the first recommendation of Gemini, but it also suggested Phuket and Mauritius.

For those husbands who need to present the business case before any big family decisions, the handy images accompanying each day’s itinerary provided by Gemini — which was also far more detailed than ChatGPT’s — helped to build up the potential of a great getaway. An easy win.

Winner: Gemini

3) My child read a copy of Debrett’s and now believes it is good manners to respond to letters. She’s expecting Santa to acknowledge her present list. Help!

Kids are clever and quick to see through tricks. And while ChatGPT’s letter from Santa lays it on thick, with too many “ho ho ho!”s and knowing mentions of elves, it’s at least better than the alternative.

Gemini loses points for not actually providing a letter, instead opting for some life coach-like motivational baloney, and also for starting its answer with the words: “It’s wonderful that your child is learning about etiquette and good manners from Debrett’s.”

Winner: ChatGPT

4) I’ve got an important job interview coming up that would seriously raise my income. How should I prepare for it?

Slightly platitudinous but generally useful advice is something generative AI tools excel at, so both do well here. Both offer bullet-pointed lists of pretty useful, if vague, tips, though Gemini’s is slightly more detailed. However, if you need to be told to “maintain good grooming and hygiene”, the extra salary may not be your biggest concern.

Winner: Gemini

• A guide to ChatGPT

5) I’m stumped by my teenager’s maths homework. The question is: “The ratio of sheep to cows in a field is 3:2. The farmer adds one more cow to the field, which makes the total number of animals in the field 26. What is the ratio of sheep to cows in the field after the new cow joins the animals in the field?”

This question is taken from the GCSE Bitesize maths revision website, and the answer is 15:11, as you clearly know. (Look, we won’t tell if you don’t.) ChatGPT channels the smug Mensa member classmate the memory of whom still makes you subconsciously ball up your fist, explaining its working and getting the answer right.

Google’s version does the same thing, then says: “The new ratio of sheep to cows in the field is 39:27.” As I tell people, bear in mind that generative AI tools are like a fresh-faced, overconfident Oxbridge graduate.

Winner: ChatGPT

6) Can you design me a Christmas card for our family this year? It should highlight our love of gardening and Aldi’s own-brand sherry.

One of Gemini’s main selling points is its prowess with multimodal (read: not just text) inputs. However, that’s inputs. All it does is offer me advice on how I could draw my own card. There is a nice suggestion of a poem.

OpenAI, the makers of ChatGPT, have bundled its DALL-E 2 image generator into its tool, meaning I get a pretty festive, if somewhat Raymond Briggs’s Snowman-style, depiction. Generative AI images still can’t do text right, though, so you’ll be quaffing “Aldi Ali-o Band Sherry” which, depending on how hard you do Christmas Day, may well be an accurate visualisation of your festive period.

Winner: ChatGPT

7) The news is far too dreary to sit through — can you give me an uplifting story now in the news?

Searching the internet is something that ChatGPT only recently learnt how to do but it does it well, even if the stories it surfaces are all a bit like what a particularly geeky computer coder would find uplifting. “A large dataset of human genetic sequences from nearly 500,000 volunteers has been made available for global research” does not exactly get me giddy, nor does the news that electric vehicles are outselling diesel ones in the EU, great though that is. Every story comes with a link to the source and they all exist!

Gemini is making stuff up again. Apparently Ben, who has no surname, has knitted 10,000 pairs of socks sent to nursing homes. Let’s put it this way: the story wouldn’t get past The Times’s sub-editors.

Winner: ChatGPT

8) How would you explain what gravity is in layman’s terms?

Right. We need to have a word. Google says its model is brilliant, and I’m sure it is. But read this sentence: “Imagine you’re standing on the ground. You feel pulled down, right? That’s the force of gravity at work!”

I am a simple journalist who doesn’t leave the house much and spends his days hunched over a laptop. My posture is terrible. But while my spine often feels as if it’s gone through a medieval torture device, I don’t think I’ve ever felt physically pulled down to the ground. Have you?

ChatGPT’s description seems much better to my mind. “Gravity is like an invisible force that pulls objects towards each other. It’s what keeps our feet on the ground instead of floating off into space. It’s the same force that makes an apple fall from a tree to the ground,” it says. That seems about right to me.

Winner: ChatGPT

Overall score: ChatGPT 6 Gemini 2

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *