# Gemini Pro vs. GPT-3.5: A Detailed Comparison
Written on
Chapter 1: Introduction
In the past week, Google launched Gemini Pro alongside Bard, prompting a direct comparison with its rival, GPT-3.5. This article evaluates their performance in writing, reasoning, and coding tasks, providing insights for frequent users of GPT-3.5 who may be curious about Gemini Pro's capabilities.
Note: As of now, Google's Gemini Ultra, which competes with GPT-4, has not been released. For more information about Gemini Ultra and my thoughts on it, refer to this article.
General Requests: Personal Assistance
To kick off my evaluation, I tasked both Gemini Pro and GPT-3.5 with helping me plan a trip. My itinerary involves traveling from California to New York next week, and I requested assistance in finding flights and hotels near Central Park.
Thanks to a recent update, Bard can access Google services, including Flights and Hotels, to deliver real-time data.
Bard's ability to integrate with personal applications such as Gmail and Drive enhances the customization of its responses. On the contrary, GPT-3.5 lacks internet connectivity and cannot interface with external applications.
For my second assessment, I assigned both models a role-playing task. Previously, Bard struggled with this type of prompt, but now it successfully adheres to the instructions. GPT-3.5 also performed admirably.
In the final test, I requested both models to create a humorous blog post. Both produced enjoyable content.
So far, both models have demonstrated comparable performance. However, I found Gemini Pro's output slightly more engaging, and Bard’s integration with Google applications reveals unique opportunities.
Based on this initial comparison, I would favor Gemini Pro.
Reasoning Capabilities
I tested both Gemini Pro and GPT-3.5 with math-related questions. According to a recent paper from Google, Gemini Pro outperforms GPT-3.5 in elementary math but is nearly on par in more complex problems across various difficulty levels and subdisciplines.
For my evaluation, I posed three exercises to both models. They successfully solved the first, which involved logical reasoning.
The second challenge was a number sequence problem. Here, Gemini Pro did not succeed, while GPT-3.5 provided the correct answer.
In the final, more complex task, both models failed to identify the next term in the sequence.
Interestingly, both arrived at the same incorrect answer, which was not accurate according to my reference source (GPT-4 successfully solved the same problem). In reasoning tasks, I found GPT-3.5 to have a slight edge over Gemini Pro, but further testing is necessary for a definitive verdict.
Code Generation
I evaluated Gemini Pro and GPT-3.5 on coding tasks, and the results were quite enlightening. Initially, I asked both to explain the following Python code:
from bs4 import BeautifulSoup
import requests
website = 'www.example.com'
result = requests.get(website)
content = result.text
soup = BeautifulSoup(content, 'lxml')
print(soup.prettify())
box = soup.find('article', class_='main-article')
title = box.find('h1').get_text()
transcript = box.find('div', class_='full-script').get_text(strip=True, separator=' ')
with open(f'{title}.txt', 'w') as file:
file.write(transcript)
Both provided clear explanations for the code.
However, when I requested them to generate code for sending an email, Bard sometimes hesitated.
# Write Python code that sends an email from “email_1” to “email_2” with the subject “Email sent by Bard” and the content “Bard rocks!”
This forced me to adjust my prompts to elicit a response from Bard.
Ultimately, it worked, but the generated email lacked customization for the subject and content. In contrast, GPT-3.5 handled this task seamlessly.
In this area, GPT-3.5 comes out on top.
The Final Verdict
Both models excel in handling general inquiries, but Bard's ability to connect to Google applications introduces exciting possibilities exclusive to Gemini Pro. In terms of reasoning, GPT-3.5 seems to have a slight advantage over Gemini Pro, but both struggled with more intricate problems.
When it comes to coding, both can assist effectively, but Bard occasionally declines to generate code, whereas GPT-3.5 is consistently reliable.
In summary, while both models demonstrate comparable capabilities, the key differences lie in their respective functionalities and limitations associated with their underlying platforms: Bard and ChatGPT. As a programmer, I would favor GPT-3.5 for its reliability in code generation. However, I would leverage Bard for its unique access to Google applications when necessary.
As a side note, I personally prefer not to use either Gemini Pro or GPT-3.5, as I have access to GPT-4, which is significantly more advanced. While its cost is a drawback, GPT-4 is currently the best model available. Once Gemini Ultra is released, I plan to conduct a similar comparison to determine which model is superior.
Join my newsletter of over 35,000 subscribers for free cheat sheets on ChatGPT, web scraping, Python for data science, automation, and more!
If you appreciate articles like this and wish to support my work, consider subscribing to my Substack, where I publish exclusive content not found on my other platforms.
A quick comparison between Bard (Gemini Pro) and ChatGPT (GPT 3.5).
An ultimate head-to-head comparison of ChatGPT 4o vs. Gemini 1.5 Pro.