🤖

AI Chatbot Comparison

Jul 3, 2025

View transcript

Take quiz

Review flashcards

Overview

This review compares ChatGPT, Google Gemini, Perplexity, and Grok AI chatbots across a wide range of tasks relevant to average users, evaluating accuracy, speed, usability, integrations, and overall value for their subscription costs.

Initial Product Comparison

Four AI chatbots tested: ChatGPT, Gemini, Perplexity, and Grok, each on a separate phone.
Key questions include which is most accurate, fastest, and most worth paying for.

Problem-Solving & Factual Accuracy

Luggage fit: Grok gave the most practical answer, while Perplexity was least accurate.
Object/photo recognition: Grok correctly identified an ingredient based on a photo.
Simple math and savings: All bots answered basic calculations and savings scenarios correctly, except minor rounding differences.

Language & Translation

Basic translation tasks: All bots gave acceptable English translations.
Complex homonym translation: ChatGPT and Perplexity performed best; Grok struggled with nuance.

Product Research & Recommendations

Earbud recommendations: Gemini fabricated a non-existent product; Grok performed well on color filtering.
Most bots failed nuanced product requirements (color, price, features).
All bots failed to extract info directly from pasted shopping links.

Current Events & Data Retrieval

All AIs correctly identified Ugreen’s latest charger power output.
Only Gemini accurately retrieved real-time YouTube view counts.

Critical Thinking & Analysis

All identified survivorship bias in a classic airplane reinforcement scenario.
Only ChatGPT and Perplexity nailed subtle car model identification from a single photo.

Content & Idea Generation

Email apologies and trip itineraries: ChatGPT organized responses best; others varied in clarity.
Video idea and image generation: Grok showed internet-savvy content; ChatGPT excelled at thumbnail understanding.
Humor: Grok performed best, likely benefiting from X (Twitter) data training.

Fact-Checking & Research

All AIs resisted misinformation about the Nintendo Switch 2 and false Samsung rumors.
Deep research questions: ChatGPT gave balanced, informative summaries; Gemini was excessively verbose; Perplexity consistently cited sources.

Integrations & Ecosystem

Gemini integrates best with Google Workspace and hardware.
ChatGPT offers plugins and customizable GPTs.
Grok accesses real-time X content; Perplexity limited to minor utility features.

Memory & Personalization

None reliably remembered cake details from previous prompts.
ChatGPT and Grok admitted lack of conversational memory; other bots gave generic or incorrect references.

Speed, Voice, and User Experience

Grok was fastest; ChatGPT close behind; Gemini slowest.
ChatGPT and Gemini best for natural-sounding voice interactions.

Final Scoring & Recommendations

ChatGPT: 29 points – most consistent, accurate, and well-integrated.
Grok: 26 points – fast, surprisingly good, but pricier.
Gemini: 22 points – strong integrations, weaker in speed and clarity.
Perplexity: 19 points – best sourcing but inconsistent in performance.
ChatGPT’s $20/month cost makes it the top recommendation; Grok is $30/month.

Full transcript