🤖

AI Chatbot Comparison

Jul 3, 2025

Overview

This review compares ChatGPT, Google Gemini, Perplexity, and Grok AI chatbots across a wide range of tasks relevant to average users, evaluating accuracy, speed, usability, integrations, and overall value for their subscription costs.

Initial Product Comparison

  • Four AI chatbots tested: ChatGPT, Gemini, Perplexity, and Grok, each on a separate phone.
  • Key questions include which is most accurate, fastest, and most worth paying for.

Problem-Solving & Factual Accuracy

  • Luggage fit: Grok gave the most practical answer, while Perplexity was least accurate.
  • Object/photo recognition: Grok correctly identified an ingredient based on a photo.
  • Simple math and savings: All bots answered basic calculations and savings scenarios correctly, except minor rounding differences.

Language & Translation

  • Basic translation tasks: All bots gave acceptable English translations.
  • Complex homonym translation: ChatGPT and Perplexity performed best; Grok struggled with nuance.

Product Research & Recommendations

  • Earbud recommendations: Gemini fabricated a non-existent product; Grok performed well on color filtering.
  • Most bots failed nuanced product requirements (color, price, features).
  • All bots failed to extract info directly from pasted shopping links.

Current Events & Data Retrieval

  • All AIs correctly identified Ugreen’s latest charger power output.
  • Only Gemini accurately retrieved real-time YouTube view counts.

Critical Thinking & Analysis

  • All identified survivorship bias in a classic airplane reinforcement scenario.
  • Only ChatGPT and Perplexity nailed subtle car model identification from a single photo.

Content & Idea Generation

  • Email apologies and trip itineraries: ChatGPT organized responses best; others varied in clarity.
  • Video idea and image generation: Grok showed internet-savvy content; ChatGPT excelled at thumbnail understanding.
  • Humor: Grok performed best, likely benefiting from X (Twitter) data training.

Fact-Checking & Research

  • All AIs resisted misinformation about the Nintendo Switch 2 and false Samsung rumors.
  • Deep research questions: ChatGPT gave balanced, informative summaries; Gemini was excessively verbose; Perplexity consistently cited sources.

Integrations & Ecosystem

  • Gemini integrates best with Google Workspace and hardware.
  • ChatGPT offers plugins and customizable GPTs.
  • Grok accesses real-time X content; Perplexity limited to minor utility features.

Memory & Personalization

  • None reliably remembered cake details from previous prompts.
  • ChatGPT and Grok admitted lack of conversational memory; other bots gave generic or incorrect references.

Speed, Voice, and User Experience

  • Grok was fastest; ChatGPT close behind; Gemini slowest.
  • ChatGPT and Gemini best for natural-sounding voice interactions.

Final Scoring & Recommendations

  • ChatGPT: 29 points – most consistent, accurate, and well-integrated.
  • Grok: 26 points – fast, surprisingly good, but pricier.
  • Gemini: 22 points – strong integrations, weaker in speed and clarity.
  • Perplexity: 19 points – best sourcing but inconsistent in performance.
  • ChatGPT’s $20/month cost makes it the top recommendation; Grok is $30/month.