Guide to Extracting Twitter Data with Python

Aug 6, 2024

Pulling Twitter Data Using Python

Introduction

  • Welcome to the tutorial on pulling Twitter data using Python.
  • Useful for practicing data analysis and machine learning.

Prerequisites

  • Ensure Python is installed.
    • Check a tutorial for installation if not yet done.

Getting Twitter API Access

  1. Obtain a Twitter Token:

    • Go to apps.twitter.com
    • Log in with your Twitter account; sign up if you don’t have one.
  2. Create an App:

    • Click on "create app."
    • Fill in the application details:
      • Name: Your choice
      • Description: Your choice
      • Website URL: Use placeholder if no website.
    • Describe your intention with the Twitter API (e.g., for study purposes).
    • Press the create button.
  3. Application Approval:

    • Twitter will assess your application (approval typically within a day).
    • Check your email for approval notification.
  4. Get API Keys and Tokens:

    • Click on the app details.
    • Navigate to "keys and tokens" section.
    • Note down the API key, API secret key, access token, access token secret.

Setting Up in Jupyter Notebook

  1. Import Tweepy Library:

    import tweepy
    
    • If an error occurs, install the library using:
      pip install tweepy
      
  2. Store API Credentials:

    • Set up the following variables:
      api_key = 'your_api_key'
      api_secret_key = 'your_api_secret_key'
      access_token = 'your_access_token'
      access_token_secret = 'your_access_token_secret'
      
  3. Authenticate with Twitter API:

    auth = tweepy.OAuthHandler(api_key, api_secret_key)
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth)
    

Pulling Data

Based on User Timeline

  • To get tweets from a user:
    userResults = api.user_timeline(id='username', count=10)
    
  • Print the tweets:
    for tweet in userResults:
        print(tweet.text)
    

Based on Search Query

  • To search for tweets containing a specific keyword:
    hasilSearch = api.search(q='jakarta', lang='id', count=10)
    
  • Print the search results:
    for tweet in hasilSearch:
        print(f'{tweet.user.screen_name} tweeted: {tweet.text}')
    

Conclusion

  • Successfully pulled data based on user and search queries.
  • Next video will cover processing the pulled data for advanced output.
  • Viewers are encouraged to like, subscribe, and ask questions in the comments.