How to scrape a Discord channel

How to scrape a Discord channel
How to scrape a Discord channel

How to Scrape a Discord Channel: A Step-by-Step Guide

You might want to scrape Discord channels to get useful information and see what people are chatting about. It can help you learn more, keep track of conversations, or gather data to study.

For example, I downloaded 5000 messages from the Midjourney discord. Here is how I analyzed them and what kind of insights I got :

0:00
/0:14

Disclaimer: Scraping data from Discord may breach their terms of service. Ensure you have authorization before scraping and use the information responsibly.

Disclaimer: Scraping data from Discord may breach their terms of service. Ensure you have authorization before scraping and use the information responsibly.

In this guide, we'll be using scrapedisc.com to extract messages from a discord channel.

Step 1: Create a Genuine Discord Account

To scrape a Discord channel, you need to have a genuine user account. If you want to avoid any risks to your personal account, here are some tips for creating one:

  1. Sign Up: Create a new Discord account using a valid email address.
  2. Complete Verification: Go through all verification steps, including email verification and, if needed, phone number verification.
You can use your personal account but be aware that you can get banned!

Step 2: Get your user token

Once your account is set up, you'll need to grab the auth token of your account from the network requests. This token will be used to interact with Discord’s API. Using the API, you can scrape all the Discord channel data.

  1. Open Discord in a Browser: Log in to your Discord account through a web browser.
  2. Navigate to the Network Tab: Go to the “Network” tab in the Developer Tools and filter the traffic by “XHR” to view API requests.
  3. Locate Your Token: Choose a request that isn't an error (if there aren't any, click on a channel or server to trigger some requests.) You'll find your discord token under the request headers → authorization section. Copy and paste it from there.
💡
Your token should look something like this
MTI1Mjk4MTF5OKUwMzAxODY0OA.GQhQMe.kh66KlSBCn7iZ5HvrgrJLkji5-cxFsxgm_DKZ

Step 3: Get the Server ID and the Channel ID you are trying to access

To effectively scrape messages and data from a Discord channel, you need to identify the specific guild (server) and channel you wish to scrape.

You need both the Server ID and Channel ID to scrape a Discord channel.

Here’s how you can obtain these IDs:

  1. Enable Developer Mode:
    • Open Discord Settings: Click on the gear icon next to your username in the bottom-left corner to open User Settings.
    • Navigate to Advanced Settings: Go to the "Advanced" section in the left sidebar.
    • Enable Developer Mode: Toggle the "Developer Mode" switch to ON.
  2. Find the Server ID:
    • Go to the Server: Navigate to the server (guild) you want to scrape data from.
    • Right-Click the Server Name: Right-click on the server name at the top of the Discord channel list.
    • Copy Server ID: From the context menu, select “Copy Server ID.”
  3. Find the Channel ID:
    • Open the Discord Channel: Click on the channel within the guild from which you want to scrape messages.
    • Right-Click the Channel Name: Right-click on the channel name in the channel list.
    • Select Copy ID: Choose “Copy ID” from the context menu.
💡
Your IDs should look something like this
Server ID: 123456789012345678
Channel ID: 987654321098765432

Step 4: Make your API Requests

The token you now have acts as an identifier for your account. It lets you make API requests without logging in through the website.

With your token, you can now make API requests through scrapedisc.com and scrape the channel data.

Note:

  • You should pass your token in the headers
  • Use /scrape_discord to scrape a discord, you need to pass a JSON in the body with the server and channel IDs
  • You can use /health to make sure the server is alive

Step 5: Be considerate

Please be cautious and considerate. If you send too many requests, you might have your account banned. Don’t ruin this for everyone.

Step 6: Extract and Process Data

Once you've scraped data from a Discord channel, it’s usually raw and unstructured. You need more data processing to get meaningful data.

Here’s what I got when scrapping the Midjourney Discord channel. In this channel, they share prompts for image generations. The data is quite dirty and still needs some processing!

Raw data extracted from discord

For automatic data processing, I chose to use phospho. This platform processes text automatically when you cluster them for exploration. See the results below.

0:00
/0:14

Congratulations, you learned how to scrape a Discord channel. Go to platform.phospho.ai to keep extracting value from text.