How to scrape a Discord channel
How to Scrape a Discord Channel: A Step-by-Step Guide
You might want to scrape Discord channels to get useful information and see what people are chatting about. It can help you learn more, keep track of conversations, or gather data to study.
For example, I downloaded 5000 messages from the Midjourney discord. Here is how I analyzed them and what kind of insights I got :
In this guide, we'll be using scrapedisc.com to extract messages from a discord channel.
Step 1: Create a Genuine Discord Account
To scrape a Discord channel, you need to have a genuine user account. If you want to avoid any risks to your personal account, here are some tips for creating one:
- Sign Up: Create a new Discord account using a valid email address.
- Complete Verification: Go through all verification steps, including email verification and, if needed, phone number verification.
Step 2: Get your user token
Once your account is set up, you'll need to grab the auth token of your account from the network requests. This token will be used to interact with Discord’s API. Using the API, you can scrape all the Discord channel data.
- Open Discord in a Browser: Log in to your Discord account through a web browser.
- Navigate to the Network Tab: Go to the “Network” tab in the Developer Tools and filter the traffic by “XHR” to view API requests.
- Locate Your Token: Choose a request that isn't an error (if there aren't any, click on a channel or server to trigger some requests.) You'll find your discord token under the request headers → authorization section. Copy and paste it from there.
MTI1Mjk4MTF5OKUwMzAxODY0OA.GQhQMe.kh66KlSBCn7iZ5HvrgrJLkji5-cxFsxgm_DKZ
Step 3: Get the Server ID and the Channel ID you are trying to access
To effectively scrape messages and data from a Discord channel, you need to identify the specific guild (server) and channel you wish to scrape.
You need both the Server ID and Channel ID to scrape a Discord channel.
Here’s how you can obtain these IDs:
- Enable Developer Mode:
- Open Discord Settings: Click on the gear icon next to your username in the bottom-left corner to open User Settings.
- Navigate to Advanced Settings: Go to the "Advanced" section in the left sidebar.
- Enable Developer Mode: Toggle the "Developer Mode" switch to ON.
- Find the Server ID:
- Go to the Server: Navigate to the server (guild) you want to scrape data from.
- Right-Click the Server Name: Right-click on the server name at the top of the Discord channel list.
- Copy Server ID: From the context menu, select “Copy Server ID.”
- Find the Channel ID:
- Open the Discord Channel: Click on the channel within the guild from which you want to scrape messages.
- Right-Click the Channel Name: Right-click on the channel name in the channel list.
- Select Copy ID: Choose “Copy ID” from the context menu.
Server ID: 123456789012345678
Channel ID: 987654321098765432
Step 4: Make your API Requests
The token you now have acts as an identifier for your account. It lets you make API requests without logging in through the website.
With your token, you can now make API requests through scrapedisc.com and scrape the channel data.
Note:
- You should pass your token in the headers
- Use /scrape_discord to scrape a discord, you need to pass a JSON in the body with the server and channel IDs
- You can use /health to make sure the server is alive
Step 5: Be considerate
Please be cautious and considerate. If you send too many requests, you might have your account banned. Don’t ruin this for everyone.
Step 6: Extract and Process Data
Once you've scraped data from a Discord channel, it’s usually raw and unstructured. You need more data processing to get meaningful data.
Here’s what I got when scrapping the Midjourney Discord channel. In this channel, they share prompts for image generations. The data is quite dirty and still needs some processing!
For automatic data processing, I chose to use phospho. This platform processes text automatically when you cluster them for exploration. See the results below.
Congratulations, you learned how to scrape a Discord channel. Go to platform.phospho.ai to keep extracting value from text.