Facebook bot ( facebookexternalhit ) can be a bandwidth hog

Recently I started paying a lot more attention to usage of Facebook on make some content more accessible to users. At the same time, I starting making sure that each link on Facebook has the right thumbnail on it.

After doing it, some side effects were apparent:

1. Facebook doesn’t keep the processed images for long after the page was shared. So, every so often, you get a request for each of the images you ( or someone ) shared on Facebook.

2. It also makes a poor job at making sure that each server doesn’t needs to get the same image. So, you get a few tens of requests for the same image over and over, for each Facebook server your users use.

3. It seems that the Facebook server refreshes each image each day or so.

How to identify this? On you server logs, look for the following user agents:

facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
How this can cause troubles? If you, or any other user, shares a page which contains big images (>200KB), you can get quite a big bandwidth usage.
Let’s imagine you have a blog which contains an image weighting about 200KB per post. The end result, is that each day you can get 10 or 15 servers fetching each image, one for each post. Assuming 20 posts we have 20 (posts) *15 (servers) * 200KB = 60MB worth of traffic per day, or 1.8GB per month. This may not seem much, but for sites which pay for bandwidth, it may make a dent.
For why Facebook servers don’t keep a local copy or thumbnail for each image, is something I really can’t understand.

Leave a Reply

Back to Top