Your Peace of Mind is our Commitment

Contact Us English Recent Articles

Strange Behaviour of the Soghu Web Spider

Summary

Requests from an IP address associated with the Soghu web spider had the following odd characteristics:

The anomaly was noticed because the log analysis software used (analog) reports a "corrupt line" when the user agent is missing.

Details

The anomaly was noted in webserver logs for www.yuikee.com.hk on 25th February 2010. Analog reported 57 corrupt lines in the logfile, all requests from 220.181.94.236, with the user agent field "-". The URLs requested were:

/activities/
/wines/activities/H20070504a/

The requests were spread across the day, starting at 00:00:30 +0800, and the last request at 21:39:40 +0800. On examination, the IP address 220.181.94.236 had also made 5 other requests, where the user agent was given as:

Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07

The same user agent made 6 other requests, from 220.181.94.217. These 11 requests with a user agent were all for URLs that exist on the www.yuikee.com site.

Both IP addresses are registered to China Telecom's Beijing province network.

Discussion

The fact that the URLs requested in the anomalous requests actually exist on a different host seems more than coincidence. The most likely explanation is that a single instance of Sogou's web spider suffered corruption that both prevented the sending of the user agent name and caused the ".hk" to be dropped from the hostname it was contacting.


Slashdot   Slashdot It! | Share