Requests from an IP address associated with the Soghu web spider had the following odd characteristics:
- The User Agent was not reported.
- The URL requested did not exist on the host it was requested from (www.yuikee.com.hk), but does exist on a similar hostname (www.yuikee.com), also served from the same server.
The anomaly was noticed because the log analysis software used (analog) reports a "corrupt line" when the user agent is missing.
The anomaly was noted in webserver logs for www.yuikee.com.hk on 25th February 2010. Analog reported 57 corrupt lines in the logfile, all requests from 18.104.22.168, with the user agent field "-". The URLs requested were:
The requests were spread across the day, starting at 00:00:30 +0800, and the last request at 21:39:40 +0800. On examination, the IP address 22.214.171.124 had also made 5 other requests, where the user agent was given as:
Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07
The same user agent made 6 other requests, from 126.96.36.199. These 11 requests with a user agent were all for URLs that exist on the www.yuikee.com site.
Both IP addresses are registered to China Telecom's Beijing province network.
The fact that the URLs requested in the anomalous requests actually exist on a different host seems more than coincidence. The most likely explanation is that a single instance of Sogou's web spider suffered corruption that both prevented the sending of the user agent name and caused the ".hk" to be dropped from the hostname it was contacting.