In my post about “Pinging“, I talked about a rather simple method to track when spiders came to your website. At that time, my method was just viewing the “Latest Visitors” page within CPanel. And basically refreshing it waiting for a spider to come. At this point, I was merely checking whether doing a “ping” actually did anything. I wasn’t trying to track how often it came, or what pages it came to over a certain time period, just that it came when I called.
Well the other day someone pointed me in the direction of the “Spider Tracker” wordpress plugin. This plugin does exactly what it says on the box. It tracks spiders crawl rates across your blog. I recommend if you are going to install it on your own WordPress blog, use the plugin installer built into the admin section.
By default, it just tracks Yahoo, Google and Bing bots. But you can add any user agent’s you wish to track, and it will track these aswell. So in reality, you could use it to track anything with a unique user agent. If you were super adventurous, you could track things like Blackberry or IPhone visits. I don’t know why you would, but you could!
Like always. When I install something new I want to see it working instantly! Unfortunately I did have to wait a bit for Google to come, and when it did I noticed odd patterns in it’s crawling. It never really seemed to crawl pages that are linked from external URL’s. Im sure eventually it will, but most, if not all the crawls I saw happening were just random checks by Google to (What looks like) check if the URL is still up, and if any changes have happened.
![]()
You can see that it is going to page 18 of the blog. I can assure you, there is no external URL’s pointing to that page. And logically, it should only be internally linked from page 17, and page 19. So I am rather surprised Google knows about it at all. There were other crawls much like this one, to old old posts of mine, that have no outside links pointing to them at all.
I don’t really know the answer to it, but it does seem to be Google just coming back and checking on a page at random. I know alot of people have theories about how PageRank factors into this. Like the higher your pagerank, the more often Google will come back and check on that page out of the blue. I’m not quite ready to wade (get it!) into that debate yet, but it is interesting.
Aswell as Google, Yahoo has done pretty much the same thing. Come back and crawled random pages that don’t have any external links pointing to them. Unfortunately the plugin doesn’t track where the spider came from (Can it even?), so I can’t tell if it came from an external link, or just randomly. But certainly from the pages, it is random.
I’m going to leave it running for a week, and see what sort of patterns Google comes a knocking for.


directly after installing this, googlebot was visiting page 71 !
Hey Wade, google is really crawling my site in a weird way:
2010-06-02 14:38:00 66.249.68.65 crawl-66-249-68-65.googlebot.com /page/53?p=hbkpaluoziw
2010-06-02 14:35:19 66.249.68.65 crawl-66-249-68-65.googlebot.com /page/48?p=tgnnidic
I am wondering whats the ?p=hbkpaluoziw and
?p=tgnnidic ?
So im guessing this is wordpress… What is your permalink structure?
Im not entirely sure tbh. The ?p=sdfsdf means nothing in the long run. You can change those letters to anything if the page is already re-written, and it won’t change what page you are on.
Maybe someone is linking to that page in that way? I wouldn’t have any idea to be completely honest. It is possible Google can crawl it like that, But I have no idea why it would.
yea wordpress and my permalink structure is the postname. i am almost sure no one is linking to those pages, but google really has its own ways in doing stuff