Poodle Predictor - See your site like Google does

One of the challenges a webmaster faces when designing a site, is getting it well-placed in the search-engines. Some might regard this as a job for a specialized Search Engine Optimiser, but mostly it just comes down to common-sense, and keeping a few basic rules in mind when designing the site.

One of the harder aspects though, is knowing how the search-engine will see your site. There are many sites stuck with an ugly Google listing, because they didn't realise search-engines don't use Javascript, or cookies, for instance.

Just for fun, let's look at a few:

Those listings could have been avoided, but it would mean knowing something about what search-engines support, and what they don't, and the main problem is that you have to wait until your site is listed, before you can see the result. This usually takes about a month, and at least as long again before a bad listing is refreshed.

So we developed Poodle Predictor, a free tool to help web visualize what Google sees as it spiders your site, a search-engine simulator if you will.

The main application is found here at gritechnologies.com/tools/spider.go?q=evolt.org

This is the predictor-view, it's task is two-fold. Firstly to give you a rough idea of how your page's listing will look in Google, and secondly to find every link on the page that the search-engine would.

This list of links is ideally quite long, yet not more than 100 or so.

If you are using Flash or DHTML navigation you might find that none of the links show-up in Poodle's Predictor. In that case you should add the links inside <noscript> tags, or as an alternative navigation system.

Underneath the main listing, you see three links, Diagnostics View - Source-code View - Header-Meta View.

The Diagnostics View opens the URL in Poodle Diagnostic. This gives you a color-coded view of the page through the 'eyes' of the search-engine spider.

The colors reflect whether various important tags and attributes were used on the page. The text displayed should read logically - some alt and title attributes will show in your page, and this could give unexpected results, like "welcome to [company_logo.gif]".

The Source-code View speaks for itself, it's a color-coded view of the source-code of the page. The color-coding again reflects the use of tags and attributes with importance for search-engines.

Finally the Header-Meta View will toggle the header and meta-tag section into view.

Other related pages are:

I hope you enjoy using the tool, and feel free to comment here, or by email.

Cheers, Richard.
richard.b@gritechnologies.com

Comments

How Handy

It's amazing how much of your site gets ignored! I'll be using this to take into considration serach engine, when building pages

No support for imagemaps yet...

Hi, glad you like it.

I should point out that it doesn't support imagemaps yet, so sites using imagemaps for navigation actually get spidered better than Poodle shows.
That and several other bugs have been noted for the next update.
Thanks to everyone who sent mail reporting strange results.

Valid Code

Nice tool. Can you add some text to say that it works best with guaranteed well-formed code and that users should put their HTML code through a code validator first?

Great Tool

This is really useful, thanks!

Bug Report

* * BUG REPORT * *

Hmm, I think I broke it.

I got it to check a site like http://www.domain.isp.net/user/folder meaning http://www.domain.isp.net/user/folder/index.html of course.

It checked the page and then said that every link on the page was broken, because it was assuming http://www.domain.isp.net/user/foo.html and http://www.domain.isp.net/user/bar.html for the links, instead of http://www.domain.isp.net/user/folder/foo.html and http://www.domain.isp.net/user/folder/bar.html which is where the files really are (i.e. inside the same folder as the index.html file is located).

Hope you can fix that one.

re: bug report

Hi,
Thanks for the feedback.
If you want to send the URL in question offlist to richard.b@gritechnologies.com , I'll add it to my list of bugs.

If I try to replicate it by using:
http://gritechnologies.com/tools/spider.go?q=www.evolt.org/article/view

things work as normal.
Also, Poodle Predictor is suposed to point out potential problems before Googlebot chokes on them.
In some cases Poodle Predictor appears to break, but it is actually failing due to some sloppy mark-up on the page, which might also trip-up Google.

Bug Report Sent by email.

Summary:

http://www.domain.isp.net/user/folder -- does not work.

http://www.domain.isp.net/user/folder/ -- does work.

Links that fail are ones inside /folder/index.html that point to other files in the /folder/ level folder.

Maybe other people can try this?

Any other comments?

Any other comments?

description differes from google

Looking at web design leeds the description it gives shows the table summary which isn't what google gives? Why is this?

What's about a bookmarklet?

Due to convenience it would be nice if you could offer a bookmarklet.

Nice tool, but nothing special..

I wonder what makes this tool such a big deal. For one thing, it is inaccurate. For instance, if I have links on my website that contain my own personal file extensions (say, index.fcgi) then Google does not index those links. There are several other things about what Google slurps and doesn't. Secondly, you can use a simple link checker such as Xenu to figure out what your link maps look like and what Google or any other search engine for that matter will follow. Thirdly, you can simply try Google itself! :)

Web Confs

Hi,

http://www.webconfs.com/search-engine-spider-simulator.php

This tool displays the text & links that the Search Engine would see when it crawls a page.

Anatomy of Search Engine

Hi,

I totally agree with you. Everyone should read the "Anatomy of a Search Engine" to know more about it.
Thank you.

This is not really an issue anymore

While this has been a big issue in the past, Most browsers support Javascript so we don't see this happen nearly as much. Although there are still some sites with issues like this, search engines typically will display a page that can be viewed in it's full functionality first. Thanks, Brad Henry SEOslap

I've used this tool from

I've used this tool from time to time to check my NYC photography site and unfortunately I don't think it's very accurate. It returns plenty of errors that Google doesn't seem to have any problems with at all. --Andrew

It is accurate and that is

It is accurate and that is the point. Youre not making your site for Google, you're making it for your users. If Poodle sees a problem, then some of your users may see it too. Dave

Grandpa remembers...

An alternative tool that could be useful for this purpose is the Lynx browser, a text-only browser from the early 1990s that is still included in a some Linux distributions (there are also versions for Mac and Windows). --Would you believe that I've used it quite extensively before Mosaic became popular... (Showing my age, aren't I?!)

Nice Tool for Valid Code

Hi, I agree with you - I used this tool for tRucks and i check my site, is really useful, thanks!