Tuesday, June 22, 2010

Testing web pages with “spider” vision

spidey-vision In my previous post about the importance of choosing UI controls that provide server-side rendering, we discussed the problem that modern “JavaScript-only” UI components present- most notably, their inability to support search engine indexing. We established the importance of picking UI controls- like Telerik’s RadControls for ASP.NET AJAX or Extensions for ASP.NET MVC- that provide both server- and client-side rendering to ensure that your websites can both maximize performance and accessibility (to both humans and search bots).

But how do you know you have a problem? You may be using UI controls now and not be sure if you’re SEO friendly (unless you’re using Telerik controls, of course, in which case, you’re covered). You need a way to see your pages in the same way the bots do.

How do you gain search “spider vision” for testing your pages?

Fortunately, you have a couple of options:

  1. Text-only Web Browser – Text browsers, such as Lynx, provide you with a close approximation of what search spiders see when they visit your page. That makes them good tools for testing your content to see how your page renders when JavaScript is not in play. Even Google recommends using this approach to test content.
  2. Fetch as Googlebot – This is super spider vision. Google provides a beta tool as part of its Tools for Webmasters called “Fetch as Googlebot.” As the name implies, this tool allows you to plug-in a URL from your site (you have to have verified access to the site before you can use this tool, unfortunately) and get back a processed result that shows you “exactly” what Goolgebot sees when it visits your page. Obviously, Google is probably not showing all its cards with this tool, but it gives you a good approximation and once again clearly highlights the “text-only” nature of search indexing.

With tools available to give us spider vision, what does client-side rendering look like versus server-side rendering? Let’s test.

Using a simple ASP.NET MVC test page, I bind a Telerik Extensions Grid for MVC to customers data from Northwind. Since the Telerik MVC Grid supports both rendering modes, I create one view that uses Server Binding and another that uses Ajax Binding (i.e. client side-binding, similar to how something like a jQuery grid works). Here is what the page looks like in the browser (in both modes):


Now, here’s how Googlebot sees the Server Binding version of the page (relevant Grid section in the screen cap):


Notice two things. First, notice that Googlebot only sees text. Second, notice the data in the Grid (highlighted in red). This is good. This means Google is indexing our Grid content in the same way users see it. Now let’s see what happens if we use Goolgebot vision to load our client-side rendering (Ajax Binding) page:


Where’s our data? That’s right: missing in action! Because our client-side page uses Ajax and JavaScript to initialize, Googlebot (and search spiders, in general) do not see the content. The content might as well not exist as far as indexing is concerned.

Make sure your pages aren’t invisible.

Fortunately, my pages are using Telerik controls, so I can easily fix the problem by enabling Server Binding for crawlers. If you are using “pure” client-side components that provide no server-side option, though, your solution will not be as easy. Take advantage of tools like Lynx and “Fetch” to gain “spider vision” and ensure your fancy pages aren’t invisible to some of your most important visitors. And, of course, save yourself some time and use Telerik tools to avoid the problem!