|
Page Cloaking - To Cloak or Not
to Cloak.
By Sumantra Roy
Page cloaking can
broadly be defined as a technique used to deliver different
web pages under different circumstances. There are two
primary reasons that people use page cloaking:
i) It allows them
to create a separate optimized page for each search engine
and another page which is aesthetically pleasing and designed
for their human visitors. When a search engine spider visits
a site, the page which has been optimized for that search
engine is delivered to it. When a human visits a site, the
page which was designed for the human visitors is shown. The
primary benefit of doing this is that the human visitors
don't need to be shown the pages which have been optimized
for the search engines, because the pages which are meant for
the search engines may not be aesthetically pleasing, and may
contain an over-repetition of keywords.
ii) It allows
them to hide the source code of the optimized pages that they
have created, and hence prevents their competitors from being
able to copy the source code.
Page cloaking is
implemented by using some specialized cloaking scripts. A
cloaking script is installed on the server, which detects
whether it is a search engine or a human being that is
requesting a page. If a search engine is requesting a page,
the cloaking script delivers the page which has been
optimized for that search engine. If a human being is
requesting the page, the cloaking script delivers the page
which has been designed for humans.
There are two
primary ways by which the cloaking script can detect whether
a search engine or a human being is visiting a site:
i) The first and
simplest way is by checking the User-Agent variable. Each
time anyone (be it a search engine spider or a browser being
operated by a human) requests a page from a site, it reports
an User-Agent name to the site. Generally, if a search engine
spider requests a page, the User-Agent variable contains the
name of the search engine. Hence, if the cloaking script
detects that the User-Agent variable contains a name of a
search engine, it delivers the page which has been optimized
for that search engine. If the cloaking script does not
detect the name of a search engine in the User-Agent
variable, it assumes that the request has been made by a
human being and delivers the page which was designed for
human beings.
However, while
this is the simplest way to implement a cloaking script, it
is also the least safe. It is pretty easy to fake the
User-Agent variable, and hence, someone who wants to see the
optimized pages that are being delivered to different search
engines can easily do so.
ii) The second
and more complicated way is to use I.P. (Internet Protocol)
based cloaking. This involves the use of an I.P. database
which contains a list of the I.P. addresses of all known
search engine spiders. When a visitor (a search engine or a
human) requests a page, the cloaking script checks the I.P.
address of the visitor. If the I.P. address is present in the
I.P. database, the cloaking script knows that the visitor is
a search engine and delivers the page optimized for that
search engine. If the I.P. address is not present in the I.P.
database, the cloaking script assumes that a human has
requested the page, and delivers the page which is meant for
human visitors.
Although more
complicated than User-Agent based cloaking, I.P. based
cloaking is more reliable and safe because it is very
difficult to fake I.P. addresses.
Now that you have
an idea of what cloaking is all about and how it is
implemented, the question arises as to whether you should use
page cloaking. The one word answer is "NO". The
reason is simple: the search engines don't like it, and will
probably ban your site from their index if they find out that
your site uses cloaking. The reason that the search engines
don't like page cloaking is that it prevents them from being
able to spider the same page that their visitors are going to
see. And if the search engines are prevented from doing so,
they cannot be confident of delivering relevant results to
their users. In the past, many people have created optimized
pages for some highly popular keywords and then used page
cloaking to take people to their real sites which had nothing
to do with those keywords. If the search engines allowed this
to happen, they would suffer because their users would
abandon them and go to another search engine which produced
more relevant results.
Of course, a
question arises as to how a search engine can detect whether
or not a site uses page cloaking. There are three ways by
which it can do so:
i) If the site
uses User-Agent cloaking, the search engines can simply send
a spider to a site which does not report the name of the
search engine in the User-Agent variable. If the search
engine sees that the page delivered to this spider is
different from the page which is delivered to a spider which
reports the name of the search engine in the User-Agent
variable, it knows that the site has used page cloaking.
ii) If the site
uses I.P. based cloaking, the search engines can send a
spider from a different I.P. address than any I.P. address
which it has used previously. Since this is a new I.P.
address, the I.P. database that is used for cloaking will not
contain this address. If the search engine detects that the
page delivered to the spider with the new I.P. address is
different from the page that is delivered to a spider with a
known I.P. address, it knows that the site has used page
cloaking.
iii) A human
representative from a search engine may visit a site to see
whether it uses cloaking. If she sees that the page which is
delivered to her is different from the one being delivered to
the search engine spider, she knows that the site uses
cloaking.
Hence, when it
comes to page cloaking, my advice is simple: don't even think
about using it.
Article by Sumantra Roy.
Sumantra is one of the most respected search engine
positioning specialists on the Internet. To have Sumantra's
company place your site at the top of the search engines, go
to 1stSearchRanking.com For more advice on
how you can take your web site to the top of the search
engines, subscribe to his FREE newsletter by going to
|