A Guide to Google and Caching Your Pet
I figured this guide was necessary after seeing so many boards from
people who are sending their pets names out to people on clraik in the belief that autorefreshing helps "speed up" the caching process. This can lead to pets being force cached, pets appearing in less than desirable places like tumblr, and generally pushing the pet out into the spotlight. It seems like there's a lot of misinformation and confusion about how Google actually works.
First up big thanks to
(you need an account to see links) and
(you need an account to see links) for providing me with the technical details, pushing me to make this and making sure I didn't miss anything.
So What is a Cache and How Does it Relate to Neopets?
In simplistic terms a cache is a snapshot of a page from some point in time previous to the present day. It's a mechanism that exists to help view a webpage if the page or website no longer exists or if the webpage takes a long time to load - a snapshot saved by Google will load much faster as Google invariably has the best servers. So how does this fit in to Neopets and to us here on clraik? The cache function is used by people as a way of determining the "legitimacy" of a pet. If the pets snapshot shows it on a different account, particularly an account that may have it's OWN cache showing it was recently returned from being inactive or a shell, it can suggest the pet was bought.
This can be a great source of anxiety for recent UC buyers as it is
well accepted as unsafe to attempt to trade/chat with the pet if the pets cache shows it may be bought.
How and When Does a Google Cache Update?
The cache of a page updates when Google's crawlers/spiders run over the page and send an updated snapshot back to Google's servers. This happens at Googles own discretion and
there is no "neopets safe" way to speed up the process. So why are some page caches far more updated than others? In order for a page to be crawled and its screenshot updated, the crawlers must be able to find it. To do this, they must pick up the pages URL whilst crawling a page they are already destined to go over. So, if we imagine they start on the Neopets homepage, they will find the NeoBoards, and from there links to userlookups and from there links to that user's pets and so on and so forth.
In (very) simplistic terms, the more links to a URL, the more often the crawlers will pick it up and add it to their "queue" and the more often the cache will update. As a result, pages that are never linked to (e.g. years old inactive accounts) may have never been picked up by the crawlers and may well not have a cache at all.
If you want a more in-depth look at how Google works with indexing, caching and ranking you might considering searching for
google optimization, SEO or (you need an account to see links) .
Understanding Google Webmaster Tools
The only other way Google can find pages to index them and update the cache is by having the URL submitted to them via Google webmaster tools and the "add URL" function. This is problematic as it can often lead to a
double image where two flash images of the pet appear in the cache, one of top of each other, and this is a sure-fire way to get yourself into trouble should anyone ever look into your pet. A double image suggests you've submitted the URL to Google, and why would you have done that if you had nothing to hide? As a result,
the webmaster tools are best left well alone and giving other people links to your pet may also end badly if they force cache either through malice or believing they're doing you a favor.
As anyone can submit a URL to Google, it's impossible to know if the URL was submitted by the pet owner. It's not a good idea to assume a pet is illegitimate just because it has a double image. If you submit a userlookup URL to Google, the crawlers will then find the pets on the userlookup, and may well force through a cache of those too.
Another function exists within Google that allows you to submit a page for them to review and potentially remove the cache all together. This will only happen if the URL you submit is exact, and the page has changed since it was last crawled. This is also not a good strategy for protecting your pet, as the page is still within Google's system and the next time it is crawled the old cache may be put back up.
Speeding up the Caching Process Without Webmaster Tools
This is where things get confusing and people may believe that page refreshing means your page will be crawled more quickly.
Google has no way to know that you are mass refreshing a page that Google doesn't own. By giving people on clraik links over PM
(where, by the way, crawlers can't find them - they only work in the public domain) you are exposing your pet to strangers for no good reason.
Changing the Content on the page
Google occasionally goes back through all indexed pages and updates the cache of those that have since changed. This means that if you pet has an outdated cache, updating your pets lookup may well eventually lead to an update, but it can be a very long process as there is no way to predict when this will happen.
If your pet and account has no cache at all, changing its petlookup will do nothing to help you.
Patience
There's no magic button to force Google to update the cache in a way that will make everything safe from prying eyes.
The only thing that can really be done is to wait and to have links to your pet/userlookup out there.
Throwing a pet out into the public with an old cache means it could easily be found to be bought. If no link exists on the web for Google to find your pet, it never will.
It has no idea the pet, and page, exists. Ways around this could be to submit game scores for high score tables that aren't watched closely (ones with max scores for example) or to move the pet to an account that Google already has record of and hope for the best. When Google goes through it's records to update old caches, it will crawl the HST or userlookup it already has on file, and find the new link to your userlookup/pet.
There is no completely safe method here. You cannot update the pets cache without the pet being exposed to some degree.
The TL;DR Version
- Google magic is complicated.
- Refreshing does not cache your pet faster.
- Being linked to; on boards, HSTs, on a userlookup that has a cache etc. will encourage Google to find your page and update it.
- Posting all over the boards with an uncached pet is a bad idea.
- Force caching your pet is a terrible idea because double images are a nightmare.
- Sending your pet/userlookup link to random people on clraik is probably an awful idea too.
- You must develop the patience of a Saint and nerves of steel.