Google crawling is case insensitive
Starting with URI specification
Scheme and hostname are case insensitive i.e. the below url’s are treated same.
http://www.xyx.com/ = HTTP://www.Xyz.com/
But in case of Directories and filenames it is case sensitive
The below examples are treated as 3 different URLs
* http://www.xyz.com/Page1.html
* http://www. xyz.com/PAGE1.HTML
* http://www.xyz.com/page1.html
Google and Case Issues
Crawling
Google considers case variations in directory and filename and will consider the below URL’s as different and may crawl all the 3
* http://www.xyz.com/Page1.html
* http://www.xyz.com/PAGE1.HTML
* http://www.xyz.com/page1.html
Indexing
When case-varied URLs are accessible and webserver does not redirect to the preferred URL
Duplicate content is crawled between different URL cases.
It consolidate properties (such as link information) between duplicate URL’s and stores them.
It will display, high-ranking URL selected from case-sensitive URL comparisons.
URL Case Recommendations
Web server default behavior is as follows
* IIS is case insensitive it will treat Page1.html = page1.html, the two pages are treated as same
* Apache is case-sensitive it will treat Page1.html != page1.html, the two pages are treated as different
The most important issue which is not much discussed is robots.txt is case sensitive for paths
The below example will explain the same
* Disallow: /abc = disallow: /abc
* Disallow: /ABC != Disallow: /abc, the two paths are treated as different
Recommendation
1. Follow consistent design format for URL’s either choose ePuppy.html or epuppy.html
2. It is recommended and is often more error-proof to create all lowercase URLs such as epuppy.html
3. Verify case sensitive paths with Webmaster Tools’ robots.txt analysis tool
If the above mentioned points are considered while creating a website many duplicate issues can be solved.
2 comments:
Thanks Raman for sharing all about this about Google. I was not aware of this fact but from now onwards i will take care of this. Thanks for sharing.
Thanks Raman for sharing all about this about Google. I was not aware of this fact but from now onwards i will take care of this. Thanks for sharing.
Post a Comment