SEO (search engine optimization) is a real challenge! Notions such as robots.txt, “no index / no follow” pages or “rel=canonical” raise problems among website owners / webmasters. With this article, we try to explain each notion in order to completely eliminate confusion.
ROBOTS.TXT
Through the robots.txt file, search engines receive signals about what they can and cannot access within a website. The robots.txt file is loaded into the root of the website. It is a simple text file that uses a parameter to select which pages are to be analysed or not by search engine robots (User-agent & Disallow).
Ways to configure the Robots.txt file at a site level:
- Permissions for full site analysis and sitemap location indication:
User-agent: *
Disallow:
Sitemap:http:www.mysite.ro/sitemap.xml
- To exclude bots from accessing certain directories or website pages:
User-agent: *
Disallow: /cgi-bin/
Disallow: /admin/
Disallow: /private/
NO INDEX & NO FOLLOW – Cand folosim < meta name=”robots” content=”noindex, follow” >?
O alta modalitate de a restrictiona accesul robotilor motoarelor de cautare la anumite pagini ce genereaza duplicate content sau la sectiuni cu rol de sortare intr-un site (paginile de filtre) este utilizarea < meta name=”robots” content=”noindex, follow” >. Un site care genereaza continut duplicat atrage in mod negativ atentia motoarelor de cautare.
Pages generated from accessing filtering, sorting systems, tags or other pages useful only for usability should be analyzed and scheduled to be stopped from indexing. It is recommended to choose “noindex, nofollow” in order not to index and not to follow these pages or the links generated within them and respectively “noindex, follow” in order not to allow indexing of the pages but still to have the possibility to follow the links within them.
“REL = CANONICAL”
The role of “rel=canonical” is to eliminate duplicate pages generated by the use of performance tracking URLs (UTMs), dynamic pages or other types of pages that generate duplicate content due to the use of dynamically generated URLs. It is not recommended to use “rel=canonical” to avoid indexing sorting or filtering pages. For such pages it is recommended to use “meta noindex” and not “rel=canonical”.
General Recommendation
Don’t allow indexing of pages considered to have no SEO value just because you want to have a large number of pages in the Google index or in the hope that they will generate additional traffic to your site. Google may apply a penalty (definitely to be avoided). Use SEO techniques/actions to eliminate duplicate content, thus improving the user experience with the site and making it easier for search engine bots to analyze it.
For all these actions, you can call on an agency that provides SEO services pentru a se asigura ca toate aspectele site-ului tau sunt in regula. to ensure that all aspects of your website are in order. In addition, SEO specialists can make recommendations to grow your business online.