SEO Configuration - Sitemap & Robots.txt
π Files Locationβ
robots.txtβ
- Location:
static/robots.txt - Production URL:
https://xerago.ai/robots.txt - Purpose: Controls search engine crawling behavior
sitemap.xmlβ
- Generated automatically by Docusaurus during build
- Production URL:
https://xerago.ai/sitemap.xml - Location after build:
build/sitemap.xml
π€ robots.txt Configurationβ
The robots.txt file is configured to:
- β Allow all search engines to crawl all pages
- β Reference the sitemap location
- β Include placeholders for blocking specific bots (if needed)
- β Include crawl-delay option (commented out by default)
Current Configuration:β
User-agent: *
Allow: /
Sitemap: https://xerago.ai/sitemap.xml
To Block Specific Pages:β
Uncomment and modify these lines in static/robots.txt:
Disallow: /admin/
Disallow: /private/
To Block Specific Bots:β
Add these lines to static/robots.txt:
User-agent: BadBotName
Disallow: /
πΊοΈ Sitemap Configurationβ
Configured in docusaurus.config.js:
sitemap: {
changefreq: 'weekly', // How often pages change
priority: 0.5, // Default priority (0.0 to 1.0)
ignorePatterns: ['/tags/**'], // Patterns to exclude
filename: 'sitemap.xml', // Output filename
}
Sitemap Settings Explained:β
-
changefreq: 'weekly'
- Tells search engines how often to check for updates
- Options:
always,hourly,daily,weekly,monthly,yearly,never
-
priority: 0.5
- Default priority for all pages (0.0 = lowest, 1.0 = highest)
- Homepage typically gets 1.0, other pages 0.5-0.8
-
ignorePatterns
- Excludes specific URL patterns from sitemap
- Currently excludes tag pages:
/tags/**
-
filename: 'sitemap.xml'
- Standard filename for sitemaps
π How to Verifyβ
After Building for Production:β
-
Build the site:
npm run build -
Check robots.txt:
# File should exist at:
build/robots.txt -
Check sitemap.xml:
# File should exist at:
build/sitemap.xml -
Serve locally to test:
npm run serveThen visit:
π Production Deploymentβ
After deploying to production, verify:
-
robots.txt is accessible:
- Visit:
https://xerago.ai/robots.txt - Should display the robots.txt content
- Visit:
-
sitemap.xml is accessible:
- Visit:
https://xerago.ai/sitemap.xml - Should display XML sitemap with all pages
- Visit:
-
Submit to Search Engines:
Google Search Console:
- Go to: https://search.google.com/search-console
- Add property:
https://xerago.ai - Submit sitemap:
https://xerago.ai/sitemap.xml
Bing Webmaster Tools:
- Go to: https://www.bing.com/webmasters
- Add site:
https://xerago.ai - Submit sitemap:
https://xerago.ai/sitemap.xml
π Sitemap Contentsβ
The sitemap will automatically include:
β
All MDX pages in src/pages/
/(home)/about-us/blog/*/customer-stories/*/solutions/*- etc.
β
All documentation pages in docs/
β Excluded patterns:
/tags/**(tag pages)- Any patterns added to
ignorePatterns
π οΈ Customization Optionsβ
Change Update Frequency for Specific Pages:β
You can set different priorities in page frontmatter:
---
title: About Us
description: Learn about Xerago
# SEO customization
sitemap:
changefreq: daily
priority: 0.9
---
Exclude Specific Pages:β
Add to docusaurus.config.js:
sitemap: {
ignorePatterns: [
'/tags/**',
'/admin/**',
'/private/**',
],
}
β Checklistβ
- robots.txt created in
static/folder - Sitemap configuration added to
docusaurus.config.js - Sitemap references correct production URL
- Test robots.txt after build (
build/robots.txt) - Test sitemap.xml after build (
build/sitemap.xml) - Submit sitemap to Google Search Console (after deployment)
- Submit sitemap to Bing Webmaster Tools (after deployment)
- Monitor crawl errors in search console
π Notesβ
- Docusaurus automatically generates
sitemap.xmlduring the build process - The sitemap is regenerated on every build with updated content
- robots.txt is a static file and won't change unless you edit it
- Both files will be available at the root of your production site