# robots.txt for G2 Middle East # Updated: 2025-10-28 # Purpose: Maximise SEO, AI/LLM visibility, and protect sensitive content # ============================================ # UNIVERSAL RULES - Apply to All Bots # ============================================ User-agent: * Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /white-paper/ Disallow: /white-papers/ Disallow: /api/admin/ Disallow: /admin/login Disallow: /admin/dashboard Disallow: /*.json$ Disallow: /*?token= Disallow: /*?reset= Disallow: /projects/verify-email Disallow: /projects/reset-password # Allow public content and key areas Allow: /projects/ Allow: /projects/dashboard Allow: /projects/case-studies/ Allow: /projects/login Allow: /projects/register Allow: /team/ Allow: /team/tim-jacobs Allow: /briefing/ Allow: /perspectives/ Allow: /contact/ Allow: /services/ # Sitemap locations Sitemap: https://g2middleeast.com/sitemap.xml Sitemap: https://g2middleeast.com/sitemap-projects.xml # LLMs/AI: See https://g2middleeast.com/llms.txt for structured company data # Privacy: https://g2middleeast.com/privacy-policy # Terms: https://g2middleeast.com/terms-of-service # Brand logo: https://g2middleeast.com/assets/logo-g2me.svg # Preferred citation: "G2 Middle East & Africa, a specialised division of Casta Diva Group focused on strategic advisory and governmental affairs in the Middle East and Africa." # ============================================ # AI/LLM CRAWLERS (Explicit Permissions) # ============================================ User-agent: GPTBot Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 User-agent: Claude-Web Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 User-agent: anthropic-ai Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 User-agent: Google-Extended Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 User-agent: PerplexityBot Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 User-agent: CCBot Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 2 User-agent: cohere-ai Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 User-agent: meta-externalagent Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 User-agent: Applebot Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 User-agent: Applebot-Extended Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 # ============================================ # SEARCH ENGINE CRAWLERS (Google, Bing, Yandex, Baidu, DuckDuckGo) # ============================================ User-agent: Googlebot Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 0.5 User-agent: Googlebot-Image Allow: / Disallow: /whitepaper/ Disallow: /api/admin/ User-agent: Googlebot-News Allow: / Disallow: /whitepaper/ Disallow: /api/admin/ Allow: /briefing/ User-agent: Bingbot Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 User-agent: YandexBot Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 2 User-agent: Baiduspider Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 2 User-agent: DuckDuckBot Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 # ============================================ # SOCIAL MEDIA CRAWLERS # ============================================ User-agent: FacebookBot Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 User-agent: Twitterbot Allow: / Disallow: /whitepaper/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ User-agent: LinkedInBot Allow: / Disallow: /whitepaper/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ User-agent: Slackbot Allow: / Disallow: /whitepaper/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ # ============================================ # ACADEMIC & RESEARCH CRAWLERS # ============================================ User-agent: ia_archiver Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 2 User-agent: archive.org_bot Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 5 User-agent: Wayback Machine Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Crawl-delay: 5 # ============================================ # EMERGING AI CRAWLERS # ============================================ User-agent: Omgilibot Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 2 User-agent: YouBot Allow: / Disallow: /whitepaper/ Disallow: /whitepapers/ Disallow: /api/admin/ Allow: /projects/ Allow: /team/tim-jacobs Allow: /briefing/ Crawl-delay: 1 # ============================================ # BLOCK MALICIOUS/SCRAPER BOTS # ============================================ User-agent: AhrefsBot Disallow: / User-agent: SemrushBot Disallow: / User-agent: MJ12bot Disallow: / User-agent: DotBot Disallow: / User-agent: BLEXBot Disallow: / User-agent: PetalBot Disallow: / User-agent: DataForSeoBot Disallow: / User-agent: serpstatbot Disallow: / User-agent: ZoominfoBot Disallow: / User-agent: Scrapy Disallow: / # ============================================ # NOTES & DOCUMENTATION # ============================================ # This robots.txt file is configured to: # 1. Allow all major AI crawlers (ChatGPT, Claude, Gemini, Perplexity, etc.) # 2. Allow all major search engines (Google, Bing, DuckDuckGo, etc.) # 3. Allow full access to /projects/ area (including authenticated pages) # 4. Allow full access to /team/tim-jacobs profile # 5. Allow full access to /briefing/ content (Perspectives) # 6. BLOCK all access to /whitepaper/ and /whitepapers/ directories # 7. BLOCK admin areas and sensitive API endpoints # 8. BLOCK malicious scrapers and SEO bots # 9. Set appropriate crawl delays to prevent server overload # 10. Reference llms.txt for structured company data # For questions or modifications, contact: tim@g2middleeast.com