SF Crawl System

Instructions

A guide to configuring and running SEO crawls with Screaming Frog Cloud. All crawl settings are managed through .seospiderconfig files created in Screaming Frog Desktop.

1. Create a Configuration File

All crawl settings (threads, URL limits, extraction rules, user agent, etc.) are controlled by a .seospiderconfig file. Create one in Screaming Frog Desktop:

  1. Open Screaming Frog SEO Spider on your desktop
  2. Configure your crawl settings (see sections below for key settings)
  3. Go to File > Configuration > Save As
  4. Save the .seospiderconfig file

2. Key Settings to Configure in SF Desktop

URL Crawl Limit

Configuration > Spider > Limits > Crawl Limit
Set the maximum number of URLs to crawl. Without this, SF will crawl the entire site which can take hours for large sites. Recommended: 500 for quick audits, 5000 for full audits.

SettingWhere in SF DesktopRecommendation
Crawl LimitConfiguration > Spider > Limits500 (quick) / 5000 (full)
ThreadsConfiguration > Speed10-15 recommended
JS RenderingConfiguration > Spider > RenderingOFF unless crawling SPA (React/Angular/Vue)
URL FilteringConfiguration > Include / ExcludeUse to focus on specific sections (e.g. /blog/)
User AgentConfiguration > User-AgentGooglebot or Chrome
Respect Robots.txtConfiguration > Robots.txtON for production, OFF for full audit

3. Custom Extraction Rules

Extract specific content from every crawled page. Configure in SF Desktop:

  1. Go to Configuration > Custom > Extraction
  2. Click Add to create a new rule
  3. Choose the selector type (CSS Selector, XPath, or Regex)
  4. Enter the selector pattern
  5. Save the config file

Tip: Inspect the target site first (right-click → Inspect Element) to find the right CSS selectors for the content you want to extract.

Common extraction examples:

GoalTypePatternExtract
Page body contentCSS Selector.entry-contentExtract Text
Main content areaCSS Selectormain, article, #contentExtract Text
Product priceCSS Selector.product-price, .priceExtract Text
Schema markupCSS Selectorscript[type="application/ld+json"]Inner HTML
Email addressesRegex[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}Extract Text

4. Upload Config and Start Crawl

  1. Go to New Crawl from the sidebar
  2. Enter the target URL
  3. Upload your .seospiderconfig file (optional — default config is used if not uploaded)
  4. Review the parsed settings and extraction rules displayed below the upload
  5. Click Save Config to save it for reuse
  6. Click Start Crawl

Previously saved configs can be loaded via the "Load Saved Config" link.

5. Viewing Results

  • Pages tab — Summary stats: total pages, status codes, errors, issues
  • Issues tab — SEO issues detected with severity, count, and downloadable CSV per issue
  • SF Exports tab — Download raw Screaming Frog CSV files
  • Custom extraction data is automatically cleaned and stored in the database. Raw CSV is also available in SF Exports

6. Managing Config Files

  • Save — Upload and save configs for reuse across crawls
  • Load — Select a previously saved config when starting a new crawl
  • Download — Download a saved config to edit in SF Desktop and re-upload
  • Configs store all SF settings including extraction rules, URL limits, speed, rendering, and filtering

7. Tips for Faster Crawls

  • Always set a URL crawl limit in SF Desktop to avoid crawling entire large sites
  • Keep JS Rendering OFF unless crawling a SPA
  • Use URL filters to focus on specific sections (e.g. /blog/)
  • Increase threads to 15-20 for faster crawls on robust servers
  • Disable image/CSS/JS checking in SF Desktop if you only need HTML page data