Use - Standalone

You can use Scrapy in standalone mode, where you run the process directly.

Doing this does not let you take full advantage of scrapy but it can be useful for testing or places where it is hard to install or use other software.


Scrapy provides a commandline interface for spiders.

To see available spiders:

scrapy list

To run one:

scrapy crawl <spider_name> -a key=value


scrapy crawl canada_buyandsell -a note="Started by Fred." -a sample=true
scrapy crawl canada_buyandsell -a note="Started by Fred."

Update the note with your name, and anything else of interest.

Output - Disk

You must configure FILES_STORE.

FILES_STORE = 'data'

FILES_STORE should be a local folder that data will appear in.

Files are stored in {FILES_STORE}/{scraper_name}/{scraper_start_date_time}.

Output - Kingfisher Process

In, make sure the 3 API variables are set to load from the environment. For example:


The kingfisher-process API endpoint variables are currently accessed from the scraper’s environment. To configure:

  1. Copy to
  2. Set the KINGFISHER_* variables in to match your instance (local or server).
  3. Run source to export them to the scraper environment.