'Scrapy | How can I add the custom pipelines to custom_settings?
I'm having issues getting my scraper to load an item pipeline. In my attempts to try and add my custom pipeline I am getting the following error:
builtins.ModuleNotFoundError: No module named 'scraper_app'
I have tried running from settings.py ITEM_PIPELINES = ["scraper_app.pipelines.LeasePipeline"] it's working but when I tried running it via custom_settings variable the above error occurs.
Below is the directory structure of my application:
.
├── scraper_app
│ ├── __init__.py
│ ├── models.py
│ ├── pipelines.py
│ ├── settings.py
│ └── spiders
│ ├── __init__.py
│ ├── leased.py
│ ├── lease.py
│ ├── sale.py
│ └── sold.py
└── scrapy.cfg
I need to run multiple pipelines for different spiders in my spiders folder.
In the lease.py file I set:
custom_settings = {
"LOG_FILE": "cel_lease.log",
"ITEM_PIPELINES": {"scraper_app.pipelines.LeasePipeline": 300},
}
I am running it as a standalone script
python lease.py
The scraper fails with the following error:
builtins.ModuleNotFoundError: No module named 'scraper_app'
Can anyone point me out what I am doing wrong?
Solution 1:[1]
Since you are running it is a standalone script, then it does not understand the path to the pipeline. You can either import the pipeline into the script or define it in the same script then simply define custom settings as below where LeasePipeline is the pipeline class:
custom_settings = {
"LOG_FILE": "cel_lease.log",
"ITEM_PIPELINES": {LeasePipeline: 300},
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | msenior_ |
