An Example Code for Scrapy Spider Implementation in Python

For the first solution, you can create a worker and incorporate it into your Python script, as suggested by this blog post. To utilize it, you can use the provided example or execute the scrapy crawl command.
Regarding the second solution, Maxime Lorant’s answer resolved the problems I encountered while constructing a scrapy spider in my script. Alternatively, you could bypass parsing the date by using an instance in the beginning. As a side note, there is a comprehensive example in this answer on how to execute Scrapy from a script.


Solution:

To revise your code, make adjustments to the constructor of

__init__()

to include the

date

parameter. Additionally, it is recommended to utilize the

datetime.strptime()

method for parsing the date string.

from datetime import datetime
class MySpider(CrawlSpider):
    name = 'tw'
    allowed_domains = ['test.com']
    def __init__(self, *args, **kwargs):
        super(MySpider, self).__init__(*args, **kwargs) 
        date = kwargs.get('date')
        if not date:
            raise ValueError('No date given')
        dt = datetime.strptime(date, "%m-%d-%Y")
        self.start_urls = ['http://test.com/{dt.year}-{dt.month}-{dt.day}'.format(dt=dt)]

Next, you could create an instance of the spider using this approach:

spider = MySpider(date='01-01-2015')

Alternatively, instead of parsing the date, you can simply provide an instance of

datetime

initially.

class MySpider(CrawlSpider):
    name = 'tw'
    allowed_domains = ['test.com']
    def __init__(self, *args, **kwargs):
        super(MySpider, self).__init__(*args, **kwargs) 
        dt = kwargs.get('dt')
        if not dt:
            raise ValueError('No date given')
        self.start_urls = ['http://test.com/{dt.year}-{dt.month}-{dt.day}'.format(dt=dt)]
spider = MySpider(dt=datetime(year=2014, month=01, day=01))

For your information, take a look at this response as an elaborate illustration on executing Scrapy via a script.

Frequently Asked Questions