Coverage for scrapy/utils/request : 92%
Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
|
""" This module provides some useful functions for working with scrapy.http.Request objects """
""" Return the request fingerprint.
The request fingerprint is a hash that uniquely identifies the resource the request points to. For example, take the following two urls:
http://www.example.com/query?id=111&cat=222 http://www.example.com/query?cat=222&id=111
Even though those are two different URLs both point to the same resource and are equivalent (ie. they should return the same response).
Another example are cookies used to store session ids. Suppose the following page is only accesible to authenticated users:
http://www.example.com/members/offers.html
Lot of sites use a cookie to store the session id, which adds a random component to the HTTP Request and thus should be ignored when calculating the fingerprint.
For this reason, request headers are ignored by default when calculating the fingeprint. If you want to include specific headers use the include_headers argument, which is a list of Request headers to include.
"""
"""Autenticate the given request (in place) using the HTTP basic access authentication mechanism (RFC 2617) and the given username and password """
"""Return the raw HTTP representation (as string) of the given request. This is provided only for reference since it's not the actual stream of bytes that will be send when performing the request (that's controlled by Twisted). """
"""Wrap a request inside a Deferred.
This returns a Deferred whose first pair of callbacks are the request callback and errback. The Deferred also triggers when the request callback/errback is executed (ie. when the request is downloaded) """ d = Deferred() if request.callback: d.addCallbacks(request.callback, request.errback) request.callback, request.errback = d.callback, d.errback return d |