You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If we crawl /seizure directly all this urls are OK. But when we start scanning with /ru/person/seizure all relative urls incorrect prefixed with before-redirected url like /ru/person/seizure/1142 and mark as broken.
The text was updated successfully, but these errors were encountered:
Cannot be done without changes in gaxios (referenced PR). If real page URL will be in response this bug can be solved with changing opts.url to res.request.responseURL in index.js:149.
Also it can be another feature. Crawler result json can contains information about page links that are redirects. There are many cases when it can be usefull:
http links to sites that fully upgraded to https
links without www.
redirects can lead to not the same page than before
For example url https://www.sberbank.ru/ru/person/seizure redirected to https://www.sberbank.ru/seizure and have relative urls in there like
./1142
.If we crawl /seizure directly all this urls are OK. But when we start scanning with /ru/person/seizure all relative urls incorrect prefixed with before-redirected url like
/ru/person/seizure/1142
and mark as broken.The text was updated successfully, but these errors were encountered: