Skip to content
Discussion options

You must be logged in to vote

The first important thing to understand is that TransclusionDecideRule as used in the default config is an ACCEPT rule not a REJECT rule. This means it allows URIs that would otherwise be rejected to be accepted. In other words it strictly only widens the scope. If a URI is already accepted due to another rule such as by being in SURT scope it will have no effect on it.

For the purposes of the maxTransHops setting a transclusion hop is any hop that is not a regular navigation link ('L'), a form submission ('S') or a site-map ('M') link.

A speculative hop (X) is where Heritrix finds a something that looks like a URL in JavaScript source. Heritrix is not able to understand JavaScript code s…

Replies: 6 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by ato
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
2 participants
Converted from issue

This discussion was converted from issue #496 on September 30, 2022 00:33.