Normalize documents

This setting allows to normalize both input files, by replacing and removing words or part of words.

Every line in the normalization file represents one normalization rule. A normalization rule is a sequence of characters without whitespace or two sequences of characters separated by whitespace.

A rule with two character sequences e.g. ſ s will replace every instance of long-s ſ with a normal s before comparing the texts.

Rules without whitespace characters will represent sequences that are to be removed from both files before comparing the texts.

An example could be as follows:

ſ ss

These rules would remove every occurence of PAGEBREAK from both files and replace every "ſ" with "ss" before comparing the texts.