'Difference in normalization of Levenshtein (edit) distance?

If the Levenshtein distance between two strings, s and t is given by L(s,t),

what is the difference in the impact on the resulting heuristic of the following two different normalization approaches?

L(s,t) / [length(s) + length(t)]
L(s,t) / max[length(s), length(t)]
(L(s,t)*2) / [length(s) + length(t)]

I noticed that normalization approach 2 is recommended by the Levenshtein distance Wikipedia page but no mention is made of approach 1. Are both approaches equally valid? Just wondering if there is some mathematical justification for using one over the other.

Also, what is the difference between approach 1 and approach 3?

With the following example:

s = "Hi, my name is"
t = "Hello, my name is"
L(s,t) = 4
length(s) = 14 # (includes white space)
length(t) = 17 # (includes white space)

The Levenshtein distance given the three normalization algorithms above are:

[Approach 1]   4  /(14+17) = 0.129
[Approach 2]   4  /(17)    = 0.235
[Approach 3] (4*2)/(14+17) = 0.258

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Difference in normalization of Levenshtein (edit) distance?

Sources

Related Questions