'using command line tools to extract and replace texts for translations

For an application, I have a language file in the way

first_identifier = English words
second_identifier = more English words

and need to translate it to further languages. In a first step I'm required to extract the right side of those texts resulting in a file like ...

English words
more English words

... How can I archive that? Using grep maybe?

Next I'd use a translation tool and receive something like

German words
more German words

that need to be inserted in the first file again (replace English words with Germans) now. I was thinking about using sed maybe, but I don't know how to use it for this purpose. Or, do you have other recommendations?



Solution 1:[1]

To do it as you describe would be:

$ cat tst.sh
#!/usr/bin/env bash

tmp=$(mktemp) || exit 1
trap 'rm -f "$tmp"; exit' 0

sed 's/[^ =]* = //' "${@:--}" > "$tmp" &&
tr 'a-z' 'A-Z' < "$tmp" |
awk '
    BEGIN { OFS = " = " }
    NR == FNR {
        ger[NR] = $0
        next
    }
    {
        sub(/ = .*/,"")
        print $0, ger[FNR]
    }
' - "$tmp"

$ ./tst.sh file
English words = ENGLISH WORDS
more English words = MORE ENGLISH WORDS

but you don't need a temp file for that:

$ cat tst.sh
#!/usr/bin/env bash

sed 's/[^ =]* = //' "$@" |
tr 'a-z' 'A-Z' |
awk '
    BEGIN { OFS = " = " }
    NR == FNR {
        ger[NR] = $0
        next
    }
    {
        sub(/ = .*/,"")
        print $0, ger[FNR]
    }
' - "$@"

$ ./tst.sh file
first_identifier = ENGLISH WORDS
second_identifier = MORE ENGLISH WORDS

and I think this might be what you really want anyway so your translation tool can translate 1 line at a time instead of the whole input at once which might produce different results:

$ cat tst.sh
#!/usr/bin/env bash

while IFS= read -r line; do
    id="${line%% = *}"
    eng="${line#* = }"
    ger="$(tr 'a-z' 'A-Z' <<<"$eng")"
    printf '%s = %s\n' "$id" "$ger"
done < "${@:--}"

$ ./tst.sh file
first_identifier = ENGLISH WORDS
second_identifier = MORE ENGLISH WORDS

Just replace tr 'a-z' 'A-Z' < "$tmp" or tr 'a-z' 'A-Z' <<<"$eng" with the call to whatever translation tool you have in mind.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1