'Scanning log file using ForEach-Object and replacing text is taking a very long time

I have a Powershell script that scans log files and replaces text when a match is found. The list is currently 500 lines, and I plan to double/triple this. the log files can range from 400KB to 800MB in size. 

Currently, when using the below, a 42MB file takes 29mins, and I'm looking for help if anyone can see any way to make this faster?

I tried changing ForEach-Object with ForEach-ObjectFast but it's causing the script to take sufficiently longer. also tried changing the first ForEach-Object to a forloop but still took ~29 mins. 

$lookupTable= @{
'aaa:bbb:123'='WORDA:WORDB:NUMBER1'
'bbb:ccc:456'='WORDB:WORDBC:NUMBER456'
}

Get-Content -Path $inputfile | ForEach-Object {
   $line=$_
   $lookupTable.GetEnumerator() | ForEach-Object {
       if ($line-match$_.Key)
        {
           $line=$line-replace$_.Key,$_.Value
        }
    }
  $line
}|Set-Content -Path $outputfile


Solution 1:[1]

Since you say your input file could be 800MB in size, reading and updating the entire content in memory could potentially not fit. The way to go then is to use a fast line-by-line method and the fastest I know of is switch

# hardcoded here for demo purposes. 
# In real life you get/construct these from the Get-ChildItem 
# cmdlet you use to iterate the log files in the root folder..  
$inputfile  = 'D:\Test\test.txt'
$outputfile = 'D:\Test\test_new.txt'  # absolute full file path because we use .Net here

# because we are going to Append to the output file, make sure it doesn't exist yet
if (Test-Path -Path $outputfile -PathType Leaf) { Remove-Item -Path $outputfile -Force }

$lookupTable= @{
    'aaa:bbb:123'='WORDA:WORDB:NUMBER1' 
}

# create a regex string from the Keys of your lookup table,
# merging the strings with a pipe symbol (the regex 'OR').
# your Keys could contain characters that have special meaning in regex, so we need to escape those
$regexLookup  = '({0})' -f (($lookupTable.Keys | ForEach-Object { [regex]::Escape($_) }) -join '|')

# create a StreamWriter object to write the lines to the new output file
# Note: use an ABSOLUTE full file path for this
$streamWriter = [System.IO.StreamWriter]::new($outputfile, $true)  # $true for Append

switch -Regex -File $inputfile {
    $regexLookup {
        # do the replacement using the value in the lookup table.
        # because in one line there may be multiple matches to replace
        # get a System.Text.RegularExpressions.Match object to loop through all matches
        $line  = $_
        $match = [regex]::Match($line, $regexLookup)
        while ($match.Success) { 
            # because we escaped the keys, to find the correct entry we now need to unescape
            $line  = $line -replace $match.Value, $lookupTable[[regex]::Unescape($match.Value)]
            $match = $match.NextMatch()
        }
        $streamWriter.WriteLine($line)
    }
    default { $streamWriter.WriteLine($_) }  # write unchanged
}

# dispose of the StreamWriter object
$streamWriter.Dispose()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1