'Find and replace with PowerShell based on the next line

I'm trying to find and replace via PowerShell based on what's on the next line of the line I want to replace. For example, the following text file:

blahblah 
flimflam 
zimzam

If the line after blahblah is flimflam, replace blahblah with new stuff

Here's the code I have so far:

$reader = New-Object System.IO.StreamReader($myFile.FullName);
$FileContents=$reader.ReadToEnd()
$reader.Close()


if(the line after "blahblah" == "flimflam") #pseudo code
{
    $FileContents=$FileContents.Replace("blahblah","new stuff")
}

If the next line is anything other than flimflam, do nothing.

One idea I had was to replace "blahblah n` flimflam" with "new stuff", but I can't get it to work. I think I might be onto something with including the new line character though.



Solution 1:[1]

  • While your use of System.IO.StreamReader works, it's generally easier to use Get-Content -Raw to read a file into memory in full, as a single, multi-line string.

    • If performance is a concern, you can still use .NET types directly, in which case [System.IO.File]::ReadAllText($myFile.FullName) is a much simpler alternative - although if there's any performance gain to be had over Get-Content -Raw at all, it is probably insignificant.

    • To specify the input file's encoding explicitly, use Get-Encoding -Encoding <encoding> / [System.IO.File]::ReadAllText($myFile.FullName, <encoding>),

  • The [string] type's .Replace() method is limited to literal string replacement, so advanced matching such as limiting matches to a full line is not an option.

    • Use PowerShell's regex-based -replace operator instead.

    • To prevent confusion with PowerShell's string expansion (string interpolation) in double-quoted ("...") strings, it's generally preferable to use -replace with single-quoted ('...') strings, which PowerShell treats as literals, so you can focus on regex constructs in the string.

PS> $FileContents -replace '(?m)^blahblah(?=\r?\nflimflam$)', 'new stuff'
new stuff
flimflam
zimzam
  • (?m) uses inline option m (multi-line) to make anchors ^ / $ match the start / end of each line (instead of the string as a whole).

  • (?=...) is a look-ahead assertion that matches without including the matching part in the overall match, so that it doesn't get replaced.

  • \r?\n is a platform-agnostic way to match a newline sequence / character: CRLF (\r\n) on Windows, LF-only (\n) on Unix-like platforms.

Solution 2:[2]

With a RegEx with a positive lookahead you can replace the previous line without even knowing the content of that line:

(Get-Content .\SO_53398250.txt -raw) -replace "(?sm)^[^`r`n]+(?=`r?`nflimflam)","new stuff"|
 Set-Content .\SO_53398250_2.txt

See the RegEx explained on regex101.com (with different escaping `n => \n)

Solution 3:[3]

This questions asks for a reusable cmdlet that supports streaming as much as possible...

Replace-String

Function Replace-String {
    [CmdletBinding()][OutputType([String[]])]Param (
        [String]$Match, [String]$Replacement, [Int]$Offset = 0,
        [Parameter(ValueFromPipeLine = $True)][String[]]$InputObject
    )
    Begin {
        $Count = 0
        $Buffer = New-Object String[] ([Math]::Abs($Offset))
    }
    Process {
        $InputObject | ForEach-Object {
            If ($Offset -gt 0) {
                If ($Buffer[$Count % $Offset] -Match $Match) {$Replacement} Else {$_}
            } ElseIf ($Offset -lt 0) {
                If ($Count -ge -$Offset) {If ($_ -Match $Match) {$Replacement} Else {$Buffer[$Count % $Offset]}}
            } Else {
                If ($_ -Match $Match) {$Replacement} Else {$_}
            }
            If ($Offset) {$Buffer[$Count++ % $Offset] = $_}
        }
    }
    End {
        For ($i = 0; $i -gt $Offset; $i--) {$Buffer[$Count++ % $Offset]}
    }
}

Syntax

$InputObject | Replace-String [-Match] <String to find>
                              [-Replacement] <Replacement string to use>
                              [-Offset] <Offset relative to the matched string>

Examples:

Replace the found string:

'One', 'Two', 'Three', 'Four', 'Five' | Replace-String Three X 0
One
Two
X
Four
Five

Replace the string prior the found string:

'One', 'Two', 'Three', 'Four', 'Five' | Replace-String Three X -1
One
X
Three
Four
Five

Replace the second string prior the found string:

'One', 'Two', 'Three', 'Four', 'Five' | Replace-String Three X -2
X
Two
Three
Four
Five

Replace the string after the found string:

'One', 'Two', 'Three', 'Four', 'Five' | Replace-String Three X 1
One
Two
Three
X
Five

Replace the second string after the found string:

'One', 'Two', 'Three', 'Four', 'Five' | Replace-String Three X 2
One
Two
Three
Four
X

Replace the strings prior the (two) strings that contains a T:

'One', 'Two', 'Three', 'Four', 'Five' | Replace-String T X -1
X
X
Three
Four
Five

Replace the strings after the (two) strings that contains a T:

'One', 'Two', 'Three', 'Four', 'Five' | Replace-String T X 1
One
Two
X
X
Five

Specific to the question:

'blahblah', 'flimflam','zimzam' | Replace-String 'flimflam' 'new stuff' -1
new stuff
flimflam
zimzam

Parameters

-InputObject <String[]> (From pipeline)
A stream of strings to match and replace

-Match <String>
The string to match in the stream.
Note that the -Match operator is used for this parameter which means that it support regular expressions. If the whole string needs to match, use start line and end line anchors, e.g.: -Match '^Three$'.

-Replacement <String>
The string to use for replacing the target(s).

-Offset <Int> = 0
The offset to string relative to the matched string(s) to replace. The default is 0, meaning: replace the matched string(s).

Background

A little background on the programming in this cmdlet:

  • The Input Processing Methods (Begin {...}, Process {...}, End {...} are used to pass the strings as fast as possible through the cmdlet and release them for the next cmdlet in the pipeline. This cmdlet is designed for the middle of a pipeline (e.g. Get-Content $myFile | Replace-String A B 1 | ...). To leverage from the pipeline:
    • Avoid brackets (like: ($List) | Replace-String A B
    • Avoid assigning the output (like: $Array = ... | Replace-String A B
    • Avoid parameters that read in the whole content (as Get-Content -Raw)
  • If both cases (replacing behind -using a negative offset- or ahead -using a positive offset-), a buffer is required of the size of the offset ($Buffer = New-Object String[] ([Math]::Abs($Offset)))
  • To speed up the process, the script cycles through the buffer ($Buffer[$Count % $Offset]) rather than shifting the contained items
  • If ($Count -ge -$Offset) {... will hold the first number of input strings (equal to the offset) as it can only later been determined whether an input string needs to be replaced or not
  • In the end (End {...), if the $Offset is negative, the buffer (that contains the rest of the input strings) is being released. In other words, a negative offset (e.g. -offset -$n) will buffer $n strings and cause the output to run $n strings behind the input stream

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3