'how to find count of certain string from multiple text file in powershell

I have a multiple text files and I need to find and count unique specific words in those files.

Like we need to find how many users logged in for certain time from multiple log files.

I have created the following code, its working fine for lesser files but for multiple larger files its taking too much time

   $A =Get-Content C:\Users\XXXXXXX\Documents\Python\Test\*.log | ForEach-Object { $wrds=$_.Split(" "); foreach ($i in $wrds) {  Write-Output $i } } | Sort-Object | Get-Unique | select-string  -pattern "AAA" -CaseSensitive -SimpleMatch

is it possible to finetune this to run faster.



Solution 1:[1]

If I understand correctly, you would like to find certain user logins occurring in many log files, based on your use of Select-String.

# an array of usernames to search for
$users = 'user1', 'user2', 'userX'
# create a regex from this array by joining the values with regex 'OR' (the pipe symbol)
[regex]$regex = ($users | ForEach-Object { [regex]::Escape($_)}) -join '|'

# or if you need whole string matches instead of allowing partial matches, use
# [regex]$regex = '\b({0})\b' -f (($users | ForEach-Object { [regex]::Escape($_)}) -join '|')

# get a list of all log files
$logFiles = Get-ChildItem -Path 'C:\Users\XXXXXXX\Documents\Python\Test' -Filter '*.log' -File

# loop trhough the list of log files and find the matches in each of them
$result = foreach ($file in $logFiles) {
    $allmatches = $regex.Matches(($file | Get-Content -Raw))
    $logins = @($allmatches.Value | Select-Object -Unique)
    if ($logins.Count) {
        [PsCustomObject]@{
            LogFile    = $file.FullName
            LoginCount = $logins.Count
            Users      = $logins -join ', '
        }
    }
}

# visual output
$result | Out-GridView -Title 'Login search results'

# or save as CSV file
$result | Export-Csv -Path 'X:\somewhere\results.csv' -NoTypeInformation

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1