'Why has this PowerShell Object changed? What is the difference?
foreach ($line in $test) {
$line.GetType()
$newline = $line -split ("<.*?>") -split ("{.*?}") # remove html and css tags
$newline.GetType()
}
I came across this when trying to use the .Trim() method on $newline. It works, but the intellisense did not indicate that it would. I thought .Trim() would only work on String Objects (BaseType:System.Object), but in this instance, it seems to work on String[] Objects as well (BaseType:System.Array).
$line.GetType() returns
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True String System.Object
$newline.GetType() returns
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True String[] System.Array
First off, I would like to know why my original string was converted to an array, assuming it's the return value of -split... Is it now an array of characters? I am a little confused.
Secondly, if there is a good answer, why do the string methods work on what is technically an array?
Coming from Python and C/C++, thanks.
Solution 1:[1]
Santiago Squarzon's helpful answer provides an effective solution to what your code is trying to do.
To answer your questions as asked, building on Lasse V. Karlsen's helpful comments:
I would like to know why my original string was converted to an array, assuming it's the return value of -split... Is it now an array of characters?
The -split operator splits a string or array of strings into substrings by a given separator regex and returns the substrings as a string array ([string[]])
'foo|bar' -split '\|' # -> [string[]] ('foo', 'bar')
With an array as input, the splitting operation is performed on each element separately, and the per-element result arrays are concatenated to form a single, flat array.
'foo|bar', 'baz|quux' -split '\|' # -> [string[]] ('foo', 'bar', 'baz', 'quux')
Secondly, if there is a good answer, why do the string methods work on what is technically an array?
What you're seeing is a feature previously semi-officially known as member enumeration and soon to be officially termed member-access enumeration: The ability to access a member (a property or a method) on a collection and have it implicitly applied to each of its elements, with the results getting collected in an array (for two or more elements).
- It is described in detail in this answer.
Quick example:
# .Trim() is called on *each element* of the input array.
PS> (' foo', 'bar ').Trim() | ForEach-Object { "[$_]" }
[foo]
[bar]
Solution 2:[2]
Lasse V. Karlsen already provided the key information to understand why the strings ($line) are converted to string[] after being split. What you have most likely wanted to use in this case was the -replace operator which is regex compatible.
Using below as an example:
$htmlcss = @'
table {
font-family: arial, sans-serif;
border-collapse: collapse;
width: 100%;
}
td, th {
border: 1px solid #dddddd;
text-align: left;
padding: 8px;
}
tr:nth-child(even) {
background-color: #dddddd;
}
</style>
</head>
<body>
<h2>HTML Table</h2>
<table>
<tr>
<th>Company</th>
<th>Contact</th>
<th>Country</th>
</tr>
<tr>
<td>Alfreds Futterkiste</td>
<td>Maria Anders</td>
<td>Germany</td>
</tr>
</table>
</body>
</html>
'@
Using -replace to remove the HTML and CSS tags then -split to get a string[] and lastly filter the array to skip the empty lines:
$htmlcss -replace '(?s)<.*?>|\{.*?\}' -split '\r?\n' |
Where-Object { $_ -match '\S' }
Results in:
table
td, th
tr:nth-child(even)
HTML Table
Company
Contact
Country
Alfreds Futterkiste
Maria Anders
Germany
Note, regarding \{.*?\}, for this regex to work you must use it with a string or multi-line string. It will not work with a string array string[]. You will also need to enable the (?s) flag. Supposing you were reading this from a file you would want to use the -Raw switch on Get-Content.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
