'How to find all Confluence pages with PowerShell
I'm trying to pull all Confluence pages in my instance that haven't been modified since 1/1/21. I was able to get all of the parent pages that haven't been modified since 1/1/21 fairly easily. However, I'm now trying to get all of the child pages.
I know get-confluencechildpage
has a -recurse
option (source) but when I use it I get Invoke-Method : Page children is currently only supported for direct children
.
I've created a script that will iterate through the top level pages and check if there are child pages. What I can't figure out is how to setup the do until and not duplicate the child page output, see picture below.
Here's what I have so far. Once I can get the get-confluencechildpage
figured out I can then add an if and base it on date modified. Can anybody point me in the right direction? Please and thank you.
$Pages = get-confluencespace -spacekey 'SPACEKEY' | get-confluencepage
$NotMod = $pages | ? { $_.version.when -lt (get-date 1/1/21) }
$full = @()
foreach ($1 in $notmod) {
$full += get-confluencepage $1.id
if ($1 | Get-ConfluenceChildPage) {
$Descendents = $1 | Get-ConfluenceChildPage
foreach ($child in $Descendents) {
$full += $child
do {
$Next = 1
$Next = $child | Get-ConfluenceChildPage
if ($next) {
$full += $next
}
} until (
$null -eq $Next
)
}
}
}
Solution 1:[1]
I've just tested Atlassian's ConfluencePS
module (v2.5.1) on a self-hosted Confluence instance (v7.3.5), and the Get-ConfluencePage
cmdlet appears to return a flattened list of the entire document tree for a given space.
Based on that, your code would simply be:
# get all pages from a Confluence "Space"
$all_pages = Get-ConfluencePage -SpaceKey "myspacekey";
# filter all the pages to just get those last edited before a specified date
$timestamp = (get-date -Year 2021 -Month 1 -Day 1).Date;
$filtered = $pages | where-object { $_.Version.When -lt $timestamp };
$filtered | format-table "Id"
ID
--
17714
67261
..etc
Update
If, for whatever reason, you don't get all the pages in the Space returned from Get-ConfluencePage
, you could do a depth-first search of the root pages in the tree using Get-ConfluenceChildPage
:
# get root pages in the space
$rootPages = ...
# push root pages onto a stack
$stack = new-object System.Collections.ArrayList;
foreach( $rootPage in $rootPages )
{
$null = $stack.Add($rootPage);
}
# initialise the result set
$all_pages = new-object System.Collections.ArrayList;
# while stack not empty
while( $stack.Count -gt 0 )
{
# pop the top page off the stack
$parent = $stack[$stack.Count - 1];
$stack.RemoveAt($stack.Count - 1);
# add the top page to the result set
$null = $all_pages.Add($parent);
# get child pages
write-host "getting child pages for '$($parent.Title)' ($($parent.ID))";
$children = Get-ConfluenceChildPage -PageId $parent.ID;
write-host ($children | format-table | out-string);
# push child pages onto the stack
foreach( $child in $children )
{
$null = $stack.Add($child);
}
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |