'Fortran OMP : how to do a parallel and a single task?

I am a newbie in parallel programming. This is my serial code that I would like do parallelize

program main
  implicit none
  integer :: pr_number, i, pr_sum
  real :: pr_av
  
  pr_sum = 0
  do i=1,1000
! The following instruction is an example to simplify the problem.
! In the real case, it takes a long time that is more or less the same for all threads
! and it returns a large array
   pr_number = int(rand()*10) 
   pr_sum = pr_sum+pr_number
   pr_av = (1.d0*pr_sum) / i
   print *,i,pr_av ! In real case, writing a huge amount of data on one file
 enddo

 end program main

I woud like to parallelize pr_number = int(rand()*10) and to have only one print each num_threads. I tried many things but it does not work. For example,

program main
  implicit none
  integer :: pr_number, i, pr_sum
  real :: pr_av
  
  pr_sum = 0
!$OMP PARALLEL DEFAULT(SHARED) PRIVATE(pr_number) SHARED(pr_sum,pr_av)
!$OMP DO REDUCTION(+:pr_sum)
  do i=1,1000
   pr_number = int(rand()*10)
   pr_sum = pr_sum+pr_number
!$OMP SINGLE
   pr_av = (1.d0*pr_sum) / i
   print *,i,pr_av
!$OMP END SINGLE
 enddo
!$OMP END DO
!$OMP END PARALLEL

end program main

I have an error message at compilation time : work-sharing region may not be closely nested inside of work-sharing, critical or explicit task region.

How can I have an output like that (if I have 4 threads for example) ?

       4   3.00000000    
       8   3.12500000    
      12   4.00000000    
      16   3.81250000    
      20   3.50000000  
      ...

I repeat, I am a beginner on parallel programming. I read many things on stackoverflow but, I think, I have not yet the skill to understand. I work on it, but ...

Edit 1

To explain as suggested in comments. A do loop performs N times a lengthy calculation (N markov chain montecarlo) and the average of all calculations is written to a file at each iteration. The previous average is deleted, only the last one is kept, so process can be followed. I would like to parallelise this calculation over 4 threads. This is what I imagine to do but perhaps, it is not the best idea.

enter image description here

Thanks for help.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source