並行譜數值方法/使用ParaView協同處理進行視覺化
ParaView協同處理外掛允許應用程式程式碼被檢測以連線到ParaView伺服器,以執行視覺化管道。管道可以生成各種格式的影像或VTK XML並行檔案格式資料集。
使用者使用一個示例(可能解析度較低)資料集在ParaView客戶端中建立一個管道。然後,該管道使用協同處理外掛匯出為Python指令碼,該指令碼將由應用程式載入。
應用程式開發人員還需要編寫一些Fortran/C++程式碼,將應用程式的資料結構轉換為ParaView可以理解的格式。對於完全規則的網格,這是很簡單的。
第一個要求是擁有一個可以匯出Python指令碼的ParaView客戶端。截至ParaView 3.14.1,這需要從原始碼構建。 標準說明適用於一些關於構建 指令碼生成外掛的額外說明。構建完成後,第一次啟動ParaView時,轉到工具,管理外掛,並將協同處理外掛設定為在啟動時載入。載入外掛後,應該有兩個新的選單項:寫入器和協同處理
- 可能需要安裝CMake
- 可能需要構建Qt
- 可能需要關閉構建Manta外掛
以下說明適用於2012年夏季在NCSA的Forge上測試的Navier-Stokes CUDA Fortran。
|
()
|
navierstokes.cuf添加了4行並刪除了.datbin寫入。 程式碼下載
!--------------------------------------------------------------------
!
! PURPOSE
!
! This program numerically solves the 2D incompressible Navier-Stokes
! on a Square Domain [0,1]x[0,1] using pseudo-spectral methods and
! Crank-Nicolson timestepping. The numerical solution is compared to
! the exact Taylor-Green Vortex Solution.
!
! Periodic free-slip boundary conditions and Initial conditions:
! u(x,y,0)=sin(2*pi*x)cos(2*pi*y)
! v(x,y,0)=-cos(2*pi*x)sin(2*pi*y)
! Analytical Solution (subscript denote derivatives):
! u(x,y,t)=sin(2*pi*x)cos(2*pi*y)exp(-8*pi^2*t/Re)
! v(x,y,t)=-cos(2*pi*x)sin(2*pi*y)exp(-8*pi^2*t/Re)
! u_y(x,y,t)=-2*pi*sin(2*pi*x)sin(2*pi*y)exp(-8*pi^2*t/Re)
! v_x(x,y,t)=2*pi*sin(2*pi*x)sin(2*pi*y)exp(-8*pi^2*t/Re)
! omega=v_x-u_y
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Ny = number of modes in y - power of 2 for FFT
! nplots = number of plots produced
! plotgap = number of timesteps inbetween plots
! Re = dimensionless Renold's number
! ReInv = 1/Re for optimization
! dt = timestep size
! dtInv = 1/dt for optimization
! tol = determines when convergences is reached
! scalemodes = 1/(Nx*Ny), scaling after preforming FFTs
! numthreads = number of CPUs used in calculation
! .. Scalars ..
! i = loop counter in x direction
! j = loop counter in y direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! time = times at which data is saved
! chg = error at each iteration
! temp = used for ordering saved omega
! .. Arrays (gpu) ..
! u_d = velocity in x direction
! uold_d = velocity in x direction at previous timestep
! v_d = velocity in y direction
! vold_d = velocity in y direction at previous timestep
! omeg_d = vorticity in real space
! omeghat_d = 2D Fourier transform of vorticity
! at next iterate
! omegoldhat_d = 2D Fourier transform of vorticity at previous
! iterate
! nloldhat_d = nonlinear term in Fourier space
! at previous iterate
! omegexact_d = taylor-green vorticity at
! at final step
! psihat_d = 2D Fourier transform of streamfunction
! at next iteration
! omegcheck_d = store of vorticity at previous iterate
! temp1_d/temp2_d = reusable complex/real space used for
! calculations. This reduces number of
! arrays stored.
! .. Vectors (gpu) ..
! kx_d = fourier frequencies in x direction
! ky_d = fourier frequencies in y direction
! x_d = x locations
! y_d = y locations
! name_config = array to store filename for data to be saved
! REFERENCES
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
! This program has not been fully optimized.
!--------------------------------------------------------------------
module precision
! Precision control
integer, parameter, public :: Single = kind(0.0) ! Single precision
integer, parameter, public :: Double = kind(0.0d0) ! Double precision
integer, parameter, public :: fp_kind = Double
!integer, parameter, public :: fp_kind = Single
end module precision
module cufft
integer, public :: CUFFT_FORWARD = -1
integer, public :: CUFFT_INVERSE = 1
integer, public :: CUFFT_R2C = Z'2a' ! Real to Complex (interleaved)
integer, public :: CUFFT_C2R = Z'2c' ! Complex (interleaved) to Real
integer, public :: CUFFT_C2C = Z'29' ! Complex to Complex, interleaved
integer, public :: CUFFT_D2Z = Z'6a' ! Double to Double-Complex
integer, public :: CUFFT_Z2D = Z'6c' ! Double-Complex to Double
integer, public :: CUFFT_Z2Z = Z'69' ! Double-Complex to Double-Complex
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!
! cufftPlan2d(cufftHandle *plan, int nx,int ny, cufftType type)
!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
interface cufftPlan2d
subroutine cufftPlan2d(plan, nx, ny, type) bind(C,name='cufftPlan2d')
use iso_c_binding
integer(c_int):: plan
integer(c_int),value:: nx, ny, type
end subroutine cufftPlan2d
end interface
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!
! cufftDestroy(cufftHandle plan)
!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
interface cufftDestroy
subroutine cufftDestroy(plan) bind(C,name='cufftDestroy')
use iso_c_binding
integer(c_int),value:: plan
end subroutine cufftDestroy
end interface
interface cufftExecD2Z
subroutine cufftExecD2Z(plan, idata, odata) &
& bind(C,name='cufftExecD2Z')
use iso_c_binding
use precision
integer(c_int), value :: plan
real(fp_kind), device :: idata(1:nx,1:ny)
complex(fp_kind),device :: odata(1:nx,1:ny)
end subroutine cufftExecD2Z
end interface
interface cufftExecZ2D
subroutine cufftExecZ2D(plan, idata, odata) &
& bind(C,name='cufftExecZ2D')
use iso_c_binding
use precision
integer(c_int),value:: plan
complex(fp_kind),device:: idata(1:nx,1:ny)
real(fp_kind),device :: odata(1:nx,1:ny)
end subroutine cufftExecZ2D
end interface
end module cufft
PROGRAM main
use precision
use cufft
! coprocessing:
use ns2dcnadaptor
! declare variables
IMPLICIT NONE
INTEGER(kind=4), PARAMETER :: Nx=1024
INTEGER(kind=4), PARAMETER :: Ny=1024
!INTEGER(kind=8) ::
REAL(fp_kind), PARAMETER :: dt=0.00025d0
REAL(fp_kind), PARAMETER :: dtInv=1.0d0/REAL(dt,kind(0d0))
REAL(fp_kind), PARAMETER &
:: pi=3.14159265358979323846264338327950288419716939937510d0
REAL(fp_kind), PARAMETER :: Re=5000.0d0
REAL(fp_kind), PARAMETER :: ReInv=1.0d0/REAL(Re,kind(0d0))
REAL(fp_kind), PARAMETER :: tol=0.1d0**10
!coprocessing: temp is used as an int, runtime crashes on
! floating temp used in write format string on line 376
!REAL(fp_kind) :: scalemodes,chg,temp=10000000
real(fp_kind) :: scalemodes, chg
integer(kind=4) :: temp=100000000
INTEGER(kind=4), PARAMETER :: nplots=400, plotgap=200
REAL(fp_kind),DIMENSION(:), ALLOCATABLE :: x,y
REAL(fp_kind),DIMENSION(:,:), ALLOCATABLE :: omeg,omegexact
INTEGER(kind=4) :: i,j,n,t, AllocateStatus
INTEGER(kind=4) :: planz2d,pland2z, kersize
!variables used for saving data and timing
INTEGER(kind=4) :: start, finish, count_rate,count, iol
CHARACTER*100 :: name_config
! Declare variables for GPU
REAL(fp_kind), DEVICE, DIMENSION(:,:), ALLOCATABLE :: u_d,v_d,&
omegcheck_d,omeg_d,&
nl_d, temp2_d, omegexact_d
COMPLEX(fp_kind), DEVICE, DIMENSION(:,:), ALLOCATABLE :: omegoldhat_d, nloldhat_d,&
omeghat_d, nlhat_d, psihat_d,&
temp1_d
COMPLEX(fp_kind), DEVICE, DIMENSION(:), ALLOCATABLE :: kx_d,ky_d
REAL(kind=8),DEVICE, DIMENSION(:), ALLOCATABLE :: x_d,y_d
kersize=min(Nx,256)
PRINT *,'Program starting'
PRINT *,'Grid:',Nx,'X',Ny
PRINT *,'dt:',dt
ALLOCATE(x(1:Nx),y(1:Ny),omeg(1:Nx,1:Ny),omegexact(1:Nx,1:Ny),&
stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
PRINT *,'Allocated CPU arrays'
ALLOCATE(kx_d(1:Nx/2+1),ky_d(1:Ny),x_d(1:Nx),y_d(1:Ny),u_d(1:Nx,1:Ny),&
v_d(1:Nx,1:Ny),omeg_d(1:Nx,1:Ny),&
omegexact_d(1:Nx,1:Ny),temp2_d(1:Nx,1:Ny),&
omegoldhat_d(1:Nx/2+1,1:Ny),nloldhat_d(1:Nx/2+1,1:Ny),&
omegcheck_d(1:Nx,1:Ny),omeghat_d(1:Nx/2+1,1:Ny),nl_d(1:Nx,1:Ny),&
nlhat_d(1:Nx/2+1,1:Ny), psihat_d(1:Nx/2+1,1:Ny),temp1_d(1:Nx/2+1,1:Ny),&
stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
PRINT *,'Allocated GPU arrays'
CALL cufftPlan2D(pland2z,nx,ny,CUFFT_D2Z)
CALL cufftPlan2D(planz2d,nx,ny,CUFFT_Z2D)
PRINT *,'Setup FFTs'
! setup fourier frequencies
!$cuf kernel do <<< *,* >>>
DO i=1,Nx/2+1
kx_d(i)= 2.0d0*pi*cmplx(0.0d0,1.0d0)*REAL(i-1,kind=fp_kind)
END DO
kx_d(1+Nx/2)=0.0d0
!$cuf kernel do <<< *,* >>>
DO i=1,Nx
x_d(i)=REAL(i-1,kind(0d0))/REAL(Nx,kind=fp_kind)
END DO
!$cuf kernel do <<< *,* >>>
DO j=1,Ny/2+1
ky_d(j)= 2.0d0*pi*cmplx(0.0d0,1.0d0)*REAL(j-1,kind=fp_kind)
END DO
ky_d(1+Ny/2)=0.0d0
!$cuf kernel do <<< *,* >>>
DO j = 1,Ny/2 -1
ky_d(j+1+Ny/2)=-ky_d(1-j+Ny/2)
END DO
!$cuf kernel do <<< *, * >>>
DO j=1,Ny
y_d(j)=REAL(j-1,kind(0d0))/REAL(Ny,kind=fp_kind)
END DO
scalemodes=1.0d0/REAL(Nx*Ny,kind=fp_kind)
PRINT *,'Setup grid and fourier frequencies'
!initial data
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx
u_d(i,j)=sin(2.0d0*pi*x_d(i))*cos(2.0d0*pi*y_d(j))
END DO
END DO
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx
v_d(i,j)=-cos(2.0d0*pi*x_d(i))*sin(2.0d0*pi*y_d(j))
END DO
END DO
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx
omeg_d(i,j)=4.0d0*pi*sin(2.0d0*pi*x_d(i))*sin(2.0d0*pi*y_d(j))+0.01d0*cos(2.0d0*pi*y_d(j))
END DO
END DO
CALL cufftExecD2Z(pland2z,omeg_d,omeghat_d)
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx/2+1
temp1_d(i,j)=omeghat_d(i,j)*kx_d(i)
END DO
END DO
CALL cufftExecZ2D(planz2d,temp1_d,temp2_d)
!first part nonlinear term in real space
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx
nl_d(i,j)=u_d(i,j)*temp2_d(i,j)
END DO
END DO
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx/2+1
temp1_d(i,j)=omeghat_d(i,j)*ky_d(j)
END DO
END DO
CALL cufftExecZ2D(planz2d,temp1_d,temp2_d)
!$cuf kernel do <<< *,* >>>
DO j=1,Ny
DO i=1,Nx
nl_d(i,j)=(nl_d(i,j)+v_d(i,j)*temp2_d(i,j))*scalemodes
END DO
END DO
CALL cufftExecD2Z(pland2z,nl_d,nlhat_d)
omegcheck_d=omeg_d
PRINT *,'Got initial data, starting timestepping'
CALL system_clock(start,count_rate)
! coprocessing: simulation loop
! coprocessing: before entering, initialze the ParaView coprocessor
! coprocessing: this is from FortranAdaptorAPI.cxx
! coprocessing: it takes the name of the pipeline script you created
! coprocessing: in the ParaView client and the name's length in chars
call coprocessorinitialize("pipeline.py", 11 )
DO t=1,nplots
DO n=1,plotgap
chg=1.0d0
nloldhat_d=nlhat_d
omegoldhat_d=omeghat_d
DO WHILE (chg>tol)
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx/2+1
omeghat_d(i,j)=((dtInv+0.5d0*ReInv*(kx_d(i)*kx_d(i)+ky_d(j)*ky_d(j)))&
*omegoldhat_d(i,j) - 0.5d0*(nloldhat_d(i,j)+nlhat_d(i,j)))&
/(dtInv-0.5d0*ReInv*(kx_d(i)*kx_d(i)+ky_d(j)*ky_d(j)))
END DO
END DO
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx/2+1
psihat_d(i,j)=-omeghat_d(i,j)/(kx_d(i)*kx_d(i)+ky_d(j)*ky_d(j)+0.10d0**14)
END DO
END DO
CALL cufftExecZ2D(planz2d,omeghat_d,omeg_d)
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx/2+1
temp1_d(i,j)=-psihat_d(i,j)*kx_d(i)*scalemodes
END DO
END DO
CALL cufftExecZ2D(planz2d,temp1_d,v_d)
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx/2+1
temp1_d(i,j)=psihat_d(i,j)*ky_d(j)*scalemodes
END DO
END DO
CALL cufftExecZ2D(planz2d,temp1_d,u_d)
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx/2+1
temp1_d(i,j)=omeghat_d(i,j)*kx_d(i)
END DO
END DO
CALL cufftExecZ2D(planz2d,temp1_d,temp2_d)
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx
nl_d(i,j)=u_d(i,j)*temp2_d(i,j)
END DO
END DO
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx/2+1
temp1_d(i,j)=omeghat_d(i,j)*ky_d(j)
END DO
END DO
CALL cufftExecZ2D(planz2d,temp1_d,temp2_d)
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx
nl_d(i,j)=(nl_d(i,j)+v_d(i,j)*temp2_d(i,j))*scalemodes
END DO
END DO
CALL cufftExecD2Z(pland2z,nl_d,nlhat_d)
chg=0.0d0
!$cuf kernel do(2) <<< (2,*), (kersize,1) >>>
DO j=1,Ny
DO i=1,Nx
chg=chg+(omeg_d(i,j)-omegcheck_d(i,j))*(omeg_d(i,j)-omegcheck_d(i,j))&
*scalemodes*scalemodes
END DO
END DO
omegcheck_d=omeg_d
END DO
END DO
PRINT *, t*plotgap*dt
omeg=omeg_d
!temp=temp+1
!write(name_config,'(a,i0,a)') 'omega',temp,'.datbin'
!INQUIRE(iolength=iol) omeg(1,1)
!OPEN(unit=11,FILE=name_config,form="unformatted", access="direct",recl=iol)
!count = 1 !coprocessing - there's an intrinsic array function this shadows
!DO j=1,Ny
! DO i=1,Nx
! WRITE(11,rec=count) omeg(i,j)*scalemodes
! count=count+1
! END DO
!END DO
!CLOSE(11)
! coprocessing: do the coprocessing at the end of the sim loop
! grid dims, time step, time, scalar array
call navierStokesCoprocessor(Nx, Ny, 1, t, t*dt ,omeg)
END DO
! coprocessing: clean up
call coprocessorfinalize()
CALL system_clock(finish,count_rate)
PRINT*,'Program took ',REAL(finish-start)/REAL(count_rate),&
'for Time stepping'
! Copy results back to host
x=x_d
y=y_d
!!!!!!!!!!!!!!!!!!!!!!!!
!copy over data to disk!
!!!!!!!!!!!!!!!!!!!!!!!!
name_config = 'xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)
name_config = 'ycoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
WRITE(11,*) y(j)
END DO
CLOSE(11)
!!!!!!!!!!!!!!!!!!!!!!!!
CALL cufftDestroy(planz2d)
CALL cufftDestroy(pland2z)
PRINT *,'Destroyed fft plan'
DEALLOCATE(x,y,omeg,omegexact,stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
PRINT *,'Deallocated CPU memory'
DEALLOCATE(kx_d,ky_d,x_d,y_d,u_d,&
v_d, omegcheck_d, omeg_d,omegoldhat_d,&
omegexact_d, nloldhat_d,omeghat_d,nl_d, nlhat_d,&
temp1_d,temp2_d, psihat_d,stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
PRINT *,'Deallocated GPU memory'
PRINT *,'Program execution complete'
END PROGRAM main
|
()
|
! fortran module for interacting with the ParaView CoProcessor
! loosely based on:
! ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/phastaadaptor.f
! -- Changed for SESE
! subroutine determines if coprocessing needed during the
! current simulation step or not.
module ns2dcnadaptor
implicit none
public
contains
subroutine navierstokescoprocessor(nx, ny, nz, step, time, omeg)
! nx, ny, nz -- grid dimensions, nz will be 1 since this is a 2D problem.
! step -- current simulation time step
! time -- current simulation time
! omega -- scalar array for the current time step
integer, intent(in) :: nx, ny, nz, step
real(kind=8), intent(in) :: time
real(kind=8), dimension(:,:), intent (in) :: omeg
integer :: flag
! check if processing this time step
! defined in FortranAdaptorAPI.h
call requestdatadescription(step, time, flag)
if (flag .ne. 0) then
! processing requested
! check if need to create grid
! defined in FortranAdaptorAPI.h
call needtocreategrid(flag)
if (flag .ne. 0) then
! grid needed
! defined in ns2dcnVTKDataSet.cxx
call createcpimagedata(nx, ny, nz)
end if
! defined in ns2dcnVTKDataSet.cxx
call addfield(omeg)
! defined in FortranAdaptorAPI.h
call coprocess()
end if
return
end subroutine navierstokescoprocessor
end module ns2dcnadaptor
|
()
|
定義將模擬資料放入ParaView/VTK庫可以理解的格式的函式。對於完全規則的網格來說非常簡單。擴充套件到3D很簡單。非結構化網格將需要更多努力。 程式碼下載
// Adaptor for getting fortran simulation code into ParaView CoProcessor.
// Based on the PhastaAdaptor sample in the ParaView distribution.
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx
// Fortran specific header
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/
#include "FortranAdaptorAPI.h"
// CoProcessor specific headers
// ParaView-3.14.1-Source/CoProcessing/CoProcessor/
#include "vtkCPDataDescription.h"
#include "vtkCPInputDataDescription.h"
#include "vtkCPProcessor.h"
#include "vtkDoubleArray.h"
#include "vtkPointData.h"
// ParaView-3.14.1-Source/VTK/Filtering/vtkImageData.h
#include "vtkImageData.h"
// These will be called from the Fortran "glue" code"
// Completely dependent on data layout, structured vs. unstructured, etc.
// since VTK/ParaView uses different internal layouts for each.
// Creates the data container for the CoProcessor.
extern "C" void createcpimagedata_(int* nx, int* ny, int* nz)
{
if (!ParaViewCoProcessing::GetCoProcessorData()) {
vtkGenericWarningMacro("Unable to access CoProcessorData.");
return;
}
// The simulation grid is a 2-dimensional topologically and geometrically
// regular grid. In VTK/ParaView, this is considered an image data set.
vtkImageData* Grid = vtkImageData::New();
// assuming dimZ == 1 for now
Grid->SetDimensions(*nx, *ny, *nz);
// Setting the Origin and Spacing are also options.
// Name should be consistent between here, Fortran and Python client script.
ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(Grid);
}
// Add field(s) to the data container.
// Separate from above because this will be dynamic, grid is static.
// Might be an issue, VTK probably assumes row major, but
// omeg probably passed column major...
// by hand name mangling for fortran
extern "C" void addfield_(double* scalars)
{
vtkCPInputDataDescription *idd = ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input");
vtkImageData* Image = vtkImageData::SafeDownCast(idd->GetGrid());
if (!Image) {
vtkGenericWarningMacro("No adaptor grid to attach field data to.");
return;
}
// field name must match that in the fortran code.
if (idd->IsFieldNeeded("omeg")) {
vtkDoubleArray* omega = vtkDoubleArray::New();
omega->SetName("omeg");
omega->SetArray(scalars, Image->GetNumberOfPoints(), 1);
Image->GetPointData()->AddArray(omega);
omega->Delete();
}
}
指令碼管道是透過使用協同處理指令碼生成器外掛在ParaView客戶端中生成的。
- 檔案 > 開啟 > 瀏覽到資料集,該檔案現在應該在管道瀏覽器視窗中可見。
- 在屬性視窗中,單擊應用,資料現在應該出現在佈局視窗中。
- 要更改顏色對映,請單擊工具欄上帶有綠色圓圈的彩虹按鈕。
- 要在佈局中顯示顏色對映,請單擊工具欄上垂直彩虹的按鈕。
- 在工具 > 鎖定檢視大小自定義中設定視窗的大小。這將幫助你為最終影像排列專案。記下尺寸,稍後你需要它們來解決一個錯誤。
- 如果要寫出VTK資料集,請轉到寫入器 > 並行影像資料寫入器,這應該會在管道中新增一個寫入器。
- 要匯出python管道,請轉到協同處理 > 匯出狀態,這應該會啟動一個嚮導。
- 單擊下一步。
- 選擇你想要連線到你的模擬的資料集,新增它,單擊下一步。
- 預設的模擬名稱為“輸入”。如果你更改它,請務必更新ns2dcnVTKDataSet.cxx,單擊下一步。
- 檢查是否要輸出影像。單擊完成。
- 儲存檔案。檔名將在navierstokes.cuf中引用,因此請儲存為類似pipeline-test-??.py的名稱,然後將其複製到pipeline.py,以避免為新的管道重新編譯。
如果管道設定為輸出影像,那麼指令碼中可能存在兩個問題,可以視為錯誤。
- 無論在客戶端中設定了什麼視口大小,指令碼都不會使用它。
- 顏色對映標籤將以一個包含它們的框繪製。
第一個問題的解決方法是在客戶端中,向ViewSize方法呼叫新增與視口視窗尺寸相同的引數。第二個問題只需要簡單的編輯。這兩個問題都在示例pipeline.py中用#mvm註釋標記了。
Forge 要求載入 ParaView OSMesa 模組。$ module load paraview/3.14.1-OSMesa。在 Nautilus 上,該模組為 paraview/3.14.1。
按照執行未經檢測版本的程式的方式執行。
有許多方法可以做到這一點,如果 ffmpeg 可用,則可以使用 ffmpeg -sameq -i path/to/images/image%04d.png myanimation.mpg。
或者,如果將 VTK 資料集寫入檔案,則可以直接在 ParaView 中使用它們。
流暢影片約為 30 fps 或 dt = 0.033… 如果將較大 dt 寫入 VTK 資料集,則可以在 ParaView 中進行線性時間插值。如果將較大 dt 寫入影像並將其拼接成 mpg,則可用選項會大大降低影片質量。
- Forge 沒有 X 支援,因此需要構建離屏 Mesa (OSMesa)。
- Nautilus 支援 X,因此可以使用提供的 OpenGL 庫進行渲染。
自 ParaView 3.14.1 以來,協同處理 API 有一些細微的更改。以下內容適用於 ParaView 3.98+ 和 Catalyst 1.0 alpha+。這些更改已於 2013 年 8 月 20 日在 NICS 的 Beacon 上透過 offload 模式進行了測試。
程式碼更改如下:
- 在模擬程式碼中,coprocessorinitialize 現在為 coprocessorinitializewithpython,引數相同。
- 在 C++ 資料集輔助程式碼中,包含 vtkCPPythonAdaptorAPI.h,而不是 FortranAdaptorAPI.h。
- 在 C++ 資料集輔助程式碼中,從 ParaViewCoprocessing 名稱空間更改為 vtkCPPythonAdaptorAPI 名稱空間。
要連結的庫也已更改,對於 ParaView 4.0.1 版本,它們為:
- -lvtkPVCatalystPython26D-pv4.0
- -lvtkPVCatalyst-pv4.0
- -lvtkCommonCore-pv4.0
- -lvtkCommonDataModel-pv4.0
- -lvtkPVPythonCatalyst-pv4.0
|
()
|
!--------------------------------------------------------------------
!
!
! PURPOSE
!
! This program numerically solves the 2D incompressible Navier-Stokes
! on a Square Domain [0,1]x[0,1] using pseudo-spectral methods and
! Crank-Nicolson timestepping. The numerical solution is compared to
! the exact Taylor-Green Vortex Solution.
!
! Periodic free-slip boundary conditions and Initial conditions:
! u(x,y,0)=sin(2*pi*x)cos(2*pi*y)
! v(x,y,0)=-cos(2*pi*x)sin(2*pi*y)
! Analytical Solution:
! u(x,y,t)=sin(2*pi*x)cos(2*pi*y)exp(-8*pi^2*nu*t)
! v(x,y,t)=-cos(2*pi*x)sin(2*pi*y)exp(-8*pi^2*nu*t)
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Ny = number of modes in y - power of 2 for FFT
! Nt = number of timesteps to take
! Tmax = maximum simulation time
! FFTW_IN_PLACE = value for FFTW input
! FFTW_MEASURE = value for FFTW input
! FFTW_EXHAUSTIVE = value for FFTW input
! FFTW_PATIENT = value for FFTW input
! FFTW_ESTIMATE = value for FFTW input
! FFTW_FORWARD = value for FFTW input
! FFTW_BACKWARD = value for FFTW input
! pi = 3.14159265358979323846264338327950288419716939937510d0
! mu = viscosity
! rho = density
! .. Scalars ..
! i = loop counter in x direction
! j = loop counter in y direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! count = keep track of information written to disk
! iol = size of array to write to disk
! start = variable to record start time of program
! finish = variable to record end time of program
! count_rate = variable for clock count rate
! planfx = Forward 1d fft plan in x
! planbx = Backward 1d fft plan in x
! planfy = Forward 1d fft plan in y
! planby = Backward 1d fft plan in y
! dt = timestep
! .. Arrays ..
! u = velocity in x direction
! uold = velocity in x direction at previous timestep
! v = velocity in y direction
! vold = velocity in y direction at previous timestep
! u_y = y derivative of velocity in x direction
! v_x = x derivative of velocity in y direction
! omeg = vorticity in real space
! omegold = vorticity in real space at previous
! iterate
! omegcheck = store of vorticity at previous iterate
! omegoldhat = 2D Fourier transform of vorticity at previous
! iterate
! omegoldhat_x = x-derivative of vorticity in Fourier space
! at previous iterate
! omegold_x = x-derivative of vorticity in real space
! at previous iterate
! omegoldhat_y = y-derivative of vorticity in Fourier space
! at previous iterate
! omegold_y = y-derivative of vorticity in real space
! at previous iterate
! nlold = nonlinear term in real space
! at previous iterate
! nloldhat = nonlinear term in Fourier space
! at previous iterate
! omeghat = 2D Fourier transform of vorticity
! at next iterate
! omeghat_x = x-derivative of vorticity in Fourier space
! at next timestep
! omeghat_y = y-derivative of vorticity in Fourier space
! at next timestep
! omeg_x = x-derivative of vorticity in real space
! at next timestep
! omeg_y = y-derivative of vorticity in real space
! at next timestep
! .. Vectors ..
! kx = fourier frequencies in x direction
! ky = fourier frequencies in y direction
! kxx = square of fourier frequencies in x direction
! kyy = square of fourier frequencies in y direction
! x = x locations
! y = y locations
! time = times at which save data
! name_config = array to store filename for data to be saved
! fftfx = array to setup x Fourier transform
! fftbx = array to setup y Fourier transform
! REFERENCES
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
! This program has not been optimized to use the least amount of memory
! but is intended as an example only for which all states can be saved
!--------------------------------------------------------------------
! External routines required
!
! External libraries required
! FFTW3 -- Fast Fourier Transform in the West Library
! (http://www.fftw.org/)
! declare variables
PROGRAM main
use nsadaptor_glue
IMPLICIT NONE
INTEGER(kind=4), PARAMETER :: Nx=256
INTEGER(kind=4), PARAMETER :: Ny=256
REAL(kind=8), PARAMETER :: dt=0.00125
REAL(kind=8), PARAMETER &
:: pi=3.14159265358979323846264338327950288419716939937510
REAL(kind=8), PARAMETER :: rho=1.0d0
REAL(kind=8), PARAMETER :: mu=1.0d0
REAL(kind=8), PARAMETER :: tol=0.1d0**10
REAL(kind=8) :: chg
INTEGER(kind=4), PARAMETER :: nplots=50
REAL(kind=8), DIMENSION(:), ALLOCATABLE :: time
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE :: kx,kxx
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE :: ky,kyy
REAL(kind=8), DIMENSION(:), ALLOCATABLE :: x
REAL(kind=8), DIMENSION(:), ALLOCATABLE :: y
COMPLEX(kind=8), DIMENSION(:,:), ALLOCATABLE :: &
u,uold,v,vold,u_y,v_x,omegold, omegcheck, omeg,&
omegoldhat, omegoldhat_x, omegold_x,&
omegoldhat_y, omegold_y, nlold, nloldhat,&
omeghat, omeghat_x, omeghat_y, omeg_x, omeg_y,&
nl, nlhat, psihat, psihat_x, psi_x, psihat_y, psi_y
! added for coprocessing
real(kind=8), dimension(:,:), allocatable :: dump
REAL(kind=8),DIMENSION(:,:), ALLOCATABLE :: uexact_y,vexact_x,omegexact
INTEGER(kind=4) :: i,j,k,n, allocatestatus, count, iol
INTEGER(kind=4) :: start, finish, count_rate
INTEGER(kind=4), PARAMETER :: FFTW_IN_PLACE = 8, FFTW_MEASURE = 0, &
FFTW_EXHAUSTIVE = 8, FFTW_PATIENT = 32, &
FFTW_ESTIMATE = 64
INTEGER(kind=4),PARAMETER :: FFTW_FORWARD = -1, FFTW_BACKWARD=1
COMPLEX(kind=8), DIMENSION(:,:), ALLOCATABLE :: fftfx,fftbx
INTEGER(kind=8) :: planfxy,planbxy
CHARACTER*100 :: name_config
CALL system_clock(start,count_rate)
ALLOCATE(time(1:nplots),kx(1:Nx),kxx(1:Nx),ky(1:Ny),kyy(1:Ny),x(1:Nx),y(1:Ny),&
u(1:Nx,1:Ny),uold(1:Nx,1:Ny),v(1:Nx,1:Ny),vold(1:Nx,1:Ny),u_y(1:Nx,1:Ny),&
v_x(1:Nx,1:Ny),omegold(1:Nx,1:Ny),omegcheck(1:Nx,1:Ny), omeg(1:Nx,1:Ny),&
omegoldhat(1:Nx,1:Ny),omegoldhat_x(1:Nx,1:Ny), omegold_x(1:Nx,1:Ny), &
omegoldhat_y(1:Nx,1:Ny),omegold_y(1:Nx,1:Ny), nlold(1:Nx,1:Ny), nloldhat(1:Nx,1:Ny),&
omeghat(1:Nx,1:Ny), omeghat_x(1:Nx,1:Ny), omeghat_y(1:Nx,1:Ny), omeg_x(1:Nx,1:Ny),&
omeg_y(1:Nx,1:Ny), nl(1:Nx,1:Ny), nlhat(1:Nx,1:Ny), psihat(1:Nx,1:Ny), &
psihat_x(1:Nx,1:Ny), psi_x(1:Nx,1:Ny), psihat_y(1:Nx,1:Ny), psi_y(1:Nx,1:Ny),&
uexact_y(1:Nx,1:Ny), vexact_x(1:Nx,1:Ny), omegexact(1:Nx,1:Ny),fftfx(1:Nx,1:Ny),&
fftbx(1:Nx,1:Ny),dump(1:Nx,1:Ny),stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
PRINT *,'allocated space'
! set up ffts
CALL dfftw_plan_dft_2d_(planfxy,Nx,Ny,fftfx(1:Nx,1:Ny),fftbx(1:Nx,1:Ny),&
FFTW_FORWARD,FFTW_EXHAUSTIVE)
CALL dfftw_plan_dft_2d_(planbxy,Nx,Ny,fftbx(1:Nx,1:Ny),fftfx(1:Nx,1:Ny),&
FFTW_BACKWARD,FFTW_EXHAUSTIVE)
! setup fourier frequencies in x-direction
DO i=1,1+Nx/2
kx(i)= 2.0d0*pi*cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))
END DO
kx(1+Nx/2)=0.0d0
DO i = 1,Nx/2 -1
kx(i+1+Nx/2)=-kx(1-i+Nx/2)
END DO
DO i=1,Nx
kxx(i)=kx(i)*kx(i)
END DO
DO i=1,Nx
x(i)=REAL(i-1,kind(0d0))/REAL(Nx,kind(0d0))
END DO
! setup fourier frequencies in y-direction
DO j=1,1+Ny/2
ky(j)= 2.0d0*pi*cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))
END DO
ky(1+Ny/2)=0.0d0
DO j = 1,Ny/2 -1
ky(j+1+Ny/2)=-ky(1-j+Ny/2)
END DO
DO j=1,Ny
kyy(j)=ky(j)*ky(j)
END DO
DO j=1,Ny
y(j)=REAL(j-1,kind(0d0))/REAL(Ny,kind(0d0))
END DO
PRINT *,'Setup grid and fourier frequencies'
DO j=1,Ny
DO i=1,Nx
u(i,j)=sin(2.0d0*pi*x(i))*cos(2.0d0*pi*y(j))
v(i,j)=-cos(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))
u_y(i,j)=-2.0d0*pi*sin(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))
v_x(i,j)=2.0d0*pi*sin(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))
omeg(i,j)=v_x(i,j)-u_y(i,j)
END DO
END DO
! Vorticity to Fourier Space
CALL dfftw_execute_dft_(planfxy,omeg(1:Nx,1:Ny),omeghat(1:Nx,1:Ny))
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!Initial nonlinear term !!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! obtain \hat{\omega}_x^{n,k}
DO j=1,Ny
omeghat_x(1:Nx,j)=omeghat(1:Nx,j)*kx(1:Nx)
END DO
! obtain \hat{\omega}_y^{n,k}
DO i=1,Nx
omeghat_y(i,1:Ny)=omeghat(i,1:Ny)*ky(1:Ny)
END DO
! convert to real space
CALL dfftw_execute_dft_(planbxy,omeghat_x(1:Nx,1:Ny),omeg_x(1:Nx,1:Ny))
CALL dfftw_execute_dft_(planbxy,omeghat_y(1:Nx,1:Ny),omeg_y(1:Nx,1:Ny))
! compute nonlinear term in real space
DO j=1,Ny
nl(1:Nx,j)=u(1:Nx,j)*omeg_x(1:Nx,j)/REAL(Nx*Ny,kind(0d0))+&
v(1:Nx,j)*omeg_y(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
END DO
CALL dfftw_execute_dft_(planfxy,nl(1:Nx,1:Ny),nlhat(1:Nx,1:Ny))
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
time(1)=0.0d0
PRINT *,'Got initial data, starting timestepping'
! New Coprocessor API:
call coprocessorinitializewithpython("pipeline.py", 11)
DO n=1,nplots
chg=1
! save old values
uold(1:Nx,1:Ny)=u(1:Nx,1:Ny)
vold(1:Nx,1:Ny)=v(1:Nx,1:Ny)
omegold(1:Nx,1:Ny)=omeg(1:Nx,1:Ny)
omegcheck(1:Nx,1:Ny)=omeg(1:Nx,1:Ny)
omegoldhat(1:Nx,1:Ny)=omeghat(1:Nx,1:Ny)
nloldhat(1:Nx,1:Ny)=nlhat(1:Nx,1:Ny)
DO WHILE (chg>tol)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!nonlinear fixed (n,k+1)!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! obtain \hat{\omega}_x^{n+1,k}
DO j=1,Ny
omeghat_x(1:Nx,j)=omeghat(1:Nx,j)*kx(1:Nx)
END DO
! obtain \hat{\omega}_y^{n+1,k}
DO i=1,Nx
omeghat_y(i,1:Ny)=omeghat(i,1:Ny)*ky(1:Ny)
END DO
! convert back to real space
CALL dfftw_execute_dft_(planbxy,omeghat_x(1:Nx,1:Ny),omeg_x(1:Nx,1:Ny))
CALL dfftw_execute_dft_(planbxy,omeghat_y(1:Nx,1:Ny),omeg_y(1:Nx,1:Ny))
! calculate nonlinear term in real space
DO j=1,Ny
nl(1:Nx,j)=u(1:Nx,j)*omeg_x(1:Nx,j)/REAL(Nx*Ny,kind(0d0))+&
v(1:Nx,j)*omeg_y(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
END DO
! convert back to fourier
CALL dfftw_execute_dft_(planfxy,nl(1:Nx,1:Ny),nlhat(1:Nx,1:Ny))
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! obtain \hat{\omega}^{n+1,k+1} with Crank Nicolson timestepping
DO j=1,Ny
omeghat(1:Nx,j)=( (1.0d0/dt+0.5d0*(mu/rho)*(kxx(1:Nx)+kyy(j)))&
*omegoldhat(1:Nx,j) - 0.5d0*(nloldhat(1:Nx,j)+nlhat(1:Nx,j)))/&
(1.0d0/dt-0.5d0*(mu/rho)*(kxx(1:Nx)+kyy(j)))
END DO
! calculate \hat{\psi}^{n+1,k+1}
DO j=1,Ny
psihat(1:Nx,j)=-omeghat(1:Nx,j)/(kxx(1:Nx)+kyy(j))
END DO
psihat(1,1)=0.0d0
psihat(Nx/2+1,Ny/2+1)=0.0d0
psihat(Nx/2+1,1)=0.0d0
psihat(1,Ny/2+1)=0.0d0
! obtain \psi_x^{n+1,k+1} and \psi_y^{n+1,k+1}
DO j=1,Ny
psihat_x(1:Nx,j)=psihat(1:Nx,j)*kx(1:Nx)
END DO
CALL dfftw_execute_dft_(planbxy,psihat_x(1:Nx,1:Ny),psi_x(1:Nx,1:Ny))
DO i=1,Nx
psihat_y(i,1:Ny)=psihat(i,1:Ny)*ky(1:Ny)
END DO
CALL dfftw_execute_dft_(planbxy,psihat_y(1:Ny,1:Ny),psi_y(1:Ny,1:Ny))
DO j=1,Ny
psi_x(1:Nx,j)=psi_x(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
psi_y(1:Nx,j)=psi_y(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
END DO
! obtain \omega^{n+1,k+1}
CALL dfftw_execute_dft_(planbxy,omeghat(1:Nx,1:Ny),omeg(1:Nx,1:Ny))
DO j=1,Ny
omeg(1:Nx,j)=omeg(1:Nx,j)/REAL(Nx*Ny,kind(0d0))
END DO
! obtain u^{n+1,k+1} and v^{n+1,k+1} using stream function (\psi) in real space
DO j=1,Ny
u(1:Nx,j)=psi_y(1:Nx,j)
v(1:Nx,j)=-psi_x(1:Nx,j)
END DO
! check for convergence
chg=maxval(abs(omeg-omegcheck))
! saves {n+1,k+1} to {n,k} for next iteration
omegcheck=omeg
END DO
time(n+1)=time(n)+dt
PRINT *,'TIME ',time(n+1)
! call adaptor here
! not sure if there's a way to cleanly pass complex to C++ code
do j = 1, Ny
do i = 1, Nx
dump(i,j) = real(omeg(i,j), kind(0d0))
end do
end do
call nsadaptor(Nx, Ny, 1, n, time(n), dump)
END DO
call coprocessorfinalize()
DO j=1,Ny
DO i=1,Nx
uexact_y(i,j)=-2.0d0*pi*sin(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))*&
exp(-8.0d0*mu*(pi**2)*nplots*dt)
vexact_x(i,j)=2.0d0*pi*sin(2.0d0*pi*x(i))*sin(2.0d0*pi*y(j))*&
exp(-8.0d0*mu*(pi**2)*nplots*dt)
omegexact(i,j)=vexact_x(i,j)-uexact_y(i,j)
END DO
END DO
name_config = 'omegafinal.datbin'
INQUIRE(iolength=iol) omegexact(1,1)
OPEN(unit=11,FILE=name_config,form="unformatted", access="direct",recl=iol)
count = 1
DO j=1,Ny
DO i=1,Nx
WRITE(11,rec=count) REAL(omeg(i,j),KIND(0d0))
count=count+1
END DO
END DO
CLOSE(11)
name_config = 'omegaexactfinal.datbin'
OPEN(unit=11,FILE=name_config,form="unformatted", access="direct",recl=iol)
count = 1
DO j=1,Ny
DO i=1,Nx
WRITE(11,rec=count) omegexact(i,j)
count=count+1
END DO
END DO
CLOSE(11)
name_config = 'xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)
name_config = 'ycoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
WRITE(11,*) y(j)
END DO
CLOSE(11)
CALL dfftw_destroy_plan_(planfxy)
CALL dfftw_destroy_plan_(planbxy)
CALL dfftw_cleanup_()
DEALLOCATE(time,kx,kxx,ky,kyy,x,y,&
u,uold,v,vold,u_y,v_x,omegold, omegcheck, omeg, &
omegoldhat, omegoldhat_x, omegold_x,&
omegoldhat_y, omegold_y, nlold, nloldhat,&
omeghat, omeghat_x, omeghat_y, omeg_x, omeg_y,&
nl, nlhat, psihat, psihat_x, psi_x, psihat_y, psi_y,&
uexact_y,vexact_x,omegexact, &
fftfx,fftbx,stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
PRINT *,'Program execution complete'
END PROGRAM main
|
()
|
Fortran 介面卡:NSadaptor.f90 -- 沒有 API 更改,出於完整性考慮,將其包含在內。 程式碼下載
! Fortran module for interacting with the ParaView/Catalyst coprocessor
module nsadaptor_glue
implicit none
public
contains
subroutine nsadaptor(nx, ny, nz, step, time, scalar)
! nx, ny, nz -- grid dimensions
! step -- current time step
! time -- current time
! scalar -- the data
integer, intent(in) :: nx, ny, nz, step
real(kind=8), intent(in) :: time
real(kind=8), dimension(:,:), intent(in) :: scalar
integer :: flag
! check if coprocessing this step
! defined in adaptor header in ParaView src
call requestdatadescription(step, time, flag)
if (flag .ne. 0) then
! coprocessing requested
! check if grid exists
! also from ParaView adaptor header
call needtocreategrid(flag)
if (flag .ne. 0) then
! grid needed
! defined by application developer
call createcpimagedata(nx, ny, nz)
end if
! also defined by application dev
call addfield(scalar, "test")
! from ParaView adaptor header
call coprocess()
end if
return
end subroutine nsadaptor
end module nsadaptor_glue
|
()
|
C++ VTK 資料輔助程式:PointBasedDataSet.cxx -- 更改包括新的標頭和新的名稱空間。還要注意 vtkSmartPointer 的使用,而不是原始指標處理。 程式碼下載
// VTK Data set creator for ParaView/Catalyst Coprocessing.
// Interfaces with a Fortran adaptor to get data from a Fortran
// simulation.
// Fortran specific header for API called from Fortran glue module
#include "vtkCPPythonAdaptorAPI.h"
// Adaptor specific headers, will need to change if simulation data structure
// changes. This should available from the ParaView install include directory
// if you chose to install development files during configuration.
#include "vtkCPDataDescription.h"
#include "vtkCPInputDataDescription.h"
#include "vtkCPProcessor.h"
#include "vtkDoubleArray.h"
#include "vtkPointData.h"
#include "vtkSmartPointer.h"
#include "vtkImageData.h"
#include "vtkCellData.h"
#include <string>
#include <iostream>
// Names are mangled with trailing underscore for linker visibility, along with
// using extern "C"
// Creates the vtkDataSet container for the CoProcessor. The simulation data
// must be compatible.
extern "C" void createcpimagedata_(int* nx, int* ny, int* nz)
{
if (!vtkCPPythonAdaptorAPI::GetCoProcessorData()) {
vtkGenericWarningMacro("Unable to access coprocessor data.");
return;
}
vtkSmartPointer<vtkImageData> img = vtkSmartPointer<vtkImageData>::New();
// Note: expects Fortran indexing
img->SetExtent(0, *nx - 1, 0, *ny - 1, 0, *nz - 1);
vtkCPPythonAdaptorAPI::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(img);
}
// Add data to the container.
extern "C" void addfield_(double* simdata, char* name)
{
vtkSmartPointer<vtkCPInputDataDescription> idd =
vtkCPPythonAdaptorAPI::GetCoProcessorData()->GetInputDescriptionByName("input");
vtkSmartPointer<vtkImageData> img = vtkImageData::SafeDownCast(idd->GetGrid());
if (!img) {
vtkGenericWarningMacro("No adaptor grid to attach field data to.");
return;
}
std::string fieldPythonName = name;
if (idd->IsFieldNeeded(name)) {
vtkSmartPointer<vtkDoubleArray> field = vtkSmartPointer<vtkDoubleArray>::New();
field->SetName(name);
field->SetArray(simdata, img->GetNumberOfPoints(), 1);
img->GetPointData()->AddArray(field);
}
}
以下說明是在 NICS 的 Nautilus 上使用 NS3D-MPI、NavierStokes3DfftIMR.f90、KgSemiImp3D.f90 和 NSLsplitting.f90 時得出的。
在使用單個 CPU 節點(非分割槽資料)時,可以將模擬陣列直接傳遞給 ParaView,並使用 vtkDataSet::GetPointData()->AddArray() 方法將其新增到管道中。然後,vtkImageData 的點將與模擬的網格節點完全對應。該 vtkImageData 將具有隱式細胞結構,每個細胞角點都是模擬節點之一。
如果沒有針對 MPI 的特殊更改,此程式碼將執行,但是 ParaView 將無法建立角點位於不同秩上的細胞。載入 .pvti 檔案時會顯示錯誤,指出範圍不正確。可以載入各個 .vtis 檔案,檢視實際情況確實是這樣的,以及細胞之間是否存在間隙。
例如,假設陣列是 dimension(20,10,10),並且已分解為 2 個鉛筆 (1:10,:,:) 和 (11:20,:,:)。ParaView 認為 (10:11,:,:) 的資料丟失,因此會出現間隙。
光暈提供了一種填補間隙的方法。2decomp&fft 光暈 api 建立包含鉛筆資料和周圍光暈元素的陣列。例如,假設鉛筆為 (10:20,10:20,:), 則光暈陣列為 (9:21,9:21,:). 使用光暈會帶來一些問題:
- 光暈元素可能會超出計算域,這些元素將包含垃圾資料。
- 並非每個鉛筆都需要將相同的光暈元素髮送給協同處理器。這需要在程式中新增額外的決策邏輯。
- 從 C++ 的角度來看,光暈陣列部分不會緊密打包,因此會包含垃圾資料。
|
()
|
- 添加了光暈陣列。
- 添加了用於將打包陣列傳遞給 C++ 的一維陣列。
- 根據鉛筆範圍添加了用於判斷要傳遞哪些光暈元素的分支。
- 如果鉛筆距離索引原點最遠,則僅傳送鉛筆,不傳送光暈元素。
- 否則,如果鉛筆位於最遠範圍的任一側,但不位於兩側,則傳送其他範圍中最遠的光暈元素。
- 否則,傳送兩側最遠的光暈元素。
- 添加了對光暈區域的重塑。陣列透過引用從 F90 傳遞到 C++。C++ 不理解 F90 陣列區域,並且只會從引用中遞增讀取(這是任何陣列區域問題,不僅僅是光暈)。解決此問題的一種方法是將區域重塑為一維陣列,這將使所需資料在記憶體中連續。
PROGRAM main
!-----------------------------------------------------------------------------------
!
!
! PURPOSE
!
! This program numerically solves the 3D incompressible Navier-Stokes
! on a Cubic Domain [0,2pi]x[0,2pi]x[0,2pi] using pseudo-spectral methods and
! Implicit Midpoint rule timestepping. The numerical solution is compared to
! an exact solution reported by Shapiro
!
! Analytical Solution:
! u(x,y,z,t)=-0.25*(cos(x)sin(y)sin(z)+sin(x)cos(y)cos(z))exp(-t/Re)
! v(x,y,z,t)= 0.25*(sin(x)cos(y)sin(z)-cos(x)sin(y)cos(z))exp(-t/Re)
! w(x,y,z,t)= 0.5*cos(x)cos(y)sin(z)exp(-t/Re)
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Ny = number of modes in y - power of 2 for FFT
! Nz = number of modes in z - power of 2 for FFT
! Nt = number of timesteps to take
! Tmax = maximum simulation time
! FFTW_IN_PLACE = value for FFTW input
! FFTW_MEASURE = value for FFTW input
! FFTW_EXHAUSTIVE = value for FFTW input
! FFTW_PATIENT = value for FFTW input
! FFTW_ESTIMATE = value for FFTW input
! FFTW_FORWARD = value for FFTW input
! FFTW_BACKWARD = value for FFTW input
! pi = 3.14159265358979323846264338327950288419716939937510d0
! Re = Reynolds number
! .. Scalars ..
! i = loop counter in x direction
! j = loop counter in y direction
! k = loop counter in z direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! count = keep track of information written to disk
! iol = size of array to write to disk
! start = variable to record start time of program
! finish = variable to record end time of program
! count_rate = variable for clock count rate
! planfxyz = Forward 3d fft plan
! planbxyz = Backward 3d fft plan
! dt = timestep
! .. Arrays ..
! u = velocity in x direction
! v = velocity in y direction
! w = velocity in z direction
! uold = velocity in x direction at previous timestep
! vold = velocity in y direction at previous timestep
! wold = velocity in z direction at previous timestep
! ux = x derivative of velocity in x direction
! uy = y derivative of velocity in x direction
! uz = z derivative of velocity in x direction
! vx = x derivative of velocity in y direction
! vy = y derivative of velocity in y direction
! vz = z derivative of velocity in y direction
! wx = x derivative of velocity in z direction
! wy = y derivative of velocity in z direction
! wz = z derivative of velocity in z direction
! uxold = x derivative of velocity in x direction
! uyold = y derivative of velocity in x direction
! uzold = z derivative of velocity in x direction
! vxold = x derivative of velocity in y direction
! vyold = y derivative of velocity in y direction
! vzold = z derivative of velocity in y direction
! wxold = x derivative of velocity in z direction
! wyold = y derivative of velocity in z direction
! wzold = z derivative of velocity in z direction
! utemp = temporary storage of u to check convergence
! vtemp = temporary storage of u to check convergence
! wtemp = temporary storage of u to check convergence
! temp_r = temporary storage for untransformed variables
! uhat = Fourier transform of u
! vhat = Fourier transform of v
! what = Fourier transform of w
! rhsuhatfix = Fourier transform of righthand side for u for timestepping
! rhsvhatfix = Fourier transform of righthand side for v for timestepping
! rhswhatfix = Fourier transform of righthand side for w for timestepping
! nonlinuhat = Fourier transform of nonlinear term for u
! nonlinvhat = Fourier transform of nonlinear term for u
! nonlinwhat = Fourier transform of nonlinear term for u
! phat = Fourier transform of nonlinear term for pressure, p
! temp_c = temporary storage for Fourier transforms
! realtemp = Real storage
!
! .. Vectors ..
! kx = fourier frequencies in x direction
! ky = fourier frequencies in y direction
! kz = fourier frequencies in z direction
! x = x locations
! y = y locations
! z = y locations
! time = times at which save data
! name_config = array to store filename for data to be saved
!
! REFERENCES
!
! A. Shapiro " The use of an exact solution of the Navier-Stokes equations
! in a validation test of a three-dimensional nonhydrostatic numerical model"
! Monthly Weather Review vol. 121, 2420-2425, (1993).
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
!
! This program has not been optimized to use the least amount of memory
! but is intended as an example only for which all states can be saved
!
!--------------------------------------------------------------------------------
! External routines required
!
! External libraries required
! 2DECOMP&FFT -- Fast Fourier Transform in the West Library
! (http://2decomp.org/)
!---------------------------------------------------------------------------------
USE decomp_2d
USE decomp_2d_fft
USE decomp_2d_io
USE MPI
!coprocessing:
use NSadaptor_module
IMPLICIT NONE
! declare variables
INTEGER(kind=4), PARAMETER :: Nx=256
INTEGER(kind=4), PARAMETER :: Ny=256
INTEGER(kind=4), PARAMETER :: Nz=256
INTEGER(kind=4), PARAMETER :: Lx=1
INTEGER(kind=4), PARAMETER :: Ly=1
INTEGER(kind=4), PARAMETER :: Lz=1
INTEGER(kind=4), PARAMETER :: Nt=20
REAL(kind=8), PARAMETER :: dt=0.05d0/Nt
REAL(kind=8), PARAMETER :: Re=1.0d0
REAL(kind=8), PARAMETER :: tol=0.1d0**10
REAL(kind=8), PARAMETER :: theta=0.0d0
REAL(kind=8), PARAMETER &
:: pi=3.14159265358979323846264338327950288419716939937510d0
REAL(kind=8), PARAMETER :: ReInv=1.0d0/REAL(Re,kind(0d0))
REAL(kind=8), PARAMETER :: dtInv=1.0d0/REAL(dt,kind(0d0))
REAL(kind=8) :: scalemodes,chg,factor
REAL(kind=8), DIMENSION(:), ALLOCATABLE :: x, y, z, time,mychg,allchg
COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE :: u, v, w,&
ux, uy, uz,&
vx, vy, vz,&
wx, wy, wz,&
uold, uxold, uyold, uzold,&
vold, vxold, vyold, vzold,&
wold, wxold, wyold, wzold,&
utemp, vtemp, wtemp, temp_r
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE :: kx, ky, kz
COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE :: uhat, vhat, what,&
rhsuhatfix, rhsvhatfix,&
rhswhatfix, nonlinuhat,&
nonlinvhat, nonlinwhat,&
phat,temp_c
!coprocessing: added x,y,z component arrays also halos
!coprocessing: halos get allocated inside call to update_halo
REAL(kind=8), DIMENSION(:,:,:), ALLOCATABLE :: realtemp, realtempx, &
realtempy, realtempz, tempxhalo, tempyhalo, tempzhalo
!coprocessing: added 1D array for passing to C++
real(kind=8), dimension(:), allocatable :: xpass2c, ypass2c, zpass2c
! MPI and 2DECOMP variables
TYPE(DECOMP_INFO) :: decomp
INTEGER(kind=MPI_OFFSET_KIND) :: filesize, disp
INTEGER(kind=4) :: p_row=0, p_col=0, numprocs, myid, ierr
! variables used for saving data and timing
INTEGER(kind=4) :: count, iol
INTEGER(kind=4) :: i,j,k,n,t,allocatestatus
INTEGER(kind=4) :: ind, numberfile
CHARACTER*100 :: name_config
INTEGER(kind=4) :: start, finish, count_rate
! initialisation of 2DECOMP&FFT
CALL MPI_INIT(ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)
! do automatic domain decomposition
CALL decomp_2d_init(Nx,Ny,Nz,p_row,p_col)
! get information about domain decomposition choosen
CALL decomp_info_init(Nx,Ny,Nz,decomp)
! initialise FFT library
CALL decomp_2d_fft_init
IF (myid.eq.0) THEN
PRINT *,'Grid:',Nx,'X',Ny,'Y',Nz,'Z'
PRINT *,'dt:',dt
END IF
ALLOCATE(x(1:Nx),y(1:Ny),z(1:Nz),time(1:Nt+1),mychg(1:3),allchg(1:3),&
u(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
v(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
w(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
ux(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uy(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uz(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vx(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vy(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vz(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wx(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wy(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wz(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uxold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uyold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uzold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vxold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vyold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vzold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wxold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wyold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wzold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
utemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vtemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wtemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
temp_r(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
kx(1:Nx),ky(1:Ny),kz(1:Nz),&
uhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
vhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
what(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
rhsuhatfix(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
rhsvhatfix(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
rhswhatfix(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
nonlinuhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
nonlinvhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
nonlinwhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
phat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
temp_c(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
realtemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
realtempx(decomp%xst(1):decomp%xen(1), &
decomp%xst(2):decomp%xen(2), &
decomp%xst(3):decomp%xen(3)),&
realtempy(decomp%xst(1):decomp%xen(1), &
decomp%xst(2):decomp%xen(2), &
decomp%xst(3):decomp%xen(3)),&
realtempz(decomp%xst(1):decomp%xen(1), &
decomp%xst(2):decomp%xen(2), &
decomp%xst(3):decomp%xen(3)),&
stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
IF (myid.eq.0) THEN
PRINT *,'allocated space'
END IF
! setup fourier frequencies in x-direction
DO i=1,Nx/2+1
kx(i)= cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))/Lx
END DO
kx(1+Nx/2)=0.0d0
DO i = 1,Nx/2 -1
kx(i+1+Nx/2)=-kx(1-i+Nx/2)
END DO
ind=1
DO i=-Nx/2,Nx/2-1
x(ind)=2.0d0*pi*REAL(i,kind(0d0))*Lx/REAL(Nx,kind(0d0))
ind=ind+1
END DO
! setup fourier frequencies in y-direction
DO j=1,Ny/2+1
ky(j)= cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))/Ly
END DO
ky(1+Ny/2)=0.0d0
DO j = 1,Ny/2 -1
ky(j+1+Ny/2)=-ky(1-j+Ny/2)
END DO
ind=1
DO j=-Ny/2,Ny/2-1
y(ind)=2.0d0*pi*REAL(j,kind(0d0))*Ly/REAL(Ny,kind(0d0))
ind=ind+1
END DO
! setup fourier frequencies in z-direction
DO k=1,Nz/2+1
kz(k)= cmplx(0.0d0,1.0d0)*REAL(k-1,kind(0d0))/Lz
END DO
kz(1+Nz/2)=0.0d0
DO k = 1,Nz/2 -1
kz(k+1+Nz/2)=-kz(1-k+Nz/2)
END DO
ind=1
DO k=-Nz/2,Nz/2-1
z(ind)=2.0d0*pi*REAL(k,kind(0d0))*Lz/REAL(Nz,kind(0d0))
ind=ind+1
END DO
scalemodes=1.0d0/REAL(Nx*Ny*Nz,kind(0d0))
IF (myid.eq.0) THEN
PRINT *,'Setup grid and fourier frequencies'
END IF
!initial conditions for Taylor-Green vortex
! factor=2.0d0/sqrt(3.0d0)
! DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
! u(i,j,k)=factor*sin(theta+2.0d0*pi/3.0d0)*sin(x(i))*cos(y(j))*cos(z(k))
! END DO; END DO; END DO
! DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
! v(i,j,k)=factor*sin(theta-2.0d0*pi/3.0d0)*cos(x(i))*sin(y(j))*cos(z(k))
! END DO ; END DO ; END DO
! DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
! w(i,j,k)=factor*sin(theta)*cos(x(i))*cos(y(j))*sin(z(k))
! END DO ; END DO ; END DO
time(1)=0.0d0
factor=sqrt(3.0d0)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
u(i,j,k)=-0.5*( factor*cos(x(i))*sin(y(j))*sin(z(k))&
+sin(x(i))*cos(y(j))*cos(z(k)) )*exp(-(factor**2)*time(1)/Re)
END DO; END DO; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
v(i,j,k)=0.5*( factor*sin(x(i))*cos(y(j))*sin(z(k))&
-cos(x(i))*sin(y(j))*cos(z(k)) )*exp(-(factor**2)*time(1)/Re)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
w(i,j,k)=cos(x(i))*cos(y(j))*sin(z(k))*exp(-(factor**2)*time(1)/Re)
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(u,uhat,DECOMP_2D_FFT_FORWARD)
CALL decomp_2d_fft_3d(v,vhat,DECOMP_2D_FFT_FORWARD)
CALL decomp_2d_fft_3d(w,what,DECOMP_2D_FFT_FORWARD)
! derivative of u with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,ux,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uz,DECOMP_2D_FFT_BACKWARD)
! derivative of v with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vz,DECOMP_2D_FFT_BACKWARD)
! derivative of w with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wz,DECOMP_2D_FFT_BACKWARD)
! save initial data
n = 0
!coprocessing: removed savedata calls from coprocessing version
!coprocessing: could have this after the coprocessor initialize and
!coprocessing: followed by a call to the coprocessor to write out the initial data
!coprocessing: image. Otherwise, these are just wasted calls at this spot.
call coprocessorinitialize("pipeline.py", 11)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempx(i,j,k)=REAL(wy(i,j,k)-vz(i,j,k),KIND=8)
END DO; END DO ; END DO
call update_halo(realtempx, tempxhalo, level=1)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempy(i,j,k)=REAL(uz(i,j,k)-wx(i,j,k),KIND=8)
END DO; END DO ; END DO
call update_halo(realtempy, tempyhalo, level=1)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempz(i,j,k)=REAL(vx(i,j,k)-uy(i,j,k),KIND=8)
END DO; END DO ; END DO
call update_halo(realtempz, tempzhalo, level=1)
!coprocessing: Four cases of sending halo data to the coprocessor.
if ((ny_global == decomp%xen(2)) .and. (nz_global == decomp%xen(3))) then
!coprocessing: case 1: pencil furthest from index origin, send just the pencil
allocate(xpass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
allocate(ypass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
allocate(zpass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3), n, n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c, ypass2c, zpass2c)
else if ((ny_global > decomp%xen(2)) .and. (nz_global == decomp%xen(3))) then
!coprocessing: case 2: pencil furthest in z, send just upper y halo
allocate(xpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
allocate(ypass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
allocate(zpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2)+1,decomp%xst(3),decomp%xen(3), n, n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c, ypass2c, zpass2c)
else if ((ny_global == decomp%xen(2)) .and. (nz_global > decomp%xen(3))) then
!coprocessing: case 3: pencil furthest in y, send just upper z halo
allocate(xpass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
allocate(ypass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
allocate(zpass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3)+1, n, n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c, ypass2c, zpass2c)
else
!coprocessing: case 4: send both upper y and upper z halo
allocate(xpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
allocate(ypass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
allocate(zpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2)+1,decomp%xst(3),decomp%xen(3)+1,n,n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c,ypass2c,zpass2c)
end if
!start timer
CALL system_clock(start,count_rate)
! coprocessing: Simulation loop starts here
DO n=1,Nt
!fixed point
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
uold(i,j,k)=u(i,j,k)
uxold(i,j,k)=ux(i,j,k)
uyold(i,j,k)=uy(i,j,k)
uzold(i,j,k)=uz(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
vold(i,j,k)=v(i,j,k)
vxold(i,j,k)=vx(i,j,k)
vyold(i,j,k)=vy(i,j,k)
vzold(i,j,k)=vz(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
wold(i,j,k)=w(i,j,k)
wxold(i,j,k)=wx(i,j,k)
wyold(i,j,k)=wy(i,j,k)
wzold(i,j,k)=wz(i,j,k)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
rhsuhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*uhat(i,j,k)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
rhsvhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*vhat(i,j,k)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
rhswhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*what(i,j,k)
END DO; END DO ; END DO
chg=1
DO WHILE (chg .gt. tol)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(ux(i,j,k)+uxold(i,j,k))&
+(v(i,j,k)+vold(i,j,k))*(uy(i,j,k)+uyold(i,j,k))&
+(w(i,j,k)+wold(i,j,k))*(uz(i,j,k)+uzold(i,j,k)))
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_r,nonlinuhat,DECOMP_2D_FFT_FORWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(vx(i,j,k)+vxold(i,j,k))&
+(v(i,j,k)+vold(i,j,k))*(vy(i,j,k)+vyold(i,j,k))&
+(w(i,j,k)+wold(i,j,k))*(vz(i,j,k)+vzold(i,j,k)))
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_r,nonlinvhat,DECOMP_2D_FFT_FORWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(wx(i,j,k)+wxold(i,j,k))&
+(v(i,j,k)+vold(i,j,k))*(wy(i,j,k)+wyold(i,j,k))&
+(w(i,j,k)+wold(i,j,k))*(wz(i,j,k)+wzold(i,j,k)))
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_r,nonlinwhat,DECOMP_2D_FFT_FORWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
phat(i,j,k)=-1.0d0*( kx(i)*nonlinuhat(i,j,k)&
+ky(j)*nonlinvhat(i,j,k)&
+kz(k)*nonlinwhat(i,j,k))&
/(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)+0.1d0**13)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
uhat(i,j,k)=(rhsuhatfix(i,j,k)-nonlinuhat(i,j,k)-kx(i)*phat(i,j,k))/&
(dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
vhat(i,j,k)=(rhsvhatfix(i,j,k)-nonlinvhat(i,j,k)-ky(j)*phat(i,j,k))/&
(dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
what(i,j,k)=(rhswhatfix(i,j,k)-nonlinwhat(i,j,k)-kz(k)*phat(i,j,k))/&
(dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
END DO; END DO ; END DO
! derivative of u with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,ux,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uz,DECOMP_2D_FFT_BACKWARD)
! derivative of v with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vz,DECOMP_2D_FFT_BACKWARD)
! derivative of w with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wz,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
utemp(i,j,k)=u(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
vtemp(i,j,k)=v(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
wtemp(i,j,k)=w(i,j,k)
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(uhat,u,DECOMP_2D_FFT_BACKWARD)
CALL decomp_2d_fft_3d(vhat,v,DECOMP_2D_FFT_BACKWARD)
CALL decomp_2d_fft_3d(what,w,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
u(i,j,k)=u(i,j,k)*scalemodes
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
v(i,j,k)=v(i,j,k)*scalemodes
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
w(i,j,k)=w(i,j,k)*scalemodes
END DO; END DO ; END DO
mychg(1) =maxval(abs(utemp-u))
mychg(2) =maxval(abs(vtemp-v))
mychg(3) =maxval(abs(wtemp-w))
CALL MPI_ALLREDUCE(mychg,allchg,3,MPI_DOUBLE_PRECISION,MPI_MAX,MPI_COMM_WORLD,ierr)
chg=allchg(1)+allchg(2)+allchg(3)
IF (myid.eq.0) THEN
PRINT *,'chg:',chg
END IF
END DO
time(n+1)=n*dt
!goto 5100
IF (myid.eq.0) THEN
PRINT *,'time',n*dt
END IF
!coprocessing: populating the arrays sent to the coprocessor
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempx(i,j,k)=REAL(wy(i,j,k)-vz(i,j,k),KIND=8)
END DO; END DO ; END DO
call update_halo(realtempx, tempxhalo, level=1)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempy(i,j,k)=REAL(uz(i,j,k)-wx(i,j,k),KIND=8)
END DO; END DO ; END DO
call update_halo(realtempy, tempyhalo, level=1)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempz(i,j,k)=REAL(vx(i,j,k)-uy(i,j,k),KIND=8)
END DO; END DO ; END DO
call update_halo(realtempz, tempzhalo, level=1)
!coprocessing: same cases as for initial conditions.
if ((ny_global == decomp%xen(2)) .and. (nz_global == decomp%xen(3))) then
! case 1: send just the pencil
allocate(xpass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
allocate(ypass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
allocate(zpass2c(decomp%xsz(1)*decomp%xsz(2)*decomp%xsz(3)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3), n, n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c, ypass2c, zpass2c)
else if ((ny_global > decomp%xen(2)) .and. (nz_global == decomp%xen(3))) then
! case 2: send just upper y halo
allocate(xpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
allocate(ypass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
allocate(zpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*decomp%xsz(3)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2)+1,decomp%xst(3),decomp%xen(3), n, n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c, ypass2c, zpass2c)
else if ((ny_global == decomp%xen(2)) .and. (nz_global > decomp%xen(3))) then
! case 3: send just upper z halo
allocate(xpass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
allocate(ypass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
allocate(zpass2c(decomp%xsz(1)*decomp%xsz(2)*(decomp%xsz(3)+1)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2),1:decomp%xsz(3)+1),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3)+1, n, n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c, ypass2c, zpass2c)
else
! send both upper y and upper z halo
allocate(xpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
allocate(ypass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
allocate(zpass2c(decomp%xsz(1)*(decomp%xsz(2)+1)*(decomp%xsz(3)+1)))
xpass2c = reshape(tempxhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(xpass2c)/))
ypass2c = reshape(tempyhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(ypass2c)/))
zpass2c = reshape(tempzhalo(:,1:decomp%xsz(2)+1,1:decomp%xsz(3)+1),(/size(zpass2c)/))
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2)+1,decomp%xst(3),decomp%xen(3)+1,n,n*dt, &
xpass2c, ypass2c, zpass2c)
deallocate(xpass2c,ypass2c,zpass2c)
end if
END DO
!coprocessing:
call coprocessorfinalize()
CALL system_clock(finish,count_rate)
IF (myid.eq.0) then
PRINT *, 'Program took', REAL(finish-start)/REAL(count_rate), 'for main timestepping loop'
END IF
IF (myid.eq.0) THEN
name_config = './data/tdata.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO n=1,1+Nt
WRITE(11,*) time(n)
END DO
CLOSE(11)
name_config = './data/xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)
name_config = './data/ycoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
WRITE(11,*) y(j)
END DO
CLOSE(11)
name_config = './data/zcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO k=1,Nz
WRITE(11,*) z(k)
END DO
CLOSE(11)
PRINT *,'Saved data'
END IF
! Calculate error in final numerical solution
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
utemp(i,j,k)=u(i,j,k) -&
(-0.5*( factor*cos(x(i))*sin(y(j))*sin(z(k))&
+sin(x(i))*cos(y(j))*cos(z(k)) )*exp(-(factor**2)*time(Nt+1)/Re))
END DO; END DO; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
vtemp(i,j,k)=v(i,j,k) -&
(0.5*( factor*sin(x(i))*cos(y(j))*sin(z(k))&
-cos(x(i))*sin(y(j))*cos(z(k)) )*exp(-(factor**2)*time(Nt+1)/Re))
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
wtemp(i,j,k)=w(i,j,k)-&
(cos(x(i))*cos(y(j))*sin(z(k))*exp(-(factor**2)*time(Nt+1)/Re))
END DO; END DO ; END DO
mychg(1) = maxval(abs(utemp))
mychg(2) = maxval(abs(vtemp))
mychg(3) = maxval(abs(wtemp))
CALL MPI_ALLREDUCE(mychg,allchg,3,MPI_DOUBLE_PRECISION,MPI_MAX,MPI_COMM_WORLD,ierr)
chg=allchg(1)+allchg(2)+allchg(3)
IF (myid.eq.0) THEN
PRINT*,'The error at the final timestep is',chg
END IF
! clean up
CALL decomp_2d_fft_finalize
CALL decomp_2d_finalize
DEALLOCATE(x,y,z,time,mychg,allchg,u,v,w,ux,uy,uz,vx,vy,vz,wx,wy,wz,uold,uxold,uyold,uzold,&
vold,vxold,vyold,vzold,wold,wxold,wyold,wzold,utemp,vtemp,wtemp,&
temp_r,kx,ky,kz,uhat,vhat,what,rhsuhatfix,rhsvhatfix,&
rhswhatfix,phat,nonlinuhat,nonlinvhat,nonlinwhat,temp_c,&
realtemp,stat=AllocateStatus)
!coprocessing: deallocate Coprocessing specific arrays
deallocate(realtempx, realtempy, realtempz, tempxhalo, tempyhalo, tempzhalo)
IF (AllocateStatus .ne. 0) STOP
IF (myid.eq.0) THEN
PRINT *,'Program execution complete'
END IF
CALL MPI_FINALIZE(ierr)
END PROGRAM main
|
()
|
- 更改介面卡以接受當前鉛筆的子範圍。
! fortran module for interacting with the ParaView CoProcessor
! loosely based on:
! ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/phastaNSadaptor.f
! -- Changed for SESE
! subroutine determines if coprocessing needed during the
! current simulation step or not, and if so, calls the coprocessor.
! some of the subroutines are supplied by ParaView's FortranAdaptorAPI,
! others have to be supplied by the programmer.
module NSadaptor_module
use iso_c_binding
implicit none
public
interface NSadaptor
! Originally also had a version that accepted 3D arrays, but in all other
! respects was identical. Decided this could lead to too much confusion.
module procedure NSadaptor1D
end interface NSadaptor
contains
subroutine NSadaptor1D(nx, ny, nz, xst, xen, yst, yen, zst, zen, &
step, time, omegax, omegay, omegaz)
! nx, ny, nz -- grid dimensions or entire mesh
! used for setting whole extent
! xst, xen, etc. -- extents of current subdomain pencil
! step -- current simulation time step
! time -- current simulation time
! omega* -- scalar array for the current time step
! flag -- receives status from API calls
integer, intent(in) :: nx, ny, nz, xst, xen, yst, yen, zst, zen, step
real(kind=8), intent(in) :: time
real(kind=8), dimension(:), intent (in) :: omegax, omegay, omegaz
integer :: flag
! check if processing this time step
! defined in FortranAdaptorAPI.h
call requestdatadescription(step, time, flag)
if (flag /= 0) then
! processing requested
! check if need to create grid
! defined in FortranAdaptorAPI.h
call needtocreategrid(flag)
if (flag /= 0) then
! grid needed
! defined in VTKPointBasedDataSet.cxx
! takes the size of the entire grid, and the extents of the
! sub grid.
call createcpimagedata(nx, ny, nz, xst, xen, yst, yen, zst, zen)
end if
! defined in VTKPointBasedDataSet.cxx
! remember to null-terminate strings for C/C++
call addfield(omegax, "realtempx"//char(0));
call addfield(omegay, "realtempy"//char(0));
call addfield(omegaz, "realtempz"//char(0));
! defined in FortranAdaptorAPI.h
call coprocess()
end if
return
end subroutine NSadaptor1D
end module NSadaptor_module
|
()
|
- 指定了相對於整個資料的已分割槽資料範圍。
- 指定了單元格的相對間距。
// Adaptor for getting fortran simulation code into ParaView CoProcessor.
// Based on the PhastaAdaptor sample in the ParaView distribution.
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx
// Fortran specific header
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/
#include "FortranAdaptorAPI.h"
// CoProcessor specific headers
// Routines that take the place of VTK dataset object creation.
// Called from Fortran code which also calls the Fortran Adaptor API
// supplied with ParaView source.
// Note: names mangled with trailing underscores for linker visibility.
#include "vtkCPDataDescription.h"
#include "vtkCPInputDataDescription.h"
#include "vtkCPProcessor.h"
#include "vtkDoubleArray.h"
#include "vtkPointData.h"
#include "vtkSmartPointer.h"
#include "vtkImageData.h"
#include <string>
// These will be called from the Fortran "glue" code"
// Completely dependent on data layout, structured vs. unstructured, etc.
// since VTK/ParaView uses different internal layouts for each.
// Creates the data container for the CoProcessor.
// Takes the extents for both the global dataset and the particular subsection
// visible to the current MPI rank.
// Note: expects to receive Fortran base-1 indices.
extern "C" void createcpimagedata_(int* nx, int* ny, int* nz, int* xst, int* xen,
int* yst, int* yen, int* zst, int* zen)
{
if (!ParaViewCoProcessing::GetCoProcessorData()) {
vtkGenericWarningMacro("Unable to access CoProcessorData.");
return;
}
// The simulation grid is a 3-dimensional topologically and geometrically
// regular grid. In VTK/ParaView, this is considered an image data set.
vtkSmartPointer<vtkImageData> img = vtkSmartPointer<vtkImageData>::New();
img->SetExtent(*xst - 1, *xen - 1, *yst - 1, *yen - 1, *zst - 1, *zen - 1);
img->SetSpacing( 1.0 / *nx, 1.0 / *ny, 1.0 / *nz);
ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(img);
ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetWholeExtent(0, *nx - 1, 0, *ny - 1, 0, *nz - 1);
}
// Add field(s) to the data container.
// Separate from above because this will be dynamic, grid is static.
// Might be an issue, VTK assumes row major C storage.
// Underscore is by-hand name mangling for fortran linking.
// Note: Expects the address of the data, has no way of determining
// if the array is densely packed or not.
// Note 2: Expects "name" to point to null-terminated character array,
// be sure to null-terminate in caller.
extern "C" void addfield_(double* a, char* name)
{
vtkSmartPointer<vtkCPInputDataDescription> idd = ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input");
vtkSmartPointer<vtkImageData> img = vtkImageData::SafeDownCast(idd->GetGrid());
if (!img) {
vtkGenericWarningMacro("No adaptor grid to attach field data to.");
return;
}
if (idd->IsFieldNeeded(name)) {
vtkSmartPointer<vtkDoubleArray> field = vtkSmartPointer<vtkDoubleArray>::New();
field->SetName(name);
field->SetArray(a, img->GetNumberOfPoints(), 1);
img->GetPointData()->AddArray(field);
}
}
- 某些 VTK/ParaView 過濾器可以使用幽靈(光暈)單元格,程式碼可以擴充套件以處理這些單元格。
- 在模擬程式碼本身中添加了很多程式碼。
- 光暈和打包陣列需要額外的記憶體。
- 為 C++ 重塑陣列的額外開銷。
- 不同的分解庫可能需要更改光暈邏輯。
- 不能與 3D 分解庫一起使用。
檢視資料的另一種方式是,模擬的每個節點都是視覺化域中的一個單元格。然後,當鉛筆傳送到 ParaView 時,就不會有缺失單元格的間隙。
|
()
|
- 除了常規的 CoProcessing 呼叫之外,沒有其他更改。
PROGRAM main
!-----------------------------------------------------------------------------------
!
!
! PURPOSE
!
! This program numerically solves the 3D incompressible Navier-Stokes
! on a Cubic Domain [0,2pi]x[0,2pi]x[0,2pi] using pseudo-spectral methods and
! Implicit Midpoint rule timestepping. The numerical solution is compared to
! an exact solution reported by Shapiro
!
! Analytical Solution:
! u(x,y,z,t)=-0.25*(cos(x)sin(y)sin(z)+sin(x)cos(y)cos(z))exp(-t/Re)
! v(x,y,z,t)= 0.25*(sin(x)cos(y)sin(z)-cos(x)sin(y)cos(z))exp(-t/Re)
! w(x,y,z,t)= 0.5*cos(x)cos(y)sin(z)exp(-t/Re)
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Ny = number of modes in y - power of 2 for FFT
! Nz = number of modes in z - power of 2 for FFT
! Nt = number of timesteps to take
! Tmax = maximum simulation time
! FFTW_IN_PLACE = value for FFTW input
! FFTW_MEASURE = value for FFTW input
! FFTW_EXHAUSTIVE = value for FFTW input
! FFTW_PATIENT = value for FFTW input
! FFTW_ESTIMATE = value for FFTW input
! FFTW_FORWARD = value for FFTW input
! FFTW_BACKWARD = value for FFTW input
! pi = 3.14159265358979323846264338327950288419716939937510d0
! Re = Reynolds number
! .. Scalars ..
! i = loop counter in x direction
! j = loop counter in y direction
! k = loop counter in z direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! count = keep track of information written to disk
! iol = size of array to write to disk
! start = variable to record start time of program
! finish = variable to record end time of program
! count_rate = variable for clock count rate
! planfxyz = Forward 3d fft plan
! planbxyz = Backward 3d fft plan
! dt = timestep
! .. Arrays ..
! u = velocity in x direction
! v = velocity in y direction
! w = velocity in z direction
! uold = velocity in x direction at previous timestep
! vold = velocity in y direction at previous timestep
! wold = velocity in z direction at previous timestep
! ux = x derivative of velocity in x direction
! uy = y derivative of velocity in x direction
! uz = z derivative of velocity in x direction
! vx = x derivative of velocity in y direction
! vy = y derivative of velocity in y direction
! vz = z derivative of velocity in y direction
! wx = x derivative of velocity in z direction
! wy = y derivative of velocity in z direction
! wz = z derivative of velocity in z direction
! uxold = x derivative of velocity in x direction
! uyold = y derivative of velocity in x direction
! uzold = z derivative of velocity in x direction
! vxold = x derivative of velocity in y direction
! vyold = y derivative of velocity in y direction
! vzold = z derivative of velocity in y direction
! wxold = x derivative of velocity in z direction
! wyold = y derivative of velocity in z direction
! wzold = z derivative of velocity in z direction
! utemp = temporary storage of u to check convergence
! vtemp = temporary storage of u to check convergence
! wtemp = temporary storage of u to check convergence
! temp_r = temporary storage for untransformed variables
! uhat = Fourier transform of u
! vhat = Fourier transform of v
! what = Fourier transform of w
! rhsuhatfix = Fourier transform of righthand side for u for timestepping
! rhsvhatfix = Fourier transform of righthand side for v for timestepping
! rhswhatfix = Fourier transform of righthand side for w for timestepping
! nonlinuhat = Fourier transform of nonlinear term for u
! nonlinvhat = Fourier transform of nonlinear term for u
! nonlinwhat = Fourier transform of nonlinear term for u
! phat = Fourier transform of nonlinear term for pressure, p
! temp_c = temporary storage for Fourier transforms
! realtemp = Real storage
!
! .. Vectors ..
! kx = fourier frequencies in x direction
! ky = fourier frequencies in y direction
! kz = fourier frequencies in z direction
! x = x locations
! y = y locations
! z = y locations
! time = times at which save data
! name_config = array to store filename for data to be saved
!
! REFERENCES
!
! A. Shapiro " The use of an exact solution of the Navier-Stokes equations
! in a validation test of a three-dimensional nonhydrostatic numerical model"
! Monthly Weather Review vol. 121, 2420-2425, (1993).
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
!
! This program has not been optimized to use the least amount of memory
! but is intended as an example only for which all states can be saved
!
!--------------------------------------------------------------------------------
! External routines required
!
! External libraries required
! 2DECOMP&FFT -- Fast Fourier Transform in the West Library
! (http://2decomp.org/)
!---------------------------------------------------------------------------------
USE decomp_2d
USE decomp_2d_fft
USE decomp_2d_io
USE MPI
!coprocessing:
use NSadaptor_module
IMPLICIT NONE
! declare variables
INTEGER(kind=4), PARAMETER :: Nx=256
INTEGER(kind=4), PARAMETER :: Ny=256
INTEGER(kind=4), PARAMETER :: Nz=256
INTEGER(kind=4), PARAMETER :: Lx=1
INTEGER(kind=4), PARAMETER :: Ly=1
INTEGER(kind=4), PARAMETER :: Lz=1
INTEGER(kind=4), PARAMETER :: Nt=20
REAL(kind=8), PARAMETER :: dt=0.05d0/Nt
REAL(kind=8), PARAMETER :: Re=1.0d0
REAL(kind=8), PARAMETER :: tol=0.1d0**10
REAL(kind=8), PARAMETER :: theta=0.0d0
REAL(kind=8), PARAMETER &
:: pi=3.14159265358979323846264338327950288419716939937510d0
REAL(kind=8), PARAMETER :: ReInv=1.0d0/REAL(Re,kind(0d0))
REAL(kind=8), PARAMETER :: dtInv=1.0d0/REAL(dt,kind(0d0))
REAL(kind=8) :: scalemodes,chg,factor
REAL(kind=8), DIMENSION(:), ALLOCATABLE :: x, y, z, time,mychg,allchg
COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE :: u, v, w,&
ux, uy, uz,&
vx, vy, vz,&
wx, wy, wz,&
uold, uxold, uyold, uzold,&
vold, vxold, vyold, vzold,&
wold, wxold, wyold, wzold,&
utemp, vtemp, wtemp, temp_r
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE :: kx, ky, kz
COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE :: uhat, vhat, what,&
rhsuhatfix, rhsvhatfix,&
rhswhatfix, nonlinuhat,&
nonlinvhat, nonlinwhat,&
phat,temp_c
!coprocessing: added x,y,z component arrays
REAL(kind=8), DIMENSION(:,:,:), ALLOCATABLE :: realtemp, realtempx, &
realtempy, realtempz
! MPI and 2DECOMP variables
TYPE(DECOMP_INFO) :: decomp
INTEGER(kind=MPI_OFFSET_KIND) :: filesize, disp
INTEGER(kind=4) :: p_row=0, p_col=0, numprocs, myid, ierr
! variables used for saving data and timing
INTEGER(kind=4) :: count, iol
INTEGER(kind=4) :: i,j,k,n,t,allocatestatus
INTEGER(kind=4) :: ind, numberfile
CHARACTER*100 :: name_config
INTEGER(kind=4) :: start, finish, count_rate
! initialisation of 2DECOMP&FFT
CALL MPI_INIT(ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)
! do automatic domain decomposition
CALL decomp_2d_init(Nx,Ny,Nz,p_row,p_col)
! get information about domain decomposition choosen
CALL decomp_info_init(Nx,Ny,Nz,decomp)
! initialise FFT library
CALL decomp_2d_fft_init
IF (myid.eq.0) THEN
PRINT *,'Grid:',Nx,'X',Ny,'Y',Nz,'Z'
PRINT *,'dt:',dt
END IF
ALLOCATE(x(1:Nx),y(1:Ny),z(1:Nz),time(1:Nt+1),mychg(1:3),allchg(1:3),&
u(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
v(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
w(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
ux(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uy(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uz(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vx(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vy(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vz(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wx(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wy(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wz(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uxold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uyold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
uzold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vxold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vyold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vzold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wxold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wyold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wzold(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
utemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
vtemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
wtemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
temp_r(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
kx(1:Nx),ky(1:Ny),kz(1:Nz),&
uhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
vhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
what(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
rhsuhatfix(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
rhsvhatfix(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
rhswhatfix(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
nonlinuhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
nonlinvhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
nonlinwhat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
phat(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
temp_c(decomp%zst(1):decomp%zen(1),&
decomp%zst(2):decomp%zen(2),&
decomp%zst(3):decomp%zen(3)),&
realtemp(decomp%xst(1):decomp%xen(1),&
decomp%xst(2):decomp%xen(2),&
decomp%xst(3):decomp%xen(3)),&
realtempx(decomp%xst(1):decomp%xen(1), &
decomp%xst(2):decomp%xen(2), &
decomp%xst(3):decomp%xen(3)),&
realtempy(decomp%xst(1):decomp%xen(1), &
decomp%xst(2):decomp%xen(2), &
decomp%xst(3):decomp%xen(3)),&
realtempz(decomp%xst(1):decomp%xen(1), &
decomp%xst(2):decomp%xen(2), &
decomp%xst(3):decomp%xen(3)),&
stat=AllocateStatus)
IF (AllocateStatus .ne. 0) STOP
IF (myid.eq.0) THEN
PRINT *,'allocated space'
END IF
! setup fourier frequencies in x-direction
DO i=1,Nx/2+1
kx(i)= cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))/Lx
END DO
kx(1+Nx/2)=0.0d0
DO i = 1,Nx/2 -1
kx(i+1+Nx/2)=-kx(1-i+Nx/2)
END DO
ind=1
DO i=-Nx/2,Nx/2-1
x(ind)=2.0d0*pi*REAL(i,kind(0d0))*Lx/REAL(Nx,kind(0d0))
ind=ind+1
END DO
! setup fourier frequencies in y-direction
DO j=1,Ny/2+1
ky(j)= cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))/Ly
END DO
ky(1+Ny/2)=0.0d0
DO j = 1,Ny/2 -1
ky(j+1+Ny/2)=-ky(1-j+Ny/2)
END DO
ind=1
DO j=-Ny/2,Ny/2-1
y(ind)=2.0d0*pi*REAL(j,kind(0d0))*Ly/REAL(Ny,kind(0d0))
ind=ind+1
END DO
! setup fourier frequencies in z-direction
DO k=1,Nz/2+1
kz(k)= cmplx(0.0d0,1.0d0)*REAL(k-1,kind(0d0))/Lz
END DO
kz(1+Nz/2)=0.0d0
DO k = 1,Nz/2 -1
kz(k+1+Nz/2)=-kz(1-k+Nz/2)
END DO
ind=1
DO k=-Nz/2,Nz/2-1
z(ind)=2.0d0*pi*REAL(k,kind(0d0))*Lz/REAL(Nz,kind(0d0))
ind=ind+1
END DO
scalemodes=1.0d0/REAL(Nx*Ny*Nz,kind(0d0))
IF (myid.eq.0) THEN
PRINT *,'Setup grid and fourier frequencies'
END IF
!initial conditions for Taylor-Green vortex
! factor=2.0d0/sqrt(3.0d0)
! DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
! u(i,j,k)=factor*sin(theta+2.0d0*pi/3.0d0)*sin(x(i))*cos(y(j))*cos(z(k))
! END DO; END DO; END DO
! DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
! v(i,j,k)=factor*sin(theta-2.0d0*pi/3.0d0)*cos(x(i))*sin(y(j))*cos(z(k))
! END DO ; END DO ; END DO
! DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
! w(i,j,k)=factor*sin(theta)*cos(x(i))*cos(y(j))*sin(z(k))
! END DO ; END DO ; END DO
time(1)=0.0d0
factor=sqrt(3.0d0)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
u(i,j,k)=-0.5*( factor*cos(x(i))*sin(y(j))*sin(z(k))&
+sin(x(i))*cos(y(j))*cos(z(k)) )*exp(-(factor**2)*time(1)/Re)
END DO; END DO; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
v(i,j,k)=0.5*( factor*sin(x(i))*cos(y(j))*sin(z(k))&
-cos(x(i))*sin(y(j))*cos(z(k)) )*exp(-(factor**2)*time(1)/Re)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
w(i,j,k)=cos(x(i))*cos(y(j))*sin(z(k))*exp(-(factor**2)*time(1)/Re)
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(u,uhat,DECOMP_2D_FFT_FORWARD)
CALL decomp_2d_fft_3d(v,vhat,DECOMP_2D_FFT_FORWARD)
CALL decomp_2d_fft_3d(w,what,DECOMP_2D_FFT_FORWARD)
! derivative of u with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,ux,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uz,DECOMP_2D_FFT_BACKWARD)
! derivative of v with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vz,DECOMP_2D_FFT_BACKWARD)
! derivative of w with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wz,DECOMP_2D_FFT_BACKWARD)
! save initial data
n = 0
!coprocessing: removed savedata calls from coprocessing version
!coprocessing: could have this after the coprocessor initialize and
!coprocessing: followed by a call to the coprocessor to write out the initial data
!coprocessing: image. Otherwise, these are just wasted calls at this spot.
call coprocessorinitialize("pipeline.py", 11)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempx(i,j,k)=REAL(wy(i,j,k)-vz(i,j,k),KIND=8)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempy(i,j,k)=REAL(uz(i,j,k)-wx(i,j,k),KIND=8)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempz(i,j,k)=REAL(vx(i,j,k)-uy(i,j,k),KIND=8)
END DO; END DO ; END DO
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3), n, n*dt, &
realtempx, realtempy, realtempz)
!start timer
CALL system_clock(start,count_rate)
! coprocessing: Simulation loop starts here
DO n=1,Nt
!fixed point
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
uold(i,j,k)=u(i,j,k)
uxold(i,j,k)=ux(i,j,k)
uyold(i,j,k)=uy(i,j,k)
uzold(i,j,k)=uz(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
vold(i,j,k)=v(i,j,k)
vxold(i,j,k)=vx(i,j,k)
vyold(i,j,k)=vy(i,j,k)
vzold(i,j,k)=vz(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
wold(i,j,k)=w(i,j,k)
wxold(i,j,k)=wx(i,j,k)
wyold(i,j,k)=wy(i,j,k)
wzold(i,j,k)=wz(i,j,k)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
rhsuhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*uhat(i,j,k)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
rhsvhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*vhat(i,j,k)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
rhswhatfix(i,j,k) = (dtInv+(0.5*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)))*what(i,j,k)
END DO; END DO ; END DO
chg=1
DO WHILE (chg .gt. tol)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(ux(i,j,k)+uxold(i,j,k))&
+(v(i,j,k)+vold(i,j,k))*(uy(i,j,k)+uyold(i,j,k))&
+(w(i,j,k)+wold(i,j,k))*(uz(i,j,k)+uzold(i,j,k)))
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_r,nonlinuhat,DECOMP_2D_FFT_FORWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(vx(i,j,k)+vxold(i,j,k))&
+(v(i,j,k)+vold(i,j,k))*(vy(i,j,k)+vyold(i,j,k))&
+(w(i,j,k)+wold(i,j,k))*(vz(i,j,k)+vzold(i,j,k)))
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_r,nonlinvhat,DECOMP_2D_FFT_FORWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
temp_r(i,j,k)=0.25d0*((u(i,j,k)+uold(i,j,k))*(wx(i,j,k)+wxold(i,j,k))&
+(v(i,j,k)+vold(i,j,k))*(wy(i,j,k)+wyold(i,j,k))&
+(w(i,j,k)+wold(i,j,k))*(wz(i,j,k)+wzold(i,j,k)))
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_r,nonlinwhat,DECOMP_2D_FFT_FORWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
phat(i,j,k)=-1.0d0*( kx(i)*nonlinuhat(i,j,k)&
+ky(j)*nonlinvhat(i,j,k)&
+kz(k)*nonlinwhat(i,j,k))&
/(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k)+0.1d0**13)
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
uhat(i,j,k)=(rhsuhatfix(i,j,k)-nonlinuhat(i,j,k)-kx(i)*phat(i,j,k))/&
(dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
vhat(i,j,k)=(rhsvhatfix(i,j,k)-nonlinvhat(i,j,k)-ky(j)*phat(i,j,k))/&
(dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
END DO; END DO ; END DO
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
what(i,j,k)=(rhswhatfix(i,j,k)-nonlinwhat(i,j,k)-kz(k)*phat(i,j,k))/&
(dtInv-(0.5d0*ReInv)*(kx(i)*kx(i)+ky(j)*ky(j)+kz(k)*kz(k))) !*scalemodes
END DO; END DO ; END DO
! derivative of u with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,ux,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=uhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,uz,DECOMP_2D_FFT_BACKWARD)
! derivative of v with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3); DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=vhat(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,vz,DECOMP_2D_FFT_BACKWARD)
! derivative of w with respect to x, y, and z
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kx(i)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wx,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*ky(j)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wy,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%zst(3),decomp%zen(3) ; DO j=decomp%zst(2),decomp%zen(2) ; DO i=decomp%zst(1),decomp%zen(1)
temp_c(i,j,k)=what(i,j,k)*kz(k)*scalemodes
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(temp_c,wz,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
utemp(i,j,k)=u(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
vtemp(i,j,k)=v(i,j,k)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
wtemp(i,j,k)=w(i,j,k)
END DO; END DO ; END DO
CALL decomp_2d_fft_3d(uhat,u,DECOMP_2D_FFT_BACKWARD)
CALL decomp_2d_fft_3d(vhat,v,DECOMP_2D_FFT_BACKWARD)
CALL decomp_2d_fft_3d(what,w,DECOMP_2D_FFT_BACKWARD)
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
u(i,j,k)=u(i,j,k)*scalemodes
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
v(i,j,k)=v(i,j,k)*scalemodes
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
w(i,j,k)=w(i,j,k)*scalemodes
END DO; END DO ; END DO
mychg(1) =maxval(abs(utemp-u))
mychg(2) =maxval(abs(vtemp-v))
mychg(3) =maxval(abs(wtemp-w))
CALL MPI_ALLREDUCE(mychg,allchg,3,MPI_DOUBLE_PRECISION,MPI_MAX,MPI_COMM_WORLD,ierr)
chg=allchg(1)+allchg(2)+allchg(3)
IF (myid.eq.0) THEN
PRINT *,'chg:',chg
END IF
END DO
time(n+1)=n*dt
!goto 5100
IF (myid.eq.0) THEN
PRINT *,'time',n*dt
END IF
!coprocessing: populating the arrays sent to the coprocessor
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempx(i,j,k)=REAL(wy(i,j,k)-vz(i,j,k),KIND=8)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempy(i,j,k)=REAL(uz(i,j,k)-wx(i,j,k),KIND=8)
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
realtempz(i,j,k)=REAL(vx(i,j,k)-uy(i,j,k),KIND=8)
END DO; END DO ; END DO
call NSadaptor(Nx,Ny,Nz,decomp%xst(1),decomp%xen(1), &
decomp%xst(2),decomp%xen(2),decomp%xst(3),decomp%xen(3), n, n*dt, &
realtempx, realtempy, realtempz)
END DO
!coprocessing:
call coprocessorfinalize()
CALL system_clock(finish,count_rate)
IF (myid.eq.0) then
PRINT *, 'Program took', REAL(finish-start)/REAL(count_rate), 'for main timestepping loop'
END IF
IF (myid.eq.0) THEN
name_config = './data/tdata.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO n=1,1+Nt
WRITE(11,*) time(n)
END DO
CLOSE(11)
name_config = './data/xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)
name_config = './data/ycoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
WRITE(11,*) y(j)
END DO
CLOSE(11)
name_config = './data/zcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO k=1,Nz
WRITE(11,*) z(k)
END DO
CLOSE(11)
PRINT *,'Saved data'
END IF
! Calculate error in final numerical solution
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
utemp(i,j,k)=u(i,j,k) -&
(-0.5*( factor*cos(x(i))*sin(y(j))*sin(z(k))&
+sin(x(i))*cos(y(j))*cos(z(k)) )*exp(-(factor**2)*time(Nt+1)/Re))
END DO; END DO; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
vtemp(i,j,k)=v(i,j,k) -&
(0.5*( factor*sin(x(i))*cos(y(j))*sin(z(k))&
-cos(x(i))*sin(y(j))*cos(z(k)) )*exp(-(factor**2)*time(Nt+1)/Re))
END DO; END DO ; END DO
DO k=decomp%xst(3),decomp%xen(3); DO j=decomp%xst(2),decomp%xen(2); DO i=decomp%xst(1),decomp%xen(1)
wtemp(i,j,k)=w(i,j,k)-&
(cos(x(i))*cos(y(j))*sin(z(k))*exp(-(factor**2)*time(Nt+1)/Re))
END DO; END DO ; END DO
mychg(1) = maxval(abs(utemp))
mychg(2) = maxval(abs(vtemp))
mychg(3) = maxval(abs(wtemp))
CALL MPI_ALLREDUCE(mychg,allchg,3,MPI_DOUBLE_PRECISION,MPI_MAX,MPI_COMM_WORLD,ierr)
chg=allchg(1)+allchg(2)+allchg(3)
IF (myid.eq.0) THEN
PRINT*,'The error at the final timestep is',chg
END IF
! clean up
CALL decomp_2d_fft_finalize
CALL decomp_2d_finalize
DEALLOCATE(x,y,z,time,mychg,allchg,u,v,w,ux,uy,uz,vx,vy,vz,wx,wy,wz,uold,uxold,uyold,uzold,&
vold,vxold,vyold,vzold,wold,wxold,wyold,wzold,utemp,vtemp,wtemp,&
temp_r,kx,ky,kz,uhat,vhat,what,rhsuhatfix,rhsvhatfix,&
rhswhatfix,phat,nonlinuhat,nonlinvhat,nonlinwhat,temp_c,&
realtemp,stat=AllocateStatus)
!coprocessing: deallocate Coprocessing specific arrays
deallocate(realtempx, realtempy, realtempz)
IF (AllocateStatus .ne. 0) STOP
IF (myid.eq.0) THEN
PRINT *,'Program execution complete'
END IF
CALL MPI_FINALIZE(ierr)
END PROGRAM main
|
()
|
- 接受當前鉛筆的子範圍。
! Fortran module for interacting with the ParaView CoProcessor
! loosely based on:
! ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/phastaadaptor.f
! -- Changed for SESE
! Subroutine determines if coprocessing needed during the
! current simulation step or not, and if so, calls the coprocessor.
! Some of the subroutines are supplied by ParaView's FortranAdaptorAPI,
! others have to be supplied by the programmer.
module NSadaptor_module
use iso_c_binding
implicit none
public
interface NSadaptor
module procedure NSadaptor3D
end interface NSadaptor
contains
subroutine NSadaptor3D(nx, ny, nz, xst, xen, yst, yen, zst, zen, &
step, time, omegax, omegay, omegaz)
! nx, ny, nz -- grid dimensions of entire mesh
! used for setting whole extent
! xst, xen, etc. -- extents of current subdomain pencil
! step -- current simulation time step
! time -- current simulation time
! omega* -- scalar arrays for the current time step
integer, intent(in) :: nx, ny, nz, xst, xen, yst, yen, zst, zen, step
real(kind=8), intent(in) :: time
real(kind=8), dimension(:,:,:), intent (in) :: omegax, omegay, omegaz
integer :: flag
! check if processing this time step
! defined in FortranAdaptorAPI.h
call requestdatadescription(step, time, flag)
if (flag /= 0) then
! processing requested
! check if need to create grid
! defined in FortranAdaptorAPI.h
call needtocreategrid(flag)
if (flag /= 0) then
! grid needed
! defined in VTKCellBasedDataSet.cxx
! takes the size of the entire grid, and the extents of the
! sub grid.
call createcpimagedata(nx, ny, nz, xst, xen, yst, yen, zst, zen)
end if
! defined in VTKCellBasedDataSet.cxx
! call for each field of interest. Be sure to null-terminate the
! string for C++!
call addfield(omegax, "realtempx"//char(0))
call addfield(omegay, "realtempy"//char(0))
call addfield(omegaz, "realtempz"//char(0))
! defined in FortranAdaptorAPI.h
call coprocess()
end if
return
end subroutine NSadaptor3D
end module NSadaptor_module
|
()
|
- 更改了範圍的計算方式。
- 將點資料呼叫更改為單元格資料呼叫。
// Adaptor for getting fortran simulation code into ParaView CoProcessor.
// Based on the PhastaAdaptor sample in the ParaView distribution.
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx
// Fortran specific header
// ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/
#include "FortranAdaptorAPI.h"
// CoProcessor specific headers
// Routines that take the place of VTK dataset object creation.
// Called from Fortran code which also calls the Fortran Adaptor API
// supplied with ParaView source.
// Note: names mangled with trailing underscores for linker visibility.
#include "vtkCPDataDescription.h"
#include "vtkCPInputDataDescription.h"
#include "vtkCPProcessor.h"
#include "vtkDoubleArray.h"
#include "vtkPointData.h"
#include "vtkSmartPointer.h"
#include "vtkImageData.h"
#include "vtkCellData.h"
#include <string>
// These will be called from the Fortran adaptor "glue" code.
// Completely dependent on data layout, structured vs. unstructured, etc.
// since VTK/ParaView uses different internal layouts for each.
// Creates the data container for the CoProcessor.
// Takes the extents for both the global dataset and the particular subsection
// visible to the current MPI rank.
// Note: expects to receive Fortran base-1 indices.
extern "C" void createcpimagedata_(int* nx, int* ny, int* nz, int* xst, int* xen,
int* yst, int* yen, int* zst, int* zen)
{
if (!ParaViewCoProcessing::GetCoProcessorData()) {
vtkGenericWarningMacro("Unable to access CoProcessorData.");
return;
}
// The simulation grid is a 3-dimensional topologically and geometrically
// regular grid. In VTK/ParaView, this is considered an image data set.
vtkSmartPointer<vtkImageData> img = vtkSmartPointer<vtkImageData>::New();
// Indexing based on change from F90 to C++, and also from nodal to cellular.
img->SetExtent(*xst - 1, *xen, *yst - 1, *yen, *zst - 1, *zen);
img->SetSpacing( 1.0 / *nx, 1.0 / *ny, 1.0 / *nz);
ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(img);
ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetWholeExtent(0, *nx, 0, *ny, 0, *nz);
}
// Add field to the data container.
// Separate from above because this could be dynamic, grid is static.
// Might be an issue, VTK assumes row major C storage.
// Underscore is by-hand name mangling for fortran linking.
// Note: Expects the address of the data, has no way of determining
// if the array is densely packed or not.
// Note 2: Assumes "name" points to null-terminated array of chars.
// Easiest way to do that is to concatenate a terminal NULL in caller.
extern "C" void addfield_(double* a, char* name)
{
vtkSmartPointer<vtkCPInputDataDescription> idd = ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input");
vtkSmartPointer<vtkImageData> img = vtkImageData::SafeDownCast(idd->GetGrid());
if (!img) {
vtkGenericWarningMacro("No adaptor grid to attach field data to.");
return;
}
if (idd->IsFieldNeeded(name)) {
vtkSmartPointer<vtkDoubleArray> field = vtkSmartPointer<vtkDoubleArray>::New();
field->SetName(name);
field->SetArray(a, img->GetNumberOfCells(), 1);
img->GetCellData()->AddArray(field);
}
}
- 最小的程式碼更改,模擬程式碼中沒有更改(除了協處理呼叫)。
- 不需要光暈或打包陣列。
- 應該可以在無需更改的情況下使用不同的分解庫。
- 需要在 ParaView 客戶端中進行額外處理,以處理單元格渲染和點渲染之間的差異。
- 不能與需要幽靈單元格的 VTK/ParaView 過濾器一起使用。
C++ 沒有本機複數資料型別。這需要在 ParaView 客戶端中進行一些額外的處理
- 以減少的網格大小執行原始的非協處理模擬。
- 使用 RAW 二進位制閱讀器在 ParaView 客戶端中開啟此資料。
- 將標量分量數設定為 2
- 建立管道,實部和虛部的預設名稱分別是 ImageFile_X 和 ImageFile_Y。
|
()
|
有關將複數傳遞給協處理器的詳細資訊,請參見 NLSadaptor.f90 的程式碼。 程式碼下載
! Fortran module for interacting with the ParaView CoProcessor
! loosely based on:
! ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/phastaadaptor.f
! -- Changed for SESE
! Subroutine determines if coprocessing needed during the
! current simulation step or not, and if so, calls the coprocessor.
! Some of the subroutines are supplied by ParaView's FortranAdaptorAPI,
! others have to be supplied by the programmer.
module NLSadaptor_module
use iso_c_binding
implicit none
public
! using an interface incase need arises to overload 1D, 2D versions
interface NLSadaptor
module procedure NLSadaptor3D
end interface NLSadaptor
contains
subroutine NLSadaptor3D(nx, ny, nz, xst, xen, yst, yen, zst, zen, &
step, time, a)
! nx, ny, nz -- grid dimensions of entire mesh
! used for setting whole extent
! xst, xen, etc. -- extents of current subdomain pencil
! step -- current simulation time step
! time -- current simulation time
! a -- scalar array for the current time step
! flag -- receives status from API calls
integer, intent(in) :: nx, ny, nz, xst, xen, yst, yen, zst, zen, step
real(kind=8), intent(in) :: time
complex(kind=8), dimension(:,:,:), intent (in) :: a
integer :: flag
flag = 0
! check if processing this time step
! defined in FortranAdaptorAPI.h
call requestdatadescription(step, time, flag)
if (flag /= 0) then
! processing requested
! check if need to create grid
! defined in FortranAdaptorAPI.h
call needtocreategrid(flag)
if (flag /= 0) then
! grid needed
! defined in adaptor.cxx
! takes the size of the entire grid, and the extents of the
! sub grid.
call createcpimagedata(nx, ny, nz, xst, xen, yst, yen, zst, zen)
end if
! defined in adaptor.cxx
! Be sure to null-terminate the
! string for C++!
call addfield(a, "u_complex"//char(0))
! defined in FortranAdaptorAPI.h
call coprocess()
end if
return
end subroutine NLSadaptor3D
end module NLSadaptor_module
|
( )
|
這也需要在附帶的 C++ 檔案中進行一些小的程式碼更改。 程式碼下載
// 介面卡,用於將 fortran 模擬程式碼輸入 ParaView 協處理。 // 基於 ParaView 發行版中的 PhastaAdaptor 示例。 // ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/PhastaAdaptor/PhastaAdaptor.cxx
// Fortran 特定標頭 // ParaView-3.14.1-Source/CoProcessing/Adaptors/FortranAdaptors/
- include "FortranAdaptorAPI.h"
// 協處理特定標頭 // 替代 VTK 資料集物件建立的例程。 // 從 Fortran 程式碼呼叫,該程式碼還呼叫與 ParaView 原始碼一起提供的 Fortran Adaptor API // 注意:名稱與尾部下劃線混合,用於連結器可見性。
- include "vtkCPDataDescription.h"
- include "vtkCPInputDataDescription.h"
- include "vtkCPProcessor.h"
- include "vtkDoubleArray.h"
- include "vtkPointData.h"
- include "vtkSmartPointer.h"
- include "vtkImageData.h"
- include "vtkCellData.h"
- include <string>
// 這些將從 Fortran “粘合” 程式碼呼叫 // 完全依賴於資料佈局,結構化與非結構化,等等 // 因為 VTK/ParaView 對每個佈局使用不同的內部佈局。
// 為協處理器建立資料容器。 // 獲取全域性資料集和當前 MPI 秩可見的特定子部分的範圍 // 注意:預期接收 Fortran 基於 1 的索引。 extern "C" void createcpimagedata_(int* nx, int* ny, int* nz, int* xst, int* xen, int* yst, int* yen, int* zst, int* zen) {
if (!ParaViewCoProcessing::GetCoProcessorData()) {
vtkGenericWarningMacro("Unable to access CoProcessorData.");
return;
}
// The simulation grid is a 3-dimensional topologically and geometrically // regular grid. In VTK/ParaView, this is considered an image data set. vtkSmartPointer<vtkImageData> img = vtkSmartPointer<vtkImageData>::New();
// For Fortram complex, need to set that there are 2 scalar components img->SetNumberOfScalarComponents(2);
// Indexing based on change from F90 to C++, and also from nodal to cellular.
// Extents are given in terms of nodes, not cells.
img->SetExtent(*xst - 1, *xen, *yst - 1, *yen, *zst - 1, *zen);
// Setting spacing is important so that the camera position in the pipeline makes
// sense if using different sized meshes between setting up the pipeline and running
// the simulation. Origin can often be ignored.
img->SetSpacing( 1.0 / *nx, 1.0 / *ny, 1.0 / *nz); // considering passing in as args.
ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetGrid(img);
ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input")->SetWholeExtent(0, *nx, 0, *ny, 0, *nz);
}
// 向資料容器新增欄位。 // 與上述分開,因為這可能是動態的,網格是靜態的。 // 可能會出現問題,VTK 假設行主序 C 儲存。 // 下劃線是用於 fortran 連結的手動名稱混合。 // 注意:預期接收資料的地址,無法確定 // 陣列是否密集打包。 // 注意 2:假設 “name” 指向以 null 結尾的字元陣列。 // 最簡單的實現方式是在呼叫者中進行連線。 extern "C" void addfield_(double* a, char* name) {
vtkSmartPointer<vtkCPInputDataDescription> idd = ParaViewCoProcessing::GetCoProcessorData()->GetInputDescriptionByName("input");
vtkSmartPointer<vtkImageData> img = vtkImageData::SafeDownCast(idd->GetGrid());
if (!img) {
vtkGenericWarningMacro("No adaptor grid to attach field data to.");
return;
}
if (idd->IsFieldNeeded(name)) {
vtkSmartPointer<vtkDoubleArray> field = vtkSmartPointer<vtkDoubleArray>::New();
field->SetNumberOfComponents(2);
field->SetName(name);
field->SetArray(a, 2* img->GetNumberOfCells(), 1);
img->GetCellData()->AddArray(field);
}
}
- 某些 ParaView 過濾器和表示形式僅適用於點資料,例如體積渲染。使用單元格資料到點資料過濾器進行轉換。請務必選中選項,以將單元格資料透過過濾器傳遞。如果模擬執行沒有錯誤併產生沒有輸出,則可能的罪魁禍首是在建立管道時未能選中該選項。
- 使用體積渲染的管道將生成有關區域 ID 為 0 的警告。通常可以忽略這些警告。
- ParaView 維基條目
- SC10 協處理教程
- ParaView 原始碼,特別是 PHASTA Fortran 示例。
- paraview-users 郵件列表