三次非線性薛定諤方程

背景

三次非線性薛定諤方程出現在許多領域，包括量子力學、非線性光學和水面波。可以在http://en.wikipedia.org/wiki/Schrodinger_equation和http://en.wikipedia.org/wiki/Nonlinear_Schrodinger_equation找到一般介紹。薛定諤方程的數學介紹可以在 Sulem 和 Sulem^[1]和楊^[2]中找到。在本節中，我們將介紹運算元分裂的概念，然後解釋如何將其應用於一維、二維和三維的非線性薛定諤方程。在一維中，可以證明三次非線性薛定諤方程是亞臨界的，因此存在所有時間的解。在二維中，它是 $H^{1}$ 臨界的，因此解可能表現出 $H^{1}$ 範數的爆炸，即解的梯度的平方積分可以在有限時間內變為無窮大。最後，在三維中，非線性薛定諤方程是 $L^{2}$ 超臨界的，因此解的平方積分也可以在有限時間內變為無窮大。有關範數和希爾伯特空間的介紹，請參閱偏微分方程或分析的教科書，例如 Evans^[3]、Linares 和 Ponce^[4]、Lieb 和 Loss^[5]或 Renardy 和 Rogers^[6]。一個有趣的問題是如何發生這種爆炸，數值模擬經常被用來理解這一點；參見 Sulem 和 Sulem^[7]以獲取此類示例。三次非線性薛定諤方程^[8]由下式給出

$i\psi _{t}+\Delta \psi \pm |\psi |^{2}\psi =0,$

其中 $\psi$ 是波函式， $\Delta$ 是拉普拉斯運算元，因此在一維中它是 $\partial _{xx}$ ，在二維中， $\partial _{xx}+\partial _{yy}$ ，在三維中它是 $\partial _{xx}+\partial _{yy}+\partial _{zz}$ 。 $+$ 對應於聚焦三次非線性薛定諤方程，而 $-$ 對應於散焦三次非線性薛定諤方程。該方程具有許多守恆量，包括“質量”，

$\int _{\Omega }|\psi |^{2}{\text{d}}^{n}\mathbf {x}$

和“能量”，

$\int _{\Omega }{\frac {1}{2}}|\nabla \psi |^{2}\mp {\frac {1}{4}}|\psi |^{4}{\text{d}}^{n}\mathbf {x}$

其中 $n$ 是維度， $\Omega$ 是解的域。正如 Klein^[9] 解釋的那樣，這兩個量可以為數值生成解的精度提供有用的檢驗。

分裂

我們將考慮一種稱為分裂的數值方法來求解此方程。這種方法在多個應用中出現，並且當方程可以分裂成兩個單獨的方程時，它是一種有用的數值方法，每個方程都可以精確求解，或者每個部分最適合由不同的數值方法求解。 Holden 等人^[10]，McLachlan 和 Quispel^[11]，Thalhammer^[12]，Shen、Tang 和 Wang^[13]，Weideman 和 Herbst^[14] 以及 Yang^[15] 提供了對分裂的介紹，http://en.wikipedia.org/wiki/Split-step_method 也對此進行了介紹。對於那些對非線性薛定諤方程的時間步進方法比較感興趣的人，請參閱 Klein^[16]。為了描述該方法的基本原理，我們考慮 Holden 等人^[17] 給出的一個例子，即常微分方程：

u_{t}=u(u-1)

其中

u(t=0)=0.8

.

( 1)

我們可以透過變數分離相對簡單地求解此方程，發現

$u(t)={\frac {4}{4+\exp(t)}}.$

現在，一個有趣的觀察結果是，我們也可以單獨求解方程 $u_{t}=u^{2}$ 和 $u_{t}=-u$ 。對於第一個，我們得到 $u(t)={\frac {u(0)}{1-tu(0)}}$ ，對於第二個，我們得到 $u(t)=u(0)\exp(-t)$ 。分裂背後的原理是交替地求解這兩個單獨的方程，時間很短。我們將描述 Strang 分裂，雖然還有其他形式的分裂，例如 Godunov 分裂以及加性分裂。我們在這裡不介紹這些，而是將您推薦給前面提到的參考文獻，特別是 Holden 等人^[18]。為了理解如何使用分裂求解微分方程，請考慮線性常微分方程

$u_{t}=u+2u$ 其中 $u(0)=1$ .

We can first solve $p_{t}=p$ for a time $\delta t/2$ and then using $q(0)=p(\delta t/2)$ , we solve $q_{t}=2q$ also for a time $\delta t$ to get $q(\delta t)$ and finally solve $r_{t}=r$ for a time $\delta t/2$ with initial data $r(0)=q(\delta t)$ . Thus in this case $p(\delta t)=\exp(\delta t/2)$ , $q(\delta t)=p(\delta t/2)\exp(2\delta t)=\exp(5\delta t/2)$ and $u(\delta t)\approx r(\delta t/2)=q(\delta t)\exp(\delta t/2)=\exp(3\delta t)$ , which in this case is the exact solution. One can perform a similar splitting for matrix differential equations. Consider solving $\mathbf {u} _{t}=(\mathbf {A} +\mathbf {B} )\mathbf {u}$ , where $\mathbf {A}$ and $\mathbf {B}$ are $n\times n$ matrices, the exact solution is $\mathbf {u} =\exp \left((\mathbf {A} +\mathbf {B} )t\right)\mathbf {u} (t=0)$ , and an approximate solution produced after one time step of splitting is $u(\delta t)\approx u(0)\exp(\mathbf {A} \delta t)\exp(\mathbf {B} \delta t)$ , which is not in general equal to $u(t=0)\exp \left((\mathbf {A} +\mathbf {B} )\delta t\right)$ unless the matrices $\mathbf {A}$ and $\mathbf {B}$ commute^[19], and so the error in doing splitting in this case is of the form $(\mathbf {A} \mathbf {B} -\mathbf {B} \mathbf {A} )\delta t$ ^[20]. Listing A uses Matlab to demonstrate how to do splitting for eq. 1 .

( A)

使用 Strang 分裂求解 ODE 的 Matlab 程式程式碼下載

% A program to solve the u_t=u(u-1) using a
% Strang Splitting method

clear all; format compact; format short;
set(0,'defaultaxesfontsize',30,'defaultaxeslinewidth',.7,...
    'defaultlinelinewidth',6,'defaultpatchlinewidth',3.7,...
    'defaultaxesfontweight','bold')
Nt = 1000; % number of time slices
tmax = 1; % maximum time
dt=tmax/Nt; % increment between times
time=(linspace(1,Nt,Nt)-1)*dt; % time
uexact=4./(4+exp(time)); % exact solution
u(1)=0.8

for i=1:Nt-1
    c=-1/u(i);
    utemp=-1/(c+0.5*dt);
    utemp2=utemp*exp(-dt);
    c=-1/utemp2;
    u(i+1)=-1/(c+0.5*dt);
end
figure(1)
plot(time,u,'r+',time,uexact,'b-');

( Ap)

使用 Strang 分裂求解 ODE 的 Python 程式。程式碼下載

"""
A program to solve u_t'=u(u-1) using a Strang
splitting method
"""

import time
import numpy
import matplotlib.pyplot as plt

Nt = 1000	# Number of timesteps
tmax = 1.0	# Maximum time
dt=tmax/Nt # increment between times
u0 = 0.8 # Initial value
t0 = 0	# Starting time
u = [u0]	# Variables holding the values of iterations
t = [t0]	# Times of discrete solutions

T0 = time.clock()
for i in xrange(Nt):
c = -1.0/u[i]
utemp=-1.0/(c+0.5*dt)
utemp2=utemp*numpy.exp(-dt)
c = -1.0/utemp2
unew=-1.0/(c+0.5*dt)
u.append(unew)
t.append(t[i]+dt)

T = time.clock() - T0	
uexact = [4.0/(4.0+numpy.exp(tt)) for tt in t]

print
print "Elapsed time is %f" % (T)

plt.figure()
plt.plot(t,u,'r+',t,uexact,'b-.')
plt.xlabel('Time')
plt.ylabel('Solution')
plt.legend(('Numerical Solution', 'Exact solution'))
plt.title('Numerical solution of du/dt=u(u-1)')
plt.show()

練習

修改 Matlab 程式碼，以計算時間 1 處的誤差，並選擇多個不同的時間步長。數值驗證 Strang 分裂的二階精度。
修改 Matlab 程式碼以使用 Godunov 分裂，其中求解 $u1_{t}=u1$ 時間為 $\delta t$ ，然後使用 $u1(\delta t)$ 作為初始資料，求解 $u2_{t}=2u2$ 也為時間 $\delta t$ ，以獲得對 $u(\delta t)$ 的近似值。計算時間為 1 時的誤差，對於不同的時間步長選擇。數值驗證 Godunov 分裂是一階精度的。

序列

對於非線性薛定諤方程

i\psi _{t}\pm |\psi |^{2}\psi +\Delta \psi =0

,

( 2)

我們首先求解

i\psi _{t}+\Delta \psi =0

( 3)

使用傅立葉變換精確求解，得到

$\psi (\delta {t}/2,\cdot )$ .

然後我們求解

i\psi _{t}\pm |\psi |^{2}\psi =0

( 4)

其中

$\psi (\delta {t}/2,\cdot )$

作為初始資料，時間步長為 $\delta t$ 。正如 Klein^[21] 和 Thalhammer^[22] 所解釋的，這可以在實空間中精確求解，因為在等式 4 中， $|\psi |^{2}$ 是空間和時間的每個點的守恆量。為了證明這一點，令 $\psi ^{*}$ 表示 $\psi$ 的複共軛，所以

${\frac {\mathrm {d} |\psi |^{2}}{\mathrm {d} t}}=\psi ^{*}{\frac {\mathrm {d} \psi }{\mathrm {d} t}}+{\frac {\mathrm {d} \psi ^{*}}{\mathrm {d} t}}\psi =\psi ^{*}\left(\pm i|\psi |^{2}\psi \right)+\left(\pm i|\psi |^{2}\psi \right)^{*}\psi =0$ .

然後使用方程式 3 計算另一個半步，該半步使用透過求解方程式 4 生成的解來獲得時間 $\delta t$ 處的近似解。下面是演示拆分的示例 Matlab 程式碼。

非線性薛定諤方程的示例 Matlab 和 Python 程式

清單 B 中的程式計算對聚焦非線性薛定諤方程的顯式已知精確解的近似值。

( B)

一個使用 Strang 拆分求解一維非線性薛定諤方程的 Matlab 程式程式碼下載

% A program to solve the nonlinear Schr\"{o}dinger equation using a
% splitting method

clear all; format compact; format short;
set(0,'defaultaxesfontsize',30,'defaultaxeslinewidth',.7,...
    'defaultlinelinewidth',6,'defaultpatchlinewidth',3.7,...
    'defaultaxesfontweight','bold')

Lx = 20; % period 2*pi * L
Nx = 16384; % number of harmonics
Nt = 1000; % number of time slices
dt = 0.25*pi/Nt; % time step
U=zeros(Nx,Nt/10);

Es = -1; % focusing or defocusing parameter

% initialise variables
x = (2*pi/Nx)*(-Nx/2:Nx/2 -1)'*Lx; % x coordinate
kx = 1i*[0:Nx/2-1 0 -Nx/2+1:-1]'/Lx; % wave vector
k2x = kx.^2; % square of wave vector
% initial conditions
t=0; tdata(1)=t;
u=4*exp(1i*t)*(cosh(3*x)+3*exp(8*1i*t)*cosh(x))...
    ./(cosh(4*x)+4*cosh(2*x)+3*cos(8*t));
v=fft(u);
figure(1); clf; plot(x,u);xlim([-2,2]); drawnow;
U(:,1)=u;

% mass
ma = fft(abs(u).^2);
ma0 = ma(1);

% solve pde and plot results
for n =2:Nt+1
    
    vna=exp(0.5*1i*dt*k2x).*v;
    una=ifft(vna);
    pot=2*(una.*conj(una));
    unb=exp(-1i*Es*dt*pot).*una;
    vnb=fft(unb);
    v=exp(0.5*1i*dt*k2x).*vnb;
    t=(n-1)*dt;
    
    if (mod(n,10)==0)
        tdata(n/10)=t;
        u=ifft(v);
        U(:,n/10)=u;
        uexact=4*exp(1i*t)*(cosh(3*x)+3*exp(8*1i*t)*cosh(x))...
            ./(cosh(4*x)+4*cosh(2*x)+3*cos(8*t));
        figure(1); clf; plot(x,abs(u).^2); ...
            xlim([-0.5,0.5]); title(num2str(t));
        figure(2); clf; plot(x,abs(u-uexact).^2);...
            xlim([-0.5,0.5]); title(num2str(t));
        drawnow;
        ma = fft(abs(u).^2);
        ma = ma(1);
        test = log10(abs(1-ma/ma0))
    end
end
figure(3); clf; mesh(tdata(1:(n-1)/10),x,abs(U(:,1:(n-1)/10)).^2);

( Bp)

一個使用 Strang 拆分求解一維非線性薛定諤方程的 Python 程式。程式碼下載

"""
A program to solve the 1D Nonlinear Schrodinger equation using a
second order splitting method. The numerical solution is compared
to an exact soliton solution. The exact equation solved is
iu_t+u_{xx}+2|u|^2u=0

More information on visualization can be found on the Mayavi
website, in particular:
http://github.enthought.com/mayavi/mayavi/mlab.html
which was last checked on 6 April 2012

"""

import math
import numpy
import matplotlib.pyplot as plt
import time

plt.ion()

# Grid
Lx=16.0 # Period 2*pi*Lx
Nx=8192 # Number of harmonics
Nt=1000 # Number of time slices
tmax=1.0 # Maximum time
dt=tmax/Nt # time step
plotgap=10 # time steps between plots
Es= -1.0 # focusing (+1) or defocusing (-1) parameter
numplots=Nt/plotgap # number of plots to make

x = [i*2.0*math.pi*(Lx/Nx) for i in xrange(-Nx/2,1+Nx/2)]
k_x = (1.0/Lx)*numpy.array([complex(0,1)*n for n in range(0,Nx/2) \
+ [0] + range(-Nx/2+1,0)])

k2xm=numpy.zeros((Nx), dtype=float)
xx=numpy.zeros((Nx), dtype=float)

for i in xrange(Nx):
   k2xm[i] = numpy.real(k_x[i]**2)
   xx[i]=x[i]
        

# allocate arrays
usquared=numpy.zeros((Nx), dtype=float)
pot=numpy.zeros((Nx), dtype=float)
u=numpy.zeros((Nx), dtype=complex)
uexact=numpy.zeros((Nx), dtype=complex)
una=numpy.zeros((Nx), dtype=complex)
unb=numpy.zeros((Nx), dtype=complex)
v=numpy.zeros((Nx), dtype=complex)
vna=numpy.zeros((Nx), dtype=complex)
vnb=numpy.zeros((Nx), dtype=complex)
mass=numpy.zeros((Nx), dtype=complex)
test=numpy.zeros((numplots-1),dtype=float)
tdata=numpy.zeros((numplots-1), dtype=float)

t=0.0
u=4.0*numpy.exp(complex(0,1.0)*t)*\
   (numpy.cosh(3.0*xx)+3.0*numpy.exp(8.0*complex(0,1.0)*t)*numpy.cosh(xx))\
   /(numpy.cosh(4*xx)+4.0*numpy.cosh(2.0*xx)+3.0*numpy.cos(8.0*t))
uexact=u
v=numpy.fft.fftn(u)
usquared=abs(u)**2
fig =plt.figure()
ax = fig.add_subplot(311)
ax.plot(xx,numpy.real(u),'b-')
plt.xlabel('x')
plt.ylabel('real u')
ax = fig.add_subplot(312)
ax.plot(xx,numpy.imag(u),'b-')
plt.xlabel('x')
plt.ylabel('imaginary u')
ax = fig.add_subplot(313)
ax.plot(xx,abs(u-uexact),'b-')
plt.xlabel('x')
plt.ylabel('error')
plt.show()


# initial mass
usquared=abs(u)**2
mass=numpy.fft.fftn(usquared)
ma=numpy.real(mass[0])
maO=ma
tdata[0]=t
plotnum=0
#solve pde and plot results
for nt in xrange(numplots-1):
    for n in xrange(plotgap):
        vna=v*numpy.exp(complex(0,0.5)*dt*k2xm)
        una=numpy.fft.ifftn(vna)
        usquared=2.0*abs(una)**2
        pot=Es*usquared
        unb=una*numpy.exp(complex(0,-1)*dt*pot)
        vnb=numpy.fft.fftn(unb)
        v=vnb*numpy.exp(complex(0,0.5)*dt*k2xm)
        u=numpy.fft.ifftn(v)
        t+=dt
    plotnum+=1
    usquared=abs(u)**2
    uexact = 4.0*numpy.exp(complex(0,1.0)*t)*\
      (numpy.cosh(3.0*xx)+3.0*numpy.exp(8.0*complex(0,1.0)*t)*numpy.cosh(xx))\
      /(numpy.cosh(4*xx)+4.0*numpy.cosh(2.0*xx)+3.0*numpy.cos(8.0*t))
    ax = fig.add_subplot(311)
    plt.cla()
    ax.plot(xx,numpy.real(u),'b-')
    plt.title(t)
    plt.xlabel('x')
    plt.ylabel('real u')
    ax = fig.add_subplot(312)
    plt.cla()
    ax.plot(xx,numpy.imag(u),'b-')
    plt.xlabel('x')
    plt.ylabel('imaginary u')
    ax = fig.add_subplot(313)
    plt.cla()
    ax.plot(xx,abs(u-uexact),'b-')
    plt.xlabel('x')
    plt.ylabel('error')
    plt.draw()
    mass=numpy.fft.fftn(usquared)
    ma=numpy.real(mass[0])
    test[plotnum-1]=numpy.log(abs(1-ma/maO))
    print(test[plotnum-1])
    tdata[plotnum-1]=t
       
plt.ioff()
plt.show()

( C)

一個使用 Strang 拆分求解二維非線性薛定諤方程的 Matlab 程式程式碼下載

% A program to solve the 2D nonlinear Schr\"{o}dinger equation using a
% splitting method

clear all; format compact; format short;
set(0,'defaultaxesfontsize',30,'defaultaxeslinewidth',.7,...
    'defaultlinelinewidth',6,'defaultpatchlinewidth',3.7,'defaultaxesfontweight','bold')

% set up grid
tic
Lx = 20; % period 2*pi*L
Ly = 20; % period 2*pi*L
Nx = 2*256; % number of harmonics
Ny = 2*256; % number of harmonics
Nt = 100; % number of time slices
dt = 5.0/Nt; % time step

Es = 1.0;

% initialise variables
x = (2*pi/Nx)*(-Nx/2:Nx/2 -1)'*Lx; % x coordinate
kx = 1i*[0:Nx/2-1 0 -Nx/2+1:-1]'/Lx; % wave vector
y = (2*pi/Ny)*(-Ny/2:Ny/2 -1)'*Ly; % y coordinate
ky = 1i*[0:Ny/2-1 0 -Ny/2+1:-1]'/Ly; % wave vector
[xx,yy]=meshgrid(x,y);
[k2xm,k2ym]=meshgrid(kx.^2,ky.^2);
% initial conditions
u = exp(-(xx.^2+yy.^2));
v=fft2(u);
figure(1); clf; mesh(xx,yy,u); drawnow;
t=0; tdata(1)=t;

% mass
ma = fft2(abs(u).^2);
ma0 = ma(1,1);

% solve pde and plot results
for n =2:Nt+1
    vna=exp(0.5*1i*dt*(k2xm + k2ym)).*v;
    una=ifft2(vna);
    pot=Es*((abs(una)).^2);
    unb=exp(-1i*dt*pot).*una;
    vnb=fft2(unb);
    v=exp(0.5*1i*dt*(k2xm + k2ym)).*vnb;
    u=ifft2(v);
    t=(n-1)*dt;
    tdata(n)=t;
     if (mod(n,10)==0)
         figure(2); clf; mesh(xx,yy,abs(u).^2); title(num2str(t));
         drawnow;
         ma = fft2(abs(u).^2);
         ma = ma(1,1);
         test = log10(abs(1-ma/ma0))
     end
end
figure(4); clf; mesh(xx,yy,abs(u).^2);
toc

( Cp)

一個使用 Strang 拆分求解二維非線性薛定諤方程的 Python 程式。程式碼下載

"""
A program to solve the 2D Nonlinear Schrodinger equation using a
second order splitting method

More information on visualization can be found on the Mayavi
website, in particular:
http://github.enthought.com/mayavi/mayavi/mlab.html
which was last checked on 6 April 2012

"""

import math
import numpy
from mayavi import mlab
import matplotlib.pyplot as plt
import time

# Grid
Lx=4.0 # Period 2*pi*Lx
Ly=4.0 # Period 2*pi*Ly
Nx=64 # Number of harmonics
Ny=64 # Number of harmonics
Nt=100 # Number of time slices
tmax=1.0 # Maximum time
dt=tmax/Nt # time step
plotgap=10 # time steps between plots
Es= 1.0 # focusing (+1) or defocusing (-1) parameter
numplots=Nt/plotgap # number of plots to make

x = [i*2.0*math.pi*(Lx/Nx) for i in xrange(-Nx/2,1+Nx/2)]
y = [i*2.0*math.pi*(Ly/Ny) for i in xrange(-Ny/2,1+Ny/2)]
k_x = (1.0/Lx)*numpy.array([complex(0,1)*n for n in range(0,Nx/2) \
+ [0] + range(-Nx/2+1,0)])
k_y = (1.0/Ly)*numpy.array([complex(0,1)*n for n in range(0,Ny/2) \
+ [0] + range(-Ny/2+1,0)])

k2xm=numpy.zeros((Nx,Ny), dtype=float)
k2ym=numpy.zeros((Nx,Ny), dtype=float)
xx=numpy.zeros((Nx,Ny), dtype=float)
yy=numpy.zeros((Nx,Ny), dtype=float)


for i in xrange(Nx):
    for j in xrange(Ny):
            k2xm[i,j] = numpy.real(k_x[i]**2)
            k2ym[i,j] = numpy.real(k_y[j]**2)
            xx[i,j]=x[i]
            yy[i,j]=y[j]
        

# allocate arrays
usquared=numpy.zeros((Nx,Ny), dtype=float)
pot=numpy.zeros((Nx,Ny), dtype=float)
u=numpy.zeros((Nx,Ny), dtype=complex)
una=numpy.zeros((Nx,Ny), dtype=complex)
unb=numpy.zeros((Nx,Ny), dtype=complex)
v=numpy.zeros((Nx,Ny), dtype=complex)
vna=numpy.zeros((Nx,Ny), dtype=complex)
vnb=numpy.zeros((Nx,Ny), dtype=complex)
mass=numpy.zeros((Nx,Ny), dtype=complex)
test=numpy.zeros((numplots-1),dtype=float)
tdata=numpy.zeros((numplots-1), dtype=float)

u=numpy.exp(-(xx**2 + yy**2 ))
v=numpy.fft.fftn(u)
usquared=abs(u)**2
src = mlab.surf(xx,yy,usquared,colormap='YlGnBu',warp_scale='auto')
mlab.scalarbar()
mlab.xlabel('x',object=src)
mlab.ylabel('y',object=src)
mlab.zlabel('abs(u)^2',object=src)

# initial mass
usquared=abs(u)**2
mass=numpy.fft.fftn(usquared)
ma=numpy.real(mass[0,0])
print(ma)
maO=ma
t=0.0
tdata[0]=t
plotnum=0
#solve pde and plot results
for nt in xrange(numplots-1):
    for n in xrange(plotgap):
        vna=v*numpy.exp(complex(0,0.5)*dt*(k2xm+k2ym))
        una=numpy.fft.ifftn(vna)
        usquared=abs(una)**2
        pot=Es*usquared
        unb=una*numpy.exp(complex(0,-1)*dt*pot)
        vnb=numpy.fft.fftn(unb)
        v=vnb*numpy.exp(complex(0,0.5)*dt*(k2xm+k2ym) )
        u=numpy.fft.ifftn(v)
        t+=dt
    plotnum+=1
    usquared=abs(u)**2
    src.mlab_source.scalars = usquared
    mass=numpy.fft.fftn(usquared)
    ma=numpy.real(mass[0,0])
    test[plotnum-1]=numpy.log(abs(1-ma/maO))
    print(test[plotnum-1])
    tdata[plotnum-1]=t
       
plt.figure()
plt.plot(tdata,test,'r-')
plt.title('Time Dependence of Change in Mass')
plt.show()

( D)

一個使用 Strang 拆分求解三維非線性薛定諤方程的 Matlab 程式程式碼下載

% A program to solve the 3D nonlinear Schr\"{o}dinger equation using a
% splitting method
% updated by Abdullah Bargash , AbdulAziz Al-thiban
% vol3d can be downloaded from http://www.mathworks.com/matlabcentral/fileexchange/22940-vol3d-v2
%
clear all; format compact; format short;
set(0,'defaultaxesfontsize',30,'defaultaxeslinewidth',.7,...
    'defaultlinelinewidth',6,'defaultpatchlinewidth',3.7,...
    'defaultaxesfontweight','bold')

% set up grid
tic
Lx = 4; % period 2*pi*L
Ly = 4; % period 2*pi*L
Lz = 4; % period 2*pi*L
Nx = 64; % number of harmonics
Ny = 64; % number of harmonics
Nz = 64; % number of harmonics
Nt = 100; % number of time slices
dt = 1.0/Nt; % time step

Es = 1.0; % focusing or defocusing parameter

% initialise variables
x = (2*pi/Nx)*(-Nx/2:Nx/2 -1)'*Lx; % x coordinate
kx = 1i*[0:Nx/2-1 0 -Nx/2+1:-1]'/Lx; % wave vector
y = (2*pi/Ny)*(-Ny/2:Ny/2 -1)'*Ly; % y coordinate
ky = 1i*[0:Ny/2-1 0 -Ny/2+1:-1]'/Ly; % wave vector
z = (2*pi/Nz)*(-Nz/2:Nz/2 -1)'*Lz; % y coordinate
kz = 1i*[0:Nz/2-1 0 -Nz/2+1:-1]'/Lz; % wave vector
[xx,yy,zz]=meshgrid(x,y,z);
[k2xm,k2ym,k2zm]=meshgrid(kx.^2,ky.^2,kz.^2);

% initial conditions
u = exp(-(xx.^2+yy.^2+zz.^2));
v=fftn(u);
figure(1); clf; UP = abs(u).^2;
    H = vol3d('CData',UP,'texture','3D','XData',x,'YData',y,'ZData',z);
    xlabel('x'); ylabel('y'); zlabel('z'); colorbar;
    axis equal; axis square; view(3);
 
    xlabel('x'); ylabel('y'); zlabel('z');
axis equal; axis square; view(3); drawnow;
t=0; tdata(1)=t;
   
% mass
ma = fftn(abs(u).^2);
ma0 = ma(1,1,1);

% solve pde and plot results

for n =2:Nt+1
    vna=exp(0.5*1i*dt*(k2xm + k2ym + k2zm)).*v;
    una=ifftn(vna);
    pot=Es*((abs(una)).^2);
    unb=exp(-1i*dt*pot).*una;
    vnb=fftn(unb);
    v=exp(0.5*1i*dt*(k2xm + k2ym + k2zm)).*vnb;
    u=ifftn(v);
    t=(n-1)*dt;
    tdata(n)=t;
    if (mod(n,10)==0)
        
    figure(1); clf; UP = abs(u).^2;
   
    H = vol3d('CData',UP,'texture','3D','XData',x,'YData',y,'ZData',z);
    xlabel('x'); ylabel('y'); zlabel('z'); colorbar;
    axis equal; axis square; view(3);
    title(num2str(t));

        
        xlabel('x'); ylabel('y'); zlabel('z');
        axis equal; axis square; view(3); drawnow;
        ma = fftn(abs(u).^2);
        ma = ma(1,1,1); test = log10(abs(1-ma/ma0))
         
    end
     
end

    figure(1); clf; UP = abs(u).^2;
   
    H = vol3d('CData',UP,'texture','3D','XData',x,'YData',y,'ZData',z);
    xlabel('x'); ylabel('y'); zlabel('z'); colorbar;
    axis equal; axis square; view(3);

非線性薛定諤方程的示例一維 Fortran 程式

在考慮並行程式之前，我們需要了解如何為一維非線性薛定諤方程編寫 Fortran 程式碼。下面是一個示例 Fortran 程式，後面是一個 Matlab 繪圖指令碼，用於視覺化結果。在編譯 Fortran 程式時，需要標準 Fortran 編譯器和 FFTW 庫。由於所需命令與熱方程 makefile 中的命令相似，因此此處不再贅述。

( E)

一個使用拆分求解一維非線性薛定諤方程的 Fortran 程式程式碼下載

!--------------------------------------------------------------------
!
!
! PURPOSE
!
! This program solves nonlinear Schrodinger equation in 1 dimension
! i*u_t+Es*|u|^2u+u_{xx}=0
! using a second order time spectral splitting scheme
!
! The boundary conditions are u(0)=u(2*L*\pi)
! The initial condition is u=exp(-x^2)
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Nt = number of timesteps to take
! Tmax = maximum simulation time
! plotgap = number of timesteps between plots
! FFTW_IN_PLACE = value for FFTW input
! FFTW_MEASURE = value for FFTW input
! FFTW_EXHAUSTIVE = value for FFTW input
! FFTW_PATIENT = value for FFTW input
! FFTW_ESTIMATE = value for FFTW input
! FFTW_FORWARD = value for FFTW input
! FFTW_BACKWARD = value for FFTW input
! pi = 3.14159265358979323846264338327950288419716939937510d0
! L = width of box
! ES = +1 for focusing and -1 for defocusing
! .. Scalars ..
! i = loop counter in x direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! start = variable to record start time of program
! finish = variable to record end time of program
! count_rate = variable for clock count rate
! planfx = Forward 1d fft plan in x
! planbx = Backward 1d fft plan in x
! dt = timestep
! .. Arrays ..
! u = approximate solution
! v = Fourier transform of approximate solution
! .. Vectors ..
! una = temporary field
! unb = temporary field
! vna = temporary field
! pot = potential
! kx = fourier frequencies in x direction
! x = x locations
! time = times at which save data
! name_config = array to store filename for data to be saved
! fftfx = array to setup x Fourier transform
! fftbx = array to setup x Fourier transform
! REFERENCES
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
! Check that the initial iterate is consistent with the
! boundary conditions for the domain specified
!--------------------------------------------------------------------
! External routines required
!
! External libraries required
! FFTW3 -- Fast Fourier Transform in the West Library
! (http://www.fftw.org/)


PROGRAM main

! Declare variables
IMPLICIT NONE
INTEGER(kind=4), PARAMETER :: Nx=8*256	
INTEGER(kind=4), PARAMETER	:: Nt=200	
REAL(kind=8), PARAMETER	&
:: pi=3.14159265358979323846264338327950288419716939937510d0
REAL(kind=8), PARAMETER	:: L=5.0d0	
REAL(kind=8), PARAMETER	:: Es=1.0d0	
REAL(kind=8)	:: dt=2.0d0/Nt	
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE	:: kx	
REAL(kind=8), DIMENSION(:), ALLOCATABLE	:: x	
COMPLEX(kind=8), DIMENSION(:,:), ALLOCATABLE	:: u	
COMPLEX(kind=8), DIMENSION(:,:), ALLOCATABLE	:: v
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE	:: una,vn
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE	:: unb,pot
REAL(kind=8), DIMENSION(:), ALLOCATABLE	:: time
INTEGER(kind=4)	:: i,j,k,n,modes,AllocateStatus
INTEGER(kind=4)	:: start, finish, count_rate
INTEGER(kind=4), PARAMETER	:: FFTW_IN_PLACE = 8, FFTW_MEASURE = 0, &
FFTW_EXHAUSTIVE = 8, FFTW_PATIENT = 32, FFTW_ESTIMATE = 64
INTEGER(kind=4), PARAMETER :: FFTW_FORWARD = -1, FFTW_BACKWARD=1	
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE	:: fftfx,fftbx
INTEGER(kind=8)	:: planfx,planbx
   CHARACTER*100	:: name_config

CALL system_clock(start,count_rate)
ALLOCATE(kx(1:Nx),x(1:Nx),u(1:Nx,1:Nt+1),v(1:Nx,1:Nt+1),&
una(1:Nx),vn(1:Nx),unb(1:Nx),pot(1:Nx),time(1:Nt+1),&
fftfx(1:Nx),fftbx(1:Nx),stat=AllocateStatus)
IF (allocatestatus .ne. 0) STOP	
! set up ffts
CALL dfftw_plan_dft_1d_(planfx,Nx,fftfx(1:Nx),fftbx(1:Nx),&
FFTW_FORWARD,FFTW_PATIENT)
CALL dfftw_plan_dft_1d_(planbx,Nx,fftbx(1:Nx),fftfx(1:Nx),&
FFTW_BACKWARD,FFTW_PATIENT)
PRINT *,'Setup FFTs'
! setup fourier frequencies
DO i=1,1+Nx/2
kx(i)= cmplx(0.0d0,1.0d0)*(i-1.0d0)/L
END DO
kx(1+Nx/2)=0.0d0
DO i = 1,Nx/2 -1
kx(i+1+Nx/2)=-kx(1-i+Nx/2)
END DO
DO i=1,Nx
x(i)=(-1.0d0 + 2.0d0*REAL(i-1,kind(0d0))/REAL(Nx,kind(0d0)))*pi*L
END DO
PRINT *,'Setup grid and fourier frequencies'

DO i=1,Nx
u(i,1)=exp(-1.0d0*(x(i)**2))
END DO
! transform initial data
CALL dfftw_execute_dft_(planfx,u(1:Nx,1),v(1:Nx,1))
PRINT *,'Got initial data, starting timestepping'
time(1)=0.0d0
DO n=1,Nt
time(n+1)=n*dt
DO i=1,Nx
vn(i)=exp(0.5d0*dt*kx(i)*kx(i)*cmplx(0.0d0,1.0d0))*v(i,n)
END DO
CALL dfftw_execute_dft_(planbx,vn(1:Nx),una(1:Nx))
! normalize
DO i=1,Nx
una(i)=una(1:Nx)/REAL(Nx,kind(0d0))	
pot(i)=Es*una(i)*conjg(una(i))
unb(i)=exp(cmplx(0.0d0,-1.0d0)*dt*pot(i))*una(i)
END DO
CALL dfftw_execute_dft_(planfx,unb(1:Nx),vn(1:Nx))
DO i=1,Nx
v(i,n+1)=exp(0.50d0*dt*kx(i)*kx(i)*cmplx(0.0d0,1.0d0))*vn(i)
END DO
CALL dfftw_execute_dft_(planbx,v(1:Nx,n+1),u(1:Nx,n+1))
! normalize
DO i=1,Nx
u(i,n+1)=u(i,n+1)/REAL(Nx,kind(0d0))
END DO
END DO
PRINT *,'Finished time stepping'
CALL system_clock(finish,count_rate)
PRINT*,'Program took ',&
REAL(finish-start,kind(0d0))/REAL(count_rate,kind(0d0)),'for execution'

name_config = 'u.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Nt
DO i=1,Nx
WRITE(11,*) abs(u(i,j))**2
END DO
END DO
CLOSE(11)

name_config = 'tdata.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Nt
WRITE(11,*) time(j)
END DO
CLOSE(11)

name_config = 'xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)

PRINT *,'Saved data'

CALL dfftw_destroy_plan_(planbx)
CALL dfftw_destroy_plan_(planfx)
CALL dfftw_cleanup_()

DEALLOCATE(kx,x,u,v,una,vn,unb,&
pot,time,fftfx,fftbx,&
stat=AllocateStatus)
IF (allocatestatus .ne. 0) STOP
PRINT *,'deallocated memory'
PRINT *,'Program execution complete'
END PROGRAM main

( F)

一個 Matlab 程式，它繪製清單 E 生成的對一維非線性薛定諤方程的數值解程式碼下載

% A program to plot the computed results

clear all; format compact, format short,
set(0,'defaultaxesfontsize',18,'defaultaxeslinewidth',.9,...
    'defaultlinelinewidth',3.5,'defaultpatchlinewidth',5.5);

% Load data
load('./u.dat');
load('./tdata.dat');
load('./xcoord.dat');
Tsteps = length(tdata);

Nx = length(xcoord); Nt = length(tdata);
 
u = reshape(u,Nx,Nt);

% Plot data
figure(3); clf; mesh(tdata,xcoord,u); xlabel t; ylabel x; zlabel('|u|^2');

共享記憶體並行：OpenMP

我們回顧一下，OpenMP 是一組編譯器指令，可以讓您輕鬆地讓 Fortran、C 或 C++ 程式在共享記憶體機器上執行——也就是說，所有計算程序都可以訪問同一全域性定址記憶體空間的計算機。它允許對已用上述語言之一編寫的序列程式進行輕鬆的並行化。

我們將演示二維非線性薛定諤方程的一種並行形式，其中我們將使用 OpenMP 命令對迴圈進行並行化，但將使用執行緒化的 FFTW 庫為我們並行化變換。示例程式在清單 E 中。在練習中概述了使用 OpenMP 命令顯式並行化迴圈和快速傅立葉變換的第二種方法。

( G)

一個使用拆分和執行緒化 FFTW 求解二維非線性薛定諤方程的 OpenMP Fortran 程式程式碼下載

!--------------------------------------------------------------------
!
!
! PURPOSE
!
! This program solves nonlinear Schrodinger equation in 2 dimensions
! i*u_t+Es*|u|^2u+u_{xx}+u_{yy}=0
! using a second order time spectral splitting scheme
!
! The boundary conditions are u(x=0,y)=u(2*Lx*\pi,y),
! u(x,y=0)=u(x,y=2*Ly*\pi)
! The initial condition is u=exp(-x^2-y^2)
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Ny = number of modes in y - power of 2 for FFT
! Nt = number of timesteps to take
! Tmax = maximum simulation time
! plotgap = number of timesteps between plots
! FFTW_IN_PLACE = value for FFTW input
! FFTW_MEASURE = value for FFTW input
! FFTW_EXHAUSTIVE = value for FFTW input
! FFTW_PATIENT = value for FFTW input
! FFTW_ESTIMATE = value for FFTW input
! FFTW_FORWARD = value for FFTW input
! FFTW_BACKWARD = value for FFTW input
! pi = 3.14159265358979323846264338327950288419716939937510d0
! Lx = width of box in x direction
! Ly = width of box in y direction
! ES = +1 for focusing and -1 for defocusing
! .. Scalars ..
! i = loop counter in x direction
! j = loop counter in y direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! numthreads = number of openmp threads
! ierr = error return code
! start = variable to record start time of program
! finish = variable to record end time of program
! count_rate = variable for clock count rate
! planfx = Forward 1d fft plan in x
! planbx = Backward 1d fft plan in x
! planfy = Forward 1d fft plan in y
! planby = Backward 1d fft plan in y
! dt = timestep
! .. Arrays ..
! u = approximate solution
! v = Fourier transform of approximate solution
! unax = temporary field
! vnax = temporary field
! vnbx = temporary field
! vnay = temporary field
! vnby = temporary field
! potx = potential
! .. Vectors ..
! kx = fourier frequencies in x direction
! ky = fourier frequencies in y direction
! x = x locations
! y = y locations
! time = times at which save data
! name_config = array to store filename for data to be saved
! fftfx = array to setup x Fourier transform
! fftbx = array to setup x Fourier transform
! fftfy = array to setup y Fourier transform
! fftby = array to setup y Fourier transform
!
! REFERENCES
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
! Check that the initial iterate is consistent with the
! boundary conditions for the domain specified
!--------------------------------------------------------------------
! External routines required
!
! External libraries required
! FFTW3 -- Fast Fourier Transform in the West Library
! (http://www.fftw.org/)
! OpenMP library
PROGRAM main
USE omp_lib	
IMPLICIT NONE	
! Declare variables
INTEGER(kind=4), PARAMETER :: Nx=1024
INTEGER(kind=4), PARAMETER :: Ny=1024	
INTEGER(kind=4), PARAMETER	:: Nt=20	
INTEGER(kind=4), PARAMETER	:: plotgap=5	
REAL(kind=8), PARAMETER	:: &
pi=3.14159265358979323846264338327950288419716939937510d0
REAL(kind=8), PARAMETER	:: Lx=2.0d0	
REAL(kind=8), PARAMETER	:: Ly=2.0d0	
REAL(kind=8), PARAMETER	:: Es=1.0d0	
REAL(kind=8)	:: dt=0.10d0/Nt	
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE	:: kx	
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE	:: ky	
REAL(kind=8), DIMENSION(:), ALLOCATABLE	:: x	
REAL(kind=8), DIMENSION(:), ALLOCATABLE	:: y	
COMPLEX(kind=8), DIMENSION(:,:), ALLOCATABLE:: unax,vnax,vnbx,potx
COMPLEX(kind=8), DIMENSION(:,:), ALLOCATABLE:: vnay,vnby
REAL(kind=8), DIMENSION(:), ALLOCATABLE	:: time
INTEGER(kind=4)	:: i,j,k,n,allocatestatus,ierr
INTEGER(kind=4)	:: start, finish, count_rate, numthreads
INTEGER(kind=8), PARAMETER :: FFTW_IN_PLACE=8, FFTW_MEASURE=0,&
FFTW_EXHAUSTIVE=8, FFTW_PATIENT=32,&
                    FFTW_ESTIMATE=64
INTEGER(kind=8),PARAMETER :: FFTW_FORWARD=-1, FFTW_BACKWARD=1	
INTEGER(kind=8)	:: planfxy,planbxy
    CHARACTER*100	:: name_config,number_file

numthreads=omp_get_max_threads()
PRINT *,'There are ',numthreads,' threads.'

ALLOCATE(kx(1:Nx),ky(1:Nx),x(1:Nx),y(1:Nx),unax(1:Nx,1:Ny),&
vnax(1:Nx,1:Ny),potx(1:Nx,1:Ny),time(1:1+Nt/plotgap),&
stat=allocatestatus)	
IF (allocatestatus .ne. 0) stop
PRINT *,'allocated memory'

! set up multithreaded ffts
CALL dfftw_init_threads_(ierr)
PRINT *,'Initiated threaded FFTW'
CALL dfftw_plan_with_nthreads_(numthreads)
PRINT *,'Inidicated number of threads to be used in planning'
CALL dfftw_plan_dft_2d_(planfxy,Nx,Ny,unax(1:Nx,1:Ny),vnax(1:Nx,1:Ny),&
FFTW_FORWARD,FFTW_ESTIMATE)
  CALL dfftw_plan_dft_2d_(planbxy,Nx,Ny,vnax(1:Nx,1:Ny),unax(1:Nx,1:Ny),&
  FFTW_BACKWARD,FFTW_ESTIMATE)
PRINT *,'Setup FFTs'

! setup fourier frequencies
!$OMP PARALLEL PRIVATE(i,j)
!$OMP DO SCHEDULE(static)
DO i=1,1+Nx/2
kx(i)= cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))/Lx
END DO
!$OMP END DO
kx(1+Nx/2)=0.0d0
!$OMP DO SCHEDULE(static)
DO i = 1,Nx/2 -1
kx(i+1+Nx/2)=-kx(1-i+Nx/2)
END DO
!$OMP END DO
!$OMP DO SCHEDULE(static)
   DO i=1,Nx
x(i)=(-1.0d0+2.0d0*REAL(i-1,kind(0d0))/REAL(Nx,kind(0d0)) )*pi*Lx
END DO
!$OMP END DO
!$OMP DO SCHEDULE(static)
DO j=1,1+Ny/2
ky(j)= cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))/Ly
END DO
!$OMP END DO
ky(1+Ny/2)=0.0d0
!$OMP DO SCHEDULE(static)
DO j = 1,Ny/2 -1
ky(j+1+Ny/2)=-ky(1-j+Ny/2)
END DO
!$OMP END DO
!$OMP DO SCHEDULE(static)
   DO j=1,Ny
y(j)=(-1.0d0+2.0d0*REAL(j-1,kind(0d0))/REAL(Ny,kind(0d0)) )*pi*Ly
END DO
!$OMP END DO
PRINT *,'Setup grid and fourier frequencies'
!$OMP DO SCHEDULE(static)
DO j=1,Ny
unax(1:Nx,j)=exp(-1.0d0*(x(1:Nx)**2 +y(j)**2))
END DO
!$OMP END DO
!$OMP END PARALLEL
name_config = 'uinitial.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
DO i=1,Nx
WRITE(11,*) abs(unax(i,j))**2
END DO
END DO
CLOSE(11)
! transform initial data and do first half time step
CALL dfftw_execute_dft_(planfxy,unax(1:Nx,1:Ny),vnax(1:Nx,1:Ny))

PRINT *,'Got initial data, starting timestepping'
time(1)=0.0d0
CALL system_clock(start,count_rate)	
DO n=1,Nt	
!$OMP PARALLEL DO PRIVATE(j) SCHEDULE(static)
DO j=1,Ny
DO i=1,Nx
vnax(i,j)=exp(0.5d0*dt*(kx(i)*kx(i) + ky(j)*ky(j))&
*cmplx(0.0d0,1.0d0))*vnax(i,j)
END DO
END DO
!$OMP END PARALLEL DO
CALL dfftw_execute_dft_(planbxy,vnax(1:Nx,1:Ny),unax(1:Nx,1:Ny))
!$OMP PARALLEL DO PRIVATE(j) SCHEDULE(static)
DO j=1,Ny
DO i=1,Nx
unax(i,j)=unax(i,j)/REAL(Nx*Ny,kind(0d0))
potx(i,j)=Es*unax(i,j)*conjg(unax(i,j))
unax(i,j)=exp(cmplx(0.0d0,-1.0d0)*dt*potx(i,j))&
*unax(i,j)
END DO
END DO
!$OMP END PARALLEL DO
CALL dfftw_execute_dft_(planfxy,unax(1:Nx,1:Ny),vnax(1:Nx,1:Ny))
!$OMP PARALLEL DO PRIVATE(i) SCHEDULE(static)
DO j=1,Ny
DO i=1,Nx
vnax(i,j)=exp(0.5d0*dt*(kx(i)*kx(i) + ky(j)*ky(j))&	
*cmplx(0.0d0,1.0d0))*vnax(i,j)
END DO
END DO
!$OMP END PARALLEL DO
IF (mod(n,plotgap)==0) then
time(1+n/plotgap)=n*dt
PRINT *,'time',n*dt
CALL dfftw_execute_dft_(planbxy,vnax(1:Nx,1:Ny),unax(1:Nx,1:Ny))
!$OMP PARALLEL DO PRIVATE(j) SCHEDULE(static)
DO j=1,Ny
DO i=1,Nx
unax(i,j)=unax(i,j)/REAL(Nx*Ny,kind(0d0))
END DO
END DO
!$OMP END PARALLEL DO
name_config='./data/u'
WRITE(number_file,'(i0)') 10000000+1+n/plotgap
ind=index(name_config,' ') -1
name_config=name_config(1:ind)//numberfile
ind=index(name_config,' ') -1
name_config=name_config(1:ind)//'.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
DO i=1,Nx
WRITE(11,*) abs(unax(i,j))**2
END DO
END DO
CLOSE(11)
END IF
END DO
PRINT *,'Finished time stepping'
CALL system_clock(finish,count_rate)
PRINT*,'Program took ',REAL(finish-start)/REAL(count_rate),&
'for Time stepping'


name_config = 'tdata.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,1+Nt/plotgap
WRITE(11,*) time(j)
END DO
CLOSE(11)

name_config = 'xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)

name_config = 'ycoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
WRITE(11,*) y(j)
END DO
CLOSE(11)
PRINT *,'Saved data'

CALL dfftw_destroy_plan_(planbxy)
CALL dfftw_destroy_plan_(planfxy)
CALL dfftw_cleanup_threads_()

DEALLOCATE(unax,vnax,potx,stat=allocatestatus)	
IF (allocatestatus .ne. 0) STOP
PRINT *,'Deallocated memory'

PRINT *,'Program execution complete'
END PROGRAM main

( H)

一個編譯清單 G 中 OpenMP 程式的示例 makefile 程式碼下載

#define the complier
COMPILER = gfortran
# compilation settings, optimization, precision, parallelization
FLAGS = -O3 -fopenmp


# libraries
LIBS = -L/usr/local/lib -lfftw3 -lm
# source list for main program
SOURCES = NLSsplitting.f90

test: $(SOURCES)
${COMPILER} -o NLSsplitting $(FLAGS) $(SOURCES) $(LIBS)

clean:
rm *.o

clobber:
rm NLSsplitting

該示例假設您使用的是 Flux 並且已載入 GCC 編譯器以及 GCC 編譯的 FFTW 版本的環境。要使用 Intel 編譯器使用此程式碼，需要明確設定 OMP 棧大小以確保足夠大。如果您使用的是 PGI 編譯器而不是 GCC 編譯器，請將 makefile 中的標誌 $-fopenmp$ 更改為 $-mp$ 。

( I)

一個 Matlab 程式，它繪製清單 G 生成的對二維非線性薛定諤方程的數值解程式碼下載

% A program to plot the computed results for the 2D NLS equation

clear all; format compact, format short,
set(0,'defaultaxesfontsize',18,'defaultaxeslinewidth',.9,...
    'defaultlinelinewidth',3.5,'defaultpatchlinewidth',5.5);

% Load data
load('./ufinal.dat');
load('./tdata.dat');
load('./ycoord.dat');
load('./xcoord.dat');

Ny = length(ycoord); Nx = length(xcoord); Nt = length(tdata);
 
ufinal = reshape(ufinal,Nx,Ny);

% Plot data
figure(3); clf; mesh(xcoord,ycoord,ufinal); xlabel x; ylabel y; zlabel('|u|^2');

( J)

一個在 Flux 上使用的示例提交指令碼。將 {your_username} 替換為您自己的使用者名稱。程式碼下載

#!/bin/bash
#PBS -N NLS
#PBS -l nodes=1:ppn=2,walltime=00:03:00
#PBS -q flux
#PBS -l qos=math471f11_flux
#PBS -A math471f11_flux
#PBS -M your_username@umich.edu
#PBS -m abe
#PBS -V
#
# Create a local directory to run and copy your files to local.
# Let PBS handle your output
cp ${HOME}/parallelspectralintro/NLSsplitting /nobackup/your_username/NLSsplitting
cd /nobackup/your_username

export OMP_NUM_THREADS=2
./NLSsplitting

練習

下載與 Klein、Muite 和 Roidot 預印本^[23] 附帶的示例 Matlab 程式。檢查這些薛定諤型方程的質量和能量是如何計算的。在非線性薛定諤方程的 Matlab 程式中新增程式碼以檢查質量和能量的守恆性。
Gross-Pitaevskii 方程^[24] 由 $i\psi _{t}+|\psi |^{2}\psi +V(\mathbf {x} )\psi =0$ 給出，其中我們將取 $V(\mathbf {x} )=\lVert \mathbf {x} \rVert _{l^{2}}^{2}=\sum _{k=1}^{N}x_{k}^{2}$ ，其中 $N$ 是空間維度。證明該方程可以透過將其拆分為以下兩個方程來求解

$i\psi _{t}+\Delta \psi =0$

( 5)

和

$i\psi _{t}+|\psi |^{2}\psi +V(\mathbf {x} )\psi =0$ .

( 6)

請務必解釋如何求解公式 5 和 6 。
修改Matlab程式碼以求解一維、二維和三維的Gross-Pitaevskii方程。
修改序列Fortran程式碼以求解一維、二維和三維的Gross-Pitaevskii方程。

清單 J 和 K 給出了一種並行化OpenMP程式的替代方法。使清單 G 中的程式儘可能高效，並儘可能類似於 J 中的程式，但不要更改並行化策略。比較兩個不同程式的速度。嘗試改變使用的網格點數和核心數。哪段程式碼在您的系統上更快？您認為這是為什麼？

( J)

一個使用分裂方法求解二維非線性薛定諤方程的OpenMP Fortran程式程式碼下載

!--------------------------------------------------------------------
!
!
! PURPOSE
!
! This program solves nonlinear Schrodinger equation in 2 dimensions
! i*u_t+Es*|u|^2u+u_{xx}+u_{yy}=0
! using a second order time spectral splitting scheme
!
! The boundary conditions are u(x=0,y)=u(2*Lx*\pi,y),
! u(x,y=0)=u(x,y=2*Ly*\pi)
! The initial condition is u=exp(-x^2-y^2)
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Ny = number of modes in y - power of 2 for FFT
! Nt = number of timesteps to take
! Tmax = maximum simulation time
! plotgap = number of timesteps between plots
! FFTW_IN_PLACE = value for FFTW input
! FFTW_MEASURE = value for FFTW input
! FFTW_EXHAUSTIVE = value for FFTW input
! FFTW_PATIENT = value for FFTW input
! FFTW_ESTIMATE = value for FFTW input
! FFTW_FORWARD = value for FFTW input
! FFTW_BACKWARD = value for FFTW input
! pi = 3.14159265358979323846264338327950288419716939937510d0
! Lx = width of box in x direction
! Ly = width of box in y direction
! ES = +1 for focusing and -1 for defocusing
! .. Scalars ..
! i = loop counter in x direction
! j = loop counter in y direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! start = variable to record start time of program
! finish = variable to record end time of program
! count_rate = variable for clock count rate
! planfx = Forward 1d fft plan in x
! planbx = Backward 1d fft plan in x
! planfy = Forward 1d fft plan in y
! planby = Backward 1d fft plan in y
! dt = timestep
! .. Arrays ..
! u = approximate solution
! v = Fourier transform of approximate solution
! unax = temporary field
! vnax = temporary field
! vnbx = temporary field
! vnay = temporary field
! vnby = temporary field
! potx = potential
! .. Vectors ..
! kx = fourier frequencies in x direction
! ky = fourier frequencies in y direction
! x = x locations
! y = y locations
! time = times at which save data
! name_config = array to store filename for data to be saved
! fftfx = array to setup x Fourier transform
! fftbx = array to setup x Fourier transform
! fftfy = array to setup y Fourier transform
! fftby = array to setup y Fourier transform
!
! REFERENCES
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
! Check that the initial iterate is consistent with the
! boundary conditions for the domain specified
!--------------------------------------------------------------------
! External routines required
!
! External libraries required
! FFTW3 -- Fast Fourier Transform in the West Library
! (http://www.fftw.org/)
! OpenMP library

PROGRAM main
USE omp_lib	
IMPLICIT NONE	
! Declare variables
INTEGER(kind=4), PARAMETER :: Nx=2**8	
INTEGER(kind=4), PARAMETER :: Ny=2**8	
INTEGER(kind=4), PARAMETER	:: Nt=20	
INTEGER(kind=4), PARAMETER	:: plotgap=5	
REAL(kind=8), PARAMETER	:: &
pi=3.14159265358979323846264338327950288419716939937510d0
REAL(kind=8), PARAMETER	:: Lx=2.0d0	
REAL(kind=8), PARAMETER	:: Ly=2.0d0	
REAL(kind=8), PARAMETER	:: Es=0.0d0	
REAL(kind=8)	:: dt=0.10d0/Nt	
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE	:: kx,ky
REAL(kind=8), DIMENSION(:), ALLOCATABLE	:: x,y
COMPLEX(kind=8), DIMENSION(:,:), ALLOCATABLE:: unax,vnax,vnbx,potx
COMPLEX(kind=8), DIMENSION(:,:), ALLOCATABLE:: vnay,vnby
REAL(kind=8), DIMENSION(:), ALLOCATABLE	:: time
INTEGER(kind=4)	:: i,j,k,n,allocatestatus
INTEGER(kind=4)	:: start, finish, count_rate
INTEGER(kind=8), PARAMETER :: FFTW_IN_PLACE=8, FFTW_MEASURE=0,&
FFTW_EXHAUSTIVE=8, FFTW_PATIENT=32,&
                    FFTW_ESTIMATE=64
INTEGER(kind=8),PARAMETER :: FFTW_FORWARD=-1, FFTW_BACKWARD=1	
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE :: fftfx,fftbx,fftfy,fftby
INTEGER(kind=8)	:: planfx,planbx,planfy,planby
    CHARACTER*100	:: name_config

ALLOCATE(kx(1:Nx),ky(1:Nx),x(1:Nx),y(1:Nx),unax(1:Nx,1:Ny),&
vnax(1:Nx,1:Ny),vnbx(1:Nx,1:Ny),potx(1:Nx,1:Ny),fftfx(1:Nx),&
fftbx(1:Nx),fftfy(1:Nx),fftby(1:Nx),vnay(1:Ny,1:Nx),&
vnby(1:Ny,1:Nx),time(1:1+Nt/plotgap),stat=allocatestatus)	
IF (allocatestatus .ne. 0) stop
PRINT *,'allocated memory'
! set up ffts
CALL dfftw_plan_dft_1d_(planfx,Nx,fftfx(1:Nx),fftbx(1:Nx),&
FFTW_FORWARD,FFTW_ESTIMATE)
  CALL dfftw_plan_dft_1d_(planbx,Nx,fftbx(1:Nx),fftfx(1:Nx),&
  FFTW_BACKWARD,FFTW_ESTIMATE)
CALL dfftw_plan_dft_1d_(planfy,Ny,fftfy(1:Ny),fftby(1:Ny),&
FFTW_FORWARD,FFTW_ESTIMATE)
  CALL dfftw_plan_dft_1d_(planby,Ny,fftby(1:Ny),fftfy(1:Ny),&
  FFTW_BACKWARD,FFTW_ESTIMATE)	
PRINT *,'Setup FFTs'

! setup fourier frequencies
!$OMP PARALLEL DO PRIVATE(i) SCHEDULE(static)
DO i=1,1+Nx/2
kx(i)= cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))/Lx
END DO
!$OMP END PARALLEL DO
kx(1+Nx/2)=0.0d0
!$OMP PARALLEL DO PRIVATE(i) SCHEDULE(static)
DO i = 1,Nx/2 -1
kx(i+1+Nx/2)=-kx(1-i+Nx/2)
END DO
!$OMP END PARALLEL DO
!$OMP PARALLEL DO PRIVATE(i) SCHEDULE(static)
   DO i=1,Nx
x(i)=(-1.0d0+2.0d0*REAL(i-1,kind(0d0))/REAL(Nx,kind(0d0)) )*pi*Lx
END DO
!$OMP END PARALLEL DO
!$OMP PARALLEL DO PRIVATE(j) SCHEDULE(static)
DO j=1,1+Ny/2
ky(j)= cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))/Ly
END DO
!$OMP END PARALLEL DO
ky(1+Ny/2)=0.0d0
!$OMP PARALLEL DO PRIVATE(j) SCHEDULE(static)
DO j = 1,Ny/2 -1
ky(j+1+Ny/2)=-ky(1-j+Ny/2)
END DO
!$OMP END PARALLEL DO
!$OMP PARALLEL DO PRIVATE(j) SCHEDULE(static)
   DO j=1,Ny
y(j)=(-1.0d0+2.0d0*REAL(j-1,kind(0d0))/REAL(Ny,kind(0d0)) )*pi*Ly
END DO
!$OMP END PARALLEL DO
PRINT *,'Setup grid and fourier frequencies'
!$OMP PARALLEL DO PRIVATE(j) SCHEDULE(static)
DO j=1,Ny
DO i=1,Nx
unax(i,j)=exp(-1.0d0*(x(i)**2 +y(j)**2))
END DO
END DO
!$OMP END PARALLEL DO
name_config = 'uinitial.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
DO i=1,Nx
WRITE(11,*) abs(unax(i,j))**2
END DO
END DO
CLOSE(11)
!$OMP PARALLEL DO PRIVATE(j) SCHEDULE(static)
DO j=1,Ny
DO i=1,Nx
CALL dfftw_execute_dft_(planfx,unax(i,j),vnax(i,j))
END DO
END DO
!$OMP END PARALLEL DO
vnay(1:Ny,1:Nx)=TRANSPOSE(vnax(1:Nx,1:Ny))
! transform initial data and do first half time step
!$OMP PARALLEL DO PRIVATE(i) SCHEDULE(static)
DO i=1,Nx
CALL dfftw_execute_dft_(planfy,vnay(1:Ny,i),vnby(1:Ny,i))
DO j=1,Ny
vnby(j,i)=exp(0.5d0*dt*(kx(i)*kx(i) + ky(j)*ky(j))&
*cmplx(0.0d0,1.0d0))*vnby(j,i)
END DO
CALL dfftw_execute_dft_(planby,vnby(j,i),vnay(j,i))
END DO
!$OMP END PARALLEL DO
PRINT *,'Got initial data, starting timestepping'
time(1)=0.0d0
CALL system_clock(start,count_rate)	
DO n=1,Nt	
vnbx(1:Nx,1:Ny)=TRANSPOSE(vnay(1:Ny,1:Nx))/REAL(Ny,kind(0d0))	
!$OMP PARALLEL DO PRIVATE(j) SCHEDULE(static)
DO j=1,Ny
CALL dfftw_execute_dft_(planbx,vnbx(1:Nx,j),unax(1:Nx,j))
DO i=1,Nx
unax(i,j)=unax(1:Nx,j)/REAL(Nx,kind(0d0))
potx(i,j)=Es*unax(i,j)*conjg(unax(i,j))
unax(i,j)=exp(cmplx(0.0d0,-1.0d0)*dt*potx(i,j))&
*unax(i,j)
END DO
CALL dfftw_execute_dft_(planfx,unax(1:Nx,j),vnax(1:Nx,j))
END DO
!$OMP END PARALLEL DO
vnby(1:Ny,1:Nx)=TRANSPOSE(vnax(1:Nx,1:Ny))	
!$OMP PARALLEL DO PRIVATE(i) SCHEDULE(static)
DO i=1,Nx
CALL dfftw_execute_dft_(planfy,vnby(1:Ny,i),vnay(1:Ny,i))
DO j=1,Ny
vnby(j,i)=exp(dt*(kx(i)*kx(i) + ky(j)*ky(j))&	
*cmplx(0.0d0,1.0d0))*vnay(j,i)	
END DO
CALL dfftw_execute_dft_(planby,vnby(1:Ny,i),vnay(1:Ny,i))
END DO
!$OMP END PARALLEL DO
IF (mod(n,plotgap)==0) then
time(1+n/plotgap)=n*dt
PRINT *,'time',n*dt
END IF
END DO
PRINT *,'Finished time stepping'
CALL system_clock(finish,count_rate)
PRINT*,'Program took ',REAL(finish-start)/REAL(count_rate),&
'for Time stepping'

! transform back final data and do another half time step
vnbx(1:Nx,1:Ny)=transpose(vnay(1:Ny,1:Nx))/REAL(Ny,kind(0d0))	
!$OMP PARALLEL DO PRIVATE(j) SCHEDULE(static)
DO j=1,Ny
CALL dfftw_execute_dft_(planbx,vnbx(1:Nx,j),unax(1:Nx,j))
unax(1:Nx,j)=unax(1:Nx,j)/REAL(Nx,kind(0d0))
potx(1:Nx,j)=Es*unax(1:Nx,j)*conjg(unax(1:Nx,j))
unax(1:Nx,j)=exp(cmplx(0,-1)*dt*potx(1:Nx,j))*unax(1:Nx,j)
CALL dfftw_execute_dft_(planfx,unax(1:Nx,j),vnax(1:Nx,j))
END DO
!$OMP END PARALLEL DO
vnby(1:Ny,1:Nx)=TRANSPOSE(vnax(1:Nx,1:Ny))	
!$OMP PARALLEL DO PRIVATE(i) SCHEDULE(static)
DO i=1,Nx
CALL dfftw_execute_dft_(planfy,vnby(1:Ny,i),vnay(1:Ny,i))
vnby(1:Ny,i)=exp(0.5d0*dt*(kx(i)*kx(i) + ky(1:Ny)*ky(1:Ny))&
*cmplx(0,1))*vnay(1:Ny,i)	
CALL dfftw_execute_dft_(planby,vnby(1:Ny,i),vnay(1:Ny,i))
END DO
!$OMP END PARALLEL DO
vnbx(1:Nx,1:Ny)=TRANSPOSE(vnay(1:Ny,1:Nx))/REAL(Ny,kind(0d0))	
!$OMP PARALLEL DO PRIVATE(j) SCHEDULE(static)
DO j=1,Ny
CALL dfftw_execute_dft_(planbx,vnbx(1:Nx,j),unax(1:Nx,j))
unax(1:Nx,j)=unax(1:Nx,j)/REAL(Nx,kind(0d0))
END DO	
!$OMP END PARALLEL DO
name_config = 'ufinal.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
DO i=1,Nx
WRITE(11,*) abs(unax(i,j))**2
END DO
END DO
CLOSE(11)

name_config = 'tdata.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,1+Nt/plotgap
WRITE(11,*) time(j)
END DO
CLOSE(11)

name_config = 'xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)

name_config = 'ycoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
WRITE(11,*) y(j)
END DO
CLOSE(11)
PRINT *,'Saved data'

CALL dfftw_destroy_plan_(planbx)
CALL dfftw_destroy_plan_(planfx)
CALL dfftw_destroy_plan_(planby)
CALL dfftw_destroy_plan_(planfy)
CALL dfftw_cleanup_()

DEALLOCATE(unax,vnax,vnbx,potx, vnay,vnby,stat=allocatestatus)	
IF (allocatestatus .ne. 0) STOP
PRINT *,'Deallocated memory'

PRINT *,'Program execution complete'
END PROGRAM main

( K)

一個用於編譯清單 J 中OpenMP程式的示例makefile 程式碼下載

#define the complier
COMPILER = gfortran
# compilation settings, optimization, precision, parallelization
FLAGS = -O0 -fopenmp


# libraries
LIBS = -L/usr/local/lib -lfftw3 -lm
# source list for main program
SOURCES = NLSsplitting.f90

test: $(SOURCES)
${COMPILER} -o NLSsplitting $(FLAGS) $(SOURCES) $(LIBS)

clean:
rm *.o

clobber:
rm NLSsplitting

此示例假設您使用的是Flux，並且已載入了intel編譯器以及Intel編譯版本的FFTW的環境。如果您使用的是免費的GCC編譯器而不是Intel編譯器，請將makefile中的標誌

-openmp

更改為

-fopenmp

。

修改OpenMP Fortran程式碼以求解二維和三維的Gross-Pitaevskii方程。
^[25] 一些等離子體的量子流體動力學模型與非線性薛定諤方程非常相似，也可以使用分裂方法進行數值近似。Eliasson和Shukla^[26] 使用的一個等離子體模型是

$i\Psi _{t}+\Delta \Psi +\phi \Psi -|\Psi |^{4/D}\Psi =0$

( 8)

和

$\Delta \phi =|\Psi |^{2}-1$

( 9)

其中 $\Psi$ 是有效波函式， $\phi$ 是靜電勢， $D$ 是維度，通常為 1、2 或 3。此方程可以以與Klein、Muite和Roidot^[27] 中的Davey-Stewartson方程類似的方式求解。具體來說，首先求解

$iP_{t}+\Delta P=0$

( 10)

使用傅立葉變換，使得
$P(\delta t)=\exp \left(-i\Delta \delta t\right)P(0)$
其中 $P(0)=\Psi (0)$ 。然後求解

$\phi =\Delta ^{-1}\left(|P|^{2}-1\right)$

( 11)

使用傅立葉變換。最後，求解

$iQ_{t}+\phi Q-|Q|^{4/D}Q=0$

( 12)

利用在每個網格點上 $\phi -|Q|^{4/D}$ 是一個常數，所以解是 $Q(\delta t)=\exp \left[i\left(\phi -|\Phi |^{4/D}\right)\delta t\right]Q(0)$ 其中 $Q(0)=P(\delta t)$ 以及 $\Psi (\delta t)\approx Q(\delta t)$ .

^[28] 運算元分裂法可以用於除非線性薛定諤方程以外的其他方程。另一個可以使用運算元分裂法的方程是復 Ginzburg-Landau 方程 ${\frac {\partial A}{\partial t}}=A+(1+i\alpha )\Delta A-(1+i\beta )|A|^{2}A,$ 其中 $A$ 是一個複函式，通常是一個、兩個或三個變數的函式。基於 Blanes、Casa、Chartier 和 Miura 早期有限差分程式碼，列表 L 提供了一個一維程式碼示例，該示例使用了 Blanes 等人^[29] 描述的方法。透過使用復係數，Blanes 等人^[30] 可以為拋物線方程建立高階分裂方法。以前的嘗試失敗了，因為如果只使用實係數，則對於高於二階的方法，所需的向後步驟會導致數值不穩定。修改示例程式碼，以解決一維、二維和三維空間中的復 Ginzburg-Landau 方程。線性部分 ${\frac {\partial A}{\partial t}}=A+(1+i\alpha )\Delta A$ 可以使用傅立葉變換顯式求解。要解決非線性部分， ${\frac {\partial A}{\partial t}}=-(1+i\beta )|A|^{2}A$ 考慮 ${\frac {\partial |A|^{2}}{\partial t}}={\frac {\partial A}{\partial t}}A^{*}+{\frac {\partial A^{*}}{\partial t}}A=2|A|^{4}$ 並精確地求解 $|A|^{2}$ 。為了恢復相位，觀察 ${\frac {\partial \log(A)}{\partial t}}=-(1+i\beta )|A|^{2}$ ，它也可以顯式積分，因為 $|A|^{2}(t)$ 是已知的。

( L)

一個使用 16 階分裂法來解決三次非線性薛定諤方程的 Matlab 程式程式碼下載

% A program to solve the nonlinear Schr\"{o}dinger equation using a
% splitting method. The numerical solution is compared to an exact
% solution.
% S. Blanes, F. Casas, P. Chartier and A. Murua
% "Optimized high-order splitting methods for some classes of parabolic
% equations"
% ArXiv pre-print 1102.1622v2
% Forthcoming Mathematics of Computation

clear all; format compact; format short;
set(0,'defaultaxesfontsize',30,'defaultaxeslinewidth',.7,...
    'defaultlinelinewidth',6,'defaultpatchlinewidth',3.7,...
    'defaultaxesfontweight','bold')

% set up grid
Lx = 20; % period 2*pi * L
Nx = 16384; % number of harmonics
Nt = 2000; % number of time slices
dt = 0.25*pi/Nt;% time step
U=zeros(Nx,Nt/10);
method=3; % splitting method: 1 Strang, 2 CCDV10, 3 Blanes et al 2012

% initialise variables
x = (2*pi/Nx)*(-Nx/2:Nx/2 -1)'*Lx; % x coordinate
kx = 1i*[0:Nx/2-1 0 -Nx/2+1:-1]'/Lx; % wave vector

% initial conditions
t=0; tdata(1)=t;
u=4*exp(1i*t)*(cosh(3*x)+3*exp(8*1i*t)*cosh(x))...
    ./(cosh(4*x)+4*cosh(2*x)+3*cos(8*t));
v=fft(u);
figure(1); clf; plot(x,u);xlim([-2,2]); drawnow;
U(:,1)=u;

% mass
ma = fft(abs(u).^2);
ma0 = ma(1);

if method==1,
    %
    % Strang-Splitting
    %
    s=2;
    a=[1;0];
    b=[1/2;1/2];
    %
elseif method==2,
    %
    % Method of Castella, Chartier, Descombes and Vilmart
    % BIT Numerical Analysis vol 49 pp 487-508, 2009
    %
    s=5;
    a=[1/4;1/4;1/4;1/4;0];
    b=[1/10-1i/30;4/15+2*1i/15;4/15-1i/5;4/15+2*1i/15;1/10-1i/30];
    %
elseif method==3,
    %
    % Method of Blanes, Casas, Chartier and Murua 2012
    %
    s=17;
    a=1/16*[1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;0];
    b=[0.028920177910074098791 - 0.005936580835725746103*1i;
       0.056654351383649876160 + 0.020841963949772627119*1i;
       0.067258385822722143569 - 0.039386393748812362460*1i;
       0.070333980553260772061 + 0.058952097930307840316*1i;
       0.077095100838099173580 - 0.038247636602014810025*1i;
       0.042022140317231098258 - 0.033116379859951038579*1i;
       0.050147397749937784280 + 0.061283684958324249562*1i;
       0.047750191909146143447 - 0.032332468814362628262*1i;
       0.119636547031757819706 + 0.015883426044923736862*1i;
       0.047750191909146143447 - 0.032332468814362628262*1i;
       0.050147397749937784280 + 0.061283684958324249562*1i;
       0.042022140317231098258 - 0.033116379859951038579*1i;
       0.077095100838099173580 - 0.038247636602014810025*1i;
       0.070333980553260772061 + 0.058952097930307840316*1i;
       0.067258385822722143569 - 0.039386393748812362460*1i;
       0.056654351383649876160 + 0.020841963949772627119*1i;
       0.028920177910074098791 - 0.005936580835725746103*1i];
end;


% solve pde and plot results
for n =2:Nt+1
    for m=1:(s-1)
        vna=exp(b(m)*1i*dt*kx.*kx).*v;
        una=ifft(vna);
        pot=(2*una.*conj(una));
        unb=exp(-1i*a(m)*(-1)*dt*pot).*una;
        v=fft(unb);
    end
    v=exp(b(s)*1i*dt*kx.*kx).*v;
    u=ifft(v);
    t=(n-1)*dt;
    if (mod(n,10)==0)
        tdata(n/10)=t;
        u=ifft(v);
        U(:,n/10)=u;
        uexact=...
            4*exp(1i*t)*(cosh(3*x)+3*exp(8*1i*t)*cosh(x))...
            ./(cosh(4*x)+4*cosh(2*x)+3*cos(8*t));
        figure(1); clf; plot(x,abs(u).^2); ...
            xlim([-0.5,0.5]); title(num2str(t));
        figure(2); clf; loglog(abs(v(1:Nx/2))); ...
            title('Fourier Coefficients');
        figure(3); clf; plot(x,abs(u-uexact).^2); ...
            xlim([-0.5,0.5]); title('error');
        drawnow;
        ma = fft(abs(u).^2);
        ma = ma(1);
        test = log10(abs(1-ma/ma0))
    end
end
figure(4); clf; mesh(tdata(1:(n-1)/10),x,abs(U(:,1:(n-1)/10)).^2);
xlim([0,t]);

分散式記憶體並行：MPI

在本節中，我們將使用從 http://www.2decomp.org/index.html 獲得的 2DECOMP&FFT 庫。該網站包含一些示例，說明了如何使用此庫，特別是 http://www.2decomp.org/case_study1.html 中的示例程式碼，非常有助於說明如何將使用 FFTW 的程式碼轉換為使用 MPI 和上述庫的程式碼。

在使用 2DECOMP&FFT 建立並行 MPI 程式碼之前，我們將生成一個使用分裂法來解決 3D 非線性薛定諤方程的序列 Fortran 程式碼。我們不會使用基於迴圈的並行化來執行一系列一維快速傅立葉變換，而是使用 FFTW 的三維 FFT，這樣序列版本和 MPI 並行版本就具有相同的結構。序列版本在列表 E 中。此檔案可以透過與 Fortran 程式章節中的列表 A 中相同的方式進行編譯。

( M)

一個使用分裂法和 FFTW 解決 3D 非線性薛定諤方程的 Fortran 程式程式碼下載

!--------------------------------------------------------------------
!
!
! PURPOSE
!
! This program solves nonlinear Schrodinger equation in 3 dimensions
! i*u_t+Es*|u|^2u+u_{xx}+u_{yy}+u_{zz}=0
! using a second order time spectral splitting scheme
!
! The boundary conditions are u(x=0,y,z)=u(2*Lx*\pi,y,z),
! u(x,y=0,z)=u(x,y=2*Ly*\pi,z), u(x,y,z=0)=u(x,y,z=2*Lz*\pi)
! The initial condition is u=exp(-x^2-y^2)
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Ny = number of modes in y - power of 2 for FFT
! Nz = number of modes in z - power of 2 for FFT
! Nt = number of timesteps to take
! Tmax = maximum simulation time
! plotgap = number of timesteps between plots
! FFTW_IN_PLACE = value for FFTW input
! FFTW_MEASURE = value for FFTW input
! FFTW_EXHAUSTIVE = value for FFTW input
! FFTW_PATIENT = value for FFTW input
! FFTW_ESTIMATE = value for FFTW input
! FFTW_FORWARD = value for FFTW input
! FFTW_BACKWARD = value for FFTW input
! pi = 3.14159265358979323846264338327950288419716939937510d0
! Lx = width of box in x direction
! Ly = width of box in y direction
! Lz = width of box in z direction
! ES = +1 for focusing and -1 for defocusing
! .. Scalars ..
! i = loop counter in x direction
! j = loop counter in y direction
! k = loop counter in z direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! start = variable to record start time of program
! finish = variable to record end time of program
! count_rate = variable for clock count rate
! count = keep track of information written to disk
! iol = size of array to write to disk
! planfxyz = Forward 3d fft plan
! planbxyz = Backward 3d fft plan
! dt = timestep
! modescalereal = Number to scale after backward FFT
! ierr = error code
! .. Arrays ..
! unax = approximate solution
! vnax = Fourier transform of approximate solution
! potx = potential
! .. Vectors ..
! kx = fourier frequencies in x direction
! ky = fourier frequencies in y direction
! x = x locations
! y = y locations
! time = times at which save data
! name_config = array to store filename for data to be saved
! fftfxy = array to setup 2D Fourier transform
! fftbxy = array to setup 2D Fourier transform
!
! REFERENCES
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
! Check that the initial iterate is consistent with the
! boundary conditions for the domain specified
!--------------------------------------------------------------------
! External routines required
!
! External libraries required
! FFTW3 -- Fast Fourier Transform in the West Library
! (http://www.fftw.org/)
PROGRAM main
! Declare variables
IMPLICIT NONE
INTEGER(kind=4), PARAMETER	:: Nx=2**5	
INTEGER(kind=4), PARAMETER :: Ny=2**5	
INTEGER(kind=4), PARAMETER :: Nz=2**5	
INTEGER(kind=4), PARAMETER	:: Nt=50	
INTEGER(kind=4), PARAMETER	:: plotgap=10
REAL(kind=8), PARAMETER	::&
pi=3.14159265358979323846264338327950288419716939937510d0
REAL(kind=8), PARAMETER	:: Lx=2.0d0,Ly=2.0d0,Lz=2.0d0	
REAL(kind=8), PARAMETER	:: Es=1.0d0	
REAL(kind=8)	:: dt=0.10d0/Nt	
REAL(kind=8) :: modescalereal
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE	:: kx,ky,kz
REAL(kind=8), DIMENSION(:), ALLOCATABLE	:: x,y,z
COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE	:: unax,vnax,potx
REAL(kind=8), DIMENSION(:), ALLOCATABLE	:: time
INTEGER(kind=4)	:: i,j,k,n,AllocateStatus,count,iol
! timing
INTEGER(kind=4)	:: start, finish, count_rate
   ! fftw variables
INTEGER(kind=8), PARAMETER	:: FFTW_IN_PLACE = 8, FFTW_MEASURE = 0, &
FFTW_EXHAUSTIVE = 8, FFTW_PATIENT = 32, FFTW_ESTIMATE = 64
INTEGER(kind=8),PARAMETER	:: FFTW_FORWARD = -1, FFTW_BACKWARD=1	
INTEGER(kind=8)	::	planfxyz,planbxyz
CHARACTER*100	::	name_config, number_file

CALL system_clock(start,count_rate)	
ALLOCATE(unax(1:Nx,1:Ny,1:Nz),vnax(1:Nx,1:Ny,1:Nz),potx(1:Nx,1:Ny,1:Nz),&
    kx(1:Nx),ky(1:Ny),kz(1:Nz),x(1:Nx),y(1:Ny),z(1:Nz),&
    time(1:1+Nt/plotgap),stat=AllocateStatus)	
IF (AllocateStatus .ne. 0) STOP
PRINT *,'allocated space'
modescalereal=1.0d0/REAL(Nx,KIND(0d0))
modescalereal=modescalereal/REAL(Ny,KIND(0d0))
modescalereal=modescalereal/REAL(Nz,KIND(0d0))

! set up ffts
CALL dfftw_plan_dft_3d_(planfxyz,Nx,Ny,Nz,unax(1:Nx,1:Ny,1:Nz),&
vnax(1:Nx,1:Ny,1:Nz),FFTW_FORWARD,FFTW_ESTIMATE)
  CALL dfftw_plan_dft_3d_(planbxyz,Nx,Ny,Nz,vnax(1:Nx,1:Ny,1:Nz),&
  unax(1:Nx,1:Ny,1:Nz),FFTW_BACKWARD,FFTW_ESTIMATE)

PRINT *,'Setup FFTs'

! setup fourier frequencies and grid points
DO i=1,1+Nx/2
kx(i)= cmplx(0.0d0,1.0)*REAL(i-1,kind(0d0))/Lx
END DO
kx(1+Nx/2)=0.0d0
DO i = 1,Nx/2 -1
kx(i+1+Nx/2)=-kx(1-i+Nx/2)
END DO
DO i=1,Nx
x(i)=(-1.0d0+2.0d0*REAL(i-1,kind(0d0))/REAL(Nx,kind(0d0)) )*pi*Lx
END DO
DO j=1,1+Ny/2
ky(j)= cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))/Ly
END DO
ky(1+Ny/2)=0.0d0
DO j = 1,Ny/2 -1
ky(j+1+Ny/2)=-ky(1-j+Ny/2)
END DO
DO j=1,Ny
y(j)=(-1.0d0+2.0d0*REAL(j-1,kind(0d0))/REAL(Ny,kind(0d0)) )*pi*Ly
END DO
DO k=1,1+Nz/2
kz(k)= cmplx(0.0d0,1.0d0)*REAL(k-1,kind(0d0))/Lz
END DO
kz(1+Nz/2)=0.0d0
DO k = 1,Nz/2 -1
kz(k+1+Nz/2)=-kz(1-k+Nz/2)
END DO
DO k=1,Nz
z(k)=(-1.0d0+2.0d0*REAL(k-1,kind(0d0))/REAL(Nz,kind(0d0)) )*pi*Lz
END DO

PRINT *,'Setup grid and fourier frequencies'

DO k=1,Nz;	DO j=1,Ny; DO i=1,Nx
unax(i,j,k)=exp(-1.0d0*(x(i)**2 +y(j)**2+z(k)**2))
END DO; END DO; END DO
name_config = 'uinitial.dat'
INQUIRE(iolength=iol) unax(1,1,1)
OPEN(unit=11,FILE=name_config,form="unformatted", &
access="direct",recl=iol)
count=1
DO k=1,Nz; DO j=1,Ny; DO i=1,Nx
WRITE(11,rec=count) unax(i,j,k)
        count=count+1
END DO; END DO; END DO
CLOSE(11)

CALL dfftw_execute_dft_(planfxyz,unax(1:Nx,1:Ny,1:Nz),vnax(1:Nx,1:Ny,1:Nz))

PRINT *,'Got initial data, starting timestepping'
time(1)=0
DO n=1,Nt
DO k=1,Nz; DO j=1,Ny; DO i=1,Nx
vnax(i,j,k)=exp(0.50d0*dt*&
(kz(k)*kz(k) + kx(i)*kx(i) + ky(j)*ky(j))&
*cmplx(0.0d0,1.0d0))*vnax(i,j,k)
END DO; END DO; END DO
CALL dfftw_execute_dft_(planbxyz,vnax(1:Nx,1:Ny,1:Nz),&
unax(1:Nx,1:Ny,1:Nz))

DO k=1,Nz; DO j=1,Ny; DO i=1,Nx
unax(i,j,k)=unax(i,j,k)*modescalereal
potx(i,j,k)=Es*unax(i,j,k)*conjg(unax(i,j,k))
unax(i,j,k)=exp(cmplx(0.0d0,-1.0d0)*dt*potx(i,j,k))&
*unax(i,j,k)
END DO; END DO; END DO
CALL dfftw_execute_dft_(planfxyz,unax(1:Nx,1:Ny,1:Nz),&
vnax(1:Nx,1:Ny,1:Nz))

DO k=1,Nz; DO j=1,Ny; DO i=1,Nx
vnax(i,j,k)=exp(0.5d0*dt*&
(kx(i)*kx(i) + ky(j)*ky(j)+ kz(k)*kz(k))&
*cmplx(0.0d0,1.0d0))*vnax(i,j,k)	
END DO; END DO; END DO
IF (mod(n,plotgap)==0) THEN
time(1+n/plotgap)=n*dt
PRINT *,'time',n*dt
CALL dfftw_execute_dft_(planbxyz,vnax(1:Nx,1:Ny,1:Nz),unax(1:Nx,1:Ny,1:Nz))
DO k=1,Nz; DO j=1,Ny; DO i=1,Nx
unax(i,j,k)=unax(i,j,k)*modescalereal
END DO; END DO; END DO
name_config='./data/u'
WRITE(number_file,'(i0)') 10000000+1+n/plotgap
ind=index(name_config,' ') -1
name_config=name_config(1:ind)//numberfile
ind=index(name_config,' ') -1
name_config=name_config(1:ind)//'.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
DO i=1,Nx
WRITE(11,*) abs(unax(i,j))**2
END DO
END DO
CLOSE(11)

END IF
END DO
PRINT *,'Finished time stepping'

! transform back final data and do another half time step
CALL system_clock(finish,count_rate)
PRINT*,'Program took ',REAL(finish-start)/REAL(count_rate),'for execution'

name_config = 'tdata.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,1+Nt/plotgap
WRITE(11,*) time(j)
END DO
CLOSE(11)

name_config = 'xcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)

name_config = 'ycoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
WRITE(11,*) y(j)
END DO
CLOSE(11)

name_config = 'zcoord.dat'
OPEN(unit=11,FILE=name_config,status="UNKNOWN")
REWIND(11)
DO k=1,Nz
WRITE(11,*) z(k)
END DO
CLOSE(11)
PRINT *,'Saved data'

CALL dfftw_destroy_plan_(planbxyz)
CALL dfftw_destroy_plan_(planfxyz)
CALL dfftw_cleanup_()

DEALLOCATE(unax,vnax,potx,&
    kx,ky,kz,x,y,z,&
    time,stat=AllocateStatus)	
IF (AllocateStatus .ne. 0) STOP
PRINT *,'Program execution complete'
END PROGRAM main

與之前的程式相比，列表 E 中的程式將其最終資料作為二進位制檔案寫入。這通常比寫入文字檔案快得多，並且生成的二進位制檔案的大小通常也小得多。當寫入多個此類檔案時，或者當單個檔案很大時，這一點很重要。由於格式發生了變化，二進位制檔案也需要使用稍微不同的方式讀入。列表 N 中的 Matlab 指令碼展示瞭如何做到這一點。

( N)

一個繪製列表 E 或 O 生成的 3D 非線性薛定諤方程數值解的 Matlab 程式程式碼下載

% A program to plot the computed results

clear all; format compact, format short,
set(0,'defaultaxesfontsize',18,'defaultaxeslinewidth',.9,...
    'defaultlinelinewidth',3.5,'defaultpatchlinewidth',5.5);

% Load data
tdata=load('./tdata.dat');
x=load('./xcoord.dat');
y=load('./ycoord.dat');
z=load('./zcoord.dat');
Tsteps = length(tdata);

Nx = length(x); Nt = length(tdata);
Ny = length(y); Nz = length(z);
fid=fopen('./ufinal.datbin','r');
[fname,mode,mformat]=fopen(fid);
u=fread(fid,Nx*Ny*Nz,'double',mformat);
u = reshape(u,Nx,Ny,Nz);

% Plot data
figure (1); clf ; UP = abs(u).^2;
p1 = patch(isosurface(x,y,z,UP,.0025) ,...
'FaceColor','yellow','EdgeColor','none');
p2 = patch(isocaps(x,y,z,UP,.0025) ,...
'FaceColor','interp','EdgeColor','none');
isonormals(UP,p1); lighting phong;
xlabel('x'); ylabel('y'); zlabel('z');
axis equal; axis square; view(3); drawnow;

現在，我們將修改上述程式碼，以使用 MPI 和 2DECOMP&FFT 庫。2DECOMP&FFT 庫隱藏了大多數 MPI 的細節，儘管有一些命令對使用者理解非常有用。這些命令是

USE mpi或INCLUDE 'mpif.h'
MPI_INIT
MPI_COMM_SIZE
MPI_COMM_RANK
MPI_FINALIZE

程式在清單 O 中列出，請將其與 E 中的序列程式碼進行比較。庫 2DECOMP&FFT 對陣列進行域分解，以便將陣列的不同部分放到不同的處理器上。即使陣列儲存在不同的處理器上，該庫也可以對陣列執行傅立葉變換——該庫執行執行傅立葉變換所需的所有必要訊息傳遞和轉置。需要注意的是，傅立葉變換後陣列中條目順序不一定與 FFTW 使用的順序相同。但是，結構將返回條目的正確排序decomp因此，此結構用於獲取迴圈的起始和結束條目。我們假設庫 2DECOMP&FFT 已安裝在適當的位置。

( O)

使用分裂和 2DECOMP&FFT 求解 3D 非線性薛定諤方程的 Fortran 程式程式碼下載

!--------------------------------------------------------------------
!
!
! PURPOSE
!
! This program solves nonlinear Schrodinger equation in 3 dimensions
! i*u_t+Es*|u|^2u+u_{xx}+u_{yy}+u_{zz}=0
! using a second order time spectral splitting scheme
!
! The boundary conditions are u(x=0,y,z)=u(2*Lx*\pi,y,z),
! u(x,y=0,z)=u(x,y=2*Ly*\pi,z), u(x,y,z=0)=u(x,y,z=2*Lz*\pi)
! The initial condition is u=exp(-x^2-y^2)
!
! .. Parameters ..
! Nx = number of modes in x - power of 2 for FFT
! Ny = number of modes in y - power of 2 for FFT
! Nz = number of modes in z - power of 2 for FFT
! Nt = number of timesteps to take
! Tmax = maximum simulation time
! plotgap = number of timesteps between plots
! pi = 3.14159265358979323846264338327950288419716939937510d0
! Lx = width of box in x direction
! Ly = width of box in y direction
! Lz = width of box in z direction
! ES = +1 for focusing and -1 for defocusing
! .. Scalars ..
! i = loop counter in x direction
! j = loop counter in y direction
! k = loop counter in z direction
! n = loop counter for timesteps direction
! allocatestatus = error indicator during allocation
! start = variable to record start time of program
! finish = variable to record end time of program
! count_rate = variable for clock count rate
! dt = timestep
! modescalereal = Number to scale after backward FFT
! myid = Process id
! ierr = error code
! p_row = number of rows for domain decomposition
! p_col = number of columns for domain decomposition
! filesize = total filesize
! disp = displacement to start writing data from
! ind = index in array to write
! plotnum = number of plot to save
! numberfile = number of the file to be saved to disk
! stat = error indicator when reading inputfile
! .. Arrays ..
! u = approximate solution
! v = Fourier transform of approximate solution
! pot = potential
! .. Vectors ..
! kx = fourier frequencies in x direction
! ky = fourier frequencies in y direction
! kz = fourier frequencies in z direction
! x = x locations
! y = y locations
! z = z locations
! time = times at which save data
! nameconfig = array to store filename for data to be saved
! InputFileName = name of the Input File
! .. Special Structures ..
! decomp = contains information on domain decomposition
! see http://www.2decomp.org/ for more information
!
! REFERENCES
!
! ACKNOWLEDGEMENTS
!
! ACCURACY
!
! ERROR INDICATORS AND WARNINGS
!
! FURTHER COMMENTS
! Check that the initial iterate is consistent with the
! boundary conditions for the domain specified
!--------------------------------------------------------------------
! External routines required
!
! External libraries required
! 2DECOMP&FFT -- Domain decomposition and Fast Fourier Library
! (http://www.2decomp.org/index.html)
! MPI library

PROGRAM main
USE decomp_2d
USE decomp_2d_fft
USE decomp_2d_io
USE MPI
! Declare variables
IMPLICIT NONE
INTEGER(kind=4)	:: Nx=2**5
INTEGER(kind=4) :: Ny=2**5
INTEGER(kind=4) :: Nz=2**5
INTEGER(kind=4)	:: Nt=50	
INTEGER(kind=4)	:: plotgap=10
REAL(kind=8), PARAMETER	::&
pi=3.14159265358979323846264338327950288419716939937510d0
REAL(kind=8)	:: Lx=2.0d0,Ly=2.0d0,Lz=2.0d0	
REAL(kind=8)	:: Es=1.0d0	
REAL(kind=8)	:: dt=0.0010d0	
COMPLEX(kind=8), DIMENSION(:), ALLOCATABLE	:: kx,ky,kz
REAL(kind=8), DIMENSION(:), ALLOCATABLE	:: x,y,z
COMPLEX(kind=8), DIMENSION(:,:,:), ALLOCATABLE	:: u,v,pot
REAL(kind=8), DIMENSION(:), ALLOCATABLE	:: time
INTEGER(KIND=4), DIMENSION(1:5)	:: intcomm
REAL(KIND=8), DIMENSION(1:5)	:: dpcomm
REAL(kind=8) :: modescalereal
INTEGER(kind=4)	:: i,j,k,n,AllocateStatus,stat
INTEGER(kind=4)	:: myid,numprocs,ierr
TYPE(DECOMP_INFO)	:: decomp
INTEGER(kind=MPI_OFFSET_KIND) :: filesize, disp
INTEGER(kind=4)	:: p_row=0, p_col=0	
  INTEGER(kind=4)	:: start, finish, count_rate, ind, plotnum
CHARACTER*50	::	nameconfig
CHARACTER*20	::	numberfile
! initialisation of MPI
CALL MPI_INIT(ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)

! initialisation of 2decomp
! do automatic domain decomposition
CALL decomp_2d_init(Nx,Ny,Nz,p_row,p_col)
! get information about domain decomposition choosen
CALL decomp_info_init(Nx,Ny,Nz,decomp)
! initialise FFT library
CALL decomp_2d_fft_init
ALLOCATE(u(decomp%xst(1):decomp%xen(1),&
    decomp%xst(2):decomp%xen(2),&
    decomp%xst(3):decomp%xen(3)),&
            v(decomp%zst(1):decomp%zen(1),&
             decomp%zst(2):decomp%zen(2),&
             decomp%zst(3):decomp%zen(3)),&
            pot(decomp%xst(1):decomp%xen(1),&
             decomp%xst(2):decomp%xen(2),&
             decomp%xst(3):decomp%xen(3)),&
    kx(1:Nx),ky(1:Ny),kz(1:Nz),&
    x(1:Nx),y(1:Ny),z(1:Nz),&
    time(1:1+Nt/plotgap),stat=AllocateStatus)	
IF (AllocateStatus .ne. 0) STOP

IF (myid.eq.0) THEN
PRINT *,'allocated space'
END IF
modescalereal=1.0d0/REAL(Nx,KIND(0d0))
modescalereal=modescalereal/REAL(Ny,KIND(0d0))
modescalereal=modescalereal/REAL(Nz,KIND(0d0))

! setup fourier frequencies and grid points
DO i=1,1+Nx/2
kx(i)= cmplx(0.0d0,1.0d0)*REAL(i-1,kind(0d0))/Lx
END DO
kx(1+Nx/2)=0.0d0
DO i = 1,Nx/2 -1
kx(i+1+Nx/2)=-kx(1-i+Nx/2)
END DO
DO i=1,Nx
x(i)=(-1.0d0 + 2.0d0*REAL(i-1,kind(0d0))/REAL(Nx,kind(0d0)))*pi*Lx
END DO
DO j=1,1+Ny/2
ky(j)= cmplx(0.0d0,1.0d0)*REAL(j-1,kind(0d0))/Ly
END DO
ky(1+Ny/2)=0.0d0
DO j = 1,Ny/2 -1
ky(j+1+Ny/2)=-ky(1-j+Ny/2)
END DO
DO j=1,Ny
y(j)=(-1.0d0 + 2.0d0*REAL(j-1,kind(0d0))/REAL(Ny,kind(0d0)))*pi*Ly
END DO
DO k=1,1+Nz/2
kz(k)= cmplx(0.0d0,1.0d0)*REAL(k-1,kind(0d0))/Lz
END DO
kz(1+Nz/2)=0.0d0
DO k = 1,Nz/2 -1
kz(k+1+Nz/2)=-kz(1-k+Nz/2)
END DO
DO k=1,Nz
z(k)=(-1.0d0 + 2.0d0*REAL(k-1,kind(0d0))/REAL(Nz,kind(0d0)))*pi*Lz
END DO

IF (myid.eq.0) THEN
PRINT *,'Setup grid and fourier frequencies'
END IF
DO k=decomp%xst(3),decomp%xen(3)
DO j=decomp%xst(2),decomp%xen(2)
DO i=decomp%xst(1),decomp%xen(1)
u(i,j,k)=exp(-1.0d0*(x(i)**2 +y(j)**2+z(k)**2))
END DO
END DO
END DO

! write out using 2DECOMP&FFT MPI-IO routines
nameconfig='./data/u'
plotnum=0
WRITE(numberfile,'(i0)') 10000000+plotnum
ind=index(nameconfig,' ') -1
nameconfig=nameconfig(1:ind)//numberfile
ind=index(nameconfig,' ') -1
nameconfig=nameconfig(1:ind)//'.datbin'
CALL decomp_2d_write_one(1,u,nameconfig)
  
CALL decomp_2d_fft_3d(u,v,DECOMP_2D_FFT_FORWARD)
IF (myid.eq.0) THEN
PRINT *,'Got initial data, starting timestepping'
END IF
CALL system_clock(start,count_rate)	
time(1)=0
DO n=1,Nt
! Use Strang splitting
DO k=decomp%zst(3),decomp%zen(3)
DO j=decomp%zst(2),decomp%zen(2)
DO i=decomp%zst(1),decomp%zen(1)
v(i,j,k)=exp(0.50d0*dt*&
(kz(k)*kz(k) + kx(i)*kx(i) + ky(j)*ky(j))&
*cmplx(0.0d0,1.0d0))*v(i,j,k)
END DO
END DO
END DO

CALL decomp_2d_fft_3d(v,u,DECOMP_2D_FFT_BACKWARD)

DO k=decomp%xst(3),decomp%xen(3)
DO j=decomp%xst(2),decomp%xen(2)
DO i=decomp%xst(1),decomp%xen(1)
u(i,j,k)=u(i,j,k)*modescalereal
pot(i,j,k)=Es*u(i,j,k)*conjg(u(i,j,k))
u(i,j,k)=exp(cmplx(0.0d0,-1.0d0)*dt*pot(i,j,k))*u(i,j,k)
END DO
END DO
END DO
CALL decomp_2d_fft_3d(u,v,DECOMP_2D_FFT_FORWARD)

DO k=decomp%zst(3),decomp%zen(3)
DO j=decomp%zst(2),decomp%zen(2)
DO i=decomp%zst(1),decomp%zen(1)
v(i,j,k)=exp(dt*0.5d0*&
(kx(i)*kx(i) +ky(j)*ky(j) +kz(k)*kz(k))&
*cmplx(0.0d0,1.0d0))*v(i,j,k)	
END DO
END DO
END DO
IF (mod(n,plotgap)==0) THEN
time(1+n/plotgap)=n*dt
IF (myid.eq.0) THEN
PRINT *,'time',n*dt
END IF
CALL decomp_2d_fft_3d(v,u,DECOMP_2D_FFT_BACKWARD)
u=u*modescalereal
nameconfig='./data/u'
plotnum=plotnum+1
WRITE(numberfile,'(i0)') 10000000+plotnum
ind=index(nameconfig,' ') -1
nameconfig=nameconfig(1:ind)//numberfile
ind=index(nameconfig,' ') -1
nameconfig=nameconfig(1:ind)//'.datbin'
! write out using 2DECOMP&FFT MPI-IO routines
CALL decomp_2d_write_one(1,u,nameconfig)
END IF
END DO
IF (myid.eq.0) THEN
PRINT *,'Finished time stepping'
END IF

CALL system_clock(finish,count_rate)

IF (myid.eq.0) THEN
PRINT*,'Program took ',REAL(finish-start)/REAL(count_rate),'for execution'
END IF
IF (myid.eq.0) THEN	
! Save times at which output was made in text format
nameconfig = './data/tdata.dat'
OPEN(unit=11,FILE=nameconfig,status="UNKNOWN")
REWIND(11)
DO j=1,1+Nt/plotgap
WRITE(11,*) time(j)
END DO
CLOSE(11)
! Save x grid points in text format
nameconfig = './data/xcoord.dat'
OPEN(unit=11,FILE=nameconfig,status="UNKNOWN")
REWIND(11)
DO i=1,Nx
WRITE(11,*) x(i)
END DO
CLOSE(11)
! Save y grid points in text format
nameconfig = './data/ycoord.dat'
OPEN(unit=11,FILE=nameconfig,status="UNKNOWN")
REWIND(11)
DO j=1,Ny
WRITE(11,*) y(j)
END DO
CLOSE(11)
! Save z grid points in text format
nameconfig = './data/zcoord.dat'
OPEN(unit=11,FILE=nameconfig,status="UNKNOWN")
REWIND(11)
DO k=1,Nz
WRITE(11,*) z(k)
END DO
CLOSE(11)
PRINT *,'Saved data'
END IF

! clean up
   CALL decomp_2d_fft_finalize
   CALL decomp_2d_finalize
  DEALLOCATE(u,v,pot,&
    kx,ky,kz,x,y,z,&
    time,stat=AllocateStatus)	
IF (AllocateStatus .ne. 0) STOP
IF (myid.eq.0) THEN
PRINT *,'Program execution complete'
END IF
CALL MPI_FINALIZE(ierr)	
END PROGRAM main

要編譯程式碼，請根據您的系統在清單 P 中相應地修改 makefile。鍵入 make，然後在編譯後您應該就可以執行它了。假設 FFTW 是 2DECOMP&FFT 呼叫的底層一維快速傅立葉變換庫，因此如果使用其他庫，則應更改它。請注意，如果您要更改問題大小或執行時間，則需要重新構建程式。

( P)

使用分裂和 2DECOMP&FFT 求解 3D 非線性薛定諤方程的並行 Fortran 程式的示例 makefile 程式碼下載

COMPILER =  mpif90 
decompdir=../2decomp_fft
FLAGS = -O0 
	
DECOMPLIB = -I${decompdir}/include -L${decompdir}/lib -l2decomp_fft
LIBS = -L${FFTW_LINK} -lfftw3_threads -lfftw3 -lm 
SOURCES = NLSsplitting.f90  

NLSsplitting: $(SOURCES)
		${COMPILER} -o NLSsplitting $(FLAGS) $(SOURCES) $(LIBS) $(DECOMPLIB) 


clean:
	rm -f *.o
	rm -f *.mod
clobber:	
	rm -f NLSsplitting

練習

ASCII 字元集需要每個字元 7 位，因此至少需要 7 位來儲存 0 到 9 之間的數字。IEEE 算術中的雙精度數需要 64 位來儲存一個雙精度數，大約有 15 位十進位制數和大約 3 位十進位制數指數。儲存 IEEE 雙精度數需要多少位？假設一個檔案有 $10^{6}$ 個雙精度數。如果數字儲存為 IEEE 雙精度數，則檔案的最小大小是多少？如果數字儲存為字元，則檔案的最小大小是多少？
使用 2DECOMP&FFT 編寫一個 MPI 程式碼來求解三維 Gross-Pitaevskii 方程。
學習使用 VisIt (https://wci.llnl.gov/codes/visit/) 或 Paraview (http://www.paraview.org/) 並編寫一個指令碼以類似於 Matlab 程式碼的方式視覺化二維和三維輸出。

注意

↑ Sulem 和 Sulem (1999)
↑ 楊 (2010)
↑ Evans (2010)
↑ Linares 和 Ponce (2009)
↑ Lieb 和 Loss (2003)
↑ Rennardy 和 Rogers (2004)
↑ Sulem 和 Sulem (1989)
↑ 為了簡化演示，我們主要考慮聚焦立方非線性薛定諤方程。
↑ Klein (2008)
↑ Holden 等人 (2011)
↑ McLachlan 和 Quispel (2002)
↑ Thalhammer
↑ Shen、Tang 和 Wang (2011)
↑ Weideman 和 Herbst (1986)
↑ 楊 (2010)
↑ Klein (2008)
↑ Holden 等人 (2011)
↑ Holden 等人 (2011)
↑ 也就是說 $\mathbf {A} \mathbf {B} =\mathbf {B} \mathbf {A}$ 。
↑ 可以使用指數函式的級數展開來推匯出它， $\exp(\mathbf {A} t)=\sum _{n=0}^{\infty }{\frac {(\mathbf {A} t)^{n}}{n!}}$ ，並從 $\exp(\mathbf {A} \delta t)\exp(\mathbf {B} \delta t).$ 中減去 $\exp((\mathbf {A} +\mathbf {B} )\delta t)$ 。
↑ Klein (2008)
↑ Thalhammer (2008)
↑ Klein、Muite 和 Roidot (2011)
↑ http://en.wikipedia.org/wiki/Gross\%E2\%80\%93Pitaevskii\_equation
↑
這個問題是由 Joshua Kirschenheiter 的專案引起的。
↑ Eliasson 和 Shukla (2009)
↑ Klein、Muite 和 Roidot (2011)
↑
這個問題是由 Kohei Harada 和 Matt Warnez 的專案引起的。
↑ Blanes 等人
↑ Blanes 等人

參考

Blanes, S.; Casas, F.; Chartier, P.; Murua, A. "Splitting methods with complex coefficients for some classes of evolution equations". Mathematics of Computation.

Shukla, P.K. (2009). "Nonlinear aspects of quantum plasma physics: Nanoplasmonics and nanostructures in dense plasmas". Plasma and Fusion Research: Review Articles. 4: 32.

Evans, L.C. (2010). Partial Differential Equations. American Mathematical Society.

Holden, H.; Karlsen, K.H.; Risebro, N.H.; Tao, T. (2011). "Operator splitting for the KdV equation". Mathematics of Computation. 80: 821–846.

Klein, C. (2008). "Fourth order time-stepping for low dispersion Korteweg-De Vries and nonlinear Schrödinger equations". Electronic Transactions on Numerical Analysis. 29: 116–135.

Klein, C.; Muite, B.K.; Roidot, K. (2011). "Numerical Study of Blowup in the Davey-Stewartson System". {{cite journal}}: Cite journal requires |journal= (help)

Klein, C.; Roidot, K. (2011). "Fourth order time-stepping for Kadomstev-Petviashvili and Davey-Stewartson Equations". SIAM Journal on Scientific Computation. 33: 3333–3356.

Loss, M. (2003). Analysis. American Mathematical Society.

Ponce, G. (2009). Introduction to Nonlinear Dispersive Equations. Springer.

McLachlan, R.I.; Quispel, G.R.W. (2002). "Splitting Methods". Acta Numerica. 11: 341–434.

Rogers, R.C. (2004). An Introduction to Partial Differential Equations. Springer.

Wang, L.-L. (2011). Spectral Methods: Algorithms, Analysis and Applications. Springer.

Sulem, P.L. (1999). The Nonlinear Schrödinger equation: Self-Focusing and Wave Collapse. Springer.

Thalhammer, M. Time-Splitting Spectral Methods for Nonlinear Schrödinger Equations (PDF). {{cite book}}: Cite has empty unknown parameter: |coauthors= (help)

Weideman, J.A.C.; Herbst, B.M. (1986). "Split-step methods for the solution of the nonlinear Schrödinger equation". SIAM Journal on Numerical Analysis. SIAM. 23 (3): 485–507.

Yang, J. (2010). Nonlinear Waves in Integrable and Nonintegrable Systems. SIAM.

[1] Sulem 和 Sulem (1999)

[2] 楊 (2010)

[3] Evans (2010)

[4] Linares 和 Ponce (2009)

[5] Lieb 和 Loss (2003)

[6] Rennardy 和 Rogers (2004)

[7] Sulem 和 Sulem (1989)

[8] 為了簡化演示，我們主要考慮聚焦立方非線性薛定諤方程。

[9] Klein (2008)

[10] Holden 等人 (2011)

[11] McLachlan 和 Quispel (2002)

[12] Thalhammer

[13] Shen、Tang 和 Wang (2011)

[14] Weideman 和 Herbst (1986)

[15] 楊 (2010)

[16] Klein (2008)

[17] Holden 等人 (2011)

[18] Holden 等人 (2011)

[19] 也就是說 $\mathbf {A} \mathbf {B} =\mathbf {B} \mathbf {A}$ 。

[20] 可以使用指數函式的級數展開來推匯出它， $\exp(\mathbf {A} t)=\sum _{n=0}^{\infty }{\frac {(\mathbf {A} t)^{n}}{n!}}$ ，並從 $\exp(\mathbf {A} \delta t)\exp(\mathbf {B} \delta t).$ 中減去 $\exp((\mathbf {A} +\mathbf {B} )\delta t)$ 。

[21] Klein (2008)

[22] Thalhammer (2008)

[23] Klein、Muite 和 Roidot (2011)

[24] http://en.wikipedia.org/wiki/Gross\%E2\%80\%93Pitaevskii\_equation

[25] 
這個問題是由 Joshua Kirschenheiter 的專案引起的。

[26] Eliasson 和 Shukla (2009)

[27] Klein、Muite 和 Roidot (2011)

[28] 
這個問題是由 Kohei Harada 和 Matt Warnez 的專案引起的。

[29] Blanes 等人

[30] Blanes 等人

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]