跳轉至內容

GLPK/Unix 批次執行

來自 Wikibooks,開放世界中的開放書籍

這些指令碼演示了在 Unix 命令列上執行批處理作業。為了普遍性,演示被分為兩部分。第一部分建立一組問題例項,然後在第二部分中的一組批處理佇列中執行它們。

第一部分包括四個指令碼,tst1tst2atst2btst3tst1 是一個 awk 指令碼,它在一行上建立一組點,並在 X 和 Y 值中都存在誤差。tst3 呼叫 tst1tst2atst2b 多次,從而建立一組 MathProg 格式的資料檔案。

這些示例來自 Solaris 系統,其中 Aho、Weinberger 和 Kernighan 的 The AWK Programming Language 中的 awk(1) 被稱為 nawk。在其他系統上,它只是 awk,而 awk 的先前實現被稱為 oawk(舊的 awk)。許多 Linux 系統使用 Gnu 實現,gawk。因此,可能需要對 nawk 的引用進行少量編輯。

所有這些在 Solaris、Linux、FreeBSD、OpenBSD、MacOS X 上以及在 Windows 上新增一個 Unix 環境(如 Cygwin)後,經過適當的微調後都將有效。

#!/bin/nawk -f

#-----------------------------------------------------------------------
#  tst1
#
# generate the specified number of points on a line with random errors
# in the X & Y values
#-----------------------------------------------------------------------

BEGIN{

   srand();

   n = ARGV[1];
   ARGV[1] = "";

   for( i=0; i<n; i++ ){

      printf( "%d " ,i+1  );
      
      printf( "%f " ,i/3.0+rand() );
      printf( "%f " ,i/7.0+rand() );

      printf( "\n" );

   }
}
#!/bin/sh
#--------------------------------------------------------------
# tst2a
#
# This script uses Bourne shell here files to add the MathProg
# statements.
#--------------------------------------------------------------

# create the first part of the MathProg data file statements

cat <<EOF
data;

param : I :   x    y :=

EOF

# pass stdin through to stdout

cat 

# add the last part of the MathProg data file

cat <<EOF
;
end;
EOF

<syntaxhighlight lang="bash">
#!/bin/sh
#--------------------------------------------------------------
# tst2b
#
# This script uses Bourne shell here files to add the MathProg
# statements.  It then uses the Unix utilities, awk and sort to
# randomly reorder the points.
#--------------------------------------------------------------

# write out the first part of the MathProg data file

cat <<EOF
data;

param : I :   x    y :=

EOF

# randomly reorder the input

nawk 'BEGIN{srand();}{ print $0 ,rand()}' ${1}  \
| sort -k 4n | nawk '{print NR ,$2, $3}'

# write out the last part of the MathProg data file

cat <<EOF
;
end;
EOF
#!/bin/sh

#-----------------------------------------------------------------------
# tst3
#
# create a set of data files in MathProg format using tst1 to generate 
# the data and either tst2a or tst2b to add the required MathProg 
# statements.
#
# ./tst1 generates points on a line with random errors in the X & Y values
# ./tst2a just sets up the MathProg statements
# ./tst2b also randomly reorders the data using Unix command line 
# utilities.  
#
# tst3 takes two arguments specifying the number of points on the line 
# and the number of data files to create.  The sleep following tst2b 
# is to ensure that a new seed is used for each instance.
#-----------------------------------------------------------------------
 
if [ ${#} -ne 2 ]
   then

   echo "usage:"
   echo "./tst3 <npoints> <ninstances>"
   exit
fi

J=1

while [ ${J} -le ${2} ]
   do

   #./tst1 ${1}  | ./tst2a >${J}.dat
   ./tst1 ${1}  | ./tst2b >${J}.dat; sleep 1;

   J=`expr ${J} + 1`

done

tst4 根據副檔名 .dat 獲取作業列表,將列表分成 4 組,並在後臺佇列中執行它們。在 Unix 上實現批處理佇列有很多選擇。這是一個適合單個研究人員使用單個多核工作站的需求的最小示例。如果您需要在叢集上執行大量長時間執行的作業,則應考慮更全面的佇列系統。

來自 glpsol 的所有控制檯輸出都將定向到每個佇列的日誌檔案,命名為 Q0.log、Q1.log 等。

#!/bin/sh 

#-----------------------------------------------------------------------
# tst4
#
# This script runs a collection of jobs in a set of 4 parallel queues.
#
# This can be extended to as many cores in a multicore processor as
# you wish.  If you plan to run a very large number of jobs that will
# require significant time to complete it is suggested that you use 
# N-1 queues where N is the number of cores in your system.  This 
# will ensure that you have a core free for interactive use.
#
# The jobs are identified by the extension .dat, however, any naming
# will work.
#
# White space matters.  In particular, the "\" must be followed by
# a newline (aka linefeed).
#-----------------------------------------------------------------------

# get a list of all jobs in a temporary file

/bin/ls *.dat >/tmp/tmp.$$

# break the list into 4 sublists

Q0=`nawk 'NR%4==0' /tmp/tmp.$$`
Q1=`nawk 'NR%4==1' /tmp/tmp.$$`
Q2=`nawk 'NR%4==2' /tmp/tmp.$$`
Q3=`nawk 'NR%4==3' /tmp/tmp.$$`

# remove the temporary file

rm /tmp/tmp.$$

# fire off the queues by putting shell loops into the background

(for I in ${Q0};                                       \
   do                                                  \
   glpsol -m tst.mod -d ${I} -o ${I}_log -y ${I}_out ; \
done ) 2>&1 >Q0.log &


(for I in ${Q1};                                       \
   do                                                  \
   glpsol -m tst.mod -d ${I} -o ${I}_log -y ${I}_out ; \
done ) 2>&1 >Q1.log &


(for I in ${Q2};                                       \
   do                                                  \
   glpsol -m tst.mod -d ${I} -o ${I}_log -y ${I}_out ; \
done ) 2>&1 >Q2.log &


(for I in ${Q3};                                       \
   do                                                  \
   glpsol -m tst.mod -d ${I} -o ${I}_log -y ${I}_out ; \
done ) 2>&1 >Q3.log &

MathProg 模型是分散式示例目錄中 cf12a.mod 的一個小修改。

# set of points

set I;

# independent variable

param x {i in I};

# dependent variable

param y {i in I};

# define equation variables

var a;

var b;

var u {i in I}, >= 0;

var v {i in I}, >= 0;

# define objective function

minimize error: sum {i in I} u[i] + sum {i in I} v[i];

# define equation constraint

s.t. equation {i in I} : b * x[i] + a + u[i] - v[i] = y[i];

solve;

printf "y = %.4fx + %.4f\n", b, a;

end;
華夏公益教科書