Lisp 操作 C 結構體

概述

最終，處理由 C 或 C++ 程式生成的 struct 包括將該結構載入到緩衝區中，並取出各個欄位。類似地，傳送這樣的結構體涉及組合一個緩衝區。

在 Lisp 中使用字串或字元時，需要進行一些型別轉換。

這裡首選的方法是使用 '(unsigned-byte 8) 來處理來自或傳送到 C 結構體的位元。

雖然字元陣列在 C 等其他語言中可能很便宜，或者允許利用 ASCII 字元集的事實上的特性，但對於 ANSI Common Lisp 來說，這是錯誤的做法。

ANSI 規範不保證字元將包含 ASCII 或 ISO-8859-1 或類似字元集。實際上，ANSI 規範僅保證字元程式碼最多為 96 [1]，儘管當代實現可能支援 Unicode。

與建議的檔案和目錄類似，只需讀取或寫入無符號位元組向量。

Lisp 程式中其他任何需要處理實際字串或字元的部分，都應該在讀取後根據需要進行轉換。

讀取

(defun read-c-file (&optional (file-path "data.struct") (max-length 48))
  (with-open-file (stream (merge-pathnames file-path)
			  :element-type '(unsigned-byte 8)
			  :direction :input)
    (let ((buffer (make-array max-length
			      :element-type '(unsigned-byte 8)
			      :fill-pointer t)))
      (let ((actual-length (read-sequence buffer stream
					  :end max-length)))
	(setf (fill-pointer buffer) actual-length)
	(format t "received=~a max=~a buffer=~s~%" actual-length max-length buffer))
      buffer)))

使用陣列的 fill-pointer 是可選的，但建議這樣做，可以幫助跟蹤實際接收到的長度，這可能與嘗試讀取的長度不同。

寫入

(defun write-C-file (buffer &optional length (file-path "data.struct"))
  (unless length
    (setf length (length buffer)))
  (with-open-file (stream (merge-pathnames file-path)
			  :element-type '(unsigned-byte 8)
			  :direction :output
			  :if-exists :rename)
    (let ((written (length (write-sequence buffer stream))))
      (format t "wrote=~a bytes buffer=~s~%" written buffer)))
  buffer)

處理

在處理 '(unsigned-byte 8) 元素的向量時，根據位元組偏移量，根據需要轉換對應 C 結構體的每個欄位。（注意，它實際上是一個向量，儘管是用 make-array 建立的。區別在於它只有一個維度。）

從原始位元組中提取字串

(map 'string #'code-char
     (subseq buffer *start-index* *end-index*))

只提取一個位元組

(subseq buffer *state-index* (1+ *state-index*))

當然，你需要將上面兩個示例中得到的每個值的返回值賦值。

同時，將值賦值到原始位元組緩衝區中

(setf (elt buffer *magic-number-index*) (logand #xFF *preamble-value*))

保護要賦值的內容很重要；使用像 logand 這樣的位掩碼很適合這種情況。

對於不止一個位元組，比如將sequence2 插入到sequence1 中成為子集，可以使用

(replace sequence1 (map 'vector #'char-code sequence2)
	 :start1 a :end1 b)

或者遵循上面的示例

(replace buffer (map '(vector '(unsigned-byte 8)) #'char-code string-text)
	 :start1 *start-index* :end1 *end-index*)

下面提供了一個輔助函式，可以避免建立中間的 vector。

輔助函式

(defun map-replace (fn sequence1 sequence2 &key (start1 0) end1 (start2 0) end2)
  "Alter elements of first sequence with those from second but after applying function
to that element first, performing each element in order.

Results will be identical to the following but without creating
intermediate vector:
  (replace sequence1 (map 'vector #'char-code sequence2) :start1 start1 :end1 end1)

See also: http://common-lisp.net/project/trivial-utf-8

Side-effects: sequence1 gets modified unless sequence2 is effectively nil.
Returns sequence1 after all modifications.
"
  (loop
     for i upfrom start1 below (or end1 (length sequence1))
     and j upfrom start2 below (or end2 (length sequence2))
     do (setf (elt sequence1 i) (funcall fn (elt sequence2 j))))
  sequence1)

(defun network-bytes-to-number (buffer start-index total-bits)
  "Convert network byte ordered sequence of unsigned bytes to a number."
  (unless (= (mod total-bits 8) 0)
    (error "Please specify total-bits as total for multiples of eight bit bytes"))
  (let ((value 0))
    (loop for i downfrom (- total-bits 8) downto 0 by 8
       for cursor upfrom start-index
       do (setf value (dpb (elt buffer cursor)
			   (byte 8 i) value))

	 (format t "buffer[~d]==#x~2X; shift<< ~d bits; value=~d~%"
		 cursor (elt buffer cursor) i value))
    value))

(defun number-to-network-bytes (number total-bits &optional buffer (start-index 0))
  "Convert number to network byte ordered sequence of unsigned bytes characters."
  (unless (= (mod total-bits 8) 0)
    (error "Please specify total-bits as total for multiples of eight bit bytes"))
  (unless buffer
    (setf buffer (make-array (/ total-bits 8) :element-type '(unsigned-byte 8))))
  (loop for i downfrom (- total-bits 8) downto 0 by 8
     for cursor upfrom start-index
     do (setf (elt buffer cursor) (ldb (byte 8 i) number))

       (let ((value (ldb (byte 8 i) number)))
	 (format t "number=~d: shift>> ~d bits; value=~d #x~2X; buffer[~d]==#x~2X~%"
		 number i value value cursor (elt buffer cursor))))
  buffer)

時間 & 紀元

如果要從其他語言（更不用說作業系統）轉換時間值，請注意紀元（0 值的語義）可能會有所不同。

ANSI Common Lisp 的紀元是 UTC 1900 年 1 月 1 日午夜，值為 0，而 Unix 和許多 C 庫使用的是 1970 年 1 月 1 日。簡單的算術運算可以在兩者之間進行轉換。