跳至內容

Ada 程式設計/庫/GNAT.String Split

來自 華夏公益教科書,開放書籍,為開放世界

Ada. Time-tested, safe and secure.
Ada. 經久耐用、安全可靠。

根據一組分隔符將字串分解成多個元件可以用多種不同的方法實現。在本文中,我們將重點介紹使用GNAT.String_Split 包的解決方案。

注意事項

[編輯 | 編輯原始碼]

如果你在自己的程式中使用以下示例,結果將是一個可移植性較差的程式。GNAT 包僅在 [GPL] 和 [GCC GNAT] 編譯器中找到,這意味著你的程式可能無法用其他 Ada 編譯器編譯。

你想要將一個字串分割成一組單獨的元件,例如

 This is a string 

進入

 This
 is
 a
 string

這正是你使用 GNAT.String_Split 包可以做到的。

GNAT.String_Split 解決方法

[編輯 | 編輯原始碼]

讓我們直接進入解決字串分割問題的程式碼。建立一個名為 explode.adb 的檔案,並將此程式碼新增到其中

--  A procedure to illustrate the use of the GNAT.String_Split package.  This
--  is just the simplest, most basic usage; the package can do a lot more, like
--  splitting on a char set, re-split the string with new separators, and
--  return the separators found before and after each substring.  Left as an
--  exercise for the reader. ;)

with Ada.Characters.Latin_1;
with Ada.Text_IO; 
with GNAT.String_Split;

procedure Explode is
   use Ada.Characters;
   use Ada.Text_IO;
   use GNAT;
   
   Data : constant String :=
            "This becomes a " & Latin_1.HT & " bunch of     substrings";
   --  The input data would normally be read from some external source or 
   --  whatever. Latin_1.HT is a horizontal tab.
   
   Subs : String_Split.Slice_Set;
   --  Subs is populated by the actual substrings.
   
   Seps : constant String := " " & Latin_1.HT;  
   --  just an arbitrary simple set of whitespace.                                 
begin
   Put_Line ("Splitting '" & Data & "' at whitespace.");
   --  Introduce our job.
   
   String_Split.Create (S          => Subs,
                        From       => Data,
                        Separators => Seps,
                        Mode       => String_Split.Multiple);
   --  Create the split, using Multiple mode to treat strings of multiple
   --  whitespace characters as a single separator.
   --  This populates the Subs object.
   
   Put_Line 
     ("Got" & 
      String_Split.Slice_Number'Image (String_Split.Slice_Count (Subs)) &
      " substrings:");
   --  Report results, starting with the count of substrings created.
   
   for I in 1 .. String_Split.Slice_Count (Subs) loop
      --  Loop though the substrings.  
      declare
         Sub : constant String := String_Split.Slice (Subs, I);
         --  Pull the next substring out into a string object for easy handling.
      begin
         Put_Line (String_Split.Slice_Number'Image (I) &
                   " -> " & 
                   Sub & 
                   " (length" & Positive'Image (Sub'Length) & 
                   ")");
         --  Output the individual substrings, and their length.
         
      end;
   end loop;
end Explode;

你像這樣編譯並執行 Explode 程式

 $ gnatmake explode.adb
 $ ./explode

你應該看到類似於此的輸出

 Splitting 'This becomes a   bunch of     substrings' at whitespace.
 Got 6 substrings:
  1 -> This (length 4)
  2 -> becomes (length 7)
  3 -> a (length 1)
  4 -> bunch (length 5)
  5 -> of (length 2)
  6 -> substrings (length 10)

示例中的註釋或多或少地解釋了正在發生的事情,但為了清楚起見,我們將逐步介紹程式碼,從依賴項和 use 子句開始

with Ada.Characters.Latin_1;
with Ada.Text_IO; 
with GNAT.String_Split;

procedure Explode is
   use Ada.Characters;
   use Ada.Text_IO;
   use GNAT;

這三行 with 列出了我們的程式所依賴的包。當編譯器遇到這些包時,它會從其庫中檢索這些包。"//Procedure Explode is//" 行標記了我們程式的開始,特別是宣告部分,我們在這裡宣告/初始化我們的常量和變數。它還命名了我們的程式 Explode。請注意 use 子句。新增這些子句使我們能夠做到這一點

Put_Line ("Some text");

而不是這個

Ada.Text_IO.Put_Line ("Some text");

在程式中。非常方便。

作為練習,嘗試註釋掉三個 use 子句,並在程式中為所有型別和過程新增實際的包名稱。

接下來我們有這個

Data : constant String :=
            "This becomes a " & Latin_1.HT & " bunch of     substrings";

這是我們要分割成單個元件的 StringLatin_1.HT 是在 Ada.Characters.Latin_1 中宣告的常量。它在字串中插入一個水平製表符。由於我們在整個程式中都沒有更改 Data 的值,因此我們已將其初始化為 常量

Subs : String_Split.Slice_Set;

Subs 變數是單個元件或“切片”的容器。

Seps : constant String := " " & Latin_1.HT;

這些是我們的分隔符。在本例中,我們要根據空格 (" ") 和水平製表符 (//Latin_1.HT//) 分割字串。請注意,分隔符不包含在生成的 Slice_Set 中。嘗試使用不同的分隔符進行試驗。

begin
   Put_Line ("Splitting '" & Data & "' at whitespace.");

begin 標記了我們程式主體的開始。在 begin 之後,我們輸出一條簡短的訊息。

String_Split.Create (S          => Subs,
                     From       => Data,
                     Separators => Seps,
                     Mode       => String_Split.Multiple);

這是程式的核心。在這條語句中,Data String 根據 Seps 分隔符被分割成單個切片,並將生成的切片放入 Subs Slice_Set 中。請注意 Mode => String_Split.Multiple 引數。使用 Multiple 模式時,String_Split.Create 將將連續的空格和水平製表符視為一個分隔符。

作為練習,嘗試將 Multiple 更改為 Single 看看會發生什麼。

Put_Line 
     ("Got" & 
      String_Split.Slice_Number'Image (String_Split.Slice_Count (Subs)) &
      " substrings:");

這是負責輸出的程式碼行

 Got 6 substrings:

是的,對於這麼少的輸出來說,這看起來像是一行非常長的程式碼,但這是有原因的

String_Split.Slice_Number'Image (String_Split.Slice_Count (Subs))

該程式碼行負責輸出中的“6”部分。它所做的就是將 Integer6 轉換為 String 值“6”,它使用 Image [[1]] 完成此操作。String_Split.Slice_Count (Subs) 返回一個 Slice_Number 型別,它基本上只是一個值 >=0 的 Integer,然後 Image 將其轉換為適合輸出的 String

for I in 1 .. String_Split.Slice_Count (Subs) loop
   --  Loop though the substrings.   
   declare
      Sub : constant String := String_Split.Slice (Subs, I);
      --  Pull the next substring out into a string object for easy handling.
   begin
      Put_Line (String_Split.Slice_Number'Image (I) &
                " -> " & 
                Sub & 
                " (length" & Positive'Image (Sub'Length) & 
                ")");
      --  Output the individual substrings, and their length.    
   end;
end loop;

在這裡,我們開始一個迴圈,該迴圈重複 String_Split.Slice_Count (Subs) 次,在本例中為 6 次。因此,在第一個迴圈中 I 為 1,在最後一個迴圈中 I 為 6。在迴圈內部,我們 declare 一個新的塊。這使我們能夠在每次迴圈重複時重新初始化 Sub 常量,並使用我們分割後的下一個切片重新初始化它。這是使用 String_Split.Slice 函式完成的,該函式以我們的 Sub 常量和 I 迴圈計數器作為引數,並返回一個 String。在塊的主體中,我們輸出每個切片,以及它在 Subs Slice_Set 中的索引和長度。如你所見,我們再次使用 Image 屬性將數值轉換為 Strings

你可以像這樣去除迴圈內部的塊

for I in 1 .. String_Split.Slice_Count (Subs) loop
   --  Loop though the substrings.   
   Put_Line 
     (String_Split.Slice_Number'Image (I) &
      " -> " & 
      String_Split.Slice (Subs, I) & 
      " (length" & Positive'Image (String_Split.Slice (Subs, I)'Length) & 
      ")");
   --  Output the individual substrings, and their length.
end loop;

如你所見,我們不再使用 Sub 常量。相反,我們直接呼叫 String_Split.Slice (Subs, I)。它工作方式相同,但可能不太易讀。

另一個選擇是使用 Ada.Strings.Unbounded.Unbounded_String。你可以在此處檢視可能的解決方案

foobar.adb

with Ada.Characters.Latin_1; with Ada.Strings.Unbounded; with Ada.Text_IO; with Ada.Text_IO.Unbounded_IO; with GNAT.String_Split;

procedure Foobar is

  use Ada.Characters;
  use Ada.Strings.Unbounded;
  use Ada.Text_IO;
  use Ada.Text_IO.Unbounded_IO;
  use GNAT;

  Data : constant String := 
           "This becomes a " & Latin_1.HT & " bunch of     substrings";
  --  The input data, normally would be read from some external source or 
  --  whatever. Latin_1.HT is a horizontal tab.

  Subs : String_Split.Slice_Set;
  --  Subs is populated by the actual substrings.

  Seps : constant String := " " & Latin_1.HT;  
  --  just arbitrary simple set of whitespace.

  Sub : Unbounded_String;
  --  Object to a slice.

begin

  Put_Line ("Splitting '" & Data & "' at whitespace.");
  --  Introduce our job

  String_Split.Create (S          => Subs,
                       From       => Data,
                       Separators => Seps,
                       Mode       => String_Split.Multiple);
  --  Create the split, using Multiple mode to treat strings of multiple
  --  whitespace characters as a single separator.
  --  This populates the Subs object.

  Put_Line 
    ("Got" & 
     String_Split.Slice_Number'Image (String_Split.Slice_Count (Subs)) &
     " substrings:");
  --  Report results, starting with the count of substrings created

  for I in 1 .. String_Split.Slice_Count (Subs) loop
     --  Loop though the substrings

     --  Note that we've avoided the block from the first example. This is
     --  possible because our Sub variable is now an Unbounded_String, which
     --  does not have to be declared with an initial length.

     Sub := To_Unbounded_String (String_Split.Slice (Subs, I));
     --  Pull the next substring out into an Unbounded_String object for 
     --  easy handling. String_Split.Slice return a String, which we convert
     --  to an Unbounded_String using the aptly named To_Unbounded_String
     --  function.

     Put (String_Split.Slice_Number'Image (I));
     Put (" -> "); 
     Put (Sub); 
     Put (" (length" & Positive'Image (Length (Sub)) & ")");
     New_Line;
  end loop;

end Foobar; </syntaxhighlight>

最後我們有

end Explode;

它只是簡單地結束程式。

至此,我們完成了這個關於如何根據一組分隔符將字串分割成單個部分(切片)的小教程。我希望你喜歡閱讀它,就像我喜歡撰寫它一樣。

華夏公益教科書

[編輯 | 編輯原始碼]

外部示例

[編輯原始碼]
華夏公益教科書