Ada 程式設計/庫/GNAT.String Split
根據一組分隔符將字串分解成多個元件可以用多種不同的方法實現。在本文中,我們將重點介紹使用GNAT.String_Split 包的解決方案。
如果你在自己的程式中使用以下示例,結果將是一個可移植性較差的程式。GNAT 包僅在 [GPL] 和 [GCC GNAT] 編譯器中找到,這意味著你的程式可能無法用其他 Ada 編譯器編譯。
你想要將一個字串分割成一組單獨的元件,例如
This is a string
進入
This is a string
這正是你使用 GNAT.String_Split 包可以做到的。
讓我們直接進入解決字串分割問題的程式碼。建立一個名為 explode.adb 的檔案,並將此程式碼新增到其中
-- A procedure to illustrate the use of the GNAT.String_Split package. This
-- is just the simplest, most basic usage; the package can do a lot more, like
-- splitting on a char set, re-split the string with new separators, and
-- return the separators found before and after each substring. Left as an
-- exercise for the reader. ;)
with Ada.Characters.Latin_1;
with Ada.Text_IO;
with GNAT.String_Split;
procedure Explode is
use Ada.Characters;
use Ada.Text_IO;
use GNAT;
Data : constant String :=
"This becomes a " & Latin_1.HT & " bunch of substrings";
-- The input data would normally be read from some external source or
-- whatever. Latin_1.HT is a horizontal tab.
Subs : String_Split.Slice_Set;
-- Subs is populated by the actual substrings.
Seps : constant String := " " & Latin_1.HT;
-- just an arbitrary simple set of whitespace.
begin
Put_Line ("Splitting '" & Data & "' at whitespace.");
-- Introduce our job.
String_Split.Create (S => Subs,
From => Data,
Separators => Seps,
Mode => String_Split.Multiple);
-- Create the split, using Multiple mode to treat strings of multiple
-- whitespace characters as a single separator.
-- This populates the Subs object.
Put_Line
("Got" &
String_Split.Slice_Number'Image (String_Split.Slice_Count (Subs)) &
" substrings:");
-- Report results, starting with the count of substrings created.
for I in 1 .. String_Split.Slice_Count (Subs) loop
-- Loop though the substrings.
declare
Sub : constant String := String_Split.Slice (Subs, I);
-- Pull the next substring out into a string object for easy handling.
begin
Put_Line (String_Split.Slice_Number'Image (I) &
" -> " &
Sub &
" (length" & Positive'Image (Sub'Length) &
")");
-- Output the individual substrings, and their length.
end;
end loop;
end Explode;
你像這樣編譯並執行 Explode 程式
$ gnatmake explode.adb $ ./explode
你應該看到類似於此的輸出
Splitting 'This becomes a bunch of substrings' at whitespace. Got 6 substrings: 1 -> This (length 4) 2 -> becomes (length 7) 3 -> a (length 1) 4 -> bunch (length 5) 5 -> of (length 2) 6 -> substrings (length 10)
示例中的註釋或多或少地解釋了正在發生的事情,但為了清楚起見,我們將逐步介紹程式碼,從依賴項和 use 子句開始
with Ada.Characters.Latin_1;
with Ada.Text_IO;
with GNAT.String_Split;
procedure Explode is
use Ada.Characters;
use Ada.Text_IO;
use GNAT;
這三行 with 列出了我們的程式所依賴的包。當編譯器遇到這些包時,它會從其庫中檢索這些包。"//Procedure Explode is//" 行標記了我們程式的開始,特別是宣告部分,我們在這裡宣告/初始化我們的常量和變數。它還命名了我們的程式 Explode。請注意 use 子句。新增這些子句使我們能夠做到這一點
Put_Line ("Some text");
而不是這個
Ada.Text_IO.Put_Line ("Some text");
在程式中。非常方便。
作為練習,嘗試註釋掉三個 use 子句,並在程式中為所有型別和過程新增實際的包名稱。
接下來我們有這個
Data : constant String :=
"This becomes a " & Latin_1.HT & " bunch of substrings";
這是我們要分割成單個元件的 String。Latin_1.HT 是在 Ada.Characters.Latin_1 中宣告的常量。它在字串中插入一個水平製表符。由於我們在整個程式中都沒有更改 Data 的值,因此我們已將其初始化為 常量。
Subs : String_Split.Slice_Set;
Subs 變數是單個元件或“切片”的容器。
Seps : constant String := " " & Latin_1.HT;
這些是我們的分隔符。在本例中,我們要根據空格 (" ") 和水平製表符 (//Latin_1.HT//) 分割字串。請注意,分隔符不包含在生成的 Slice_Set 中。嘗試使用不同的分隔符進行試驗。
begin
Put_Line ("Splitting '" & Data & "' at whitespace.");
begin 標記了我們程式主體的開始。在 begin 之後,我們輸出一條簡短的訊息。
String_Split.Create (S => Subs,
From => Data,
Separators => Seps,
Mode => String_Split.Multiple);
這是程式的核心。在這條語句中,Data String 根據 Seps 分隔符被分割成單個切片,並將生成的切片放入 Subs Slice_Set 中。請注意 Mode => String_Split.Multiple 引數。使用 Multiple 模式時,String_Split.Create 將將連續的空格和水平製表符視為一個分隔符。
作為練習,嘗試將 Multiple 更改為 Single 看看會發生什麼。
Put_Line
("Got" &
String_Split.Slice_Number'Image (String_Split.Slice_Count (Subs)) &
" substrings:");
這是負責輸出的程式碼行
Got 6 substrings:
是的,對於這麼少的輸出來說,這看起來像是一行非常長的程式碼,但這是有原因的
String_Split.Slice_Number'Image (String_Split.Slice_Count (Subs))
該程式碼行負責輸出中的“6”部分。它所做的就是將 Integer 值 6 轉換為 String 值“6”,它使用 Image [[1]] 完成此操作。String_Split.Slice_Count (Subs) 返回一個 Slice_Number 型別,它基本上只是一個值 >=0 的 Integer,然後 Image 將其轉換為適合輸出的 String。
for I in 1 .. String_Split.Slice_Count (Subs) loop
-- Loop though the substrings.
declare
Sub : constant String := String_Split.Slice (Subs, I);
-- Pull the next substring out into a string object for easy handling.
begin
Put_Line (String_Split.Slice_Number'Image (I) &
" -> " &
Sub &
" (length" & Positive'Image (Sub'Length) &
")");
-- Output the individual substrings, and their length.
end;
end loop;
在這裡,我們開始一個迴圈,該迴圈重複 String_Split.Slice_Count (Subs) 次,在本例中為 6 次。因此,在第一個迴圈中 I 為 1,在最後一個迴圈中 I 為 6。在迴圈內部,我們 declare 一個新的塊。這使我們能夠在每次迴圈重複時重新初始化 Sub 常量,並使用我們分割後的下一個切片重新初始化它。這是使用 String_Split.Slice 函式完成的,該函式以我們的 Sub 常量和 I 迴圈計數器作為引數,並返回一個 String。在塊的主體中,我們輸出每個切片,以及它在 Subs Slice_Set 中的索引和長度。如你所見,我們再次使用 Image 屬性將數值轉換為 Strings。
你可以像這樣去除迴圈內部的塊
for I in 1 .. String_Split.Slice_Count (Subs) loop
-- Loop though the substrings.
Put_Line
(String_Split.Slice_Number'Image (I) &
" -> " &
String_Split.Slice (Subs, I) &
" (length" & Positive'Image (String_Split.Slice (Subs, I)'Length) &
")");
-- Output the individual substrings, and their length.
end loop;
如你所見,我們不再使用 Sub 常量。相反,我們直接呼叫 String_Split.Slice (Subs, I)。它工作方式相同,但可能不太易讀。
另一個選擇是使用 Ada.Strings.Unbounded.Unbounded_String。你可以在此處檢視可能的解決方案
foobar.adb
with Ada.Characters.Latin_1; with Ada.Strings.Unbounded; with Ada.Text_IO; with Ada.Text_IO.Unbounded_IO; with GNAT.String_Split;
procedure Foobar is
use Ada.Characters;
use Ada.Strings.Unbounded;
use Ada.Text_IO;
use Ada.Text_IO.Unbounded_IO;
use GNAT;
Data : constant String :=
"This becomes a " & Latin_1.HT & " bunch of substrings";
-- The input data, normally would be read from some external source or
-- whatever. Latin_1.HT is a horizontal tab.
Subs : String_Split.Slice_Set;
-- Subs is populated by the actual substrings.
Seps : constant String := " " & Latin_1.HT;
-- just arbitrary simple set of whitespace.
Sub : Unbounded_String;
-- Object to a slice.
begin
Put_Line ("Splitting '" & Data & "' at whitespace.");
-- Introduce our job
String_Split.Create (S => Subs,
From => Data,
Separators => Seps,
Mode => String_Split.Multiple);
-- Create the split, using Multiple mode to treat strings of multiple
-- whitespace characters as a single separator.
-- This populates the Subs object.
Put_Line
("Got" &
String_Split.Slice_Number'Image (String_Split.Slice_Count (Subs)) &
" substrings:");
-- Report results, starting with the count of substrings created
for I in 1 .. String_Split.Slice_Count (Subs) loop
-- Loop though the substrings
-- Note that we've avoided the block from the first example. This is
-- possible because our Sub variable is now an Unbounded_String, which
-- does not have to be declared with an initial length.
Sub := To_Unbounded_String (String_Split.Slice (Subs, I));
-- Pull the next substring out into an Unbounded_String object for
-- easy handling. String_Split.Slice return a String, which we convert
-- to an Unbounded_String using the aptly named To_Unbounded_String
-- function.
Put (String_Split.Slice_Number'Image (I));
Put (" -> ");
Put (Sub);
Put (" (length" & Positive'Image (Length (Sub)) & ")");
New_Line;
end loop;
end Foobar; </syntaxhighlight>
最後我們有
end Explode;
它只是簡單地結束程式。
至此,我們完成了這個關於如何根據一組分隔符將字串分割成單個部分(切片)的小教程。我希望你喜歡閱讀它,就像我喜歡撰寫它一樣。
外部示例
[編輯原始碼]- 在以下位置搜尋
GNAT.String_Split的 示例:Rosetta Code,GitHub (gists),任何 Alire 包 或 本華夏公益教科書。 - 在以下位置搜尋與
GNAT.String_Split相關的 帖子:Stack Overflow,comp.lang.ada 或 任何與 Ada 相關的頁面。
