統計分析：使用 R 入門 / R / 邏輯運算

當訪問向量元素時，我們看到了如何使用一個簡單的涉及小於號 (<) 的邏輯表示式來生成一個邏輯向量，然後可以用來選擇小於某個值的元素。這種型別的邏輯運算非常有用。除了 < 之外，還有其他一些比較運算子。以下是完整集合（有關更多詳細資訊，請參閱 ?Comparison）

<（小於）和 <=（小於或等於）
>（大於）和 >=（大於或等於）
==（等於^[1]）和 !=（不等於）

透過使用和、或和非組合邏輯向量，可以獲得更大的靈活性。例如，我們可能想找出哪些美國州的面積小於 10 000 或大於 100 000 平方英里，或者找出哪些州的面積大於 100 000 平方英里並且名稱較短。下面的程式碼展示瞭如何使用以下 R 符號來做到這一點

&（“和”）
|（“或”）
!（“非”）

當使用邏輯向量時，以下函式特別有用，如下所示

which() 識別邏輯向量中哪些元素為 TRUE
sum() 可用於給出邏輯向量中為 TRUE 的元素數量。這是因為 sum() 強制將其輸入轉換為數字，如果 TRUE 和 FALSE 被轉換為數字，它們將分別取值為 1 和 0。
ifelse() 根據邏輯向量中的每個元素是 TRUE 還是 FALSE 返回不同的值。具體來說，像 ifelse(aLogicalVector, vectorT, vectorF) 這樣的命令會接受 aLogicalVector，並對每個為 TRUE 的元素返回 vectorT 中的對應元素，對每個為 FALSE 的元素返回 vectorF 中的對應元素。額外說明的是，如果 vectorT 或 vectorF 比 aLogicalVector 短，它們將透過重複擴充套件到正確的長度。

輸入

### In these examples, we'll reuse the American states data, especially the state names
### To remind yourself of them, you might want to look at the vector "state.names"

nchar(state.name)       # nchar() returns the number of characters in strings of text ...
nchar(state.name) <= 6  #so this indicates which states have names of 6 letters or fewer
ShortName <- nchar(state.name) <= 6         #store this logical vector for future use
sum(ShortName)          #With a logical vector, sum() tells us how many are TRUE (11 here)
which(ShortName)        #These are the positions of the 11 elements which have short names
state.name[ShortName]   #Use the index operator [] on the original vector to get the names
state.abb[ShortName]    #Or even on other vectors (e.g. the 2 letter state abbreviations)

isSmall <- state.area < 10000  #Store a logical vector indicating states <10000 sq. miles
isHuge  <- state.area > 100000 #And another for states >100000 square miles in area
sum(isSmall)                   #there are 8 "small" states
sum(isHuge)                    #coincidentally, there are also 8 "huge" states

state.name[isSmall | isHuge]   # | means OR. So these are states which are small OR huge
state.name[isHuge & ShortName] # & means AND. So these are huge AND with a short name
state.name[isHuge & !ShortName]# ! means NOT. So these are huge and with a longer name

### Examples of ifelse() ###

ifelse(ShortName, state.name, state.abb) #mix short names with abbreviations for long ones
# (think of this as "*if* ShortName is TRUE then use state.name *else* use state.abb)

### Many functions in R increase input vectors to the correct size by duplication ###
ifelse(ShortName, state.name, "tooBIG")   #A silly example: the 3rd argument is duplicated
size <- ifelse(isSmall, "small", "large") #A more useful example, for both 2nd & 3rd args
size                                      #might be useful as an indicator variable?             
ifelse(size=="large", ifelse(isHuge, "huge", "medium"), "small") #A more complex example

結果

> ### In these examples, we'll reuse the American states data, especially the state names
> ### To remind yourself of them, you might want to look at the vector "state.names"
>  
> nchar(state.name)       # nchar() returns the number of characters in strings of text ...
 [1]  7  6  7  8 10  8 11  8  7  7  6  5  8  7  4  6  8  9  5  8 13  8  9 11  8  7  8  6 13
[30] 10 10  8 14 12  4  8  6 12 12 14 12  9  5  4  7  8 10 13  9  7
> nchar(state.name) <= 6  #so this indicates which states have names of 6 letters or fewer
 [1] FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE
[15]  TRUE  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
[29] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
[43]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
> ShortName <- nchar(state.name) <= 6         #store this logical vector for future use
> sum(ShortName)          #With a logical vector, sum() tells us how many are TRUE (11 here)
[1] 11
> which(ShortName)        #These are the positions of the 11 elements which have short names
 [1]  2 11 12 15 16 19 28 35 37 43 44
> state.name[ShortName]   #Use the index operator [] on the original vector to get the names
 [1] "Alaska" "Hawaii" "Idaho"  "Iowa"   "Kansas" "Maine"  "Nevada" "Ohio"   "Oregon"
[10] "Texas"  "Utah"  
> state.abb[ShortName]    #Or even on other vectors (e.g. the 2 letter state abbreviations)
 [1] "AK" "HI" "ID" "IA" "KS" "ME" "NV" "OH" "OR" "TX" "UT"
>  
> isSmall <- state.area < 10000  #Store a logical vector indicating states <10000 sq. miles
> isHuge  <- state.area > 100000 #And another for states >100000 square miles in area
> sum(isSmall)                   #there are 8 "small" states
[1] 8
> sum(isHuge)                    #coincidentally, there are also 8 "huge" states
[1] 8
>  
> state.name[isSmall | isHuge]   # | means OR. So these are states which are small OR huge
 [1] "Alaska"        "Arizona"       "California"    "Colorado"      "Connecticut"  
 [6] "Delaware"      "Hawaii"        "Massachusetts" "Montana"       "Nevada"       
[11] "New Hampshire" "New Jersey"    "New Mexico"    "Rhode Island"  "Texas"        
[16] "Vermont"      
> state.name[isHuge & ShortName] # & means AND. So these are huge AND with a short name
[1] "Alaska" "Nevada" "Texas" 
> state.name[isHuge & !ShortName]# ! means NOT. So these are huge and with a longer name
[1] "Arizona"    "California" "Colorado"   "Montana"    "New Mexico"
>  
> ### Examples of ifelse() ###
>  
> ifelse(ShortName, state.name, state.abb) #mix short names with abbreviations for long ones
 [1] "AL"     "Alaska" "AZ"     "AR"     "CA"     "CO"     "CT"     "DE"     "FL"    
[10] "GA"     "Hawaii" "Idaho"  "IL"     "IN"     "Iowa"   "Kansas" "KY"     "LA"    
[19] "Maine"  "MD"     "MA"     "MI"     "MN"     "MS"     "MO"     "MT"     "NE"    
[28] "Nevada" "NH"     "NJ"     "NM"     "NY"     "NC"     "ND"     "Ohio"   "OK"    
[37] "Oregon" "PA"     "RI"     "SC"     "SD"     "TN"     "Texas"  "Utah"   "VT"    
[46] "VA"     "WA"     "WV"     "WI"     "WY"    
> # (think of this as "*if* ShortName is TRUE then use state.name *else* use state.abb)
>  
> ### Many functions in R increase input vectors to the correct size by duplication ###
> ifelse(ShortName, state.name, "tooBIG")   #A silly example: the 3rd argument is duplicated
 [1] "tooBIG" "Alaska" "tooBIG" "tooBIG" "tooBIG" "tooBIG" "tooBIG" "tooBIG" "tooBIG"
[10] "tooBIG" "Hawaii" "Idaho"  "tooBIG" "tooBIG" "Iowa"   "Kansas" "tooBIG" "tooBIG"
[19] "Maine"  "tooBIG" "tooBIG" "tooBIG" "tooBIG" "tooBIG" "tooBIG" "tooBIG" "tooBIG"
[28] "Nevada" "tooBIG" "tooBIG" "tooBIG" "tooBIG" "tooBIG" "tooBIG" "Ohio"   "tooBIG"
[37] "Oregon" "tooBIG" "tooBIG" "tooBIG" "tooBIG" "tooBIG" "Texas"  "Utah"   "tooBIG"
[46] "tooBIG" "tooBIG" "tooBIG" "tooBIG" "tooBIG"
> size <- ifelse(isSmall, "small", "large") #A more useful example, for both 2nd & 3rd args
> size                                      #might be useful as an indicator variable?             
 [1] "large" "large" "large" "large" "large" "large" "small" "small" "large" "large"
[11] "small" "large" "large" "large" "large" "large" "large" "large" "large" "large"
[21] "small" "large" "large" "large" "large" "large" "large" "large" "small" "small"
[31] "large" "large" "large" "large" "large" "large" "large" "large" "small" "large"
[41] "large" "large" "large" "large" "small" "large" "large" "large" "large" "large"
> ifelse(size=="large", ifelse(isHuge, "huge", "medium"), "small") #A more complex example
 [1] "medium" "huge"   "huge"   "medium" "huge"   "huge"   "small"  "small"  "medium"
[10] "medium" "small"  "medium" "medium" "medium" "medium" "medium" "medium" "medium"
[19] "medium" "medium" "small"  "medium" "medium" "medium" "medium" "huge"   "medium"
[28] "huge"   "small"  "small"  "huge"   "medium" "medium" "medium" "medium" "medium"
[37] "medium" "medium" "small"  "medium" "medium" "medium" "huge"   "medium" "small" 
[46] "medium" "medium" "medium" "medium" "medium"

如果你做過任何計算機程式設計，你可能更習慣在“if”語句的上下文中處理邏輯。雖然 R 也擁有一個 if() 語句，但在處理向量時，它並不那麼有用。例如，以下 R 表示式

if(aVariable == 0) then print("zero") else print("not zero")

期望 aVariable 是一個單一數字：如果這個數字為 0，則輸出“zero”，如果它不是零則輸出“not zero”^[2]。如果 aVariable 是一個包含 2 個或多個值的向量，則只有第一個元素有效：其他所有元素都被忽略^[3]。也存在一些邏輯運算子，它們只考慮向量的第一個元素：這些是 &&（用於 AND）和 ||（用於 OR）^[4]。

註釋

↑ 請注意，當使用連續（小數）數字時，舍入誤差可能意味著計算結果彼此並不完全相等，即使它們看起來應該相等。因此，在使用 == 處理連續數字時要小心。R 提供了 all.equal 函式來幫助解決這個問題
↑ 但與 ifelse 不同，它無法處理 NA 值
↑ 因此，在 if 語句中使用 == 可能不是一個好主意，有關詳細資訊，請參閱 ?"==" 中的註釋。
↑ 這些在 R 的更高階的計算機程式設計中特別有用，有關詳細資訊，請參閱 ?"&&"

[1] 請注意，當使用連續（小數）數字時，舍入誤差可能意味著計算結果彼此並不完全相等，即使它們看起來應該相等。因此，在使用 == 處理連續數字時要小心。R 提供了 all.equal 函式來幫助解決這個問題

[2] 但與 ifelse 不同，它無法處理 NA 值

[3] 因此，在 if 語句中使用 == 可能不是一個好主意，有關詳細資訊，請參閱 ?"==" 中的註釋。

[4] 這些在 R 的更高階的計算機程式設計中特別有用，有關詳細資訊，請參閱 ?"&&"

[1]

[2]

[3]

[4]