如何基于OR条件的两列中的字符串匹配对R数据框进行子集化?
要基于OR条件下两列中的字符串匹配对R数据框进行子集化,我们可以使用带有双方括号和OR运算符|的grepl函数。例如,如果我们有一个名为df的数据框,其中包含两个字符串列,例如x和y,则可以使用以下方法根据任何列中的特定字符串匹配进行子集化
语法
df[grepl("text",df[["x"]])|grepl("text",df[["y"]]),]
查看以下示例以了解其工作原理。
示例1
考虑以下数据框-
f1<-sample(c("India","China","Egypt","UK"),20,replace=TRUE) f2<-sample(c("India","China","Egypt","UK"),20,replace=TRUE) v1<-rnorm(20) df1<-data.frame(f1,f2,v1) df1输出结果
f1 f2 v1 1 India India 0.58383357 2 UK Egypt -0.71045054 3 India China -0.07848666 4 Egypt India 1.21017481 5 Egypt UK -0.81991817 6 Egypt China 1.98979283 7 India India 0.36160374 8 Egypt China -1.77619986 9 China UK -0.05397712 10 India Egypt -0.30372078 11 Egypt India -1.68623489 12 India India -0.41997104 13 India China -0.97064798 14 UK Egypt 2.02704796 15 UK Egypt -0.47732133 16 China China 0.53153059 17 Egypt UK -1.71608164 18 Egypt India -0.73298689 19 UK UK 1.83674440 20 China China -1.12186527
根据前两列中任何一列中印度的匹配对df1进行子集化-
df1<-df1[grepl("India",df1[["f1"]])|grepl("India",df1[["f2"]]),] df1
f1 f2 v1 1 India India 0.58383357 3 India China -0.07848666 4 Egypt India 1.21017481 7 India India 0.36160374 10 India Egypt -0.30372078 11 Egypt India -1.68623489 12 India India -0.41997104 13 India China -0.97064798 18 Egypt India -0.73298689
例2
g1<-sample(c("Male","Female"),20,replace=TRUE) g2<-sample(c("Male","Female"),20,replace=TRUE) v2<-rpois(20,5) df2<-data.frame(g1,g2) df2输出结果
g1 g2 1 Female Male 2 Female Male 3 Female Female 4 Male Male 5 Male Female 6 Female Female 7 Female Male 8 Male Male 9 Male Female 10 Male Female 11 Female Female 12 Male Male 13 Male Male 14 Male Female 15 Female Male 16 Female Male 17 Female Male 18 Male Female 19 Female Female 20 Male Female
根据前两列中任何一列中女性的匹配对df2进行子集化-
df2<-df2[grepl("Female",df2[["g2"]])|grepl("Female",df2[["g2"]]),] df2
g1 g2 3 Female Female 5 Male Female 6 Female Female 9 Male Female 10 Male Female 11 Female Female 14 Male Female 18 Male Female 19 Female Female 20 Male Female