shell脚本之正则表达式(一)

系统运维 正则表达式概述 基础正则表达式 扩展正则表达式 正则表达式概述

1. 正则表达式的定义

创新互联公司是一家集网站建设,阳春企业网站建设,阳春品牌网站建设,网站定制,阳春网站建设报价,网络营销,网络优化,阳春网站推广为一体的创新建站企业,帮助传统企业提升企业形象加强企业竞争力。可充分满足这一群体相比中小企业更为丰富、高端、多元的互联网需求。同时我们时刻保持专业、时尚、前沿,时刻以成就客户成长自我,坚持不断学习、思考、沉淀、净化自己,让我们为更多的企业打造出实用型网站。

正则表达式又称正规表达式、常规表达式。在代码中常简写为 regexregexp或 RE;正则表达式是使用单个字符串来描述、匹配一系列符合某个句法规则的字符串简单来说, 是一种匹配字符串的方法,通过一些特殊符号,实现快速查找、删除、替换某个特定字符串。

正则表达式是由普通字符与元字符组成的文字模式,普通字符包括大小写字母、数字、标点符号及一些其他符号,元字符则是指那些在正则表达式中具有特殊意义的专用字符,可以用来规定其前导字符(即位于元字符前面的字符)在目标对象中的出现模式。

正则表达式一般用于脚本编程与文本编辑器中。

2.正则表达式用途

正则表达式对于系统管理员来说是非常重要的,系统运行过程中会产生大量的信息,这些信息有些是非常重要的,有些则仅是告知的信息。身为系统管理员如果直接看这么多的信息数据,无法快速定位到重要的信息,如“用户账号登录失败”“服务启动失败”等信息。这时可以通过正则表达式快速提取“有问题”的信息。如此一来,可以将运维工作变得更加简单、方便。

基础正则表达式

正则表达式的字符串表达方法根据不同的严谨程度与功能分为基本正则表达式与扩展正则表达式。基础正则表达式是常用的正则表达式的最基础的部分。在 Linux系统中常见的文件处理工具中grepsed支持基础正则表达式,掌握基础正则表达式的使用方法,首先必须了解基本正则表达式所包含的元字符的含义,下面通过 grep命令以举例的方式逐个介绍。

1.基础正则表达式示例

下面的操作我这边复制一份httpd配置文件作为测试使用。

[root@localhost ~]# cp /etc/httpd/conf/httpd.conf  /opt/httpd.txt
[root@localhost ~]# cd /opt
[root@localhost opt]# ls
httpd.txt  rh
[root@localhost opt]# cat httpd.txt 
#
# This is the main Apache HTTP server configuration file.  It contains the
# configuration directives that give the server its instructions.
# See  for detailed information.
# In particular, see 
# 
# for a discussion of each configuration directive.
#
# Do NOT simply read the instructions in here without understanding
# what they do.  They\'re here only as hints or reminders.  If you are unsure
# consult the online docs. You have been warned.  
#
# Configuration and logfile names: If the filenames you specify for many
# of the server\'s control files begin with / (or drive:/ for Win32), the
# server will use that explicit path.  If the filenames do *not* begin
# with /, the value of ServerRoot is prepended -- so \'log/access_log\'
# with ServerRoot set to \'/www\' will be interpreted by the
# server as \'/www/log/access_log\', where as \'/log/access_log\' will be
# interpreted as \'/log/access_log\'.

#
# ServerRoot: The top of the directory tree under which the server\'s
# configuration, error, and log files are kept.
#
# Do not add a slash at the end of the directory path.  If you point
...//省略部分内容...
1) 查找特定字符

使用grep命令查找特定字符,其中“-n”表示显示行号、“-i”表示不区分大小写

[root@localhost opt]# grep -n the httpd.txt 
2:# This is the main Apache HTTP server configuration file.  It contains the
3:# configuration directives that give the server its instructions.
9:# Do NOT simply read the instructions in here without understanding
10:# what they do.  They\'re here only as hints or reminders.  If you are unsure
11:# consult the online docs. You have been warned.  
13:# Configuration and logfile names: If the filenames you specify for many
14:# of the server\'s control files begin with / (or drive:/ for Win32), the
15:# server will use that explicit path.  If the filenames do *not* begin
16:# with /, the value of ServerRoot is prepended -- so \'log/access_log\'
17:# with ServerRoot set to \'/www\' will be interpreted by the
22:# ServerRoot: The top of the directory tree under which the server\'s
25:# Do not add a slash at the end of the directory path.  If you point
26:# ServerRoot at a non-local disk, be sure to specify a local disk on the
27:# Mutex directive, if file-based mutexes are used.  If you wish to share the
35:# ports, instead of the default. See also the 
47:# To be able to use the functionality of a module which was built as a DSO you
48:# have to place corresponding `LoadModule\' lines at this location so the
49:# directives contained in it are actually available _before_ they are used.
62:# User/Group: The name (or #number) of the user/group to run httpd as.
71:# The directives in this section set up the values used by the \'main\'
74:# any  containers you may define later in the file.
76:# All of these directives may appear inside  containers,
77:# in which case these default settings will be overridden for the
...//省略部分内容...
[root@localhost opt]# grep -ni the httpd.txt 
2:# This is the main Apache HTTP server configuration file.  It contains the
3:# configuration directives that give the server its instructions.
9:# Do NOT simply read the instructions in here without understanding
10:# what they do.  They\'re here only as hints or reminders.  If you are unsure
11:# consult the online docs. You have been warned.  
13:# Configuration and logfile names: If the filenames you specify for many
14:# of the server\'s control files begin with / (or drive:/ for Win32), the
15:# server will use that explicit path.  If the filenames do *not* begin
16:# with /, the value of ServerRoot is prepended -- so \'log/access_log\'
17:# with ServerRoot set to \'/www\' will be interpreted by the
22:# ServerRoot: The top of the directory tree under which the server\'s
25:# Do not add a slash at the end of the directory path.  If you point
26:# ServerRoot at a non-local disk, be sure to specify a local disk on the
27:# Mutex directive, if file-based mutexes are used.  If you wish to share the
35:# ports, instead of the default. See also the 
47:# To be able to use the functionality of a module which was built as a DSO you
48:# have to place corresponding `LoadModule\' lines at this location so the
49:# directives contained in it are actually available _before_ they are used.
62:# User/Group: The name (or #number) of the user/group to run httpd as.
71:# The directives in this section set up the values used by the \'main\'
73:#  definition.  These values also provide defaults for
74:# any  containers you may define later in the file.
76:# All of these directives may appear inside  containers,
77:# in which case these default settings will be overridden for the
82:# ServerAdmin: Your address, where problems with the server should be
89:# ServerName gives the name and port that the server uses to identify itself.
98:# Deny access to the entirety of your server\'s filesystem. You must
99:# explicitly permit access to web content directories in other
...//省略部分内容...

若反向选择,如查找不包含“the”字符的行,则需要通过 grep命令的“-vn”选项实现。

[root@localhost opt]# grep -nv the httpd.txt 
1:#
4:# See  for detailed information.
5:# In particular, see 
6:# 
7:# for a discussion of each configuration directive.
8:#
12:#
18:# server as \'/www/log/access_log\', where as \'/log/access_log\' will be
19:# interpreted as \'/log/access_log\'.
20:
21:#
23:# configuration, error, and log files are kept.
24:#
28:# same ServerRoot for multiple httpd daemons, you will need to change at
29:# least PidFile.
30:#
31:ServerRoot /etc/httpd
32:
33:#
34:# Listen: Allows you to bind Apache to specific IP addresses and/or
36:# directive.
37:#
38:# Change this to Listen on specific IP addresses as shown below to 
39:# prevent Apache from glomming onto all bound IP addresses.
40:#
41:#Listen 12.34.56.78:80
42:Listen 80
43:
44:#
45:# Dynamic Shared Object (DSO) Support
46:#
50:# Statically compiled modules (those listed by `httpd -l\') do not need
51:# to be loaded here
...//省略部分内容...
2) 利用中括号“[]”来查找集合字符

httpd.txt测试文件中添加字符串shirtshortwdwodwoodwoooood。想要查找“shirt”“short”这两个字符串时,可以发现这两个字符串均包含“sh”“rt”。此时执行以下命令即可同时查找到“shirt”“short”这两个字符串。“[]”中无论有几个字符,都仅代表一个字符,也就是说“[io]”表示匹配“i”或者“o”

[root@localhost opt]# vim httpd.txt 
...//省略部分内容...
# Supplemental configuration
#
# Load config files in the /etc/httpd/conf.d directory, if any.
IncludeOptional conf.d/*.conf
shirt
short
wd                                                                          
wod                                                                          
wood                                                                            
woooood                                                                          
:wq        
[root@localhost opt]# grep -n \'sh[io]rt\' httpd.txt 
354:shirt
355:short

若要查找包含重复单个字符“oo”时,只需要执行以下命令即可。

[root@localhost opt]# grep -n \'oo\' httpd.txt 
16:# with /, the value of ServerRoot is prepended -- so \'log/access_log\'
17:# with ServerRoot set to \'/www\' will be interpreted by the
22:# ServerRoot: The top of the directory tree under which the server\'s
26:# ServerRoot at a non-local disk, be sure to specify a local disk on the
28:# same ServerRoot for multiple httpd daemons, you will need to change at
31:ServerRoot /etc/httpd
54:# LoadModule foo_module modules/mod_foo.so
60:# httpd as root initially and it will switch.  
63:# It is usually good practice to create a dedicated user and group for
86:ServerAdmin root@localhost
115:# DocumentRoot: The directory out of which you will serve your
119:DocumentRoot /var/www/html
130:# Further relax access to the default document root:
226:    # Redirect permanent /foo http://www.example.com/bar
230:    # access content that does not live under the DocumentRoot.
332:#ErrorDocument 500 The server made a boo boo.
358:wood
359:woooood

若查找“oo”前面不是“w”的字符串,只需要通过集合字符的反向选择“[^]”来实现该目的

[root@localhost opt]# grep -n \'[^w]oo\' httpd.txt 
16:# with /, the value of ServerRoot is prepended -- so \'log/access_log\'
17:# with ServerRoot set to \'/www\' will be interpreted by the
22:# ServerRoot: The top of the directory tree under which the server\'s
26:# ServerRoot at a non-local disk, be sure to specify a local disk on the
28:# same ServerRoot for multiple httpd daemons, you will need to change at
31:ServerRoot /etc/httpd
54:# LoadModule foo_module modules/mod_foo.so
60:# httpd as root initially and it will switch.  
63:# It is usually good practice to create a dedicated user and group for
86:ServerAdmin root@localhost
115:# DocumentRoot: The directory out of which you will serve your
119:DocumentRoot /var/www/html
130:# Further relax access to the default document root:
226:    # Redirect permanent /foo http://www.example.com/bar
230:    # access content that does not live under the DocumentRoot.
332:#ErrorDocument 500 The server made a boo boo.
359:woooood

在上述命令的执行结果中发现“woooood”也符合匹配规则,上述结果中可以得知,“oo”前面的“o”是符合匹配规则的。若不希望“oo”前面存在小写字母,可以使用“grep –n‘[^a-z]oo’httpd.txt”命令实现,其中“a-z”表示小写字母,大写字母则通过“A-Z”表示。

[root@localhost opt]# grep -n \'[^a-z]oo\' httpd.txt 
16:# with /, the value of ServerRoot is prepended -- so \'log/access_log\'
17:# with ServerRoot set to \'/www\' will be interpreted by the
22:# ServerRoot: The top of the directory tree under which the server\'s
26:# ServerRoot at a non-local disk, be sure to specify a local disk on the
28:# same ServerRoot for multiple httpd daemons, you will need to change at
31:ServerRoot /etc/httpd
115:# DocumentRoot: The directory out of which you will serve your
119:DocumentRoot /var/www/html
230:    # access content that does not live under the DocumentRoot.

查找包含数字的行可以通过“grep –n‘[0-9]’ httpd.txt”命令来实现.

[root@localhost opt]# grep -n \'[0-9]\' httpd.txt 
4:# See  for detailed information.
6:# 
14:# of the server\'s control files begin with / (or drive:/ for Win32), the
41:#Listen 12.34.56.78:80
42:Listen 80
95:#ServerName www.example.com:80
141:    # http://httpd.apache.org/docs/2.4/mod/core.html#options
311:# interpretation of all content as UTF-8 by default.  To use the 
312:# default browser choice (ISO-8859-1), or to allow the META tags
316:AddDefaultCharset UTF-8
329:# 1) plain text 2) local redirects 3) external redirects
332:#ErrorDocument 500 The server made a boo boo.
333:#ErrorDocument 404 /missing.html
334:#ErrorDocument 404 /cgi-bin/missing_handler.pl
335:#ErrorDocument 402 http://www.example.com/subscription_info.html
3) 查找行首“^”与行尾字符“$”

基础正则表达式包含两个定位元字符:“^”(行首)与“$”(行尾)。若想查找 “Ser”字符串为行首的行,则可以通过“^”元字符来实现。

[root@localhost opt]# grep -n \'^Ser\' httpd.txt 
31:ServerRoot /etc/httpd
86:ServerAdmin root@localhost

查询以小写字母开头的行可以通过“^[a-z]”规则来过滤,查询大写字母开头的行则使用“^[A-Z]”规则,若查询不以字母开头的行则使用“^[^a-zA-Z]”规则。“^”符号在元字符集合“[]”符号内外的作用是不一样的,在“[]”符号内表示反向选择,在“[]”符号外则代表定位行首。

[root@localhost opt]# grep -n \'^[a-z]\' httpd.txt 
354:shirt
355:short
356:wd
357:wod
358:wood
359:woooood
[root@localhost opt]# grep -n \'^[A-Z]\' httpd.txt 
31:ServerRoot /etc/httpd
42:Listen 80
56:Include conf.modules.d/*.conf
66:User apache
67:Group apache
86:ServerAdmin root@localhost
119:DocumentRoot /var/www/html
182:ErrorLog logs/error_log
189:LogLevel warn
316:AddDefaultCharset UTF-8
348:EnableSendfile on
353:IncludeOptional conf.d/*.conf
[root@localhost opt]# grep -n \'^[^a-zA-Z]\' httpd.txt 
1:#
2:# This is the main Apache HTTP server configuration file.  It contains the
3:# configuration directives that give the server its instructions.
4:# See  for detailed information.
5:# In particular, see 
6:# 
7:# for a discussion of each configuration directive.
8:#
9:# Do NOT simply read the instructions in here without understanding
10:# what they do.  They\'re here only as hints or reminders.  If you are unsure
11:# consult the online docs. You have been warned.  
...//省略部分内容...

若想查找以某一特定字符结尾的行则可以使用“$”定位符。例如,执行以下命令即可实现查询以小数点(.)结尾的行。因为小数点(.)在正则表达式中也是一个元字符,所以在这里需要用转义字符“\\”将具有特 殊意义的字符转化成普通字符。

...//省略部分内容...[root@localhost opt]# grep -n \'\\.$\' httpd.txt 
3:# configuration directives that give the server its instructions.
4:# See  for detailed information.
7:# for a discussion of each configuration directive.
19:# interpreted as \'/log/access_log\'.
23:# configuration, error, and log files are kept.
29:# least PidFile.
36:# directive.
...//省略部分内容...

当查询空白行时,执行“grep –n ‘^$’ httpd.txt”命令即可。

[root@localhost opt]# grep -n \'^$\' httpd.txt 
20:
32:
43:
57:
68:
80:
87:
96:
...//省略部分内容...
4) 查找任意一个字符“.”与重复字符“*”

在正则表达式中小数点(.)也是一个元字符,代表任意一个字符。例如, 执行以下命令就可以查找“w??d”的字符串,即共有四个字符,以w开头 d结尾。

[root@localhost opt]# grep -n \'w..d\' httpd.txt 
108:# Note that from this point forward you must specifically allow
148:    # It can be All, None, or any combination of the keywords:
358:wood

在上述结果中,“wood”字符串“w..d”匹配规则。若想要查询oooooooo等资料,则需要使用星号(*)元字符。但需要注意的是,“*”代表的是重复零个或多个前面的单字符。“o*”表示拥有零个(即为空字符)或大于等于一个“o”的字符,因为允许空字符,所以执行“grep –n‘o*’ httpd.txt”命令会将文本中所有的内容都输出打印。如果是“oo*”, 则第一个 o必须存在,第二个 o则是零个或多个 o,所以凡是包含 oooooooo,等的资料都符合标准。同理,若查询包含至少两个 o以上的字符串,则执行“grep –n‘ooo*’ httpd.txt”命令即可。

[root@localhost opt]# grep -n \'o*\' httpd.txt 
...//省略部分内容...
353:IncludeOptional conf.d/*.conf
354:shirt
355:short
356:wd
357:wod
358:wood
359:woooood
[root@localhost opt]# grep -n \'oo*\' httpd.txt
...//省略部分内容...
353:IncludeOptional conf.d/*.conf
355:short
357:wod
358:wood
359:woooood
[root@localhost opt]# grep -n \'ooo*\' httpd.txt 
16:# with /, the value of ServerRoot is prepended -- so \'log/access_log\'
17:# with ServerRoot set to \'/www\' will be interpreted by the
22:# ServerRoot: The top of the directory tree under which the server\'s
26:# ServerRoot at a non-local disk, be sure to specify a local disk on the
28:# same ServerRoot for multiple httpd daemons, you will need to change at
31:ServerRoot /etc/httpd
54:# LoadModule foo_module modules/mod_foo.so
60:# httpd as root initially and it will switch.  
63:# It is usually good practice to create a dedicated user and group for
86:ServerAdmin root@localhost
115:# DocumentRoot: The directory out of which you will serve your
119:DocumentRoot /var/www/html
130:# Further relax access to the default document root:
226:    # Redirect permanent /foo http://www.example.com/bar
230:    # access content that does not live under the DocumentRoot.
332:#ErrorDocument 500 The server made a boo boo.
358:wood
359:woooood

查询以 w开头 d结尾,中间包含至少一个 o的字符串,执行以下命令即可实现。

[root@localhost opt]# grep -n \'woo*d\' httpd.txt 
357:wod
358:wood
359:woooood

查询以 w开头 d结尾,中间的字符可有可无的字符串。

[root@localhost opt]# grep -n \'w.*d\' httpd.txt 
...//省略部分内容...
342:# be turned off when serving from networked-mounted 
356:wd
357:wod
358:wood
359:woooood

查询任意数字所在行

[root@localhost opt]# grep \'[0-9][0-9]*\' httpd.txt 
# See  for detailed information.
# 
# of the server\'s control files begin with / (or drive:/ for Win32), the
#Listen 12.34.56.78:80
Listen 80
#ServerName www.example.com:80
    # http://httpd.apache.org/docs/2.4/mod/core.html#options
# interpretation of all content as UTF-8 by default.  To use the 
# default browser choice (ISO-8859-1), or to allow the META tags
AddDefaultCharset UTF-8
# 1) plain text 2) local redirects 3) external redirects
#ErrorDocument 500 The server made a boo boo.
#ErrorDocument 404 /missing.html
#ErrorDocument 404 /cgi-bin/missing_handler.pl
#ErrorDocument 402 http://www.example.com/subscription_info.html
5) 查找连续字符范围“{}”

在上面的示例中,我们使用“.”“*”来设定零个到无限多个重复的字符,如果想要限制一个范围内的重复的字符串,这个时候就需要使用基础正则表达式中的限定范围的字符“{}”,因为“{}”Shell中具有特殊 意义,所以在使用“{}”字符时,需要利用转义字符“\\”,将“{}”字符转换成普通字符。

(1)查询两个 o的字符。

[root@localhost opt]# grep -n \'o\\{2\\}\' httpd.txt 
16:# with /, the value of ServerRoot is prepended -- so \'log/access_log\'
17:# with ServerRoot set to \'/www\' will be interpreted by the
22:# ServerRoot: The top of the directory tree under which the server\'s
26:# ServerRoot at a non-local disk, be sure to specify a local disk on the
28:# same ServerRoot for multiple httpd daemons, you will need to change at
31:ServerRoot /etc/httpd
54:# LoadModule foo_module modules/mod_foo.so
60:# httpd as root initially and it will switch.  
63:# It is usually good practice to create a dedicated user and group for
86:ServerAdmin root@localhost
115:# DocumentRoot: The directory out of which you will serve your
119:DocumentRoot /var/www/html
130:# Further relax access to the default document root:
226:    # Redirect permanent /foo http://www.example.com/bar
230:    # access content that does not live under the DocumentRoot.
332:#ErrorDocument 500 The server made a boo boo.
358:wood
359:woooood

(2)查询以 w开头以 d结尾,中间包含 2~5o的字符串。

[root@localhost opt]# grep -n \'o\\{2,5\\}\' httpd.txt 
16:# with /, the value of ServerRoot is prepended -- so \'log/access_log\'
17:# with ServerRoot set to \'/www\' will be interpreted by the
22:# ServerRoot: The top of the directory tree under which the server\'s
26:# ServerRoot at a non-local disk, be sure to specify a local disk on the
28:# same ServerRoot for multiple httpd daemons, you will need to change at
31:ServerRoot /etc/httpd
54:# LoadModule foo_module modules/mod_foo.so
60:# httpd as root initially and it will switch.  
63:# It is usually good practice to create a dedicated user and group for
86:ServerAdmin root@localhost
115:# DocumentRoot: The directory out of which you will serve your
119:DocumentRoot /var/www/html
130:# Further relax access to the default document root:
226:    # Redirect permanent /foo http://www.example.com/bar
230:    # access content that does not live under the DocumentRoot.
332:#ErrorDocument 500 The server made a boo boo.
358:wood
359:woooood

(3)查询以 w 开头以 d 结尾,中间包含 2 以上 o 的字符串。

[root@localhost opt]# grep -n \'wo\\{2\\}\' httpd.txt 
358:wood
359:woooood
2. 元字符总结 元字符 作用 ^ 匹配输入字符串的开始位置。除非在方括号表达式中使用,表示不包含该字符集合。要匹配“^”字符本身,请使用“\\^” $ 匹配输入字符串的结尾位置。如果设置了 RegExp对象的 Multiline属性,则“$”也匹配‘\\n’‘\\r’。要匹配“$”字符本身,请使用“\\$” . 匹配除“\\r\\n”之外的任何单个字符 \\ 将下一个字符标记为特殊字符、原义字符、向后引用、八进制转义符。例如,‘n’匹配字符“n”‘\\n’匹配换行符。序列‘\\\\’匹配“\\”,而‘\\(’则匹配“(” * 匹配前面的子表达式零次或多次。要匹配“*”字符,请使用“\\*” [] 字符集合。匹配所包含的任意一个字符。例如,“[abc]”可以匹配“plain”中的“a” [^] 赋值字符集合。匹配未包含的一个任意字符。例如,“[^abc]”可以匹配“plain”“plin”中的任何一个字母 [n1-n2] 字符范围。匹配指定范围内的任意一个字符。例如,“[a-z]”可以匹配“a”“z”范围内的任意一个小写字母字符。注意:只有连字符(-)在字符组内部,并且出现在两个字符之间时,才能表示字符的范围;如果出现在字符组的开头,则只能表示连字符本身 {n} n是一个非负整数,匹配确定的n次。例如,“o{2}”不能匹配“Bob”中的“o”,但是能匹配“food”中的两个o {n,} n是一个非负整数,至少匹配n次。例如,“o{2,}”不能匹配“Bob”中的“o”,但能匹配“foooood”中的所有 o“o{1,}”等价于“o+”“o{0,}”则等价于“o*” {n,m} mn均为非负整数,其中 n<=m,最少匹配 n次且最多匹配 m次 扩展正则表达式

通常情况下会使用基础正则表达式就已经足够了,但有时为了简化整个指令,需要使用范围更广的扩展正则表达式,例如,使用基础正则表达式查询除文件中空白行与行首为“#”之外的行(通常用于查看生效的配置文件),执行“grep –v ‘^$’ httpd.txt | grep –v ‘^#’”即可实现。这里需要使用管道命令来搜索两次。如果使用扩展正则表达式,可以简化为“egrep –v ‘^$|^#’ httpd.txt”,其中,单引号内的管道符号表示或者(or)

[root@localhost opt]# grep -v \'^$\' httpd.txt | grep -v ^#
ServerRoot /etc/httpd
Listen 80
Include conf.modules.d/*.conf
User apache
Group apache
ServerAdmin root@localhost

    AllowOverride none
    Require all denied

...//省略部分内容...
[root@localhost opt]# egrep -v \'^$|^#\' httpd.txt 
ServerRoot /etc/httpd
Listen 80
Include conf.modules.d/*.conf
User apache
Group apache
ServerAdmin root@localhost

    AllowOverride none
    Require all denied

DocumentRoot /var/www/html

    AllowOverride None
    # Allow open access:
    Require all granted

...//省略部分内容...

grep命令仅支持基础正则表达式,如果使用扩展正则表达式,需要使用 egrepawk命令。awk命令在后面进行讲解,这里我们直接使用 egrep命令。egrep命令与 grep命令的用法基本相似。egrep命令是一个搜索文件获得模式,使用该命令可以搜索文件中的任意字符串和符号,也可以搜索一个或多个文件的字符串,一个提示符可以是单个字符、一个字符串、一个字或一个句子。

与基础正则表达式类型相同,扩展正则表达式也包含多个元字符,常见的扩展正则表达式的元字符主要包括以下几个:

元字符 作用 + 重复一个或者一个以上的前一个字符 零个或者一个的前一个字符 | 使用或者(or)的方式找出多个字符 () 查找“组”字符串 ()+ 辨别多个重复的组 示例

执行“egrep -n \'wo+d\' httpd.txt”命令,即可查询wod、wood、woooood等字符串。

[root@localhost opt]# egrep -n \'wo+d\' httpd.txt 
357:wod
358:wood
359:woooood

执行“egrep -n \'wo?d\' httpd.txt”命令,即可查询“wd”“wod”这两个字符串。

[root@localhost opt]# egrep -n \'wo?d\' httpd.txt 
168:# The following lines prevent .htaccess and .htpasswd files from being 
356:wd
357:wod

执行“egrep -n \'of|is|on\' httpd.txt”命令即可查询of或者if或者on字符串。

[root@localhost opt]# egrep -n \'if|is|on\' httpd.txt 
2:# This is the main Apache HTTP server configuration file.  It contains the
3:# configuration directives that give the server its instructions.
4:# See  for detailed information.
7:# for a discussion of each configuration directive.
9:# Do NOT simply read the instructions in here without understanding
10:# what they do.  They\'re here only as hints or reminders.  If you are unsure
11:# consult the online docs. You have been warned.  
13:# Configuration and logfile names: If the filenames you specify for many
14:# of the server\'s control files begin with / (or drive:/ for Win32), the
16:# with /, the value of ServerRoot is prepended -- so \'log/access_log\'
23:# configuration, error, and log files are kept.
26:# ServerRoot at a non-local disk, be sure to specify a local disk on the
27:# Mutex directive, if file-based mutexes are used.  If you wish to share the
...//省略部分内容...

“egrep -n \'sh(i|o)rt\' httpd.txt”`“shirt”“short”因为这两个单词的“sh”“rt”是重复的,所以将“i”“o列于“()”符号当中,并以“|”分隔,即可查询shirt或者short字符串。

[root@localhost opt]# egrep -n \'sh(i|o)rt\' httpd.txt 
354:shirt
355:short

“egrep -n \'A(xyz)+C\' httpd.txt”。该命令是查询开头的A结尾是C,中间有一个以上的xyz字符串的意思,在httpd.txt文件中添加字符串AxyzCAxyzxyzC

[root@localhost opt]# vim httpd.txt 
...//省略部分内容...
woooood
AxyzC
AxyzxyzC
~                                                                             
~                                                                             
:wq     
[root@localhost opt]# egrep -n \'A(xyz)+C\' httpd.txt 
360:AxyzC
361:AxyzxyzC

本文题目:shell脚本之正则表达式(一)
转载来于:http://myzitong.com/article/cjiihi.html