shell脚本编程之正则表达式(二)(扩展正则表达式、sed)

shell脚本编程之正则表达式(二)

一、前言

​ 本文主要是对扩展正则表达式的介绍,同时,继续按照上篇文章的风格介绍sed文本处理工具,sed作为shell编程中“三剑客”之一,在对文本处理上有巨大作用。
关于正则概念以及grep命令结合正则使用的案例请参照:https://blog.51cto.com/14557673/2455588

创新互联建站专注于松阳企业网站建设,成都响应式网站建设,商城系统网站开发。松阳网站建设公司,为松阳等地区提供建站服务。全流程定制网站制作,专业设计,全程项目跟踪,创新互联建站专业和态度为您提供的服务

二、扩展正则表达式

​ 扩展正则表达式主要是为了简化指令而产出的。例如,使用基础正则表达式查询文件中空白行与行首为#号之外的行(一般用于查看生效的配置),执行"grep -v '^$' test.txt | grep -v '^#",而使用扩展正则表达式,可以简化为“egrep -v '^$|^#' test.txt”,由于grep仅支持基础正则表达式,如果想要使用扩展正则表达式需要使用egrep或者awk命令。下一篇会讲解awk命令。

​ egrep 命令是一个搜索文件获得模式,使用该命令可以搜索文件中的任意字符串和符号,也可以搜索一个或多个文件的字符串,一个提示符可以是单个字符、一个字符串、一个字或一个句子。

​ 扩展正则表达式常见元字符表

元字符作用与示例
+ 作用:重复一个或者一个以上的前一个字符 示例:执行“egrep -n 'wo+d' test.txt”命令,即可查询"wood" "woood" "woooooood"等字符串
作用:零个或者一个的前一个字符 示例:执行“egrep -n 'bes?t' test.txt”命令,即可查询“bet”“best”这两个字符串
| 作用:使用或者(or)的方式找出多个字符 示例:执行“egrep -n 'of|is|on' test.txt”命令即可查询"of"或者"if"或者"on"字符串
() 作用:查找“组”字符串示例:“egrep -n 't(a|e)st' test.txt”。“tast”与“test”因为这两个单词的“t”与“st”是重复的,所以将“a”与“e”列于“()”符号当中,并以“|”分隔,即可查询"tast"或者"test"字符串
()+ 作用:辨别多个重复的组 示例:“egrep -n 'A(xyz)+C' test.txt”。该命令是查询开头的"A"结尾是"C",中间有一个以上的 "xyz"字符串的意思

三、文本处理器之——sed工具简介

sed (Stream Editor)是一个强大而简单的文本解析转换工具,可以读取文本,并根据指定的条件对文本内容进行编辑(删除、替换、添加、移动等),最后输出所有行或者仅输出处理的某些行。sed 也可以在无交互的情况下实现相当复杂的文本处理操作,被广泛应用于 Shell 脚本中,用以完成各种自动化处理任务。

​ sed 的工作流程主要包括读取、执行和显示三个过程。

读取:sed 读取的是行内容并且将其存储到临时的缓冲区中(或称为模式空间,pattern space)

执行:默认在模式空间中顺序执行,除非指定了行的地址。

显示:发送修改后的内容到输出流。再发送数据后,模式空间将会被清空。

注意:默认情况下,所有的 sed 命令都是在模式空间内执行的,因此输入的文件并不会发生任何变化,除非是用重定向存储输出。

用法:sed [选项] ’操作‘ 参数或者sed [选项] -f scriptfile 参数

选项:

  • -e 或--expression=:表示用指定命令或者脚本来处理输入的文本文件。
  • -f 或--file=:表示用指定的脚本文件来处理输入的文本文件。
  • -h 或--help:显示帮助。
  • -n、--quiet 或 silent:表示仅显示处理后的结果。
  • -i:直接编辑文本文件。

操作:

  • a:增加,在当前行下面增加一行指定内容。
  • c:替换,将选定行替换为指定内容。
  • d:删除,删除选定的行。
  • i:插入,在选定行上面插入一行指定内容。
  • p:打印,如果同时指定行,表示打印指定行;如果不指定行,则表示打印所有内容;如果有非打印字符,则以 ASCII 码输出。其通常与“-n”选项一起使用。
  • s:替换,替换指定字符。
  • y:字符转换。

四、sed用法实例细讲

1)输出符合条件的文本(p表示正常输出)

[root@lokott opt]# sed -n 'p' test.txt   //相当于cat test.txt
he was short and fat.
He was wearing a blue polo shirt with black pants. 
The home of Football on BBC Sport online.
the tongue is boneless but it breaks bones.12!
 google is the best tools for search keyword.
The year ahead will test our political establishment to the limit.
PI=3.141592653589793238462643383249901429
a wood cross!
Actions speak louder than words

#woood #
#woooooood # 
AxyzxyzxyzxyzC
I bet this place is really spooky late at night! 
Misfortunes never come alone/single.
I shouldn't have lett so tast.
[root@lokott opt]# cat test.txt |wc -l
17
[root@lokott opt]# sed -n 'p' test.txt |wc -l
17

[root@lokott opt]# sed -n '2p' test.txt                 //显示第二行
He was wearing a blue polo shirt with black pants. 
[root@lokott opt]# sed -n '2,5p' test.txt               //显示2-5行
He was wearing a blue polo shirt with black pants. 
The home of Football on BBC Sport online.
the tongue is boneless but it breaks bones.12!
google is the best tools for search keyword.
[root@lokott opt]# sed -n 'n;p' test.txt       //循环显示,跳过一行,再显示一行,以此类推
He was wearing a blue polo shirt with black pants. 
the tongue is boneless but it breaks bones.12!
The year ahead will test our political establishment to the limit.
a wood cross!

#woood #
AxyzxyzxyzxyzC
Misfortunes never come alone/single.
[root@lokott opt]# sed -n 'p;n' test.txt      //循环显示,先显示一行,再跳过一行,一次类推
he was short and fat.
The home of Football on BBC Sport online.
 google is the best tools for search keyword.
PI=3.141592653589793238462643383249901429
Actions speak louder than words

#woooooood # 
I bet this place is really spooky late at night! 
I shouldn't have lett so tast.

[root@lokott opt]# nl test.txt  //显示行号并且列出文本内容,这里为了方便演示,其行号不等同实际操作的行号
     1  he was short and fat.               
     2  He was wearing a blue polo shirt with black pants. 
     3  The home of Football on BBC Sport online.
     4  the tongue is boneless but it breaks bones.12!
     5   google is the best tools for search keyword.
     6  The year ahead will test our political establishment to the limit.
     7  PI=3.141592653589793238462643383249901429
     8  a wood cross!
     9  Actions speak louder than words

    10  #woood #
    11  #woooooood # 
    12  AxyzxyzxyzxyzC
    13  I bet this place is really spooky late at night! 
    14  Misfortunes never come alone/single.
    15  I shouldn't have lett so tast.
[root@lokott opt]# sed -n '2,5{p;n}' test.txt   显示2和4行
He was wearing a blue polo shirt with black pants. 
the tongue is boneless but it breaks bones.12!
[root@lokott opt]# sed -n '2,${p;n}' test.txt  //2到末行隔行显示
He was wearing a blue polo shirt with black pants. 
the tongue is boneless but it breaks bones.12!
The year ahead will test our political establishment to the limit.
a wood cross!

#woood #
AxyzxyzxyzxyzC
Misfortunes never come alone/single.

sed结合正则表达式使用时,格式略有不同,正则表达式须以“/”包围,实例如下:

[root@lokott opt]# sed -n '/the/p' test.txt                     //输出包含the的行
the tongue is boneless but it breaks bones.12!
 google is the best tools for search keyword.
The year ahead will test our political establishment to the limit. 
[root@lokott opt]# sed -n '4,/the/p' test.txt  //这里无论第四行是否有the都会讲其显示,因为该条命令的含
the tongue is boneless but it breaks bones.12!  //义是从第四行显示直到遇到第一个the的所有内容,第四行
 google is the best tools for search keyword.   //的the不算
[root@lokott opt]# sed -n '4,/the/p' test.txt  //更改了文本演示效果
tahe tongue is boneless but it breaks bones.12!
google is the best tools for search keyword.
[root@lokott opt]# sed -n '/the/=' test.txt //显示包含the的行号
5
6
[root@lokott opt]# sed -n '/^PI/p' test.txt //显示以PI开头的行  可以用grep '^PI' test.txt代替
PI=3.141592653589793238462643383249901429
[root@lokott opt]# sed -n '/[0-9]$/p' test.txt //显示以数字结尾的行
PI=3.141592653589793238462643383249901429
[root@lokott opt]# sed -n '/\/p' test.txt //输出包含单词wood 的行,\< \>代表单词边界
a wood cross!

2)删除符合条件的文本(d)

[root@lokott opt]# nl test.txt | sed '3d' //删除第三行
     1  he was short and fat.
     2  He was wearing a blue polo shirt with black pants. 
     4  tahe tongue is boneless but it breaks bones.12!
     5  google is the best tools for search keyword.
     6  The year ahead will test our political establishment to the limit.
     7  PI=3.141592653589793238462643383249901429
     8  a wood cross!
     9  Actions speak louder than words

    10  #woood #
    11  #woooooood # 
    12  AxyzxyzxyzxyzC
    13  I bet this place is really spooky late at night! 
    14  Misfortunes never come alone/single.
    15  I shouldn't have lett so tast.

[root@lokott opt]# nl test.txt | sed '3,5d' //删除3-5行
     1  he was short and fat.
     2  He was wearing a blue polo shirt with black pants. 
     6  The year ahead will test our political establishment to the limit.
     7  PI=3.141592653589793238462643383249901429
     8  a wood cross!
     9  Actions speak louder than words

    10  #woood #
    11  #woooooood # 
    12  AxyzxyzxyzxyzC
    13  I bet this place is really spooky late at night! 
    14  Misfortunes never come alone/single.
    15  I shouldn't have lett so tast.
[root@lokott opt]# nl test.txt | sed '/cross/d' //删除含有cross的行
     1  he was short and fat.
     2  He was wearing a blue polo shirt with black pants. 
     3  The home of Football on BBC Sport online.
     4  tahe tongue is boneless but it breaks bones.12!
     5  google is the best tools for search keyword.
     6  The year ahead will test our political establishment to the limit.
     7  PI=3.141592653589793238462643383249901429
     9  Actions speak louder than words

    10  #woood #
    11  #woooooood # 
    12  AxyzxyzxyzxyzC
    13  I bet this place is really spooky late at night! 
    14  Misfortunes never come alone/single.
    15  I shouldn't have lett so tast.

[root@lokott opt]# sed '/^[a-z]/d' test.txt | nl  //删除所有小写字母开头的行
     1  He was wearing a blue polo shirt with black pants. 
     2  The home of Football on BBC Sport online.
     3  The year ahead will test our political establishment to the limit.
     4  PI=3.141592653589793238462643383249901429
     5  Actions speak louder than words

     6  #woood #
     7  #woooooood # 
     8  AxyzxyzxyzxyzC
     9  I bet this place is really spooky late at night! 
    10  Misfortunes never come alone/single.
    11  I shouldn't have lett so tast.

[root@lokott opt]# sed '/\.$/d' test.txt   //删除以点为结尾的行
He was wearing a blue polo shirt with black pants. 
tahe tongue is boneless but it breaks bones.12!
PI=3.141592653589793238462643383249901429
a wood cross!
Actions speak louder than words

#woood #
#woooooood # 
AxyzxyzxyzxyzC
I bet this place is really spooky late at night!

[root@lokott opt]# sed '/^$/d' test.txt             //删除所有空行 
he was short and fat.
He was wearing a blue polo shirt with black pants. 
The home of Football on BBC Sport online.
tahe tongue is boneless but it breaks bones.12!
google is the best tools for search keyword.
The year ahead will test our political establishment to the limit.
PI=3.141592653589793238462643383249901429
a wood cross!
Actions speak louder than words
#woood #
#woooooood # 
AxyzxyzxyzxyzC
I bet this place is really spooky late at night! 
Misfortunes never come alone/single.
I shouldn't have lett so tast.

3)替换符合条件的文本

​ 在使用 sed 命令进行替换操作时需要用到 s(字符串替换)、c(整行/整块替换)、y(字符转换)命令选项,常见的用法如下所示。

sed's/the/THE/'test.txt//将每行中的第一个the 替换为 THE

sed's/l/L/2'test.txt//将每行中的第 2 个l 替换为L

sed's/the/THE/g'test.txt//将文件中的所有the 替换为THE

sed's/o//g**'test.txt** //将文件中的所有o 删除(替换为空串)

sed's/^/#/**'test.txt** //在每行行首插入#号

sed'/the/s/^/#/'test.txt //在包含the 的每行行首插入#号

sed's/$/EOF/'test.txt//在每行行尾插入字符串EOF

sed'3,5s/the/THE/g'test.txt//将第 3~5 行中的所有the 替换为 THE

sed'/the/s/o/O/g**'test.txt** //将包含the 的所有行中的o 都替换为 O

4) 迁移符合条件的文本

其中,H,复制到剪贴板;g、G,将剪贴板中的数据覆盖/追加至指定行;w,保存为文件;r,读取指定文件;a,追加指定内容。

sed'/the/{H;d};$G'test.txt//将包含the 的行迁移至文件末尾,{;}用于多个操作

sed'1,5{H;d};17G'test.txt//将第 1~5 行内容转移至第 17 行后

sed '/the/wout.file'test.txt//将包含the 的行另存为文件out.file

sed '/the/r/etc/hostname'test.txt//将文件/etc/hostname 的内容添加到包含the 的每行以后

sed'3aNew'test.txt //在第 3 行后插入一个新行,内容为 New

sed'/the/aNew'test.txt//在包含the 的每行后插入一个新行,内容为 New

sed'3aNew1\nNew2'test.txt//在第 3 行后插入多行内容,中间的\n 表示换行

未完待续.......)


网站标题:shell脚本编程之正则表达式(二)(扩展正则表达式、sed)
文章地址:http://myzitong.com/article/piedec.html