goldengate配置添加pump进程僵死分析

1.创建目录:

在皋兰等地区,都构建了全面的区域性战略布局,加强发展的系统性、市场前瞻性、产品创新能力,以专注、极致的服务理念,为客户提供成都做网站、成都网站设计 网站设计制作定制网站设计,公司网站建设,企业网站建设,成都品牌网站建设,成都营销网站建设,成都外贸网站建设公司,皋兰网站建设费用合理。

GGSCI (jq-prod-oracle-wms3-120-24) 3> CREATE SUBDIRS

Creating subdirectories under current directory /u01/app/goldengate

Parameter file                 /u01/app/goldengate/dirprm: created.
Report file                    /u01/app/goldengate/dirrpt: created.
Checkpoint file                /u01/app/goldengate/dirchk: created.
Process status files           /u01/app/goldengate/dirpcs: created.
SQL script files               /u01/app/goldengate/dirsql: created.
Database definitions files     /u01/app/goldengate/dirdef: created.
Extract data files             /u01/app/goldengate/dirdat: created.
Temporary files                /u01/app/goldengate/dirtmp: created.
Credential store files         /u01/app/goldengate/dircrd: created.
Masterkey wallet files         /u01/app/goldengate/dirwlt: created.
Dump files                     /u01/app/goldengate/dirdmp: created.

2.编辑mgr:

edit param mgr

port 7809
autostart er *
autorestart er *
PURGEOLDEXTRACTS /s01/app/goldengate/dirdat/sz*, USECHECKPOINTS, MINKEEPDAYS 3

启动:

GGSCI (chuanqiu) 9> start mgr
Manager started.


GGSCI (chuanqiu) 10> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING  

3.添加pump进程:

GGSCI (scm02db01.baozunops.com) 2> edit params p_wmsjq

extract p_wmsjq
rmthost 192.168.101, mgrport 7809, compress
passthru
numfiles 5000
rmttrail ./dirdat/sz
--dynamicresolution
ddl
table wms.T_USER           ;
table wms.T_BRAND           ;
table wms.T_CHANNEL           ;
table wms.T_CUSTOMER        ;

ADD EXTRACT p_wmsjq, EXTTRAILSOURCE ./dirdat/ea, BEGIN now

add rmttrail ./dirdat/sz extract p_wmsjq

启动进程:

start p_wmsjq

查看添加的进程状态:

stats P_WMSJQ

Sending STATS request to EXTRACT P_WMSJQ ...

2018-09-19 16:34:33  ERROR   OGG-15149  EXTRACT P_WMSJQ is initializing, please try the command later.


GGSCI (chunqiu) 40> info P_WMSJQ

EXTRACT    P_WMSJQ   Initialized   2018-09-19 16:27   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:12:20 ago)
Process ID           11252
Log Read Checkpoint  File ./dirdat/ea000000000
                      2018-09-19 16:27:03.000000

为什么会出现这种情况呢?

这是因为抽取进程以及运行了很久,而且日志量非常的大,之前的很多文件已经删除,所以P_WMSJQ进程在后台不断的寻找刚才从BEGIN now开始的时候的日志,可以从后台可以看到如下日志确认:

tail -f ggserr.log查看后台日志:

2018-09-19T16:41:03.194+0800  INFO    OGG-02232  Oracle GoldenGate Capture for Oracle, p_wmsjq.prm:  Switching to next trail file /u01/goldengate/dirdat/ea000001813 at 2018-09-19 16:41:03.194239 due to EOF. with current RBA 499,998,509.
2018-09-19T16:41:07.781+0800  INFO    OGG-02232  Oracle GoldenGate Capture for Oracle, p_wmsjq.prm:  Switching to next trail file /u01/goldengate/dirdat/ea000001814 at 2018-09-19 16:41:07.781704 due to EOF. with current RBA 499,999,232.
2018-09-19T16:41:12.339+0800  INFO    OGG-02232  Oracle GoldenGate Capture for Oracle, p_wmsjq.prm: Switching to next trail file /u01/goldengate/dirdat/ea000001815 at 2018-09-19 16:41:12.339296 due to EOF. with current RBA 499,998,210.

那么到底需要多久才可以完成呢?   我们可以从extract的抽取进程生成的最新的trail文件确认:

查看trail文件的序号:

ls -l ea000002*

-rw-r----- 1 oracle oinstall 477M Sep 19 15:58 ea000002146
-rw-r----- 1 oracle oinstall 477M Sep 19 16:03 ea000002147
-rw-r----- 1 oracle oinstall 477M Sep 19 16:07 ea000002148
-rw-r----- 1 oracle oinstall 477M Sep 19 16:11 ea000002149
-rw-r----- 1 oracle oinstall 477M Sep 19 16:25 ea000002150
-rw-r----- 1 oracle oinstall 477M Sep 19 16:40 ea000002151
-rw-r----- 1 oracle oinstall 477M Sep 19 16:52 ea000002152
-rw-r----- 1 oracle oinstall 6.0M Sep 19 16:52 ea000002153

或者查看另外的pump进程:

GGSCI (chunqiu) 41> info PUMP_lbs

EXTRACT    PUMP_WMS  Last Started 2018-09-19 06:55   Status RUNNING
Checkpoint Lag       00:00:03 (updated 00:00:03 ago)
Process ID           92773
Log Read Checkpoint  File /u01/goldengate/dirdat/ea000002152
                      2018-09-19 16:42:10.000000  RBA 85805949

从该进程可以看出ea000001815 到ea000002151 还有点时间,不过已经越来越接近了。

等了很久,再次看下:

GGSCI (scm02db01.baozunops.com) 54> info P_WMSJQ

EXTRACT    P_WMSJQ   Initialized   2018-09-19 16:27   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:42:57 ago)
Process ID           11252
Log Read Checkpoint  File ./dirdat/ea000000000
                      2018-09-19 16:27:03.000000

已经42分钟过去了,可知trail文件是相当大的。

最后确认:

GGSCI (scm02db01.baozunops.com) 79> info P_WMSJQ

EXTRACT    P_WMSJQ   Last Started 2018-09-19 17:32   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:07 ago)
Process ID           71325
Log Read Checkpoint  File /s01/goldengate/dirdat/ea000002156
                      2018-09-19 17:32:04.000000  RBA 37332782


GGSCI (chunqiu) 80> stats P_WMSJQ

Sending STATS request to EXTRACT P_WMSJQ ...

Start of Statistics at 2018-09-19 17:33:07.

DDL replication statistics (for all trails):

*** Total statistics since extract started     ***
     Operations                                   0.00
     Mapped operations                            0.00
     Unmapped operations                            0.00
     Other operations                            0.00
     Excluded operations                            0.00

Output to ./dirdat/sz:

Extracting from wms.T_USER to WMS.T_USER:

*** Total statistics since 2018-09-19 17:32:22 ***
     Total inserts                               7041.00
     Total updates                             108651.00
     Total deletes                             178000.00
     Total discards                                 0.00
     Total operations                          293692.00

看看target端文件已经传输过来:

-rw-r----- 1 oracle oinstall 499999847 Sep 19 17:31 sz000000000
-rw-r----- 1 oracle oinstall 499999640 Sep 19 17:31 sz000000001
-rw-r----- 1 oracle oinstall 499999561 Sep 19 17:43 sz000000002
-rw-r----- 1 oracle oinstall  50584137 Sep 19 17:47 sz000000003

另外,如果是新的环境或者是日志量非常小,不会出现这种情况。

因此,在日志工作中,特别是生产环境,如果遇到goldengate问题,不要惊慌,只要理解原理,解决问题相当简单。

2018-09-19 周三


文章标题:goldengate配置添加pump进程僵死分析
新闻来源:http://myzitong.com/article/gieohp.html