Поиск по этому блогу

вторник, 12 марта 2019 г.

Pentaho PDI logging: fixing concurrency issues

If you are running single thread logging for PDI transformation to database you are good.
I am using MySQL (Percona cluster)
I got multiple  issues with concurrent inserts/updates to this table:

ERROR (version 8.2.0.0-342, build 8.2.0.0-342 from 2018-11-14 10.30.55 by buildguy) : Unable to write log record to log table
or
Deadlock found when trying to get lock; try restarting transaction

What major issues with DB schema I see:
1) ID_BATCH  column is generated on Pentaho side, it locks tables "LOCK TABLE WRITE" to do update, this is why I got Deadlock. - REMOVE ID
2)  CHANNEL_ID - is not indexed by default - ADD Index:
alter table log_DB.TABLE_NAME add index idx_2 (CHANNEL_ID);
Every time you need to comment out drop index if you do SQL schema changes from PDI otherwise PDI will remove it.
3) If possible decrease number of logged lines(it will also speed up insert).
4) Don't use "Log record timeout(in days)". Solution: do delete from table in separate Job/transformation.
5) It's better to create primary key for table:
alter table log_DB.TABLE_NAME add column id int primary key auto_increment;


To Pentaho team: looks like it's very old problem. I spent 1 hour to find solution that work for me.... Why can't you fix it?

Bug:
https://jira.pentaho.com/browse/PDI-2054

My version is 8.2 0_o





Комментариев нет:

Отправить комментарий