Research on Topic Identification of Safety Hazard Information in Oilfield Enterprises

X Zhang, H Shi, C Chen, J Guan - Proceedings of the 2023 12th …, 2023 - dl.acm.org
X Zhang, H Shi, C Chen, J Guan
Proceedings of the 2023 12th International Conference on Computing and …, 2023dl.acm.org
To address the issues of semantic sparsity and insufficient co-occurrence information
encountered when using the Latent Dirichlet Allocation (LDA) topic model for identifying
topics in production safety hazard texts, a method is proposed for mining and analyzing
safety hazard texts from both word and topic perspectives. At the word level, the Pointwise
Mutual Information (PMI) algorithm is employed for co-occurrence analysis of hazard texts,
revealing the associations between key hazard words through the co-occurrence of textual …
To address the issues of semantic sparsity and insufficient co-occurrence information encountered when using the Latent Dirichlet Allocation (LDA) topic model for identifying topics in production safety hazard texts, a method is proposed for mining and analyzing safety hazard texts from both word and topic perspectives. At the word level, the Pointwise Mutual Information (PMI) algorithm is employed for co-occurrence analysis of hazard texts, revealing the associations between key hazard words through the co-occurrence of textual feature words. At the topic level, the PMI-LDA topic model is utilized to model and represent the production safety hazard texts, with the quality of the modeling results evaluated based on indicators such as topic similarity and log-likelihood. The results demonstrate that the combination of the PMI algorithm and the LDA topic model can effectively identify the major types and characteristics of safety hazards recorded in the oilfield enterprise's production safety hazard data, thereby providing valuable support for hazard investigation and control work.
ACM Digital Library
Showing the best result for this search. See all results