TI-NERmerger: Semi-Automated Framework for Integrating NER Datasets in Cybersecurity

Inoussa Mouiche; Sherif Saad

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

TI-NERmerger: Semi-Automated Framework for Integrating NER Datasets in Cybersecurity

Topics: Machine learning applications to data security and privacy; Machine Learning Security and Privacy

In Proceedings of the 21st International Conference on Security and Cryptography SECRYPT - Volume 1, 357-370, 2024 , Dijon, France

Authors: Inoussa Mouiche and Sherif Saad

Affiliation: School of Computer Science, University of Windsor, ON, Canada

Keyword(s): Threat Intelligence, Named Entity Recognition, Data Annotation, Data Augmentation.

Abstract: Recent advancements highlight the crucial role of high-quality data in developing accurate AI models, especially in threat intelligence named entity recognition (TI-NER). This technology automates the detection and classification of information from extensive cyber reports. However, the lack of scalable annotated security datasets hinders TI-NER system development. To overcome this, researchers often use data augmentation techniques such as merging multiple annotated NER datasets to improve variety and scalability. Integrating these datasets faces challenges like maintaining consistent entity annotations and entity categories and adhering to standardized tagging schemes. Manually merging datasets is time-consuming and impractical on a large scale. Our paper presents TI-NERmerger, a semi-automated framework that integrates diverse TI-NER datasets into scalable, compliant datasets aligned with cybersecurity standards like STIX-2.1. We validated the framework’s efficiency and effectiven ess by comparing it with manual processes using the DNRTI and APTNER datasets, producing Augmented APTNER (2APTNER). The results demonstrate over 94% reduction in manual labour, saving several months of work in just minutes. Additionally, we applied advanced ML algorithms to validate the effectiveness of the integrated NER datasets. We also provide publicly accessible datasets and resources, supporting further research in threat intelligence and AI model developments. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 13.59.234.182

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Mouiche, I. and Saad, S. (2024). TI-NERmerger: Semi-Automated Framework for Integrating NER Datasets in Cybersecurity. In Proceedings of the 21st International Conference on Security and Cryptography - SECRYPT; ISBN 978-989-758-709-2; ISSN 2184-7711, SciTePress, pages 357-370. DOI: 10.5220/0012867900003767

@conference{secrypt24,
author={Inoussa Mouiche. and Sherif Saad.},
title={TI-NERmerger: Semi-Automated Framework for Integrating NER Datasets in Cybersecurity},
booktitle={Proceedings of the 21st International Conference on Security and Cryptography - SECRYPT},
year={2024},
pages={357-370},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012867900003767},
isbn={978-989-758-709-2},
issn={2184-7711},
}

TY - CONF

JO - Proceedings of the 21st International Conference on Security and Cryptography - SECRYPT
TI - TI-NERmerger: Semi-Automated Framework for Integrating NER Datasets in Cybersecurity
SN - 978-989-758-709-2
IS - 2184-7711
AU - Mouiche, I.
AU - Saad, S.
PY - 2024
SP - 357
EP - 370
DO - 10.5220/0012867900003767
PB - SciTePress