loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Inoussa Mouiche and Sherif Saad

Affiliation: School of Computer Science, University of Windsor, ON, Canada

Keyword(s): Threat Intelligence, Named Entity Recognition, Data Annotation, Data Augmentation.

Abstract: Recent advancements highlight the crucial role of high-quality data in developing accurate AI models, especially in threat intelligence named entity recognition (TI-NER). This technology automates the detection and classification of information from extensive cyber reports. However, the lack of scalable annotated security datasets hinders TI-NER system development. To overcome this, researchers often use data augmentation techniques such as merging multiple annotated NER datasets to improve variety and scalability. Integrating these datasets faces challenges like maintaining consistent entity annotations and entity categories and adhering to standardized tagging schemes. Manually merging datasets is time-consuming and impractical on a large scale. Our paper presents TI-NERmerger, a semi-automated framework that integrates diverse TI-NER datasets into scalable, compliant datasets aligned with cybersecurity standards like STIX-2.1. We validated the framework’s efficiency and effectiven ess by comparing it with manual processes using the DNRTI and APTNER datasets, producing Augmented APTNER (2APTNER). The results demonstrate over 94% reduction in manual labour, saving several months of work in just minutes. Additionally, we applied advanced ML algorithms to validate the effectiveness of the integrated NER datasets. We also provide publicly accessible datasets and resources, supporting further research in threat intelligence and AI model developments. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 13.59.234.182

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Mouiche, I. and Saad, S. (2024). TI-NERmerger: Semi-Automated Framework for Integrating NER Datasets in Cybersecurity. In Proceedings of the 21st International Conference on Security and Cryptography - SECRYPT; ISBN 978-989-758-709-2; ISSN 2184-7711, SciTePress, pages 357-370. DOI: 10.5220/0012867900003767

@conference{secrypt24,
author={Inoussa Mouiche. and Sherif Saad.},
title={TI-NERmerger: Semi-Automated Framework for Integrating NER Datasets in Cybersecurity},
booktitle={Proceedings of the 21st International Conference on Security and Cryptography - SECRYPT},
year={2024},
pages={357-370},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012867900003767},
isbn={978-989-758-709-2},
issn={2184-7711},
}

TY - CONF

JO - Proceedings of the 21st International Conference on Security and Cryptography - SECRYPT
TI - TI-NERmerger: Semi-Automated Framework for Integrating NER Datasets in Cybersecurity
SN - 978-989-758-709-2
IS - 2184-7711
AU - Mouiche, I.
AU - Saad, S.
PY - 2024
SP - 357
EP - 370
DO - 10.5220/0012867900003767
PB - SciTePress