There is a newer version of the record available.

Published November 15, 2023 | Version v1
Dataset Open

Predictions for sRNA-mRNA interactions in E. coli - kGraphRNA, GraphRNA, sInterRF, and sInterXGB models

  • 1. ROR icon Ben-Gurion University of the Negev

Description

Bacterial small RNAs (sRNAs) are pivotal in post-transcriptional regulation, affecting functions like virulence, metabolism, and gene expression by binding specific mRNA targets. Identifying these targets is crucial to understanding sRNA regulation across species. Despite advancements in high-throughput (HT) experimental methods, they remain technically challenging and are limited to detecting sRNA-target interactions under specific environmental conditions. Therefore, computational approaches, especially machine learning (ML), are essential for identifying strong candidates for biological validation. 

In this study, we hypothesize that ML models trained on large-scale interaction data from specific conditions can accurately predict new interactions in unseen conditions within the same bacterial strain. To test this, we developed models from two families: (1) graph neural networks (GNNs), including GraphRNA and kGraphRNA, that learn transformed representations of interacting sRNA-mRNA pairs via graph relationships, and (2) decision forests, sInterRF (Random Forest) and sInterXGB (XGBoost), which use various interaction features for prediction. We also proposed Summation Ensemble Models (SEM) that combine scores from multiple models. Across three seen-to-unseen conditions evaluations, our models —particularly kGraphRNA— significantly improved the area under the ROC curve (AUC) and Precision-Recall curve (PR-AUC) compared to sRNARFTarget, CopraRNA, and RNAup. The SEM model combining GraphRNA and CopraRNA outperformed CopraRNA alone on a low-throughput (LT) interactions test set (HT-to-LT).

In this dataset, we provide the prediction scores of our models: kGraphRNA, GraphRNA, sInterRF, and sInterXGB for any pair of sRNA and mRNA of Escherichia coli K12 MG1655 (NC_000913). We also provide the true labels and the CopraRNA p-value scores computed for all possible pairs. Note that prediction scores are provided for all the unlabeled sRNA-mRNA pairs not included in the HT train set (see our paper for details).

For convenience, each CVS file contains the scores of a single sRNA with the following information: accession IDs, locus tags, and names of the sRNA the mRNA; CopraRNA p-value (if available); the prediction scores of kGraphRNA, GraphRNA, sInterRF, and sInterXGB models; true label (if available) – 1 for interaction and 0 for non-interaction;  whether the sRNA-mRNA pair was included in the train set – true or false; whether the sRNA-mRNA pair was sampled for the train set as a random negative sample – true or false.

Files

3'ets-leuz_G0-16636.csv

Files (38.0 MB)

Name Size Download all
md5:316e555e5d2b63307b3c3f6f42bc40c3
491.1 kB Preview Download
md5:3629d0b5bb55da2419ac7a6c5a525a6e
400.7 kB Preview Download
md5:d5fead206106f9b508af9487359ce81f
395.5 kB Preview Download
md5:0282e238fb4f5f5af6d615cdf190b1f8
417.7 kB Preview Download
md5:fa9c9aa4fcaa9a4b840a2b63609011c1
421.0 kB Preview Download
md5:2ccdc34adaebc76bb7ebd3c22d518186
279.4 kB Preview Download
md5:02bab9471054d8d4958abb2b49a5c124
424.2 kB Preview Download
md5:095dfe46ef9e11562d5d39e8cd55ece1
430.4 kB Preview Download
md5:888e38a15576a87d4ec1dc743304ae1e
431.4 kB Preview Download
md5:86ebfcaec14ecef27517f4470816e0b3
423.7 kB Preview Download
md5:379c6ecc32312fb4d744177ee3601654
401.1 kB Preview Download
md5:5767afc08563b783daa0cbb8379d56c8
304.3 kB Preview Download
md5:f4f3dec83753576ed9c166d02cf02520
418.3 kB Preview Download
md5:c4c0e8187d2a50b488cbf94c1ae3c350
433.5 kB Preview Download
md5:7c3688c4f549967ec741ef69f5eb69b5
400.8 kB Preview Download
md5:4055de56ab4284e2cf21d6f179fc1816
425.3 kB Preview Download
md5:974a550defb2115de037ab25659288e7
432.3 kB Preview Download
md5:6fa82a793a4a49c5f39abb3f2950a2b8
309.3 kB Preview Download
md5:c0595abc2cd7a708145b2d6560abe4a0
403.4 kB Preview Download
md5:606ac8cfcedf5aaf7fbfc6ce01d27705
420.3 kB Preview Download
md5:e7cc71b7e4ee5679ed04b192ebb05d48
396.0 kB Preview Download
md5:c2a36b9678517c6187f0e93559e104f3
433.5 kB Preview Download
md5:0628fabdca12c21c4bcdc4859a06edb4
391.1 kB Preview Download
md5:627cd6fc983121533ab50c47004d5f9a
407.1 kB Preview Download
md5:6e442898aa6d7827ece877bae0aa7dd5
423.0 kB Preview Download
md5:3b7cec10ef8edfb8f6a299c864fc5584
422.8 kB Preview Download
md5:1de9970ceb55c80ad1de8a32d4006c76
426.6 kB Preview Download
md5:4b3debca76a1cfa96adece91855f5a36
425.6 kB Preview Download
md5:ce2e147a8441c257ab944107f699eba9
431.1 kB Preview Download
md5:4f63cc2a82a9d07d3c61c07ff395b6ea
430.3 kB Preview Download
md5:0e7fab1e06c2bae2246cbc4c71f03d3d
423.9 kB Preview Download
md5:0723e2fad40d85d82a5608fd845a72ad
417.0 kB Preview Download
md5:29ea05325367de47a3ef9d038803883f
405.2 kB Preview Download
md5:f7e94ce8e1b10535fb06d01215d17ef2
399.3 kB Preview Download
md5:a9ae9ca8974a78076488f0403c217892
414.3 kB Preview Download
md5:b9930038752dd2d3cf0febff3a7ea8ca
414.7 kB Preview Download
md5:894dfa18aa33d4b8ed6be8d897cba47c
380.2 kB Preview Download
md5:68df6fe5424952873de645804aa13066
411.0 kB Preview Download
md5:984ff3c55dac3107bed71fa631f67ed8
288.4 kB Preview Download
md5:19946aa63135a566f15841ea86ba172e
377.6 kB Preview Download
md5:75dc04fb79faf4b5e55cd9d3966ef202
427.0 kB Preview Download
md5:7a284a82251391b5f70ab5c545327d97
426.7 kB Preview Download
md5:ea9a6cfb12a59af09103f3e40534ffcb
423.6 kB Preview Download
md5:8658c11463930ab6b1216872bdd21841
295.9 kB Preview Download
md5:c04d5600170f50bf9259c44ea8f2bdfe
429.0 kB Preview Download
md5:a7057cf6dd464324e924db9c42945769
429.6 kB Preview Download
md5:fabfc90033ef6cb56b503efd0309bed4
437.2 kB Preview Download
md5:f56dc90d59bf6ceac31480c51c739c35
303.1 kB Preview Download
md5:0d89c20c7af2b332dff6002332b4bfd9
412.4 kB Preview Download
md5:79fbd0b9f9ec41b0349d4965fa75fbd4
397.9 kB Preview Download
md5:33f2aaee069b9767362c77596f26acfe
418.5 kB Preview Download
md5:8cb1aca0746dea977cf78ba4b04712c8
416.1 kB Preview Download
md5:32dd7948ff927fddd8bbd793b14a5bd1
406.1 kB Preview Download
md5:036b4504142cf061e1e747a98f59d007
391.6 kB Preview Download
md5:239437d5a9e28d71970dcace9feed388
402.0 kB Preview Download
md5:16e143cef0c096fe66e933f9ba361bd3
430.9 kB Preview Download
md5:f6162bc3510f9df89fbdf2abbc1c0c9b
426.0 kB Preview Download
md5:ed274e43656e924cccc7253fe1587e55
417.0 kB Preview Download
md5:c29deb60619cbbd508a73894787b7134
433.4 kB Preview Download
md5:53369964191a1cb5cfaf6a69f521778e
430.4 kB Preview Download
md5:f4d084db952c2b4b42045733b712e135
428.9 kB Preview Download
md5:476886eac56b2b2cfb8dbdbe7a48e39e
434.9 kB Preview Download
md5:c0e5736bc38f7c09279f882a9270dd3d
430.5 kB Preview Download
md5:6091c2dc61472ec210b4184eb8eb94be
287.0 kB Preview Download
md5:1abc1917c9e5e09bbbfd3f09de2a134e
391.4 kB Preview Download
md5:1709d5ab9db7fbbc590ec56ec7ca74aa
437.0 kB Preview Download
md5:ba70236e0c49d666a3417d0246b0c71b
428.8 kB Preview Download
md5:3de1d1724d1c01f6891fa02ec32f2ae8
427.3 kB Preview Download
md5:0140ef9b33bf11820b184f02ebf3bbb7
312.4 kB Preview Download
md5:e1dd5902e0d167c0fab0280148f34e47
436.6 kB Preview Download
md5:ec3fc8328d7ae33c499b4458d99e6fbd
308.1 kB Preview Download
md5:dd2ec3045ec7dd010da18b60985c1583
427.2 kB Preview Download
md5:84d6c1df85f0f5e484423ceb9a8cfae3
429.8 kB Preview Download
md5:9bb6ea01643701598f5d5dacda62b4f1
432.2 kB Preview Download
md5:188b888b296baa86a14b9c4f5cf32091
430.6 kB Preview Download
md5:b8521c52e3e65744cc2d8536acc32215
423.7 kB Preview Download
md5:b982fd0e427b104a8e131459baf51a83
431.0 kB Preview Download
md5:1de3cc8e811835d85b915001bdb1334f
435.2 kB Preview Download
md5:a66439a5675d2b9567528d6bba25d521
365.0 kB Preview Download
md5:2f9daca0f85fba071617f9fcaa611416
388.8 kB Preview Download
md5:7f916fce6b3dc2d41df5951aa5c15bad
399.4 kB Preview Download
md5:36b152cf049eb15629d107eceaf2aac0
330.2 kB Preview Download
md5:c64af1ace14faccdab4b09367b376f23
415.9 kB Preview Download
md5:a53d15e50e9a9109f0e9bbb456713d99
403.9 kB Preview Download
md5:8ae921ae3b014776288fa5ef0c1ab8ac
424.3 kB Preview Download
md5:78086f68f1565dbbbce55343d9a99715
433.1 kB Preview Download
md5:c442e5376515bf52c17bf4a9673d81dd
428.8 kB Preview Download
md5:84a644237d32bc9e1245ae693eea2a8f
431.8 kB Preview Download
md5:327cc4b16927b4af9cf7fa7c58490822
409.9 kB Preview Download
md5:9fe166dcd93500110a5727074c4a000a
304.8 kB Preview Download
md5:25e56fa2b57bba11fdb1b8c213757e87
424.9 kB Preview Download
md5:faeb9a86e501464b467849fd63c4dc43
433.1 kB Preview Download
md5:4910279091228d35a1a46bc68a9be3df
349.2 kB Preview Download
md5:d3e0726a319c40aee7705264e933753c
426.6 kB Preview Download

Additional details

Related works