Combining Random Forest with Firefly Algorithm to Improve Darknet Traffic Detection

##plugins.themes.academic_pro.article.main##

Vincent Timothy Lim
Rusdianto Roestam

Abstract

Darknet traffic detection system, a cyber-crime activity detection system that detects the use of Tor and VPNs, is one way to reduce the occurrence of darknet cyber-crimes. Current existing detection tools such as machine learning models have shown its capability in detecting darknet network traffics. However, it still faces some limitations in its performance due to suboptimal hyperparameters. One of the existing classification models used for darknet traffic detection, such as Random Forest demonstrated great performance in detecting darknet activities. This research utilizes the Firefly Algorithm (FA), a prominent swarm intelligence method, to fine-tune hyperparameters and enhance the detection capabilities of the Random Forest (RF) model. The proposed RF-FA (Random Forest – Firefly Algorithm) approach is evaluated against the standard Random Forest model. Tests performed on the CIC-Darknet2020 dataset reveal that the Firefly Algorithm improves the RF model's performance in all key metrics. The optimized RF-FA model attains an accuracy, precision, recall, and F1-score of 98.73%, surpassing the baseline RF model, which achieves 98.62% in accuracy, precision, and recall, along with an F1-score of 98.61%.

##plugins.themes.academic_pro.article.details##

How to Cite
Timothy Lim, V. and Roestam, R. (2026) “Combining Random Forest with Firefly Algorithm to Improve Darknet Traffic Detection”, Ranah Research : Journal of Multidisciplinary Research and Development, 8(2), pp. 1254-1261. doi: 10.38035/rrj.v8i2.2023.

References

Aliefa, M. H., & Suyanto. (2020). Variable-length chromosome for optimizing the structure of recurrent neural network. In 2020 International Conference on Data Science and Its Applications (ICoDSA) (pp. 1–5). IEEE. https://doi.org/10.1109/ICoDSA50139.2020.9213012
Almomani, A. (2023). Darknet traffic analysis and classification system based on modified stacking ensemble learning algorithms. Information Systems and E-Business Management.
Almomani, A. (2023). Darknet traffic classification and adversarial system based on modified stacking ensemble learning algorithms. Information Systems and E-Business Management.
Ao, Y., Li, H., Zhu, L., Ali, S., & Yang, Z. (2019). The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. Journal of Petroleum Science and Engineering, 174, 776–789. https://doi.org/10.1016/j.petrol.2018.11.067
Bernal, E., Lagunes, M. L., Castillo, O., Soria, J., & Valdez, F. (2021). Optimization of type-2 fuzzy logic controller design using the GSO and FA algorithms. International Journal of Fuzzy Systems, 23(1), 42–57. https://doi.org/10.1007/s40815-020-00976-w
Chioran, D., & Valean, H. (2020). Arduino based smart home automation system. International Journal of Advanced Computer Science and Applications, 11(4). https://doi.org/10.14569/IJACSA.2020.0110410
Clarissa, V., & Suyanto. (2019). New reward-based movement to improve globally-evolved BCO in nurse rostering problem. In 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI) (pp. 114–117). IEEE. https://doi.org/10.1109/ISRITI48646.2019.9034669
Coutinho Marim, M., et al. (2023). Darknet traffic detection and characterization with models based on decision trees and neural networks. Intelligent Systems with Applications, 18, 200199. https://doi.org/10.1016/j.iswa.2023.200199
Draper-Gil, G., Lashkari, A. H., Mamun, M. S. I., & Ghorbani, A. A. (2016). Characterization of encrypted and VPN traffic using time-related features. In Proceedings of the 2nd International Conference on Information Systems Security and Privacy.
Feurer, M., & Hutter, F. (2019). Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges. IFAC-PapersOnLine, 50(1), 4973–4978. https://doi.org/10.1016/j.ifacol.2017.08.763
Iliadis, L. A., & Kaifas, T. (2021). Darknet traffic classification using machine learning techniques. In 2021 10th International Conference on Modern Circuits and Systems Technologies.
Karunanayake, I., Ahmed, N., Malaney, R., Islam, R., & Jha, S. K. (2023). Darknet traffic analysis: Investigating the impact of modified Tor traffic on onion service traffic classification.
Kumar, V., & Kumar, D. (2020). A systematic review on firefly algorithm: Past, present, and future. Archives of Computational Methods in Engineering. https://doi.org/10.1007/s11831-020-09332-x
Mane, P., Sanghavi, V., Parkar, Y., Walanje, A., & Patel, J. (2019). Traffic classification using machine learning. In Proceedings of the 2nd International Conference on Advances in Science & Technology (ICAST-2019).
Probst, P., Wright, M. N., & Boulesteix, A. (2019). Hyperparameters and tuning strategies for random forest. WIREs Data Mining and Knowledge Discovery, 9(3). https://doi.org/10.1002/widm.1301
Saleem, J., Islam, R., & Islam, M. Z. (2024). Darknet traffic analysis: A systematic literature review. IEEE Access.
Sarwar, M. B., Hanif, M. K., Talib, R., Younas, M., & Sarwar, M. U. (2021). DarkDetect: Darknet traffic detection and categorization using modified convolution-long short-term memory. IEEE Access.
Speiser, J. L., Miller, M. E., Tooze, J., & Ip, E. (2019). A comparison of random forest variable selection methods for classification prediction modeling. Expert Systems with Applications, 134, 93–101. https://doi.org/10.1016/j.eswa.2019.05.028
Tawakkal, M. I., & Suyanto. (2020). Exploration-exploitation balanced krill herd algorithm for thesis examination timetabling. In 2020 International Conference on Data Science and Its Applications (ICoDSA) (pp. 1–5). IEEE. https://doi.org/10.1109/ICoDSA50139.2020.9212837
Yang, X.-S. (2014). Introduction to algorithms. In Nature-inspired optimization algorithms (pp. 1–21). Elsevier. https://doi.org/10.1016/B978-0-12-416743-8.00001-4