This study explores how semantic information within log messages enhances anomaly detection, often outperforming models that rely solely on sequential or temporal data. Through Transformer-based experiments on public datasets, the authors find that event occurrence and semantic cues are more predictive of anomalies than sequence order. The research underscores the limits of current datasets and calls for new, well-annotated benchmarks to evaluate log-based anomaly detection more effectively, enabling models that fully leverage log semantics and event context.This study explores how semantic information within log messages enhances anomaly detection, often outperforming models that rely solely on sequential or temporal data. Through Transformer-based experiments on public datasets, the authors find that event occurrence and semantic cues are more predictive of anomalies than sequence order. The research underscores the limits of current datasets and calls for new, well-annotated benchmarks to evaluate log-based anomaly detection more effectively, enabling models that fully leverage log semantics and event context.

Why Log Semantics Matter More Than Sequence Data in Detecting Anomalies

2025/11/04 01:52

Abstract

1 Introduction

2 Background and Related Work

2.1 Different Formulations of the Log-based Anomaly Detection Task

2.2 Supervised v.s. Unsupervised

2.3 Information within Log Data

2.4 Fix-Window Grouping

2.5 Related Works

3 A Configurable Transformer-based Anomaly Detection Approach

3.1 Problem Formulation

3.2 Log Parsing and Log Embedding

3.3 Positional & Temporal Encoding

3.4 Model Structure

3.5 Supervised Binary Classification

4 Experimental Setup

4.1 Datasets

4.2 Evaluation Metrics

4.3 Generating Log Sequences of Varying Lengths

4.4 Implementation Details and Experimental Environment

5 Experimental Results

5.1 RQ1: How does our proposed anomaly detection model perform compared to the baselines?

5.2 RQ2: How much does the sequential and temporal information within log sequences affect anomaly detection?

5.3 RQ3: How much do the different types of information individually contribute to anomaly detection?

6 Discussion

7 Threats to validity

8 Conclusions and References

\

6 Discussion

We discuss our lessons learned according to the experimental results.

Semantic information contributes to anomaly detection

The findings of this study confirm the efficacy of utilizing semantic information within log messages for log-based anomaly detection. Recent studies show classical machine learning models and simple log representation (vectorization) techniques can outperform complex DL counterparts [7, 23]. In these simple approaches, log events within log data are substituted with event IDs or tokens, and semantic information is lost. However, according to our experimental results, the semantic information is valuable for subsequent models to distinguish anomalies, while the event occurrence information is also prominent.

We call for future contributions of new, high-quality datasets that can be combined with our flexible approach to evaluate the influence of different components in logs for anomaly detection. ***The results of our study confirm the findings of recent works [16, 23]. Most anomalies may not be associated with sequential information within log sequences. The occurrence of certain log templates and the semantics within log templates contribute to the anomalies. This finding highlights the importance of employing new datasets to validate the recent designs of DL models (e.g., LSTM [10], Transformer [11]). Moreover, our flexible approach can be used off-the-shelf with the new datasets to evaluate the influences of different components and contribute to high-quality anomaly detection that leverages the full capacity of logs.

The publicly available log datasets that are well-annotated for anomaly detection are limited, which greatly hinders the evaluation and development of anomaly detection approaches that have practical impacts. Except for the HDFS dataset, whose anomaly annotations are session-based, the existing public datasets contain annotations for each log entry within log data, which implies the anomalies are only associated with certain specific log events or associated parameters within the events. Under this setting, the causality or sequential information that may imply anomalous behaviors is ignored.

7 Threats to validity

We have identified the following threats to the validity of our findings:

Construct Validity

In our proposed anomaly detection method, we adopt the Drain parser to parse the log data. Although the Drain parser performs well and can generate relatively accurate parsing results, parsing errors still exist. The parsing error may influence the generation of log event embedding (i.e., logs from the same log event may have different embeddings) and thus influence the performance of the anomaly detection model. To mitigate this threat, we pass some extra regular expressions for each dataset to the parser. These regular expressions can help the parser filter some known dynamic areas in log messages and thus achieve more accurate results.

\ Internal Validity There are various hyperparameters involved in our proposed anomaly detection model and experiment settings: 1) In the process of generating samples for both training and test sets, we define minimum and maximum lengths, along with step sizes, to generate log sequences of varying lengths. We do not have prior knowledge about the range of sequence length in which anomalies may reside. However, we set these parameters according to the common practices of previous studies, which adopt fixlength grouping. 2) The Transformer-based anomaly detection model entails numerous hyperparameters, such as the number of transformer layers, attention heads, and the size of the fully-connected layer. As the number of combinations is huge, we were not able to do a grid search. However, we referred to the settings of similar models and experimented with different combinations of hyperparameters, selecting the bestperforming combination accordingly.

\ External Validity

In this study, we conducted experiments on four public log datasets for anomaly detection. Some findings and conclusions obtained from our experimental results are constrained to the studied datasets. However, the studied datasets are the most used ones to evaluate the log-based anomaly detection models. They have become the standard of the evaluation. As the annotation of the log datasets demands a lot of human effort, there are only a few publicly available datasets for log-based anomaly detection tasks. The studied datasets are representative, thus enabling the findings to illuminate prevalent challenges within the realm of anomaly detection.

Reliability

The reliability of our findings may be influenced by the reproducibility of results, as variations in dataset preprocessing, hyperparameter tuning, and log parsing configurations across different implementations could lead to discrepancies. To mitigate this threat, we adhered to well-used preprocessing processes and hyperparameter settings, which are detailed in the paper. However, even minor differences in experimental setups or parser configurations may yield divergent outcomes, potentially impacting the consistency of the model’s performance across independent studies.

8 Conclusions and References

The existing log-based anomaly detection approaches have used different types of information within log data. However, it remains unclear how these different types of information contribute to the identification of anomalies. In this study, we first propose a Transformer-based anomaly detection model, with which we conduct experiments with different input feature combinations to understand the role of different information in detecting anomalies within log sequences. The experimental results demonstrate that our proposed approach achieves competitive and more stable performance compared to simple machine learning models when handling log sequences of varying lengths. With the proposed model and the studied datasets, we find that sequential and temporal information do not contribute to the overall performance of anomaly detection when the event occurrence information is present. The event occurrence information is the most prominent feature for identifying anomalies, while the inclusion of semantic information from log templates is helpful for anomaly detection models. Our results and findings generally confirm that of the recent empirical studies and indicate the deficiency of using the existing public datasets to evaluate anomaly detection methods, especially the deep learning models. Our work highlights the need to utilize new datasets that contain different types of anomalies and align more closely with real-world systems to evaluate anomaly detection models. Our flexible approach can be readily applied with the new datasets to evaluate the influences of different components and enhance anomaly detection by leveraging the full capacity of log information.

:::info Supplementary information: The source code of the proposed method is publicly available in our supplementary material package 1.

:::

Acknowledgements

We would like to gratefully acknowledge the Natural Sciences and Engineering Research Council of Canada (NSERC, RGPIN-2021-03900) and the Fonds de recherche du Qu´ebec – Nature et technologies (FRQNT, 326866) for their funding support for this work.

References

[1] He, S., Zhu, J., He, P., Lyu, M.R.: Experience report: System log analysis for anomaly detection. In: 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp. 207–218 (2016). IEEE

[2] Oliner, A., Ganapathi, A., Xu, W.: Advances and challenges in log analysis. Communications of the ACM 55(2), 55–61 (2012)

[3] He, S., He, P., Chen, Z., Yang, T., Su, Y., Lyu, M.R.: A survey on automated log analysis for reliability engineering. ACM computing surveys (CSUR) 54(6), 1–37 (2021)

[4] Zhu, J., He, S., Liu, J., He, P., Xie, Q., Zheng, Z., Lyu, M.R.: Tools and benchmarks for automated log parsing. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 121–130 (2019). IEEE

[5] Chen, Z., Liu, J., Gu, W., Su, Y., Lyu, M.R.: Experience report: Deep learningbased system log analysis for anomaly detection. arXiv preprint arXiv:2107.05908 (2021)

[6] Nedelkoski, S., Bogatinovski, J., Acker, A., Cardoso, J., Kao, O.: Self-attentive classification-based anomaly detection in unstructured logs. In: 2020 IEEE International Conference on Data Mining (ICDM), pp. 1196–1201 (2020). IEEE

[7] Wu, X., Li, H., Khomh, F.: On the effectiveness of log representation for log-based anomaly detection. Empirical Software Engineering 28(6), 137 (2023)

[8] Xu, W., Huang, L., Fox, A., Patterson, D., Jordan, M.I.: Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, pp. 117–132 (2009)

[9] Lou, J.-G., Fu, Q., Yang, S., Xu, Y., Li, J.: Mining invariants from console logs for system problem detection. In: 2010 USENIX Annual Technical Conference (USENIX ATC 10) (2010) 1https://github.com/mooselab/suppmaterial-CfgTransAnomalyDetector 21

[10] Du, M., Li, F., Zheng, G., Srikumar, V.: Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1285–1298 (2017)

[11] Le, V.-H., Zhang, H.: Log-based anomaly detection without log parsing. In: 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 492–504 (2021). IEEE

[12] Guo, H., Yang, J., Liu, J., Bai, J., Wang, B., Li, Z., Zheng, T., Zhang, B., Peng, J., Tian, Q.: Logformer: A pre-train and tuning pipeline for log anomaly detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, pp. 135–143 (2024)

[13] He, S., Lin, Q., Lou, J.-G., Zhang, H., Lyu, M.R., Zhang, D.: Identifying impactful service system problems via log analysis. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 60–70 (2018)

[14] Farzad, A., Gulliver, T.A.: Unsupervised log message anomaly detection. ICT Express 6(3), 229–237 (2020)

[15] Le, V.-H., Zhang, H.: Log-based anomaly detection with deep learning: how far are we? In: 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE), pp. 1356–1367 (2022). IEEE

[16] Landauer, M., Skopik, F., Wurzenberger, M.: A critical review of common log data sets used for evaluation of sequence-based anomaly detection techniques. Proceedings of the ACM on Software Engineering 1(FSE), 1354–1375 (2024)

[17] Zhu, J., He, S., He, P., Liu, J., Lyu, M.R.: Loghub: A large collection of system log datasets for ai-driven log analytics. In: 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), pp. 355–366 (2023). IEEE

[18] Bodik, P., Goldszmidt, M., Fox, A., Woodard, D.B., Andersen, H.: Fingerprinting the datacenter: automated classification of performance crises. In: Proceedings of the 5th European Conference on Computer Systems, pp. 111–124 (2010)

[19] Chen, M., Zheng, A.X., Lloyd, J., Jordan, M.I., Brewer, E.: Failure diagnosis using decision trees. In: International Conference on Autonomic Computing, 2004. Proceedings., pp. 36–43 (2004). IEEE

[20] Liang, Y., Zhang, Y., Xiong, H., Sahoo, R.: Failure prediction in ibm bluegene/l event logs. In: Seventh IEEE International Conference on Data Mining (ICDM 2007), pp. 583–588 (2007). IEEE

[21] Guo, H., Yuan, S., Wu, X.: Logbert: Log anomaly detection via bert. In: 2021 22 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2021). IEEE

[22] Lin, Q., Zhang, H., Lou, J.-G., Zhang, Y., Chen, X.: Log clustering based problem identification for online service systems. In: Proceedings of the 38th International Conference on Software Engineering Companion, pp. 102–111 (2016)

[23] Yu, B., Yao, J., Fu, Q., Zhong, Z., Xie, H., Wu, Y., Ma, Y., He, P.: Deep learning or classical machine learning? an empirical study on log-based anomaly detection. In: Proceedings of the 46th IEEE/ACM International Conference on Software Engineering, pp. 1–13 (2024)

[24] He, P., Zhu, J., Zheng, Z., Lyu, M.R.: Drain: An online log parsing approach with fixed depth tree. In: 2017 IEEE International Conference on Web Services (ICWS), pp. 33–40 (2017). IEEE

[25] Reimers, N., Gurevych, I.: Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)

[26] Face, H.: all-MiniLM-L6-v2 Model. Accessed: April 8, 2024. https://huggingface. co/sentence-transformers/all-MiniLM-L6-v2

[27] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)

[28] Irie, K., Zeyer, A., Schl¨uter, R., Ney, H.: Language modeling with deep transformers. arXiv preprint arXiv:1905.04226 (2019)

[29] Haviv, A., Ram, O., Press, O., Izsak, P., Levy, O.: Transformer language models without positional encodings still learn positional information. In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 1382–1390. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (2022). https://doi.org/10.18653/v1/ 2022.findings-emnlp.99 . https://aclanthology.org/2022.findings-emnlp.99

[30] Kazemi, S.M., Goel, R., Eghbali, S., Ramanan, J., Sahota, J., Thakur, S., Wu, S., Smyth, C., Poupart, P., Brubaker, M.: Time2vec: Learning a vector representation of time. arXiv preprint arXiv:1907.05321 (2019)

[31] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

[32] Oliner, A., Stearley, J.: What supercomputers say: A study of five system logs. In: 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), pp. 575–584 (2007). IEEE

:::info Authors:

  1. Xingfang Wu
  2. Heng Li
  3. Foutse Khomh

:::

:::info This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

:::

\

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.
Share Insights

You May Also Like

Here is What Every Investor Should Do in a Crypto Bear Market

Here is What Every Investor Should Do in a Crypto Bear Market

The post Here is What Every Investor Should Do in a Crypto Bear Market appeared on BitcoinEthereumNews.com. When prices start to crater, crowds of traders run for the hills in fear, selling into a market bottom. But history has also shown that, painful as they are, downturns in crypto can be among the richest moments for those who know what they are doing. But unlike traditional markets, crypto never sleeps and trades off narratives, as well as moves right now on innovation, or news around the world. Which is why bear markets are so volatile — and also a time when they can be fertile ground for disciplined investors who are ready rather than panicked. In past cycles, the money managers who took this longer-term approach rather than chasing quick rebounds tended to make the biggest gains when the bull market returned. Against that kind of backdrop, the humpbacked migration-type of big-game whale behavior, like seen on MAGACOIN FINANCE, is a signal that pro money has already been quietly positioning for what’s upcoming, regardless of whether retail follows their tempo or not.  Focus on Fundamentals Bear markets separate the wheat from the chaff, revealing who is genuinely building utility and who was just hype. Investors would do well to monitor developer activity, real-world applications and active partnerships along with them. Strongly established, tech-backed cryptocurrencies with active communities have the best chances of weathering a storm and also making it against the upcoming bull cycle.  Accumulate Gradually Finding the exact bottom is nearly impossible. Instead of waiting for the “perfect” entry, strategies like dollar-cost averaging (DCA) allow steady accumulation over time. This approach lowers the emotional pressure of market timing and builds exposure at more favorable prices, preparing portfolios for recovery when optimism returns. Diversify Wisely Focusing on one token is exhilarating when the market is booming, but it can also be destructive during down cycles. Holding a…
Share
BitcoinEthereumNews2025/09/20 10:16
Preliminary analysis of the Balancer V2 attack, which resulted in a loss of $120 million.

Preliminary analysis of the Balancer V2 attack, which resulted in a loss of $120 million.

On November 3, the Balancer V2 protocol and its fork projects were attacked on multiple chains, resulting in a serious loss of more than $120 million. BlockSec issued an early warning at the first opportunity [1] and gave a preliminary analysis conclusion [2]. This was a highly complex attack. Our preliminary analysis showed that the root cause was that the attacker manipulated the invariant, thereby distorting the calculation of the price of BPT (Balancer Pool Token) -- that is, the LP token of Balancer Pool -- so that it could profit in a stable pool through a batchSwap operation. Background Information 1. Scaling and Rounding To standardize the decimal places of different tokens, the Balancer contract will: upscale: Upscales the balance and amount to a uniform internal precision before performing the calculation; downscale: Reduces the result to its original precision and performs directional rounding (e.g., inputs are usually rounded up to ensure the pool is not under-filled; output paths are often truncated downwards). Conclusion: Within the same transaction, the asymmetrical rounding direction used in different stages can lead to a systematic slight deviation when executed repeatedly in very small steps. 2. Prices of D and BPT The Balancer V2 protocol’s Composable Stable Pool[3] and the fork protocol were affected by this attack. Stable Pool is used for assets that are expected to maintain a close 1:1 exchange ratio (or be exchanged at a known exchange rate), allowing large exchanges without causing significant price shocks, thereby greatly improving the efficiency of capital utilization between similar or related assets. The pool uses the Stable Math (a Curve-based StableSwap model), where the invariant D represents the pool's "virtual total value". The approximate price of BPT (Pool's LP Token) is: The formula above shows that if D is made smaller on paper (even if no funds are actually withdrawn), the price of BPT will be cheaper. BTP represents the pool share and is used to calculate how many pool reserves can be obtained when withdrawing liquidity. Therefore, if an attacker can obtain more BPT, they can profit when withdrawing liquidity. Attack Analysis Taking an attack transaction on Arbitrum as an example, the batchSwap operation can be divided into three stages: Phase 1: The attacker redeems BPT for the underlying asset to precisely adjust the balance of one of the tokens (cbETH) to a critical point (amount = 9) for rounding. This step sets the stage for the precision loss in the next phase. Phase Two: The attacker uses a carefully crafted quantity (= 8) to swap between another underlying asset (wstETH) and cbETH. Due to rounding down when scaling the token quantity, the calculated Δx is slightly smaller (from 8.918 to 8), causing Δy to be underestimated and the invariant D (derived from Curve's StableSwap model) to be smaller. Since BPT price = D / totalSupply, the BPT price is artificially suppressed. Phase 3: The attackers reverse-swap the underlying assets back to BPT, restoring the balance within the pool while profiting from the depressed price of BPT—acquiring more BPT tokens. Finally, the attacker used another profitable transaction to withdraw liquidity, thereby using the extra BPT to acquire other underlying assets (cbETH and wstETH) in the Pool and thus profit. Attacking the transaction: https://app.blocksec.com/explorer/tx/arbitrum/0x7da32ebc615d0f29a24cacf9d18254bea3a2c730084c690ee40238b1d8b55773 Profitable trades: https://app.blocksec.com/explorer/tx/arbitrum/0x4e5be713d986bcf4afb2ba7362525622acf9c95310bd77cd5911e7ef12d871a9 Reference: [1]https://x.com/Phalcon_xyz/status/1985262010347696312 [2]https://x.com/Phalcon_xyz/status/1985302779263643915 [3]https://docs-v2.balancer.fi/concepts/pools/composable-stable.html
Share
PANews2025/11/04 14:00