Contact Us | Language: čeština English
Title: | Efficient algorithms for mining clickstream patterns using pseudo-IDLists | ||||||||||
Author: | Huynh, Huy M.; Nguyen, Loan T.T.; Vo, Bay; Yun, Unil; Komínková Oplatková, Zuzana; Hong, Tzung-Pei | ||||||||||
Document type: | Peer-reviewed article (English) | ||||||||||
Source document: | Future Generation Computer Systems. 2020, vol. 107, p. 18-30 | ||||||||||
ISSN: | 0167-739X (Sherpa/RoMEO, JCR) | ||||||||||
Journal Impact
This chart shows the development of journal-level impact metrics in time
|
|||||||||||
DOI: | https://doi.org/10.1016/j.future.2020.01.034 | ||||||||||
Abstract: | Sequential pattern mining is an important task in data mining. Its subproblem, clickstream pattern mining, is starting to attract more research due to the growth of the Internet and the need to analyze online customer behaviors. To date, only few works are dedicately proposed for the problem of mining clickstream patterns. Although one approach is to use the general algorithms for sequential pattern mining, those algorithms’ performance may suffer and the resources needed are more than would be necessary with a dedicated method for mining clickstreams. In this paper, we present pseudo-IDList, a novel data structure that is more suitable for clickstream pattern mining. Based on this structure, a vertical format algorithm named CUP (Clickstream pattern mining Using Pseudo-IDList) is proposed. Furthermore, we propose a pruning heuristic named DUB (Dynamic intersection Upper Bound) to improve our proposed algorithm. Four real-life clickstream databases are used for the experiments and the results show that our proposed methods are effective and efficient regarding runtimes and memory consumption. © 2020 Elsevier B.V. | ||||||||||
Full text: | https://www.sciencedirect.com/science/article/pii/S0167739X19314475 | ||||||||||
Show full item record |