Site icon automotivemogul

Exploiting topic analysis models to explore psychological dimensions in social media data

Exploiting topic analysis models to explore psychological dimensions in social media data
  • Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K. & Harshman, R. Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41, 391–407 (1990).

    Article 

    Google Scholar 

  • Alghamdi, R. & Alfalqi, K. A survey of topic modeling in text mining. Int. J. Adv. Comput. Sci. Appl. 6 (2015).

  • Boyd-Graber, J. et al. Applications of topic models. Foundations Trends Inform. Retrieval 11, 143–296 (2017).

    Article 

    Google Scholar 

  • Blei, D. M. Probabilistic topic models. Commun. ACM 55, 77–84 (2012).

    Article 

    Google Scholar 

  • Wu, X., Dong, X., Nguyen, T. T. & Luu, A. T. Effective neural topic modeling with embedding clustering regularization. In International Conference on Machine Learning, 37335–37357 (PMLR, 2023).

  • Paul, M. J. & Dredze, M. Discovering health topics in social media using topic models. PloS one 9, e103408 (2014).

    Article 
    ADS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Nguyen, T., Phung, D., Dao, B., Venkatesh, S. & Berk, M. Affective and content analysis of online depression communities. IEEE Trans. Affective Comput. 5, 217–226 (2014).

    Article 

    Google Scholar 

  • Nguyen, T. et al. Using linguistic and topic analysis to classify sub-groups of online depression communities. Multimedia Tools Appl. 76, 10653–10676 (2017).

    Article 

    Google Scholar 

  • Seo, H. & Song, M. An analysis of the discourse topics of users who exhibit symptoms of depression on social media. J. Korean Soc. Inform. Manage. 36, 207–226 (2019).

    Google Scholar 

  • Sik, D., Németh, R. & Katona, E. Topic modelling online depression forums: beyond narratives of self-objectification and self-blaming. J. Mental Health 32, 386–395 (2023).

    Article 
    PubMed 

    Google Scholar 

  • Liu, Y. et al. Monitoring covid-19 pandemic through the lens of social media using natural language processing and machine learning. Health Inform. Sci. Syst. 9, 25 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Yu, L. et al. Detecting changes in attitudes toward depression on chinese social media: a text analysis. J. Affective Disorders 280, 354–363 (2021).

    Article 
    PubMed 

    Google Scholar 

  • Chandrasekaran, R., Kotaki, S. & Nagaraja, A. H. Detecting and tracking depression through temporal topic modeling of tweets: insights from a 180-day study. Npj Mental Health Res. 3, 1–10 (2024).

    Article 

    Google Scholar 

  • Hoyle, A. et al. Is automated topic model evaluation broken? the incoherence of coherence. Adv. Neural Inform. Processing Syst. 34, 2018–2033 (2021).

    Google Scholar 

  • Doogan, C. & Buntine, W. Topic model or topic twaddle? re-evaluating demantic interpretability measures. In: North American Association for Computational Linguistics 2021, 3824–3848 (Association for Computational Linguistics (ACL), 2021).

  • Blei, D. M., Ng, A. Y. & Jordan, M. I. Latent dirichlet allocation. J. Mach, Learning Res. 3, 993–1022 (2003).

    Google Scholar 

  • Grootendorst, M. BERTopic: Neural topic modeling with a class-based TF-IDF procedure (2022). ArXiv:2203.05794 [cs].

  • Meng, Y., Zhang, Y., Huang, J., Zhang, Y. & Han, J. Topic discovery via latent space clustering of pretrained language model representations. In Proceedings of the ACM Web Conference 2022, 3143–3152 (2022).

    Google Scholar 

  • Losada, D. E., Crestani, F. & Parapar, J. erisk 2017: Clef lab on early risk prediction on the internet: experimental foundations. In: Experimental IR Meets Multilinguality, Multimodality, and Interaction: 8th International Conference of the CLEF Association, CLEF 2017, Dublin, Ireland, September 11–14, 2017, Proceedings 8, 346–360 (Springer, 2017).

  • Losada, D. E., Crestani, F. & Parapar, J. Overview of erisk: early risk prediction on the internet. In: Experimental IR Meets Multilinguality, Multimodality, and Interaction: 9th International Conference of the CLEF Association, CLEF 2018, Avignon, France, September 10-14, 2018, Proceedings 9, 343–361 (Springer, 2018).

  • Parapar, J., Martín-Rodilla, P., Losada, D. E. & Crestani, F. eRisk 2022: pathological gambling, depression, and eating disorder challenges. In: European Conference on Information Retrieval, 436–442 (Springer, 2022).

  • Paatero, P. & Tapper, U. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 111–126 (1994).

    Article 

    Google Scholar 

  • Lee, D. D. & Seung, H. S. Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999) (Publisher: Nature Publishing Group.).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar 

  • Blei, D. M. & Lafferty, J. D. A correlated topic model of science. Ann. Appl. Statistics 1, 17–35 (2007).

    MathSciNet 

    Google Scholar 

  • Li, W. & McCallum, A. Pachinko allocation: Dag-structured mixture models of topic correlations. In: Proceedings of the 23rd international conference on Machine learning, 577–584 (2006).

  • Blei, D. M. & Lafferty, J. D. Dynamic topic models. In: Proceedings of the 23rd international conference on Machine learning, 113–120 (2006).

  • Bianchi, F., Terragni, S. & Hovy, D. Pre-training is a hot topic: Contextualized document embeddings improve topic coherence. Preprint at arXiv:2004.03974 (2020).

  • Reimers, N. Sentence-BERT: Sentence embeddings using siamese bert-networks. Preprint at arXiv:1908.10084 (2019).

  • Miao, Y., Yu, L. & Blunsom, P. Neural variational inference for text processing. In: International conference on machine learning, 1727–1736 (PMLR, 2016).

  • Srivastava, A. & Sutton, C. Autoencoding variational inference for topic models. Preprint at arXiv:1703.01488 (2017).

  • Xu, Y. et al. Hyperminer: Topic taxonomy mining with hyperbolic embedding. Adv. Neural Inform. Processing Syst. 35, 31557–31570 (2022).

    Google Scholar 

  • Wang, D. et al. Representing mixtures of word embeddings with mixtures of topic embeddings. Preprint at arXiv:2203.01570 (2022).

  • Angelov, D. Top2vec: Distributed representations of topics. Preprint at arXiv:2008.09470 (2020).

  • Wu, X., Nguyen, T., Zhang, D. C., Wang, W. Y. & Luu, A. T. Fastopic: A fast, adaptive, stable, and transferable topic modeling paradigm. Preprint atarXiv:2405.17978 (2024).

  • AlSumait, L., Barbará, D., Gentle, J. & Domeniconi, C. Topic significance ranking of LDA generative models. In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009, Proceedings, Part I 20, 67–82 (Springer, 2009).

  • Newman, D., Lau, J. H., Grieser, K. & Baldwin, T. Automatic evaluation of topic coherence. In: Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics, 100–108 (2010).

  • Mimno, D., Wallach, H., Talley, E., Leenders, M. & McCallum, A. Optimizing semantic coherence in topic models. In: Proceedings of the 2011 conference on empirical methods in natural language processing, 262–272 (2011).

  • Aletras, N. & Stevenson, M. Evaluating topic coherence using distributional semantics. In Koller, A. & Erk, K. (eds.) Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013), 13–22 (Association for Computational Linguistics, Potsdam, Germany, 2013).

  • Röder, M., Both, A. & Hinneburg, A. Exploring the space of topic coherence measures. In: Proceedings of the eighth ACM international conference on Web search and data mining, 399–408 (2015).

  • Nikolenko, S. I. Topic quality metrics based on distributed word representations. In: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 1029–1032 (2016).

  • Fang, A., Macdonald, C., Ounis, I. & Habel, P. Using word embedding to evaluate the coherence of topics from twitter data. In: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 1057–1060 (2016).

  • Rahimi, H. et al. Contextualized topic coherence metrics. Preprint at arXiv:2305.14587 (2023).

  • Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. & Blei, D. Reading tea leaves: How humans interpret topic models. Adv. Neural Inform. Processing Syst. 22 (2009).

  • Losada, D. E., Crestani, F. & Parapar, J. Overview of eRisk at CLEF 2019: Early risk prediction on the internet (extended overview). Working Notes of CLEF 2019 Conference and Labs of the Evaluation Forum 4, 21 (2019).

    Google Scholar 

  • Parapar, J., Martín-Rodilla, P., Losada, D. E. & Crestani, F. Overview of erisk at clef 2021: Early risk prediction on the internet (extended overview).. Working Notes of CLEF 2021 Conference and Labs of the Evaluation Forum 1, 864–887 (2021).

    Google Scholar 

  • Parapar, J., Martín-Rodilla, P., Losada, D. E. & Crestani, F. Overview of erisk 2023: Early risk prediction on the internet. In: International Conference of the Cross-Language Evaluation Forum for European Languages, 294–315 (Springer, 2023).

  • Coppersmith, G., Dredze, M., Harman, C., Hollingshead, K. & Mitchell, M. Clpsych 2015 shared task: Depression and ptsd on twitter. In: Proceedings of the 2nd workshop on computational linguistics and clinical psychology: from linguistic signal to clinical reality, 31–39 (2015).

  • Milne, D. N., Pink, G., Hachey, B. & Calvo, R. A. Clpsych 2016 shared task: Triaging content in online peer-support forums. In: Proceedings of the third workshop on computational linguistics and clinical psychology, 118–127 (2016).

  • Lynn, V. et al. CLPsych 2018 shared task: Predicting current and future psychological health from childhood essays. In: Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, 37–46 (2018).

  • Zirikly, A., Resnik, P., Uzuner, O. & Hollingshead, K. Clpsych 2019 shared task: Predicting the degree of suicide risk in reddit posts. In: Proceedings of the sixth workshop on computational linguistics and clinical psychology, 24–33 (2019).

  • Tsakalidis, A. et al. Overview of the CLPsych 2022 shared task: Capturing moments of change in longitudinal user posts. In: Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology, 184–198 (2022).

  • Chim, J. et al. Overview of the CLPsych 2024 shared task: Leveraging large language models to identify evidence of suicidality risk in online posts. In: Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024), 177–190 (2024).

  • Guntuku, S. C., Yaden, D. B., Kern, M. L., Ungar, L. H. & Eichstaedt, J. C. Detecting depression and mental illness on social media: an integrative review. Current Opinion Behav. Sci. 18, 43–49 (2017).

    Article 

    Google Scholar 

  • Chancellor, S. & De Choudhury, M. Methods in predictive techniques for mental health status on social media: a critical review. NPJ Digital Med. 3, 43 (2020).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Ríssola, E. A., Losada, D. E. & Crestani, F. A survey of computational methods for online mental state assessment on social media. ACM Trans. Comput. Healthcare 2, 1–31 (2021).

    Article 

    Google Scholar 

  • Chen, X. & Genc, Y. A systematic review of artificial intelligence and mental health in the context of social media. In: International Conference on Human-Computer Interaction, 353–368 (Springer, 2022).

  • Crestani, F., Losada, D. E. & Parapar, J. Early Detection of Mental Health Disorders by Social Media Monitoring: The First Five Years of the ERisk Project, 1018 (Springer Nature, 2022).

  • Ríssola, E. A., Parapar, J., Losada, D. E. & Crestani, F. A survey of the first five years of erisk: Findings and conclusions. In: Early Detection of Mental Health Disorders by Social Media Monitoring: The First Five Years of the eRisk Project, 31–57 (Springer, 2022).

  • Shensa, A., Sidani, J. E., Dew, M. A., Escobar-Viera, C. G. & Primack, B. A. Social media use and depression and anxiety symptoms: A cluster analysis. Am. J. Health Behav. 42, 116–128 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Aragón, M. E., López-Monroy, A. P. & Montes-y Gómez, M. INAOE-CIMAT at eRisk 2019: Detecting signs of anorexia using fine-grained emotions. In: Working Notes of CLEF 2019 Conference and Labs of the Evaluation Forum (2019).

  • De Choudhury, M., Gamon, M., Counts, S. & Horvitz, E. Predicting depression via social media. In: Proceedings of the international AAAI conference on web and social media 7(1), 128–137 (2013).

    Article 

    Google Scholar 

  • Couto, M., Parapar, J. & Losada, D. E. Comparison of clustering algorithms for knowledge discovery in social media publications: A case study of mental health analysis. Procesamiento del lenguaje natural 73, 69–81 (2024).

    Google Scholar 

  • Yang, P., Han, K. & Diesner, J. Topics, temporal patterns, and network characteristics of AI-related discourse on reddit. In: International Conference on Advances in Social Networks Analysis and Mining, 333–344 (Springer, 2024).

  • Kerasiotis, M., Ilias, L. & Askounis, D. Depression detection in social media posts using transformer-based models and auxiliary features. Soc. Netw. Analysis Mining 14, 196 (2024).

    Article 

    Google Scholar 

  • Kanahuati-Ceballos, M. & Valdivia, L. J. Detection of depressive comments on social media using rnn, lstm, and random forest: comparison and optimization. Soc. Netw. Anal. Mining 14, 44 (2024).

    Article 

    Google Scholar 

  • Naseem, U. et al. Incorporating historical information by disentangling hidden representations for mental health surveillance on social media. Soc. Netw. Anal. Mining 14, 9 (2023).

    Article 

    Google Scholar 

  • He, M., Bakker, E. M. & Lew, M. S. Dpd (depression detection) net: a deep neural network for multimodal depression detection. Health Inform. Sci. Syst. 12, 53 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Thushari, P. D. et al. Identifying discernible indications of psychological well-being using ml: explainable ai in reddit social media interactions. Soc. Netw. Anal. Mining 13, 141 (2023).

    Article 

    Google Scholar 

  • Bao, E., Pérez, A. & Parapar, J. Explainable depression symptom detection in social media. Health Inform. Sci. Syst. 12, 47 (2024).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Kherwa, P. & Bansal, P. Topic Modeling: A Comprehensive Review. EAI Endorsed Trans. Scalable Inform. Syst. 7 (2019).

  • Curiskis, S. A., Drake, B., Osborn, T. R. & Kennedy, P. J. An evaluation of document clustering and topic modelling in two online social networks: Twitter and Reddit. Inform. Proces. Manag. 57, 102034 (2020).

    Article 

    Google Scholar 

  • Blair, S. J., Bi, Y. & Mulvenna, M. D. Aggregated topic models for increasing social media topic coherence. Appl. Intell. 50, 138–156 (2020).

    Article 

    Google Scholar 

  • Zhao, H. et al. Topic modelling meets deep neural networks: a survey. Preprint at arXiv:2103.00498 (2021).

  • Laureate, C. D. P., Buntine, W. & Linger, H. A systematic review of the use of topic models for short text social media analysis. Artif. Intell. Rev. 56, 14223–14255 (2023).

    Article 

    Google Scholar 

  • McInnes, L., Healy, J. & Melville, J. Umap: Uniform manifold approximation and projection for dimension reduction. Preprint at arXiv:1802.03426 (2018).

  • Feldbauer, R. & Flexer, A. A comprehensive empirical comparison of hubness reduction in high-dimensional spaces. Knowl. Inform. Syst. 59, 137–166 (2019).

    Article 
    PubMed 

    Google Scholar 

  • Sarkar, S. & Ghosh, A. K. On perfect clustering of high dimension, low sample size data. Preprint at arXiv:1612.09121 (2016).

  • Peng, D., Gui, Z. & Wu, H. Interpreting the curse of dimensionality from distance concentration and manifold effect. Preprint at arXiv:2401.00422 (2024).

  • Pestov, V. On the geometry of similarity search: dimensionality curse and concentration of measure. Preprint at cs/9901004 (1999).

  • McInnes, L. et al. HDBSCAN: Hierarchical density based clustering. J. Open Sourc. Softw. 2, 205 (2017).

    Article 
    ADS 

    Google Scholar 

  • Losada, D. E., Crestani, F. & Parapar, J. eRisk 2020: Self-harm and depression challenges. In: European conference on information retrieval, 557–563 (Springer, 2020).

  • Hoyle, A., Goel, P. & Resnik, P. Improving neural topic models using knowledge distillation. Preprint at arXiv:2010.02377 (2020).

  • Fleiss, J. L. Measuring nominal scale agreement among many raters. Psychol. Bull. 76, 378–382 (1971).

    Article 

    Google Scholar 

  • Landis, J. R. & Koch, G. G. The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977) (Standard reference for interpreting Kappa levels.).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • McHugh, M. L. Interrater reliability: the kappa statistic. Biochem. Med. 22, 276–282 (2012).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Fernández-Pichel, M., Aragón, M. E., Saborido-Patiño, J. & Losada, D. E. Personality trait analysis during the covid-19 pandemic: a comparative study on social media. J. Intell. Inform. Syst. 62, 117–142 (2023).

    Article 

    Google Scholar 

  • link

    Exit mobile version