Attention Decay in Science | Della Briotta Parolo, Pan, Ghosh, Huberman, Kaski, Fortunato

Pietro Della Briotta Parolo, Raj Kumar Pan, Rumi Ghosh, Bernardo A. Huberman, Kimmo Kaski, Santo Fortunato; Attention Decay in Science; preprint; Elsevier (submitted to some journal of theirs); submitted: 2015-03-09; 12 pages; arXiv:1503.01881.

Abstract

The exponential growth in the number of scientific papers makes it increasingly difficult for researchers to keep track of all the publications relevant to their work. Consequently, the attention that can be devoted to individual papers, measured by their citation counts, is bound to decay rapidly. In this work we make a thorough study of the life-cycle of papers in different disciplines. Typically, the citation rate of a paper increases up to a few years after its publication, reaches a peak and then decreases rapidly. This decay can be described by an exponential or a power law behavior, as in ultradiffusive processes, with exponential fitting better than power law for the majority of cases. The decay is also becoming faster over the years, signaling that nowadays papers are forgotten more quickly. However, when time is counted in terms of the number of published papers, the rate of decay of citations is fairly independent of the period considered. This indicates that the attention of scholars depends on the number of published items, and not on real time.

Via: backfill

Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach | Schwartz, Eichstaedt, Kern, Dziurzynski, Ramones, Agrawal, Shah, Kosinski, Stillwell, Seligman, Ungar

H. Andrew Schwartz, Johannes C. Eichstaedt, Margaret L. Kern, Lukasz Dziurzynski, Stephanie M. Ramones, Megha Agrawal, Achal Shah, Michal Kosinski, David Stillwell, Martin E. P. Seligman, Lyle H. Ungar; Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach; In PLoS One; 2013-09-23; 16 pages; landing.

Abstract

We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a comprehensive exploration of language that distinguishes people, finding connections that are not captured with traditional closed-vocabulary word-category analyses. Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase ‘sick of’ and the word ‘depressed’), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive ‘my’ when mentioning their ‘wife’ or ‘girlfriend’ more often than females use ‘my’ with ‘husband’ or ’boyfriend’). To date, this represents the largest study, by an order of magnitude, of language and person

Mentioned

  • Differential Language Analysis (DLA)
  • Five Factor Model (FFM), Big Five
  • Linguistic Inquiry and Word Count (LIWC)
  • Open Vocabularoy, Closed Vocabulary
  • Regression
    • L0 Norm
    • L1 Norm
  • multi-predictor to multi-output regression
  • World Well-Being Program
  • Method
    • Linguistic Feature Extraction
    • Correlational Analysis
    • Visualization
  • Pointwise Mutual Information (PMI)
  • Pott’s happyfuntokenizer for <3 and :-)
  • Personality Tests
    • My Personality, an app
    • International Personality Item Pool
    • NEO Personality Inventory Revised (NEO-PI-R)
  • Vocabulatires
    • Language of Gender
    • Language of Age
    • Language of Personality
  • Latent Dirichlet Allocation (LDA)

Actualities

Via: backfill

Tracking Sentiment in Mail: How Genders Differ on Emotional Axes | Mohammad, Yang

Saif M. Mohammad, Tony (Wenda) Yang; Tracking Sentiment in Mail: How Genders Differ on Emotional Axes;In Proceedings of the ACL Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA); 2011-06; 10 pages; Also available at arXiv; 2013-09-24.

Abstract

With the widespread use of email, we now have access to unprecedented amounts of text that we ourselves have written. In this paper, we show how sentiment analysis can be used in tandem with effective visualizations to quantify and track emotions in many types of mail. We create a large word–emotion association lexicon by crowdsourcing, and use it to compare emotions in love letters, hate mail, and suicide notes. We show that there are marked differences across genders in how they use emotion words in work-place email. For example, women use many words from the joy–sadness axis, whereas men prefer terms from the fear–trust axis. Finally, we show visualizations that can help people track emotions in their emails.

From Once Upon a Time to Happily Ever After: Tracking Emotions in Novels and Fairy Tales | Mohammad

Saif Mohammad; From Once Upon a Time to Happily Ever After: Tracking Emotions in Novels and Fairy Tales; In Proceedings of the ACL Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH); 2011, Available at arXiv; 2013-09-23; 10 pages.

Abstract

Today we have access to unprecedented amounts of literary texts. However, search still relies heavily on key words. In this paper, we show how sentiment analysis can be used in tandem with effective visualizations to quantify and track emotions in both individual books and across very large collections. We introduce the concept of emotion word density, and using the Brothers Grimm fairy tales as example, we show how collections of text can be organized for better search. Using the Google Books Corpus we show how to determine an entity’s emotion associations from cooccurring words. Finally, we compare emotion words in fairy tales and novels, to show that fairy tales have a much wider range of emotion word densities than novels.

Via: backfill