We Can Hear You With WiFi | Wang, Zou, Zhou, Wu, Ni

Guanhua Wang, Yongpan Zou, Zimu Zhou, Kaishun Wu, Lionel M. Ni; WWe Can Hear You with Wi-Fi!; In Proceedings of MobiCom; 2014-09-11; 12 pages; library.


Recent literature advances Wi-Fi signals to “see” people’s motions and locations. This paper asks the following question: Can Wi-Fi “hear” our talks? We present WiHear, which enables Wi-Fi signals to “hear” our talks without deploying any devices. To achieve this, WiHear needs to detect and analyze fine-grained radio reflections from mouth movements. WiHear solves this micro-movement detection problem by introducing Mouth Motion Profile that leverages partial multipath effects and wavelet packet transformation. Since Wi-Fi signals do not require line-of-sight, WiHear can “hear” people talks within the radio range. Further, WiHear can simultaneously “hear” multiple people’s talks leveraging MIMO technology. We implement WiHear on both USRP N210 platform and commercial Wi-Fi infrastructure. Results show that within our pre-defined vocabulary, WiHear can achieve detection accuracy of 91% on average for single individual speaking no more than 6 words and up to 74% for no more than 3 people talking simultaneously. Moreover, the detection accuracy can be further improved by deploying multiple receivers from different angle.

Locality-Sensitive Hashing for Search in High Dimensional Spaces






Some search queries

A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise (DBSCAN) | Ester, Kriegel, Sander, Xu

Martin Ester, Hans-Peter Kriegel, Jiirg Sander, Xiaowei Xu; A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise; In Proceedings of Knowledge Discovery in Databases (KDD); 1996; 6 pages.


Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, we present the new clustering algorithm DBSCAN relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape. DBSCAN requires only one input parameter and supports the user in determining an appropriate value for it. We performed an experimental evaluation of the effectiveness and efficiency of DBSCAN using synthetic data and real data of the SEQUOIA 2000 benchmark. The results of our experiments demonstrate that

  1. DBSCAN is significantly more effective in discovering clusters of arbitrary shape than the well-known algorithm CLARANS, and that
  2. DBSCAN outperforms CLARANS by factor of more than 100 in terms of efficiency.

Attention Decay in Science | Della Briotta Parolo, Pan, Ghosh, Huberman, Kaski, Fortunato

Pietro Della Briotta Parolo, Raj Kumar Pan, Rumi Ghosh, Bernardo A. Huberman, Kimmo Kaski, Santo Fortunato; Attention Decay in Science; preprint; Elsevier (submitted to some journal of theirs); submitted: 2015-03-09; 12 pages; arXiv:1503.01881.


The exponential growth in the number of scientific papers makes it increasingly difficult for researchers to keep track of all the publications relevant to their work. Consequently, the attention that can be devoted to individual papers, measured by their citation counts, is bound to decay rapidly. In this work we make a thorough study of the life-cycle of papers in different disciplines. Typically, the citation rate of a paper increases up to a few years after its publication, reaches a peak and then decreases rapidly. This decay can be described by an exponential or a power law behavior, as in ultradiffusive processes, with exponential fitting better than power law for the majority of cases. The decay is also becoming faster over the years, signaling that nowadays papers are forgotten more quickly. However, when time is counted in terms of the number of published papers, the rate of decay of citations is fairly independent of the period considered. This indicates that the attention of scholars depends on the number of published items, and not on real time.

Via: backfill

Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach | Schwartz, Eichstaedt, Kern, Dziurzynski, Ramones, Agrawal, Shah, Kosinski, Stillwell, Seligman, Ungar

H. Andrew Schwartz, Johannes C. Eichstaedt, Margaret L. Kern, Lukasz Dziurzynski, Stephanie M. Ramones, Megha Agrawal, Achal Shah, Michal Kosinski, David Stillwell, Martin E. P. Seligman, Lyle H. Ungar; Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach; In PLoS One; 2013-09-23; 16 pages; landing.


We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a comprehensive exploration of language that distinguishes people, finding connections that are not captured with traditional closed-vocabulary word-category analyses. Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase ‘sick of’ and the word ‘depressed’), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive ‘my’ when mentioning their ‘wife’ or ‘girlfriend’ more often than females use ‘my’ with ‘husband’ or ’boyfriend’). To date, this represents the largest study, by an order of magnitude, of language and person


  • Differential Language Analysis (DLA)
  • Five Factor Model (FFM), Big Five
  • Linguistic Inquiry and Word Count (LIWC)
  • Open Vocabularoy, Closed Vocabulary
  • Regression
    • L0 Norm
    • L1 Norm
  • multi-predictor to multi-output regression
  • World Well-Being Program
  • Method
    • Linguistic Feature Extraction
    • Correlational Analysis
    • Visualization
  • Pointwise Mutual Information (PMI)
  • Pott’s happyfuntokenizer for <3 and :-)
  • Personality Tests
    • My Personality, an app
    • International Personality Item Pool
    • NEO Personality Inventory Revised (NEO-PI-R)
  • Vocabulatires
    • Language of Gender
    • Language of Age
    • Language of Personality
  • Latent Dirichlet Allocation (LDA)


Via: backfill

Private traits and attributes are predictable from digital records of human behavior | Kosinski, Stillwell, Graepel

Michal Kosinski, David Stillwell, Thore Graepel; Private traits and attributes are predictable from digital records of human behavior; In Proceedings of the National Academy of Sciences of the United States of America (PNAS); 2013-02-12; 4 pages; landing.


We show that easily accessible digital records of behavior, Facebook Likes, can be used to automatically and accurately predict a range of highly sensitive personal attributes including: sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender. The analysis presented is based on a dataset of over 58,000 volunteers who provided their Facebook Likes, detailed demographic profiles, and the results of several psychometric tests. The proposed model uses dimensionality reduction for preprocessing the Likes data, which are then entered into logistic/linear regression to predict individual psychodemographic profiles from Likes. The model correctly discriminates between homosexual and heterosexual men in 88% of cases, African Americans and Caucasian Americans in 95% of cases, and between Democrat and Republican in 85% of cases. For the personality trait “Openness,” prediction accuracy is close to the test–retest accuracy of a standard personality test. We give examples of associations between attributes and Likes and discuss implications for online personalization and privacy.


  • You Are What You Like, promotional site.
  • Singular Value Decomposition (SVD)
  • Pseudo-Inverse of a Matrix
  • Five Factor Model (FFM)
    • Dimensions
      1. Openness to Experience
      2. Conscientiousness
      3. Extraversion
      4. Agreeableness
      5. Emotional Stability
    • Instruments
      • NEO Personality Inventory (NEO-PI-R)
      • NEO Five-Factor Inventory (NEO-FFI)
  • Intelligence
    • Raven’s Standard Progressive Matrices (SPM)
    • Spearman’s Theory of General Ability
  • International Personality Item Pool (IPIP)
  • Satisfaction With Life (SWL)
  • myPersonality Project
  • Receiver-Operating Characteristic (ROC)
  • Area Under [the] Curve (AUC)


  • Lazer D, et al. (2009) Computational social science. In Science 323(5915):721–723.
  • Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender
    systems. In Computer 42(8):30–37.
  • Chen Y, Pavlov D, Canny JF (2009) Large-scale behavioral targeting. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD), pp 209–218.
  • Butler D (2007) Data sharing threatens privacy. In Nature 449(7163):644–645.
  • Narayanan A, Shmatikov V (2008) Robust de-anonymization of large sparse datasets. In Proceedings of the IEEE Symposium on Security and Privacy, pp 111–125.
  • Duhigg C (2012) The Power of Habit: Why We Do What We Do in Life and Business
    (Random House, New York).
  • Ince HO, Yarali A, Özsel D (2009) Customary killings in Turkey and Turkish modernization. In Middle East Studies 45(4):537–551.
  • 8. Fast LA, Funder DC (2008) Personality as manifest in word use: Correlations with selfreport, acquaintance report, and behavior. In Journal of Personal Social Psychology 94(2):334–346.
  • Costa PT, McCrae RR (1992) Revised NEO Personality Inventory (NEO-PI-R) and NEO Five-Factor Inventory (NEO-FFI) Manual (Psychological Assessment Resources, Odessa, FL).
  • Gosling SD, Ko SJ, Mannarelli T, Morris ME (2002) A room with a cue: Personality
    judgments based on offices and bedrooms. In Journal of Personal Social Psychology 82(3):379–398.
  • Hu J, Zeng H-J, Li H, Niu C, Chen Z (2007) Demographic prediction based on user’s browsing behavior. In Proceedings of the International World Wide Web Conference (WWW), pp 151–160.
  • Murray D, Durrell K (1999) Inferring demographic attributes of anonymous Internet
    users. In Revised Papers from the International Workshop on Web Usage Analysis and User Profiling, eds Masand BM, Spiliopoulou M (Springer, London), pp 7–20.
  • De Bock K, Van Den Poel D (2010) Predicting website audience demographics for Web advertising targeting using multi-website clickstream data. In Fundamenta Informaticae 98(1):49–70.
  • Goel S, Hofman JM, Sirer MI (2012) Who does what on the Web: Studying Web
    browsing behavior at scale. In International Conference on Weblogs and Social Media, pp 130–137.
  • Kosinski M, Kohli P, Stillwell DJ, Bachrach Y, Graepel T (2012) Personality and website choice. In Proceedings of the ACM Web Science Conference, pp 251–254.
  • Marcus B, Machilek F, Schütz A (2006) Personality in cyberspace: Personal Web sites as media for personality expressions and impressions. In Journal of Personal Social Psychology 90(6):1014–1031.
  • Rentfrow PJ, Gosling SD (2003) The do re mi’s of everyday life: The structure and
    personality correlates of music preferences. In Journal Personal Social Psychology 84(6):1236–1256.
  • Quercia D, Lambiotte R, Kosinski M, Stillwell D, Crowcroft J (2012) The Personality of popular Facebook users. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work (CSCW), 2012, pp 955–964.
  • Bachrach Y, Kohli P, Graepel T, Stillwell DJ, Kosinski M (2012) Personality and patterns of Facebook usage. In Proceedings of the ACM Web Science Conference, pp 36–44.
  • Quercia D, Kosinski M, Stillwell DJ, Crowcroft J (2011) Our Twitter profiles, our selves: Predicting personality with Twitter. In Proceedings of the 2011 IEEE International Conference on Privacy, Security, Risk, and Trust, or maybe in Proceedings of the IEEE International Conference on Social Computing, pp 180–185.
  • Golbeck J, Robles C, Edmondson M, Turner K (2011) Predicting personality from
    Twitter. Proceedings of the IEEE International Conference on Social Computing, pp 149–156.
  • Golbeck J, Robles C, Turner K (2011) Predicting personality with social media. In Proceedings of the Conference on Human Factors in Computing Systems (CHI), pp 253–262.
  • Jernigan C, Mistree BF (2009) Gaydar: Facebook friendships expose sexual orientation. First Monday 14(10).
  • Golub GH, Kahan W (1965) Calculating the singular values and pseudo-inverse of a matrix. In Journal Society for Industrial & Applied Math (SIAM) 2(2):205–224; also as Journal of SIAM Numerical Analysis, B 2(2).
  • Goldberg LR, et al. (2006) The international personality item pool and the future of
    public-domain personality measures. In Journal Research in Personality 40(1):84–96.
  • Raven JC (2000) The Raven’s progressive matrices: Change and stability over culture and time. In Cognitive Psychology 41(1):1–48.
  • Diener E, Emmons RA, Larsen RJ, Griffin S (1985) The satisfaction with life scale. In Journal Personal Assessment 49(1):71–75.
  • Musick K, Meier A (2010) Are both parents always better than one? Parental conflict
    and young adult well-being. In Social Science Research 39(5):814–830.
  • Schimmack U, Diener E, Oishi S (2002) Life-satisfaction is a momentary judgment and a stable personality characteristic: The use of chronically accessible and stable sources. In Journal of Personality 70(3):345–384.
  • Nass C, Lee KM (2000) Does computer-generated speech manifest personality? An experimental test of similarity-attraction. In Journal of Experimental Psychology 7(3):171–181.


  • Costa PT, McCrae RR (1992) Revised NEO Personality Inventory (NEO-PI-R) and NEO Five-Factor Inventory (NEO-FFI) Manual (Psychological Assessment Resources, Odessa, FL).
  • Goldberg LR, et al. (2006) The international personality item pool and the future of public-domain personality measures. In Journal of Research on Personality 40(1):84–96.
  • Raven JC (2000) The Raven’s progressive matrices: change and stability over culture and time. In Cognitive Psychology 41(1):1–48.
  • Lubinski D (2004) Introduction to the special section on cognitive abilities: 100 years after Spearman’s (1904) “’General intelligence,’ objectively determined and measured”. In Journal of Personal Social Psychology 86(1):96–111.
  • Diener E, Emmons RA, Larsen RJ, Griffin S (1985) The satisfaction with life scale. In Journal of Personal Assessment 49(1):71–75.
  • Golub GH, Kahan W (1965) Calculating the singular values and pseudo-inverse of a matrix. In Journal Society for Industrial & Applied Math (SIAM) 2(2):205–224.


Fast Unfolding of Communities of Large Networks | Blondel, Guillaume, Lambiotte, Lefebre

Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, Etienne Lefebvre; Fast unfolding of communities in large networks; In Journal of Statistical Mechanics: Theory and Experiment; Volume 10; 2008; 12 pages; landing


We propose a simple method to extract the community structure of large networks. Our method is a heuristic method that is based on modularity optimization. It is shown to outperform all other known community detection method in terms of computation time. Moreover, the quality of the communities detected is very good, as measured by the so-called modularity. This is shown first by identifying language communities in a Belgian mobile phone network of 2.6 million customers and by analyzing a web graph of 118 million nodes and more than one billion links. The accuracy of our algorithm is also verified on ad-hoc modular networks.


The Louvain Method


Via: backfill

Inferring Trip Destinations From Driving Habits Data | Dewri, Annadata, Eltarjaman, Thurimella

Rinku Dewri, Prasad Annadata, Wisam Eltarjaman, Ramakrishna Thurimella; Inferring Trip Destinations From Driving Habits Data; In Proceedings of Workshop on Privacy in the Electronic Society (WPES); 2013; 9 pages.


The collection of driving habits data is gaining momentum as vehicle telematics based solutions become popular in consumer markets such as auto-insurance and driver assistance services. These solutions rely on driving features such as time of travel, speed, and braking to assess accident risk and driver safety. Given the privacy issues surrounding the geographic tracking of individuals, many solutions explicitly claim that the customer’s GPS coordinates are not recorded. Although revealing driving habits can give us access to a number of innovative products, we believe that the disclosure of this data only offers a false sense of privacy. Using speed and time data from real world driving trips, we show that the destinations of trips may also be determined without having to record GPS coordinates. Based on this, we argue that customer privacy expectations in non-tracking telematics applications need to be reset, and new policies need to be implemented to inform customers of possible risks.


  • Products
    • Progressive’sSnapshot,
    • AllState’s Drivewise,
    • State Farm’s In-Drive,
    • National General Insurance’s Low-Mileage Discount,
    • Travelers’ Intellidrive,
    • Esurance’s Drivesense,
    • Safeco’s Rewind,
    • Aviva’s Drive,
    • Amaguiz PAYD,
    • Insure The Box,
    • Cover-box,
    • Ingenie,
    • MyDrive.
  • Quasi-identifiers
  • Telematics
  • OnStar
  • OBD-II
  • LandAirSea GPS Tracking Key
  • OpenStreetMap
  • Stop Points
  • Depth-First Search (DFS)

Via: backfill, backfill

Tracking Sentiment in Mail: How Genders Differ on Emotional Axes | Mohammad, Yang

Saif M. Mohammad, Tony (Wenda) Yang; Tracking Sentiment in Mail: How Genders Differ on Emotional Axes;In Proceedings of the ACL Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA); 2011-06; 10 pages; Also available at arXiv; 2013-09-24.


With the widespread use of email, we now have access to unprecedented amounts of text that we ourselves have written. In this paper, we show how sentiment analysis can be used in tandem with effective visualizations to quantify and track emotions in many types of mail. We create a large word–emotion association lexicon by crowdsourcing, and use it to compare emotions in love letters, hate mail, and suicide notes. We show that there are marked differences across genders in how they use emotion words in work-place email. For example, women use many words from the joy–sadness axis, whereas men prefer terms from the fear–trust axis. Finally, we show visualizations that can help people track emotions in their emails.

From Once Upon a Time to Happily Ever After: Tracking Emotions in Novels and Fairy Tales | Mohammad

Saif Mohammad; From Once Upon a Time to Happily Ever After: Tracking Emotions in Novels and Fairy Tales; In Proceedings of the ACL Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH); 2011, Available at arXiv; 2013-09-23; 10 pages.


Today we have access to unprecedented amounts of literary texts. However, search still relies heavily on key words. In this paper, we show how sentiment analysis can be used in tandem with effective visualizations to quantify and track emotions in both individual books and across very large collections. We introduce the concept of emotion word density, and using the Brothers Grimm fairy tales as example, we show how collections of text can be organized for better search. Using the Google Books Corpus we show how to determine an entity’s emotion associations from cooccurring words. Finally, we compare emotion words in fairy tales and novels, to show that fairy tales have a much wider range of emotion word densities than novels.

Via: backfill

Data Science Code of Professional Conduct



Balancing Risk and InnovationEthics of Big Data: Balancing Risk and Innovation
Kord Davis; Ethics of Big Data: Balancing Risk and Innovation; O’Reilly Media; 2012-09; 82 pages; Amazon: kindle: $10, paperback: $18.
Privacy and Big Data

Terence Craig, Mary E. Ludloff; Privacy and Big Data; O’Reilly Media; 2011-09-29; 108 pages; kindle: $10, paperback: $18.



A Code Of Ethics for Analysts


A Code of Ethics for Analysts

The INFORMS Code of Ethics for Certified Analytics Professionals includes six sections under the responsibilities for an analyst in the field. Excerpts are below:


  • All professionals have societal obligations to perform their work in a professional, competent and ethical manner.
  • Professionals should adhere to all applicable laws, regulations and international covenants.

Employers and Clients

  • It is the practitioner’s responsibility to assure employers and clients that an analytical approach is suitable to their needs and resources, and include presenting the capabilities and limitations of analytical methods in addressing their problem.
  • Analytics professionals should clearly state their qualifications and relevant experience.
  • It is imperative to fulfill all commitments to employers and clients, guard any privileged information they provide unless required to disclose, and accept full responsibility for your performance.
  • Where appropriate, present a client or employer with choices among valid alternative approaches that may vary in scope, cost, or precision.
  • Apply analytical methods and procedures scientifically, without predetermining the outcome.
  • Resist any pressure from employers and clients to produce a particular “result,” regardless of its validity.


  • Analytics professionals have a responsibility to promote the effective and efficient use of analytical methods by all members of research teams and to respect the ethical obligations of members of other disciplines.
  • When possible, professionals share nonproprietary data and methods with others; participate in peer review, focusing on the assessment of methods not individuals.
  • Respect differing professional opinions while acknowledging the contributions and intellectual property of others.
  • Those professionals involved in teaching or training students or junior analysts have a responsibility to instill in them an appreciation for the practical value of the concept and methods they are learning.
  • Those in leadership and decision-making roles should use professional qualifications with regard to analytic professionals’ hiring, firing, promotion, work assignments, and other professional matters.
  • Avoid harassment of or discrimination based on professionally irrelevant bases such as race, color, ethnicity, gender, sexual orientation, national origin, age, religion, nationality, or disability.

Research Subjects

  • If a project involves research subjects, including census or survey respondents, an analytics professional will know and adhere to the appropriate rules for the protection of those human subjects.
  • Be particularly aware of situations involving vulnerable populations that may be subject to special risks and may not be able to protect their own interests.
  • This responsibility includes protecting the privacy and confidentiality of research subjects and data concerning them.

INFORMS and the Profession

  • Analytics professionals will strive for relevance in all analyses.
  • Each study or project should be based on a competent understanding of the subject-matter issues, appropriate analytical methods, and technical criteria to justify both the practical relevance of the study and the data to be used.
  • Guard against the possibility that a predisposition by investigators or data providers might predetermine the analytical result.
  • Remain current in constantly changing analytical methodology, as preferred methods from yesterday may be barely acceptable today and totally obsolete tomorrow.
  • Disclose conflicts of interest, financial and otherwise, and resolve them.
  • Provide only such expert testimony as you would be willing to have peer reviewed.
  • Maintain personal responsibility for all work bearing your name; avoid undertaking work or coauthoring publications for which you would not want to acknowledge responsibility.

Alleged Misconduct

  • Certified Analytics Professionals will strive to avoid condoning or appearing to condone careless, incompetent, or unethical practices. Misconduct broadly includes all professional dishonesty, by commission or omission, and, within the realm of professional activities and expression, all harmful disrespect for people, unauthorized or illegal use of their intellectual and physical property, and unjustified detraction from the reputation of others.
  • Recognize that differences of opinion and honest error do not constitute misconduct; they warrant discussion, but not accusation.
  • Questionable scientific practices may or may not constitute misconduct, depending on their nature and the definition of misconduct used.
  • Do not condone retaliation against or damage to the employability or those who responsibly call attention to possible scientific error or misconduct.

Data Science Code of Professional Conduct

Via Rose Business Technologies, excerpt from the section entitled, Data Science Evidence, Quality of Data and Quality of Evidence

A data scientist shall not knowingly:

  1. Fail to use scientific methods in performing data science.
  2. Fail to rank the quality of evidence in a reasonable and understandable manner for the client.
  3. Claim weak or uncertain evidence is strong evidence.
  4. Misuse weak or uncertain evidence to communicate a false reality or promote an illusion of understanding.
  5. Fail to rank the quality of data in a reasonable and understandable manner for the client.
  6. Claim bad or uncertain data quality is good data quality.
  7. Misuse bad or uncertain data quality to communicate a false reality or promote an illusion of understanding.
  8. Fail to disclose any and all data science results or engage in cherry-picking.
  9. Fail to attempt to replicate data science results.
  10. Fail to disclose that data science results could not be replicated.
  11. Misuse data science results to communicate a false reality or promote an illusion of understanding.
  12. Fail to disclose failed experiments or disconfirming evidence known to the data scientist to be directly adverse to the position of the client.
  13. Offer evidence that the data scientist knows to be false. If a data scientist questions the quality of data or evidence the data scientist must disclose this to the client. If a data scientist has offered material evidence and the data scientist comes to know of its falsity, the data scientist shall take reasonable remedial measures, including disclosure to the client. A data scientist may disclose and label evidence the data scientist reasonably believes is false.
  14. Cherry-pick data and data science evidence.

Social Network Cluster Analysis

Editor; Dancing the Bunny Hop with the NSA; In The Network Thinkers; 2013-07-27.




  • Kim Zetter; Tracking Terrorists the Las Vegas Way; In PC World; undated 2012?

    • Non-Obvious Relationship Awareness (NORA).
    • Systems Research and Development (SRD)
      • Jeff Jonas, founder, Chief Scientist
      • John Slitz, CEO
      • Funding: In-Q-Tel
      • Food Marketing Institute (prospected at, no announced deal)
    • Fuzzy Logic
    • Real-time analysis
    • Choicepoint
    • Record Linkage (probabilistic record linkage)
    • Black Hat Conference, Las Vegas
      • unrelated to the body of the story
      • SRD demonstrated something, “offered a glimpse”


  • Vladis Krebs; Connecting the Dots: Tracking Two Identified Terrorists; at Orgnet; undated; updates 2005, 2006, 2007, available through 2013-08-01.
  • Vladis E. Krebs (orgnet.com); Mapping Networks of Terrorist Cells; In CONNECTIONS; Vol. 24, No. 3; pages 42-52, 10 pages.
  • James Bamford; Then Know Much More Than You Think; In The New York Review of Books; 2013-07-12.

    • History of the NSA
      • The Black Chamber
        • Herbert O. Yardley
        • Woodrow Wilson
        • Radio Communications Act
        • Newcomb Carlton, president of Western Union
      • Project Shamrock
        • Ended 1975
        • Senator Frank Church
      • Adrienne J. Kinne
        • age 24 in 2001
        • conducted eavsdropping without a warrant, interviewed & quoted.
      • Foreign Intelligence Surveillance Act (FISA)
      • Foreign Intelligence Surveillance Court (FISC)
    • Chronology of testimonies of NSA leader & Senators
      • General Keith Alexander, director of the NSA.
      • William Binney, ex-NSA, interviewed.
      • James Clapper, the director of national intelligence.
      • Senators Ron Wyden and Mark Udall
        • a joint statement
    • Database/project secret-silly-codenames:
      • UPSTREAM
      • PRISM

Via: backfill

Engineering Serendipity | Greg Lindsay, NYT

Via: Greg Lindsay; Engineering Serendipity; In The New York Times (NYT); 2013-04-07.
Via backfill


  • Google, a hagiography in Vanity Fair, forthcoming
  • Yahoo!, is like Google, but different
  • Theme: serendipity
    • “coined by the British aristocrat Horace Walpole in a 1754 letter, long referred to a fortunate accidental discovery” (no citation)
  • Ronald S. Burt
    • sociologist, a professor, University of Chicago
    • a study on Raytheon circa 2004, N=673.
    • Co-author Michael Fire
    • Theory of “organizational gap ‘structural holes’”
    • Constructed a social network map
  • Thomas J. Allen
    • a professor of management and engineering at M.I.T.
    • “out of sight, out of mind”
  • Uncredited, uncited study of 2012
    • researchers at Arizona State University
    • “sensors” to measure creativity
    • a study-that-shows
  • Sociometric Solutions
    • originated at MIT Media Lab’s (ML) Human Dynamics Laboratory (HDL)
    • Ben Waber
      • co-founder, had visited MIT ML HDL
      • claim: “employees who ate at cafeteria tables designed for 12 were more productive than those at tables for four”
      • book People Analytics forthcoming 2013-05-11 (no kindle)
  • Gratuitous color quote
    • Scott Doorley, a creative director at Stanford University’s Institute of Design
    • Scott Witthoft, a colleague
    • propose “positioning couches near doorways and stocking rooms with multiple types of seating to encourage lingering conversations.”