Exploring ADINT: Using Ad Targeting for Surveillance on a Budget — or — How Alice Can Buy Ads to Track Bob | Vines, Roesner, Kohno

Paul Vines, Franziska Roesner, Tadayoshi Kohno; Exploring ADINT: Using Ad Targeting for Surveillance on a Budget — or — How Alice Can Buy Ads to Track Bob; In Proceedings of the 16th ACM Workshop on Privacy in the Electronic Society (WPES 2017); 2017-10-30; 11 pages; outreach.

tl;dr → Tadayoshi et al. are virtuosos at these performance art happenings. Catchy hook, cool marketing name (ADINT) and press outreach frontrunning the actual conference venue. For the wuffie and the lulz. Nice demo tho.
and → They bought geofence campaigns in a grid. They used close-the-loop analytics to identify the sojourn trail of the target.
and → dont’ use Grindr.


The online advertising ecosystem is built upon the ability of advertising networks to know properties about users (e.g., their interests or physical locations) and deliver targeted ads based on those properties. Much of the privacy debate around online advertising has focused on the harvesting of these properties by the advertising networks. In this work, we explore the following question: can third-parties use the purchasing of ads to extract private information about individuals? We find that the answer is yes. For example, in a case study with an archetypal advertising network, we find that — for $1000 USD — we can track the location of individuals who are using apps served by that advertising network, as well as infer whether they are using potentially sensitive applications (e.g., certain religious or sexuality-related apps). We also conduct a broad survey of other ad networks and assess their risks to similar attacks. We then step back and explore the implications of our findings.


  • Markets
    They chose

    • Facebooik
    • not Google
    • etc.
    • not to fight with big DSPs;
      the picked the weaker ones to highlight.
  • Apps
    They chose

    • lower-quality apps.
    • adult apps
      few “family oriented” [none?] apps.
    • <ahem>Adult Diapering Diary</ahem>
      <ahem>Adult Diapering Diary</ahem>


  • DSPs sell 8m CEP (precision) location.

Spooky Cool Military Lingo


Targeting Dimensions

  • Demographics
  • Interests
  • Personally-Identifying Information (PII)
  • Domain (a usage taxonomy)
  • Location
  • Identifiers
    • Cookie Identifier
    • Mobile Ad Identifier (e.g. IDFA, GPSAID)
  • Technographics
    • Device (Make Model OS)
    • Network (Carrier)
  • Search

Media Types

Supply-Side Platforms (SSPs)

  • Adbund
  • InnerActive
  • MobFox
  • Smaato
  • Xapas

Supply (the adware itself, The Applications, The Apps)

  • Adult Diapering Diary
  • BitTorrent
  • FrostWire
  • Grindr
  • Hide My Texts
  • Hide Pictures vault
  • Hornet
  • iFunny
  • Imgur
  • Jack’D
  • Meet24
  • MeetMe
  • Moco
  • My Mixtapez Music
  • Pregnant Mommy’s Maternity
  • Psiphon
  • Quran Reciters
  • Romeo
  • Tagged
  • Talkatone
  • TextFree
  • TextMe
  • TextPlus
  • The Chive
  • uTorrent
  • Wapa
  • Words with Friends

Demand-Side Platforms (DSPs)

  • Ademedo
  • AddRoll
  • AdWords
  • Bing
  • Bonadza
  • BluAgile
  • Centro
  • Choozle
  • Criteo
  • ExactDrive
  • Facebook
  • GetIntent
  • Go2Mobi
  • LiquidM
  • MediaMath
  • MightyHive
  • Simpli.Fi
  • SiteScout
  • Splicky
  • Tapad



  • Gunes Acar, Christian Eubank, Steven Englehardt, Marc Juarez, Arvind Narayanan, Claudia Diaz. 2014. The Web Never Forgets: Persistent Tracking Mechanisms in the Wild. In Proceedings of the ACM Conference on Computer and Communications Security.
  • Rebecca Balebako, Pedro Leon, Richard Shay, Blase Ur, Yang Wang, L Cranor. 2012. Measuring the effectiveness of privacy tools for limiting behavioral advertising. In Web 2.0 Security and Privacy.
  • Hal Berghel. 2001. Caustic Cookies. In His Blog.
  • Interactive Advertising Bureau. 2015. IAB Tech Lab Content Taxonomy.
  • Interactive Advertising Bureau. 2017. IAB Interactive Advertising Wiki.
  • Giuseppe Cattaneo, Giancarlo De Maio, Pompeo Faruolo, Umberto Ferraro Petrillo. 2013. A review of security attacks on the GSM standard. In Information and Communication Technology-EurAsia Conference. Springer, pages 507–512.
  • Robert M Clark. 2013. Perspectives on Intelligence Collection. In The intelligencer, a Journal of US Intelligence Studies 20, 2, pages 47–53.
  • David Cole. 2014. We kill people based on metadata. In The New York Review of Books
  • Jonathan Crussell, Ryan Stevens, Hao Chen. 2014. Madfraud: Investigating ad fraud in android applications. In Proceedings of the 12th Annual International Conference on Mobile Systems, Applications, and Services. ACM, pages 123–134.
  • Doug DePerry, Tom Ritter, Andrew Rahimi. 2013. Cloning with a Compromised CDMA Femtocell.
  • Google Developers. 2017. Google Ads.
  • Steven Englehardt and Arvind Narayanan. 2016. Online tracking: A 1-million-site measurement and analysis. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, pages 1388–1401.
  • Steven Englehardt, Dillon Reisman, Christian Eubank, Peter Zimmerman, Jonathan Mayer, Arvind Narayanan, Edward W Felten. 2015. Cookies that give you away: The surveillance implications of web tracking. In Proceedings of the 24th International Conference on World Wide Web. ACM, pages 289–299.
  • Go2mobi. 2017.
  • Aleksandra Korolova. 2010. Privacy violations using microtargeted ads: A case study. In Proceedings of the 2010 IEEE International Conference on IEEE Data Mining Workshops (ICDMW), pages 474–482.
  • Zhou Li, Kehuan Zhang, Yinglian Xie, Fang Yu, XiaoFeng Wang. 2012. Knowing your enemy: understanding and detecting malicious web advertising. In Proceedings of the 2012 ACM conference on Computer and Communications Security. ACM, pages 674–686.
  • Nicolas Lidzborski. 2014. Staying at the forefront of email security and reliability: HTTPS-only and 99.978 percent availability.; In Their Blog. Google.
  • Steve Mansfield-Devine. 2015. When advertising turns nasty. In Network Security 11, pages 5–8.
  • Jeffrey Meisner. 2014. Advancing our encryption and transparency efforts. In Their Blog, Microsoft.
  • Rick Noack. 2014. Could using gay dating app Grindr get you arrested in Egypt?. In The Washington Post.
  • Franziska Roesner, Tadayoshi Kohno, David Wetherall. 2012. Detecting and Defending Against Third-Party Tracking on the Web. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI).
  • Sooel Son, Daehyeok Kim, Vitaly Shmatikov. 2016. What mobile ads know about mobile users. In Proceedings of the 23rd Annual Network and Distributed System Security Symposium (NDSS).
  • Mark Joseph Stern. 2016. This Daily Beast Grindr Stunt Is Sleazy, Dangerous, and Wildly Unethical. In Slate, 2016.
  • Ryan Stevens, Clint Gibler, Jon Crussell, Jeremy Erickson, Hao Chen. 2012. Investigating user privacy in android ad libraries. In Proceedings of the Workshop on Mobile Security Technologies<e/m> (MoST).
  • Ratko Vidakovic. 2013. The Mechanics Of Real-Time Bidding. In Marketingland.
  • Craig E. Wills and Can Tatar. 2012. Understanding what they do with what they know. In Proceedings of the ACM Workshop on Privacy in the Electronic Society (WPES).
  • Tom Yeh, Tsung-Hsiang Chang, Robert C Miller. 2009. Sikuli: using GUI screenshots for search and automation. In Proceedings of the 22nd annual ACM Symposium on User Interface Software and Technology. ACM, pages 183–192.
  • Apostolis Zarras, Alexandros Kapravelos, Gianluca Stringhini, Thorsten Holz, Christopher Kruegel, Giovanni Vigna. 2014. The dark alleys of madison avenue: Understanding malicious advertisements. In Proceedings of the 2014 Conference on Internet Measurement Conference
  • Tiliang Zhang, Hua Zhang, Fei Gao. 2013. A Malicious Advertising Detection Scheme Based on the Depth of URL Strategy. In Proceedings of the 2013 Sixth International Symposium on Computational Intelligence and Design (ISCID), Vol. 2. IEEE, pages 57–60.
  • Peter Thomas Zimmerman. 2015. Measuring privacy, security, and censorship through the utilization of online advertising exchanges. Technical Report. Tech. rep., Princeton University.


The Suitcase Words

  • Mobile Advertising ID (MAID)
  • Demand-Side Platform (DSP)
  • Supply-Side Platform (SSP)
  • Global Positioning System (GPS)
  • Google Play Store (GPS)
  • geofencing
  • cookie tracking
  • Google Advertising Identifier (GAID)
    Google Play Services Advertising Identifier (GAID)
  • Facebook
  • Snowden
  • WiFi

Previously filled.

Incompatible: The GDPR in the Age of Big Data | Tal Zarsky

Tal Zarsky (Haifa); Incompatible: The GDPR in the Age of Big Data; Seton Hall Law Review, Vol. 47, No. 4(2), 2017; 2017-08-22; 26 pages; ssrn:3022646.
Tal Z. Zarsky is Vice Dean and Professor, Haifa University, IL.

tl;dr → the opposition is elucidated and juxtaposed; the domain is problematized.
and → “Big Data,” by definition, is opportunistic and unsupervisable; it collects everything and identifies something later in the backend.  Else it is not “Big Data” (it is “little data,” which is known, familiar, boring, and of course has settled law surrounding its operational envelope).


After years of drafting and negotiations, the EU finally passed the General Data Protection Regulation (GDPR). The GDPR’s impact will, most likely, be profound. Among the challenges data protection law faces in the digital age, the emergence of Big Data is perhaps the greatest. Indeed, Big Data analysis carries both hope and potential harm to the individuals whose data is analyzed, as well as other individuals indirectly affected by such analyses. These novel developments call for both conceptual and practical changes in the current legal setting.

Unfortunately, the GDPR fails to properly address the surge in Big Data practices. The GDPR’s provisions are — to borrow a key term used throughout EU data protection regulation — incompatible with the data environment that the availability of Big Data generates. Such incompatibility is destined to render many of the GDPR’s provisions quickly irrelevant. Alternatively, the GDPR’s enactment could substantially alter the way Big Data analysis is conducted, transferring it to one that is suboptimal and inefficient. It will do so while stalling innovation in Europe and limiting utility to European citizens, while not necessarily providing such citizens with greater privacy protection.

After a brief introduction (Part I), Part II quickly defines Big Data and its relevance to EU data protection law. Part III addresses four central concepts of EU data protection law as manifested in the GDPR: Purpose Specification, Data Minimization, Automated Decisions and Special Categories. It thereafter proceeds to demonstrate that the treatment of every one of these concepts in the GDPR is lacking and in fact incompatible with the prospects of Big Data analysis. Part IV concludes by discussing the aggregated effect of such incompatibilities on regulated entities, the EU, and society in general.


<snide><irresponsible>Apparently this was not known before the activists captured the legislature and affected their ends with the force of law. Now we know. Yet we all must obey the law, as it stands and as it is written. And why was this not published in an EU-located law journal, perhaps one located in … Brussels?</irresponsible></snide>



    1. Purpose Limitation
    2. Data Minimization
    3. Special Categories
    4. Automated Decisions


  • Big Data (contra “little data”)
  • personal data
  • Big Data Revolution
  • evolution not revolution
    no really, revolution not evolution
  • The GDPR is a regulation “on the protection of natural persons,”
  • EU General Data Protection Regulation (GDPR)
  • EU Data Protection Directive (DPD)
  • IS GDPR different than DPD?  Maybe not.  Why? c.f. page 10.
  • Various attempts at intuiting bright-line tests around the laws are recited.
    It is a law, but nobody knows how it is interpreted or how it might be enforced.
  • statistical purpose
  • analytical purpose
  • data minimization
  • pseudonymization
  • reidentification
  • specific individuals
  • <quote>n the DPD, article 8(1) prohibited the processing of data “revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership, and the processing of data concerning health or sex life,” while providing narrow exceptions.85 This distinction was embraced by the GDPR.</quote>
  • Article 29 Working Party
  • on (special) category contagion
    “we feel that all data is credit data, we just don’t know how to use it yet.”
    c.f. page 19; attributed to Dr. Douglas Merrill, then-founder, ZestFinance, ex-CTO, Google.
  • data subjects
  • automated decisions
  • right to “contest the decision”
  • obtain human intervention
  • trade secrets contra decision transparency
    by precedent, in EU (DE), corporate rights trump decision subject’s rights.
  • [a decision process] must be interpretable
  • right to due process [when facing a machine]


Big Data is…

  • …wait for it… so very very big
    …thank you, thank you very much. I will be here all week. Please tip your waitron.
  • The Four Five “Vs”
The Four Five “Vs”
  1. The Volume of data collected,
  2. The Variety of the sources,
  3. The Velocity,
    <quote>with which the analysis of the data can unfold,</quote>,
  4. The Veracity,
    <quote>of the data which could (arguably) be achieved through the analytical process.</quote>,
  5. The Value, yup, that’s five.
    … <quote>yet this factor seems rather speculative and is thus best omitted.</quote>,

The Brussels Effect

  • What goes on in EU goes global,
  • “Europeanization”
  • Law in EU is applied world-wide because corporate operations are universal.


  • purpose limitation,
  • data minimization,
  • special categories,
  • automated decisions.


There are 123 references, across 26 pages of prose, made manifest as footnotes in the legal style. Here, simplified and deduplicated.

Previously filled.

Syllabus for Solon Barocas @ Cornell | INFO 4270: Ethics and Policy in Data Science

INFO 4270 – Ethics and Policy in Data Science
Instructor: Solon Barocas
Venue: Cornell University


Solon Barocas


A Canon, The Canon

In order of appearance in the syllabus, without the course cadence markers…

  • Danah Boyd and Kate Crawford, Critical Questions for Big Data; In <paywalled>Information, Communication & Society,Volume 15, Issue 5 (A decade in Internet time: the dynamics of the Internet and society); 2012; DOI:10.1080/1369118X.2012.678878</paywalled>
    Subtitle: Provocations for a cultural, technological, and scholarly phenomenon
  • Tal Zarsky, The Trouble with Algorithmic Decisions; In Science, Technology & Human Values, Vol 41, Issue 1, 2016 (2015-10-14); ResearchGate.
    Subtitle: An Analytic Road Map to Examine Efficiency and Fairness in Automated and Opaque Decision Making
  • Cathy O’Neil, Weapons of Math Destruction; Broadway Books; 2016-09-06; 290 pages, ASIN:B019B6VCLO: Kindle: $12, paper: 10+SHT.
  • Frank Pasquale, The Black Box Society: The Secret Algorithms That Control Money and Information; Harvard University Press; 2016-08-29; 320 pages; ASIN:0674970845: Kindle: $10, paper: $13+SHT.
  • Executive Office of the President, President Barack Obama, Big Data: A Report on Algorithmic Systems, Opportunity, and Civil Rights; The White House Office of Science and Technology Policy (OSTP); 2016-05; 29 pages; archives.
  • Lisa Gitelman (editor), “Raw Data” is an Oxymoron; Series: Infrastructures; The MIT Press; 2013-01-25; 192 pages; ASIN:B00HCW7H0A: Kindle: $20, paper: $18+SHT.
    Lisa Gitelman, Virginia Jackson; Introduction (6 pages)
  • Agre, “Surveillance and Capture: Two Models of Privacy”
  • Bowker and Star, Sorting Things Out
  • Auerbach, “The Stupidity of Computers”
  • Moor, “What is Computer Ethics?”
  • Hand, “Deconstructing Statistical Questions”
  • O’Neil, On Being a Data Skeptic
  • Domingos, “A Few Useful Things to Know About Machine Learning”
  • Luca, Kleinberg, and Mullainathan, “Algorithms Need Managers, Too”
  • Friedman and Nissenbaum, “Bias in Computer Systems”
  • Lerman, “Big Data and Its Exclusions”
  • Hand, “Classifier Technology and the Illusion of Progress” [Sections 3 and 4]
  • Pager and Shepherd, “The Sociology of Discrimination: Racial Discrimination in Employment, Housing, Credit, and Consumer Markets”
  • Goodman, “Economic Models of (Algorithmic) Discrimination”
  • Hardt, “How Big Data Is Unfair”
  • Barocas and Selbst, “Big Data’s Disparate Impact” [Parts I and II]
  • Gandy, “It’s Discrimination, Stupid”
  • Dwork and Mulligan, “It’s Not Privacy, and It’s Not Fair”
  • Sandvig, Hamilton, Karahalios, and Langbort, “Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms”
  • Diakopoulos, “Algorithmic Accountability: Journalistic Investigation of Computational Power Structures”
  • Lavergne and Mullainathan, “Are Emily and Greg more Employable than Lakisha and Jamal?”
  • Sweeney, “Discrimination in Online Ad Delivery”
  • Datta, Tschantz, and Datta, “Automated Experiments on Ad Privacy Settings”
  • Dwork, Hardt, Pitassi, Reingold, and Zemel, “Fairness Through Awareness”
  • Feldman, Friedler, Moeller, Scheidegger, and Venkatasubramanian, “Certifying and Removing Disparate Impact”
  • Žliobaitė and Custers, “Using Sensitive Personal Data May Be Necessary for Avoiding Discrimination in Data-Driven Decision Models”
  • Angwin, Larson, Mattu, and Kirchner, “Machine Bias”
  • Kleinberg, Mullainathan, and Raghavan, “Inherent Trade-Offs in the Fair Determination of Risk Scores”
  • Northpointe, COMPAS Risk Scales: Demonstrating Accuracy Equity and Predictive Parity
  • Chouldechova, “Fair Prediction with Disparate Impact”
  • Berk, Heidari, Jabbari, Kearns, and Roth, “Fairness in Criminal Justice Risk Assessments: The State of the Art”
  • Hardt, Price, and Srebro, “Equality of Opportunity in Supervised Learning”
  • Wattenberg, Viégas, and Hardt, “Attacking Discrimination with Smarter Machine Learning”
  • Friedler, Scheidegger, and Venkatasubramanian, “On the (Im)possibility of Fairness”
  • Tene and Polonetsky, “Taming the Golem: Challenges of Ethical Algorithmic Decision Making”
  • Lum and Isaac, “To Predict and Serve?”
  • Joseph, Kearns, Morgenstern, and Roth, “Fairness in Learning: Classic and Contextual Bandits”
  • Barocas, “Data Mining and the Discourse on Discrimination”
  • Grgić-Hlača, Zafar, Gummadi, and Weller, “The Case for Process Fairness in Learning: Feature Selection for Fair Decision Making”
  • Vedder, “KDD: The Challenge to Individualism”
  • Lippert-Rasmussen, “‘We Are All Different’: Statistical Discrimination and the Right to Be Treated as an Individual”
  • Schauer, Profiles, Probabilities, And Stereotypes
  • Caliskan, Bryson, and Narayanan, “Semantics Derived Automatically from Language Corpora Contain Human-like Biases”
  • Zhao, Wang, Yatskar, Ordonez, and Chang, “Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints”
  • Bolukbasi, Chang, Zou, Saligrama, and Kalai, “Man Is to Computer Programmer as Woman Is to Homemaker?”
  • Citron and Pasquale, “The Scored Society: Due Process for Automated Predictions”
  • Ananny and Crawford, “Seeing without Knowing”
  • de Vries, “Privacy, Due Process and the Computational Turn”
  • Zarsky, “Transparent Predictions”
  • Crawford and Schultz, “Big Data and Due Process”
  • Kroll, Huey, Barocas, Felten, Reidenberg, Robinson, and Yu, “Accountable Algorithms”
  • Bornstein, “Is Artificial Intelligence Permanently Inscrutable?”
  • Burrell, “How the Machine ‘Thinks’”
  • Lipton, “The Mythos of Model Interpretability”
  • Doshi-Velez and Kim, “Towards a Rigorous Science of Interpretable Machine Learning”
  • Hall, Phan, and Ambati, “Ideas on Interpreting Machine Learning”
  • Grimmelmann and Westreich, “Incomprehensible Discrimination”
  • Selbst and Barocas, “Regulating Inscrutable Systems”
  • Jones, “The Right to a Human in the Loop”
  • Edwards and Veale, “Slave to the Algorithm? Why a ‘Right to Explanation’ is Probably Not the Remedy You are Looking for”
  • Duhigg, “How Companies Learn Your Secrets”
  • Kosinski, Stillwell, and Graepel, “Private Traits and Attributes Are Predictable from Digital Records of Human Behavior”
  • Barocas and Nissenbaum, “Big Data’s End Run around Procedural Privacy Protections”
  • Chen, Fraiberger, Moakler, and Provost, “Enhancing Transparency and Control when Drawing Data-Driven Inferences about Individuals”
  • Robinson and Yu, Knowing the Score
  • Hurley and Adebayo, “Credit Scoring in the Era of Big Data”
  • Valentino-Devries, Singer-Vine, and Soltani, “Websites Vary Prices, Deals Based on Users’ Information”
  • The Council of Economic Advisers, Big Data and Differential Pricing
  • Hannak, Soeller, Lazer, Mislove, and Wilson, “Measuring Price Discrimination and Steering on E-commerce Web Sites”
  • Kochelek, “Data Mining and Antitrust”
  • Helveston, “Consumer Protection in the Age of Big Data”
  • Kolata, “New Gene Tests Pose a Threat to Insurers”
  • Swedloff, “Risk Classification’s Big Data (R)evolution”
  • Cooper, “Separation, Pooling, and Big Data”
  • Simon, “The Ideological Effects of Actuarial Practices”
  • Tufekci, “Engineering the Public”
  • Calo, “Digital Market Manipulation”
  • Kaptein and Eckles, “Selecting Effective Means to Any End”
  • Pariser, “Beware Online ‘Filter Bubbles’”
  • Gillespie, “The Relevance of Algorithms”
  • Buolamwini, “Algorithms Aren’t Racist. Your Skin Is just too Dark”
  • Hassein, “Against Black Inclusion in Facial Recognition”
  • Agüera y Arcas, Mitchell, and Todorov, “Physiognomy’s New Clothes”
  • Garvie, Bedoya, and Frankle, The Perpetual Line-Up
  • Wu and Zhang, “Automated Inference on Criminality using Face Images”
  • Haggerty, “Methodology as a Knife Fight”
    <snide>A metaphorical usage. Let hyperbole be your guide</snide>

Previously filled.

Code Dependent: Pros and Cons of the Algorithm Age | Pew Research

, ; Code Dependent: Pros and Cons of the Algorithm Age; 2017-02-08; 87 pages; landing.
Teaser: Algorithms are aimed at optimizing everything. They can save lives, make things easier and conquer chaos. Still, experts worry they can also put too much control in the hands of corporations and governments, perpetuate bias, create filter bubbles, cut choices, creativity and serendipity, and could result in greater unemployment.

tl;dr → there be dragons; this is an important area; the future is at stake; the alarum has been sounded; there are seers who can show us the way. In their own words.


Future of the Internet, of Pew Research & Elon University.

Table of Contents

  • Overview
  • Themes illuminating concerns and challenges
  • Key experts’ thinking about the future impacts of algorithms
  • About this canvassing of experts
  • Theme 1: Algorithms will continue to spread everywhere
  • Theme 2: Good things lie ahead
  • Theme 3: Humanity and human judgment are lost when data and predictive modeling become paramount
  • Theme 4: Biases exist in algorithmically-organized systems
  • Theme 5: Algorithmic categorizations deepen divides
  • Theme 6: Unemployment will rise
  • Theme 7: The need grows for algorithmic literacy, transparency and oversight
  • Acknowledgments


Code-Dependent: Pros and Cons of the Algorithm Age; , (Pew Research Center); In Their Blog; 2017-02-08.

Teaser: Algorithms are aimed at optimizing everything. They can save lives, make things easier and conquer chaos. Still, experts worry they can also put too much control in the hands of corporations and governments, perpetuate bias, create filter bubbles, cut choices, creativity and serendipity, and could result in greater unemployment/


  • Pew Research Center of the Pew Charitable Trusts
  • Imagining the Internet Center at Elon Univesity
  • <ahem>the Singularity enthusiasts … .</ahem>


  1. Algorithms will continue to spread everywhere
  2. Good things lie ahead
  3. Humanity adn human judgement are lost wwhen data nad predictive modeling become paramount
  4. Biases exist in algorithymically-organized systems
  5. algorithmic categorizations deepen divides
  6. Unemployment will rise
  7. The need grows for algorithmic literacy, transparency and oversight.


  • <snicker>Artificial Intelligence (AI)</snicker>
  • algocratic governance
  • surveillance capitalism
  • information capitalism
  • topsight
  • black-box nature [of]
  • digital scientism
  • obedience score


  • Aneesh Aneesh, Stanford University.
  • Peter Diamandis, CEO, XPrize Foundation.
  • Shoshana Zuboff, Harvard.
  • Jim Warren, activist.
  • Terry Langendoen, expert, U.S. National Science Foundation.
  • Patrick Tucker technology editor at Defense One,.
  • Paul Jones, clinical professor at the University of North Carolina-Chapel Hill and director of ibiblio.org.
  • David Krieger, director of the Institute for Communication & Leadership IKF,.
  • Galen Hunt, partner research manager at Microsoft Research NExT,.
  • Alf Rehn, professor and chair of management and organization at Åbo Akademi University in Finland,.
  • Andrew Nachison, founder at We Media,.
  • Luis Lach, president of the Sociedad Mexicana de Computación en la Educación, A.C.
  • Frank Pasquale, professor of law, University of Maryland.
  • Jeff Jarvis, reporter.
  • Cindy Cohn, executive director at the Electronic Frontier Foundation,.
  • Bernardo A. Huberman, senior fellow and director of the Mechanisms and Design Lab at HPE Labs, Hewlett Packard Enterprise.
  • Marcel bullinga, expert.
  • Michael Rogers, principal, Practical Futurist.
  • Brian Christian, Tom Griffiths.
  • David Gelertner.
  • Deloitte Global (anonymous contributors).
  • Barry Chudakov, founder and principal at Sertain Research and StreamFuzion Corp.
  • Stephen Downes, staff, National Research Council of Canada,.
  • Bart Knijnenburg, assistant professor in human-centered computing at Clemson University.
  • Justin Reich, executive director at the MIT Teaching Systems Lab.
  • Dudley Irish, tradesman (a coder).
  • Ryan Hayes, owner of Fit to Tweet,.
  • Adam Gismondi, a visiting scholar at Boston College.
  • Susan Etlinger, staff, Altimeter Group.
  • Chris Kutarna, fellow, Oxford Martin School.
  • Vintno Cert, Internet Hall of Fame, vice president and chief internet evangelist at Google:.
  • Cory Doctorow, writer, computer science activist-in-residence at MIT Media Lab and co-owner of Boing Boing.
  • Jonathan Grudin, Microsoft.
  • Doc Searls, director, Project VRM, Berkman Center, Harvard University,.
  • Marc Rotenberg, executive director of the Electronic Privacy Information Center.
  • Richard Stallman, Internet Hall of Fame, president of the Free Software Foundation.
  • David Clark, Internet Hall of Fame, senior research scientist at MIT,.
  • Baratunde Thurston, Director’s Fellow at MIT Media Lab, ex-digital director of The Onion.
  • Anil Dash, pundit.
  • John Markoff, New York Times.
  • Danah Boyd (“danah boyd”), founder, Data & Society, an advocacy group.
  • Henning Schulzrinne, Internet Hall of Fame, professor at Columbia University,.
  • Amy Webb, futurist and CEO at the Future Today Institute.
  • Jamais Cascio, distinguished fellow at the Institute for the Future.
  • Mike Liebhold, senior researcher and distinguished fellow at the Institute for the Future,.
  • Ben Shneiderman, professor of computer science at the University of Maryland,.
  • David Weinberger, senior researcher at the Harvard Berkman Klein Center for Internet & Society.


Previously filled.

Roundup of miscellaneous notes, captured and organized

Blockchain Culture

The Seven(Hundred) Dwarves

  • Blockstack(.org)- The New Decentralized Internet
    • blockstack, at GitHub
    • Union Square Ventures (USV)
    • Promotion
      • Staff (USV); The Blockchain App Stack; In Their Blog; 2016-08-08.
      • Blockstack Unveils A Browser For The Decentralized Web; Laura Shin; In Forbes; 2017-05-15.
        tl;dr → <quote>Tuesday, at the main blockchain industry conference, Consensus, one of the companies working on this new decentralized web, Blockstack, which has $5.5 million in funding from Union Square Ventures and AngelList cofounder Naval Ravikant, released a browser add-on that enables that and more.<snip/>The add-on enables a browser to store the user’s identity information by a local key on the consumer’s device.</quote>; Ryan Shea, cofounder.
  • Everyone has something here.

Bluetooth Culture

Bluetooth LE (BLE)

  • and?

Bluetooth 5

  • Something about mesh networking
  • Something about the standard being released “summer 2017.”

C++ Culture


  • The roadmap onto the twenties.


  • MapReduce, from ETL or EU somewhere.
  • Kyoto Cabinet, Typhoon, Tycoon
  • Virtual Reality packages
  • Ctemplate, Olafud Spek (?)
  • Robot Operating System (ROS)
  • libgraphqlparser – A GraphQL query parser in C++ with C and C++ APIs

Computing Culture

Ubicomp, <ahem>Pervicomp</ahem>

  • Rich Gold
  • Mark Weiser

Dev(Ops) Culture

Futures Cult(ure)


  • Cory Doctorow, the coming war against general purpose computing, an article; WHERE?
  • Cory Doctorow, dystopia contra utopia, an article; WHERE?


  • Cory Doctorow, various works

Imagine a World In Which…

  • Stocks vs Flows
  • Chaos vs Stability
  • Permission vs Permissionless
  • Civil Society ↔ Crony Society
    • Transparency
    • Deals
    • Priorities
  • Predictive Technology “just works”
    • is trusted
    • is eventual
    • is law
    • “is” equates with “ought”

Fedora Culture

  • Flatpak

Fedora 26 Notes

  • nmcli reload con down $i
  • nm cli reload con up $i
  • eui64 must be manually configured

Internet of (unpatchable) Thingies (IoT)

  • MQTT
  • mosquito

Language Lifestyles

Go Lang

  • Go for it.
  • A package manager


  • theory
  • implementation?

Rust Lang

  • Was there a NoStarch book?


  • C++20?
    hey, surely someone has modules working by now, eh?



  • Repig, in C++, with threads, in an NVMe


  • sure, what?


  • Interface to the (discontinued) Proliphix thermostats


  • CDN Store
  • Picture Store
  • Document Cache (store & forward)


  • Firefox Tiles

SCOLD Experiences

SCOLD near-syntax, common errors

  • #import <hpp>
  • missing #divert
  • #using, a declaration
  • #origin
  • #namespace
  • $@


Build System
  • –with-std-scold or maybe –with-scold
  • vecdup, like strdup
  • vectree, like strfree→free
  • json::check::Failure or json::Cast.
  • namespace json::is
    • is_array
    • is_null
    • is_object
  • json::as<…>(…)
  • pathify(…)
  • column result
  • concept guarding the template parameter, from C++17
  • typed strings
    • location
    • path
    • etc.
  • and

Surveillance Culture


  • Eigenpeople
  • Eigenpersonas
  • Personality modeling


Yves-Alexandre de Montjoye, Jordi Quoidbach, Florent Robic, Alex (Sandy) Pentland; Predicting Personality Using Novel Mobile Phone-Based Metrics; In: A.M. Greenberg, W.G. Kennedy, N.D. Bos (editors) Social Computing, Behavioral-Cultural Modeling and Prediction as Proceedings of Social Computing, Behavioral (SBP 2013), Lecture Notes in Computer Science, vol 7812; 2013; paywalls: Springer, ACM. Previously filled.


  • POSS (Post Open Source Software)
    defined as: if everything is on GitHub, then who needs licenses?
    Was this ever amplified?
    Certainly it is facially incorrect and facile.


  • Rob Horning; Sock of Myself, an essay; In Real Life Magazine; 2017-05-17
    tl;dr → riffing on happiness, Facebook. Is. Bad. Q.E.D. R.D. Laing , The Divided Self,; John Cheney-Lippold’s We Are Data; Donald Mackenzie.
  • Michael Nelson; University of California, Riverside.

Purposive directionality

  • increase
    • predictability
  • reduce
    • uncertainty
    • variability


Uncomprehensible, Unknown, Unpossible

  • Sunlight, a package? FOSS?

Networks of Control | Cracked Labs


Wolfie Christl and Sarah Spiekermann; Networks of Control; Facultas, Vienna; 2016; 185 pages; landing.
Teaser: A Report on Corporate Surveillance, Digital Tracking, Big Data & Privacy

Table of Contents

  1. Preface
  2. Introduction
  3. Analyzing Personal Data
    1. Big Data and predicting behavior with statistics and data mining
    2. Predictive analytics based on personal data: selected examples
      1. The “Target” example: predicting pregnancy from purchase behavior
      2. Predicting sensitive personal attributes from Facebook Likes
      3. Judging personality from phone logs and Facebook data
      4. Analyzing anonymous website visitors and their web searches
      5. Recognizing emotions from keyboard typing patterns
      6. Forecasting future movements based on phone data
      7. Predicting romantic relations and job success from Facebook data
    3. De-anonymization and re-identification
  4. Analyzing Personal Data in Marketing, Finance, Insurance and Work
    1. Practical examples of predicting personality from digital records
    2. Credit scoring and personal finance
    3. Employee monitoring, hiring and workforce analytics
    4. Insurance and healthcare
    5. Fraud prevention and risk management
    6. Personalized price discrimination in e-commerce
  5. Recording Personal Data – Devices and Platforms
    1. Smartphones, mobile devices and apps – spies in your pocket?
    2. Car telematics, tracking-based insurance and the Connected Car
      1. Data abuse by apps
    3. Wearables, fitness trackers and health apps – measuring the self
      1. A step aside – gamification, surveillance and influence on behavior
      2. Example: Fitbit’s devices and apps
      3. Transmitting data to third parties
      4. Health data for insurances and corporate wellness
    4. Ubiquitous surveillance in an Internet of Things?
      1. Examples – from body and home to work and public space
  6. Data Brokers and the Business of Personal Data
    1. The marketing data economy and the value of personal data
    2. Thoughts on a ‘Customers’ Lifetime Risk’ – an excursus
    3. From marketing data to credit scoring and fraud detection
    4. Observing, inferring, modeling and scoring people
    5. Data brokers and online data management platforms
    6. Cross-device tracking and linking user profiles with hidden identifiers
    7. Case studies and example companies
      1. Acxiom – the world’s largest commercial database on consumers
      2. Oracle and their consumer data brokers Bluekai and Datalogix
      3. Experian – expanding from credit scoring to consumer data
      4. arvato Bertelsmann – credit scoring and consumer data in Germany
      5. LexisNexis and ID Analytics – scoring, identity, fraud and credit risks
      6. Palantir – data analytics for national security, banks and insurers
      7. Alliant Data and Analytics IQ – payment data and consumer scores
      8. Lotame – an online data management platform (DMP)
      9. Drawbridge – tracking and recognizing people across devices
      10. Flurry, InMobi and Sense Networks – mobile and location data
      11. Adyen, PAY.ON and others – payment and fraud detection
      12. MasterCard – fraud scoring and marketing data
  7. Summary of Findings and Discussion of its Societal Implications
    1. Ubiquitous data collection
    2. A loss of contextual integrity
    3. The transparency issue
    4. Power imbalances
    5. Power imbalances abused: systematic discrimination and sorting
    6. Companies hurt consumers and themselves
    7. Long term effects: the end of dignity?
    8. Final reflection: From voluntary to mandatory surveillance?
  8. Ethical Reflections on Personal Data Markets (by Sarah Spiekermann)
    1. A short Utilitarian reflection on personal data markets
    2. A short deontological reflection on personal data markets
    3. A short virtue ethical reflection on personal data markets
    4. Conclusion on ethical reflections
  9. Recommended Action
    1. Short- and medium term aspects of regulation
    2. Enforcing transparency from outside the “black boxes”
    3. Knowledge, awareness and education on a broad scale
    4. A technical and legal model for a privacy-friendly digital economy
  10. List of tables
  11. List of figures
  12. References




  • Anna Fielder, Chair of Privacy International
  • Courtney gabrielson, International Association of Privacy Professionals (IAPP)


There are 677 footnoes, which are distinct from the references.
There are 211 references.

Separately filled.

Corporate Surveillance in Everyday Life | Cracked Labs

Corporate Surveillance in Everyday Life. How Companies Collect, Combine, Analyze, Trade, and Use Personal Data on BillionsWolfie Christl,; Cracked Labs, Vienna; 2017-06; 93 pages.

Teaser: <shrill>How thousands of companies monitor, analyze, and influence the lives of billions. Who are the main players in today’s digital tracking? What can they infer from our purchases, phone calls, web searches, and Facebook likes? How do online platforms, tech companies, and data brokers collect, trade, and make use of personal data?</shrill>

Table of Contents

  1. Background and Scope
  2. Introduction
  3. Relevant players within the business of personal data
    1. Businesses in all industries
    2. Media organizations and digital publishers
    3. Telecom companies and Internet Service Providers
    4. Devices and Internet of Things
    5. Financial services and insurance
    6. Public sector and key societal domains
    7. Future developments?
  4. The Risk Data Industry
    1. Rating people in finance, insurance and employment
    2. Credit scoring based on digital behavioral data
    3. Identity verification and fraud prevention
    4. Online identity and fraud scoring in real-time
    5. Investigating consumers based on digital records
  5. The Marketing Data Industry
    1. Sorting and ranking consumers for marketing
    2. The rise of programmatic advertising technology
    3. Connecting offline and online data
    4. Recording and managing behaviors in real-time
    5. Collecting identities and identity resolution
    6. Managing consumers with CRM, CIAM and MDM
  6. Examples of Consumer Data Broker Ecosystems
    1. Acxiom, its services, data providers, and partners
    2. Oracle as a consumer data platform
    3. Examples of data collected by Acxiom and Oracle
  7. Key Developments in Recent Years
    1. Networks of digital tracking and profiling
    2. Large-scale aggregation and linking of identifiers
    3. “Anonymous” recognition
    4. Analyzing, categorizing, rating and ranking people
    5. Real-time monitoring of behavioral data streams
    6. Mass personalization
    7. Testing and experimenting on people
    8. Mission creep – everyday life, risk assessment and marketing
  8. Conclusion
  9. Figures
  10. References



  • Omer Tene
  • Jules Polonetsky


Yes.  A work this polished could be hid for long.


The web variant is summary material.

  1. Analyzing people
  2. Analyzing people in finance, insurance and healthcare
  3. Large-scale collection and use of consumer data
  4. Data brokers and the business of personal data
  5. Real-time monitoring of behaviors across everyday life
  6. Linking, matching and combining digital profiles
  7. Managing consumers and behaviors, personalization and testing
  8. Dragnet – everyday life, marketing data and risk analytics
  9. Mapping the commercial tracking and profiling landscape
  10. Towards a society of pervasive digital social control?


There are 601 footnotes, which are distinct from the references.
There are 102 of references

Previously filled.

Beyond Public Key Encryption | Matthew Green

Matthew Green; Beyond Public Key Encryption; In His Blog entitled A Few Thoughts on Cryptographic Engineering; 2017-07-02.
Matthew Green, professor, Johns Hopkins University.

tl;dr → overview & history of Identity Based Cryptography and allied arts.


  • Eugen Belyakoff, an artist, The Noun Project (licensed artwork, specifically communicative graphics)
  • Voltage Security, now Hewlett-Packard Enterprise (HPE)
  • IBE systems effectively “bake in” key escrow
  • Christopher Cocks discovered RSA circa five years before RSA did.
    ellisdocdiscovered the RSA cryptosystem
  • Boneh-Franklin Scheme, 2001

    • elliptic curves
    • support efficient bilinear maps (pdf)
  • Attribute-Based Encryption (ABE)
    think: biometric & encryption; record-level & field-level database access encryption

    • Sahai & Waters
    • “threshold gate”.
    • fuzzy IBE, or not.
    • is that a threshold gate can be used to implement the boolean AND and OR gates
    • ciphertext policy
  • Functional Encryption iacr:2010/543
    Concept: embed arbitrary computer programs? in the attributes of ABE, iacr:2013/337, arXiv:1210.5287



  • Attribute-Based Encryption (ABE)
  • Diffie-Hellman Key Exchange (DHKE)
  • Functional Encryption (FE?, <aside>everything gets an acronym</aside>)
  • Identity Based Encryption (IBE); a.k.a. Identity-Based Cryptography
  • Identity-Based Encryption (IBE)
  • Identity-Based Signature (IBS)
  • Key Generation Authority.
  • Master Public Key (MPK)
  • Master Secret Key (MSK)
  • Pretty Good Privacy (PGP)
  • Public Key Encryption (PKE)
  • Public Key Infrastructure (PKI)
  • Shamir-Rivest-Adelman (RSA), a cryptosystem
The Roles
  • Alice
  • Bob
  • Eve
  • Mallory

Key Servers

At GitHub




At arXiv

At Semantic Scholar


In Jimi Wales’ Wiki

Previously filled.

Online Privacy and ISPs | Institute for Information Security & Privacy, Georgia Tech

Peter Swire, Justin Hennings, Alana Kirkland; Online Privacy and ISPs; a whitepaper; Institute for Information Security & Privacy, Georgia Tech; 2016-05; 131 pages.
Teaser: ISP Access to Consumer Data is Limited and Often Less than Access by Others

  • Peter Swire
    • Associate Director,
      The Institute for Information
      Security & Privacy at Georgia Tech
    • Huang Professor of Law,
      Georgia Tech Scheller College of Business
      Senior Counsel, Alston & Bird LLP
  • Justin Hemmings,
    • Research Associate,
      Georgia Tech Scheller College of Business
    • Policy Analyst
      Alston & Bird LLP
  • Alana Kirkland
    • Associate Attorney, Alston & Bird LLP

tl;dr → ISP < Media; ISPs are not omnipotent; ISPs see less than you think; Consumer visibility is mitigated by allowed usage patterns: cross-ISP, cross-device, VPN, DNS obfuscation, encryption.  Anyway, Facebook has it all and more.

Consumer profiling observation is already occurring by other means anyway.

<quote> In summary, based on a factual analysis of today’s Internet ecosystem in the United States, ISPs have neither comprehensive nor unique access to information about users’ online activity. Rather, the most commercially valuable information about online users, which can be used for targeted advertising and other purposes, is coming from other contexts. Market leaders are combining these contexts for insight into a wide range of activity on each device and across devices. </quote>

<translation> The other guys are already doing it, why stop ISPs? </translation>

ISP surveillanceObservation of consumers is neither Comprehensive, nor Unique

<quote> The Working Paper addresses two fundamental points. First, ISP access to user data is not comprehensive – technological developments place substantial limits on ISPs’ visibility. Second, ISP access to user data is not unique – other companies often have access to more information and a wider range of user information than ISPs. Policy decisions about possible privacy regulation of ISPs should be made based on an accurate understanding of these facts. </quote>

<view> It’s unargued why comprehensive or unique are bright-line standards of anything at all. </view>

Previously filled.



  • ISPs < Media
    The dumb-pipe, bit-shoving, ISPs see less than media services, who see semantic richness.
  • Cross-device is the new nowadays.
  • Encryption is everywhere.


  • a technical statement
  • contra “use” which is an action by a person
Cross-Device Tracking
Logged-In, Cross-Context Tracking
Not Logged-In, Cross-Context Tracking
Cross-Device Tracking
  • Frequency Capping
  • Attribution
  • Improved Advertising Targeting
  • Sequenced Advertising
  • Tracking Simultaneity
Limits the use of “data” (facts about consumers)
  • at the point of collection
  • at the point of use
Location of a consumer
  • Coarse contra Precise
  • Current contra Historical


The document has both a Preface and an Executive Summary. so the journeyperson junior policy wonkmaker can approach the material at whatever level of complexity their time budget and training affords.


  • Technological Developments Place Substantial Limits on ISPs’ Visibility into Users’ Online Activity:
    1. From a single stationary device to multiple mobile devices and connections.
    2. Pervasive encryption.
    3. Shift in domain name lookup.
  • Non-ISPs Often Have Access to More and a Wider Range of User Information than ISPs:
    1. Non-ISP services have unique insights into user activity.
    2. Non-ISPs dominate in cross-context tracking.
    3. Non-ISPs dominate in cross-device tracking.

Executive Summary

  • Technological Developments Place Substantial Limits on ISPs’ Visibility into Users’ Online Activity:
    1. From a single stationary device to multiple mobile devices and connections.
    2. Pervasive encryption.
    3. Shift in domain name lookup.
  • Non-ISPs Often Have Access to More and a Wider Range of User Information than ISPs:
    1. Non-ISP services have unique insights into user activity.
      • social networks
      • search engines
      • webmail and messaging
      • operating systems
      • mobile apps
      • interest-based advertising
      • browsers
      • Internet video
      • e-commerce.
    2. Non-ISPs dominate in cross-context tracking.
    3. Non-ISPs dominate in cross-device tracking.

Table Of Contents

Online Privacy and ISPs: ISP Access to Consumer Data is Limited and Often Less than Access by Others

Summary of Contents:

  • Preface
  • Executive Summary
    • Appendix 1: Some Key Terms
  • Chapter 1: Limited Visibility of Internet Service Providers Into Users’ Internet Activity
    • Appendix 1: Encryption for Top 50 Web Site
    • Appendix 2: The Growing Prevalence of HTTPS as Fraction of Internet Traffic
  • Chapter 2: Social Networks
  • Chapter 3: Search Engines
  • Chapter 4: Webmail and Messaging
  • Chapter 5: How Mobile Is Transforming Operating Systems
  • Chapter 6: Interest-Based Advertising (“IBA”) and Tracking
  • Chapter 7: Browsers, Internet Video, and E-commerce
  • Chapter 8: Cross-Context Tracking
    • Appendix 1: Cross-Context Chart Citations
  • Chapter 9: Cross-Device Tracking
  • Chapter 10: Conclusion


  • Interest-Based Advertising (IBA)
  • Tracking
  • Location
    • Coarse Location
    • Precise Location
  • Natural Language Conversation Robots (a.k.a. ‘bots)
    • Siri, Apple
    • Now, Google Now
    • Cortana, Microsoft


Also see page 124 of The Work.

  • Availability → contra Use
  • Big Data → data which is very big.
  • Broadband Internet Access Services → an ISP, but not a dialup service
    as used in the Open Internet Order, of the FCC, 2015-24, Appendix A.
  • Chat bot → <fancy>Personal Digital Assistance</fancy>
  • Cookie
  • CPNI → Customer Proprietary Network Information
    47 U.S.C. §222. Also, Section 222 are at 47 C.F.R.§ 64.2001 et seq.
  • Cross-Dontext
  • Cross-Device
  • DNS → Domain Name Service
  • DPI → Deep Packet Inspection
  • Edge Providers → smart pipes, page stuffing, click-baiting; e.g. Akamai, CloudFlare, CloudFront, etc.. exemplars.
  • End-to-End
    • Argument
    • Encryption
  • Factual Analysis → this means something different to lawyers contra engineers.
  • FCC → Federal Communications Commission
  • Form
    Form Autofill, a browser feature
  • FTC → Federal Trade Commission
  • FTT → Freedom To Tinker, a venue, an oped
  • GPS → Global Positioning System
  • HTTP → you know.
  • HTTPS → you know.
  • IBA → Interest-Based Advertising
  • IP → Internet Protocol
    • Address
  • IoT → Internet of Thingies Toys Unpatchables
  • IRL → <culture who=”The Youngs”>In Real Life</culture>
  • ISP → Internet Service Provider
  • Last Mile, of an ISP
  • Location
    • Coarse → “city”- “DMA”- or “country”-level
    • Precise → an in-industry definition exists
  • Metadata → indeed.
  • OBA → Online Behavioral Advertising
  • Open Internet Order, of the FCC.
  • OS → <ahem>Operating System</ahem>
  • Party System
    • First Party
    • [Second Party], no one cares.
    • Third Party
    • [Fourth Party]
  • Personal Information → the sacred stuff, the poisonous stuff
  • Personal Digital Assistant → a trade euphemism for NLP + command patterns for IVR; all the 1st-tier shops have one nowadays.
    • Siri → Apple
    • Now → Google
    • Cortana → Microsoft
  • Scanning
  • Section 222, see Title II
  • SSL → you mean TLS
  • Title II, of the Telecommunications Act.
    • Section 222,
  • Tracking
    • (Across-) Cross-Context
    • (Across-) Cross-Device
  • TLS → you mean SSL
  • UGC → User-Generated Content (unsupervised filth; e.g. comment spam)
  • URL → you know.
  • VPN → run one.
  • WiFi → for some cultural reason “wireless” turns into “Wireless Fidelity” and “WiFi”
  • Working Paper → are unreviewed work products..
  • Visibility → bookkeeping by the surveillor observer.



Of course, it’s a legal-style policy whitepaper. Of course there are references; they are among the NN footnotes. In rough order of appearance in the work.


Persuasion and the other thing: A critique of big data methodologies in politics | Ethnography Matters

Molly Sauter; Persuasion and the other thing: A critique of big data methodologies in politics; In Ethnography Matters; 2017-05-24.

tl;dr → 3026 words. Big Data (which so is very big) is bad. The sphere is problematized. A problematic which situates the hegemons is synthesized via the dialectic. A mode of resistance is posited.

<ahem>… and by way of brief rebuttal: The Computers and The Establishment that owns & operates The Computers, their work inuring to the mutual benefit of them both, individually and severally, are smarter than all that (c.f. the trivial use of “grep -v”), and also the suggested modality of dissent violates the T&C which was previously freely given and binds & constrains individual future actions; its unilateral repudiation makes the performer at once dishonest, conflicted, and an outlaw who deserves no quarter; not in theory, not in practice, or under the reigning jurisdictional supervision (c.f. 18 U.S.C. Section 1001, as opined).</ahem>

Previously filled.


  • Cambridge Analytica
  • Donald Trump
  • Brexit Campaign
  • Facebook
    • “likes”
    • targeted nudges
  • Mother Jones
  • The Guardian
  • SCL Group
  • Apple (Computer) Inc.


  • There is not enough consent (from the subjects)
  • <quote>Democracy shifts from a form of governance at least theoretically concerned with public debate and persuasion to one focused on private, opaque manipulation and emotional coercion.</quote>


The obfuscation schemes, taxonomized in Brunton & Nissenbaum:

  • noisy bots
  • “like-farming,” i.e. spamming.
  • TrackMeNot
    a browser extension which generates abusive search query engines.
  • AdNauseam
    a browser extension which generates abusive click streams.
  • FaceCloak
    Something about storing data “off Facebook,” yet performing the data “on Facebook.”
  • Bayesian Flooding … sounds fancy; it means creating profile- & page- spam entries on Facebook.


Unless otherwise noted persons are credited as “an activist.”

  • Finn Brunton
    with Helen Nissenbaum
  • Michal Kosinski, a bad guy in the pantheon
    with et al. as David Stillwell, Thore Graepel
  • Helen Nissenbaum
    with Finn Brunton (for symmetry)
  • Kelly Oliver
  • Molly Sauter
  • Zeynep Tufekci
  • Sara Marie Watson


… sounds fancy, and more than a little dangerous (<quote> cacklingly evil</quote>).  In rough order of appearance.

  • psychographics
  • algorithmic nudging
  • entitlements (<quote> held by advertisers, tech firms, and researchers who deploy big data analytics in support of political campaigns or other political projects </quote>
  • sense of entitlement
  • subjectivity (something about having agency; being such is good)
    objects (data objects); something about not having agency; being such is bad.
  • obfuscation
  • sabotage
    <quote>sabotaging the efficacy of the methodology in general, to resist attempts to be read, known, and manipulated.</quote>
  • emotional contagion; c.f. Facebook, an ”experiment,” 2014
  • nudge (contra shove)
  • algorithmic modeling → “opinions embedded in mathematics” [page 21, O'Neil].
  • otherness
  • knowability
  • digital shadow-selves
  • a paradoxical problem
    wow man, dig it … a paradox, a problem with a paradox, that’s like a paradox2.
  • data broker
  • entitlement of inference
    <quote>a certain entitlement of inference</quote>
    <quote>the entitlement of inference on display</quote>
  • influence techniques
    secret or opaque influence techniques
  • consent of the governed
    meaningful consent
  • inferential modeling collectssynthesizes non-disclosed information
  • opting out
    social media abstinence
  • data doppelganger
  • pervasive surveillance and modeling systems
  • obfuscation
    <quote>creates noise, either at the level of the platform or the individual profile</quote>


Trajectory Recovery from Ash: User Privacy Is NOT Preserved in Aggregated Mobility Data | Xu, Tu, Li, Zhang, Fu, Jin

Fengli Xu, Zhen Tu, Yong Li, Pengyu Zhang, Xiaoming Fu, Depeng Jin; Trajectory Recovery From Ash: User Privacy Is NOT Preserved in Aggregated Mobility Data; In Proceedings of the Conference on the World Wide Web (WWW); 2017-02-21 (2017-02-25); 10 pages; arXiv:1702.06270

tl;dr → probabilistic individuation from timestamped aggregated population location records.


Human mobility data has been ubiquitously collected through cellular networks and mobile applications, and publicly released for academic research and commercial purposes for the last decade. Since releasing individual’s mobility records usually gives rise to privacy issues, datasets owners tend to only publish aggregated mobility data, such as the number of users covered by a cellular tower at a specific timestamp, which is believed to be sufficient for preserving users’ privacy. However, in this paper, we argue and prove that even publishing aggregated mobility data could lead to privacy breach in individuals’ trajectories. We develop an attack system that is able to exploit the uniqueness and regularity of human mobility to recover individual’s trajectories from the aggregated mobility data without any prior knowledge. By conducting experiments on two real-world datasets collected from both mobile application and cellular network, we reveal that the attack system is able to recover users’ trajectories with accuracy about 73%~91% at the scale of tens of thousands to hundreds of thousands users, which indicates severe privacy leakage in such datasets. Through the investigation on aggregated mobility data, our work recognizes a novel privacy problem in publishing statistic data, which appeals for immediate attentions from both academy and industry.



  1. R. Wang, M. Xue, K. Liu, et al. Data-driven privacy analytics: A wechat case study in location-based social networks. In Wireless Algorithms, Systems, and Applications. Springer, 2015.
  2. Apple’s commitment to your privacy.
  3. V. D. Blondel, M. Esch, C. Chan, et al. Data for development: the D4D challenge on mobile phone data. arXiv:1210.0137, 2012.
  4. G. Acs and C. Castelluccia. A case study: privacy preserving release of spatio-temporal density in Paris. In Proceedings of the ACM Conference of the Special Interest Group on Knowledge D-something and D-Something (SIGKDD). ACM, 2014.
  5. China telcom’s big data products.
  6. C. Song, Z. Qu, N. Blumm. Limits of predictability in human mobility. In Science, 2010.
  7. S. Isaacman, R. Becker, R. Cáceres, et al. Ranges of human mobility in Los Angeles and New York. In Proceedings of the IEEE Workshops on Pervasive Computing and Communications (PERCOM). IEEE, 2011.
  8. S. Isaacman, R. Becker, R. Cáceres, et al. Human mobility modeling at metropolitan scales. In In Proceedings of the ACM Conference on Mobile Systems (MOBISYS). ACM, 2012.
  9. M. Seshadri, S. Machiraju, A. Sridharan, et al. Mobile call graphs: beyond power-law and lognormal distributions. In Proceedings of the ACM Conference on Knowledge Discovery? and Discernment? (KDD). ACM, 2008.
  10. Y. Wang, H. Zang, M. Faloutsos. Inferring cellular user demographic information using homophily on call graphs. In Proceedings of the IEEE Workshop on Computer Communications (INFOCOM) IEEE, 2013.
  11. A. Wesolowski, N. Eagle, A. J. Tatem, et al. Quantifying the impact of human mobility on malaria. In Science, 2012.
  12. M. Saravanan, P. Karthikeyan, A. Aarthi. Exploring community structure to understand disease spread and control using mobile call detail records. NetMob D4D Challenge, 2013. Probably there’s a promotional micro-site for this.
  13. R. W. Douglass, D. A. Meyer, M. Ram, et al. High resolution population estimates from telecommunications data. In EPJ Data Science, 2015.
  14. H. Wang, F. Xu, Y. Li, et al. Understanding mobile traffic patterns of large scale cellular towers in urban environment. In Proceedings of the ACM Internet Measurement Conference (IMC). ACM, 2015.
  15. L. Sweeney. k-anonymity: A model for protecting privacy. In International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2002.
  16. Y. de Montjoye, L. Radaelli, V. K. Singh, et al. Unique in the shopping mall: On the reidentifiability of credit card metadata. In Science, 2015.
  17. H. Zang and J. Bolot. Anonymization of location data does not work: A large-scale measurement study. In Proceedings of the ACM Conference on Mobile Communications (Mobicom). ACM, 2011.
  18. M. Gramaglia and M. Fiore. Hiding mobile traffic fingerprints with glove. In Proceedings of the ACM Conference CoNEXT, 2015.
  19. A.-L. Barabasi. The origin of bursts and heavy tails in human dynamics. In Nature, 2005.
  20. A. Machanavajjhala, D. Kifer, J. Gehrke, et al. l-Diversity: Privacy beyond k-Anonymity. In Transactions on Knowledge Doodling? and Deliverance? (TKDD), 2007.
  21. Y. de Montjoye, C. A. Hidalgo, M. Verleysen, et al. Unique in the crowd: The privacy bounds of human mobility. In Scientific Reports, 2013.
  22. G. B. Dantzig. Linear Programming and Extensions. Princeton University Press, 1998.
  23. H. W. Kuhn. The Hungarian Method for the Assignment Problem. In Naval Research Logistics Quarterly, 1955.
  24. O. Abul, F. Bonchi, M. Nanni. Anonymization of moving objects databases by clustering and perturbation. In Information Systems, 2010.
  25. Pascal Welke, Ionut Andone, Konrad Blaszkiewicz, Alexander Markowetz. Differentiating smartphone users by app usage. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pages 519–523. ACM, 2016.
  26. Lukasz Olejnik, Claude Castelluccia, Artur Janc. Why Johnny Can’t Browse in Peace: On the uniqueness of web browsing history patterns. In Proceedings of the 5th Workshop on Hot Topics in Privacy Enhancing Technologies (HotPETs), 2012.
  27. M. C. Gonzalez, C. A. Hidalgo, A.-L. Barabasi. Understanding individual human mobility patterns. In Nature, 2008.
  28. C. Song, T. Koren, P. Wang, et al. Modelling the scaling properties of human mobility. In Nature Physics, 2010.
  29. Y. Liu, K. P. Gummadi, B. Krishnamurthy, et al. Analyzing Facebook Privacy Settings: User Expectations vs. Reality. In Proceedings of the ACM Internet Measurement Conference (IMC). ACM, 2011.
  30. B. Krishnamurthy and C. E. Wills. Generating a privacy footprint on the Internet. In Proceedings of the ACM Internet Measurement Conference
  31. S. Le B., C. Zhang, A. Legout, et al. I know where you are and what you are sharing: exploiting P2P communications to invade users’ privacy. In Proceedings of the ACM Internet Measurement Conference (IMC). ACM, 2011.
  32. S. Liu, I. Foster, S. Savage, et al. Who is. com? learning to parse WHOIS records. In Proceedings of the ACM Internet Measurement Conference (IMC). ACM, 2015.
  33. H. Kido, Y. Yanagisawa, T. Satoh. Protection of location privacy using dummies for location-based services. In Proceedings of the IEEE International Conference on (Mountain?) DEW (ICDEW). IEEE, 2005.
  34. A. Monreale, G. L. Andrienko, N. V. Andrienko, et al. Movement data anonymity through generalization. In Transactions on Data Privacy, 2010.
  35. K. Sui, Y. Zhao, D. Liu, et al. Your trajectory privacy can be breached even if you walk in groups. In Proceedings of the IEEE/ACM International Workshop on Quality of Service (IWQoS), 2016.
  36. Y. Song, D. Dahlmeier, S. Bressan. Not so unique in the crowd: a simple and effective algorithm for anonymizing location data. In PIR@ SIGIR, 2014.
  37. S. Garfinkel. Privacy protection and RFID. In Ubiquitous and Pervasive Commerce. Springer, 2006.
  38. J. Domingo-Ferrer and R. Trujillo-Rasua. Microaggregation-and permutation-based anonymization of movement data. In Information Sciences, 2012.
  39. Cynthia Dwork, Adam Smith, Thomas Steinke, Jonathan Ullman, Salil Vadhan. Robust Traceability From Trace Amounts. In Proceedings of the 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS), , pages 650–669. IEEE, 2015.

Previously filled.

Experience with Let’s Encrypt certbot for Fedora 23 (fails)

At certbot.eff.org with Apache on Fedora 23+

sudo dnf install -y python-certbot-apache
Error: nothing provides python2-augeas needed by python2-certbot-apache-0.8.1-1.fc23.noarch
(try to add '--allowerasing' to command line to replace conflicting packages)


dnf install -y augeas
dnf install -y python-augeas

Therefore: certbot isn’t ready for Fedora 23 yet.

Fedora 22?


wget https://dl.eff.org/certbot-auto

Nope … too big and complicated … it will never work … and they didn’t test it on Fedora anyway.


Prerequisites of python-certbot-apache


Still fails

$ sudo dnf install python2-certbot-apache
Last metadata expiration check performed 2:49:52 ago on Wed Sep 28 04:06:26 2016.
Error: nothing provides python2-augeas needed by python2-certbot-apache-0.8.1-1.fc23.noarch
(try to add '--allowerasing' to command line to replace conflicting packages)


wget https://dl.fedoraproject.org/pub/fedora/linux/updates/23/x86_64/p/python2-certbot-apache-0.8.1-1.fc23.noarch.rpm
sudo rpm --install --nodeps python2-certbot-apache-0.8.1-1.fc23.noarch.rpm

What got installed?

$ rpm -q -l -p ./python2-certbot-apache-0.8.1-1.fc23.noarch.rpm  | grep -v test

You also have to install


. It will list, but fails to create, the directories /etc/letsencrypt and /var/lib/letsencrypt

$ sudo dnf install certbot
Last metadata expiration check performed 0:18:54 ago on Wed Sep 28 07:09:29 2016.
Dependencies resolved.
 Package               Arch                 Version                     Repository             Size
 certbot               noarch               0.8.1-2.fc23                updates                20 k

Transaction Summary
Install  1 Package

Total download size: 20 k
Installed size: 20 k
Is this ok [y/N]: y
Downloading Packages:
certbot-0.8.1-2.fc23.noarch.rpm                                      42 kB/s |  20 kB     00:00    
Total                                                                16 kB/s |  20 kB     00:01     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Installing  : certbot-0.8.1-2.fc23.noarch                                                     1/1 
  Verifying   : certbot-0.8.1-2.fc23.noarch                                                     1/1 

  certbot.noarch 0.8.1-2.fc23                                                                       

$ rpm -q -l certbot
$ rpm -q -l certbot | xargs ls -ld
ls: cannot access /etc/letsencrypt: No such file or directory
ls: cannot access /var/lib/letsencrypt: No such file or directory
-rwxr-xr-x. 1 root root   302 Jul  6 06:42 /usr/bin/certbot
lrwxrwxrwx. 1 root root    16 Jul  6 06:42 /usr/bin/letsencrypt -> /usr/bin/certbot
drwxr-xr-x. 2 root root  4096 Sep 28 07:28 /usr/share/doc/certbot
-rw-r--r--. 1 root root   362 Jun 14 16:46 /usr/share/doc/certbot/CHANGES.rst
-rw-r--r--. 1 root root   604 Jun 14 16:46 /usr/share/doc/certbot/CONTRIBUTING.md
-rw-r--r--. 1 root root  7702 Jun 14 16:46 /usr/share/doc/certbot/README.rst
drwxr-xr-x. 2 root root  4096 Sep 28 07:28 /usr/share/licenses/certbot
-rw-r--r--. 1 root root 11456 Jun 14 16:46 /usr/share/licenses/certbot/LICENSE.txt
$ certbot plugins
An unexpected error occurred:
OSError: [Errno 13] Permission denied: '/etc/letsencrypt'
Please see the logfile 'certbot.log' for more details.

You have to do it yourself:

sudo mkdir /etc/letsencrypt /var/lib/letsencrypt

Emergent Privacy | Ran Wolff

Ran Wolff (Yahoo); Emergent Privacy; 2013-01-17 → 2014-07-30; 17 pages; ssrn:2193164


Defining privacy is a long sought goal for philosophers and legal scholars alike. Current definitions lack mathematical rigor. They are therefore impracticable for domains such as economics and computer science in which privacy needs to be quantified and computed.

This paper describes a game theoretic framework in which privacy requires no definition per se. Rather, it is an emergent property of specific games, the strategy by which players maximize their reward. In this context, key activities related to privacy, such as methods for its protection and ways in which it is traded, are given concrete meaning.

Based in game theory, emergent privacy demonstrates that the right to privacy can be derived, at least in part, on a utilitarian philosophical basis.


  • Alessandro Acquisti. The Economics of Privacy.
  • Giacomo Calzolari, Alessandro Pavan. On the optimality of privacy in sequential contracting. In Journal of Economic Theory, 130(1):168-204, 2006.
  • Nikhil S. Dighe, Jun Zhuang, Vicki M. Bier. Secrecy in Defensive Allocations as a Strategy for achieving more Cost-effective Attacker Deterrence. In International Journal of Performability Engineering, 5(1):31-43, 2009.
  • Arik Friedman, Ran Wolff, and Assaf Schuster. Providing k-Anonymity in data mining. In VLDB Journal, 17(4):789-804, 2008.
  • Peter H. Huang. The Law and Economics of Consumer Privacy Versus Data Mining. 1998. ssrn:94041. 35 pages.
  • Joseph B. Kadane, Mark Schervish, Teddy Seidenfield. Is Ignorance Bliss? In Journal of Philosophy, 105(1):5-36, 2008.
  • Helen Nissenbaum. Privacy as Contextual Integrity. In Washington Law Review, 79(4):119-158, 2004.
  • Paul Ohm. Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization. In UCLA Law Review, 57:1701-1777, 2010.
  • Martin J. Osborne. An introduction to game theory. Oxford University Press, 2003-08. p. 283.
  • Richard A. Posner. The right to privacy. In Georgia Law Review, 12(3):393-422, 1978.
  • E. Rasmusen. Games and information. Cambridge, 1994.
  • Pierangela Samarati, Latanya Sweeney. Protecting Privacy when Disclosing Information: k-Anonymity and Its Enforcement through Generalization and Suppression. Technical Report SRI-CSL-98-04, Computer Science Laboratory, SRI International. 1998. landing.
  • Daniel J. Solove. Understanding privacy. Harvard University Press, Cambridge, Mass, 2008.
  • Samuel D. Warren, Louis D. Brandeis. In The Right to Privacy. Harvard Law Review , IV(5), 1890.


WebRTC and STUN for intra-LAN exploration & end-user tracking


  • WebRTC, promotional site
  • Availabilities
    all the browsers that matter

    • Android
    • Chrome (Linux, Android, Windows)
    • Firefox
    • Opera
    • Safari (iOS)




  • RFC 7350Datagram Transport Layer Security (DTLS) as Transport for Session Traversal Utilities for NAT (STUN); Petit-Huguenin, Salgueiro; IETF; 2014-08.
  • RFC 7064URI Scheme for the Session Traversal Utilities for NAT (STUN) Protocol; Nandakumar, Salgueiro, Jones, Petit-Huguenin; IETF; 2013-11.
  • RFC 5928Traversal Using Relays around NAT (TURN) Resolution Mechanism; Petit-Huguenin; IETF; 2010-08.
  • RFC 5389Session Traversal Utilities for NAT (STUN); Rosenberg, Mahy, Matthews, Wing; IETF; 2008-10.

    • RFC 3489STUN – Simple Traversal of User Datagram Protocol (UDP) Through Network Address Translators (NATs); Rosenberg, Weinberger, Huitema, Mahy; 2003-03.

In Jimi Wales’ Wiki.



In archaeological order


665909webrtc WebRCT Tracking; In Bugzilla of Mozilla; 2011-06-21 →2016-01-11; Closed as INVALID

Some droid using the self-asserted identity token cchen; How to Stop WebRTC Local IP Address Leaks on Google Chrome and Mozilla Firefox While Using Private IPs; In Privacy Online Forums; 2015-01→2015-03.


  • Availability
    of the problem (not of WebRTC in general)

    • Chrome of Google
      • Windows
    • Firefox of Mozilla
      • Unclear, perhaps Windows only
    • Internet Explorer of Microsoft
      WebRTC is not available at all.
    • Opera of Mozilla
      • Unclear
    • Safari of Apple
      WebRTC is not available except through a plugin
    • Unavailable
      • Chrome of Google
        • OS/X
        • Android
      • Linux at all
        not clear; not mentioned at all.
  • Blocking
    • Chrome of Google
    • Firefox of Mozilla
      • Production
        • about:config
        • media.peerconnection.enabled set to true (default true)
      • Development

        • Canary
        • Nightly
        • Bowser
    • Opera of Opera
  • API Directory
    • voice calls
    • video chats
    • p2p file sharing


  • Chrome
    default is available and active
  • Firefox
    • about:config
    • media.peerconnection.enabled set to true (default true)
  • Opera
    only when configured, with a plugin, to run Google Chrome extensions


webrtc-ips, a STUN & WebRTC test rig

  • diafygi/webrtc-ips
  • via on-page JavaScript, makes latent requests to certain STUN servers.
  • Firefox 34 → Does. Not. Work.
  • Fails with
    Error: RTCPeerConnection constructor passed invalid RTCConfiguration - missing url webrtc-ips:58


  • Private Internet Access (PIA)
  • Real-Time-Communication (RTC)
  • Virtual Private Network (VPN)
  • WebRTC


In Privacy Online Forums:


  • 2013
  •  Since WebRTC uses javascript requests to get your IP address, users of NoScript or similar services will not leak their IP addresses.

Via: backfill.


  • about:config
  • media.peerconnection.enabled set to true (default true)

Web Privacy Census | Altaweel, Good, Hoofnagle

Ibrahim Altaweel, Nathaniel Good, Chris Jay Hoofnagle; Web Privacy Census; In Technology Science; 2015-12-15.

tl;dr → there are lots of (HTML4) cookies; cookies are for tracking; cookies are bad. factoids are exhibited.


Most people may believe that online activities are tracked more pervasively now than they were in the past. In 2011, we started surveying the online mechanisms used to track people online (e.g., HTTP cookies, Flash cookies and HTML5 storage). We called this our Web Privacy Census. We repeated the study in 2012. In this paper, we update the study to 2015.


  • Universe
    • Quantcast
    • “top 1 million”
  • Attack
    • Firefox 39
    • OpenWPM
  • Client
    • HTML4 Cookies
    • HTML5 Storage
    • Flash
  • Use Cases
    indistinguishable in the census method

    • Analytics
    • Tracking (Trak-N-Targ)
    • Conversion
    • Personalization
    • Security


Revisiting the Uniqueness of Simple Demographics in the US Population | Philippe Golle

Philippe Golle; Revisiting the Uniqueness of Simple Demographics in the US Population; In Proceedings of the Workshop on Privacy in the Electronic Society (WPES); 2006-10-30; 4 pages.


ccording to a famous study [10] of the 1990 census data, 87% of the US population can be uniquely identified by gender, ZIP code and full date of birth. This short paper revisits the uniqueness of simple demographics in the US population based on the most recent census data (the 2000 census). We offer a detailed, comprehensive and up-to-date picture of the threat to privacy posed by the disclosure of simple demographic information. Our results generally agree with the findings of [10], although we find that disclosing one’s gender, ZIP code and full date of birth allows for unique identification of fewer individuals (63% of the US population) than reported in [10]. We hope that our study will be a useful reference for privacy researchers who need simple estimates of the comparative threat of disclosing various demographic data.

Simple Demographics Often Identify People Uniquely | Latanya Sweeney

Latanya Sweeney; Simple Demographics Often Identify People Uniquely; Data Privacy Working Paper 3; Carnegie Mellon University; Pittsburgh, PA; 2000; 34 pages.


In this document, I report on experiments I conducted using 1990 U.S. Census summary data to determine how many individuals within geographically situated populations had combinations of demographic values that occurred infrequently. It was found that combinations of few characteristics often combine in populations to uniquely or nearly uniquely identify some individuals. Clearly, data released containing such information about these individuals should not be considered anonymous. Yet, health and other person-specific data are publicly available in this form. Here are some surprising results using only three fields of information, even though typical data releases contain many more fields. It was found that 87% (216 million of 248 million) of the population in the United States had reported characteristics that likely made them unique based only on {5-digit ZIP, gender, date of birth}. About half of the U.S. population (132 million of 248 million or 53%) are likely to be uniquely identified by only {place, gender, date of birth}, where place is basically the city, town, or municipality in which the person resides. And even at the county level, {county, gender, date of birth} are likely to uniquely identify 18% of the U.S. population. In general, few characteristics are needed to uniquely identify a person.


(but different)

L. Sweeney; Uniqueness of Simple Demographics in the U.S. Population; Data Privacy Lab White Paper Series LIDAP-WP4; School of Computer Science, Carnegie Mellon University, Pittsburgh, PA; 2000; 34 pages; abstract; catalog.

Via: backfill

RFC 7624 – Confidentiality in the Face of Pervasive Surveillance: A Threat Model and Problem Statement

RFC 7624Confidentiality in the Face of Pervasive Surveillance: A Threat Model and Problem Statement; R. Barnes, B. Schneier, C. Jennings, T. Hardie, B. Trammell, C. Huitema, D. Borkmann; IETF; 2015-08.

  • state-level actors
    (police- & military-focused)
  • some mention of adtrade tracking
    (is passive, pervasive & persistent but nominally T&C, N&C, etc.)


Since the initial revelations of pervasive surveillance in 2013, several classes of attacks on Internet communications have been discovered. In this document, we develop a threat model that describes these attacks on Internet confidentiality. We assume an attacker that is interested in undetected, indiscriminate eavesdropping. The threat model is based on published, verified attacks.

Table of Contents

1. Introduction
2. Terminology
3. An Idealized Passive Pervasive Attacker
3.1. Information Subject to Direct Observation
3.2. Information Useful for Inference
3.3. An Illustration of an Ideal Passive Pervasive Attack
3.3.1. Analysis of IP Headers
3.3.2. Correlation of IP Addresses to User Identities
3.3.3. Monitoring Messaging Clients for IP Address Correlation
3.3.4. Retrieving IP Addresses from Mail Headers
3.3.5. Tracking Address Usage with Web Cookies
3.3.6. Graph-Based Approaches to Address Correlation
3.3.7. Tracking of Link-Layer Identifiers
4. Reported Instances of Large-Scale Attacks
5. Threat Model
5.1. Attacker Capabilities
5.2. Attacker Costs
6. Security Considerations
7. References
7.1. Normative References
7.2. Informative References
IAB Members at the Time of Approval
Authors’ Addresses



  • Encryption
  • Snowden
  • National Security Agency (NSA)
    • PRISM

3.3.2. Correlation of IP Addresses to User Identities

The correlation of IP addresses with specific users can be done in various ways. For example, tools like reverse DNS lookup can be used to retrieve the DNS names of servers. Since the addresses of servers tend to be quite stable and since servers are relatively less numerous than users, an attacker could easily maintain its own copy of the DNS for well-known or popular servers to accelerate such lookups.

On the other hand, the reverse lookup of IP addresses of users is generally less informative. For example, a lookup of the address currently used by one author’s home network returns a name of the form “c-192-000-002-033.hsd1.wa.comcast.net”. This particular type of reverse DNS lookup generally reveals only coarse-grained location or provider information, equivalent to that available from geolocation databases.

In many jurisdictions, Internet Service Providers (ISPs) are required to provide identification on a case-by-case basis of the “owner” of a specific IP address for law enforcement purposes. This is a reasonably expedient process for targeted investigations, but pervasive surveillance requires something more efficient. This provides an incentive for the attacker to secure the cooperation of the ISP in order to automate this correlation.

Even if the ISP does not cooperate, user identity can often be obtained via inference. POP3 [RFC1939] and IMAP [RFC3501] are used to retrieve mail from mail servers, while a variant of SMTP is used to submit messages through mail servers. IMAP connections originate from the client, and typically start with an authentication exchange in which the client proves its identity by answering a password challenge. The same holds for the SIP protocol [RFC3261] and many instant messaging services operating over the Internet using proprietary protocols.

The username is directly observable if any of these protocols operate in cleartext; the username can then be directly associated with the source address.

3.3.4. Retrieving IP Addresses from Mail Headers

SMTP [RFC5321] requires that each successive SMTP relay adds a “Received” header to the mail headers. The purpose of these headers is to enable audit of mail transmission, and perhaps to distinguish between regular mail and spam. Here is an extract from the headers of a message recently received from the perpass mailing list:

Received: from 192-000-002-044.zone13.example.org (HELO ? (xxx.xxx.xxx.xxx)
    by lvps192-000-002-219.example.net
    with ESMTPSA (DHE-RSA-AES256-SHA encrypted, authenticated); 27 Oct 2013 21:47:14 +0100
Message-ID: >526D7BD2.7070908@example.org>
Date: Sun, 27 Oct 2013 20:47:14 +0000
From: Some One <some.one@example.org>

This is the first “Received” header attached to the message by the first SMTP relay; for privacy reasons, the field values have been anonymized. We learn here that the message was submitted by “Some One” on October 27, from a host behind a NAT ( [RFC1918] that used the IP address The information remained in the message and is accessible by all recipients of the perpass mailing list, or indeed by any attacker that sees at least one copy of the message.

An attacker that can observe sufficient email traffic can regularly update the mapping between public IP addresses and individual email identities. Even if the SMTP traffic was encrypted on submission and relaying, the attacker can still receive a copy of public mailing lists like perpass.

3.3.5. Tracking Address Usage with Web Cookies

Many web sites only encrypt a small fraction of their transactions. A popular pattern is to use HTTPS for the login information, and then use a “cookie” to associate following cleartext transactions with the user’s identity. Cookies are also used by various advertisement services to quickly identify the users and serve them with “personalized” advertisements. Such cookies are particularly useful if the advertisement services want to keep tracking the user across multiple sessions that may use different IP addresses.

As cookies are sent in cleartext, an attacker can build a database that associates cookies to IP addresses for non-HTTPS traffic. If the IP address is already identified, the cookie can be linked to the user identify. After that, if the same cookie appears on a new IP address, the new IP address can be immediately associated with the predetermined identity.

3.3.6. Graph-Based Approaches to Address Correlation

An attacker can track traffic from an IP address not yet associated with an individual to various public services (e.g., web sites, mail servers, game servers) and exploit patterns in the observed traffic to correlate this address with other addresses that show similar patterns. For example, any two addresses that show connections to the same IMAP or webmail services, the same set of favorite web sites, and game servers at similar times of day may be associated with the same individual. Correlated addresses can then be tied to an individual through one of the techniques above, walking the “network graph” to expand the set of attributable traffic.

3.3.7. Tracking of Link-Layer Identifiers

Moving back down the stack, technologies like Ethernet or Wi-Fi use MAC (Media Access Control) addresses to identify link-level destinations. MAC addresses assigned according to IEEE 802 standards are globally unique identifiers for the device. If the link is publicly accessible, an attacker can eavesdrop and perform tracking. For example, the attacker can track the wireless traffic at publicly accessible Wi-Fi networks. Simple devices can monitor the traffic and reveal which MAC addresses are present. Also, devices do not need to be connected to a network to expose link-layer identifiers. Active service discovery always discloses the MAC address of the user, and sometimes the Service Set Identifiers (SSIDs) of previously visited networks. For instance, certain techniques such as the use of “hidden SSIDs” require the mobile device to broadcast the network identifier together with the device identifier. This combination can further expose the user to inference attacks, as more information can be derived from the combination of MAC address, SSID being probed, time, and current location. For example, a user actively probing for a semi-unique SSID on a flight out of a certain city can imply that the user is no longer at the physical location of the corresponding AP. Given that large-scale databases of the MAC addresses of wireless access points for geolocation purposes have been known to exist for some time, the attacker could easily build a database that maps link-layer identifiers and time with device or user identities, and use it to track the movement of devices and of their owners. On the other hand, if the network does not use some form of Wi-Fi encryption, or if the attacker can access the decrypted traffic, the analysis will also provide the correlation between link-layer identifiers such as MAC addresses and IP addresses. Additional monitoring using techniques exposed in the previous sections will reveal the correlation between MAC addresses, IP addresses, and user identity. For instance, similarly to the use of web cookies, MAC addresses provide identity information that can be used to associate a user to different IP addresses.




  • RFC 6973Privacy Considerations for Internet Protocols, A. Cooper, H. Tschofenig, B. Aboba, J. Peterson, J. Morris, M. Hansen, R. Smith, DOI 10.17487/RFC6973, 2013-07.


Mostly newspaper articles (expos’ees) & techreport whitepapers.

  • RFC 1035Domain names – implementation and specification, P. Mockapetris, STD 13, RFC 1035, doi:10.17487/RFC1035, 1987-11.
  • RFC 1918Address Allocation for Private Internets, Y. Rekhter, B. Moskowitz, D. Karrenberg, G. de Groot, E. Lear, BCP 5, RFC 1918, doi:10.17487/RFC1918, 1996-02.
  • RFC 1939Post Office Protocol – Version 3, J. Myers, M. Rose, STD 53, RFC 1939, doi:10.17487/RFC1939, 1996-05.
  • RFC 3261SIP: Session Initiation Protocol, J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks, M. Handley, E. Schooler, RFC 3261, doi:10.17487/RFC3261, 2002-06.
  • RFC 3365Strong Security Requirements for Internet Engineering Task Force Standard Protocols, J. Schiller, BCP 61, RFC 3365, doi:10.17487/RFC3365, 2002-08.
  • RFC 3501INTERNET MESSAGE ACCESS PROTOCOL – VERSION 4rev1, M. Crispin, RFC 3501, doi:10.17487/RFC3501, 2003-03.
  • RFC 4033DNS Security Introduction and Requirements, R. Arends, Austein, R., Larson, M., Massey, D., and S. Rose, RFC 4033, doi:10.17487/RFC4033, 2005-03.
  • RFC 4303IP Encapsulating Security Payload (ESP), S. Kent, RFC 4303, doi:10.17487/RFC4303, 2005-12.
  • RFC 4949Internet Security Glossary, Version 2, R. Shirey, FYI 36, RFC 4949, doi:10.17487/RFC4949, 2007-08.
  • RFC 5246The Transport Layer Security (TLS) Protocol Version 1.2, T. Dierks, E. Rescorla, RFC 5246, doi:10.17487/RFC5246, 2008-08.
  • RFC 5321Simple Mail Transfer Protocol, J. Klensin, RFC 5321, doi:10.17487/RFC5321, 2008-10.
  • RFC 6962Certificate Transparency, B. Laurie, A. Langley, E. Kasper, RFC 6962, doi:10.17487/RFC6962, 2013-06.
  • RFC 7011Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information, B. Claise, B. Trammell, (editors), P. Aitken, STD 77, RFC 7011, doi:10.17487/RFC7011, 2013-09.
  • RFC 7258Pervasive Monitoring Is an Attack, S. Farrell, H. Tschofenig, BCP 188, RFC 7258, doi:10.17487/RFC7258, 2014-05.

Via: backfill.

Header Enrichment or ISP Enrichment? Emerging Privacy Threats in Mobile Networks | Vallina-Rodriguez, Sundaresan, Kreibich, Paxson

Narseo Vallina-Rodriguez, Srikanth Sundaresan, Christian Kreibich, Vern Paxson; Header Enrichment or ISP Enrichment? Emerging Privacy Threats in Mobile Networks; In Proceedings of the ACM SIGCOMM Workshop on Hot Topics in Middleboxes and Network Function Virtualization (HotMiddlebox 2015, huh? now you’re just being silly); 2015-08-17; 6 pages; landing.


HTTP header enrichment allows mobile operators to annotate HTTP connections via the use of a wide range of request headers. Operators employ proxies to introduce such headers for operational purposes, and—as recently widely publicized—also to assist advertising programs in identifying the subscriber responsible for the originating traffic, with significant consequences for the user’s privacy. In this paper, we use data collected by the Netalyzr network troubleshooting service over 16 months to identify and characterize HTTP header enrichment in modern mobile networks. We present a timeline of HTTP header usage for 299 mobile service providers from 112 countries, observing three main categories:

  1. unique user and device identifiers (e.g., IMEI and IMSI)
  2. headers related to advertising programs, and
  3. headers associated with network operations.


  • HTTP header enrichment
  • Netalyzr
    • Netalyzer-for-Android
  • Verizon Precision Marketingt Insights
  • The IETF’s Service Function Chaining (SFC) standards are vague about whether injected headers are good or bad (should be removed).
  • Data
    • Collected: 2013-11 → 2015-03.
    • 112 countries
    • 299 operators
  • Belief: no M?NO is yet cracking TLS to insert HTTP headers into the encrypted stream.
  • Suggested as an ID-less methods of identification: device-unique allocation of the (routable) IPv6 space to identify the device, in addition to routing to it.
  • RFC 7239Forwarded HTTP Extension; A. Peterson, M. Milsson (Opera); IETF; 2014-06.
  • Cessation Timeline
    • 2014-10 → Vodaphone (ZA) has ceased their practices in 2014-10, nothing to see there, now.
    • 2014-11 → AT&T has ceased their practices 2014-11.
    • 2015-03 → Verion was not respecting opt-out (as evidenced by not inserting the X-UIDH header) through 2015-03.
  • Continuation
    • Verion continues the X-UIDH header insertion.
  • The X-Forwarded-For header carries extra freight in T-Mobile (DE)
  • Carrier-Grade NAT (CGN) at per RFC 6598IANA-Reserved IPv4 Prefix for Shared Address Space (2012-04)


Table 1 & Table 2; Table 3 (not shown)

HTTP Header Operator Country Estimated Purpose
x-up-calling-line-id Vodacom ZA Phone Number
msisdn Orange JO MISDN
x-nokia-msisdn Smart PH
tm_user-id Movistar ES Subscriber ID
x-up-3gpp-imeisv Vodacom ZA IMEI
lbs-eventtime Smarttone HK Timestamp
lbs-zoneid Location
x-acr AT&T US unstated, an identifier
x-amobee-1 Airtel IN
x-amobee-2 Singtel SG
x-uidh Verizon US
x-vf-acr Vodacom ZA
Vodafone NL


  • Access Point Name (APN)
  • GPRS
  • HTTP
  • IMSI
  • IMEI
  • J2ME
  • Location-Based Services (LBS)
  • Mobile Country Code (MCC)
  • Mobile Network Code (MNC)
  • Mobile Network Operator (MNO)
  • Mobile Virtual Network Operator (MVNO)
  • Hong Kong Metro (subway) (MTR)
  • Service Function Chaining (SFC)
  • SIM
  • Transport-Layer Security (TLS)
  • Unique Identifier (UID); contra the specific UUID or GUID
  • Virtual Private Network (VPN)
  • WAP


A significant number of newpaper articles, vulgarizations & bloggist opinements.

Google Chrome (Chromium) becomes spyware, listening for “Ok Google Now”

But you can disable it, they say.



In archaeological order, derivative works on top, original causality chain goes down.

  • 500922#c22 (mgiuca@chromium.org) – continued response
  • 500922#c6 (mgiuca@chromium.org) – Hotword behaviour in chromium v43 (binary blob download; Chromium; 2015-06-16.
  • mgiuca – <quote>Hi, I’m an engineer from Google responsible for the hotword module<snip/></quote> 2015-06-17
  • 786909chromium: unconditionally downloads binary blob; Debian; 2015-05-26.
  • 491435Opt-out Chrome Hotword Shared Module; Chromium; 2015-05-22.
  • 9724409Chromium unconditionally downloads binary blob (debian.org); on Hacker News; 2015-06-16.



$ sudo find /home -xdev -name 'hotword*'
$ sudo find /home/raymond/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn | sort | sudo xargs ls -ld | sed -e 's/raymond/USER/g'
drwx------. 3 USER USER   4096 Aug 23  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn
drwx------. 6 USER USER   4096 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
drwx------. 2 USER USER   4096 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER   8918 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER  30773 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER    175 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER  55089 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER    276 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER  30427 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER    243 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER    252 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER    243 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER    237 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER    243 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
drwx------. 2 USER USER   4096 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER   1067 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER   1909 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER   3932 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER    547 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER   1493 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER    482 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER    524 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER   1809 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER   1873 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER   2925 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
drwx------. 2 USER USER   4096 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER   4134 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER  10942 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER  19510 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER   2691 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
drwx------. 3 USER USER   4096 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
drwx------. 2 USER USER   4096 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER 273631 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/
-rw-rw-r--. 1 USER USER 394688 Aug  8  2014 /home/USER/.config/google-chrome/Default/Extensions/bepbmhgboaologfdajaanbcjmnhjmhfn/