Exploring ADINT: Using Ad Targeting for Surveillance on a Budget — or — How Alice Can Buy Ads to Track Bob | Vines, Roesner, Kohno

Paul Vines, Franziska Roesner, Tadayoshi Kohno; Exploring ADINT: Using Ad Targeting for Surveillance on a Budget — or — How Alice Can Buy Ads to Track Bob; In Proceedings of the 16th ACM Workshop on Privacy in the Electronic Society (WPES 2017); 2017-10-30; 11 pages; outreach.

tl;dr → Tadayoshi et al. are virtuosos at these performance art happenings. Catchy hook, cool marketing name (ADINT) and press outreach frontrunning the actual conference venue. For the wuffie and the lulz. Nice demo tho.
and → They bought geofence campaigns in a grid. They used close-the-loop analytics to identify the sojourn trail of the target.
and → dont’ use Grindr.


The online advertising ecosystem is built upon the ability of advertising networks to know properties about users (e.g., their interests or physical locations) and deliver targeted ads based on those properties. Much of the privacy debate around online advertising has focused on the harvesting of these properties by the advertising networks. In this work, we explore the following question: can third-parties use the purchasing of ads to extract private information about individuals? We find that the answer is yes. For example, in a case study with an archetypal advertising network, we find that — for $1000 USD — we can track the location of individuals who are using apps served by that advertising network, as well as infer whether they are using potentially sensitive applications (e.g., certain religious or sexuality-related apps). We also conduct a broad survey of other ad networks and assess their risks to similar attacks. We then step back and explore the implications of our findings.


  • Markets
    They chose

    • Facebooik
    • not Google
    • etc.
    • not to fight with big DSPs;
      the picked the weaker ones to highlight.
  • Apps
    They chose

    • lower-quality apps.
    • adult apps
      few “family oriented” [none?] apps.
    • <ahem>Adult Diapering Diary</ahem>
      <ahem>Adult Diapering Diary</ahem>


  • DSPs sell 8m CEP (precision) location.

Spooky Cool Military Lingo


Targeting Dimensions

  • Demographics
  • Interests
  • Personally-Identifying Information (PII)
  • Domain (a usage taxonomy)
  • Location
  • Identifiers
    • Cookie Identifier
    • Mobile Ad Identifier (e.g. IDFA, GPSAID)
  • Technographics
    • Device (Make Model OS)
    • Network (Carrier)
  • Search

Media Types

Supply-Side Platforms (SSPs)

  • Adbund
  • InnerActive
  • MobFox
  • Smaato
  • Xapas

Supply (the adware itself, The Applications, The Apps)

  • Adult Diapering Diary
  • BitTorrent
  • FrostWire
  • Grindr
  • Hide My Texts
  • Hide Pictures vault
  • Hornet
  • iFunny
  • Imgur
  • Jack’D
  • Meet24
  • MeetMe
  • Moco
  • My Mixtapez Music
  • Pregnant Mommy’s Maternity
  • Psiphon
  • Quran Reciters
  • Romeo
  • Tagged
  • Talkatone
  • TextFree
  • TextMe
  • TextPlus
  • The Chive
  • uTorrent
  • Wapa
  • Words with Friends

Demand-Side Platforms (DSPs)

  • Ademedo
  • AddRoll
  • AdWords
  • Bing
  • Bonadza
  • BluAgile
  • Centro
  • Choozle
  • Criteo
  • ExactDrive
  • Facebook
  • GetIntent
  • Go2Mobi
  • LiquidM
  • MediaMath
  • MightyHive
  • Simpli.Fi
  • SiteScout
  • Splicky
  • Tapad



  • Gunes Acar, Christian Eubank, Steven Englehardt, Marc Juarez, Arvind Narayanan, Claudia Diaz. 2014. The Web Never Forgets: Persistent Tracking Mechanisms in the Wild. In Proceedings of the ACM Conference on Computer and Communications Security.
  • Rebecca Balebako, Pedro Leon, Richard Shay, Blase Ur, Yang Wang, L Cranor. 2012. Measuring the effectiveness of privacy tools for limiting behavioral advertising. In Web 2.0 Security and Privacy.
  • Hal Berghel. 2001. Caustic Cookies. In His Blog.
  • Interactive Advertising Bureau. 2015. IAB Tech Lab Content Taxonomy.
  • Interactive Advertising Bureau. 2017. IAB Interactive Advertising Wiki.
  • Giuseppe Cattaneo, Giancarlo De Maio, Pompeo Faruolo, Umberto Ferraro Petrillo. 2013. A review of security attacks on the GSM standard. In Information and Communication Technology-EurAsia Conference. Springer, pages 507–512.
  • Robert M Clark. 2013. Perspectives on Intelligence Collection. In The intelligencer, a Journal of US Intelligence Studies 20, 2, pages 47–53.
  • David Cole. 2014. We kill people based on metadata. In The New York Review of Books
  • Jonathan Crussell, Ryan Stevens, Hao Chen. 2014. Madfraud: Investigating ad fraud in android applications. In Proceedings of the 12th Annual International Conference on Mobile Systems, Applications, and Services. ACM, pages 123–134.
  • Doug DePerry, Tom Ritter, Andrew Rahimi. 2013. Cloning with a Compromised CDMA Femtocell.
  • Google Developers. 2017. Google Ads.
  • Steven Englehardt and Arvind Narayanan. 2016. Online tracking: A 1-million-site measurement and analysis. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, pages 1388–1401.
  • Steven Englehardt, Dillon Reisman, Christian Eubank, Peter Zimmerman, Jonathan Mayer, Arvind Narayanan, Edward W Felten. 2015. Cookies that give you away: The surveillance implications of web tracking. In Proceedings of the 24th International Conference on World Wide Web. ACM, pages 289–299.
  • Go2mobi. 2017.
  • Aleksandra Korolova. 2010. Privacy violations using microtargeted ads: A case study. In Proceedings of the 2010 IEEE International Conference on IEEE Data Mining Workshops (ICDMW), pages 474–482.
  • Zhou Li, Kehuan Zhang, Yinglian Xie, Fang Yu, XiaoFeng Wang. 2012. Knowing your enemy: understanding and detecting malicious web advertising. In Proceedings of the 2012 ACM conference on Computer and Communications Security. ACM, pages 674–686.
  • Nicolas Lidzborski. 2014. Staying at the forefront of email security and reliability: HTTPS-only and 99.978 percent availability.; In Their Blog. Google.
  • Steve Mansfield-Devine. 2015. When advertising turns nasty. In Network Security 11, pages 5–8.
  • Jeffrey Meisner. 2014. Advancing our encryption and transparency efforts. In Their Blog, Microsoft.
  • Rick Noack. 2014. Could using gay dating app Grindr get you arrested in Egypt?. In The Washington Post.
  • Franziska Roesner, Tadayoshi Kohno, David Wetherall. 2012. Detecting and Defending Against Third-Party Tracking on the Web. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI).
  • Sooel Son, Daehyeok Kim, Vitaly Shmatikov. 2016. What mobile ads know about mobile users. In Proceedings of the 23rd Annual Network and Distributed System Security Symposium (NDSS).
  • Mark Joseph Stern. 2016. This Daily Beast Grindr Stunt Is Sleazy, Dangerous, and Wildly Unethical. In Slate, 2016.
  • Ryan Stevens, Clint Gibler, Jon Crussell, Jeremy Erickson, Hao Chen. 2012. Investigating user privacy in android ad libraries. In Proceedings of the Workshop on Mobile Security Technologies<e/m> (MoST).
  • Ratko Vidakovic. 2013. The Mechanics Of Real-Time Bidding. In Marketingland.
  • Craig E. Wills and Can Tatar. 2012. Understanding what they do with what they know. In Proceedings of the ACM Workshop on Privacy in the Electronic Society (WPES).
  • Tom Yeh, Tsung-Hsiang Chang, Robert C Miller. 2009. Sikuli: using GUI screenshots for search and automation. In Proceedings of the 22nd annual ACM Symposium on User Interface Software and Technology. ACM, pages 183–192.
  • Apostolis Zarras, Alexandros Kapravelos, Gianluca Stringhini, Thorsten Holz, Christopher Kruegel, Giovanni Vigna. 2014. The dark alleys of madison avenue: Understanding malicious advertisements. In Proceedings of the 2014 Conference on Internet Measurement Conference
  • Tiliang Zhang, Hua Zhang, Fei Gao. 2013. A Malicious Advertising Detection Scheme Based on the Depth of URL Strategy. In Proceedings of the 2013 Sixth International Symposium on Computational Intelligence and Design (ISCID), Vol. 2. IEEE, pages 57–60.
  • Peter Thomas Zimmerman. 2015. Measuring privacy, security, and censorship through the utilization of online advertising exchanges. Technical Report. Tech. rep., Princeton University.


The Suitcase Words

  • Mobile Advertising ID (MAID)
  • Demand-Side Platform (DSP)
  • Supply-Side Platform (SSP)
  • Global Positioning System (GPS)
  • Google Play Store (GPS)
  • geofencing
  • cookie tracking
  • Google Advertising Identifier (GAID)
    Google Play Services Advertising Identifier (GAID)
  • Facebook
  • Snowden
  • WiFi

Previously filled.

Pre-Conference AdTech Summarization | Gubbins

; Things you should know about AdTech, today; In His Blog, centrally hosted on LinkedIn; 2017-08-30; regwalled (you have to login to linkedin).


Boosterism in front of the trade shows
  • Exchange Wire #ATSL17
  • Dmexco
  • Programmatic IO


  • There be consolidation in the DSP category.
  • There will be more DSPs not less fewer.
  • Owned & Operated (O&O)
  • preferential deals
  • private equity companies
  • party data & a GDPR compliant screen agnostic ID
  • no “point solutions.”
  • Doubleclick Bid Manager (DBM), Google
  • Lara O’Reilly; Some Article; In Business Insider (maybe); WHEN?
    tl;dr → something about how Google DSP DBM guarantee “fraud-free” traffic.
  • Ads.txtAuthorized Digital Sellers, IAB Tech Lab
  • Claimed:
    comScore publishers are starting to adopt Ads.txt

Buy Side

Deal Flow
  • Sizmek acquired Rocket Fuel, (unverified) $145M.
  • Tremor sells its DSP to Taptica for $50M.
  • Singtel acquired Turn for $310M.
No flow, yet
  • Adform
  • MediaMath
  • DataXu
  • AppNexus

Sell Side

  • Header Bidding (HB)
    • Replaces the SSP category
    • <quote>effectively migrated the sell sides narrative & value prop of being a yield management partner to that of a feet on the street publisher re-seller.</quote>
  • QBR (Quarterly Business Result?)
  • Prebid.js
  • With server bidding, too.
  • Supply Path Optimization (SPO)
    • Brian O’Kelley (AppNexus); Article; In His Blog; WHEN?
      Brian O’Kelley, CEO, AppNexus.
    • Article; ; In ExchangeWire; WHEN?
  • Exchange Bidding in Dynamic Allocation (EBDA), Google
The Rubicon Project
a header tag, compatible with most wrappers, no proprietary wrapper, only Prebid.js
Index Exchange
a header tag, compatible with most wrappers, a proprietary wrapper
a header tag that, compatible with many (not ‘most’) wrappers, a proprietary wrapper
a header, compatible with many (not ‘most’) wrappers, a proprietary wrapper (that is better than OpenX’s which is not enterprise grade)
a header tag, compatible with many (not ‘most)’ wrappers, a proprietary wrapper.
  • TrustX
    • with
      • Digital Content Next
      • IPONWEB
      • ANA
    • Something about a transparent marketplace.
  • Something about another supply network
    • German
    • trade press in Digiday
  • No header bidding, yet.
  • Mobile equals Adware (“in app”)
    • but Apps don’t have “browsers.”
    • but App browsers don’t have “pages” with “headers.”
    • though Apps have SDKs (libraries).
  • RTL acquires SpotX
  • <quote>One could argue video is the perfect storm for header bidding, limited quality supply & maximum demand, the ideal conditions for a unified auction…</quote>
Talking Points
  • The industry is currently debating the pros & cons of running header bidding either client or server side (A lot boils down to latency V audience match rates)
  • Google offer their own version of header bidding, this is referred to as EBDA (Exchange Bidding in Dynamic Allocation) and is available to DFP customers.
  • Facebook recently entered header bidding by launching a header tag that enables publishers to capture FAN demand via header bidding on their mobile traffic.
  • Criteo entered header bidding by offering publishers their header tag (AKA Direct Bidder) that effectively delivers Criteos unique demand into the publisher’s header auction, at a 1st rather than cleared 2nd price.
  • Amazon have launched a server to server header bidding offering for publishers that delivers unique demand and the ability to manage other S2S demand partners for the publisher.
Extra Credit
  • <quote>senior AdTech big wigs</quote>
  • programmatic auction process
  • 1st v 2nd price
  • 2nd price was for waterfall
  • 1st price will be for unified (header bidding)

General Data Protection Regulation’ (GDPR)

  • 2018-05
  • Consent must be collected.
  • Will make 2nd party data marketplaces economical.
  • The salubrious effect.
  • Publishers have a Direct Relationship with consumers.
    this is argued as being “better.”
  • Industry choices
    • collect holistic consent
      <quote>one unified [process] of consumer [outreach] rather than one for every vendor</quote>
    • individual vendor consent
      <quote>for every cookie or device ID that flows through the OpenRTB pipes we have spent the last 10 years laying.</quote>

Viewability & Brand Safety

  • IAB
  • MRC

Talking Points

  • Moat was sold to Oracle for reported number of $800M.
  • PE Firm Providence Equity bought a % of Double Verify giving them a reported value of $300M.
  • Integral Ad Science remains independent, for now


  • Telcos have what everybody in AdTech wants:
    • accurate data
    • privacy compliant data
    • scaled data
    • 1st party data.
  • Telcos want what AdTech & publishing companies have:
    • programmatic sell and buy side tools
    • content creation functions
    • distribution at scale.
    • diversification of revenues

Talking Points

  • Verizon buys AOL & Yahoo to form Oath, a publisher, a DSP, a DMP.
  • Telenor buys TapAd, a cross-device DMP-type-thing
  • Altice buys Teads, a streaming video vendor)
  • Singtel buys Turn, a DSP
  • AT&T needs a line in this list; might want to buy Time Warner which is a movie studio, media holding copmany, a cable operator, an old owner of AOL.
Raised $18.75M, Series A. Why?
Raised $20M, through Series B, Why?

Data Management Platform (DMP)

  • Not a pure-play business.
    • A division, not a business.
    • An interface, not a division.
  • Everyone wants to own one.
  • Should DMP’s also be in the media buying business?
  • What are DMP’s doing to stay relevant for a world without cookies?
  • Do DMP’s plan to build or buy device graph features / functions?
  • For platforms that process & model a lot of 1st, 2nd & 3rd party data, how will they be affected by the pending GDPR?
Talking Points
  • Adobe bought Tube Mogul, a video DSP, for $540M (based on information &amp belief).
  • Oracle bought Moat, a verification feature, for $800M
  • Oracle bought Crosswise, a cross-device database, for <unstated/>
  • Salesforce bought Krux, a DMP, FOR $700M

Lotame remains independent, for now

ID Consortium’s & Cross-Device Players

Probabilistic “won’t work”
<quote>The GDPR may make it very difficult for a number of probabilistic methods to be applied to digital ID management.</quote>
Walled Garden
They … <quote>are using their own proprietary cross-screen deterministic token / people based ID that in many cases only works within their O&O environments.</quote>
Universal ID
Is desired. <quote>CMO’s & agencies in the future will not be requesting a cleaner supply chain, but a universal ID (or ID clearing house) that will enable them to manage reach, frequency & attribution across all of the partners they buy from.</quote>
The DigiTrust
<quote>This technology solution creates an anonymous user token, which is propagated by and between its members in lieu of billions of proprietary pixels and trackers on Web pages.</quote>
Claim: “Many” leading AdTech companies are already working with the DigiTrust team. [Which?]
AppNexus ID Consortium
  • Scheme: people-based ID.
  • Launch: 2017-05
  • Trade Name: TBD
    • Index Exchange
    • LiveRamp
    • OpenX
    • Live Intent
    • Rocket Fuel
  • Adbrain
  • Screen6
  • Drawbridge



  • Blockchain is slow, too slow, way too slow
    Blockchain can handle 10 tps.
  • Does not work in OpenRGB
    • New York City
  • Some Q&A; In AdExchanger
    tl;dr → interview of Dr Boris WHO?, IPONWEB; self-styled “the smartest man in AdTech and he concurs”

Artificial Intelligence

  • Is bullshit.
  • c.f.(names dropped)
    • Deepmind
    • Boston Dynamics


  • DOOH
  • Audio
  • Programmatic TV
  • Over The Top (OTT)
  • MarTech != AdTech

Previously filled.

Roundup of miscellaneous notes, captured and organized

Blockchain Culture

The Seven(Hundred) Dwarves

  • Blockstack(.org)- The New Decentralized Internet
    • blockstack, at GitHub
    • Union Square Ventures (USV)
    • Promotion
      • Staff (USV); The Blockchain App Stack; In Their Blog; 2016-08-08.
      • Blockstack Unveils A Browser For The Decentralized Web; Laura Shin; In Forbes; 2017-05-15.
        tl;dr → <quote>Tuesday, at the main blockchain industry conference, Consensus, one of the companies working on this new decentralized web, Blockstack, which has $5.5 million in funding from Union Square Ventures and AngelList cofounder Naval Ravikant, released a browser add-on that enables that and more.<snip/>The add-on enables a browser to store the user’s identity information by a local key on the consumer’s device.</quote>; Ryan Shea, cofounder.
  • Everyone has something here.

Bluetooth Culture

Bluetooth LE (BLE)

  • and?

Bluetooth 5

  • Something about mesh networking
  • Something about the standard being released “summer 2017.”

C++ Culture


  • The roadmap onto the twenties.


  • MapReduce, from ETL or EU somewhere.
  • Kyoto Cabinet, Typhoon, Tycoon
  • Virtual Reality packages
  • Ctemplate, Olafud Spek (?)
  • Robot Operating System (ROS)
  • libgraphqlparser – A GraphQL query parser in C++ with C and C++ APIs

Computing Culture

Ubicomp, <ahem>Pervicomp</ahem>

  • Rich Gold
  • Mark Weiser

Dev(Ops) Culture

Futures Cult(ure)


  • Cory Doctorow, the coming war against general purpose computing, an article; WHERE?
  • Cory Doctorow, dystopia contra utopia, an article; WHERE?


  • Cory Doctorow, various works

Imagine a World In Which…

  • Stocks vs Flows
  • Chaos vs Stability
  • Permission vs Permissionless
  • Civil Society ↔ Crony Society
    • Transparency
    • Deals
    • Priorities
  • Predictive Technology “just works”
    • is trusted
    • is eventual
    • is law
    • “is” equates with “ought”

Fedora Culture

  • Flatpak

Fedora 26 Notes

  • nmcli reload con down $i
  • nm cli reload con up $i
  • eui64 must be manually configured

Internet of (unpatchable) Thingies (IoT)

  • MQTT
  • mosquito

Language Lifestyles

Go Lang

  • Go for it.
  • A package manager


  • theory
  • implementation?

Rust Lang

  • Was there a NoStarch book?


  • C++20?
    hey, surely someone has modules working by now, eh?



  • Repig, in C++, with threads, in an NVMe


  • sure, what?


  • Interface to the (discontinued) Proliphix thermostats


  • CDN Store
  • Picture Store
  • Document Cache (store & forward)


  • Firefox Tiles

SCOLD Experiences

SCOLD near-syntax, common errors

  • #import <hpp>
  • missing #divert
  • #using, a declaration
  • #origin
  • #namespace
  • $@


Build System
  • –with-std-scold or maybe –with-scold
  • vecdup, like strdup
  • vectree, like strfree→free
  • json::check::Failure or json::Cast.
  • namespace json::is
    • is_array
    • is_null
    • is_object
  • json::as<…>(…)
  • pathify(…)
  • column result
  • concept guarding the template parameter, from C++17
  • typed strings
    • location
    • path
    • etc.
  • and

Surveillance Culture


  • Eigenpeople
  • Eigenpersonas
  • Personality modeling


Yves-Alexandre de Montjoye, Jordi Quoidbach, Florent Robic, Alex (Sandy) Pentland; Predicting Personality Using Novel Mobile Phone-Based Metrics; In: A.M. Greenberg, W.G. Kennedy, N.D. Bos (editors) Social Computing, Behavioral-Cultural Modeling and Prediction as Proceedings of Social Computing, Behavioral (SBP 2013), Lecture Notes in Computer Science, vol 7812; 2013; paywalls: Springer, ACM. Previously filled.


  • POSS (Post Open Source Software)
    defined as: if everything is on GitHub, then who needs licenses?
    Was this ever amplified?
    Certainly it is facially incorrect and facile.


  • Rob Horning; Sock of Myself, an essay; In Real Life Magazine; 2017-05-17
    tl;dr → riffing on happiness, Facebook. Is. Bad. Q.E.D. R.D. Laing , The Divided Self,; John Cheney-Lippold’s We Are Data; Donald Mackenzie.
  • Michael Nelson; University of California, Riverside.

Purposive directionality

  • increase
    • predictability
  • reduce
    • uncertainty
    • variability


Uncomprehensible, Unknown, Unpossible

  • Sunlight, a package? FOSS?

De-Anonymizing Web Browsing Data with Social Networks | Su, Shukla, Goel, Narayanan

Jessica Su, Ansh Shukla, Sharad Goel, Arvind Narayanan; De-Anonymizing Web Browsing Data with Social Networks; draft; In Some Venue Surely (they will publish this somewhere, it is so very nicely formatted); 2017-05; 9 pages.


Can online trackers and network adversaries de-anonymize web browsing data readily available to them? We show—theoretically, via simulation, and through experiments on real user data—that de-identified web browsing histories can be linked to social media profiles using only publicly available data. Our approach is based on a simple observation: each person has a distinctive social network, and thus the set of links appearing in one’s feed is unique. Assuming users visit links in their feed with higher probability than a random user, browsing histories contain tell-tale marks of identity. We formalize this intuition by specifying a model of web browsing behavior and then deriving the maximum likelihood estimate of a user’s social profile. We evaluate this strategy on simulated browsing histories, and show that given a history with 30 links originating from Twitter, we can deduce the corresponding Twitter profile more than 50% of the time. To gauge the real-world e↵ectiveness of this approach, we recruited nearly 400 people to donate their web browsing histories, and we were able to correctly identify more than 70% of them. We further show that several online trackers are embedded on sufficiently many websites to carry out this attack with high accuracy. Our theoretical contribution applies to any type of transactional data and is robust to noisy observations, generalizing a wide range of previous de-anonymization attacks. Finally, since our attack attempts to find the correct Twitter profile out of over 300 million candidates, it is—to our knowledge—the largest-scale demonstrated de-anonymization to date.




  • <quote>Network adversaries—including government surveillance agencies, Internet service providers, and co↵ee shop eavesdroppers—also see URLs of unencrypted web traffic. The adversary may also be a cross-device tracking company aiming to link two di↵erent browsing histories (e.g., histories generated by the same user on di↵erent devices). For such an adversary, linking to social media profiles is a stepping stone.</quote>


374 people confirmed the accuracy of our deanonymization attempt.
268 people (72%) were the top candidate generated by the MLE when using t.co links.
303 people (81%) were among the top 15 candidates generated by the MLE when using t.co links.
Yet only 49% de-anonymization when using fully expanded links (the redirect target of the t.co link)

<paraphrasing>We recruited participants by advertising the experiment on a variety of websites, including

  • Twitter,
  • Facebook,
  • Quora,
  • Hacker News,
  • Freedom to Tinker
Story Line
people submitted web browsing histories.
119 cases (18%)
the application encountered a fatal error (e.g., because the Twitter API was temporarily unavailable), and it was unable to run the de-anonymization algorithm.
530 cases
remaining are useful.

87 users (16%)
had fewer than four informative links, and so no attempt to de-anonymize them was made.
443 users
remaining are useful.

374 users (84%)
confirmed whether or not our de-anonymization attempt was successful.
77 users (21%),/dt>
additionally disclosed their identity by signing into Twitter.

Apology: noted that the users who participated in our experiment are not representative of the Twitter population. In particular, they are quite active: the users who reported their identity had a median number of 378 followers and posted a median number of 2,041 total tweets.


Framing (Environment)

  • TargetConsumer is a Registered Twitter User,
    with activity and warm content selection algo in operation at Twitter HQ
  • Twitter algo selects snippets for presentation to TargetConsumer.
  • TargetConsumer either elects to read or discards the linked page.
  • An URL trail is recorded by The Panopticon Surveillance Machinery in The Record
  • Adversary has access to The Record across long spans of time and large numbers of TargetConsumers.

Problem Statement

  • Can one or many TargetConsumers be distinguished solely by URL traces in The Record?

Algorithm (Conceptual)

See C. Y. Ma, D. K. Yau, N. K. Yip, N. S. Rao. “Privacy vulnerability of published anonymous mobility traces,” In IEEE/ACM Transactions on Networking, 21(3):720–733, 2013.

  1. The simple model of web browsing behavior in which a user’s likelihood of visiting a URL is governed by the URL’s overall popularity and whether the URL appeared in the TargetConsumer’s Twitter feed.
  2. For each TargetConsumer, we compute their likelihood (under the model) of generating a given anonymous browsing history.
  3. Identify the TargetConsumer most likely to have generated that history.



  • Cookie Syncing
  • E-Tag
  • HTML5 localStorage
  • Jaccard Similarity
  • Maximum Liklihood Estimate (MLE)
  • URL (URL)


  • Ad Networks Can Personally Identify Web Users; Wendy Davis; In MediaPost; 2017-01-20.
    <quote> The authors tested their theory by recruiting 400 people who allowed their Web browsing histories to be tracked, and then comparing the sites they visited to sites mentioned in Twitter accounts they followed. The researchers say they were able to use that method to identify more than 70% of the volunteers.</quote>


  • G. Acar, C. Eubank, S. Englehardt, M. Juarez, A. Narayanan, C. Diaz. The web never forgets: Persistent tracking mechanisms in the wild. In Proceedings of ACM Conference on Computer Communications & Security (CCS), pages 674–689. ACM, 2014.
  • G. Acar, M. Juarez, N. Nikiforakis, C. Diaz, S. Gürses, F. Piessens, B. Preneel. Fpdetective: dusting the web for fingerprinters. In Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security (CCS), pages 1129–1140. ACM, 2013.
  • M. D. Ayenson, D. J. Wambach, A. Soltani, N. Good, C. J. Hoofnagle. Flash cookies and privacy II: Now with HTML5 and ETag respawning. 2011.
  • C. Budak, S. Goel, J. Rao, G. Zervas. Understanding emerging threats to online advertising. In Proceedings of the ACM Conference on Economics and Computation, 2016.
  • M. Chew, S. Stamm. Contextual identity: Freedom to be all your selves. In Proceedings of the Workshop on Web,/em>, volume 2. Citeseer, 2013.
  • ] N. Christin, S. S. Yanagihara, K. Kamataki. Dissecting one click frauds. In Proceedings of the 17th ACM conference on Computer and Communications Security
  • Y.-A. De Montjoye, C. A. Hidalgo, M. Verleysen, V. D. Blondel. Unique in the crowd: The privacy bounds of human mobility. In Scientific Reports, 3, 2013.
  • Y.-A. De Montjoye, L. Radaelli, V. K. Singh, et al. Unique in the shopping mall: On the reidentifiability of credit card metadata. In Science, 347(6221), 2015.
  • P. Eckersley. How unique is your web browser? In, pages 1–18. Springer, 2010.
  • S. Englehardt, A. Narayanan. Online tracking: A 1-million-site measurement and analysis. In Proceedings of the ACM Conference on Computer and Communications Security (CCS), 2016.
  • S. Englehardt, D. Reisman, C. Eubank, P. Zimmerman, J. Mayer, A. Narayanan, E. W. Felten. Cookies that give you away: The surveillance implications of web tracking. In Proceedings of the 24th Conference on World Wide Web (WWW), 2015.
  • Ú. Erlingsson, V. Pihur, A. Korolova. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the Conference on Computer and Communications Security (CCS), 2014.
  • D. Fifield, S. Egelman. Fingerprinting web users through font metrics. In Proceedings of the International Conference on Financial Cryptography and Data Security, 2015.
  • S. Hill, F. Provost. The myth of the double-blind review?: Author identification using only citations. In SIGKDD Explor(ification) Newsletter, 5(2):179–184, Dec. 2003.
  • M. Korayem, D. J. Crandall. De-anonymizing users across heterogeneous social computing platforms. In Proceedings of the Internation Conference on W(something) S(something) M(something) as “Some Acronym” (ICWSM), 2013.
  • A. Korolova, K. Kenthapadi, N. Mishra, A. Ntoulas. Releasing search queries and clicks privately. In Proceedings of the 18th International Conference on World Wide Web (WWW). ACM, 2009.
  • B. Krishnamurthy, K. Naryshkin, C. Wills. Privacy leakage vs. protection measures: the growing disconnect. In Proceedings of the Web
  • B. Krishnamurthy, C. E. Wills. On the leakage of personally identifiable information via online social networks. In Proceedings of the 2nd ACM Workshop on Online Social Networks (WOSN), pages 7–12. ACM, 2009.
  • P. Laperdrix, W. Rudametkin, B. Baudry. Beauty and the beast: Diverting modern web browsers to build unique browser fingerprints. In Proceedings of the 37th IEEE Symposium on Security and Privacy, 2016.
  • A. Lerner, A. K. Simpson, T. Kohno, F. Roesner. Internet jones and the raiders of the lost trackers: An archaeological study of web tracking from 1996 to 2016. In Proceedings of the 25th USENIX Security Symposium, 2016.
  • T. Libert. Exposing the invisible web: An analysis of third-party http requests on 1 million websites. In International Journal of Communication, 9:18, 2015.
  • C. Y. Ma, D. K. Yau, N. K. Yip, N. S. Rao. Privacy vulnerability of published anonymous mobility traces. In IEEE/ACM Transactions on Networking, 21(3):720–733, 2013.
  • A. Marthews, C. Tucker. Government surveillance and internet search behavior. Available at ssrn:2412564, 2015.
  • N. Mathewson, R. Dingledine. Practical traffic analysis: Extending and resisting statistical disclosure. In Proceedings of the International Workshop on Privacy Enhancing Technologies (PETS), pages 17–34. Springer, 2004.
  • J. R. Mayer, J. C. Mitchell. Third-party web tracking: Policy and technology. In Proceedings of the 2012 IEEE Symposium on Security and Privacy. IEEE, 2012.
  • K. Mowery, H. Shacham. Pixel perfect: Fingerprinting canvas in HTML5. In Proceedings of the Conference with the Acronym “W2SP” (W2SP), 2012.
  • A. Narayanan, H. Paskov, N. Z. Gong, J. Bethencourt, E. Stefanov, E. C. R. Shin, D. Song. On the feasibility of internet-scale author identification. In Proceedings of the IEEE Symposium on Security and Privacy, 2012.
  • A. Narayanan, V. Shmatikov. Robust de-anonymization of large sparse datasets. In Proceedings of the 2008 IEEE Symposium on Security and Privacy (SP), pages 111–125. IEEE, 2008.
  • N. Nikiforakis, A. Kapravelos, W. Joosen, C. Kruegel, F. Piessens, G. Vigna. Cookieless monster: Exploring the ecosystem of web-based device fingerprinting. In Proceedings of the 2013 IEEE symposium on Security and Privacy (SP), pages 541–555. IEEE, 2013.
  • L. Olejnik, G. Acar, C. Castelluccia, C. Diaz. The leaking battery A privacy analysis of the HTML5 Battery Status API. Technical Report, WHERE? 2015.
  • L. Olejnik, C. Castelluccia, A. Janc. Why Johnny can’t browse in peace: On the uniqueness of web browsing history patterns. In Proceedings of the 5th Workshop on Hot Topics in Privacy Enhancing Technologies (PETS), 2012.
  • J. Penney. Chilling effects: Online surveillance and wikipedia use. In Berkeley Technology Law Journal, 2016.
  • A. Ramachandran, Y. Kim, A. Chaintreau. “I knew they clicked when I saw them with their friends”. In Proceedings of the 2nd Conference on Online Social Networks, 2014.
  • F. Roesner, T. Kohno, D. Wetherall. Detecting and defending against third-party tracking on the web. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, pages 12–12. USENIX Association, 2012.
  • K. Sharad, G. Danezis. An automated social graph de-anonymization technique. In Proceedings of the 13th Workshop on Privacy in the Electronic Society (WPES), pages 47–58. ACM, 2014.
  • A. Soltani, S. Canty, Q. Mayo, L. Thomas, C. J. Hoofnagle. Flash cookies and privacy. In Proceedings of the AAAI Spring Symposium: Intelligent Information Privacy Management, volume 2010, pages 158–163, 2010.
  • J. Su, A. Sharma, S. Goel. The effect of recommendations on network structure. In Proceedings of the 25th Conference on World Wide Web (WWW), 2016.
  • G. Wondracek, T. Holz, E. Kirda, C. Kruegel. A practical attack to de-anonymize social network users. In Proceedings of the IEEE Symposium on Security and Privacy, 2010.

Previously filled.

Online Privacy and ISPs | Institute for Information Security & Privacy, Georgia Tech

Peter Swire, Justin Hennings, Alana Kirkland; Online Privacy and ISPs; a whitepaper; Institute for Information Security & Privacy, Georgia Tech; 2016-05; 131 pages.
Teaser: ISP Access to Consumer Data is Limited and Often Less than Access by Others

  • Peter Swire
    • Associate Director,
      The Institute for Information
      Security & Privacy at Georgia Tech
    • Huang Professor of Law,
      Georgia Tech Scheller College of Business
      Senior Counsel, Alston & Bird LLP
  • Justin Hemmings,
    • Research Associate,
      Georgia Tech Scheller College of Business
    • Policy Analyst
      Alston & Bird LLP
  • Alana Kirkland
    • Associate Attorney, Alston & Bird LLP

tl;dr → ISP < Media; ISPs are not omnipotent; ISPs see less than you think; Consumer visibility is mitigated by allowed usage patterns: cross-ISP, cross-device, VPN, DNS obfuscation, encryption.  Anyway, Facebook has it all and more.

Consumer profiling observation is already occurring by other means anyway.

<quote> In summary, based on a factual analysis of today’s Internet ecosystem in the United States, ISPs have neither comprehensive nor unique access to information about users’ online activity. Rather, the most commercially valuable information about online users, which can be used for targeted advertising and other purposes, is coming from other contexts. Market leaders are combining these contexts for insight into a wide range of activity on each device and across devices. </quote>

<translation> The other guys are already doing it, why stop ISPs? </translation>

ISP surveillanceObservation of consumers is neither Comprehensive, nor Unique

<quote> The Working Paper addresses two fundamental points. First, ISP access to user data is not comprehensive – technological developments place substantial limits on ISPs’ visibility. Second, ISP access to user data is not unique – other companies often have access to more information and a wider range of user information than ISPs. Policy decisions about possible privacy regulation of ISPs should be made based on an accurate understanding of these facts. </quote>

<view> It’s unargued why comprehensive or unique are bright-line standards of anything at all. </view>

Previously filled.



  • ISPs < Media
    The dumb-pipe, bit-shoving, ISPs see less than media services, who see semantic richness.
  • Cross-device is the new nowadays.
  • Encryption is everywhere.


  • a technical statement
  • contra “use” which is an action by a person
Cross-Device Tracking
Logged-In, Cross-Context Tracking
Not Logged-In, Cross-Context Tracking
Cross-Device Tracking
  • Frequency Capping
  • Attribution
  • Improved Advertising Targeting
  • Sequenced Advertising
  • Tracking Simultaneity
Limits the use of “data” (facts about consumers)
  • at the point of collection
  • at the point of use
Location of a consumer
  • Coarse contra Precise
  • Current contra Historical


The document has both a Preface and an Executive Summary. so the journeyperson junior policy wonkmaker can approach the material at whatever level of complexity their time budget and training affords.


  • Technological Developments Place Substantial Limits on ISPs’ Visibility into Users’ Online Activity:
    1. From a single stationary device to multiple mobile devices and connections.
    2. Pervasive encryption.
    3. Shift in domain name lookup.
  • Non-ISPs Often Have Access to More and a Wider Range of User Information than ISPs:
    1. Non-ISP services have unique insights into user activity.
    2. Non-ISPs dominate in cross-context tracking.
    3. Non-ISPs dominate in cross-device tracking.

Executive Summary

  • Technological Developments Place Substantial Limits on ISPs’ Visibility into Users’ Online Activity:
    1. From a single stationary device to multiple mobile devices and connections.
    2. Pervasive encryption.
    3. Shift in domain name lookup.
  • Non-ISPs Often Have Access to More and a Wider Range of User Information than ISPs:
    1. Non-ISP services have unique insights into user activity.
      • social networks
      • search engines
      • webmail and messaging
      • operating systems
      • mobile apps
      • interest-based advertising
      • browsers
      • Internet video
      • e-commerce.
    2. Non-ISPs dominate in cross-context tracking.
    3. Non-ISPs dominate in cross-device tracking.

Table Of Contents

Online Privacy and ISPs: ISP Access to Consumer Data is Limited and Often Less than Access by Others

Summary of Contents:

  • Preface
  • Executive Summary
    • Appendix 1: Some Key Terms
  • Chapter 1: Limited Visibility of Internet Service Providers Into Users’ Internet Activity
    • Appendix 1: Encryption for Top 50 Web Site
    • Appendix 2: The Growing Prevalence of HTTPS as Fraction of Internet Traffic
  • Chapter 2: Social Networks
  • Chapter 3: Search Engines
  • Chapter 4: Webmail and Messaging
  • Chapter 5: How Mobile Is Transforming Operating Systems
  • Chapter 6: Interest-Based Advertising (“IBA”) and Tracking
  • Chapter 7: Browsers, Internet Video, and E-commerce
  • Chapter 8: Cross-Context Tracking
    • Appendix 1: Cross-Context Chart Citations
  • Chapter 9: Cross-Device Tracking
  • Chapter 10: Conclusion


  • Interest-Based Advertising (IBA)
  • Tracking
  • Location
    • Coarse Location
    • Precise Location
  • Natural Language Conversation Robots (a.k.a. ‘bots)
    • Siri, Apple
    • Now, Google Now
    • Cortana, Microsoft


Also see page 124 of The Work.

  • Availability → contra Use
  • Big Data → data which is very big.
  • Broadband Internet Access Services → an ISP, but not a dialup service
    as used in the Open Internet Order, of the FCC, 2015-24, Appendix A.
  • Chat bot → <fancy>Personal Digital Assistance</fancy>
  • Cookie
  • CPNI → Customer Proprietary Network Information
    47 U.S.C. §222. Also, Section 222 are at 47 C.F.R.§ 64.2001 et seq.
  • Cross-Dontext
  • Cross-Device
  • DNS → Domain Name Service
  • DPI → Deep Packet Inspection
  • Edge Providers → smart pipes, page stuffing, click-baiting; e.g. Akamai, CloudFlare, CloudFront, etc.. exemplars.
  • End-to-End
    • Argument
    • Encryption
  • Factual Analysis → this means something different to lawyers contra engineers.
  • FCC → Federal Communications Commission
  • Form
    Form Autofill, a browser feature
  • FTC → Federal Trade Commission
  • FTT → Freedom To Tinker, a venue, an oped
  • GPS → Global Positioning System
  • HTTP → you know.
  • HTTPS → you know.
  • IBA → Interest-Based Advertising
  • IP → Internet Protocol
    • Address
  • IoT → Internet of Thingies Toys Unpatchables
  • IRL → <culture who=”The Youngs”>In Real Life</culture>
  • ISP → Internet Service Provider
  • Last Mile, of an ISP
  • Location
    • Coarse → “city”- “DMA”- or “country”-level
    • Precise → an in-industry definition exists
  • Metadata → indeed.
  • OBA → Online Behavioral Advertising
  • Open Internet Order, of the FCC.
  • OS → <ahem>Operating System</ahem>
  • Party System
    • First Party
    • [Second Party], no one cares.
    • Third Party
    • [Fourth Party]
  • Personal Information → the sacred stuff, the poisonous stuff
  • Personal Digital Assistant → a trade euphemism for NLP + command patterns for IVR; all the 1st-tier shops have one nowadays.
    • Siri → Apple
    • Now → Google
    • Cortana → Microsoft
  • Scanning
  • Section 222, see Title II
  • SSL → you mean TLS
  • Title II, of the Telecommunications Act.
    • Section 222,
  • Tracking
    • (Across-) Cross-Context
    • (Across-) Cross-Device
  • TLS → you mean SSL
  • UGC → User-Generated Content (unsupervised filth; e.g. comment spam)
  • URL → you know.
  • VPN → run one.
  • WiFi → for some cultural reason “wireless” turns into “Wireless Fidelity” and “WiFi”
  • Working Paper → are unreviewed work products..
  • Visibility → bookkeeping by the surveillor observer.



Of course, it’s a legal-style policy whitepaper. Of course there are references; they are among the NN footnotes. In rough order of appearance in the work.


Trajectory Recovery from Ash: User Privacy Is NOT Preserved in Aggregated Mobility Data | Xu, Tu, Li, Zhang, Fu, Jin

Fengli Xu, Zhen Tu, Yong Li, Pengyu Zhang, Xiaoming Fu, Depeng Jin; Trajectory Recovery From Ash: User Privacy Is NOT Preserved in Aggregated Mobility Data; In Proceedings of the Conference on the World Wide Web (WWW); 2017-02-21 (2017-02-25); 10 pages; arXiv:1702.06270

tl;dr → probabilistic individuation from timestamped aggregated population location records.


Human mobility data has been ubiquitously collected through cellular networks and mobile applications, and publicly released for academic research and commercial purposes for the last decade. Since releasing individual’s mobility records usually gives rise to privacy issues, datasets owners tend to only publish aggregated mobility data, such as the number of users covered by a cellular tower at a specific timestamp, which is believed to be sufficient for preserving users’ privacy. However, in this paper, we argue and prove that even publishing aggregated mobility data could lead to privacy breach in individuals’ trajectories. We develop an attack system that is able to exploit the uniqueness and regularity of human mobility to recover individual’s trajectories from the aggregated mobility data without any prior knowledge. By conducting experiments on two real-world datasets collected from both mobile application and cellular network, we reveal that the attack system is able to recover users’ trajectories with accuracy about 73%~91% at the scale of tens of thousands to hundreds of thousands users, which indicates severe privacy leakage in such datasets. Through the investigation on aggregated mobility data, our work recognizes a novel privacy problem in publishing statistic data, which appeals for immediate attentions from both academy and industry.



  1. R. Wang, M. Xue, K. Liu, et al. Data-driven privacy analytics: A wechat case study in location-based social networks. In Wireless Algorithms, Systems, and Applications. Springer, 2015.
  2. Apple’s commitment to your privacy.
  3. V. D. Blondel, M. Esch, C. Chan, et al. Data for development: the D4D challenge on mobile phone data. arXiv:1210.0137, 2012.
  4. G. Acs and C. Castelluccia. A case study: privacy preserving release of spatio-temporal density in Paris. In Proceedings of the ACM Conference of the Special Interest Group on Knowledge D-something and D-Something (SIGKDD). ACM, 2014.
  5. China telcom’s big data products.
  6. C. Song, Z. Qu, N. Blumm. Limits of predictability in human mobility. In Science, 2010.
  7. S. Isaacman, R. Becker, R. Cáceres, et al. Ranges of human mobility in Los Angeles and New York. In Proceedings of the IEEE Workshops on Pervasive Computing and Communications (PERCOM). IEEE, 2011.
  8. S. Isaacman, R. Becker, R. Cáceres, et al. Human mobility modeling at metropolitan scales. In In Proceedings of the ACM Conference on Mobile Systems (MOBISYS). ACM, 2012.
  9. M. Seshadri, S. Machiraju, A. Sridharan, et al. Mobile call graphs: beyond power-law and lognormal distributions. In Proceedings of the ACM Conference on Knowledge Discovery? and Discernment? (KDD). ACM, 2008.
  10. Y. Wang, H. Zang, M. Faloutsos. Inferring cellular user demographic information using homophily on call graphs. In Proceedings of the IEEE Workshop on Computer Communications (INFOCOM) IEEE, 2013.
  11. A. Wesolowski, N. Eagle, A. J. Tatem, et al. Quantifying the impact of human mobility on malaria. In Science, 2012.
  12. M. Saravanan, P. Karthikeyan, A. Aarthi. Exploring community structure to understand disease spread and control using mobile call detail records. NetMob D4D Challenge, 2013. Probably there’s a promotional micro-site for this.
  13. R. W. Douglass, D. A. Meyer, M. Ram, et al. High resolution population estimates from telecommunications data. In EPJ Data Science, 2015.
  14. H. Wang, F. Xu, Y. Li, et al. Understanding mobile traffic patterns of large scale cellular towers in urban environment. In Proceedings of the ACM Internet Measurement Conference (IMC). ACM, 2015.
  15. L. Sweeney. k-anonymity: A model for protecting privacy. In International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2002.
  16. Y. de Montjoye, L. Radaelli, V. K. Singh, et al. Unique in the shopping mall: On the reidentifiability of credit card metadata. In Science, 2015.
  17. H. Zang and J. Bolot. Anonymization of location data does not work: A large-scale measurement study. In Proceedings of the ACM Conference on Mobile Communications (Mobicom). ACM, 2011.
  18. M. Gramaglia and M. Fiore. Hiding mobile traffic fingerprints with glove. In Proceedings of the ACM Conference CoNEXT, 2015.
  19. A.-L. Barabasi. The origin of bursts and heavy tails in human dynamics. In Nature, 2005.
  20. A. Machanavajjhala, D. Kifer, J. Gehrke, et al. l-Diversity: Privacy beyond k-Anonymity. In Transactions on Knowledge Doodling? and Deliverance? (TKDD), 2007.
  21. Y. de Montjoye, C. A. Hidalgo, M. Verleysen, et al. Unique in the crowd: The privacy bounds of human mobility. In Scientific Reports, 2013.
  22. G. B. Dantzig. Linear Programming and Extensions. Princeton University Press, 1998.
  23. H. W. Kuhn. The Hungarian Method for the Assignment Problem. In Naval Research Logistics Quarterly, 1955.
  24. O. Abul, F. Bonchi, M. Nanni. Anonymization of moving objects databases by clustering and perturbation. In Information Systems, 2010.
  25. Pascal Welke, Ionut Andone, Konrad Blaszkiewicz, Alexander Markowetz. Differentiating smartphone users by app usage. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pages 519–523. ACM, 2016.
  26. Lukasz Olejnik, Claude Castelluccia, Artur Janc. Why Johnny Can’t Browse in Peace: On the uniqueness of web browsing history patterns. In Proceedings of the 5th Workshop on Hot Topics in Privacy Enhancing Technologies (HotPETs), 2012.
  27. M. C. Gonzalez, C. A. Hidalgo, A.-L. Barabasi. Understanding individual human mobility patterns. In Nature, 2008.
  28. C. Song, T. Koren, P. Wang, et al. Modelling the scaling properties of human mobility. In Nature Physics, 2010.
  29. Y. Liu, K. P. Gummadi, B. Krishnamurthy, et al. Analyzing Facebook Privacy Settings: User Expectations vs. Reality. In Proceedings of the ACM Internet Measurement Conference (IMC). ACM, 2011.
  30. B. Krishnamurthy and C. E. Wills. Generating a privacy footprint on the Internet. In Proceedings of the ACM Internet Measurement Conference
  31. S. Le B., C. Zhang, A. Legout, et al. I know where you are and what you are sharing: exploiting P2P communications to invade users’ privacy. In Proceedings of the ACM Internet Measurement Conference (IMC). ACM, 2011.
  32. S. Liu, I. Foster, S. Savage, et al. Who is. com? learning to parse WHOIS records. In Proceedings of the ACM Internet Measurement Conference (IMC). ACM, 2015.
  33. H. Kido, Y. Yanagisawa, T. Satoh. Protection of location privacy using dummies for location-based services. In Proceedings of the IEEE International Conference on (Mountain?) DEW (ICDEW). IEEE, 2005.
  34. A. Monreale, G. L. Andrienko, N. V. Andrienko, et al. Movement data anonymity through generalization. In Transactions on Data Privacy, 2010.
  35. K. Sui, Y. Zhao, D. Liu, et al. Your trajectory privacy can be breached even if you walk in groups. In Proceedings of the IEEE/ACM International Workshop on Quality of Service (IWQoS), 2016.
  36. Y. Song, D. Dahlmeier, S. Bressan. Not so unique in the crowd: a simple and effective algorithm for anonymizing location data. In PIR@ SIGIR, 2014.
  37. S. Garfinkel. Privacy protection and RFID. In Ubiquitous and Pervasive Commerce. Springer, 2006.
  38. J. Domingo-Ferrer and R. Trujillo-Rasua. Microaggregation-and permutation-based anonymization of movement data. In Information Sciences, 2012.
  39. Cynthia Dwork, Adam Smith, Thomas Steinke, Jonathan Ullman, Salil Vadhan. Robust Traceability From Trace Amounts. In Proceedings of the 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS), , pages 650–669. IEEE, 2015.

Previously filled.

(Cross-)Browser Fingerprinting via OS and Hardware Level Features | Cao, Song, Wijmans

Yinzhi Cao, Song Li, Erik Wijmans; (Cross-)Browser Fingerprinting via OS and Hardware Level Features; In Proceedings of the Network & Distributed System Security Symposium (NSDI); 2017-02-26; 15 pages.


In this paper, we propose a browser fingerprinting technique that can track users not only within a single browser but also across different browsers on the same machine. Specifically, our approach utilizes many novel OS and hardware level features, such as those from graphics cards, CPU, and installed writing scripts. We extract these features by asking browsers to perform tasks that rely on corresponding OS and hardware functionalities.

Our evaluation shows that our approach can successfully identify 99.24% of users as opposed to 90.84% for state of the art on single-browser fingerprinting against the same dataset. Further, our approach can achieve higher uniqueness rate than the only cross-browser approach in the literature with similar stability.



  • Chrome
  • Edge
  • Firefox
  • Internet Explorer
  • Opera
  • Safari
  • Other
    • Maxthon
    • Tor
    • UC


  • Amazon Mechanical Turk
  • MacroWorkers


  • AmIUnique
  • Panopticlick
  • Boda



Yinzhi Cao, Assistant Professor, Computer Science and Engineering Department, Lehigh University.


New Fingerprinting Techniques Identify Users Across Different Browsers on the Same PC; ; In BleepingComputer; 2017-01-12.


  • Core estimator.
  • [email threads] proposal: navigator.cores; InArchives of WhatWG of the W3C, circa 2014-05.
  • Am I Unique?, at GitHub.
  • anti-aliasing, at Graphics Wikia.
  • Panopticlick: Is your browser safe against tracking?
  • Watched; Wall Street Journal (WSJ).
  • cube mapping; In Jimi Wales’ Wiki.
  • list of writing systems; In Jimi Wales’ Wiki.
  • G. Acar, C. Eubank, S. Englehardt, M. Juarez, A. Narayanan, C. Diaz; “The web never forgets: Persistent tracking mechanisms in the wild,” in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS ’14), 2014, pp. 674–689.
  • G. Acar, M. Juarez, N. Nikiforakis, C. Diaz, S. Gürses, F. Piessens, B. Preneel; “FPDetective: Dusting the web for fingerprinters,” in Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security (CCS ’13), 2013, pp. 1129–1140.
  • M. Ayenson, D. Wambach, A. Soltani, N. Good, C. Hoofnagle; “Flash cookies and privacy II: Now with HTML5 and ETag respawning,” Available at SSRN 1898390, 2011.
  • S. Berger. You should install two browsers.
  • T. Bigelajzen. Cross browser zoom and pixel ratio detector.
  • K. Boda, A. M. F ̈oldes, G. G. Gulyás, S. Imre, “User tracking on the web via cross-browser fingerprinting,” in Proceedings of the 16th Nordic Conference on Information Security Technology for Applications, (NordSec’11), 2012, pp. 31–46.
  • F. Boesch. Soft shadow mapping.
  • Federal Trade Commission (FTC). Cross-device tracking. A celebration. 2015-11.
  • P. Eckersley, “How unique is your web browser?” in Proceedings of the 10th International Conference on Privacy Enhancing Technologies (PETS’10), 2010.
  • S. Englehardt A. Narayanan, “Online tracking: A 1-million-site measurement and analysis,” in Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, (CCS ’16), 2016.
  • A. Etienne J. Etienne. Classical suzanne monkey from blender to get your game started with threex.suzanne
  • D. Fifield S. Egelman, “Fingerprinting web users through font metrics,” in Financial Cryptography and Data Security. Springer, 2015, pp. 107–124.
  • S. Kamkar. Evercookie.
  • B. Krishnamurthy, K. Naryshkin, C. Wills, “Privacy leakage vs. protection measures: the growing disconnect,” in Web 2.0 Security and Privacy Workshop, 2011.
  • B. Krishnamurthy C. Wills, “Privacy diffusion on the web: a longitudinal perspective,” in Proceedings of the 18th International Conference on World Wide Web (WWW). ACM, 2009, pp. 541–550.
  • B. Krishnamurthy C. E. Wills. “Generating a privacy footprint on the internet,” in Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement (IM). ACM, 2006, pp. 65–70.
  • B. Krishnamurthy C. E. Wills. “Characterizing privacy in online social networks,” in Proceedings of the First Workshop on Online Social Networks. ACM, 2008, pp. 37–42.
  • P. Laperdrix, W. Rudametkin, B. Baudry, “Beauty and the beast: Diverting modern web browsers to build unique browser fingerprints”, in Proceedings of the 37th IEEE Symposium on Security and Privacy (S&P 2016), 2016.
  • A. Lerner, A. K. Simpson, T. Kohno, F. Roesner, “Internet jones and the raiders of the lost trackers: An archaeological study of web tracking from 1996 to 2016,” in Proceedings of the 25th USENIX Security Symposium (USENIX Security 16), Austin, TX, 2016.
  • J. R. Mayer J. C. Mitchell, “Third-party web tracking: Policy and technology,” in Proceedings of the 2012 IEEE Symposium on Security and Privacy (SP), 2012, pp. 413–427.
  • W. Meng, B. Lee, X. Xing, W. Lee, “Trackmeornot: Enabling flexible control on web tracking,” in Proceedings of the 25th International Conference on World Wide Web (WWW ’16), 2016, pp. 99–109.
  • H. Metwalley, S. Traverso, “Unsupervised detection of web track- ers,” in Globecom, 2015.
  • K. Mowery, D. Bogenreif, S. Yilek, H. Shacham, “Fingerprinting information in javascript implementations,” 2011.
  • K. Mowery, H. Shacham, “Pixel perfect: Fingerprinting canvas in HTML5,” In Some Venue, 2012.
  • M. Mulazzani, P. Reschl, M. Huber, M. Leithner, S. Schrittwieser, E. Weippl, F. Wien, “Fast and reliable browser identification with javascript engine fingerprinting,” in Proceedings of W2SP, 2013.
  • G. Nakibly, G. Shelef, S. Yudilevich, “Hardware fingerprinting using HTML5,” arXiv preprint arXiv:1503.01408, 2015.
  • N. Nikiforakis, W. Joosen, B. Livshits, “Privaricator: Deceiving fingerprinters with little white lies,” in Proceedings of the 24th International Conference on World Wide Web, (WWW ’15), 2015, pp. 820–830.
  • N. Nikiforakis, A. Kapravelos, W. Joosen, C. Kruegel, F. Piessens, G. Vigna, “Cookieless monster: Exploring the ecosystem of web-based device fingerprinting,” in In Proceedings of the IEEE Symposium on Security and Privacy (SP), 2013.
  • X. Pan, Y. Cao, Y. Chen, “I do not know what you visited last summer – protecting users from third-party web tracking with trackingfree browser,” in Proceedings of the Network & Distributed Systems Symposium (NDSS), 2015.
  • M. Perry, E. Clark, S. Murdoch, “The design and implementation of the Tor Browser [draft][online], United States,” 2015.
  • F. Roesner, T. Kohno, D. Wetherall, “Detecting and defending against third-party tracking on the web,” in Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI’12), 2012, pp. 12–12.
  • I. Sánchez-Rola, X. Ugarte-Pedrero, I. Santos, P. G. Bringas “Tracking users like there is no tomorrow: Privacy on the current internet,” in International Joint Conference,/em>. Springer, 2015, pp. 473– 483.
  • A. Soltani, S. Canty, Q. Mayo, L. Thomas, C. J. Hoofnagle. “Flash cookies and privacy,” in Proceedings of the AAAI Spring Symposium: Intelligent Information Privacy Management,/em>, 2010.
  • US-CERT. Securing your web browser.
  • Do Not Track Policy. In Jimi Wales’ Wiki.
  • Privacy Mode
  • M. Xu, Y. Jang, X. Xing, T. Kim, W. Lee, “Ucognito: Private browsing without tears,” in Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security (CCS ’15), 2015, pp. 438–449.
  • T.-F. Yen, Y. Xie, F. Yu, R. P. Yu, M. Abadi, “Host fingerprinting and tracking on the web: Privacy and security implications,” in Proceedings of the Network & Distributed Systems Symposium (NDSS), 2012.

FreeSense:Indoor Human Identification with WiFi Signals | Xin, Guo, Wang, Li, Yu

Tong Xin, Bin Guo, Zhu Wang, Mingyang Li, Zhiwen Yu; FreeSense:Indoor Human Identification with WiFi Signals; 2016-08-11; arxiv:1608.03430.


Human identification plays an important role in human-computer interaction. There have been numerous methods proposed for human identification (e.g., face recognition, gait recognition, fingerprint identification, etc.). While these methods could be very useful under different conditions, they also suffer from certain shortcomings (e.g., user privacy, sensing coverage range). In this paper, we propose a novel approach for human identification, which leverages WIFI signals to enable non-intrusive human identification in domestic environments. It is based on the observation that each person has specific influence patterns to the surrounding WIFI signal while moving indoors, regarding their body shape characteristics and motion patterns. The influence can be captured by the Channel State Information (CSI) time series of WIFI. Specifically, a combination of Principal Component Analysis (PCA), Discrete Wavelet Transform (DWT) and Dynamic Time Warping (DTW) techniques is used for CSI waveform-based human identification. We implemented the system in a 6m*5m smart home environment and recruited 9 users for data collection and evaluation. Experimental results indicate that the identification accuracy is about 88.9% to 94.5% when the candidate user set changes from 6 to 2, showing that the proposed human identification method is effective in domestic environments.

Background on SilverPush PRISM Ultrasound Beacons for Cross-Device Tracking


  • SilverPush SDK is in adware (apps of phones).
  • Broadcast transmission of signals
    Broadcast television inflicts watermarked commercials on Consumer Premises Equipment (CPE)
  • CPE transmits back to home base with the acknowledgement of the signal received.
    Cross-device linking is developed by correlating time & network provenances.


  • SilverPush
  • Founded 2012
  • Founders
    • Hitesh Chawla
    • Alex Modon
    • Mudit Seth
  • Locations
    • Gurgaon
    • New Delhi
    • offices in San Francisco
  • Investors
    • M&S Partners, JP; Hiro Mashita
    • IDG Ventures
    • 500 Startups; Dave McClure
    • Unilazer Ventures; Ronnie Screwvala
  • Product Lines
    • Unique Audio Beacon (UAB)
      Cross-channel attribution
    • PRISM
      Real-Time TV Ad Analytics
    • Syncads
      <buzz>Moment Marketing</buzz>
  • Customers
    • Airtel
    • Candy Crush
    • Domino’s
    • Kabam
    • Myntra
    • Proctor & Gamble
    • Samsung



Via: backfill.

Tracking the Digital Footprints of Personality | Lambiotte, Kosinski

Renaud Lambiotte, Michal Kosinski; Tracking the Digital Footprints of Personality; In Proceedings of the Institute of Electrical & Electronics Engineers (IEEE); Volume 102, Issue 12; 2014-12 (2015); 6 pages.
Teaser: This paper reviews literature showing how pervasive records of digital footprints can be used to infer personality.

tl;dr → Very broad, not even a survey really.  More of a introduction to the area.  The References.


A growing portion of offline and online human activities leave digital footprints in electronic databases. Resulting big social data offers unprecedented insights into population-wide patterns and detailed characteristics of the individuals. The goal of this paper is to review the literature showing how pervasive records of digital footprints, such as Facebook profile, or mobile device logs, can be used to infer personality, a major psychological framework describing differences in individual behavior. We briefly introduce personality and present a range of works focusing on predicting it from digital footprints and conclude with a discussion of the implications of these results in terms of privacy, data ownership, and opportunities for future research in computational social science.


  • myPersonality
    • Michal Kosinski, coordiator
  • Michal Kosinski, publications.
  • Five Factor Model of Personality (FFM)
    1. Openness to Experience
    2. Conscientiousness
    3. Extroversion
    4. Agreeableness
    5. Emotional Stability (contra neuroticism)
  • Previous Work
    • Facebook
    • Twitter


  • Call Data Records (CDR)
  • Community Similarity Networks (CSN)
  • Global Positioning System (GPS)
  • Global System for Mobile (GSM)


  • 54 references
  • M. Kosinski, ‘Measurement and prediction of individual and group differences in the digital environment, Ph.D. dissertation, Department of Psychology, Cambridge University, Cambridge, U.K., 2014. 200 pages. Lulu: $11.



  • Philadelphia, PA
  • New York, NY


  • Deterministic cross-device
  • Email address to (browser) cookie
  • Data Supply Side Platform (DSSP)
  • <quote>match Data as a Service</quote>
  • <quote>Deterministically target and retarget customers across their journey on desktop and mobile, with confidence.</quote>



  • First Round Capital
  • New Atlantic Ventures
  • Charles River Ventures
  • Felicis Ventures
  • Bullpen Capital

Security Advisory Board

  • Adam J. O’Donnell
  • Bruce Schneier
  • Justin Somaini
  • Chris Wysopal
  • Elad Yoran


Content-Based Methods for Predicting Web-Site Demographic Attributes | Kabbur, Han, Karypis

Santosh Kabbur, Eui-Hong Han, George Karypis; Content-Based Methods for Predicting Web-Site Demographic Attributes; In Proceedings of Some Conference, Surely; 2010-01; 11 pages; paywall.


Demographic information plays an important role in gaining valuable insights about a web-site’s user-base and is used extensively to target online advertisements and promotions. This paper investigates machine-learning approaches for predicting the demographic attributes of web-sites using information derived from their content and their hyperlinked structure and not relying on any information directly or indirectly obtained from the web-site’s users. Such methods are important because users are becoming increasingly more concerned about sharing their personal and behavioral information on the Internet. Regression-based approaches are developed and studied for predicting demographic attributes that utilize different content-derived features, different ways of building the prediction models, and different ways of aggregating web-page level predictions that take into account the web’s hyperlinked structure. In addition, a matrix-approximation based approach is developed for coupling the predictions of individual regression models into a model designed to predict the probability mass function of the attribute. Extensive experiments show that these methods are able to achieve an RMSE of 8–10% and provide insights on how to best train and apply such models.


  • Jian Hu, Hua-Jun Zeng, Hua Li, Cheng Niu, Zheng Chen, Demographic prediction based on user’s browsing behavior, In Proceedings of the 16th International Conference on World Wide Web (WWW), 2007-05-08, Banff, Alberta, Canada.
  • D Chakrabarti, R Kumar, K Punera Page-level template detection via isotonic smoothing, In Proceedings of the 16th International Conference on World Wide Web (WWW), 2007, pp 61-70.
  • Joachims, T. Text Categorization with Support Vector Machines: Learning with Many Relevant Features, In Proceedings of the 10th European Conference on Machine Learning (ECML), Chemnitz, Germany, 137-142, 1998
  • B Zhang, H Dai, HJ Zeng, L Qi, T Najm, TB Mah, V Shipunov, Y Li, Z Chen (Microsoft), Predicting demographic attributes based on online behavior. US Patent Publication number 2007/0208728 A1
  • D. Murray, K. Durrell. Inferring demographic attributes of anonymous internet users. In Proceedings of the Web Usage Analysis and User Profiling Workshop, Volume 1836 of Lecture Notes in Computer Science, pages 7-20. Springer, 200?.
  • E Adar, LA Adamic, FR Chen – Xerox Corporation, User profile classification by web usage analysis. US Patent Publication number 2007/0073682 A1
  • Vladimir Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, 1995
  • George Karypis. YASSPP: better kernels and coding schemes lead to improvements in protein secondary structure prediction, In Journal of Proteins, 2006-08, Volume 64-3, pages 575-586
  • Huzefa Rangwala, George Karypis, Building multiclass classifiers for remote homology detection and fold recognition, In Journal of BMC Bioinformatics, 2006, vol 7, page 455
  • Ricardo Baeza-Yates, Berthier Ribeiro-Neto. Modern Information Retrieval, Addison Wesley Longman Publishing Co. Inc.
  • E. Michailidou, S. Harper, S. Bechhofer. Visual complexity and aesthetic perception of web pages, In Proceedings of the 26th ACM International Conference on Design of Communication (SIGDOC08), Lisbon, Portugal, 2008-09-22.
  • E. H. Moore. On the reciprocal of the general algebraic matrix, In Bulletin of the American Mathematical Society 26: 394-395. 1920.
  • Roger Penrose. A generalized inverse for matrices, In Proceedings of the Cambridge Philosophical Society 51: 406-413. 1955.
  • Comscore
  • Quantcast
  • Alexa Top Sites

Revisiting the Uniqueness of Simple Demographics in the US Population | Philippe Golle

Philippe Golle; Revisiting the Uniqueness of Simple Demographics in the US Population; In Proceedings of the Workshop on Privacy in the Electronic Society (WPES); 2006-10-30; 4 pages.


ccording to a famous study [10] of the 1990 census data, 87% of the US population can be uniquely identified by gender, ZIP code and full date of birth. This short paper revisits the uniqueness of simple demographics in the US population based on the most recent census data (the 2000 census). We offer a detailed, comprehensive and up-to-date picture of the threat to privacy posed by the disclosure of simple demographic information. Our results generally agree with the findings of [10], although we find that disclosing one’s gender, ZIP code and full date of birth allows for unique identification of fewer individuals (63% of the US population) than reported in [10]. We hope that our study will be a useful reference for privacy researchers who need simple estimates of the comparative threat of disclosing various demographic data.

Simple Demographics Often Identify People Uniquely | Latanya Sweeney

Latanya Sweeney; Simple Demographics Often Identify People Uniquely; Data Privacy Working Paper 3; Carnegie Mellon University; Pittsburgh, PA; 2000; 34 pages.


In this document, I report on experiments I conducted using 1990 U.S. Census summary data to determine how many individuals within geographically situated populations had combinations of demographic values that occurred infrequently. It was found that combinations of few characteristics often combine in populations to uniquely or nearly uniquely identify some individuals. Clearly, data released containing such information about these individuals should not be considered anonymous. Yet, health and other person-specific data are publicly available in this form. Here are some surprising results using only three fields of information, even though typical data releases contain many more fields. It was found that 87% (216 million of 248 million) of the population in the United States had reported characteristics that likely made them unique based only on {5-digit ZIP, gender, date of birth}. About half of the U.S. population (132 million of 248 million or 53%) are likely to be uniquely identified by only {place, gender, date of birth}, where place is basically the city, town, or municipality in which the person resides. And even at the county level, {county, gender, date of birth} are likely to uniquely identify 18% of the U.S. population. In general, few characteristics are needed to uniquely identify a person.


(but different)

L. Sweeney; Uniqueness of Simple Demographics in the U.S. Population; Data Privacy Lab White Paper Series LIDAP-WP4; School of Computer Science, Carnegie Mellon University, Pittsburgh, PA; 2000; 34 pages; abstract; catalog.

Via: backfill

Location Terminology Guide: The Language of Location | Mobile Marketing Association (MMA)

Location Terminology Guide: The Language of Location; Mobile Marketing Association (MMA); 2013-09; 24 pages; landing; regwalled (pay with PII).

Table of Contents

  1. Introduction: The Language Of Location
  2. Location Data & Signals
  3. Location Targeting & Strategies
  4. Location Measurement & Metrics
  5. Glossary Of Terms
  6. Who We Are
  7. About MMA


  • Digital Advertising Alliance (DAA)

DAA’s Application of Self-Regulatory


The Location Committee’s Location Terminology Guide Working Group:

Name Title Organization
Leo Scullin Global Industry Initiatives MMA
Monica Ho VP of Marketing xAd
Jake Moskowitz VP, Innovations Lab Nielsen
Dan Silver Director of Marketing PlaceIQ
David Tannenbaum Associate Director ThinkNear
Allison Merlino VP of Sales, Northeast Millennial Media
Shannon Denison VP, Products & Insights Voltari
Sean Trepeta President Mobiquity Networks
Renee Soucy Writer Voltari
Alexa Irish Senior Strategic Designer Nielsen


MMA Introduces Location Terminology Guide; press release; Mobile Marketing Association (MMA); 2013-09-25.
Teaser: MMA committee unveils lexicon to define mobile location practices and educate marketers on the methods available across data, measurement and technology


  • Mobile Guidance; Digital Advertising Alliance (DAA)
  • Principles to the Mobile Environment; Digital Advertising Alliance (DAA); 2013-07.

Header Enrichment or ISP Enrichment? Emerging Privacy Threats in Mobile Networks | Vallina-Rodriguez, Sundaresan, Kreibich, Paxson

Narseo Vallina-Rodriguez, Srikanth Sundaresan, Christian Kreibich, Vern Paxson; Header Enrichment or ISP Enrichment? Emerging Privacy Threats in Mobile Networks; In Proceedings of the ACM SIGCOMM Workshop on Hot Topics in Middleboxes and Network Function Virtualization (HotMiddlebox 2015, huh? now you’re just being silly); 2015-08-17; 6 pages; landing.


HTTP header enrichment allows mobile operators to annotate HTTP connections via the use of a wide range of request headers. Operators employ proxies to introduce such headers for operational purposes, and—as recently widely publicized—also to assist advertising programs in identifying the subscriber responsible for the originating traffic, with significant consequences for the user’s privacy. In this paper, we use data collected by the Netalyzr network troubleshooting service over 16 months to identify and characterize HTTP header enrichment in modern mobile networks. We present a timeline of HTTP header usage for 299 mobile service providers from 112 countries, observing three main categories:

  1. unique user and device identifiers (e.g., IMEI and IMSI)
  2. headers related to advertising programs, and
  3. headers associated with network operations.


  • HTTP header enrichment
  • Netalyzr
    • Netalyzer-for-Android
  • Verizon Precision Marketingt Insights
  • The IETF’s Service Function Chaining (SFC) standards are vague about whether injected headers are good or bad (should be removed).
  • Data
    • Collected: 2013-11 → 2015-03.
    • 112 countries
    • 299 operators
  • Belief: no M?NO is yet cracking TLS to insert HTTP headers into the encrypted stream.
  • Suggested as an ID-less methods of identification: device-unique allocation of the (routable) IPv6 space to identify the device, in addition to routing to it.
  • RFC 7239Forwarded HTTP Extension; A. Peterson, M. Milsson (Opera); IETF; 2014-06.
  • Cessation Timeline
    • 2014-10 → Vodaphone (ZA) has ceased their practices in 2014-10, nothing to see there, now.
    • 2014-11 → AT&T has ceased their practices 2014-11.
    • 2015-03 → Verion was not respecting opt-out (as evidenced by not inserting the X-UIDH header) through 2015-03.
  • Continuation
    • Verion continues the X-UIDH header insertion.
  • The X-Forwarded-For header carries extra freight in T-Mobile (DE)
  • Carrier-Grade NAT (CGN) at per RFC 6598IANA-Reserved IPv4 Prefix for Shared Address Space (2012-04)


Table 1 & Table 2; Table 3 (not shown)

HTTP Header Operator Country Estimated Purpose
x-up-calling-line-id Vodacom ZA Phone Number
msisdn Orange JO MISDN
x-nokia-msisdn Smart PH
tm_user-id Movistar ES Subscriber ID
x-up-3gpp-imeisv Vodacom ZA IMEI
lbs-eventtime Smarttone HK Timestamp
lbs-zoneid Location
x-acr AT&T US unstated, an identifier
x-amobee-1 Airtel IN
x-amobee-2 Singtel SG
x-uidh Verizon US
x-vf-acr Vodacom ZA
Vodafone NL


  • Access Point Name (APN)
  • GPRS
  • HTTP
  • IMSI
  • IMEI
  • J2ME
  • Location-Based Services (LBS)
  • Mobile Country Code (MCC)
  • Mobile Network Code (MNC)
  • Mobile Network Operator (MNO)
  • Mobile Virtual Network Operator (MVNO)
  • Hong Kong Metro (subway) (MTR)
  • Service Function Chaining (SFC)
  • SIM
  • Transport-Layer Security (TLS)
  • Unique Identifier (UID); contra the specific UUID or GUID
  • Virtual Private Network (VPN)
  • WAP


A significant number of newpaper articles, vulgarizations & bloggist opinements.

The Wealthiest ZIP Codes in America | Experian

Via: Experian



The Richest Zip Codes in America in One Map; Jeff Desjardins; In Some Blog entitled Visual Capitalist; 2015-08-11.

Hoodsquare: Modeling and Recommending Neighborhoods in Location-based Social Networks | Zhang, Noulas, Scellato, Mascolo

Amy Xian Zhang, Anastasios Noulas, Salvatore Scellato, Cecilia Mascolo; Hoodsquare: Modeling and Recommending Neighborhoods in Location-based Social Networks; In The Human Journal; Vol. 1, No. 1; 2013; 15 pages; landing


Information garnered from activity on location-based social networks can be harnessed to characterize urban spaces and organize them into neighborhoods. In this work, we adopt a data-driven approach to the identification and modeling of urban neighborhoods using location-based social networks. We represent geographic points in the city using spatio-temporal information about Foursquare user check-ins and semantic information about places, with the goal of developing features to input into a novel neighborhood detection algorithm. The algorithm first employs a similarity metric that assesses the homogeneity of a geographic area, and then with a simple mechanism of geographic navigation, it detects the boundaries of a city’s neighborhoods. The models and algorithms devised are subsequently integrated into a publicly available, map-based tool named Hoodsquare that allows users to explore activities and neighborhoods in cities around the world.

Finally, we evaluate Hoodsquare in the context of a recommendation application where user profiles are matched to urban neighborhoods. By comparing with a number of baselines, we demonstrate how Hoodsquare can be used to accurately predict the home neighborhood of Twitter users. We also show that we are able to suggest neighborhoods geographically constrained in size, a desirable property in mobile recommendation scenarios for which geographical precision is key.



  • Zillow Estate Agency. Neighborhood boundaries.
  • Airbnb. Introducing neighborhoods.
  • M. Ankerst, M. M. Breunig, H. Kriegel, J. Sander. Optics: Ordering points to identify the clustering structure. In Proceeding of the Conference of the Special Interest Group on the Management of Data (SIGMOD), pages 49–60, 1999.
  • United States Census Bureau. Census boundaries. Tiger Shape Files.
  • T. Buttler and G. Robson. Social Capital, Gentrification and Neighbourhood Change in London: A Comparison of Three South London Neighbourhoods. Urban Studies, 38, 2001.
  • Foursquare Venue Categories.
  • J. Chang and E. Sun. Location: How Users Share and Respond To Location-based Data on Social Networking Sites. In International AAAI Conference on Web and Social Media (ICWSM), 2011.
  • R. J. Chaskin. Perspectives on Neighborhood and Community: A Review of the Literature. In Social Service Review, 71, 1997.
  • J. Cranshaw, R. Schwartz, J. I. Hong, N. Sadeh. The Livehoods Project: Utilizing Social Media to Understand the Dynamics of a City. In International AAAI Conference on Web and Social Media (ICWSM), 2012.
  • J. Cranshaw, T. Yano. Seeing a Home Away From the Home: Distilling Proto-neighborhoods From Incidental Data with Latent Topic modeling. In Proceedings of the NIPS Workshop on Computational Social Science and the Wisdom of Crowds, 2010.
  • M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I. H. Witten. The WEKA Data Mining Software: an Update. In SIGKDD Explorations Newsletter, 11(1):10–18, 2009.
  • H. Hotelling. Stability in Competition. In The Economic Journal, 39:41–57, 1929.
  • M. Mehaffy, S. Porta, Y. Rofè, N. Salingaros. Urban Nuclei and the Geometry of Streets: The ’emergent neighborhoods’ Model. In Urban Design International, 15(1):22–46, 2010.
  • A. Noulas, S. Scellato, C. Mascolo, M. Pontil. An Empirical Study of Geographic User Activity Patterns in Foursquare; In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media. 2011. pages 570-573.
  • D. Quercia, J.P. Pesce, V. Almeida. Psychological Maps 2.0: A Web Gamification Enterprise Starting in London. In Proceedings of the Conference on the World Wide Web (WWW). 2013.
  • Ross Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, 1993.
  • P.J. Rentfrow, S.D. Gosling, J. Potter. A Theory of the Emergence, Persistence, and Expression of Geographic Variation in Psychological Characteristics. In Perspectives on Psychological Science, 3(5):339–369, 2008.
  • A. Roux. Investigating Neighborhood and Area Effects on Health. In American Journal of Public Health, 91:1783–1789, 2001.
  • J. Sander, X. Qin, Z. Lu, N. Niu, A. Kovarsky. Automatic Extraction of Clusters From Hierarchical Clustering Representations. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2003.
  • N. Sastry, A. R. Pebley, M. Zonta. Neighborhood Definitions and The Spatial Dimension Of Daily Life in Los Angeles. California Center for Population Research, 2002.
  • C. M. Tiebout. A Pure Theory of Local Expenditures. In Journal of Political Economy, 64:416–424, 1956.
  • New York Times. New York’s Little Italy, Littler by the Year. 2011-02-22.
  • L. Weiss, D. Ompad, S. Galea, D. Vladhov. Defining Neighborhood Boundaries for Urban Health Research. In American Journal of Preventative Medicine, 32:154–159, 2007.

Via: backfill

Compendium on Verizon’s Precision Marketing Insights, Precision ID, X-UIDH Header


  • Unique IDentifier Header (UIDH)
  • The (silently-added) HTTP header X-UIDH
  • X-UIDH: OTgxNTk2NDk0ADJVquRu5NS5+rSbBANlrp+13QL7CXLGsFHpMi4LsUHw
  • Behaviors (based on information & belief)
    • X-UIDH changes weekly
    • The UIDH identifier indexes demographic, persona and browing history-type records of the subscriber (of the handset or PSTN or paying account).
  • Demonstrators
  • Trade Names
    • Verizon Selects
    • Relevant Mobile Advertising
    • Verizon’s Precision Market Insights
  • Precision Market Insights, a partner
  • Availability
    • No 1st party program
    • Something vague about making data available via partnerships.
  • Capabilities
    • Demographic segments on mobile
    • loyalty
    • retargeting
  • Partners
    • BlueKai
    • BrightRoll
    • RUN
  • Pilot
    • PrecisionID
    • Kraft with Starcom MediaVest group
    • 1-800-Flowers
  • Separately
    • Precision has an in-stadium identification scheme
  • Who
    • Colson Hillier, VP, Precision Market Insights
    • Debra Lewis, press relations, Verizon.
    • Adria Tomaszewski, press relations, Verizon.
    • Kathy Zanowic, senior privacy officer, Verizon.


In archaeological order; derivative works on top, original reportage lower down.


  • Open RTB v2.1 Specification, as implemented by MoPub; on DropBox; updated 2015-02-13; landing.
    <quote>2015-02-15: Removed passing of UIDH parameter and removed all references in the specification</quote>
  • HTTP  Header Enrichment Overview; Documentation; Juniper; 2013-02-14.
    • HTTP Header insertion X-MSISDN
    • MobileNext Broadband Gateway for an Access Point Name (APN)
    • <quote>installing one or more Multiservices Dense Port Concentrators (MS-DPCs) in the broadband gateway chassis</quote>