2024-12-07

Web Fingerprinting

 Web Fingerprinting

Sites can use Java Script to query your workstation and make a "fingerprint" of your hardware and software uniquely identifying you.  In the old days, this was done by storing a third-party cookie.  Now newer techniques are more sophisticated and are stored outside of your control. 

This harms in untold ways.  For example, while searching for airline tickets, if you bounce between airlines, looking for the best fare, you will often see the rate will creep upwards each time you return to review the price.  The hope is to scare you into committing to the purchase before it rises higher.  

They are able to do this because they can tell you are the same computer returning to the same site.

This is fun to review, but it is unclear
the actions you can take to prevent

The tracking is tied to many things, including your external-facing IP address (which is hard to change), the cookies stored on your computer, the browser version being used, the video card, and the number of fonts installed.  This is called a "fingerprint."  This tags your machine, not you as a user-id.

Am I Unique

Click this link and let them display your workstation's fingerprint.  "AmIUnique" and "EFF.org's" sites will show your digital "fingerprint" and it is beyond interesting. 

https://amiunique.org

See also this site (same idea, more educational):
https://coveryourtracks.eff.org/learn

The gist is this:  Yes, you are probably unique and can be easily identified.  This happens even if you are using a boring PC with a normal video card, and a normal operating system.  Click the link to see how identifiable your PC is.


For example, my first attempt at AmIUnique said, yes, I am absolutely unique to all of the people who have visited this site in the past (compared with the 3 million current visitors in their database):



The test shows 20 or so attributes.  Some are marked Green (which means you are not differentiated from other users -- you blend into the crowd) and some are marked Yellow or Red, which means this can be used to target your machine.  The more Red you have, the more unique you are.

Click for larger view

User Agent Tag

The "User Agent" tag is particularly interesting.  This is what your browser broadcasts to the world and it is transmitted in a "header."  For years, browsers have lied about this -- telling websites they are a more-or-less generic browser.  Amazingly, almost all browsers (including Chrome, IE, and others) claim to be Mozilla

Everyone should look the same but sadly they also return a version so sites can detect old browsers and display the page differently to accommodate them.  This helps identify your computer.

At first, I thought I was using Mozilla Firefox and it shows as "Mozilla" (just because almost all browsers claim this), but at the tail-end was a version number.  It reports, "I am unusual, a bit odd, and identifiable."  This awards me a flunk in "Am I Unique's" calculations:

My friend checked on his iPhone (Safari), and from his Surface Pro (MS Edge), and it also shows a similarly bad score.  In other words, his browser, which was boring-as-hell, still gave a red score in this category, and when all was said and done, he was unique compared the other 3 million people.  I am a bit confused about how this can be.

Changing the UserAgent

Your machine's total fingerprint is a collection of discoveries, and the User Agent is one of the most important.  Because I use Firefox, it targets me as a particularly odd duck.  But keep in mind, Microsoft Edge, and Chrome also have the same issue.
 
This is not recommended, but you can override what the browser sends by manually setting the User-Agent to some other, more generic value.  With this, you can try to blend into the crowd.  Indeed, this worked at a superficial level and my score improved.  As you will see, this caused problems in later testing.

Change the UserAgent/Header 

In Firefox:
a.  Browse to this url:  "about:config"  (no quotes)
b.  Accept the warning message

c.  Type:  "general.useragent.override"  (no quotes, case-sensitive!)
d.  Select (*) String, click "+"
e.  Paste this value, where I am pasting Microsoft Edge's string,
     Click the checkmark.
     Close the window

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 Edg/131.0.2903.86

In Chrome/Edge
a.  From a browser page, press F12 (opening Developer Tools)
b.  Press Ctrl-Shift-P  (opening a command menu)
c.  type "network conditions"  (now quotes)
d.  In the bottom section, scroll down, locating "User Agent"
e.  uncheck [ ] Use Browser Default
f.  Paste the value from above (see Firefox)
g. Close the developer tools box ("x"), upper-right

Results:
This changed my UserAgent flag from a red 0.27% (which means highly identifiable) to a yellow 16.22% (which means in a crowded field).


Note: This still has a version number.  If you remove the version number, you will become even more unique because nobody will have done that.  Even if you manually set a new (and different / setting an older, more common version number), after a while, after everyone upgrades to newer versions of Chrome, this will slowly devolve into being more unique -- think more odd.  If you go this route, periodically, reset this to a new base line. 
Chrome, realizing this is a tracking problem, has promised to start removing versioning from this string.  They have not completed this yet (2024.12). See this article: 
https://developers.google.com/privacy-sandbox/blog/user-agent-reduction-android-model-and-version

where the latest EDGE browser (2024.12) shows:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 Edg/131.0.2903.86

The latest Chrome browser shows (2024.12) shows:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36

Find the current User Agent tags here:
https://explore.whatismybrowser.com/useragents/explore

In the future, once (Chrome, and others) finally get around to removing the version number from this string, revert your manual changes back to the automatic broadcast.  Undo this change by (firefox) returning to this config value and delete the newly-added key.

But there is a problem: 

Although changing my UserAgent improved the browser's "hiding-in-the-crowd-ness", it prevented me from logging into GMail or into Google's blogging tool (this article).  Google said, "this is an unsupported browser," even though millions of other browsers use the same string.  This means something else is leaking -- something else is telling the website that there is a mismatch between the "header" and the browser itself. 



Trouble on the Horizon

The next area is the fingerprint is a list of fonts.  With JavaScript, websites can query the fonts on your system.  For example, my system has 180 fonts and is flagged as highly unique.  Your font list is also probably equally unique. 


I don't know how to defend against this.  It seems no matter what I could do here; my list would still be unique.  The only good solution would be a setting in the browser that says, "Return a simplified, standard list of fonts" -- but all browsers would have to agree to do this, and this is unlikely.

Perhaps another solution would be to periodically add or remove inconsequential fonts from your operating system's font list, just to add variation.  This would break the tie to your fingerprint, causing another fingerprint to be generated.  Naturally, this makes your list even more specialized.  I'd pick an obscure font and remove the Bold, then later the Italic, then later re-add the font back.  This is a lot of monkey work.  Perhaps I might write a program to automate this (check back in the future).


More Trouble

In the AmIUnique site, there are two references to video/image settings, and the website can tunnel into the hardware and tag your machine's video card.  This article describes the technique:  

https://blog.amiunique.org/an-explicative-article-on-drawnapart-a-gpu-fingerprinting-technique

It queries your exact card by uploading a secret image and running a test to detect minor variations in CPU and processing speeds.  With this, it is able to identify your video card uniquely -- even among all the other identical video cards sold from the same company. 

The article discusses mitigations, none are palatable and this too seems hard to defend against.  On the plus side, this is just one of several parameters that need to align in order to detect your machine.


JavaScript

Of course, all of this nonsense can be stopped by disabling JavaScript on a site-by-site basis.  Firefox has a neat plugin called, "Disable JavaScript" (see this link:  https://addons.mozilla.org/en-US/firefox/addon/disable-javascript).

This plugin immediately stops fingerprinting.  And it is also (sometimes!) useful for bypassing paywalls on news-reading sites.*  But it does not work well in sites, such as Amazon, or your airline, or this blogging tool -- those sites require JavaScript for basic functionality and whine when JavaScript is gone.   


I use this plugin.  On a site-by-site basis, I'll disable or enable JavaScript.  It works fairly-but-inconsistently well.  The plugin keeps track of the sites where you use this, making it transparent, once set.  The trouble is, it does not work on all sites.  But this is a low-risk, low-effort to try.

* Journalists deserve to be paid for their work, but the current design, where you have to subscribe to each news site is unmanageable.  When you randomly and infrequently visit a publisher from a news aggregator service, such as Google News, the subscription idea doesn't work very well.  


Returning to Cookies

Sites can store fingerprinting as a cookie on your local workstation (but most now store them on their own databases, where there is not visibility).

Regardless, it is still wise to manage your third-party cookies and all browsers have controls in this area.  For example, Firefox, in "Settings,", "Privacy and Security", you can block cross-site and third-party cookies:



Chrome/Edge has similar settings, not illustrated.

2024.12 - Plus, this week, Firefox announced they are removing the "Do Not Track" feature -- why?  Neither reputable or disreputable sites paid attention to it.

Painfully, you could periodically clear all cookies (doing so from each browser you use).  This resets default file folder settings, upsets previously-typed user-id's and passwords, and other such nonsense.  When cleared, it is vaguely annoying.  I used to clear all cookies frequently -- but do so less often now, just because the fingerprints are usually stored on that site's servers. 

Clearing cookies only helps if the website stores the fingerprint in a cookie; your workstation would re-generate the same fingerprint, regardless.  Of course, cookies can store other information too.  Cookies are not inherently bad, but they can be used by third-parties for cross-site data.  If you ever wondered why you are seeing a ton of ads for wrist watches, when you one-time clicked a wrist-watch ad, blame your cookies.

However, it is 'sometimes' helpful to clear all cookies prior to price-matching (airlines, hotels.com, and other such sites).  Do this on your initial foray into the price comparisons, then a second time when you are ready to commit to the purchase.  Or better, use one computer for the initial research and a second to complete the purchase.  If that second computer were on a separate IP-address-range (such as a cellular hotspot), that would be even better.  


Brave Browser

Finally, consider a different browser, at least for some surfing activities.  
https://brave.com

This browser automatically passes you through a VPN, fixing the IP Address problem.  
It blocks fingerprinting (by presenting the same fingerprint for all users)
It blocks 3rd party and cross-site cookies
Blocks most ads (except their own)
Has the ability to make BAT (micropayments) to publishers

But they have bills to pay, and they present ads on their own landing page, and the ads often target Crypto.  ZDNet had comments in this article:  https://www.zdnet.com/article/brave-browser-the-bad-and-the-ugly  and this wikipedia article talks about other advertising-related issues with the Brave browser:  https://en.wikipedia.org/wiki/Brave_(web_browser)#Controversies

I've not yet tried Brave, but am considering it for some browsing tasks (competitive purchases, airbnb, airlines, etc.).

Final Notes: 
You may have noticed this article is a little scatter-brained.  I'm still pondering what is happening here.  Regardless, I found it interesting.  When I reviewed my friend's iphone and Surface laptop's report, I came to the conclusion that all machines are uniquely fingerprinted.  In other words, they know when you've returned.

I'm considering changing my asset-tag/management program:  DeviceID, making it periodically change fonts and a few other settings, in an effort to make my calculated fingerprint less predictable.  This article is my first thoughts on the matter.
 


Your comments welcome.

Related links: 
A keyliner article:  Network wide blocking of advertising.  I do this and highly recommend:
https://keyliner.blogspot.com/2018/01/network-wide-blocking-of-ads-tracking.html