05 Aug 2020 - by 'Maurits van der Schee'
I have created a free GDPR scanner at: TQdev.com/gdpr-scanner. You can use it to see what domains are connected by your website and see who is running those domains and where they are hosted. Ideally you see only one entry with only the domain that you have entered in the web browser. In reality many websites use a lot of external services and thus also share your IP address and user agent with those services. Since this information is considered personal information this is subject to the GDPR. The GDPR says that you can only share this information if there is a need and a legal ground to do so OR a user consent. This scanner (that gives no consent) helps you identify services that information is shared with, so that you can consider whether or not you are GDPR compliant.
You submit a URL in the GDPR scanner form on the server. This URL gets picked up (after it is finished with it's current tasks) by a worker that runs Chromium in "headless" and "incognito" mode. It then loads the page and creates a HAR file (the output you see in the network tab of the Developer Tools). These results are enriched with TCPing latencies and IP address information from ip-api.com. A summary of this information is sent back from the worker to the server. The server stores the result and shows a report. This report gets it's own unique URL that you can easily share.
You can rely on the "Domain", the "Ping" and the "Hostname" data to be correct as these are measured by the scanner on your request. The "EU", "Country" and "Organization" fields are provided by ip-api.com and are lookups in an incomplete and imperfect database. I estimate is that in 80-90% of the cases the data in these columns is correct and complete. I have evaluated other databases, but their data quality was even lower.
Many sites use Google Analytics. Google offers you to anonymize IP addresses it receives. The scanner can detect whether or not this setting is enabled. You will see a "
ga_aip" or "
ga_no_aip" flag in the "Flags" column for the domain "www.google-analytics.com". These flags are clickable and clicking them will show a page with a little background information. We will be adding more privacy analysis in the future.
The scanner will also show which cookies are set and what data is stored in local storage. This helps you to identify trackers. Ideally these should only be session cookies (forgotten after the session) and all cookies should be "Secure", "HttpOnly" and have "SameSite" set to "Strict".