SkillAgentSearch skills...

Trackmap

Scripts needed to support Trackography project

Install / Use

/learn @vecna/Trackmap
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Trackography project

The Trackography project is research by Claudio ࿓ vecna, developed with Tactical Tech, as part of the MyShadow project.

Take a look at the presentation of Trackography at the 31c3 or at its online slides.

When you access media websites from your country, your Internet connection is being tracked by multiple third parties. And this happens constantly. That's what we aim to illustrate.

Our aim is to show where our data travels when we visit our favorite news websites through a visualization. We are currently looking for people to collaborate with in various countries in the world which would make this project possible.

Install

Please, check the file INSTALL.md.

You, your country and the code

This repository contains the software and data source required to detect online trackers.

The collection of data needs to happen in a distributed way, which means that the software needs to run from each of the selected countries. This is important because the network and the trackers behave differently based on the specific country of the user.

Is your country analyzed in Trackography ?

If the answer is NO there are two possible reasons:

  • No one has reviewed or created the media list, check in this directory and look at the expected format at the bottom of the page. An unverified media list can be present here, if not, sent us an email at trackmap at tacticaltech dot org (because we have some not refined list usable as starter) or provide us a list.

  • No one has run the software from your country yet: In this case, read the sections below

If the answer is YES: perfect, this means that someone has already run the software. You can still help, because different ISPs and different Geographical locations in a country bring different results. Having multiple results can be useful for further analysis and comparison.

How can you help ?

At the moment, we cannot manage new data -the project is temporary in maintenance- mode only :(

If you're a Media aware citizen, we need a reliable media list for every country. What is important is explained above, use git to help us or open an issue.

If you're a Linux user you can help run the software and collect results from your country. A distributed effort is required, because the Internet works differently in different locations. You can run the software as explained below, and it will automatically send the results to our server.

Support the collection

This procedure uses an apt-get based system (Debian/Ubuntu etc). Docker file is available but currently unmaintained.

Special condition where you can't run the software:

  • over Tor network: because a traceroute cannot run over it, this mean that your web connection appear from a different network point that your traceroutes, and this will bring unacceptable results
  • Network filtering ICMP packets: because a traceroute receives ICMP time exceeded packets and the results would be incomplete, this will be display by the software itself
  • Internet lines with a lot of packet loss (WiFi/WiMax which is far from the access point): because a traceroute is based on a protocol which does not support re-transmission and the possibility of having incomplete results would be high
  • VPN you can run this test via VPN, but you've to specify the country of your endpoint. If you are in the USA and you've a VPN ending in Sweden, you have to specifiy -c sweden.

Tor shall be used when the script has completed the collection, because we anonymize the users interacting with us.

Options

  • -i: To be used when you're in an Instable Internet Connection, like, a wifi with many packet loss.
  • -o: Specify a different output directory: needed when are performed multiple tests.
  • -d: Disable the data delivery to us. Permit to collect the data and have the file local. (you can send it later with -s)
  • -X: loop, perform test, submit result, perform again: forever.

To see a list of countries, just tape ./perform_analysis.py -c something: the software shows the available countries (or check yourself here )

Clean the data

When you've completed the collection and the script report:

Data collected has been sent, Thank You! :)

The collected data are automatically deleted, because if kept, they became a test resumed in the next collection. In order to avoid it, use the option -d (disable send) and then you can send later with -s.

Risks faced by users under pervasive surveillance

Users in country where Internet control is pervasive, will not raise concern or anomalies. Censored Internet traffic will appear like a user opening a news agency home page and close after. Then, a certain amount of "traceroute" traffic, that is a legit and common tool used to check network speed and topology.

If the user has some concern about using Tor, (which is used at the end of the test by us to deliver the test results), can avoid this using the option -d, and later we can figure out a dedicated way to receive the output collected. In this case, please contact the email address with the PGP key specified at the end of this document.

Step details, timing and resources

Few Bandwidth/CPU/disk resources are needed. It is not possible to make a precise estimation, because the executed operations depend directly on the amount of media websites being analysed. Anyhow, based on the past experience:

  • A list with around 200 media sites starts for (200 + 50) times a "one time browser". It use 5-10 seconds each. More or less spends 300 Kb for each website ( ~75 megabyte used in download ).
  • For every media fetch, 7 to 20 hosts are discovered to be included. The software processes them in the following ways:
    • every host is resolved in an IPv4 address, which helps minimize duplicated results (quite often different hosts have the same IP address). Depends on the first point, but commonly are between 1400 - 1700 unique hosts. This operation requires 10 to 30 minutes.
    • For every resolved IPv4, it performs a reverse DNS resolution. This operation is slower than the previous one and requires at least 30 to 60 minutes.
  • Then, for every unique IPv4 address, start a traceroute. Worst case scenario (and presence of option -i) the test will last for two/three hours, but this depends a lot on the number of target hosts.
  • The software requires more or less 4 hours to be completed and sends to us 8-15 megabytes of data at the end

You can interrupt the script execution with control+C, and when the command is given again, it will resume the previous execution. To start a new collection, remove the 'output' directory.

It is very important to have a stable network. If you can avoid WI-FI (or be physically near the access point), it would be better.

The operation performed by the software

  • an HTTP connection (using phantomjs) to every news media under analysis
  • Collects all the third party URLs (trackers, ADS, social), note: all the resource automatically loaded with javascript enable, not the <a href> resources.
  • Dump <object> elements ( Can be used for future analysis. analyzing this code, we can point out who are the worst tracker ;) this analysis is not yet done.)
  • DNS resolve and reverse of every unique included URL.
  • Performs a traceroute for every included URL
  • GeoIP conversion from every included IP address
  • sends the results to our server (213.108.108.84:32001) or hidden service (mzvbyzovjazwzch6.onion, if you want submit via Tor, use -T)

This shows all the nations capable of knowing which users are visiting the (selected) news media.

Technopolitical goal

We know that the online business model is mostly based on tracking.

If you don't want that a specific company to make money out of your data because you don't like it, or simply don't want be tracked by that company, your only option is to 'opt out'.

But what are the options ? This is one of the first goals of Trackography: to make you aware of the amount of tracking involved when you read the news online of a daily basis. We have addressed the media, because they represent something that involves all active citizens in a country.

But why bother if Google, NewYorkTimes or others track your behavior and interests ? It's pointless to be scared of those organizations. After all, they have done nothing bad against users.

But sadly, the data they collect is extremely valuable for market and political strategies, and such companies have been targeted by intelligence agencies for that precise reason. So it's easier to understand what we want show: tracking is not something which is only performed by internet companies, but which is also actively used as a nation's asset.

With Trackography we show the invisible links between a news reader from a nation and all the nations that can eventually snoop on their behavior. In a foreign network, you have no rights.

Is this paranoia ?

No ;) This has been done by the NSA, which intercepts the advertising network of the Angry Birds game. Angry Birds was the most deployed game, but was still an option for citizens. News media, however, are accessed by the majority of populations around the world and by tracking which websites users access, th

View on GitHub
GitHub Stars73
CategoryCustomer
Updated1mo ago
Forks42

Languages

Python

Security Score

80/100

Audited on Feb 25, 2026

No findings