937 words
5 minutes
The "Echo Beacon"

Preface#

With my games site, one constant thing that happened was people taking or embedding the games on their own sites. This was good and bad. It was good in a way that my site was spreading and becoming more widely used, but it was bad in that most of the time, the traffic stayed on whatever site it was embedded on, and half of the time, the people just took the creation for ripping the game. I decided to experiment with my site to see how far it would spread.

I have mentioned the “Echo Beacon” a few times as this mysterious tracking magic that would allow me to find sites that used my stuff, but I never really went into the details of how it worked, so it’s time to show how it all worked.

The “Echo Beacon”#

So, the beacon is more than just one thing; it is a bunch of different systems and tricks working together. Most of these are based on the different codes that Google has registered for my site, so here are a few examples, and I will go through each one.

Google Site Verification#

One option for verifying sites with Google Search Engine is to add an HTML file to the root of your site. For me, this required me to add a file named google5eb562ebd0c01680.html with the contents google-site-verification: google5eb562ebd0c01680.html to the root of my site. This was the first step in the beacon.

By adding this file, I could search sites on Github that had the file on their site. This was a good start, but I wanted to go further.

Google Analytics#

The next step was to add Google Analytics to my site. This already helps me do basic tracking, and if someone does not know how to remove it from their site, it will also lead me to their site. This was by far the most successful part of the beacon, as it was put EVERYWHERE. Now, there are a few different ways I was able to measure how far this one got.

First, just the analytics were reported, which is helpful but does not show me where they ended up.

Google Analytics Homepage

I did find one panel that showed me the top 10 hostnames for the past week, so I want to investigate some of these.

Top Hostnames within the past week

Investigating the top hostname#

Slope.io homepage

At first glance, it looks like any normal game site focused on one game and has many ads. But how did my tracking code get here, and how did it register over 135,000 users within one week? Well, I placed the tracking code on the games hosted on my site, and sure enough, when opening the game embedded on this site (slopeio.org/game/slope), there was the tracking code in the source.

<!-- curled from https://slopeio.org/game/slope/ with lines 56 to 65 -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-98DP5VKS42"></script>
<script>
  window.dataLayer = window.dataLayer || [];
  function gtag() {
    dataLayer.push(arguments);
  }
  gtag("js", new Date());
  gtag("config", "G-98DP5VKS42");
</script>
<script src="shared/lib.js" type="text/javascript"></script>

That was not hard, was it? I wonder if this is the same for the other sites on the list.

Source code of slope3d.net showing the tracking code

I guess slope3d.net is the same?

Source code of slope-game.net showing the tracking code

And slope-game.net is the same as well.

I am pretty sure you can see where this is going; it is on a lot of sites, but what if I wanted to get the exact number? Google Analytics can be the most complex thing in existence, but there has to be a way, right?

Google loves to track people, so they make sure you add your tag to every page. It will let you know if it detects your tag on a page not within your allowed list of domains. And would you look at that:

Page showing pages that are being tracked

The other domains are blurred to respect their privacy

I am sure the number would be way bigger, but it can not load the data for more than 10,000 different pages, which is understandable.

Google Analytics Code#

This part is the same as the Google Site verification, but instead of a file, it is just the tracking code: 98DP5VKS42. You can easily search for the code on your favorite duck-based search engine or GitHub. You will not find many results, but they are still floating around in closed-source sites.

/js/main.js#

One of the most important features I had on my site was tab cloaking. To do this while having full-screen games, each game had a script tag pointing to /js/main.js. Most of the time, people leave this in, and so it does not resolve to anything.

This is not as useful as it is pretty common, judging by how searching the full script tag returns over 313,000 files on GitHub, but it was there nonetheless.

What did we learn#

Well, we learned that many people used the version of the slope game that I stole from somewhere at some point and were too lazy to remove the Google Analytics tracking code.

Is there a new version of the Echo Beacon? Yup, it is active right now. It is just the umami analytics server that I have, and it has been embedded into 3kh0 lite and more.

NOTE

Just a random side note: If you do have a site and would like to have faster, privacy-preserving analytics powered by umami, shoot me a message, and let’s talk!

This is something that a third grader made up… Yes, it was only the long-term solution that I had at the time

What was the point of this? I’m not sure, but I thought it would be cool to share. After all, that is what a blog is all about! Sharing things that you find interesting and hope others do too.