OSINT At Home #26: Top 5 hacks to find deleted websites, posts and secret changes
rwuXOL5W0gE • 2025-09-19
Transcript preview
Open
Kind: captions Language: en What if I told you that everything deleted on the internet is never really gone? For example, this page, which was published by a Russian government news outlet, celebrated Russia capturing Ukraine in February 26, 20122. But they deleted it once they realized that was not the case. And even though they deleted it, it's still available for the world to see. And what about NASA's website? Especially where it used to say NASA will land the first woman, person of color, and international partner on the moon. But changes made in March 2025 removed that line. Well, the reason why these websites haven't vanished is because of one tool, the internet archive. And in today's world of vanishing tweets, edited articles, and disappearing websites, being able to retrieve the original version of a web page can be the difference between truth and deception. And so in this video, I'm going to show you some of my favorite ways to use the internet archive to dig up deleted content, track online changes, and search inside of news broadcasts. Hi everyone, and welcome back to this series on how to do open-source investigations at home. I'm Ben, and this is part 26. So, let's get started. What is the Wayback Machine in the Internet Archive? Well, think about it like if you've ever tried to rewind the internet like a videotape. Well, this is the tool to do that. The Wayback Machine is run by the Internet Archive, and it lets you see how websites looked in the past. In this section, I'll give you a quick tour of what it is, how it works, and why it matters for your investigations and research. We've all seen it when someone posts something online or a website puts something out and then sometimes it might vanish. This NASA article obviously hasn't vanished. It's on the real internet, but we're going to use it as an example of how that can actually be stored. Simply put, all we have to do is take the URL and we'll paste that into the tool called Wayback Machine. And so we can pop that into save page. Now, sometimes it takes a few minutes to go through because it's got a lot of work to do to archive that page. While that's archiving, let's have a brief look at how we can find out about previous websites that might have been archived. In a previous video, I spoke about a marketing firm that was ran out of Indonesia running influence operations and bot networks. Since then, that website has been taken down, but we can easily glean a few more details from that website through the internet archive. So, all we have to do is take the URL, which you can see copied up the top there. And if I go into Wayback Machine and pop that URL in there, we can see just the amount of times that that might have been archived. So, I'll take that back to 2019 and see what that actually used to look like. And there we have it. Where that website is currently not available, we're able to see what it used to look like and the descriptions on that site when it was actually up and running, which is really helpful for research. The second function of the wayback machine I really want to show you is how to track changes on websites. Sometimes you might have the same URL, but you might have different information over time. So for an example, I'm going to go to NASA's Arteimus page, and this is a brief description of the current page. So what we'll do is we'll run that into the Wayback Machine and see just how many times that's been archived. And you can see that's actually been archived quite a fair bit since probably its publication or the creation of that URL in mid to late 2023. So we could open up all of these and identify visual changes. But actually some of the work's been done for us to detect that. All we have to do is click on the changes tab. Now this might usually take some time to load but it gives us a chance to have a look at potential changes that have happened over the time. So I might do a pre version in March 5 and I want to compare that with something a little bit more recent say September 10. So this will bring up a visual comparison tool where at the bottom it describes it as yellow indicates content deletion and blue indicates content addition. On the 5th of March we have the website on the left and on the 10th of September we have the website on the right. Specifically, the difference on this part is this last line here that says NASA will land the first woman, first person of color, and first international partner astronaut on the moon using innovative technologies to explore more of the lunar surface than ever before. On the right, it shows that that's not there. And we can even double check that on the current website just to see if it's still off. And yeah, we can see that that's still not there. So it shows that someone has removed that line on the NASA website and thanks to the internet archive or the wayback machine tool, we're able to identify that change. And that's a really useful way to use this tool on tracking changes on websites and showing the difference between them over time. The third feature I'd like to show is around this tab called collections. I'm going to click the collections tab. It shows what collections this has been archived to and we can see some of those there and then that will open up a new collection. We can see one called end of term 2024 and we could even check out this collection and you can see the captures that have been made under that collection as well as a timeline of those collections. The fourth function I want to show you is just around the calendar view and reading the calendar a bit more to identify if there's been a spike in interest of those pages. And that calendar function is really quite useful for that. When we actually load a page and load a snapshot, we also have the function up here to click through those. So you can either use this function which just takes you back to the previously archived version. And then we've got this functionality over here just to scroll that timeline. And you can start to see why the Wayback Machine is so fascinating because websites like for example the White House, we can just have a look at the historical versions there. And this is a version that we're looking at here from May 10, 1997. Very early stages of the internet where we have web pages of the White House in a very simple manner which is a fascinating glimpse into what the website used to look like. The fifth function that I really want to go through is probably one of my favorite functions and it's in the video section. So, if you go to the archive.org homepage up in video and select TV news, you can also get there by going archive.org/details/tv. So, there's a lot of different functions to this one. You can see the special collections that they've got here. They've also got recent quotes that have been picked up and these really cover not just English news but many other channels as well. So you've got Bellarus TV, Al Jazzer, Press TV, Fox News, CNN, BBC and many others. So a couple of quick ways that you can use this. You can search the captions and I'll open up advanced search just to show some of the functionality in there. You can go through all of these networks. You can search only quotes. You can have a look for specific shows and then you've got the date range function there. I'm going to type in Yangon which is the city from Myammar and it's going to load for me the mentions of Yangon. So we can see Anthony Bourdain in there, Pierers Morgan Live, but also BBC World News and others. On the left function, you can go through the programs, the creators, the years as well, and also topics and subjects. And you can even see a little timeline which is very useful. many broadcasts in March 2021, which was just after the military coup in Yanggon, and obviously crackdowns on civilians and protesters in the streets, which were very violent and a huge level of oppression afterwards from that. One of the other things I often like to type in here is terms around, say, interference or online interference, like election interference, and just having a look at the mentions of those and seeing where that's been brought up. And that segus into a very impressive spin-off tool that allows you to visualize this. And this is the Gelt summary tool, which basically uses the data from the TV news archive and allows you to make visualizations with that. So I'm going to type that term into here. When I click create summary on election interference, it brings up a nice little timeline. So where we would usually see this content brought up in this kind of way, it brings a bit more of a summary and overview version and what are some other terms in this wordcloud that have been mentioned alongside election interference but regularly popping up and then obviously the top clips viewer as well. I really love that this has been created so that you can go from searching in the detail to election interference and drill down to the exact video, but also then to zoom back out and have a look at that bigger picture of trends and mentions over time so that you're consistently jumping between what's being mentioned on a specific news broadcast at one single time versus having an overview of all of the mentions over time. So, we've covered quite a few interesting tools, and I think it's important to pay a visit to the internet archive and the wayback machine. It's a tool that shows us that the internet never forgets, and that you just need to know where to look. And don't forget to stay tuned for some of the next sessions on open-source investigations. Until then, take it easy. [Music]
Resume
Categories