Transcript
heRPI0NDko0 • OSINT At Home #12 – How to pull text from an image and use it in search
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/Bendobrown/.shards/text-0001.zst#text/0015_heRPI0NDko0.txt
Kind: captions Language: en sometimes we just don't have that friend nearby that can tell us what a sign like this might mean in a photo or a video and it's not just about translating languages what if we wanted to know who this person is on the right in this image or be able to type this text that you can see in the back of this conference hall and run a google search on it well thankfully there's a fix and it's all available within a few clicks of your mouse so sit tight and get ready as we look at how you can extract text from an image to use it for searches translations and furthering your research [Music] hi everyone and welcome back to this series on how to do open source investigations from home i'm ben and this is part 12. so let's get started to start this session off let's say we have this photo of someone holding up this sign at a protest now we could ask a friend what it means sure and i'd always recommend you get a second opinion on translations rather than using machines for translations because well as we know machines aren't always right but i've got an english keyboard in front of me and i can't really type out what that sign says if i wanted to search for it in google so for a sign like this maybe i can use something like google translate's draw function and start to really write out some of these characters but we could actually do this a little bit easier and there's something that's really quite handy that exists called ocr ocr stands for optical character recognition it allows the text to be pulled out of an image so we can either use that in translations or we can use that to run a search in google or another search engine platform as well one of my favorite apps to do this is yandex and this is really helpful if you're doing this from an image reverse search perspective so say you wanted to reverse that search that image that you found in an article and you want to get a translation out of that or pull that text out of it all we need to do is right click or upload that using this icon that you see in the middle there and already you can see on the right of this image that we have the text that's been pulled out of that image i can easily just click on that translate function and that'll bring that up in translate and what we've got is one for all and all for one which is what that banner says on that image right there so what about other uses for this because perhaps this might be simple for some people perhaps it might be written in an article already or wherever the context is of the image that's been used so for example what if we wanted to identify this person on the right well we've got a little name written there don't we so if i was to right click on this and i was to run an image reverse search on this in yandex perhaps we can click on recognize text now there's a couple of bits of text that we can see in there it seems like this one has picked this up and it's always nice to just double check to make sure that we've got that it's picked up getty images as well and it's also picked up that name we could use the google translate function on that but what we can also do which is really useful about pulling that text out of the image is that we can just copy that text and run it in a search engine and what i'm doing now is i'm running a google search for the text that we've just pulled out of that image of that name plate right there which is the original spelling of that person's name and we can see that that's the vice president of the people's republic of china cool so what about another example well how about we look at what this actual phrase says on the back of this image here uh from a conference hall in north korea well again we can take it through the same process i always like to try and get the biggest and most clear image possible to reverse search that or to run that through any platform that you might have your hands on and simply run that through yandex as well if you don't have any plugins that allow you to right click and go to yandex you can simply save the image and just upload as an image of our search into the index so you can see that that's pulled out that text for us and already we can just select that and with my google translate plugin we can see what that might say now i always like to just double check with all these characters as well because sometimes this might just be off or it might be a little bit inaccurate also sometimes the translation function on google isn't really that great for so many different languages but one thing we can do is just run a simple search on that text as well and see perhaps if that comes up with any different results yes we've got a few different results which is quite interesting so you're probably thinking okay this seems pretty easy for small bits of text but it's also for big bits of text as well and it's not just on yandex there's lots of ocr bits of software out there and platforms i'd recommend you vet each one of them before you start uploading any private documents or private images that are not on the internet but however here's one called i2 ocr i really like using this one because you have two different options so you can do image ocr or you can do pdf ocr the reason why pdf ocr on this one is quite good in comparison to platforms like google is because this one offers very different languages so for example we could run a pdf in burmese and we could get that loosely translated again i recommend you get someone that speaks burmese or can read burmese to actually confirm any translations before you rely on them in any publications or anything like that so in testing out how this platform actually operates let's try an example such as this which is a post from december 25 from the sudan doctors committee now often you'll see perhaps facebook posts or you might not know that it's from facebook but it might be posted as a screenshot on twitter and because it's not raw text but rather a screenshot of the post you might not know or might not be able to easily translate this into google or something like that so a really easy method to do that is just to take that screenshot and really run that through one of these ocr platforms so we'll do that now i've selected my input language and i can select the file here we definitely want to make sure that we are not a robot so sometimes it's always fun to do a little bit of spot detractors and we'll extract the text so what we have here is the raw text on the left that i can select and pop into a translate function or something like that on the right we can see how that has been picked up from that image that we originally uploaded that screenshot image so the i2 ocr site is quite useful because we can just take that text and we can run that in either bing or google and actually translate that text so you can see that it's quite useful to translate lots of text but of course this is clearly written text that you can see from that screenshot there if for example you were to have a street sign or a sign of a shop that you can see in a video but isn't really easily replicable for something like yandex or something like this where we can we not able to pull out that text i'd recommend going back to the draw function and just having a little go at drawing that text in there whether it's arabic or whether it's cyrillic or some other form of alphabet that you just can't clearly get that ocr working on so there we have it i hope you found this session on extracting text from images helpful and useful if this type of technique has helped you in your research or does help you please do let me know in the comments section below i also just wanted to say a big thank you for all of the comments likes and subscribes to this channel especially the messages from journalists researchers and investigators in developing countries who are learning this content and using it to hold those responsible to account i'm so glad i can be of such help to you all i'll see you in the next session soon until then take it easy [Music] you