Finding & Investigating Digital Footprints with Open ...€¦ · Paste Sites –What Could You...
Transcript of Finding & Investigating Digital Footprints with Open ...€¦ · Paste Sites –What Could You...
1
Dr Stephen Hill
Finding & Investigating
Digital Footprints with
Open Source Intelligence
Workshop
The Web Explained
Search Engines
▪ To be truly effective at online research andinvestigation, it is important to understand theunique and combined qualities of each searchengine and to use them effectively in conjunctionwith each other…
2
Search Engines (Index)
▪ Search engines are "engines" or "robots" that crawl the weblooking for new web pages
▪ These robots read the web pages and put the text (or partsof the text) into a large database or index that you can thenaccess…
▪ Google - https://www.google.co.uk
▪ Bing - http://www.bing.com
▪ Yahoo - https://uk.yahoo.com
▪ Yandex - https://www.yandex.com
Index Search Explained
▪ Page A and Page B have equivalent location and frequency of
keywords; however
▪ Page A has 20 external webpages linking to it and Page B
has 40
▪ Based on the implication that Page B is more popular, it
would achieve a higher page ranking within Google and
Bing’s search results than Page A
▪ This information is significant to investigators as many of the
webpages sought may be “hidden” or purposely forced to be
“unpopular” by the owner due to the nature or intention of the
site…
Point to Remember!
This presents a challenge when using Google and Bing
as both of these search engines focus on presenting the
most popular pages at the top of their search results
When using these search engines, it may be necessary
to locate the least popular sites within millions of search
results, proving time consuming and relatively
ineffective…
3
https://www.google.com.au
Google – Index Search
https://www.google.co.nz
Google – Index Search
Google – Index Search (Regional)
https://www.google.co.uk
4
‘Bubbling & Tracking’
Search History
Location
Browser
Browsers version
Computer being used
Language being used
Time to type in a query
Time we spent on the search result page
Time between selecting different results for the same query
Operating system
Frequency clicking on adsense advertising on other websites
Operating systems version
Resolution of computer screen
Average amount of search requests per day
Average amount of search requests per topic (to finish search)
Distribution of search services used (web / images / videos)
Average position of search results clicked on
Time of the day
Current date
Topics of ads clicked on
Frequency of clicking advertising
Frequency of searches of domains on Google
http://www.rene-pickhardt.de/google-uses-57-signals-to-filter
Google – Time Filter
5
Google – Time Filter
Google – Cache
Google – Cache
http://webcache.googleusercontent.com/search?q=cache:efj0Wj8fzxUJ:dfk.com/+
&cd=1&hl=en&ct=clnk&gl=au
6
Google – Similar
Google – Similar
Google Image Search
7
Google Image Search
Google Image Search
Google Image Search – Face Filter
8
Google Image Search
Google Image Search
Google Reverse Image Search
9
Google Reverse Image Search
Google Reverse Image Search
BEYOND GOOGLE
10
Bing
https://www.bing.com
Google & Bing
http://advangle.com
11
Google & Bing
Google & Bing
http://advangle.com
Google & Bing
http://advangle.com
12
Search Directories
▪ Search directories are hierarchical databases withreferences to web sites
▪ The web sites that are included are hand picked by individuals and classified according to the rules of that particular search service
▪ Yahoo Directory - https://business.yahoo.com
▪ BOTW - http://botw.org
▪ DMOZ - http://www.dmoz.org
DMOZ
http://www.dmoz.org
https://startpage.com
StartPage
13
14
Carrot2
http://search.carrot2.org
Yippy - Cluster Search
Formerly known as ‘Clusty’
http://www.yippy.com
16
DuckDuckGo Bangs
https://duckduckgo.com/bang
Semantic Search
www.cluuz.com
Qwant
https://www.qwant.com
17
Qwant
https://www.qwant.com
Exalead - Advanced
http://www.exalead.com/search
Where to Find Search Engines?
www.searchenginecolossus.com
18
Advanced Search Techniques
▪ Phrase searching: “fraud in New Zealand”
▪ Boolean search: AND* fraud, NOT* scam
▪ Google Alternative: “fraud”, -scam
▪ Boolean search: fraud OR scam OR swindle
▪ Parentheses: ( ) also known as nesting…
* Will not work with Google
Check the Spelling
▪ Remember words are can be spelt differently orthere may be a misspelt word or typo on thewebsite you are looking for hence why somesearch engines fail to find the word/phrase
▪ Consider spelling and typo’s
▪ Tyres & Tires, colour & color
▪ Stephen Hill, Steven Hill, Steve Hill
▪ Serach Engine, Fraud Invesdigation...
Wildcards *
In most search engines and directories, a search for
investigat*
will give you pages with the words including:
investigate, investigated, investigation, investigator
Note: Google uses a process called stemming
19
Truncation & Wildcards *
Other ways to search using the *
" * * director of HTC Parking and Security Limited“ = ?
"Ms Anna Koltsova phone *" =?
"the * population of Auckland is" = ?
Parentheses
▪ Require the terms and operations that occur insidethe brackets to be searched first
▪ This is called "nesting"
“identity theft” ((organized OR organised) -crime)
▪ Parentheses MUST BE USED to group terms joinedby OR when there is any other Boolean operator inthe search…
20
Keyword Searching
Finding Archived Web Pages
https://archive.org/web
Internet Archive
http://archive.org/web
21
News Links
http://www.onlinenewspapers.com/
http://www.world-newspapers.com/
http://www.listofnewspapers.com/
http://www.refdesk.com/paper.html
http://www.allyoucanread.com/
http://www.actualidad.com/
http://www.thepaperboy.com/newspapers-by-country.cfm
http://news.silobreaker.com/
http://www.newsola.com
Real Time News
22
News Links
23
Classifieds - A Criminal Hotspot?
People Search
https://pipl.com
Company Search
https://opencorporates.com
24
Company Search
https://www.gov.uk/government/publications/overseas-registries/overseas-registries
Paste Sites – What Could You Find?
▪ Paste sites are websites allowing users to upload textfor public viewing.
▪ Originally designed for software developers whoneeded a place to store large amounts of text
▪ Links would be created to the text and the user couldshare the link with other programmers to review thecode.
▪ Many hacking groups use this area of the Internet tostore compromised data.
▪ Most popular site – ‘Pastebin’
Tools for Social Media Intelligence
25
Facebook Search
26
LinkedIn Search
LinkedIn Search
https://www.linkedin.com/help/linkedin/answer/76015
27
Twitter Search
28
29
Social Searcher
http://www.social-searcher.com
Social Searcher
http://www.social-searcher.com
Social Searcher
http://www.social-searcher.com
30
Reverse Image & EXIF Extraction
Reverse Image Search
http://www.tineye.com
Reverse Image Search
31
Reverse Image Search
Reverse Image Search
http://www.tineye.com/
Metadata (EXIF)
▪ Exchangeable Image File Format
▪ Standard that specifies the formats for images,sound, and ancillary tags used by digital cameras(including smartphones), scanners etc
▪ Applied to JPEG & TIFF images and can include;
▪ Original Image date & time, modified dated & time
▪ Camera details including ‘geolocation’ settings…
32
EXIF Sites to Consider
Jeffrey’s EXIF Viewer▪ http://regex.info/exif.cgi
Others▪ http://www.takenet.or.jp/~ryuuji/minisoft/exifread/english/
▪ http://www.impulseadventure.com/photo/jpeg-snoop.html
▪ http://www.sno.phy.queensu.ca/~phil/exiftool
Camera Trace▪ http://cameratrace.com/trace
▪ http://www.stolencamerafinder.com
Video Metadata▪ https://mediaarea.net/en/MediaInfo
Where Was This taken?
Tracing Location of a Photo
https://petapixel.com/assets/uploads/2012/12/fugitivemcafee.jpg
33
http://petapixel.com/assets/uploads/2012/12/fugitivemcafee.jpg
34
WHOIS
WHOIS
WHOIS
http://whois.domaintools.com/planethollywoodlondon.com
35
Hiding Your Identity Online
Disguising your ID
▪ Every time you surf the Internet, your IP addressis publicly visible to everyone on target networkresources
▪ It is important therefore not to leave a digitalfootprint...
Sock (Finger) Puppets
4 steps to create a sock puppet:
▪ Create fake ID – use name generator
▪ Create fake profiles/user accounts on Facebook etc.
▪ Fake/disguised email, phone and IP details
▪ Consider payment method – pre-paid credit card…
36
http://www.fakenamegenerator.com
Disguising Your Online ID
Proxy and VPN services re-route your internet traffic and change your IP
A Proxy is like a web filter
▪ Proxy will only secure traffic via the internet browser usingthe proxy server settings
A VPN encrypts all of your traffic
▪ VPN’s replace your ISP and route all traffic through the VPNserver, including all programs and applications...
TOR
https://www.torproject.org
37
TOR
“Tor protects you by bouncing your communications arounda distributed network of relays run by volunteers all aroundthe world:
It prevents somebody watching your Internet connectionfrom learning what sites you visit, and it prevents the sitesyou visit from learning your physical location.
Tor works with many of your existing applications, includingweb browsers, instant messaging clients, remote login, andother applications based on the TCP protocol”.
Who is using Tor?
▪ Normal people (e.g. protect their browsing records)
▪ Militaries (e.g. military field agents)
▪ Journalists and their audiences
(e.g. citizen journalists encouraging social change)
▪ Law enforcement officers (e.g. for online “undercover” operations)
▪ Activists and Whistleblowers (e.g. avoid persecution while still raising a voice)
▪ Bloggers
▪ IT professionals (e.g. during development and operational testing, access
internet resources while leaving security policies in place)
38
Tor Project
Some of the software and services under the Tor project umbrella:
▪ Torbutton
▪ Tor Browser Bundle
▪ Vidalia
▪ Orbot
▪ Tails
▪ Onionoo
▪ Metrics Portal
▪ Tor Cloud
▪ Shadow
▪ Tor2web…
Tails
https://tails.boum.org
TOR to Web
https://tor2web.org
39
VPN Options
https://www.privateinternetaccess.com
How Safe is your Browser?
https://panopticlick.eff.org
40
Public Vote on Secure Browser
Source: Sensors Tech Forum (http://sensorstechforum.com)
The users voted that the most secure browsers are:
▪ Google Chrome - 49% or 296 votes
▪ Mozilla Firefox - 31% of votes, or 187 voters
▪ Internet Explorer - 7% or 43 voters
▪ Safari and Opera both got 4% or 25 votes
▪ Microsoft Edge - 3%, or 19 votes
▪ Maxthon - 1% or 9 votes…
http://sensorstechforum.com/which-is-the-most-secure-browser-for-2016-firefox-chrome-internet-explorer-safari-2
Final Considerations
Other questions should also be taken into consideration inaddition to securing your web browser:
▪ Do you update your browser whenever a new version isavailable?
▪ Have you configured your browser updates as automatic?
▪ Do you use third-party browser add-ons and plugins, and ifyes, are you familiar with their developers?
▪ Do you install third party software from unknown downloadpages, without paying attention to the DownloadAgreement?
41
Dr Stephen Hill
Finding & Investigating
Digital Footprints with
Open Source Intelligence
Workshop