Download - Internet Marketing Overview: Part 2 of 12

Internet MarketingPart 2 of 12

George A. Rubsam@GARubsam

2

Background: Websites• 60 trillion webpages and constantly growing.• Google = Googol 1 with 100 zeros• Build your website for users (first) and search

engines (second).- Know what you are trying to accomplish with your

website (Chapter 1)- Know the audience for which you are building the

site. Needs, wants, internet behaviors, etc.

3

Background: Market Share

• Google U.S. market share has plateaud at 67%

• Google European market share is 90%+

4

Position Matters

• #1 position = 18.2% of all clicks

• #2 position = 10.1% • #3 position = 7.2%• #4 position = 4.8% • #5 position and greater

= <2%• Top ten positions =

52.32%

• Google Search

How Search Engines Work: Crawling and

Indexing

6

How Search Engines Work• Start with a “Spider” or “Webcrawler”

- Software the reads webpages.• Text• Meta tags/HTML code

- It creates an index of content on each page.- Follows every link and indexes those pages too.- It indexes entire site maps/website architecture.- Creates a searchable index database noting the

content on every readable webpage.

7

What is an Index?• Example: Back of a book section lists

keywords, people, concepts, events, page numbers etc.

• Without an index, every webpage would have be scanned, then ranked, which could take hours or days.

- Ex: You would have to scan every page of the book to find the relevant pages.

8

Indexed Data Trade Off• Example: Shirt company expands product line

to include coats.• Google indexes the Internet in two to four

weeks.- Relevance to “coat” keyword searches could take

weeks or months.- Exceptions: Viral videos and lots of press

• Users save time in search, but there could be potential delay in being considered relevant to certain keywords.

http://www.dailymail.co.uk/femail/article-3177059/Windbreaker-branded-Swiss-Army-jacket-15-pockets-neck-pillow-blanket-iPhone-charger.html

9

What Spiders See

11

Indexing Keywords, Files & Multimedia

• Spiders easily read text files, but are challenged by images, video, or animations (e.g., Flash or Java).

• Algorithms look at the webpage content and HTML code associated with the file.

http://www.jibjab.com/

https://www.google.com/search?q=java+for+android+apps&espv=2&biw=1810&bih=1126&source=lnms&sa=X&ved=0CAUQ_AUoAGoVChMI15ixlcPmxwIVSC2ICh0VjQqN&dpr=1.1#q=java+android+apps

How Search Engines Work: Providing Answers

13

What is a Search Engine?• A program that

- Accepts a query and searches an index database.- Using algorithms it finds keyword combinations in a

database that correspond with a query.- Ranks all the webpages that are relevant to the

keywords, not just the ones that matched.- Provides a list of webpages in order of relevance.

14

Types of Queries• Do

- Accomplishment: buy a plane ticket or shoes.• Know

- Information: Names of American designers ex: Ralph Lauren.

• Go - Visit Bloomingdales (specific), department stores

(less specific), shoe stores (least specific).

15

Keyword: Fashion

• Trends• Magazines• News (weather.com)• Content Aggregators• Wikipedia• Images + Categories

16

Relevance Ranking FactorsRanked most impactful to least• Domain-Level, Link Authority Features

- Quantity of links, trust, domain-level PageRank, etc.• Page-Level Link Metrics:

- Quality/spamminess of linking sources, etc.• Page-Level Keyword & Content-Based Metrics

- Content relevance scoring, on-page optimization of keyword usage, topic-modeling algorithm scores on content, content quantity/quality/relevance, etc.

• Page-Level, Keyword-Agnostic Features- Content length, readability, uniqueness, load speed, etc.

• User Usage & Traffic/Query- Data Search Engine Results Page (SERP), engagement metrics, clickstream data, visitor

traffic/usage signals, quantity/diversity/CTR of queries, both on the domain and page level

• Domain-Level Brand Metrics- Usage of brand/domain name relative to mentions in news/media/press, browser data

of usage for specific site/page.• Domain-Level Keyword Usage

- Exact-match keyword domains, partial-keyword matches.• Domain-Level, Keyword-Agnostic Features

- Domain name length• Page-Level Social Metrics

- Quantity/quality of tweeted links, Facebook shares, Google +1s, etc. to the page

17

Relevance Ranking Factors• How many links from the site to other relevant

sources?• Trustworthy server? Long server history?• Quality of sites linking back to your content.• Content readable, unique or “scraped”?• Are people engaging with the content?• Is the site/brand mentioned in credible news sources?• Does the site have exact or partial-keyword matches?• Is the domain relevant to the query?• On what social media platforms and how often is your

site/brand mentioned?

18

Ranking Logic: Keyword Hits

Hit Type Weight*

URL 100

Title Tag 95

Anchor Text 90

Text large 60

Text medium 30

Text small 10

*For context only and not accurate.

• Notice how is know Evening Dresses relates to gowns, cocktail dresses, formal dresses, etc.

• Google uses 200+ considerations to rank content.

19

Ranking Logic: Keyword Hits Hit Type Type Weight No. of Hits Weight x Hits

URL 100 1 100

Title Tag 95 1 95

Anchor Text (links) 90 5 450

Text large font 60 1 60

Text medium font 30 3 90

Text small font 10 20 200

Relevance Score 995For context only and not accurate.

20

Ranking Logic: Keywords• Number of times a word appears on a page.• Where on page terms occur (distribution)

- URL, header, headlines, body copy, footer

• The main theme and topics (on-topic issues) of the page.

- Ray Ban Sunglasses referenced on TMZ vs. PopSugar.• Relative distance between keywords (proximity)

- Evening dresses … … … … evening dresses- Evening dresses … evening dresses (most relevant)- Evening dresses, evening dresses, evening dresses, etc.

• The frequency between individual terms (occurrence)- Evening dresses … evening … dresses … dress … related words

(dinner, dancing, gowns, party planning, etc.)

Search Engine Limitations

22

Search Engine Limitations• Search engines struggle with completing login/online forms.

- CAPTCHA protects forms from bots.

• “Robots.txt” code purposely blocking search engines.• “Nofollow links” code blocks search engines from following

links.• Minimally-exposed content may be deemed unimportant by

the engine's index.• Uncommon terms normally unused in seach.

- Ex: "food cooling units” vs. "refrigerators"• International languages

- Color vs. Colour- Content in French when the majority of the visitors are from Japan.

• Mixed context signals- Blog post reads "Mexico's Best Coffee" but content is about a

Canadian vacation resort that serves great coffee.

23

Broken Link Structures

• Spiders can reach page A and sees links to pages B and E.

• Without a link pages C and D are not accessible.

24

Limiting Search Engine Access

• The robots exclusion protocol (REP), or robots.txt is a text file webmasters create to instruct robots (search engine software) how to crawl and index pages on their website.

Addendum

26

Resources• Video: How Search Works• Interactive Page: How Search Works• Internet Archive: Wayback Machine

https://www.youtube.com/watch?v=BNHR6IQJGZs

http://www.google.com/insidesearch/howsearchworks/thestory/index.html

http://archive.org/web/

27

Anatomy of a Link

• "<a" tag indicates the start of a link. • The link referral location tells the browser (and the search engines)

where the link points. • Next, the visible portion of the link for visitors, called anchor text in

the SEO world, describes the page the link points to. • "</a>" tag closes the link to constrain the linked text between the

tags and prevent the link from encompassing other elements on the page.

• The crawlers can read this basic link and use it to calculate query-independent variables, and follow it to index the contents of the referenced page.

28

Ranking Factors Survey

QuestionsPart 3 of 12 is up next.