Infinite Loops Dirty Architecture And Too Many Indexed URLs
-
Upload
dawn-anderson -
Category
Marketing
-
view
766 -
download
5
description
Transcript of Infinite Loops Dirty Architecture And Too Many Indexed URLs
INFINITE LOOPS& crawl rankDIRTY ARCHITECTURE
Dawn Anderson
CAMEINDUSTRY
VIA A DIFFERENT ROUTETHIS
to
I decided to add an additional dimension
to the site
TO ‘EXPLODE’ NATURAL SEARCH TRAFFIC
1.5 Million URLs
Crawl RateGoing Down
Indexation LevelsGoing Up
GOOGLEOnly crawling
0.1% Of our pages per
day
Infinite Loop Definition:An infinite loop is a sequence of
instructions in a computer program which loops endlessly, either due to the loop
having no terminating condition, having one that can never be met, or one that
causes the loop to start over. ..
PENGUIN & PANDAupdates came along
TOO MANY URLS=SEO DEATH
‘WE’RE ALL ‘DOOMED’’
BudgetCRAWL
Roughly proportionate to PageRank
Pages with a lot of links get crawled more
Still applies in current search landscape
RankCRAWL
A ranking metric for ‘no’ to ‘low’ PageRank pages??
Pages crawled more often rank higher
Get ‘low’ to ‘no’ PageRank pages crawled more than competitors = YOU WIN
CRAWL OPTIMISATION
Googlebot goes
AND KEEP WATCHING
FIND OUT WHERE
CHECK & MONITORfor over-indexation
500 Page Website 50,00 URLs in
GoogleYOU MAY HAVE DODGY CODE
Shoes.sitemap.xml
Dresses.sitemap.xml
tshirts.sitemap.xml
Check THOROUGHLY, Name & Categorise XML Sitemaps
yoursite.sitemap.xml
DON’T BE AFRAIDof hard 404’s
Use 410’s where you can
Giraffe
AVOIDsoft 404’s
ENSURE THATDynamic variables / parameters are checked for validation
Don’t render to just any old thing with a ‘200 OK’ response code or return a soft 404
HOW WILL YOU KNOW IF THERE’S A PROBLEM?
You won’t
AVOID A ‘JUMBLE SALE’
BUT
Use Robots.txt, nofollows, sitemaps, nav paths & cross
module internal linking
‘Herd’ Googlebot
Get Those Low Level Pages Crawled - OftenWhichever way you can
Pass equity to Siblings as Well as children
Visit the internal links section on GWT
Most Important Page 1
Most Important Page 2
Most Important Page 3
IS THIS YOUR BLOG?? HOPE NOT
CANONICALISATIONIn web search and search engine optimization (SEO), URL
canonicalization deals with web content that has more than one
possible URL. Having multiple URLs for the same web content
can cause problems for search engines - specifically in determining which URL should be shown in search results.[2]
Example:
•http://wikipedia.com
•http://www.wikipedia.com
•http://www.wikipedia.com/
•http://www.wikipedia.com/?source=asdf
All of these URLs point to the homepage of Wikipedia,
but a search engine will only consider one of them to
be the canonical form of the URL.(source - Wikipedia)
Deal Well WithNear & near duplicate content
Via canonicalization, 301’s & Content Build Out
STOP LYING & ‘GET FRESH’
Genuine ‘last modified dates’ are ALL important- FORGET PRIORITY
"It's not that Google will penalize you, it's the opportunity cost for dirty architecture based on a finite crawl budget" (A.J.Kohn) (BLIND FIVE YEAR OLD)
REMEMBER THIS
Me@dawnieando