1
Promoting Your Project Web Site
Brian Kelly
UK Web Focus
UKOLN
University of Bath
Bath, BA2 7AY
England
UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher Education Funding Councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath where it is based.
[email protected]://www.ukoln.ac.uk/
Project Manager for Exploit Interactive web magazinehttp://www.exploit-lib.org/
2
ApproachesWhat approaches can we take to raising the profile of our web site?
• Tell our friends and colleagues (at conferences in exotic places)
• Give away pens and bags
• Let it happen automatically
• Submitting resources
• Perhaps giving parts of our web site away?
3
Automated Indexing
Many users use search engines such as AltaVista, HotBot, Northern Lights, etc. to find resources.
Issues:• Will my site be indexed?• Will it be near the top of a sensible search query?• How can I improve things?
4
Problems in Being Indexed
Size of Index
Search engines are failing to keep up with the growth of the web
Not all pages on a web site will be indexed
Typically a 500 page sample will be indexed
Frames (and "splash screens")
Many indexing robots can't access framed web sites or web sites which use "splash screens"
5
Improving Indexing of Key ResourcesHow to ensure that quality pages are indexed:
• Don't publish non-work pages on the server• Move from a single large institutional server to
multiple (real or virtual) servers:
Instead of <www.ukoln.ac.uk/exploit/> use <exploit.ukoln.ac.uk/> or (even better) <exploit-lib.org/>
• Avoid use of frames (or provide link to alternative entry point)
These approaches will improve chances of more complete indexing of the web site
6
Improving Indexing (2)
Do you know if your project web sites uses the Robot Exclusion Protocol (REP) - a /robots.txt file?
Use the REP to:• Prevent junk (old or draft versions,
experimentation, etc) from being indexed
Check your /robots.txt file to:• Ensure that your web site can be indexed
User-agent: * # Following apply to all robots Disallow: /cgi-bin/ # Don't index /cgi-bin directory Disallow: /tmp/ # Don't index /tmp directory
User-agent: * # Following apply to all robots Disallow: /cgi-bin/ # Don't index /cgi-bin directory Disallow: /tmp/ # Don't index /tmp directory
Tools are available to help you manage the robots.txt file. For example RoboGen: <http://www.rietta.com/robogen/>
Tools are available to help you manage the robots.txt file. For example RoboGen: <http://www.rietta.com/robogen/>
7
Improving Indexing (3)Updating the /robots.txt file may be difficult. The (new) <META> feature allows HTML authors to control robots.
Use this in key menu pages for resources you don't want indexed.
<META NAME="robots" CONTENT="noindex, nofollow"> <META NAME="robots" CONTENT="noindex, nofollow">
deliverables reports draft personal
See <http://info.webcrawler.com/mak/projects/robots/meta-user.html> and <http://www.kollar.com/robots.html>
See <http://info.webcrawler.com/mak/projects/robots/meta-user.html> and <http://www.kollar.com/robots.html>
8
Some Solutions (3)
Getting Your Web Site Indexed (cont)Several search engines allow URLs to be submitted
Bulk Submissions
Turnaround time from a few days to several months
And what about bulk submission services?
Bulk Submissions
Turnaround time from a few days to several months
And what about bulk submission services?
9
Some Solutions (4)Some Submission Engines
http://www.webposition.com/http://www.netsubmitter.com/http://www.registerpro.com/http://www.pegasoweb.com/
engenius/http://www.exploit.com/wizard/
Some Submission Engineshttp://www.webposition.com/http://www.netsubmitter.com/http://www.registerpro.com/http://www.pegasoweb.com/
engenius/http://www.exploit.com/wizard/
There are products for submitting sites to multiple search engines (and analysing your pages, reporting on your position in search engines, etc.) But:
• How good are they?• How ethical are they?• How cost-effective are they?
There are products for submitting sites to multiple search engines (and analysing your pages, reporting on your position in search engines, etc.) But:
• How good are they?• How ethical are they?• How cost-effective are they?
10
Has It Worked?
How do you know if robots are visiting your web site?
The free BotWatch Perl program will analyse your log files and generate a report on visits by robots.
BotWatch is available at <http://www.tardis.ed.ac.uk/~sxw/robots/botwatch.html>See also <http://www.botspot.com/>
BotWatch is available at <http://www.tardis.ed.ac.uk/~sxw/robots/botwatch.html>See also <http://www.botspot.com/>
11
Problems in Ranking
Typically large numbers of hits are obtained.
Metadata may help<meta name="keywords" content="exploit, web
magazine, TAP, telematics">
<meta name="description" content="Exploit Interactive is a ..">
<meta name="DC.Title" content="Exploit .."
But:• "AltaVista" and Dublin Core metadata are not
supported by all (many?) search engines• Issues about maintenance of metadata
12
Some SolutionsUse of "AltaVista" metadata is a must for key pages Use of Dublin Core:
• Could be used in specialist applications (domain-specific search engines, current awareness services, B2B, etc.)
• Think about additional benefits to you (e.g. local searching, auditing)
• Scope for discussions with search engine vendors?
• Need to think about deployment and maintenance The Exploit Interactive web magazine uses Dublin
Core metadata to enhance local searching. The metadata can also be used by 3rd parties
The Exploit Interactive web magazine uses Dublin Core metadata to enhance local searching. The metadata can also be used by 3rd parties
13
Analysis of NFP Web Sites
Report of an analysis of NFP (National Focal Point) web sites published in Exploit Interactive issue 3. Of the 10 web sites:
• No significant use of metadata on main entry point
• Six made no use of REP, one disallowed all robots and three made sensible use
• No use of separate domain names
• One framed sitehttp://www.exploit-lib.org/issue3/nfp-websites/http://www.exploit-lib.org/issue3/nfp-websites/
14
Web DirectoriesWeb directories (e.g. Yahoo!) provide manually-compiled classifications of the webBenefits to Projects:
• Additional place to be found• "61% reach in UK Search engine market"• Can be sensibly classified e.g. Ariadne magazine is in
<http://www.yahoo.co.uk/Reference/Libraries/Professional_Resources/Internet_in_Libraries/>
Problems:• Time-consuming for cataloguers• Entries can be submitted, but this can be time-consuming• "..sub-domains have difficulties in getting into Yahoo!"
Compare:www.ukoln.ac.uk/projects/eu/exploit/www.ukoln.ac.uk/~exploit/www.exploit-lib.org www.ukoln-exploit.ac.uk
15
Submission to Web Directories
It might be worth submitting to web directories such as Yahoo!
Remember that the information will be processed by humans.
See <http://www.searchenginewatch.com/webmasters/>
It might be worth submitting to web directories such as Yahoo!
Remember that the information will be processed by humans.
See <http://www.searchenginewatch.com/webmasters/>
16
Give Your Web Site Away
Another way to promote your web site is to give it away!
You could give away:• Parts of the site to robots (e.g. metadata)• Parts of the interface• The entire site
Search interface embedded in Exploit Interactive article at <http://www.exploit-lib.org/issue3/nfp-websites/>Search interface embedded in Exploit Interactive article at <http://www.exploit-lib.org/issue3/nfp-websites/>
You could give away the interface to:
• your local indexer• a remote indexing
service e.g. HotBot See <www.ariadne.ac.uk/issue21/webwatch/>
17
Give Part of Your Site Away
http://www.omni.ac.uk/other-search/http://www.omni.ac.uk/other-search/
OMNI gives an example of a site hosting remote search interfaces.Enhances remote interface, but several issues.See article at <http://www.ariadne.ac.uk/issue21/webwatch/> for discussion
OMNI gives an example of a site hosting remote search interfaces.Enhances remote interface, but several issues.See article at <http://www.ariadne.ac.uk/issue21/webwatch/> for discussion
18
Give Your Web Site Away
Why not have your web site mirrored? Mirrors in, say, USA and Australia will help to promote your service.
Is your web site easily mirrored?• Are relative URLs used?• Do you use directories structures to delineate areas
of your web site?• If you use server-side scripting for management
purposes, do you hide unusual URLs:/issue1/mag-features.asp # Problems
/issue1/mag-features/default.asp
/issue1/mag-features/ # Usable on Unix
(also techniques such as Apache rewrites)If your web site can't be mirrored, can it be preserved?If your web site can't be mirrored, can it be preserved?
Issues:
19
Citation
Is your project web site address easy to remember? Issues:• Short domain names are a winner• Short URLs are desirable (try to avoid org. structure)• Try to cite directories (shorter and less ambiguous):www.exploit.org/issue1/pride/article.htm (article.html, article.asp)www.exploit.org/issue1/pride/ # pride/default.asp
• Very important for web site home page• Try to avoid use of tilde (~)• Avoid citing binary files
(inaccessible, lack of metadata, alternative versions, etc.)
"Promoting Web Site" Talk
Given on 18 Nov 1999
Slides: [HTML] – [PowerPoint]
"Promoting Web Site" Talk
Given on 18 Nov 1999
Slides: [HTML] – [PowerPoint]
See AlertBox column at<http://www.useit.com/alertbox/990321.html>
See AlertBox column at<http://www.useit.com/alertbox/990321.html>
20
Let's Not Forget PublicationsGetting published in a web magazine (such as Exploit Interactive) can have many benefits:
• Visibility to (variety of) readers
• Web magazine may submit its pages to search services
• Links in web magazine may be harvested
• Web magazine may be made available on CD ROM, free text system, etc.
http://www.exploit-lib.org/issue3/http://www.exploit-lib.org/issue3/
Magazine articles may also be cited e.g. see <http://sunsite.berkeley.edu/CurrentCites/>
Magazine articles may also be cited e.g. see <http://sunsite.berkeley.edu/CurrentCites/>
21
Measuring Your Success
Link popularity is growing in importance as search engines make use of citation analysis ("this site is best, as there are lots of links to it" or "this site is linked to by important sites").
Link popularity is growing in importance as search engines make use of citation analysis ("this site is best, as there are lots of links to it" or "this site is linked to by important sites").
"I tried [LinkPopularity.com], pointing out to a potential advertiser that EEVL had, according to HotBot, 1099 sites linking to it, whilst there were only 18 sites linking to their site, and suggested that what they needed was more exposure.It seems to have worked, as they have agreed to buy an ad on the soon to be released new design EEVL site." Roddy McLeod, EEVL (posting to lis-elib list)
LinkPopularity.com lets you check on the number of sites linking to your web site
LinkPopularity.com lets you check on the number of sites linking to your web site
22
Don't Forget Your Stats
You will produce graphs of your web statistics for project reports
Do the graphs indicate:• A healthy growth• Growth in the number of robots• Growth in the wrong community
Look beneath the surface
Think about "enterprise analysis packages"
referer: "" # Entered directlyreferer: "www.foo.fr/goodstuff/" # Followed link
If you record the referrer field you will be able to see the links users follow to arrive at your web site. This may help to inform dissemination strategies.
23
Universal Design
Many of the guidelines provided will have additional benefits:
• Robots and people with disabilities (e.g.blind users) have similar characteristics i.e. can't follow images, may not be able to access framed sites, etc.
• Indexing programs may index ALT attributes in <IMG> elements
• Sensibly-structured web sites can be more easily archived and mirrored.
• Metadata for general resource discovery can be reused for other applications (e.g. current awareness services).
24
ConclusionsTo conclude:
• There are approaches to the web site architectural design which can help in promoting your project web site, including:
– Project-specific domains – Short URLs– Use of the robots.txt file – Metadata– Accessible web design
• Once you have the correct architecture, you can assist in the promotion process through various submission tools
• Many of the solutions will have additional benefits• Ideally the solutions will be implemented at the start
of the project!• Dialogue with your server administrator is important
25
Further InformationSearch Engine Watch
<http://www.searchenginewatch.com/>
Deadlock<http://www.deadlock.com/promote/>
Did-it<http://www.did-it.com/>
ViirtualPromote<http://www.virtualpromote.com/promotea.html>
Pegasoweb<http://http://www.pegasoweb.com/>
Broadcaster – URL submission service
<http://www.broadcaster.co.uk/>
Submit-it – URL submission service
<http://www.submit-it.com/>
Broadcaster – URL submission service
<http://www.broadcaster.co.uk/>
Submit-it – URL submission service
<http://www.submit-it.com/>
Yahoo!
<http://dir.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Information_and_Documentation/Site_Announcement_and_Promotion/>
Yahoo!
<http://dir.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Information_and_Documentation/Site_Announcement_and_Promotion/>
Book Reviews<http://www.hw.ac.uk/libWWW/irn/irn58/irn58d.html#recent><http://www.hw.ac.uk/libWWW/irn/irn59/irn59d.html#recent>
Book Reviews<http://www.hw.ac.uk/libWWW/irn/irn58/irn58d.html#recent><http://www.hw.ac.uk/libWWW/irn/irn59/irn59d.html#recent>
Top Related