Case Study On Search Engine

keckdonkeyInternet and Web Development

Nov 18, 2013 (3 years and 11 months ago)

70 views

Case Study
On Search Engine


Search Engine
s

Search engines are
programs

that search documents for specified
keywords

and returns

a list of
the documents where the keywords were found. A

s
earch engine

is really a general class of
programs,

however, the term is often used to specifically describe systems like Google, Bing and
Yahoo! Search that enable users to search for documents on the
World Wide Web.

Web Search Engines



Typically, Web search engines work by sending out a
spider

to fetch as many documents as
possible. Another program, called an
indexer,

then reads these documents and creates an
index

based on the words contained in each document. Each search engine uses a
proprietary

algorithm

to create its indices such that, ideally, only meaningful results are returned for each
query
.



A
web search engine

is designed to search for information on the
World Wide Web

and FTP
servers. The search results are generally presented in a list of results often referred to as SERPS,
or "search engine results pages". The information may

consist of
web pages
, images, information
and other types of files. Some search engines also
mine data

available
in
databases

or
open
directories
. Unlike
web directories
, which are maintained only by human editors, search engines
also maintain
real
-
time

information by running an
algorithm

on a
web crawler
.


How web search engines work

A search engine operates in the following order:

1.

C
rawling

2.

Indexing

3.

Searching


1.
Crawling

This is the use of special software commonly known as bots, crawlers or spiders to access
information on various websites through principally three means:

1. Links from other websites already in the search engine’
s index or gathered while crawling

2. Url’s/links submitted by webmasters

3. Sitemaps submitted by webmasters

Ordinarily one would visualize the bots as some crawling objects moving rapidly all over the
web via links to reach different websites in performi
ng its tasks. However, in reality that is not
the case. It operates from a particular physical location and is akin to your web browser. It
operates by sending various requests to the web servers from which it downloads/fetches various
information on new w
eb pages, updated web pages and dead links which are all used to update
it’s index. As web pages are crawled, new links detected on these web pages are added to the
engine’s list of pages to crawl.

In the process of crawling, the engines encounter challenges in the sense that there is a trade off
between minimizing the resources it spends on crawling and maintaining an up to date index. It
tries to avoid re
-
indexing an unchanged web page while it str
ives to capture all changed web
pages in order to keep its index always current.





Fig:High
-
level architecture of a standard Web crawler

2.
Indexing

The search engines stores the pages it’s

crawlers retrieve from various web pages in a massive
index database. It sorts this information based on search terms and arranges it in alphabetical
order. This sorting enables rapid retrieval of documents from the index when search queries
demand them.
It processes the words in the web pages noting the location of the keywords within
the pages e.g. title tags, alt attributes. The engines do process many, but not all content types. As
an illustration, it cannot process the content of some rich media files

or dynamic pages.

To improve search performance, the search engines ignore (doesn’t index) common words called
stop words (such as the, is, on, or, of, how, why, as well as certain single digits and single
letters). These words are so common and do little

to narrow a search, and therefore can safely be
ignored. The indexer also ignores some punctuation and multiple spaces, in addition to
converting all letters to lowercase, to improve it’s performance.

3.

Search Query Processor

This is what most search use
rs are conversant with and in fact quite often erroneously regard as
the “search engine”. It comprises some components with the most visible being the search box or
interface through which the search user interacts with the search engine, forwarding his se
arch
query for processing.

When a user sends in a query through the interface, the index rapidly retrieves the most relevant
documents for the search query. Relevance is determined algorithmically based on many ranking
factors numbering over 200. A key fac
tor amongst these is PageRank which is a measure of the
importance of a web page. This is determined by the number and quality of links pointing to the
web page. It is however important to stress that not all links are equal as links emanating from
high ra
nked web pages is considered more powerful than
links from low ranked web page.

Search Engine S
ervices

A phrase used to describe a collection of services offered by a third
-
party vendor that are
designed to assist organizations and businesses to obtain exposure and a better search engine
ranking


--


or
placement


for their website. Typical search engine services will include search
engine optimization (
SEO
) services and search engine marketing (
SEM
) services as well as
website promotion and
website optimization

services.

1.
Search Engine Optimization Services


Why would my company need Search Engine Optimization service
s?

Many businesses take advantage of the opportunity to do business online. Search engine
optimization companies will help your business get better search results on the search engines.

1.

Search engine optimization companies specialize in getting your
company noticed on the
Web. They will be able to let you know what your company needs to do to attract more
traffic to your site.

2.

Search engine optimization companies conduct a variety of services. They can orchestrate
copy, design pages, buy links, etc. f
or your Web site.

3.

The search engines are constantly changing their parameters. Search engine optimization
services remain continuous students.

How do I choose a Search Engine Optimization service?

Ask your search engine optimization service for references.

Due to the nature of the service, it
may seem difficult to tell how well the service is performing for you, so it is necessary to contact
companies which have previously worked with them.

1.

Choose a search engine optimization service which is familiar with
your industry.

2.

It is important to have regular, on
-
going contact with your search engine optimization
service during the duration of the relationship.

3.

Have the search engine optimization company provide you with an outline for their
services. Know what the
y intend to do and how they will achieve your desired results.

4.

Understand it takes time to see results. Be wary of any service saying they can get you
great results quickly.


2.

Search Engine Marketing
(SEM)

Short for search engine marketing, SEM is often u
sed to describe acts associated with
researching, submitting and positioning a
Web site

within
search engines

to achieve maximu
m
exposure of your Web site. SEM includes things such as
search engine optimization
, paid listings
and other search
-
engine related services and functions that will increase exposure and
traffic

to
your Web site.

Search engine marketing

(
SEM
) is a form of
Internet marketing

that involves the p
romotion of
websites

by increasing their visibility in
search engine results pages

(
SERPs) through the use of
paid placement, contextual advertising, and paid inclusion.
[1]

Depending on the context, SEM
can be an umbrella term for various means of m
arketing a website.



SEM methods and metrics

There are four categories of methods and metrics used to optimize websites through search
engine marketing.

1.

Keyword research

a
nd analysis

involves three "steps:"


(a) Ensuring the site can b
e indexed in the search engines


(b) finding the most relevant and popular keywords fo
r the site and its products;




(c) using

those keywords on the site in a way that will generate and convert traffi
c.


2.

Website saturation and popularity
, how much presence a website has on search
engines, can be analyzed through the number of pages of the site that are indexed on
search engines (
saturation) and how many
backlinks

the site has (popularity). It requires
your pages containing those keywords people are looking for and ensure that they rank
high enough in search engine

rankings. Most search engines include some form of link
popularity in their ranking algorithms. The followings are major tools measuring various
aspects of saturation and link popularity: Link Popularity, Top 10 Google Analysis, and
Marketleap's Link Popu
larity and Search Engine Saturation.


3.

Back end tools
, including Web analytic tools and HTML validators, provide data on a
website and its visitors and allow the success of a website to be measured. They range
from simple traffic counters to tools that work

with log files
[10]

and to more sophisticated
tools that are based on page tagging (putting
Ja
vaScript

or an image on a page to track
actions). These tools can deliver conversion
-
related information. There are three major
tools used by EBSCO:


(a) log file analyzing tool:
WebTrends

by NetiQ;



(b) tag
-
based analytic programs WebSideStory's Hitbox;



(c) transaction
-
based tool: TeaLeaf RealiTea. Validators



check

the invisible parts of websites, highlighting potential problems and many usability



issues ensure your website meets W3C code standards. Try to use more than one



HTML

validator or spider simulator because each tests, highlights, and reports on slightly



different aspects of your website.


4.

Whois

tools

reveal the owners of various
websites, and can provide valuable information
relating to copyright and trademark issues



3.

Website Optimization

Also called
search
engine optimization

(SEO)
,

website

optimization is a phrase that describes
the procedures used to optimize


or to design from scratch


a website to rank well in
search
engines
.


Website optimization includes processes suc
h as adding relevant
keyword

and phrases
on the website, editing
meta tags
, image tags, and optimizing other components of your websi
te
to ensure that it is accessible to a search engine and improve the overall chances that the website
will be indexed by search engines.


Search Engine List

1.

20SEARCH


2.

ALL THE WEB

3.

ALTA VISTA


4.

AOL SEARCH


5.

ASK JEEVES

6.

DOGPILE


7.

EBAY


8.

EXCITE


9.

GIGABLAST


10.

GOOGLE


11.

IWON

12.

JOEANT


13
.
LYCOS

14.

MAMMA


15.

MSN

16.

NETSCAPE


17.

OPEN DIRECTORY


18.

WEBCRAWLER

19.

WIKIPEDIA


20.

YAHOO


Metasearch E
ngine
:

A

s
earch engine

that queries other search engines and then combines the results that are received
from all. In effect, the user is not using just one search engine but a combination of many sea
rch
engines at once to optimize Web searching. For example,
Dogpile


is a metasearch engine.

Offline Search E
ngine
:

Also called a local
search engine
, an offline search engine is designed to be used for offline PC,
CDROM or LAN searching usage. For example, a
Web site

can be indexed and a local search
engine used if the author wanted to dist
ribute the Web site on CD or DVD.


Blog S
earch engine

A search engine for the
blogosphere
.
Blog

search engines

only index and provide search results
from blogs (Web logs). Examples of blog search engines include Google Blog Search and
Technorati
.

Google Cust
om Search Engine

A Google Custom Search Engine enables Web site authors to host a Web site (or Web) search
box and search results on on their site. Users can customize the search engine that is built using
Google's core search technology. In creating your
own Google Custom Search Engine you can
prioritize or restrict search results based on specific Web sites and pages you specify. Once
you've defined your search engine, Google provides code for a search box that users can copy
and then paste right in to th
eir own Web site or blog.