Semantic Search Engine Based on Ontology Extraction

walkingceilInternet και Εφαρμογές Web

22 Οκτ 2013 (πριν από 4 χρόνια και 8 μήνες)

84 εμφανίσεις


Semantic Search Engine Based on Ontology

Vijaysenthil K
Suresh Thanga Krishnan


School of Computer Science and Engineering
VIT University

School of Computer Science and Engineering,


Vellore 632014
Tamil nadu,

vijaysenthilk2009, sureshm


With the development of the Web, information “Big
Bang” has taken place on the Internet. Search engines have become
the most helpful tool for obtaining useful information from the
Internet. Howeve
r, the search results returned by even the most
popular search engines are not satisfactory. When we try to
comprehend an entity, we comprehend this entity from the way it
relates to other entities the next generation Web, Semantic Web,
offers a solution t
o this problem in the system architecture level.

our approach,
the information

is presented by the relation with
others and is recorded by RDF. Then, the relation is interpreted by
OWL. The relations or their properties, RDF: has Accommodation
and has R
ating, can be interpreted in OWL. Thus OWL ontologies
are created. Once the ontologies are created we then perform the
search by taking the input from the user.

According to the order of
input we ascertain the importance of each relation. Then once we
the user input we match the relation and concepts of the user
query with that of the web
pages for which the ontologies are
Thus rank is calculated for each page and returns

result to the user.

Semantic Web, RDF, OWL, Ontology



In these years information searching processes of people
have benefited from the technological evolution of the Web.
Web information indexing and retrieval technology play an
important role, a Web
based indexing and retrieval systems
such as Go
ogle and Yahoo has changed the method people
access to information greatly.

However, with the rapid increase of the information on the
Web, people have more requirements for indexing and
retrieval of information they are interesting in. On one hand,
Web provides an enormous amount of information which
offers an inexhaustible source

for searching; on the other
hand it causes an excessive amount of noise in the search
results. Currently search engines are widely used for
searching, but there are a lot o
f unsolved problems to make
their effectiveness. For example keyword
based search
engines present serious problems related to the quality of the
search results. Relevant pages are not indexed by a
traditional search engine only if its specific internet add
is known.

The spelling of the keywords is more crucial in searching
processes than its meaning. And nowadays more and more
often the Web is not used only by people, but software agents
are becoming users of the Web too. All these needs have led
to t
he development of the Semantic Web. One of the main
aims of the Semantic Web is to improve the existing web
with a semantic layer that allows machines to understand it,
and to enable software programs to process information more
efficiently. In this paper
we show how it is possible to
improve a traditional search engine to create a semantic
search engine.


Web 2.0

The bursting of the dot
com bubble in the fall of 2001
marked a turning point for the web. Many people concluded
that the web was overhyped, when
in fact

bubbles and
consequent shakeouts appear to be a common feature of all
technological revolutions
. Shakeouts typically mark the point
at which an ascendant technology is ready to take its place at
center sta
ge. The pretenders are given the bum's rush, the real
success stories show their strength, and there begins to be an
understanding of what separates one from the other.

The concept of "Web 2.0" began with a conference
brainstorming session between web pi
oneers, noted that far
from having "crashed", the web was more important than
ever, with exciting new applications and sites popping up
with surprising regularity. What's more, the companies that
had survived the collapse seemed to have some things in
on. Could it be that the dot
com collapse marked some
kind of turning point for the web, such that a call to action
such as "Web 2.0" might make sense? We agreed that it did,
and so the

Web 2.0 Conference

was born.

the year and a
half since, the term "Web 2.0" has clearly taken hold, with
more than 9.5 million citations in Google. But there's still

huge amount of disagreement about just what Web 2.0
means, with some people decrying it as a meaningless
marketing buz
zword, and others accepting it as the new
conventional wisdom.


Web 3.0

As the times goes and the technology enriches, the experts
feels to develop something better that can be more fruitful,
advance, user friendly and intelligent. Thus originates the
pt of web 3.0 and now it is taking a handsome shape.
Web 3.0 has some more features including the feature of
Web 2.0.

Web 3.0 sites will only allow collaboration of
content generated from an approved pseudo
sequence of characters. Web 3.0 would hav
e three main



Seeking Information

Searching information would be more compact in Web 3.0.
Till now, the web uses keywords in order to comprehensive
data into usable chunks. Search engines index the Internet in
proper order and present
it to the end user in order of
relevance. The users select the information that is nearer to
their requirement. Sometimes this becomes a very hectic
process. But Web 2.0 goes one step ahead and brought us a
change in the basic way of searching. It applies
the tags in the
searching data e.g. if anyone wants to look for car. He/she
types the word in the specified space of the search engine.
The search engine displays many webs, but if the user type
BMW cars, it displays the entire relevant site only related t
BMW cars. So BMW works as a tag.

The ultimate goal of the Semantic Search is to perform
semantic search. This is achieved by representing the data
using the concepts of Conceptual Graph and producing the
output by matching. The concept of CG is implemen
ted by
RDF and OWL. By using RDF and OWL we create a tree
like structure. By using this tree like structure we perform the
matching and ranking. Based on the ranking the outputs are


Ontology Extraction


Site Registration

New site must be

registered with required information to
generate metadata in the form of RDF (Resource Description
Framework) file for the site. Metadata is the information
about the site. User registers by giving description and
keywords which are mostly used in the sit
e and creates a
relevant RDF file. Admin user has rights to register the site in
the semantic search.


RDF Generation

The RDF files are generated using the knowledge gathered
during the site registration. It generates the content of the
RDF files as su
bjects, predicates and objects. User must
inspect the triples and creates the ontologies.


OWL Generation

The Existing user enters username and password. If the user
is authenticated, he/she can extract RDF files by search for a
particular Domain categ
ory from the RDF files collection.
Output will be shown in the format of subject, predicate and
objects. As it shows many RDF files, user must inspect them
whether they meets the searching domain criteria. If so then
he adds it into ontology library under
the particular domain.
Now ontology is created. It can be used for onto search.



On the framework offered by Semantic Web, “Semantic
Search” proposes a novelty search method. It takes advantage
of the semantic information fully and achieve sem
search in Web resources.

The core idea of “Semantic Search” is that there are
relations among the submitted keywords, and Semantic Web
offers the ability of processing relations at the system
architecture level. The Web pages returned from the datab
not only include the keywords the user inputs, but also
include the relations; some semantics of keywords are
recorded by the form of RDF triples. So, the Web pages
returned by “Semantic Search” will be closer to the users’

Once the Semanti
c Web exists, it can provide the
ability to tag all content on the Web, describe what each
piece of information is about and give semantic meaning to
the content item. Thus, search engines become more effective
than they are now, and users can find the pre
cise information
they are hunting. Organizations that provide various services
can tag those services with meaning; using Web
software agents, you can dynamically find these services on
the fly and use them to your benefit or in collaboration with
her services.

Every application has its own merits and demerits. The
project has covered almost all the requirement
. Further
requirements and improvements can easily be done since the
coding is mainly structured or modular in nature. Changing
the existin
g modules or adding new modules can append
improvements. Further enhancements can be made to the
application, so that the web site functions very attractive and
useful manner than the present one.

In Semantic Search the description of the webpage and the
user query are given explicitly to form the concept relation
graph. Since certain pages in the web do not have proper
description and also certain descriptions contains complex
sentences. In future this can also be automated.



Bechhofer .S, F. va
n Harmelen, J. Hendler, I. Horrocks,
D.L.McGuinness, P.F. Patel
Schneider, and L.A. Stein,
“OWL Web Ontology Language Reference,”


Beckett.D, “RDF/XML Syntax Specification


Guarino. N, C. Masolo, and G. Vetere: “OntoSeek:
Based Access to the Web” IEEE Intelligent


Sowa.J. F: Conceptual Graph Standard.


RDF/XML Syntax Specification


OWL Web Ontology Language Reference,


Protégé Owl :


Jena :


Semantic Concept :

Author Biographies

Vijaysenthil Kathirvel

is currently pursing M
Tech in Computer Science
and Engineering in VIT University, vellore india and pursed his B.E in
Computer science and Engineering from Park College of Engineering and
Technology, Coim
batore affiliated to Anna University Chennai. His are of
interest includes Web services, Semantic Web.


Suresh Thanga Krishnan

is worki
ng as a Asst. professor

in School of
Computing Sciences and Engineering, VIT University,
, India. His
area of in
terest includes Semantic web, networking, MANNET.