amrit - WordPress – www.wordpress.com

wonderfuldistinctΤεχνίτη Νοημοσύνη και Ρομποτική

16 Οκτ 2013 (πριν από 3 χρόνια και 7 μήνες)

68 εμφανίσεις

INTERNET PORTALS WITH
MACHINE LEARNING












PRESENTED BY:
-













AMRIT CHOUDHARY






BTECH
-

CSE 7
TH

SEM



WHY PORTALS ???




Gather content from
Web
organize
it for easy access,
retrieval and
search.

Eg.
www.twenty19.com
,



Disadvantage
:
-


T
hese
portals are difficult and time
-
consuming to
maintain.



Soln


My project proposes
use

machine learning techniques to
greatly automate

creation
and maintenance of
portals.







MACHINE LEARNING


Study of computer algorithms that improve automatically through
experience.


IMPORTANCE OF MACHINE LEARNING:
-






4 general categories task’s which are impossible or difficult.

1)
Problems ,no human expert available.

2)
Human experts available ,no explanation of expertise.

3)
Problems where phenomena changes rapidly.

4)
Applications to be customized for each computer.


HOW MACHINE LEARNS ???



ASSIGNING WEIGTHS:
-

some weight assigned and compared with previous results stored.



DECESION TREES:
-

System starts from parent node with techniques of BDS and DPS.








FORMAL GRAMMARS:
-



CRESTON :
-

A
new rule is constructed by the
system or
acquired from an
external entity


GENERALISATION:
-

Conditions dropped / made less restrictive, so that the
rule applies in a larger number of situations.


SPECIALIZATION:
-

Additional conditions

added
to existing
conditions
made more
restrictive, so
that
the rule applies to
specific situations.




APPLICATIONS:
-



Optical
C
haracter Recognition(OCR)


F
ace Detection


S
pam Filtering


medical
diagnosis


spoken language
understanding


fraud
detection


PLAN:
-

E
-
BOOK PORTAL





MACHINE LEARNING FEATURES
INCLUDED IN PORTALS



CLASSIFICATION INTO TOPIC
HIERARCHY:
-


Efficiently
organize, view and explore large

quantities of
information.


SPIDERING:
-


Spider efficiently
explores
Web
, following links
that are more likely
to lead
to
e
-
books.


E
ach reference broken
down into
appropriate
fields, such
as author, title, journal, and date.


WEBWATCHER


Tour guide, highlights
hyperlinks that it believes
will be of
interest


REINFORCEMENT LEARNING


Learning optimal
decision making from rewards
or punishment
.


Goal
of reinforcement
learning:
-

learn
a policy,
a mapping from states
to actions
, that maximizes
the sum of
reward
over time.


supervised
learning:
-
Told
correct action
for
particular
state



ADVANTAGE OVER SUPERVISED
LEARNING:
-


instead it is told how good or bad the selected
action was, expressed in the form of

“scalar
reward”.





INFORMATION EXTRACTION


Information
extraction, identifying
phrases of
interest in
textual data.


powerful way ,

summarize’s

the
information
relevant to a user's
needs.


Eg. On
-

topic documents may be several
hyperlinks away from the current choice point; but
the text on the current page may offer indications of
which hyper link will lead to reward soonest
.


ADVANTAGE:
-



Allow’s

searches over specific
fields.


E
ffective
presentation of search
result(Shows in
bold)


CONCLUSION:
-



In addition to
future
work discussed
earlier,
many

other areas where machine learning can further
automate the
construction and
maintenance of
domain
-
specific search engines.
Eg. Text
classification can decide which documents on the
Web are
relevant to
the domain
.



This paper has shown that machine learning
techniques can significantly aid the creation
and
maintenance
of portals and domain
-
specific search
engines.



ADVANTAGE’S


These
techniques allow portals
quick creation
with minimal
effort.


Performance is based on the rewards over
time.


The
environment presents situations with delayed
rewards.

DISADVANTAGE’S

o
Backtracking
:
-

algorithm
fails to backtrack to the
original path
resulting in deadlock state.

o
Specify initial
and goal states
,specify
rules and
modify the rules sometimes if necessary
.

o
If
knowledge
base of
expert
system ,
not
correct
or
lack
facts &
figures, solutions
thus acquired
ineffective.


ANY QUERIES ??? …