Final_Presentation-Spring_2012x

chemistoddΤεχνίτη Νοημοσύνη και Ρομποτική

6 Νοε 2013 (πριν από 3 χρόνια και 5 μήνες)

67 εμφανίσεις

Speaker Tracker

SD1113

Layne Berge

Whitney Conmy

Tom Haselhorst

Derek Wiseman


Advisor: Dr.
Cristinel Ababei

Agenda


Introduction/Problem Statement


Requirements


Design Decisions


Hardware


Software:

Tracking


Software: GUI


Problems Encountered and Lessons Learned


Conclusion



Problem Statement:


The goal of our project is to design an
autonomous lecture recording system


System must


Capture video and audio


Pan and zoom to most effectively capture
lecturer


Be minimally invasive


Design

Decisions

Design Decisions


Many different
implementation
options


Must be:


Accurate


Computationally
Efficient


Non
-
intrusive


Moveable


Cost Effective

Design Decisions


RF Tracking


Speaker wears a
transmitter


Receiver uses a
directional antenna
to determine
location


Requires speaker to
wear a device


Signal blocked if the
speaker turns

Design Decisions


GPS Tracking


Speaker wears a
GPS receiver


Sub
-
inch GPS
accuracy
available


Cost prohibitive


Difficult to receive
signal indoors


Pressure sensing
mats


Used by ITS


Accurate


Simple data
processing


Cost Prohibitive


Not mobile

Design Decisions


Infrared Tracking


The speaker wears
an infrared beacon


Infrared filtered from
video feed


“Dot” location used
to determine where
the speaker is


Intrusive

Design Decisions


Motion Tracking


Analyzes video to
determine the
location of the
speaker


Works on the basis of
“seeing” motion


Not intrusive to the
speaker


Flexible and
accurate


Potentially difficult

Design Decisions


Final Design = Motion
Tracking


Use image
processing


PTZ camera


Computer


Focus on:


Limiting calibration


Ease of use


Accuracy

Design Justification


Advantages


Cheap Hardware


Mobile


Non
-
intrusive


Not dependant on
room geometry


Little to no
calibration
necessary


Easy to adjust
(software)




Disadvantages


Needs a host
laptop


Can only use
included camera


Not the best suited
for embedded
host


Non
-
trivial
programming
required


Design Justification


Motion Tracking is
the best option


Advantages >
Disadvantages


Choice inherently
meets design
criteria


Offers quick
-
to
-
market product



Hardware

System Overview


Laptop running XP


PTZ Camera


Non
-
auditorium setting


may need to raise


Cardioid Microphone


Presenter



Preset locations


Foldable
Cart


On wheels!


System Block Diagram

Ethernet
USB
Laptop
Camera
Pointer Dongle
Pointer
2
.
4
GHz Radio
XLR to
3
.
5
mm Cable
Microphone
Cisco PVC300

o
Pan

o
-
175


to 175


o
Tilt

o
-
35


to 90


o
Zoom

o
4x Digital and 2.6x Optical

o
Audio I/O for external mic

o
Video Resolution

o
640 x 480

o
Purchased from newegg.com
for $332.99


Retails for $794


Why PVC300?


Already
contains
PTZ features


Meets #’s 2 & 3 of our Requirements
Capture

2.
System will be able to follow a speaker (pan the video image) as they
move normally during a speech or lecture.

3.
System will allow for some form of zoom functionality (either optical or
digital).


Eliminates need for hardware


Able to zoom optically → Better resolution (than
digitally)


Able to capture audio


IP Camera allowing for broad access


PVC300 Homepage


http://10.248.123.150


Control camera features and configure settings


Features


Pointer: Allows user
to zoom into
features such as a
white board,
presentation


Unobtrusive


Easy to use


Preset threshold
settings to
customize for
different
environments

Project Expenses

Product
Retail Price
Acquired Price
Vendor
Cisco Small Business PVC300 PTZ IP Camera
$794
$332.99
newegg.com
CAD12 Cardioid Dynamic Microphone
$49.00
$21.99
newegg.com
Sharper Image Foldable Table Laptop Cart
$89.00
$62.00
amazon.com
ARCTIC COOLING Presenter 1 Wireless Presenter
$29.95
$27.99
newegg.com
Ultimate Support Tabletop Mic Stand
$14.99
$14.99
Best Buy
XLR to 3.5mm TRS Cable (10')
$15.99
$0.00
Storage Closet
Dell Latitude D520 Laptop
$200.00
$0.00
Dr. Schroeder
Total
$1,192.93
$459.96
-
Software

Software Overview


Our software solution consists of three
main parts


Program to record audio and video


Graphical User Interface (GUI)


Tracking Algorithm


The entire SpeakerTracker software
solution is installed on the target machine,
and the user only sees the simple
-
to
-
use
GUI

Program Structure


The program is broken into two executables,
one to host the GUI and tracking routines,
and another for recording.


Having recording separate helps the processes
share processor time and reduces loading of
the tracking routine on the recording algorithm


Originally, the tracking routine was written in
C++. After the GUI was added, the code was
converted to Managed C++, which is very
similar to C#

Tracking


Motion tracking
algorithm
processes each
frame according
to the chart
-
>


The actual code is
much more
complex

Tracking


Tracking Routine is
very robust and
adaptable using
user
-
accessible
settings


Due to the nature of
our design decisions,
the tracking routine
will track some
objects or colors
better than others


Stripes vs. Solids



Since the frame is
reduced to a single
16x16 matrix of integer
values, representing
the count, the routine is
very efficient


The tracking algorithm
easily outpaces the
camera’s ability to
feed the program
frames

GUI Overview


Written in Visual C++


Easy for the user to interact with


4 tabs: Home, Settings, Video Options and
Help

Home Tab


Connect to the
camera


Connect to the
pointer


Start/Stop recording


Change camera
settings

Settings Tab


Adjust threshold
settings


Choose an
environmental
setting or create a
custom setting that
automatically
adjusts the
threshold settings

Video Options


Set presets


Choose which
video to display on
laptop


Set the mode of
operation


Move the camera


Set a recording
location

Help Tab


Find the User’s
Manual and
Camera manual

Problems
Encountered
and Lessons
Learned

Initial Obstacles


Purchasing


Cost


Universal system
desired


Delivery


Shipping took two weeks


IT
issues


Needed to
assign
static IP to
camera for access


Did not have
sufficient
rights


Recording Routine


Used Real Time Streaming Protocol (RTSP)


Used live555 library to assist in attaining and
parsing RTSP stream


Utilized the second stream available from the
Cisco camera


1
st

stream used for tracking needs to use a
different format


640 x 480 resolution


H.264 encoding

Audio / Video Sync Issue


H.264 encoding standard needs a fixed frame
rate for proper playback


Cisco camera only allows setting a max frame
rate


During recording, network traffic would cause
frame rate to be reduced


Caused severe sync issue between audio and
video


5 minute video resulted in video ending after 3
minutes with audio playing for the full 5 minutes


Audio / Video Sync Resolution


To fix, we needed to recompile the library files
used to handle the RTSP stream (live555)


Twofold fix


First, observed the timestamp for each frame. If a
gap existed between frames, we filled in the gap
with copied frames


Non
-
integer amounts stored and included in calculations


Second, we used the CPU clock to give overall
corrections


Monitored the amount of frames stored during a given amount of
time


If too few, placed a “gain” on the number of copied frames
inserted during gap


If too many, we attenuated this gain


Successfully fixed the sync issue


Glitch Issue In Recording


Manifested when we
integrated code


Caused when
camera moves and
stops sending frames


Copied frames
inserted causing the
issue


H.264 does not
properly render
exact copies of
video frames



Glitch Resolution


We first tried to fix the timestamp
associated with each frame


Copied frames kept the same timestamp


Fix incremented timestamp as per frame
rate


Did not seem to make a difference


Glitch Resolution


We then tried subtly changing each
copied frame


Increased the stated size of each frame by
a few bytes


Thought that this would force the h.264
standard to see this as a new frame and
render it on playback


Fix succeeded!


Still have random glitches on long moves


Suspect hardware limitation problems with
the camera


Code Compatibility


Original tracking routine was standard C++,
had to convert to Managed C++ upon GUI
integration


Managed C++ ≈ C#


E
xternal library (OpenCV) was used to grab
camera frames and convert between color
and black & white


OpenCV needed wrapper (Emgu) upon
GUI/Tracking integration


Documentation is your friend, unless it’s on
MSDN

System Architecture


In the future, more focus on high
-
level system
design before execution


Would have began in C#


Would have better code structure


Would have better camera (tested for
encoding)


Would have documented code better in
-
process instead of afterwards


Documenting is incredibly boring, but absolutely
necessary

Result of In
-
Classroom Testing


Tested in EE 123, which is long and narrow


Learned that our stand is too short


Elevation for the camera would have
helped


Even with long range mic, distance was
too great


In future, camera should be in 2
nd

or 3
rd

row of seats

Future Work


Different tracking method


Universal turntable to mount any camera


Second video stream/PIP of board or
PowerPoint


Motion tracking turret

Summary


We succeeded in building a system that
tracks a lecturer and records their lecture
autonomously


We accomplished this using image
processing and a PTZ camera


We overcame many difficulties and in
doing so learned a lot about software
engineering


We toiled for fortnights. Give us an A.