Collection Analysis Version 2 Revision 1 technical documentation.

fortunabrontideInternet και Εφαρμογές Web

13 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

81 εμφανίσεις


1

Collection Analysis Technical Documentation

Yue Ji

February 26, 2007



This is the Collection Analysis Version 2 Revision 1.


Table of Contents:


1. Collection Analysis Version 2 Revision 1 Interface Structure………… page 1
-
8

2. Interface Programming Techniqu
e ……………………………..……… page 8
-
11

3. Collection Analysis Version 2 Revision 1 Output Description………… page 11
-
15

4. Special Technique Used in Programming ……………………………… page 15
-
21

5. Related Documentations……………….………………………………… page 21

6. Contact Staff……………………………………
………………………… page 21




1. Collection Analysis Version 2 Revision 1 Interface Structure.


1.1

Interface overview.


The call number is the key to find records. This version only deals with LC call
number. The call number’s formation rules and formats of storing i
n database are
complex. It is hard for users to know the call number based on the record category
that they are interested. Also it is very easy to pull the wrong records due to the call
number’s complex format.


This version of Collection Analysis Tool c
hanges the traditional way that let users
enter in call number. Instead, it gives the call number’s list based on users’ interests.

There are four select boxes with the values depending on previous selections. Here is
the interface screen shot. The highli
ghts here are the examples that will be used to
explain the interface design.



2




1.2

First select box.


The first select box displays the location code, location name, and the MFHD count
of LC call number in each location.

The data in this box is loaded whe
never this application is invoked by browsers.


Here is an example:

yulint [6]
-
> Yale Internet Resource

Location code:
yulint.

Location name:
Yale Internet Resource.

Total MFHD count of LC call number in yulint location:
6.


1.3

Second select box.


The secon
d select box displays location code, LC class letter, class label, and MFHD
count in its location. The data in this box are the results by clicking SELECT button
under “1. Select Location(s)” after making the selections from the first select box.


For exam
ple, after select the above first box selection, the data in this box are:

yulint
-
>D[2] HISTORY (GENERAL) AND HISTORY OF EUROPE

yulint
-
>G[2] GEOGRAPHY. ANTHROPOLOGY. RECREATION


3

yulint
-
>Q[3] SCIENCE


These seven MFHDs from yulint are:

Two is LC D class w
hose label is
HISTORY (GENERAL) AND HISTORY OF EUROPE.

Two is LC G class whose label name is
GEOGRAPHY. ANTHROPOLOGY. RECREATION.

Three is LC Q class whose label is
SCIENCE.


If you want to see more details, for example, what subclasses of Q that
yulint

h
as,
highlight the Q line, then click SELECT button under “2. Select Class(es)”. The
results will show up at the third select box.


1.4

Third select box.


The third select box displays location code, LC subclass letter, subclass label, and
MFHD count in its loc
ation.


For example, after select the above second box all selections, the data in this box are:

yulint
-
>DS[2] Asia

yulint
-
>G[2] Geography (General). Atlases. Maps

yulint
-
>QC[1] Physics

yulint
-
>QE[1] Geology

yulint
-
>QL[1] Zoology


The two D MFHD’s su
bclass in
yulint

is DS whose label is
Asia.

The two G MFHD’s subclass in
yulint

are G whose label is
Geography
(General). Atlases. Maps.

The three Q MFHD’s subclass in
yulint

are:

One is QC whose label is
Physics.

One is QE whose label is
Geology.

One is
QL whose label is
Zoology.


If you are still curious to know what call numbers of these subclasses in yulint could
be, for example, what call number range of this QC is, highlight QC line, then click
SELECT button under “3. Select Subclass(es)”. The result
s will show up at the fourth
select box.


1.5 Fourth select box.


The fourth select box displays location code, LC subclass call number range, subclass
call number range label, and MFHD count in its location.


For example, after select the above third box

selections, the data in the fourth box are:

yulint
-
>***DS1
-
937[2] History of Asia***

yulint
-
> DS501
-
518[1] East Asia. The Far East

yulint
-
> DS801
-
897[1] Japan

yulint
-
>***G1
-
922[1] Geography (General) ***

yulint
-
>***G3180
-
9980[1] Maps***

yulint
-
>

G3290
-
9880[1] By region or country

yulint
-
>***QC1
-
999[1] Physics***

yulint
-
> QC81
-
114[1] Weights and measures


4

yulint
-
>***QE1
-
996[1] Geology***

yulint
-
> QE500
-
639[1] Dynamic and structural geology

yulint
-
>***QL1
-
991[1] Zoology***

yulint
-
> QL6
05
-
739[1] Chordates. Vertebrates


LC subclass has parent


child hierarchy structure.

For example, “
DS1
-
937 History of Asia
” is the parent; “
DS501
-
518 East Asia.
The Far East
”, and “
DS801
-
897 Japan
” are its children. So if you are wondering
what are they
in
DS1
-
937
? The answer will be one is in
DS501
-
518
, and the other one
is in
DS801
-
897
.


This parent


child hierarchy structure is displayed in browser showed above.

Parent level: the lines start and end with ***.

Child level: indented away from the arro
w and under its parent.

It is possible that there is only parent level, no any children belong to.

The sum of all MFHD counts from parent level is equal to this section’s total MFHD
counts.


Example of LC Subclass Hierarchy:

Parent / MFHD count

Children
/ MFHD count

DS1
-
937 / 2

DS501
-
518 / 1, DS801
-
897 / 1

(two children)

G1
-
922 / 1

No child

G3180
-
9980 / 1

G3290
-
9880 / 1 (one child)

QC1
-
999 / 1

QC81
-
114 / 1 (one child)

QE1
-
996 / 1

QE500
-
639 / 1 (one child)

QL1
-
991 / 1

QL605
-
739 / 1 (one child)



1.
6 Conversion of “Library of Congress Classification Outline”.



The resource that is used to apply the call number hierarchy is “Library of Congress
Classification Outline”.

The URL is
http://ww
w.loc.gov/catdir/cpso/lcco/lcco.html

“Library of Congress Classification Outline” is input into three Microsoft Excel
sheets
-

LC_MAIN_CLASS.xls, LC_SUB_CLASS.xls, LC_RANGE.xls.


There is another Java standalone application EXPORTLCCLASS that processes th
ese
three Excel files to import the data into following three Oracle tables in LIBSYS.


Creation of these three tables.


1). create table "LC_MAIN_CLASS"


(




"CLASS_LETTER" VARCHAR2(1) not null constraint


"CLASS_LETTER_PK"

primary key,




"CLASS_TITLE" VARCHAR2(70) not null


)



5

The data sample from LC_MAIN_CLASS table screen shot:









2). create table "LC_SUB_CLASS"


(





"CLASS_LETTER" VARCHAR2(1) not null,





"SUBCLAS
S_LETTER" VARCHAR2(3) not null constraint


"SUBCLASS_LETTER_PK" primary key,





"SUBCLASS_TITLE" VARCHAR2(70) not null


)






6

The data sample from LC_SUB_CLASS table screen shot:





3). create table "LC_RANGE"


(



"RANGE_ID" DECIMAL(22) not null constraint "RANGE_ID_PK" primary key,



"SUBCLASS" VARCHAR2(3) not null,



"START_NUMBER" VARCHAR2(6),



"END_NUMBER" VARCHAR2(6),




"RANGE_TITLE" VARCHAR2(70) not n
ull,




"HIERARCHY" DECIMAL(22) not null,





"SEQUENCE" DECIMAL(22) not null,




"CATEGORY_ID" DECIMAL(22) not null


)








7

The data sample from LC_RANGE table screen shot:




1.7 Queries behind SELECT buttons.


1). For the

first select box:




select * from LIBSYS.LC_MAIN_CLASS




select location_id, count(*) as class_count,

substr(normalized_call_no,1,1) as class_letter

from (select * from mfhd_master where location_id in [list of location id]

and call_no_type = '0')

group
by location_id, substr(normalized_call_no,1,1)

order by location_id, substr(normalized_call_no,1,1)


2). For the second select box:




select * from LIBSYS.LC_SUB_CLASS




select count(*) as sub_count,

substr(normalized_call_no,1,instr(normalized_call_no,' ')
) as sub_letter

from (select * from (select * from MFHD_MASTER where location_id = ?)


where call_no_type = '0')

where substr(normalized_call_no,1,1) = ?

group by substr(normalized_call_no,1,instr(normalized_call_no,' '))

order by 2


8



3). For

the third select box:




select * from LIBSYS.LC_RANGE where SUBCLASS = ? order by range_id




select count(*) as range_count

from (select * from MFHD_MASTER where location_id = ? and


call_no_type='0')


where substr(normalized_call_no,1,8)
between ? AND ?


‘?’ presents the data that is generated by programs
dynamically.


The structure of the first part in NORMALIZED_CALL_NO field in
MFHD_MASTER table:

Subclass letter(one or more) + four spaces + one digit (subclass number part).

Subclass l
etter(one or more) + three spaces + two digits (subclass number part).

Subclass letter(one or more) + two spaces + three digits (subclass number part).

Subclass letter(one or more) + one space + four digits (subclass number part).



2. Interface Programmin
g Technique.


2.1 Programming language.


Client: JSP.

Middle tier: JavaScript, AJAX


DWR.

Server: Java.

Special Use: Java Thread, Java TreeMap, PrepareStatement, Java Encoding output.


2.2 Three tiers connections.


Here gives the example to explain what n
eeds to do that can make the data move from
first select box to second select box.


1). In JSP
-

CARevision1Main.jsp:



In the HEAD:


<script src='src/MoveSelection.js'> </script>


<script src='dwr/interface/StartMoveSelection.js'> </script>


<scr
ipt src='dwr/interface/SelectLocation.js'> </script>


<script src='dwr/interface/ThreadGetLocation.js'> </script>



In the BODAY:


<select name="selectlocation" size=8 multiple class="selectlocation">



9


<button type="button" name="move1"



onClick="moveFrom1To2(this.form.selectlocation)" style="


background:images/yellowbackground.jpg;border
-
width:0px">


<img src="images/selectbutton.gif" width="98" height="23"></button>


moveFrom1To2 is the function of MoveSele
ction.js


2). In JavaScript
-

MoveSelection.js:


Clear the second select box.


StartMoveSelection.startLocation(refreshLocation,wholeOptions).

StartMoveSelection is a Java program, and startLocation is its one method.

refreshLocation is the function of Mov
eSelection.js.

wholeOptions is the processed string of first select box’s selected data.




In refreshLocation.js:

ThreadGetLocation.isRunning(updateLocation).

ThreadGetLocation is a Java program, and isRunning is its one method.

updateLocation is the functi
on of MoveSlecetion.js.

updateLocation is called as updateLocation(runStatusBean)

runStatusBean is related with a Java program RunStatusBean.java. It is a setter
and getter. It needs to be declared in dwr.xml: <convert converter="bean"
match="JavaCodes.Run
StatusBean"/>. This program is the connection between
client and server.




In updateLocation(runStatusBean):


runStatusBean.finishRunning is checking the query running status on server.

The result is represented as a number:


0: Query is running. it will
call refreshLocation every 1000 milliseconds
(thousandths of a second).


1: Query is finished. Populate the results:
SelectLocation.queryResults(popListIn2).

SelectLocation is a Java program , and queryResults is its method.

popListIn2 is the function of M
oveSlecetion.js.

It uses DWR method to write the results back into second HTML select box.


2: Error happened on server. Display the error message on the browser.


3). In Java:




The communication from middle tier to server starts from
StartMoveSelection.ja
va.



10

This program invokes Java threads by calling following two Java classes:

ThreadPutLocation putData = new
ThreadPutLocation(thisApplication,inputOpt);

ThreadGetLocation getData = new ThreadGetLocation(thisApplication);


putData.start();


getDat
a.start();




ThreadPutLocation is a thread which invokes running query Java class on the
server.

This running query Java class is SelectLocation with the method runQuery.




ThreadGetLocation is also a thread which is checking query running status,
assign th
e status as a number that describes above to RunStatusBean’s setters.




In updateLocation.js, RunStatusBean’s getters are being called.

In the JavaScript function updateLocation(runStatusBean), it will periodically
(every 1000 milliseconds) check this numb
er. If the query is finished,
SelectLocation.queryResults will be called to get the query results.




RunStatusBean.java is setter/getter.

The setter are: setFinishRunning, setCountRunning.

The getters are: getFinishRunning, getCountRunning.


2.3 Programs an
d their methods/functions behind SELECT button.


1). Each SELECT button’s background functions with their parameters


in MoveSelection.js


First SELECT button

Second SELECT button

Third SELECT button

moveFrom1To2(fbox)

moveFrom2To3(fbox)

moveFrom3To
4(fbox)

refreshLocation()

refreshClass()

refreshSubclass()

updateLocation(runStatusBean)

updateClass(runStatusBean)

updateSubclass(runStatusBean)

popListIn2(selectLocation)

popListIn3(selectClass)

popListIn4(selectSubclass)



2). Each SELECT button’s b
ackground methods of Java programs:



Java
Program(.java)

Methods Included

All
SELECT
Buttons

StartMoveSelectio
n

startLocation

startClass

startSubclass


RunStatusBean

setFinishRunnin
g

getFinishRunni
ng

setCountRunnin
g

getCountRunning

First
SELECT
Butto
n

SelectLocation

queryStatus

runQuery

queryResults


ThreadPutLocation

run




ThreadGetLocation

run

isRunning

isCompleted


Second
SELECT
Button

SelectClass

queryStatus

runQuery

queryResults


ThreadPutClass

run




ThreadGetClass

run

isRunning

isComp
leted



11

Third
SELECT
Button

SelectSubclass

queryStatus

runQuery

queryResults


ThreadPutSubclass

run




ThreadGetSubclass

run

isRunning

isCompleted



Although some of method’s names are the same in different Java class, but the
contents are the differe
nt.




3. Collection Analysis Version 2 Revision 1 Output Description.


3.1 Output overview.


You can output data from each of the four select boxes. The output file is the “|”
delimited text file. The file name pattern is netid_timestamped_CA.txt. You can

import the text file into Microsoft Excel or Access to review and manipulate the data.
The maximum number of records that the text file can contain depends on multiple
factors, such as the capability of Oracle function, the maximum size of the Oracle
resu
lt set, the maximum size of text file, The length limitation of Excel or Access to
import the file, the memory size of the desktop, and server etc. It’s hard to tell what
the maximum number of record that can be output is. It’s recommended less than
40,000

records.


The time of getting output data is various upon different requests. It could be from
seconds to hours. Here uses AJAX technique to separate the connection between
client and server. After the client submits the request, the client doesn’t need t
o wait
the response from the server. That means the connection is over, but the server still
continues to do its own job. After the job is done, the server will notify the user to get
her/his file by sending an email with the URL to point to the file path.


3.2 Output button queries.


There are 4 OUTPUT buttons with 4 output types. So there are total 16 queries
behind all OUTPUT buttons. These 16 queries are documented in following four
files:

OutputLocationQueries.

OutputClassQueries.

OutputSubclassQuerie
s.

OutputRangeQueries.


3.3 Output programming summary.


Here gives the example to explain what need to do for the OUTPUT of first select
box.


1). In JSP
-

CARevision1Main.jsp:



12


In the HEAD:


<script src='src/InvokeOutput.js'> </script>


<s
cript src='dwr/interface/OutputLocation.js'> </script>



In the BODY:


<button type="button" name="out1"


onClick="output1(this.form,'<% out.print(passData); %>',


'<% out.print(lastName); %>','<% out.print(netID); %>')" style="



background:images/yellowbackground.jpg;border
-
width:0px">


<img src="images/output.gif" width="98" height="23"></button>



output1 is the function of InvokeOutput.js.



2). In JavaScript InvokeOutput.js':


Output1 parses the pa
rameters that are passed in from JSP, then concatenate them
to the different parameters that will be passed out to the Java server program.
Different parameter that is passed into Output1 will invoke one of these four
methods of Java program on the server.


OutputLocation.OutputBM(wholeOptions,passdata):

Output Bibliographic and holdings in selected location information.


OutputLocation.OutputBMA(wholeOptions,passdata):

Output Bibliographic and holdings in all related locations information.


OutputLocatio
n.OutputBMI(wholeOptions,passdata):

Output Bibliographic and holdings plus items in selected location information.


OutputLocation.OutputBMIA(wholeOptions,passdata):

Output Bibliographic and holdings plus items in all related locations information.


Outp
utLocation is a Java program, and has four methods OutputBM,
OutputBMA, OutputBMI, OutputBMIA.


After you click the OUTPUT button, it will prompt the message “Your report
URL link will be sent to your email”. At this point, this interactive transaction
bet
ween client and server is over. The client and server will not wait for each
other’s response.



3). In the Java OutputLocation.java:


Each method has the similar procedure. The procedure steps are list below in the
execution order.




Parse the paramet
ers that have been passed in from InvokeOutput.js.


13




Get the system date; then create timestamped file name. The file name pattern
is netid_YYYYMMDD_hh
-
mm
-
ss_CA.txt. "YYYYMMDD_hh
-
mm
-
ss" is
the date and time that the file is created.




Assign the output text
file’s path (where to get this file).




Set up environment of sending Email.




Dynamically build queries.




Run queries.




Write the results into text file.




Send email to notify the user that the output file is ready.


3.4 Email servers.


1). There are two do
mains on campus.




Central campus.

Incoming mail server: netid.mail.yale.edu

Email address: netid@netid.mail.yale.edu




Medical campus.

Incoming mail server: email.med.yale.edu

Email address: netid@email.med.yale.edu




Both have the same outgoing mail server
: mail.yale.edu


The user has to use the correct domain name in order to receive his/her output file.

For example, staff work on SML, their incoming mail server should be
netid.mail.yale.edu. If the program assigns their incoming mail server as
email.med.
yale.edu, the sending email will be failed.


2). How to decide the user’s email domain name?


In the Voyager OPERATOR table, the LAST_NAME contains staff group data.
Most groups are located on central campus, except for following 4 groups on
medical campu
s:

Medical Library, Medical Library Student, EPH Library, EPH Library Student.


The program selects different incoming mail server based on staff group by using
netid to find the group.



14

3.5 Output file link.


The size of the output file can be very large.

If sending the large file through the
email, it may crash the email system. So in this application, it just sends the file’s
URL link in the email. When the user clicks this link, it will bring the user to the file
path that locates on the server.

Because

the file is named starting with netid, the user can easily find his/her file on
the server. Then right click the file to save this file to his/her desktop. Be cautious,
DON’T double click to open the file. If the file size is too large, it can freeze the
browser, even the whole desktop. After the file is downloaded on the desktop, open
the new Excel sheet, and import this file.


3.6 Output file structure.


The output file is the text file. The fields are delimited by pipe sign ‘|’.




The fields in bib and h
olding file:

MFHD_LOCATION_CODE|CALL_NUMBER|BIB_FORMAT|AUTHOR|BRIE
F_TITLE|IMPRINT|BEGIN_PUB_YEAR|PHYSICAL_DESC|LANGUAGE|BIB
_ENCODING_LEVEL|BIB_ID|MFHD_ID|SUCCEEDING|BIB_DATE_TYPE|H
OLDING




The fields in bib, holding, and item file:

MFHD_LOCATION_CODE|CALL_N
UMBER|BIB_FORMAT|AUTHOR|BRIE
F_TITLE|IMPRINT|BEGIN_PUB_YEAR|PHYSICAL_DESC|LANGUAGE|BIB
_ENCODING_LEVEL|BIB_ID|MFHD_ID|SUCCEEDING|BIB_DATE_TYPE|H
OLDING|ITEM_PERM_LOC_CODE|ITEM_TEMP_LOC_CODE|LAST_CIRC_
DATE|CHARGES|BROWSES|BARCODE|ITEM_ID


3.7 Reason of output
file disordered in Excel file.



After the text file is imported into Excel sheet, if there is non
-
display character or
pipe sign ‘|’ in one record, this record in the Excel sheet will be disordered. This
record should be fixed by Cataloging Department.


3
.8 Programs and their methods/functions behind OUTPUT button.


1). Each OUTPUT button’s background functions with their parameters


in InvokeOutput.js:



Function

First OUTPUT Button

output1(fbox,passdata,lastname,netid)

Second OUTPUT Button

output2
(fbox,passdata,lastname,netid)

Third OUTPUT Button

output3(fbox,passdata,lastname,netid)

Fourth OUTPUT Button

output4(fbox,passdata,lastname,netid)




15

2). Each OUTPUT button’s background methods of Java programs:



Java
Program(.java)

Methods Included

First
OUTPUT
Button

OutputLocation

OutputBM

OutputBMA

OutputBMI

OutputBMIA

Second
OUTPUT
Button

OutputClass

OutputCBM

OutputCBMA

OutputCBMI

OutputCBMIA

Third
OUTPUT
Button

OutputSubclass

OutputSBM

OutputSBMA

OutputSBMI

OutputSBMIA

Fourth
OUTPUT
Button

O
utputRange

OutputRBM

OutputRBMA

OutputRBMI

OutputRBMIA





4. Special Technique Used in Programming.


4.1 prepareStatement vs. createStatement.


The prepareStatement is used instead of createStatement in this application.

The decision is made based upon f
ollowing explanation.




When to actually use a PreparedStatement vs a Statement object?


It depends on your usage. If you plan of executing your statement

infrequently, you might want to consider the createStatement() approach. If

you plan on executing that

statement frequently, and would not want to incur

the repeated cost of creating and compiling the statement, you may be better

off using prepared statements.



PreparedStatement objects are best used when you will be executing a large number


of identical queries with different values. If you are going to be looping through code


and adding in or updating rows in bulk, go for the PreparedStatement, otherwise,


Statement is your answer.



Example code 1:


PreparedS
tatement pstmt = conn.prepareStatement("insert into table (column2)


values ("My Value") where id = 1000");


pstmt.execute();






16


Example code 2:


PreparedStatement pstmt = co
nn.prepareStatement("insert into table (column2)


values (?) where id = ?");


pstmt.setString("My Value");


pstmt.setInt(1000);


pstmt.execute();




The first one is blat
antly wrong but what's wrong with the second one?

It's being executed every time you run through the code. Why is it bad to do it this
way? You are
DOUBLING

your number of calls to the database.


When you call conn.prepareStatement(String) you are sendin
g a message to the
database to pre
-
compile the sql string. You then send another message to the
database when you call execute() after you set the variables. The correct way of
using prepared statement would be in a situation like this:



Example
code 3:


PreparedStatement pstmt = conn.prepareStatement("insert into table (column2)


values (?) where id = ?");


while (true) // some kind of terminating loop here not jus
t while true


{


pstmt.setString(valueVar);


pstmt.setInt(intVar);


pstmt.execute();


}


However, there is a large speed difference with the first 50
-
60 record
s being sent.
If you are doing less that 50
-
60 iterations of this query it is still faster to use
Statements rather than a PreparedStatement. However, it is twice as fast to use
PreparedStatements once you have iterated through it about 1000 times.


Statem
ents are good for one time insert/updates and also for sending in batches of
several different inserts/updates.

Example code 4:


String sql1 = "insert into...";


String sql2 = "update table set ...";


Statement st = conn.createStatement();



st.addBatch(sql1);


st.addBatch(sql2);


int[] returnRows = st.executeBatch();




17

4.2 Precompile JSP and Servlet.

1). What data are “loaded into” browser when CARevision1Main.jsp is invoked every


time?



The first load of data are all l
ocations with Library of Congress call numbers.

There are two steps to get the data:




Get all locations that have holding counts from LOCATION table. These holding
counts include all classifications, such as LC, Government Documents etc. So it
needs to go

to MFHD_MASTER table to find LC holding counts only.




Each location needs to go through the whole MFHD_MASTER table to count the
number of LC call number it has.


Because MFHD_MASTER is a huge table, around 8 million records in it,
counting each location
LC holdings is time
consuming, for about 1 minute. The
users will feel too long while they are facing a blank page for a minute. The
whole results will not be changed after this JSP page is loaded at first time. It’s no
need to execute above two steps ever
y time. In order to make performance more
efficient, and to achieve this capability,

init() method is used in
PrecompileInit.java, and
jspInit() method is used in
CARevision1Main.jsp in this
application.


2). Init() method in Java servlet.




PrecompileInit
.java is a servlet. It is located at WEB
-
INF/classes/Precompile.

The init() method is precomplied and executed only once into cache if it is
declared in web.xml as below, when this application is loading into tomcat
container by various reasons, such as de
ploy this applocation, start the whole
tomcat, start this application, reload this application. If it is not declared in
web.xml, the init() won’t be precompiled and executed.

<init
-
param> part is not required for precompile, but it has a good feature tha
t can
bring in changeable external key
-
pair value into codes.


<servlet>


<servlet
-
name>PrecompileInit</servlet
-
name>


<display
-
name>Servlet Precompile Init</display
-
name>


<description>Fast servelet for listing location with LC MFHD counts.



</description>


<servlet
-
class>Precompile.PrecompileInit</servlet
-
class>


<init
-
param>


<param
-
name>Incoming_central_mail_server</param
-
name>


<param
-
value>netid.mail.yale.edu</param
-
value>


</init
-
param>


<in
it
-
param>


18


<param
-
name>Incoming_medical_mail_server</param
-
name>


<param
-
value>email.med.yale.edu</param
-
value>


</init
-
param>


<init
-
param>


<param
-
name>Outgoing_mail_server</param
-
name>


<param
-
va
lue>mail.yale.edu</param
-
value>


</init
-
param>


<init
-
param>


<param
-
name>Send_email_address</param
-
name>


<param
-
value>prodsys@mailman.yale.edu</param
-
value>


</init
-
param>


<init
-
param>


<param
-
name>Output_file_path</param
-
name>


<!
--
param
-
value>


/usr/local/tomcat/webapps/DownloadFiles/Collection_Analysis_Files


</param
-
value
--
>


<param
-
value>c:/temp</param
-
value>


</init
-
param>


<ini
t
-
param>


<param
-
name>Output_file_URL</param
-
name>


<param
-
value>

http://magellan.library.yale.edu:8085/DownloadFiles/Collection_Analysis_Fi
les


</param
-
value>


</init
-
param>


<load
-
on
-
startup>1</load
-
on
-
startup>


</servlet>


<servlet
-
mapping>


<servlet
-
name>PrecompileInit</servlet
-
name>


<url
-
pattern>/servlet/PrecompileInit</url
-
pattern>


</servlet
-
mapping>




Ex
planation about <load
-
on
-
startup>.

This tag specifies that the servlet should be loaded automatically when the web
application is started.

The value is a single positive integer, which specifies the loading order. Servlets
with lower values are loaded bef
ore servlets with higher values (ie: a servlet
with a load
-
on
-
startup value of 1 or 5 is loaded before a servlet with a value of
10 or 20).

When loaded, the init() method of the servlet is called. Therefore this tag
provides a good way to do the following
:

-

start any daemon threads, such as a server listening on a TCP/IP port,
or a

-

background maintenance thread


19

-

perform initialization of the application, such as parsing a settings file
which provides data to other servlets/JSPs

If no <load
-
on
-
startup> va
lue is specified, the servlet will be loaded when the
container decides it needs to be loaded
-

typically on it's first access. This is
suitable for servlets that don't need to perform special initialization.




If init() is not declared in web.xml, the ini
t() won’t be precompiled and executed
at the time of tomcat starting the application.




If init() is declared in web.xml, the init() will be precompiled and executed only
once at the time of tomcat starting the application.




The data that generate from ini
t() will be cached for the life time at the time of
tomcat starting the application.




In the init() of PrecompileInit.java, make data source connection; get all locations
with LC MFHD counts as above described, save them in the cached temp file for
jspInit
() to use.



3). jspInit() method in CARevision1Main.jsp.




jspInit() method can be compiled and executed into cache only once when the JSP
is invoked at the first time, no matter it is declared at web.xml or not. If it is
declared in web.xml as below,

it will be compiled and executed before the JSP is
invoked, but can’t be cached; and when the JSP is invoked at the first time,
jspInit() will be compiled and executed again; but this time it will be cached for
the life time. If it is not declared in web.
xml, it won’t be compiled and executed
before the JSP is invoked. It is no need to add the declaration in web.xml, because
it can cause the jspInit() being compiled and executed twice.


<servlet>


<servlet
-
name>JSPINIT Preload</servlet
-
name>


<jsp
-
fi
le>/CARevision1Main.jsp</jsp
-
file>


<load
-
on
-
startup>1</load
-
on
-
startup>

</servlet>




The JSP’s preload doesn’t need to have mapping section like the servlet does.




In tomcat environment, a JSP's jspInit() method is called and cached only once
the first
time the JSP is invoked for its life time. Be aware, it must happen at the
first time the JSP is invoked. Here is a trick that you can use to improve
performance using jspInit() method. You can use this method to cache static data.


Generally a JSP generat
es not only dynamic data but also static data.
Programmers often make a mistake by creating both dynamic and static data from

20

JSP page. Obviously there is a reason to create dynamic data because of its nature
but there is no need to create static data ever
y time for every request in JSP page.




If JSP is not declared in web.xml, the jspInit() won’t be precompiled and
executed at the time of tomcat starting the application.




If JSP is declared in web.xml, the jspInit() will be precompiled and executed at
the
time of tomcat starting the application.




Regardless JSP is declared in web.xml or not, JSP will be precompiled and
executed only once at the time of this JSP is invoked by browser.




The data that generate from jspInit() will be cached for the life time.




In the jspInit() of CARevision1Main.jsp, parsing the data acquired from init() of
PrecompileInit.java, and cached into string arrays for life time use. That means
after the CARevision1Main.jsp is invoked at the first time, jspInit() won’t be
compiled and e
xecuted any more. CARevision1Main.jsp just gets cached data
every time when is running.


4.3 Diacritics.


How to display the foreign language’s diacritics correctly is a complex issue. For the
most common European language, the encoding is ISO8859_1. Wheth
er the diacritics
can be displayed or got correctly depends on if its environments support the
ISO8859_1 or not. The environments include Java language, SQL, text editors,
browsers, MS Excel, Access, Word etc.


From Java programming point of view, if code
s use the inappropriate methods, output
function still can work, but diacritics will be the wrong characters.

Here are codes that used in this application for outputting the correct diacritics into
the file:

OutputStream fout = new FileOutputStream(txtName
);

OutputStream bout = new BufferedOutputStream(fout);

OutputStreamWriter txtOutput = new OutputStreamWriter(bout, "8859_1");

txtOutput.write(dataLine);

txtOutput.close();


This part of coding can successfully output the diacritics. However in order to dis
play
the correct diacritics, it also depends on if display environments can support the
ISO8859_1. For example, you can see NOTEPAD can display diacritics correctly, but
VEDIT can’t display diacritics correctly.






21

4.4 The feature of output file’s path on

the server.




This path has to be accessible through URL link. It can’t be any paths on the
server. The path has to be in tomcat container under the root of webapps.




One simple web application DownloadFiles is created for this purpose.




“DownloadFiles” i
s the root directory for URL accessible file path. All files and


directories that need to be accessed from URL are under “DownloadFiles”.




For this version of Collection Analysis, the directory is named as
Collection_Analysis_Files. All ou
tput files are saved in
Collection_Analysis_Files directory. After the user OUTPUTs his/her file, He/she
will get the email with the URL link that indicates the file path.




The Collection_Analysis_Files directory needs to be cleaned up daily. The length
o
f days that the files are saved in this directory will be 14 days from their creating
date.




Here is the example of email message that users receive after they click OUTPUT
button:


Here is your Collection Analysis File:

yj33_20070305_16
-
55
-
36_CA.txt

Plea
se click the link to find your file.

Then right click your file name to save on your desktop.

http://magellan.library.yale.edu:8085/DownloadFiles/Collection_Analysis_Files

This file will be saved for 14 days.


5. Related Documentations.


-

ReadMe_CARevisio
n1_Deploy.txt.

-

HowToUseDWR.doc.

-

OutputLocationQueries.

-

OutputClassQueries.

-

OutputSubclassQueries.

-

OutputRangeQueries.


6. Contact Staff.

IS&P: Estelle Pope < este.pope@yale.edu >

ITS: Gail Barnett < gail.barnett@yale.edu >

Bob Rice < robert.rice@yale.edu

>