This article describes

convertingtownSoftware and s/w Development

Nov 4, 2013 (3 years and 9 months ago)

76 views

This article describes how you can add the ability to convert DOC to PDF (DOC2PDF) to Microsoft
Office SharePoint Server 2007 (MOSS) using
Aspose.Words
.

In this article, we show how you can create a small console application in Visual Studio that works
as a document converter for SharePoint and invokes the Aspose components to perform the
conversion.

It is easy to add other types of conversions such as DOC

to DOCX, DOC to RTF, RTF to DOC, DOC to
WordML, WordML to DOC, HTML to DOC etc by following the example in this article. You can also
investigate other Aspose file format components such as
Aspose.Cells

and
Aspose.Slides

and use
them to support even more types of document conversions in SharePoint.

Document Converters in SharePoint

Microsoft Office SharePoint Server 2007

includes a new feature that allows the conversion of
documents from one format (content type) to another. You can use document conversions to
transform your content to suit your

business requirements. You can invoke the conversion from the
user interface or programmatically via the SharePoint Object Model.

Built
-
in Document Converters in SharePoint

SharePoint includes several document converters that you can use out of the box:



.DOCX (Office Open XML) to HTML web page (also .DOCM to web page)




InfoPath to HTML web page




XML to HTML web page


Converting a Word Document (DOCX) to a Web Page using a built
-
in MOSS document
converter.


Need for More Document Converters

The set of doc
ument converters included with MOSS is limited. You can only convert DOCX, InfoPath
and XML documents into web pages.

There are many possible scenarios where additional converters might be required:



When a draft document is stored in one format (Microsoft
Word DOC) and the final
document is published in another format (Adobe PDF) to a customer
-
facing site.




When the main format for documents is DOCX inside the organization, but it needs to
make the documents available to its customers and partners as DOC do
cuments or
vice versa.


Extensible Document Converter Framework

Thankfully, the document converter’s framework in SharePoint is extensible. It allows custom
converters to be implemented and seamlessly integrated into SharePoint allowing for any required
co
ntent type conversion to be supported.

There is a good section
Document Converters Overview

in MSDN about document converters.
Although it is geared towards developers implementing a custom document converter, it makes
good reading for any IT professional who is tasked with planning or supporting document converters
in SharePoint.

Summary of
Document Converters in SharePoint

To summarize the features of Document Converters in SharePoint:



Extensible. Custom converters can be added to facilitate almost any content
conversion.




A Document Converter is an executable. You can develop one or find a
suitable
commercial product.




Document conversions are usually resource intensive; they run on the server(s) and
are controlled by the SharePoint load balancer service.




Documents that are the results of the conversion can be versioned and they maintain
a
link to the original document in their metadata, history, properties etc.


Aspose to the Rescue

Aspose

provides a great line of .NET and Java components. Trusted by thousands of customers
worldwi
de, the products include File Format Components, Reporting Products, Visual Components
and Utility Components.

Aspose File Format Components include products such as Aspose.Words, Aspose.Cells, Aspose.Pdf,
Aspose.Slides and so on that allow you to programm
atically open, modify, generate, save, merge,
convert, etc. documents in various formats including DOC, DOCX, RTF, WordML, HTML, PDF, XLS,
PPT and others. These products are .NET class libraries that developers use when building their .NET
or Java applicat
ions that require access to documents in different formats.

Aspose File Format Components are often chosen for their superior performance, scalability and
stability in a server environment over Microsoft Office Automation. Microsoft Office Automation is
no
t recommended on the server for these
reasons
.


While Aspose

components cannot be directly
used as document converters for SharePoint out of the box, this article shows how you can easily
create a small .NET application that wraps an Aspose component and works as a document
converter for SharePoint.

Create a Docume
nt Converter for MOSS

A document converter for SharePoint (MOSS) is a custom executable that SharePoint calls with
command line arguments. The arguments specify the input, output, configuration and log files. The
command line arguments are described in det
ail in
Document Converter Run Command

in MSDN.

We are going to create a simple console application in Visual Studio 2005 that supports the
command line arguments passed
by SharePoint and performs the DOC to PDF conversion using
Aspose.Words.

In this example we are using Visual Studio 2005 and the application will be built for .NET 2.0, but
you can also use Visual Studio 2003 and the document converter will be built for .N
ET 1.1, which
will also work fine. SharePoint has no requirements regarding .NET version to document converters;
in fact, a document converter does not have to be a .NET application at all, it just needs to be an
executable.

Download and Install Aspose Com
ponents

You need to download Aspose.Words for .NET from
Aspose Downloads
.

Install Aspose.Words on your development computer. All Aspose components, when installed, work
in evaluation mo
de. The evaluation mode has no time limit and injects watermarks into produced
documents.

Create a Project

Start Visual Studio 2005 and create a new console application. This example will show a C# console
application, but you can use VB.NET too.

Add
References

Add a reference to C:
\
Program Files
\
Aspose
\
Aspose.Words
\
Bin
\
net2.0
\
Aspose.Words.DLL.


Add Code

Example

The following is the complete code of the document converter.

[C#]



using

System;

using

System.IO;

using

Aspose.Words;



namespace

Examples

{



/// <summary>



/// DOC2PDF document converter for SharePoint.



/// Uses Aspose.Words to perform the conversion.



/// </summary>



public

class

ExMossDoc2Pdf



{



/// <summary>



/// The main entry point fo
r the application.



/// </summary>



[STAThread]



static

void

Main(
string
[] args)



{



// Although SharePoint passes "
-
log <filename>" to us and we are



// supposed to log there, for the sake of simplicity, we will use



// our own hard coded path to the log file.



//



// Make sure there are permissions to write into this folder.



// The document converter will
be called under the document



// conversion account (not sure what name), so for testing purposes



// I would give the Users group write permissions into this folder.



gLog =
new

StreamWriter(@"
C:
\
Aspose2Pdf
\
log.txt
",
tru
e
);





try



{



gLog.WriteLine(DateTime.Now.ToString() + "

Started
");



gLog.WriteLine(Environment.CommandLine);





ParseCommandLine(args);





// Uncomment the code below when
you have purchased a licenses for
Aspose.Words.



//



// You need to deploy the license in the same folder as your



// executable, alternatively you can add the license file as an



// embedded
resource to your project.



//



// // Set license for Aspose.Words.



// Aspose.Words.License wordsLicense = new Aspose.Words.License();



// wordsLicense.SetLicense("Aspose.Total.lic");





ConvertDoc2Pdf(gInFileName, gOutFileName);



}



catch

(Exception e)



{



gLog.WriteLine(e.Message);



Environment.ExitCode = 100;



}



finally



{




gLog.Close();



}



}





private

static

void

ParseCommandLine(
string
[] args)



{



int

i = 0;



while

(i < args.Length)



{



string

s = args[i];



switch

(s.ToLower())



{



case

"
-
in
":



i++;



gInFileName = args[i];



break
;



case

"
-
out
":



i++;



gOutFileName = args[i];



break
;



case

"
-
config
":



// Skip the name of the config file and do nothing.



i++;



break
;



case

"
-
log
":



// Skip the name of the log file and do nothing.



i++;



break
;



default
:



throw

new

Exception("
Unknown command
line argument:
" + s);



}



i++;



}



}





private

static

void

ConvertDoc2Pdf(
string

inFileName,
string

outFileName)



{



// You can load not only DOC here, but any format supported by



// Aspose.Words: DOC, RTF, WordML, HTML.



Document doc =
new

Document(inFileName);





doc.Save(outFileName, SaveFormat.Pdf);



}





private

static

string

gInFileName;



private

static

string

gOutFileName
;



private

static

StreamWriter gLog;



}

}





[Visual Basic]



Imports

Microsoft.VisualBasic

Imports

System

Imports

System.IO

Imports

Aspose.Words



Namespace

Examples



''' <summary>



''' DOC2PDF document converter for SharePoint.



''' Uses Aspose.Words to perform the conversion.



''' </summary>



Public

Class

ExMossDoc2Pdf



''' <summary>



''' The main entry point for the application.



''' </summary>



<STAThread> _



Shared

Sub

Main(
ByVal

args
As

String
())



' Although SharePoint passes "
-
log <filename>" to us and we are



' supposed to log there, for the sake of simplicity, we will use



' our own hard coded path to the log file.



'



' Make

sure there are permissions to write into this folder.



' The document converter will be called under the document



' conversion account (not sure what name), so for testing purposes



' I would give the Users group write
permissions into this folder.



gLog =
New

StreamWriter("
C:
\
Aspose2Pdf
\
log.txt
",
True
)





Try



gLog.WriteLine(DateTime.Now.ToString() & "

Started
")



gLog.WriteLine(Environment.CommandLine)






ParseCommandLine(args)





' Uncomment the code below when you have purchased a licenses for
Aspose.Words.



'



' You need to deploy the license in the same folder as your



' executable, alternat
ively you can add the license file as an



' embedded resource to your project.



'



' // Set license for Aspose.Words.



' Aspose.Words.License wordsLicense = new Aspose.Words.License();



' wordsLicense.SetLicense("Aspose.Total.lic");





ConvertDoc2Pdf(gInFileName, gOutFileName)



Catch

e
As

Exception



gLog.WriteLine(e.Message)



Environment.ExitCode = 100



Fin
ally



gLog.Close()



End

Try



End

Sub





Private

Shared

Sub

ParseCommandLine(
ByVal

args
As

String
())



Dim

i
As

Integer

= 0



Do

While

i < args.Length



Dim

s
As

String

= args(i)



Select

Case

s.ToLower()



Case

"
-
in
"



i += 1



gInFileName = args(i)



Case

"
-
out
"



i += 1



gOutFileName = args(i
)



Case

"
-
config
"



' Skip the name of the config file and do nothing.



i += 1



Case

"
-
log
"



' Skip the name of the log file and do nothing.



i += 1



Case

Else



Throw

New

Exception("
Unknown command line argument:
" & s)



End

Select



i += 1



Loop



End

Sub





Private

Shared

Sub

ConvertDoc2Pdf(
ByVal

inFileName
As

String
,
ByVal

outFileName
As

String
)



' You can load not only DOC here, but any format supported by



' Aspose.Words: DOC, RTF, WordML, HTML.



Dim

doc
As

Document =
New

Document(inFileName)





doc.Save(outFileName, SaveFormat.Pdf)



End

Sub





Private

Shared

gInFileName
As

String



Private

Shared

gOutFileName
As

String



Private

Shared

gLog
As

StreamWriter



End

Class

End

Namespace





Select the Release

configuration and rebuild the solution.

You now have the AsposeDoc2Pdf.exe executable that can be used as a document converter for
SharePoint.

How to Build Converters for Other Formats

It is very easy to build more document converters. Aspose.Words

supports DOC, DOCX, RTF,
WordML and HTML documents and can perform conversions between these formats in any direction.
Conversions between Microsoft Word formats (DOC, DOCX, RTF and WordML) are high
-
fidelity,
meaning no content or formatting in the docume
nt is lost.

Example

Converts an RTF document to OOXML.

[C#]



public

static

void

ConvertRtfToDocx(
string

inFileName,
string

outFileName)

{



// Load an RTF file into Aspose.Words.



Aspose.Words.Document doc =
new

Aspose.Words.Document(
inFileName);





// Save the document in the OOXML format.



doc.Save(outFileName, Aspose.Words.SaveFormat.Docx);

}





[Visual Basic]



Public

Shared

Sub

ConvertRtfToDocx(
ByVal

inFileName
As

String
,
ByVal

outFileName
As

String
)



' Load

an RTF file into Aspose.Words.



Dim

doc
As

Aspose.Words.Document =
New

Aspose.Words.Document(inFileName)





' Save the document in the OOXML format.



doc.Save(outFileName, Aspose.Words.SaveFormat.Docx)

End

Sub





Deploy a Document Converter to
MOSS

The document converter for SharePoint must be packaged as a SharePoint Feature and deployed at
the Web
-
application level.

If you need an overview of deploying document converters as SharePoint features, see the following
topics in MSDN:



Document Converter Deployment
.




Working with Features
.


A Feature in SharePoint is a unit of functionality that can be added/removed to a SharePoint server.
A feature is defined in an XML file that describes the feature, its name, scope and required files. The
feature definition XML and accompanying files must
be placed in a folder in the C:
\
Program
Files
\
Common Files
\
Microsoft Shared
\
web server extensions
\
12
\
TEMPLATE
\
FEATURES folder.

Each feature needs to have a Feature.xml file that specifies the feature name, unique id, scope and
the elements that comprise th
e feature.

Create a Folder for the Feature

Create the C:
\
Program Files
\
Common Files
\
Microsoft Shared
\
web server
extensions
\
12
\
TEMPLATE
\
FEATURES
\
AsposeDoc2Pdf folder on the SharePoint server.

Create a Feature Definition XML File

In the feature folder, creat
e the Feature.xml as shown below.

Content of the Feature.xml file.

<Feature xmlns="http://schemas.microsoft.com/sharepoint/"



Id="{b4ce4c29
-
8aaf
-
4b80
-
bb63
-
d676e836f8ef}"



Title="DOC to PDF Converter (by Aspose)"



Description="Makes it possible to
convert documents from DOC to PDF."



Scope="WebApplication">





<ElementManifests>



<ElementManifest Location="Elements.xml"/>



<ElementFile Location="AsposeDoc2Pdf.exe"/>



<ElementFile Location="Aspose.Words.dll"/>



</
ElementManifests>




</Feature>

If you create more converters later on, pick a different GUID for the feature. The easiest way to
generate a unique GUID is to use the Tools / Generate GUID menu in Visual Studio.

Create a Document Converter Definition XML

File

The ElementManifest element in the Feature.xml file refers to the Elements.xml file. This file
contains the definition of the document converter. The definition of the document converter includes
unique id, display name, the name of the executable to

launch and the extensions of the source and
destination content types.

In the feature folder, create the Elements.xml as shown below.

Content of the Elements.xml file.

<Elements xmlns="http://schemas.microsoft.com/sharepoint/">



<DocumentConverter Id="
{a4df1dac
-
a22c
-
431a
-
bbf6
-
dcc91848fee9}"



Name="Word Document to PDF (by Aspose)"



App="AsposeDoc2Pdf.exe"



From="doc"



To="pdf"



/>

</Elements>

If you create more converters later on, pick a different GUID for the converter.
The easiest way to
generate a unique GUID is to use the Tools / Generate GUID menu in Visual Studio.

AsposeDoc2Pdf is now deployed as a SharePoint Feature.


Enable Document Converters

You need to enable document conversions in SharePoint, as they seem to
be disabled by default.

Go to the Central Administrator / Application Management / Configure Document Conversion screen
and enable document conversions.

Enable document conversions in MOSS.


It is a good idea to check that the document conversion services

are installed and running. In my
case they were installed and running.

Go to the Central Administration / Operations / Services on Server and make sure that the
Document Conversions Launcher Service and Document Conversions Load Balancer Services are
inst
alled and running.

Check Document Conversion services are installed and running.


Install the Document Converter Feature

Now we need to install the feature so the document converter becomes available in SharePoint.
Execute the following command on the ser
ver:

"C:
\
Program Files
\
Common Files
\
Microsoft Shared
\
web server extensions
\
12
\
BIN
\
STSADM.EXE"
-
o installfeature
-
filename AsposeDoc2Pdf
\
Feature.xml

force

Activate the Document Converter Feature

Now we need to activate the document converter, execute the
following command on the server:

"C:
\
Program Files
\
Common Files
\
Microsoft Shared
\
web server extensions
\
12
\
BIN
\
STSADM.EXE"
-
o activatefeature
-
name AsposeDoc2Pdf
-
url http://win2k3r2ee

Note that you need to specify a URL in the

url argument. I have not ful
ly figured out exactly what
URL must be specified, I just specified the name of my SharePoint server and it worked, making the
document converter available to all SharePoint sites on this server.

Now is a good time to verify that the feature in fact was in
stalled and activated. In my case, I found
I still needed to click the Activate button in the SharePoint Central Administration / Application
Management / Manage Web Applications Features window.

Make sure the new document converter feature is activated in

the Manage Web
Application Features window.


Copy the Document Converter Files!

After installing and activating the document converter as a SharePoint Feature, I was expecting that
the conversions would just run. But the conversions did not run (nothing
was happening) and I had
to examine the logs in the C:
\
Program Files
\
Common Files
\
Microsoft Shared
\
web server
extensions
\
12
\
LOGS folder.

The error message I was getting was that the Document Conversion Launcher Service was
attempting to start my converter
C:
\
Program Files
\
Microsoft Office
Servers
\
12.0
\
TransformApps
\
AsposeDoc2Pdf.exe, but there was no such file in that directory.

Most likely, I have done something wrong in the Feature Definition XML file so the files of the
feature were not copied to the
correct location, but I could not find a solution here, so I copied the
files manually.

Copy the following files from C:
\
Program Files
\
Common Files
\
Microsoft Shared
\
web server
extensions
\
12
\
TEMPLATE
\
FEATURES
\
AsposeDoc2Pdf to C:
\
Program Files
\
Microsoft Offi
ce
Servers
\
12.0
\
TransformApps:



AsposeDoc2Pdf.exe




Aspose.Words.DLL


Make Sure the Converter is Enabled for the Site

Go to your SharePoint home page, click Site Actions, Site Settings, Modify All Site Settings, Site
Content Types, Document, Manage

Document Conversion for This Content Type and make sure your
document converter is enabled.

Checking Document Converter is enabled for Documents on this SharePoint Site.


Test Your Conversion

Finally, we can test if the conversion works.

Upload a test DO
C file to the server. I uploaded a document called “Distributable VHD Image
EULA.doc”.

Upload a DOC file to MOSS.


Click on your test file so the context menu opens, select Convert Document, Word Document to PDF
(by Aspose).

Selecting to convert a DOC doc
ument to PDF.


Click OK in the window confirming the request for conversion.

Note the conversion is managed by the document conversion schedule and document conversion
load balancer service so it might not happen instantaneously. By default, the document
conversion
starts every minute.

Just refresh the page with the list of documents after several seconds until you see the converted
document appears in the list.

The document that is a result of the conversion appears in the list.


Click on the document to

download it. It is a PDF document and will open Adobe Reader on your
machine.

Adobe Reader displays the PDF document downloaded from the MOSS site.


Just to verify that Aspose.Words

did a great job at accurately converting DOC to PDF, open the
original DOC in Microsoft Word and compare with what you have in Adobe Reader.

The original DOC file opened in Microsoft Word to compare how well it was converted to
PDF.


Troubleshooting

If t
he conversion does not work, check the MOSS log files, which are detailed. The log files are in
C:
\
Program Files
\
Common Files
\
Microsoft Shared
\
web server extensions
\
12
\
LOGS.

Summary

In this article, I have shown how to use Aspose.Words to add the DOC2PDF c
onversion feature to
MOSS. I have also shown it is very easy to add many more types of conversions to SharePoint using
Aspose components.

If you feel that building your own document converter as shown in this article is too much for you
and you would prefe
r a finished product with a simple installer, let us know. We might package it as
a product eventually, say Aspose.Words for SharePoint.

If you are using Microsoft SQL Server Reporting Services, make sure to check our other great
product
Aspose.Words for Reporting Services

that makes possible the generation of true DOC,
DOCX, RTF and WordprocessingML reports in Microsoft SQL Server 2005 Reporting Services.

Please excuse any technic
al inaccuracies regarding SharePoint (if you find any) because my
experience with SharePoint before this article was nil and I had to spend some time grasping many
concepts that were new to me such as sites, web applications, document libraries and so on a
nd
what they mean in the context of MOSS.

Any questions or comments are welcome in the
Aspose.Words Forums
.