Newsletters #5

Here is my 5th newsletter. Today I like to address a topic asked several times within the WSH newsgroup: How can we process Adobe Acrobat PDF files with WSH scripts. My apologizes for the long delay after the 4th newsletter. I wrote the Microsoft Windows 2000 Professional handbook for Microsoft Press Germany, and I untertook the adventure to write two manuscripts about WSH for Microsoft Press (the German and the US branch): One 950 pages manuscript updates my existing German WSH title. The other manuscript was destined for a 600 pages WSH book published by Microsoft Press USA. Unfortunately I run into a slight error. After writing a while, I counted the pages - and ends up with the number 1200. So we had a problem: How to fit all material in one 600 pages book. So Microsoft Press USA decided that the material is good for two books, one teaching WSH script programming step-by-step. And the other is more for the "hardcore" WSH scripters (won't say the "masochists") interested in ActiveX controls programming, Accessing data bases with ADO, automating their Windows NT/Windows 2000 administration with ADSI or talking about the new adventure called Windows Mangement Interface (WMI). So everybody can imagine that I was pretty busy within the last months. But I discovered a lot of new technologies, and I intend (beside the books mentioned obove) to release this knowlege step-by-step here in the WSH Bazaar.

(c) Günter Born, http://www.borncity.de

Warning: The material submitted as a newsletter comes AS-IS without any warranty of any kind. In no event shall the author be liable for damages and losses of any kind. The material discussed herein is copyrighted by the author. You get a free right to use it. This newsletter may be distribute freely as long as: a) the zip-archive News5.zip remains intact and unmodified, b) a reference to the WSH Bazaar www.borncity.de is given and c) the newsletter is not shipped with other commercial material. If you like to ship the newsletter with CD-ROMs etc. please ask. Permission will be granted in normal cases, but I like to know where my material goes into.

Scripting with Adobe Acrobat ActiveX control (PDF.OCX)

It has been adressed several times in the WSH newsgroup: How can we use the Adobe ActiveX control PDF.OCX from scripts? Obviously it isn't too trivial to use the control - because the queries posted in the newsgroup don't receive reasonable response. Therefore I spend one rainy afternoon ( fter installing a new hard disk into my system and waiting to the new SuSE Linux 6.4 version) to investigate a bit the PDF.ocx control. Well, let's go into the details.

The control itself will be registered with the progID "Pdf.PdfCtrl.1" after installing Acrobat Reader 4.0 (www.adobe.com). So I thought it might be possible to address the control with the following VBScript code snippet:

Set oPDF = WScript.CreateObject("Pdf.PdfCtrl.1")
oPDF.src="C:\Test.pdf"

After creating a sample, I received a runtime error "Catastropic failure" in the second line. Obviously there is something wrong with this approach. The control was designed to co-operate with an application providing a valid window, and a WSH script doesn't comes with such a window. Therefore any attempt to access the object references by oPDF fails. We have a similar behavior with several other ActiveX controls - as I discussed in the WSH Tutorial and in my WSH books. In some cases we may use ugly tricks to invoke a control without having a window (for instance: using the desktop window). But these tricks won't work with Pdf.ocx.

The solution might provide a window that embeds the control. The Internet Explorer may be the right tool for that purpose. We can create a HTML document using Microsoft Frontpage and insert the ActiveX control Pdf.ocx into that page. Using Microsoft FrontPage Express to create a blank page and inserting the control with the class id:

{CA8A9780-280D-11CF-A24D-444553540000}

results in the following HTML code.

<html>
<head>
<meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="Microsoft FrontPage Express 2.0">
<title>Born's Adobe Acrobat PDF Viewer</title>
</head>

<body bgcolor="#FFFFFF">

<p><object id="Test" name="Test"
data="file:///C:/Test/Acrobat.pdf"
classid="clsid:CA8A9780-280D-11CF-A24D-444553540000"
align="middle" border="0" width="491" height="296">PDF</object></p>
</body>
</html>

The code looks good at a first glance, we have an <object> tag adressing the class id code. Unfortunately after loding the HTML document file, the browser shows an empty page. The problem with many ActiveX controls is that the author needs to know which interface must be used to invoke the control. Although there is a data attribute, pointing to the source file, the control isn't able to read the file. After browsing my Adobe Acrobat Destiller online documents, I found a short note that the <object> tag may be used to insert the control into a page. The note shows a code line that indicates that the control looks for a parameter "src" defining the path to the pdf file. Therefore I amended the code created from Microsoft FrontPage Express as shown below:

<html>
<head>
<meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1">
<meta name="GENERATOR" content="Microsoft FrontPage Express 2.0">
<title>Born's Adobe Acrobat PDF Viewer</title>
</head>

<body bgcolor="#FFFFFF">

<p><object id="Pdf1"
classid="clsid:CA8A9780-280D-11CF-A24D-444553540000"
align="baseline" border="0" width="346" height="290"><param
name="SRC" value="Acrobat.pdf"></object></p>
</body>
</html>

The code uses the same attributes like classid, border, width, height and baseline to set the size of the control's window, hide the border and align the control to the baseline. New is the <param> tag that specifies the name attribute "SRC" and sets the value to a valid file name. Loading this HTML document creates a page containing a frame viewing the PDF document addressed in the value attribute (see Figure below)

Figure: A HTML page with the Adobe Acrobat PDF document.

Well, after having the code above, it is just a small step to use the Acrobat control from WSH scripts. All we need is: fire up the Internet Explorer from a script, load a HTML document, embed the ActiveX control and access the properties of that control from the script. And here comes the code to create such a HTML document on the fly from a WSH script :

Sub MakeHTMLPage (file)
' Launch Internet Explorer, prepare a
' page and insert Acrobat control, load a file.
Dim html
' define HTML code
html = "<html><head><title>Born's Adobe Acrobat Viewer</title></head>" & _
"<body bgcolor='white'>" & _
"<object id='A1'" & _
"classid='clsid:CA8A9780-280D-11CF-A24D-444553540000'" & _
"border='0' width='550' height='450'>" & _
"<param name='SRC' value='" & file & "'>" & _
"</object></body></html>"

' *** launch Internet Explorer ***
Set oIE = WScript.CreateObject("InternetExplorer.Application")
oIE.top = 10
oIE.left = 10
oIE.width = 600
oIE.height = 520
oIE.menubar = 0 ' no menu
oIE.toolbar = 0
oIE.statusbar = 0
oIE.navigate "about:" & html ' Form
oIE.visible = 1 ' keep visible

' Important: wait till MSIE is ready
Do While (oIE.Busy): Loop
End Sub

I use a simple procedure to handle all steps. The procedure creates in instance of Internet Explorer, set the window properties and insert the code into the document window. Details about this technique are discussed in my WSH books. There is only one important issue I need to mention here: I set the id attribute value of the <object> tag to "A1". This enables us later on to access the control using the object name A1.

With a little help from my friends - or when methods comes in handy

After creating the HTML page from the script, it comes in handy, if the script can manipulate the PDF document. For instance browse through the pages, change the zoom value or print some pages. The ActiveX control provides a collection of methods and one property. The following table contains an overview about some methods and properties of this control.

Name Type Remarks
width property With of the control
height property Height of the control
src property Path and name of the document already loaded. If this property is written, the ActiveX control will load the new file.
gotoFirstPage method Shows the first page of a PDF document, uses no parameters
gotoNextPage method Shows the next page of a PDF document, uses no parameters
gotoLastPage method Shows the last page of a PDF document, uses no parameters
setCurrentPage number method Shows the page submitted as a parameter
setShowToolbar bool method Shows or hides the toolbar, parameter must be set to true or false.
setShowScrollbar bool method Shows or hides the scrollbars, parameter must be set to true or false.
setZoom percent method Sets the zoom factor, parameter is in percent.
printPages from, to method Allows a silent print, the parameters define the page interval (from page to page)
print method Print all pages, no parameters, invokes a print dialog box in Windows 2000.
printWithDialog method Invokes the Print dialog box in all Windows version.

Note: I used the object browser provided by several Microsoft applications to browse the object and find out more about the methods. Details about these techniques may be obtained from the WSH Tutorial - see for instance chapter 2 (free downloadable from the WSH Bazaar as a sneak preview). Beside the methods and properties mentioned in the table above there are other entries that shall not be discussed here.

Warning: The dialogs shown may be hidden in the background, if the ActiveX control refreshes the HTML page. Therefore I used the VBScript MsgBox function with the buttons parameter set to 4096 (always on top). Some Windows dialogs (like Print) may be also in Background. Because the WSH script doesn't have a window, all thouse dialogs don't owns a button in the task bar. This can be a bit confusing/fussy for the user, if he/she can't switch to the dialog. The dialogs will be visible, if the browser window is minimized.

Like "puppets on the string" - control the PDF document from a script

After invoking the browser loading the HTML document and embedding the ActiveX control, we can load any pdf document, change its behavior and control the pages viewed. And all thouse actions may be done from a WSH script. Let's have a look at the following code snippet:

MakeHTMLPage GetPath() & "Intro.pdf"

' Try to access PDF-Control (object named as A1)
' First we create an object variable with a reference
Set oDoc = oIE.Document.All.A1

' now show a few properties of the control
MsgBox "Width " & oDoc.width & vbCrLf & _
"Height " & oDoc.height & vbCrLf & _
"Source " & oDoc.src, _
vbOkOnly, title

' now we are going to change the size of the control
MsgBox "Change size & load new document", 4096
oDoc.width=520 ' set new size

The first line uses the procedure to create the HTML page with the ActiveX control and loads the document Intro.pdf, which must be located in the scripts folder (the function GetPath() returns the current folder). The code contains a global object variable oIE that contains a reference to the browser object. Now we can build the path to the document object and then to the HTML tags within the document. These object hierarchy is addessed by oIE.Document.All and if we append .A1 to that hierarchy, we get a reference to the ActiveX control. The line shown below assigns an object reference to the ActiveX control to the object variable oDoc:

Set oDoc = oIE.Document.All.A1

After obtaining this reference, we can use this variable to access the object's properties and methods. The statement:

oDoc.width=520

sets the height property of the ActiveX control. Use the following line to load a new document:

oDoc.src = GetPath() & "WSHCh03.pdf

And we can use the following loop to browse trough subsequent pages of the PDF document using the gotoNextPage method.

For i = 1 to 10 ' step through 10 pages
MsgBox "Goto next Page " & i, 4096, title
oDoc.gotoNextPage
Next

Isn't it easy, isn't it? Below is a sample code listing demonstrating how to invoke the HTML page and how to access the methods and properties of the control.

'************************************************
' File: Acrobat.vbs (WSH sample in VBScript)
' Author: (c) Günter Born
'
' Demonstrates how to use the Adobe Acrobat Reader
' ActiveX control to manipulate Acrobat PDF files.
' Uses Internet Explorer 4.0/5.0 as front end.
'************************************************
Option Explicit

Const title = "Born's Adobe Acrobat Viewer"

' define object variable for reference to Internet Explorer
Dim oIE ' important, variable oIE must be global
Dim oDoc ' to document object
Dim i, txt

' launch IE, create a HTML page, insert Acrobat Control,
' and load a PDF-document within the control.
MakeHTMLPage GetPath() & "Intro.pdf"

' Try to access PDF-Control (object named as A1)
' First we create an object variable with a reference
Set oDoc = oIE.Document.All.A1

' now show a few properties of the control
MsgBox "Width " & oDoc.width & vbCrLf & _
"Height " & oDoc.height & vbCrLf & _
"Source " & oDoc.src, _
vbOkOnly, title

' now we are going to change the size of the control
MsgBox "Change size & load new document", 4096
oDoc.width=520 ' set new size
oDoc.height=470
oDoc.src = GetPath() & "WSHCh03.pdf" ' load new document

MsgBox "Document: " & oDoc.src & " loaded", 4096, title

For i = 1 to 10 ' step through 10 pages
MsgBox "Goto next Page " & i, 4096, title
oDoc.gotoNextPage
Next

' hide toolbar
If MsgBox ("Hide Toolbar?", vbYesNo, title) _
= vbYes Then ' try to print
oDoc.setShowToolbar false
End If

MsgBox "Goto last Page", 4096, title ' goto last page
oDoc.gotoLastPage

MsgBox "Goto Page 11", 4096, title ' goto page
oDoc.setCurrentPage 11

MsgBox "Goto first Page", 4096, title ' goto first page
oDoc.gotoFirstPage

MsgBox "Set Zoom 80%", 4096, title ' set zoom in percent
oDoc.setZoom 80

MsgBox "Set Zoom 70%", 4096, title ' set zoom in percent
oDoc.setZoom 70

If MsgBox ("Print Pages 1 to 2", vbYesNo, title) _
= vbYes Then ' try to print
oDoc.printPages 1,2
ElseIf MsgBox ("Print all Pages", vbYesNo, title) _
= vbYes Then ' try to print
oDoc.print
ElseIf MsgBox ("Print With", vbYesNo, title) _
= vbYes Then ' try to print
oDoc.printWithDialog
End if

MsgBox "Ready", 4096, title

oIE.Quit ' close Internet Explorer
Set oIE = Nothing ' reset object variable

WScript.Quit ' Ready

'### Helper ####
Sub MakeHTMLPage (file)
' Launch Internet Explorer, prepare a
' page and insert Acrobat control, load a file.
Dim html
' define HTML code
html = "<html><head><title>Born's Adobe Acrobat Viewer</title></head>" & _
"<body bgcolor='white'>" & _
"<object id='A1'" & _
"classid='clsid:CA8A9780-280D-11CF-A24D-444553540000'" & _
"border='0' width='550' height='450'>" & _
"<param name='SRC' value='" & file & "'>" & _
"</object></body></html>"

' *** launch Internet Explorer ***
Set oIE = WScript.CreateObject("InternetExplorer.Application")
oIE.top = 10
oIE.left = 10
oIE.width = 600
oIE.height = 520
oIE.menubar = 0 ' no menu
oIE.toolbar = 0
oIE.statusbar = 0
oIE.navigate "about:" & html ' Form
oIE.visible = 1 ' keep visible

' Important: wait till MSIE is ready
Do While (oIE.Busy): Loop
End Sub

Function GetPath()
' Retrieve the script path
DIM path
path = WScript.ScriptFullName ' Script name
GetPath = Left(path, InstrRev(path, "\"))
End Function
' Ende

The script enables you to print PDF files page by page or in intervalls. And here is the ZIP archive with the sample files: News5.zip.

Tip: The archive contains also chapter 3 of the WSH Tutorial in PDF format. I used this file for demonstration purposes. Chapter 3 is good for novices, because it introduces VBScript in brief.

Ok, that's all for this time. I continue to present new technologies for WSH scripters soon. There is a lot of hot stuff on my hard disk (just spend a new 15 GB drive to keep all thouse nice things along with Win 2000 - and still have some space for Linux). And at least, I guess all that stuff will brought to you soon by Microsoft Press as a printed WSH titles ...

Further samples and details may be found also in my WSH Bazaar. Have a look at the sample page and at the WSHExtend Programmers Reference.

Other topics I mentioned in my last newsletters (like shell access) are already described in my WSH Bazaar as ordinary samples. So check out the WSH Bazaar.

Planned topics for newsletter #6 and future newsletters:

If I have time, I will introduce ADO, ADSI and WMI stuff. Enyoy scripting, till the next newsletter arrives...

... and all registered users of my WSH Tutorial will receive the first update dealing about ADO, ADSI and WMI within the next days.


(c) G. Born, 30 - March 2000 - www.borncity.de