Introduction
By default the SharePoint 2007 Search indexed only the meta data of a PDF document. By installing and configuring a PDF IFilter the Search will also index the contents of the PDF document. This allows users to find documents based on text inside the document. This process is called full text indexing.
[Indexing Server]: the server(s) in the SharePoint Farm that has/have the "Indexing" Role assigned. In a small farm this can be a single server for all roles.
[Web Front End Server]: the server(s) in the SharePoint Farm that has/have the "Web Front End" Role assigned. In a small farm this can be a single server for all roles.
Windows SharePoint Services 3.0
[Indexing Server]
- Install the PDF IFilter (see below for a list of available IFilters)
- Add the .pdf file type to the index list:
- Open the Registry Editor (Start > Run > regedit)
- Go to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Applications\\Gather\Search\Extensions\ExtensionList
- Add a new String Value
- Value name:
- Value data: pdf
-
[This step only applies to 64 bit servers]
- Go to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf
-
Change the (Default) key value
- Old value: {4C904448-74A9-11D0-AF6E-00C04FD8DC02}
-
(Foxit x64 PDF IFilter) New value: {987F8D1A-26E6-4554-B007-6B20E2680632}
-
(Adobe x64 PDF IFilter) New value: {E8978DA6-047F-4E3D-9C78-CDBE46041603}
- Perform an iisreset
- Perform a Full Update on the Search content indexes
- Open a Command Prompt on the Indexing Server
- net stop spsearch
- net start spsearch
- cd "C:\Program Files\Common Files\Microsoft Shared\Web server extensions\12\BIN"
- stsadm.exe –o spsearch -action fullcrawlstop
- stsadm.exe –o spsearch -action fullcrawlstart
[Web Front End Server]
- Copy the ICPDF.GIF () file to "C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\Template\Images"
- Edit the file C:\Program Files\Common Files\Microsoft Shared\Web server extensions\12\Template\Xml\DOCICON.XML
- Add an entry for the .pdf extension
Microsoft Office SharePoint Server 2007
[Indexing Server]
- Install the PDF IFilter (see below for a list of available IFilters)
- Add the .pdf file type to the index list:
- Go to Central Administration, then to the Shared Services Administration Web of the current SSP, go to Search Settings and next to File Type
- Add a new file type pdf
-
[This step only applies to 64 bit servers]
- Go to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf
-
Change the (Default) key value
- Old value: {4C904448-74A9-11D0-AF6E-00C04FD8DC02}
-
(Foxit x64 PDF IFilter) New value: {987F8D1A-26E6-4554-B007-6B20E2680632}
-
(Adobe x64 PDF IFilter) New value: {E8978DA6-047F-4E3D-9C78-CDBE46041603}
- Perform an iisreset
- Perform a Full Update on the Search content indexes
- Open a Command Prompt on the Indexing Server
- net stop osearch
- net start osearch
- Go to Central Administration, then to the Shared Services Administration Web of the current SSP, go to Search Settings and start a full crawl of all locations containing PDF files
[Web Front End Server]
- Copy the ICPDF.GIF () file to "C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\Template\Images"
- Edit the file C:\Program Files\Common Files\Microsoft Shared\Web server extensions\12\Template\Xml\DOCICON.XML
- Add an entry for the .pdf extension
Available IFilters
Adobe PDF IFilter 6.0 - x64
- free (always good !)
- 32 bit and 64 bit (64 bit released recently, applies to the [Indexing Server])
Foxit PDF IFilter v1.0
- free for desktops, servers require a license
- 32 bit and 64 bit (IA64 currently being tested, applies to the [Indexing Server])
Conclusion
Using the above procedure for either WSS 3.0 or MOSS 2007 it is possible to have your PDF document's contents indexed by the SharePoint Search.
References
Other