Clicky

oscar
I would like to search a folder with .pdf and .docx documents. In particular, I want to find all documents that have a specific word or phrase. While this seems to work for .pdf files, it does not always work for .docx files.

For example, I search for the term 'kotahitanga' but the results to not include a file '1993 - Cox - Kotahitanga.docx' that I know has that term.

Do you know how I can resolve this problem? Perhaps it is a general problem with Spotlight?

Thank you.
 
0 0
houdah
Hi!

HoudahSpot searches the Spotlight index. Spotlight relies on importer plug-ins to extract metadata and text content from file. For proprietary file formats it needs third party plug-ins. These are usually installed with the application that creates these files.

You can use the following procedure to check what data the plug-in extracts from a file.

1. Open /Applications/Utilities/Terminal.app
2. Type the following: "mdimport -d4 ". Without the quotes, but with the trailing space
3. Drag a file from Finder into the Terminal window. This appends the file path to the above command
4. Hit Enter

The mdimport command will index the file. It will also output a lot of Spotlight debugging information. Towards the start of the output (scroll up) it will tell you which Spotlight importer plug-in it chose to process the file. On OS X Yosemite, docx files are handled by the Apple provided RichText.mdimporter.

Towards the end of the output you will see what the plug-in reported for kMDItemTextContent. This is the text Spotlight indexes and that can be searched for.

The above procedure imports the file you have selected. If that causes the file to appear in HoudahSpot, there is a problem with your Spotlight index. Try re-indexing your drive.


Best,

Pierre Bernard
Houdah Software s.à r.l.

Houdah Software s. à r. l.
https://www.houdah.com

HoudahGeo: One-stop photo geocoding
HoudahSpot: Advanced file search utility
Tembo: Easy and effective file search
0 0
oscar
Thank you.

Unfortunately this did not work as I got a 'permission denied' message. Here is one example, though I tried for several different files, including a .docx file in my Desktop folder.


 

732L-114016-M:~ ckirkby$ mdimport -d4 

Usage: mdimport [OPTION] path

-d debugLevel Integer between 1-4

-g plugin     Import files using the listed plugin, rather than the system installed plugins.

-p            Print out performance information gathered during the run

-A            Print out the list of all of the attributes and exit

-X            Print out the schema file and exit

-L            Print out the List of plugins that we are going to use and exit

-r            Ask the server to reimport files for UTIs claimed by the listed plugin.

-n            Don't send the imported attributes to the data store.

-o path       Write the imported attributes to a file, instead of sending them to the server.

732L-114016-M:~ ckirkby$ /Users/ckirkby/Dropbox/Library/2012\ -\ Ballantyne\ -\ Webs\ of\ Empire.docx 

-bash: /Users/ckirkby/Dropbox/Library/2012 - Ballantyne - Webs of Empire.docx: Permission denied

 
0 0
houdah
Hi!

You seem to have hit return or enter after typing “mdimport -d4 “. This issued an incomplete command.

You want to create a complete command that includes the path of your file before hitting return.

Example (one line):

mdimport -d4 /Users/ckirkby/Dropbox/Library/2012\ -\ Ballantyne\ -\ Webs\ of\ Empire.docx


Best,

Pierre Bernard
Houdah Software s.à r.l.

Houdah Software s. à r. l.
https://www.houdah.com

HoudahGeo: One-stop photo geocoding
HoudahSpot: Advanced file search utility
Tembo: Easy and effective file search
0 0
oscar
Thanks. It worked. But it seems like I might have to re-index this folder since Spotlight still cannot 'see' many files that I know have certain key words or phrases.

Thanks again.
0 0