Houdah Software Forums
Register Latest Topics
 
 
 


Reply
  Author   Comment  
Rumboogy

Registered:
Posts: 8
 #1 
There is an old problem with Spotlight (and hence HoudahSpot) and Word .docx files.  I have seen it touched on here in the forums but I could not find any post that fully addresses it so I am creating such a post. 

The issue is simple.  When you open a .docx file the file is removed from Spotlight's index and thus from HoudahSpot's results.  The only way to get the file back in Spotlight/HoudahSpot's results is to force a re-indexing of the file using a terminal command mdimport -d1 WordFile.docx (where WordFile.docx is the file that has gone missing).  I have a crontab job that is fired off once an hour to do this but this is really a hack and does not solve fully solve the problem.  I had discussed this in detail on the Apple forum (https://discussions.apple.com/thread/7368240) but no one really has a definitive solution. 

So any thoughts on workarounds to solve this?  I realize that it is not a HoudahSpot issue since it is in the indexing portion of Spotlight but I am hoping you have some clever workaround to solve Microsoft/Apple's deficiency.

I am using Office for Mac 2011 and all my software is up to date.
0
houdah

Moderator
Registered:
Posts: 2,906
 #2 
Hi!

I have seen Spotlight re-index files at the moment they are opened. I assume what you are seeing, is Spotlight trying and failing to re-index.

The state of Word importers is rather convoluted. Older versions of Office installed importer plug-in in /Library/Spotlight. Newer versions have the importer plug-in included within the .app bundle. Last I checked, importer plug-ins shipped by Microsoft were 32-bit. Thus mdimport ignores these. There is a separate mdimport32 command that calls 32-bit importers. And then there are some Word files that are processed by Apple’s RichText importer.

My thinking is that the problem could be that Spotlight picks the wrong importer. Or may sometimes pick the right one and sometimes the wrong one. Your mdimport call may pick yet another one because it cannot call 32-bit importers. Try mdimport -d1 WordFile.docx and mdimport32 -d1 WordFile.docx to see which importers get used and if one makes the file appear / disappear from search results.

Thus part of the solution could be to remove redundant importer. I.e. the ones Mircrosoft installed in /Library/Spotlight as well as older versions of Microsoft Word.


Best,

Pierre Bernard
Houdah Software s.à r.l.





__________________
Houdah Software s. à r. l.
https://www.houdah.com

HoudahGeo: One-stop photo geocoding
HoudahSpot: Advanced file search utility
Tembo: Easy and effective file search
0
gilby

Registered:
Posts: 10
 #3 

I think something has got worse recently.

For me (on my desktop Mac) any docx files modified since April this year were not being indexed.  What have I discovered (and this content is largely from my post in the discussion referenced by Rumboogy): 

1.  Microsoft Office.mdimporter is supplied by Apple (not from Microsoft as part of Office as I had assumed) and is regularly changed even though the version number remains at 12.3.0 and the bundle's creation date remains at 27 June 2011.  The contents the bundle do change.   New versions are certainly included in the 10.12.4 and 10.12.5 updates (I have not checked older updates).   And this Apple version is the one in /Library/Spotlight - it does not come (directly) from Microsoft.

2. If you delete the importer that may partially fix indexing of .docx files, but means that Excel files are no longer indexed! I believe this is not a good solution.

3. I have the problem (non indexing of docx) on my desktop Mac, but not on my MacBook.  Both are running 10.12.5 and Office 15.34.   I have copied Microsoft Office.mdimporter from the MacBook to the desktop.  And the problem is fixed!!  All new/modified documents are being indexed correctly.   As far as I can see the two importers are identical. 

4. Doing mdimport -d1 <path to docx/xlsx file> in Terminal still gives an error about 'wrong architecture'.  This suggests there is some 32/64 bit issue.   But automatic indexing is working. 

5. My solution is to replace Microsoft Office.mdimporter with a copy from a Mac which is working correctly. Keeping my fingers crossed that it continues to work for a while.  It might work for you if you find another Mac working correctly.

6. I don't understand this behaviour! 

0
houdah

Moderator
Registered:
Posts: 2,906
 #4 
I believe Spotlight takes the modification date of the importer into account when picking the importer to use. It will probably prefer the newes one.

Thus replacing the Microsoft Office.mdimporter with a an identical copy may cause Spotlight to start using it because it got today’s date.

__________________
Houdah Software s. à r. l.
https://www.houdah.com

HoudahGeo: One-stop photo geocoding
HoudahSpot: Advanced file search utility
Tembo: Easy and effective file search
0
Rumboogy

Registered:
Posts: 8
 #5 
This in response to entry #2 from houdah.

I am using Office for Mac 2011.  The file "/Library/Spotlight/Microsoft Office.mdimporter" exists.  There are no .mdimporter files inside the "Microsoft Word.app" package.  So I guess 2011 is what you would call an "Older versions of Office". 

I tried forcing indexing using the mdimport/mdimport32 commands you suggested.

For mdimport I got this result:
AAW@Mac15:~/Desktop/TEST$ mdimport -d1 *.docx
2017-06-12 13:17:35.271 mdimport[7960:1067954] Imported '/Users/AAW/Desktop/TEST/test.docx' of type 'org.openxmlformats.wordprocessingml.document' with plugIn /System/Library/Spotlight/RichText.mdimporter.

For mdimport32 I got this result:
2017-06-12 13:17:45.487 mdimport32[7963:1068075] Error loading /System/Library/Spotlight/RichText.mdimporter/Contents/MacOS/RichText:  dlopen(/System/Library/Spotlight/RichText.mdimporter/Contents/MacOS/RichText, 262): no suitable image found.  Did find:
    /System/Library/Spotlight/RichText.mdimporter/Contents/MacOS/RichText: mach-o, but wrong architecture
    /System/Library/Spotlight/RichText.mdimporter/Contents/MacOS/RichText: mach-o, but wrong architecture
2017-06-12 13:17:45.487 mdimport32[7963:1068075] Cannot find function pointer RichTextSnifferPluginFactory for factory 502B7F32-60DD-11D8-87A4-000393CC3466 in CFBundle/CFPlugIn 0x7c863350 </System/Library/Spotlight/RichText.mdimporter> (bundle, not loaded)
2017-06-12 13:17:45.487 mdimport32[7963:1068075] Imported '/Users/AAW/Desktop/TEST/test.docx' of type 'org.openxmlformats.wordprocessingml.document' with plugIn /System/Library/Spotlight/RichText.mdimporter.

From this it seems that it is not using the Microsoft .mdimporter file at all but rather the "/System/Library/Spotlight/RichText.mdimporter" file.  

I should also point out hat running this from the command line has always worked to make the file visible in Spotlight.  The issue comes when I double click to open a file.  Whatever type of indexing is associated with file opening seems to be the trouble maker.

0
Rumboogy

Registered:
Posts: 8
 #6 
Continuing my response to entry #2 from houdah.

I tried removing the file "/Library/Spotlight/Microsoft Office.mdimporter" and this did not fix the problem - some .docx files dissapear from Spotlight when I open the files in Word.

I can't remove old versions of Word because that is all I have - a 2011 version of Office. 

Do you think a newer version of Office would fix this?  Is there something else I could try to fix this?
0
houdah

Moderator
Registered:
Posts: 2,906
 #7 
On my Mac, .docx files are handled by the RichText.mdimporter
.doc files are handled by /Library/Spotlight/Microsoft Office.mdimporter

A full rebuild of the Spotlight index might help.

__________________
Houdah Software s. à r. l.
https://www.houdah.com

HoudahGeo: One-stop photo geocoding
HoudahSpot: Advanced file search utility
Tembo: Easy and effective file search
0
Rumboogy

Registered:
Posts: 8
 #8 
How can you tell which importer is handling which file? 

I had done full rebuilds in the past with no success.  I will try again now.

Yesterday, I removed "/Library/Spotlight/Microsoft Office.mdimporter", tested Spotlight, then put the package back by restoring from a .ZIP.  Now today I notice that the problem is gone.  By removing and restoring the package it changes some of the modification dates for files inside Microsoft Office.mdimporter but did not change the modification date of the whole thing.  So I don't know what is going on - if this thing just changes over time or if I actually fixed it.  Time will tell if the problem comes back.

In any event, you are suggesting to touch the file "/Library/Spotlight/Microsoft Office.mdimporter"
0
gilby

Registered:
Posts: 10
 #9 
"How can you tell which importer is handling which file?": I also would like to know how to tell which importer is being used by the automatic indexing.

"Now today I notice that the problem is gone": Great news - a bit like me when I copied the importer from another Mac.

"So I don't know what is going on": Glad (is that the right word) that you and I are together in our understanding [wink]

As an 
aside, if you have any intention of updating to High Sierra (when it is released) be aware that Office 2011 is unsupported and most likely won't work.  Even the latest update to Office 2016 (15.34) is known to have problems.
0
Rumboogy

Registered:
Posts: 8
 #10 
Gilby, thanks for your reply.

I will wait a least a half year before upgrading to the next MacOS.  I always wait in an attempt to avoid the bugs that there will be in the first few months of any new major release.

I guess I will have to upgrade to Office for Mac 2016.  There has really been no benefit in any Office upgrade in the last 20 years as far as I am concerned - except to make it compatible with the latest OS.  So I tend to stay as far behind as possible on Office releases.  I am sure eventually Office for Mac 2016 will work with High Sierra.
0
Rumboogy

Registered:
Posts: 8
 #11 
This is a follow up to my post labeled #8.

What I had done at that time was to ZIP the folder "/Library/Spotlight/Microsoft Office.mdimporter", then delete this folder, then restore it from the ZIP.  This seemed to fix the issue with .DOCX files disappearing, but it started to happen again yesterday. 

So I retract my previous suggestion that this might be a solution to this problem.
0
Previous Topic | Next Topic
Print
Reply

Quick Navigation:

Easily create a Forum Website with Website Toolbox.