Clicky

Houdah Software Forums
Register Latest Topics
 
 
 


Reply
  Author   Comment  
alto2nn

Registered:
Posts: 5
 #1 
Hi,

I work in a university communications office and we have, as you might imagine, a very large server volume that stores all our work files, past/present/future. In the eight years I've been here, we've never been able to search this behemoth, and I recently decided I'd had enough and to take matters into my own hands and see if I could find a utility that could help.

I'm testing both HoudahSpot and Tembo; I suspect Tembo would be much more user-friendly for the rest of our group (I'm the one with a previous life in tech support, and mostly our folks just want the file about so-and-so or the poster from X event, not something specific to image dimensions, etc). I have both running similar searches right now and HoudahSpot has found server files where Tembo has not--but I also ran a preliminary test with HS on Friday, which leads me to my core questions:

Because this server volume is so vast, I don't expect instant results. HS took less than an hour on Friday (I wasn't watching, so I don't have a more specific time); so far today it's taken about 15 minutes to start returning server results. Tembo has been running, as of right now, for just shy of an hour and still isn't showing anything from our server.

1. Is it possible to index the server if, say, we leave a machine running all weekend (possibly longer) to do the job?

2. Has HS done some indexing already based on my earlier test, or is it just faster/better suited to a server search than Tembo in general?

3. If we can index the server, is there a way to make that index available to the whole group--and can Tembo and HS access the same index?

Thanks so much for your help!

Nancy
0
houdah

Moderator
Registered:
Posts: 2,958
 #2 

Hi!

Both HoudahSpot and Tembo rely on the Spotlight search engine that is part of macOS. This relies on the Spotlight index that macOS maintains for each volume.

There are cases where the Spotlight search engine (and thus Tembo and HoudahSpot) can return search results from volumes for which no index exists. These searches are, of course, slow and very limited in capability (i.e. search only by basic file properties, not file content or metadata).

Neither HoudahSpot nor Tembo create an index of their own. Tembo may have more problems working with a drive that has no Spotlight index. The grouping and filtering features in Tembo rely on file metadata that may not be available when the index is missing.


Spotlight indexing SMB volumes is possible - with caveats:

I have only limited knowledge of SMB. From what I read Apple has extended SMB with custom features that allow Spotlight to work. Some third party vendors support these custom features. Others don't. So it may be a matter of which brand of NAS you choose. Or what software is running on the server that shares the drive. Please check with the vendor of your SMB solution. Also, there may be a difference between macOS being able to create a Spotlight index and it being able to keep the index current as files are modified.

You can use "mdutil" at the command line in Terminal.app to check if indexing is enabled for a drive.

1. Launch /Applications/Utilities/Terminal.app
2. Paste in the following line:
mdutil -s -a
3. Press Return

This will list all connected drives and tell you if indexing is enabled. It will also give you the volume paths to use in the commands below.

To enable indexing for "My External Harddrive", use:
sudo mdutil -i on "/Volumes/My External Harddrive"

This will ask for and administrator password.

To start re-indexing of "My External Harddrive", use:
sudo mdutil -E "/Volumes/My External Harddrive"

Do also check that the following directory exists: /private/var/db/Spotlight-V100/Volumes/
This is where Spotlight stores indexes for remote volumes.

Check the existence and contents of the folder using:

1. Launch /Applications/Utilities/Terminal.app
2. Paste in the following line:
sudo ls -la /private/var/db/Spotlight-V100/Volumes/
3. Press Return

If that says "No such file or directory", you need to create the folder:

1. Paste in the following line:
sudo mkdir -p /private/var/db/Spotlight-V100/Volumes/
2. Press Return
Now use mdutil to enable indexing on your remote volume.


Best,

Pierre Bernard
Houdah Software s.à r.l.


__________________
Houdah Software s. à r. l.
https://www.houdah.com

HoudahGeo: One-stop photo geocoding
HoudahSpot: Advanced file search utility
Tembo: Easy and effective file search
0
alto2nn

Registered:
Posts: 5
 #3 
Interesting! I ran mdultil and it says that indexing is disabled for our network server. Since we're not the ones who maintain this server, I'm reluctant to turn it on without checking with our IT folks to make sure there isn't some good reason for that (and I doubt that I'd have privileges to turn it on anyway). There's definitely no folder for Spotlight to be keeping indexes for the server, either. I'll check with our tech folks and see what they have to say.

If we assume that nothing's going to change on the server indexing front, since that's the safest bet, here's what I'm wondering:

Tembo did eventually return the same results I got from HoudaSpot on Friday for the same search, but it took well over an hour (at least 90 minutes, after which I had to leave my desk for a while, so I don't know when they finally came up). Is it fair to take from that test that HoudahSpot is likely to be a better choice for us, and I should just set our less-technically inclined folks up to search the server automatically?

Of course, literally just having said that, Tembo brought up server-based results for a second search--the one I ran on HS this morning to time it--in less than five minutes, so maybe it will work after all? Though if the drive is not indexed, does that mean the first Tembo/HS search will take a very very long time but then subsequent searches will be faster? (If that's the case, I assume that we'll be starting from scratch each morning, since no index file is created?)

If you're not sure, since SMB is neither my strong suit nor yours, your best guess will at least give me a starting point with our IT folks.

Thanks so much!

Nancy
0
alto2nn

Registered:
Posts: 5
 #4 
Hi, Pierre,

Just wondering if you have any ideas about my questions from yesterday, before I go talk to our IT folks to see if we can order either Tembo or HoudahSpot.

Thanks!

Nancy
0
houdah

Moderator
Registered:
Posts: 2,958
 #5 
Hi Nancy,

Tembo and HoudahSpot should take about the same time to get results. The underlying technologies are the same.

Tembo will need to get additional information about the files to sort them into groups. Without index, Tembo may not be able to get enough information to populate filters in the drill-down groups.

HoudahSpot offers the option to add search criteria and show more columns in results. Many of these will not be functional unless there is an index.

AFAIK, the index is kept locally. Creating the index will cause network traffic. Maintaining the index may need an SMB implementation that supports Apple-specific features.

Pierre Bernard
Houdah Software s.à r.l.

__________________
Houdah Software s. à r. l.
https://www.houdah.com

HoudahGeo: One-stop photo geocoding
HoudahSpot: Advanced file search utility
Tembo: Easy and effective file search
0
gilby

Registered:
Posts: 17
 #6 
If I may butt in, I suspect you have the wrong search architecture.

If you get Spotlight/HoudahSpot to work, Spotlight creates an index on your Mac.  This index is only usable on your Mac.  It is not usable by anyone else in your group.  Keeping it up to date is not easy.  I think that your IT people will have kittens (be most alarmed) at the prospect of multiple client computers (or even just one) maintaining indexes of their server.  The network traffic generated in creating and maintaining the index is what will alarm them most.

A better architecture is one where your IT people run a product on one of their servers which a) indexes the document server, b) automatically maintains the index as files are created/deleted/modified and c) has a client interface (web based?) which enables any computer (with the appropriate access privileges) to search by metadata and content.    Now all the processing is within the computer centre with client computers sending search queries to the search server which returns just the results promptly.  And all with minimal network traffic.

Just my thoughts,
John Gilbert


0
alto2nn

Registered:
Posts: 5
 #7 
I'm thrilled that you "butted in," John! I've been thinking it sounded much more efficient to have something that doesn't have to keep an index on our individual machines, but don't know enough to know if such a thing exists.

I've opened a conversation with our IT folks and have kept it to "we really need to be able to search our stuff" and "the server's not indexed, which isn't helping--any idea why?" So we'll see what they say, but I'm really glad to have a little more info in case they seem willing to help.

Do you know if there is a tool that will do what's needed on their end? It might be nice to be able to mention something if they are baffled. We don't have many Macs on campus, so they may not be terribly aware of what the options are.

Thanks again!

Nancy
0
gilby

Registered:
Posts: 17
 #8 
I have sent Nancy a private message as this is getting way beyond Houdah Spot and Tembo.
0
Previous Topic | Next Topic
Print
Reply

Quick Navigation:

Easily create a Forum Website with Website Toolbox.