Clicky

Houdah Software Forums
Sign up Latest Topics
 
 
 


Reply
  Author   Comment  
omarruvalcaba

Registered:
Posts: 3
 #1 

Hi Everyone,

I was hoping someone would have an idea of how I could solve (or get started on solving) an issue with my PDF collection in my shared lab folder.

I have several project folders with article pdfs that my team has collected. I’d like to create a list of the PDFs we have collected across folders so that new team members can use the list to see whether we have an article and where to find it.

Is there a way to create a list that includes the name of the article in the metadata (or pdf name), pulls the meta-data so it can integrate the author and publication year, and include the link location (possibly with a link?). Would it be possible to auto-update based on added pdfs and removed pdfs? Possibly also listing duplicates.

Someone suggested Houdaspot at the mac power users forum, but I've only been able to narrow down the list to pdf, show file path, and authors. 

Is there a way to input year of publication (not modification or creation)?

Is there a way to make this a pdf that auto updates?

The smart folders export function could have been an alternative approach, but by cloud service (Box) doesn't sync these folders.

0
houdah

Moderator
Registered:
Posts: 3,040
 #2 
Most file information / metadata that HoudahSpot shows comes from the Spotlight index. This in turn gets its information from the file system and from importer plug-ins. Such importer plug-ins specialize on a given file type. These are included with the system or installed with applications that bring new file formats to your Mac.

MacOS includes a Spotlight importer to process PDF files.

Check the Info pane in HoudahSpot to see what metadata is available for PDFs. You can view the same data as columns in HoudahSpot. You can copy column text or save result lists as text files.

I don't know if the Spotlight importer knows about author or publication year. If it does, it will, most likely, only be able to get that information if is already explicitly available as such in the PDF. I.e. set as metadata. I would not expect the importer to visually process the PDF and figure out what date in there could be understood as publication date.


If the information is not readily available, you best approach will be to enter it manually. E.g. as file tag "published-2018", …

You can save HoudahSpot search to run them again at a later point. You could script this to run periodically, extract search results, and save those as text files. However: once you get to this point, it may be just as easy to skip HoudahSpot and script the mdls command. This is the shell command to run Spotlight searches.

Also look at https://www.macosxautomation.com . I believe this have ways to access Spotlight via Apple Script.

__________________
Houdah Software s. à r. l.
https://www.houdah.com

HoudahGeo: One-stop photo geocoding
HoudahSpot: Advanced file search utility
Tembo: Easy and effective file search
0
omarruvalcaba

Registered:
Posts: 3
 #3 
Thank you for that explanation.

I don't mind entering the metadata myself. It looks like I may only be able to get author, file name, and file path.

Once entered, is there a way to automate the creation of CSV files based on my filters? For example, I would love if it could do something like apply my filter every Sunday and update the CSV based on new files (or just generate a new CSV).


0
houdah

Moderator
Registered:
Posts: 3,040
 #4 
HoudahSpot does not have an Apple Script command to trigger CSV export. I believe you can use Apple Script to do "UI Scripting", i.e. simulate the selection of a menu item. That is beyond my Apple Script skills.
__________________
Houdah Software s. à r. l.
https://www.houdah.com

HoudahGeo: One-stop photo geocoding
HoudahSpot: Advanced file search utility
Tembo: Easy and effective file search
0
omarruvalcaba

Registered:
Posts: 3
 #5 
Same here, but thanks for the lead!
0
Previous Topic | Next Topic
Print
Reply

Quick Navigation:

Easily create a Forum Website with Website Toolbox.