Firstly, thanks to everybody who helped with the system restore problem in CCleaner from another thread on this forum.
As a result of a couple of people’s suggestion, I am posting this here. It’s not a Piriform issue.
But you did ask!
I am running Windows XP Workstation Professional SP3 , and it's all up to date as far as it can be (or at least as far as I can see). For a number of reasons I don't want to update this machine to a later OS. Most everything else works just fine, except I have a very large mixed format collection of documents. E.g. pdf, word, text, Mht etc.
I know it's unpopular with some, but quite frankly the most comprehensive search product I have used is Windows Search 4.0. With the indexed locations set to email and particular file locations the overhead in indexing is slight once the index is built, and it can be turned off if necessary. The usual complaint that it “slows the machine down”, in my opinion can be rectified simply by confining what it indexes to certain locations. Also there are configurations in the group policy editor that can improve things.
For general named file finding I would recommend Everything from Voidtools (it works like lightning and is FREE!). For limited content search Agent Ransack ( this is another excellent free product, but because it doesn't use an index can become slow on a large document store as it searches the content of individual files one by one ). Like most things in IT, it's "horses for courses".
The Windows Search 4.0 was a late addition to XP, and of course has not been updated much from the initial distribution. It could be that this problem is insoluble without a later OS.
Sadly search 4.0 seems inconsistent in getting results; the problem I've noticed lies in finding mht archive files saved either from Firefox (you need an addon for this) or Internet Explorer. It simply finds only some (actually most) of them from the indexed locations. Strangely if you search for "type:mht" search it doesn't find them all. "Everything" for example finds plenty more. However the problem seems to rest with only certain files and I cannot work out what it is about them that causes them not to be found. I have tried Sysinternal's "streams.exe" to remove “blocked files” from “another machine”, makes no difference. I’ve checked all the security attributes, even removed the “compressed” bit. The mimetype persistent handlers are up to date.
All to no avail. I’ve done lots of searches on the Internet, posted on other forums and cannot find a solution.
An example would be:
Search DOES NOT find: saved as mht either from Firefox or Explorer (I don’t use Explorer as a rule, just as part of this experiment)
Search DOES find:…
Copernic does find the files (but I find the free product has other deficiencies for my purposes)
Agent Ransack does find the files, however not being an indexing program is slow over the whole of my document store (approx 30,000 items). (It’s brilliant for finding bits of code etc when confined to a few directories)
As to why Mht format, it is a good way of preserving a web page with all ( or most) of its content functional. Converting to PDF is useful, but many of the links within a page can cease to function.
I think I’ve done fairly comprehensive research, so if there’s a Windows Search 4.0 expert out there who has an idea it would be just fantastic if they have a suggestion.
Many thanks in anticipation, and if it's a case of "that old chestnut", then forgive me, I must have missed it.