|
|
What's New with Wilbur
(1.64 and before)
What's New in Version 1.64
Wilbur now prescreens the results of a near search so only those files
that actually meet the near criteria are presented in the file list.
In previous versions all files that contained both words of a near search
were put in the file list, but the contents pane would only display
matches where the words were actually near. This modification removes
the confusion that this could cause, but it can take quite a bit longer
to do the actual screening. Clicking in either Wilbur result pane will
halt the screening process and leave all unscreened files in the list
in a manner similar to the previous behavior.
Fixed a problem which had crept in and caused files in zip files that
were in turn inside zip files to be missed if a specific extension was
requested. Also fixed a problem with zip files when driveless include
specifications were used and the current directory was not the root
directory.
Minor modification to allow single words in MSWord tables to be indexed.
Added automation support for containsHit function to permit automation
clients to make use of near search prescreening.
Improvements were made to the status line during indexing. The stop
button should also be more responsive allowing indexing to be stopped
within a few seconds.
What's New in Version 1.63
Minor modifications to facilitate the use of Wilbur on CD-ROM distributions.
Click here for details of using Wilbur on
a CD-ROM.
Improved error messages displayed when a file cannot be opened for
content viewing.
A single quote character turns out to be a legitimate file name character.
Wilbur will now allow these in include and exclude paths, but it can
no longer be used for quoting in those paths. (double quote can still
be used for quoting)
A GetRank method was added the the OLE Automation SearchResults object.
What's New in Version 1.62
Many accented characters were not being interpreted correctly in the
new HTML parsing routine. This has been corrected.
Quoted strings in searches are now treated as if all the words in the
string were separated by the <1 near operator. Thus "lazy
brown dog" is now equivalent to lazy <1 brown
<1 dog
RTF (Rich Text Format) files are now recognized and indexed correctly.
Most formatting information is also stripped out in the viewer.
Improvements in status lines, particularly with Zip files.
Changes were made to the redistribution policy. Unregistered versions
of Wilbur can now be used beyond the 30 day evaluation period as long
as they are only being used to view indexes stored by licensed copies.
In this case no reminder screens are displayed. Licensed copies are
of course still required to create or modify indexes. This change was
aimed at folks who wish to distribute indexed material on media like
CD-ROMs.
What's New in Version 1.61
In removing the time limit in version 1.6, we also removed any way
of registering Wilbur before the reminder screen appeared after 30 days
of use. There is now a Help/Register command.
A minor change was made which gets around the annoying habit of not
being able to view a file which QuickView or Word has held open.
What's New in Version 1.6
The focus of the changes in version 1.6 were in improving the indexing
and searching engines.
- The disk scanning routine was modified significantly and now no
directory is ever scanned more than once, regardless of how many include
statements you might have. Since most people will have several include
paths, the time savings can be substantial. Note that existing indexes
will have to be rebuilt before they can be used.
- A significant change was made to the compression portion of the
indexing. This reduced the virtual memory footprint at this point
and in some cases leads to dramatic improvements in speed. The combination
of this and the disk scanning modification reduced the time on our
main test index by over a factor of two. Updates in particular were
much faster.
- Previous versions did not recover the space used for the update
after a full build was done, until the next update. The build now
empties the update file.
- The indexer is now more aggressive about skipping words which appear
to be just random characters in the middle of binary data. In some
indexes this might significantly reduce indexed word counts.
- Wilbur is now somewhat smarter about displaying HTML files. Files
with HTM or HTML extensions will, by default, strip all HTML tags
out of the viewed contents. Some basic formatting is also done and
special characters are replaced. A command has been added to the View
menu to allow the raw HTML to still be viewed.
- Parenthesis have finally been added to the search strings. You can
now enter search strings such as (red | green) & (blue
| pink). Parenthesis can even be nested if appropriate. A
bug was also fixed where a search string such as blue | red &
green would effectively skip the | red part. It is hard to believe,
but this bug has been here a long time.
- A bidirectional near search has been added. Now "tom
: jerry" is the same as "(tom < jerry)
| (jerry < tom)" For completeness an "after"
operator has also been added, so "tom > jerry"
is equivalent to "jerry < tom". Note
that like the original near search, these operators only affect the
highlighting of words in the contents pane. All are treated as AND
operators for the purpose of adding files to the file list.
- You can now prefix a '+' sign to a word in the
search dialog and this word will be used in determining which files
are listed, but the word will not be searched for or highlighted in
the contents view. For instance the search string "Walt
+Disney" would only find files containing both Walt
and Disney, but only Walt would be highlighted in the contents pane.
If outline mode was on, only lines containing Walt would be shown.
- Another silly bug fixed was one that prevented Wilbur from ever
finding the first word indexed.
- The time limits on evaluation copies have been eliminated. Now a
reminder dialog appears at start up after 30 days of use, but otherwise
the program is completely functional.
- Beeps were added for the phrase not found condition in most circumstances.
- The default include phrase for html files has been changed from
htm? to htm* since the former would miss htm files.
What's New in Version 1.53
- The major changes in this update have
to do with zip files. The most noticeable one is that Wilbur now removes
the empty directories in the temporary zip directory when it is done
with them. On some machines with large cluster sizes and a large number
of these directories, they could waste a significant amount of space
even though empty. The original idea behind leaving them was performance,
but removing them seems to have had a negligible impact.
- In previous versions it was necessary
to include "*.*" to get all files in a zip archive. This
was inconsistent with the rest of Wilbur where only "*"
was necessary. The single wildcard now works with zip archives as
well.
- When a zip archive had more than one
file with the same name, but in different subdirectories, the temporary
files were not being removed. This has been fixed.
- There was a problem accessing files
inside zip archives when their names contained spaces.
- A couple of possible minor resource
leaks were fixed.
- The status line spelling of containing
was fixed.
- The File/Copy List Files command had
a bug that caused problems under Window NT.
- Improvements in error trapping. In particular
an oddity in the Microsoft libraries meant that out of memory errors
would not be caught in helper threads like indexing.
- User break from the compression loop
is now handled correctly.
- The executable file is now slightly
smaller thanks to a new version of the Microsoft compiler.
Whats New in Version 1.52
- You no longer need quotes when entering
file paths containing spaces or commas etc., but you may no longer
enter more than one path in the File Include or File Exclude dialogs.
Since just hitting Enter twice will bring this dialog back ready for
another entry, this does not seem like a significant loss.
- The local search in a file has been
brought into closer conformance with the overall index search. In
particular if numbers are not indexed, they are no longer considered
valid characters in a local search. This should eliminate the problem
where the word ABC is in the index, but is not found in the local
search since it appears in the form ABC123.
Whats New in Version
1.5
Major Stuff:
Memory Mapped Indexes: Rather than
reading its indexes into memory, Wilbur now uses a memory image of the
index which stays on the disk. The important benefit of this is that
startup times are drastically reduced for larger indexes. This makes
using Wilbur to quickly look something up much more convenient.
Search speeds are not significantly
effected, but there are some downsides though, particularly for folks
sharing the same indexes over a network. Specifically indexes cannot
be built or updated if someone else is using them. Also each index is
now broken into several files with different extensions which is slightly
messier. To compensate Wilbur now lets you designate a specific directory
where the indexes are stored by default.
Contents Outline Mode: The
file contents can now be collapsed so only lines containing the target
words are shown. This mode can be quickly toggled back and forth using
the number pad + key. . By default Wilbur now starts in
the outline mode, but this can be changed with an option on the File/Preferences
dialog.
File Rankings: An option can now
be set to have the number of occurrences (to a maximum of 256) of each
word in each file saved in the index. A corresponding column in the
file list allows files to be sorted by the frequency of hits of the
target words.
Indexing:
- Changes so file specs with no drive
designation work. This allows the use of indexes on CD ROMs and other
removable media where drive letters may not be known in advance. Note
that for network use, Wilbur has always supported the use of UNC names.
- A default index directory can now be
set in the File/Preferences dialog. This directory will be the default
location for saving and restoring indexes, although any directory
can still be used.
- A new index now has a default set of
common include files. This is designed to get new users going more
easily and reduce the temptation to just use *.*.
- The results of the current search can
be used as the basis of the next search. This is done by simply starting
the search line with either an & or |
character depending on whether you want the new results to be a subset
of the current results or added to them. An error in the help file
documentation incorrectly identified the tilde character ~
as the NOT operator in searches. The correct character is in fact
the caret ^".
- It is now possible to specify how many
characters constitute the smallest and largest words that will be
indexed. These setting can be adjusted to exclude undesired words
and thus reduce the size of the index.
- A search string made up completely of
invalid (too long or short or in the skip.txt file) words no longer
finds all the files in the index. A completely empty search string
can still be used for this purpose.
- Due to problems with some environments,
the extension for indexes has been changed to just the 3 characters
"wil". Sigh... The "wilbur" extension will still
work, but must be entered explicitly.
- The dialog box which reports the results
of an indexing operation now includes the option to immediately save
the index.
- This dialog also reports how long the
indexing took.
- Added percent complete to Compressing
status message. Compressing can now be a significant part of the indexing
operation, so some indication of progress seemed appropriate.
- Problems that occurred with file exclusion,
specifically with quote marks and zip files, have been fixed.
Contents Page:
- Added Edit/Export command to append
selected text to Export.txt file. This is hot keyed to the Ctrl X
key so collecting specific tidbits is now easy.
- Fixed near search so it uses right number
(not +1 ).
- Failed searches now leave the last page
of text in viewing area rather than just showing a blank area beyond
the end of the file.
- Fixed not being able to select last
character in line without going to the next line.
Dialogs:
- The index dialog was changed to have
separate include and exclude file pages. The multiple entries on these
pages can now be selected for deleting and copying to the clipboard.
The resulting clipboard information can be pasted back into include
or exclude boxes in any index or just pasted into a text editor for
direct manipulation. They are plain ASCII text, so the results of
such editing can be copied to the clipboard and pasted back into an
include or exclude list.
- The search dialog now reports the index
attributes. Specifically when the files were last indexed, what number
handling is in effect, the character set being used and the minimum/maximum
indexed word size.
- Removed zip path from the index options.
It is the overall preferences in the File/Preferences dialog.
- Added minimum/maximum word length to
index options.
- Help/About box now reports the number
of indexed files, bytes and unique words in the current index.
Main Window
- Changed registry functions to better
support multiple users on single machine.
- Fixed a problem where the Toolbar occasionally
did not update its status.
- Added a command line option which requests
that search be done without search dialog. This allows word processing
macros to start Wilbur and jump directly to the word without having
to hit enter.
- Fix so that aborting by shutting down
wouldn't get hung up on the Indexing
- Aborted message box.
- Changed registration security so an
unregistered and timed out program can still read (but not save or
update) indexes created by it or by a registered version. This permits
unregistered copies to be included with indexes on removable media.
- Changed index and update threads to
lowest priority in the hopes of reducing their impact on foreground
tasks.
- Added numerous OLE automation routines.
It is now possible to start Wilbur, specify and build indexes and
do searches from OLE automation compatible programming and script
languages.
- Removed the shortcut X" to
Export File List since it conflicted with Exit in the file menu
File List
- Multiple files can now be selected for
the File/Export File List and the File/Copy commands.
- Added File copy to folder command for
selected files in the file list. This command allows you to select
a folder to copy the selected files to.
- If more than one file is selected, the
File/Export File List command just exports the names etc. of the selected
files. If only a single file is selected, the original behavior of
exporting the whole list continues.
- The file list now remembers the previous
four sort orders and used them to progressively break ties in the
primary sort order. This means that by sorting first on time and then
on file type, the files will be listed by type and by time within
each type group.
- Fixed context menu so it opens in the
right spot. It now opens where the mouse cursor is rather than a fixed
spot on the screen.
Other
Due to the ever increasing size and update
frequency of the support DLLs from Microsoft, Wilbur now has the
necessary routines statically linked directly in its executable. This
makes Wilburs executable much larger, but still much smaller than
the size of the old executable and the DLLs combined. This is not as
esthetically pleasing as the concept of shared DLLs, but in practice
will probably be faster and more convenient for the majority of users.
|