Free Text Database specs:


Here is what I want in a database:


Format:

Complete free text input. There are just two concepts: paragraphs and sentences. Sentences are separated by periods, and are preferably short, stating one fact each. All of the sentences in a single paragraph are assumed to be related. Paragraphs are separated by an extra blank line.

For example:

His name is Steve Bush, but also known as Burnin' to his friends. His car is a miata. Hair color used to be dark black. Cell phone # is 123-45678. Works at Solutionware. Writes computer programs for a living. Wants to buy a digital camera. His main address is 1583 Curtner Ave, San Jose, CA 95125. In case of emergency call John Smith at 408-273-6097. Home phone is (408)469-4907. Favorite actress is Jennifer Garner.

Now this person's name is John James. His address is 1234 North Back St, Hollywood, CA. He doesn't like Jennifer Garner.

And that is all there is to it! The input file is just a set of paragraphs, each containing a set of sentences. Sentences are defined by periods, paragraphs by the extra blank line between them.

User interface:

We start with a single "Find box":

Find:
  words required in sentence:



______________________
 -
+

Entering any set of words in the blank space means find a single sentence that contain all of those words. For example:

Find:
  words required in sentence:



   hair color black  -
+

This would ask to search for a single sentence which contains all three of the words, hair, color, and black.

Clicking on the "+" sign adds another box, which is automatically labelled "AND":

Find:
  words required in sentence:



   hair color black  -
 +
AND
______________________  -
 +

Entering data in this second box adds a second requirement, which specifies that another sentence must be found in the SAME paragraph, with these new words. For example,

Find:
  words required in sentence:



   hair color black  -
 +
AND
   address san jose
 -
 +

The above Find request specifies that a matching paragraph must contain a sentence with the words hair, color, and black, and must also contain a sentence with the words address, san, and jose in it. In other words, it finds all paragraphs regarding black-haired people who live in San Jose.

In other words, each line specifies a group of words which must appear together in one sentence, and the various lines specify various sentences which must appear together in a single paragraph.

The first paragraph matching this form would appear in a large box on the screen below the find boxes. If there was more than one, the first would be shown and the number of matching paragraphs would also be shown, and there would be a place to click NEXT/PREVIOUS etc to cycle through them.


Printing reports:

In addition to NEXT/PREVIOUS is PRINT. It simply prints the results of the find.

But we also have a second type of form called a Print form. This form looks much the same as the Find form, and has the purpose of shortening the printout by only printing some sentences from each matching paragraph rather than all of the paragraph.

You do the FIND first, filling out that form and getting a result. Then instead of hitting PRINT, you bring up the PRINT form. It reduces the printout to a more condensed format.

For example, your find all people with black hair who live in San Jose, as in the example above, but you just want to print their name and phone number, not their entire paragraph of info. We use the Find form first, as above, then bring up the Print form and do:

Print:
  words in sentence:
Shorten?
Sort by?



   name


 -
 +
 
   phone number


 -
 +

Here, we search through only the found paragraphs, for a sentence with the word "name" in it, and print only that sentence. And also look for a sentence with the words phone number in it and print only that sentence. (Or, if both name and phone number are in the same sentence, print just that sentence).

Run on the example paragraph way above, that is:
His name is Steve Bush, but also known as Burnin' to his friends. His car is a miata. Hair color used to be dark black. Cell phone # is 123-45678. Works at Solutionware. Writes computer programs for a living. Wants to buy a digital camera. His main address is 1583 Curtner Ave, San Jose, CA 95125. In case of emergency call John Smith at 408-273-6097. Home phone is (408)469-4907. Favorite actress is Jennifer Garner.
This PRINT form would produce:

Name is Steve Bush, but also known as Burnin' to his friends.
Cell phone # is 123-45678.
Home phone is (408)469-4907.

Now, if we add an "X" in the Shorten box, like this:

Print:
  words in sentence:
Shorten?
Sort by?



   name
      X     X
 -
 +
 
   phone number
      X

 -
 +

the program uses an intelligent algorithm to separate the actual name or phone number from the rest of the words in the sentence and so prints only:
Steve Bush. 123-45678. (408)469-4907
And also, since we added an "X" in the Sort box, it sorta all the found paragraphs by that name.

And that is the entirety of the program that I want!!

Thanks for listening,

Steve Bush.