*** Help for DataProspector

DataProspector is (c) Copyright 2003, P. Lutus.

DataProspector is CareWare (no money, now or ever). Visit the CareWare Page at www.arachnoid.com/careware.

For recent revisions and further information, visit the DataProspector Home Page at www.arachnoid.com/DataProspector.

DataProspector is a Java program meant to access and modify databases in many convenient ways. It works best with the MySQL data engine, but it will also work with PostgreSQL as well (see below for further discussion).

DataProspector requires the Java 1.4 or newer runtime engine, available for download at java.sun.com.

DataProspector retains all user choices in a configuration file located on this machine at (user home)/.DataProspector/DataProspector.ini. This file can be hand-edited to accomplish some customizations, and it can be deleted to solve some kinds of program problems. If this file is deleted, DataProspector simply creates another with default values.

*** How to use DataProspector

The first step is to set up a database engine on your machine or network. I recommend MySQL - most of the testing, and the best results, have been obtained with this engine. Installing MySQL is fully covered elsewhere, and the remainder of these instructions assume you have solved this problem. DataProspector also supports PostgreSQL, but at the time of writing (3/2003) the JDBC driver for this database engine has some flaws that prevent complete testing.

The second step is to create a table. If you do not already have a database with tables set up, DataProspector can help you create some in a simple way using the system clipboard (see below).

Finally, you can use DataProspector to examine and edit your database tables. After installing the Java runtime engine, you may need to acquire and install your database engine's JDBC driver. A suitable driver for MySQL can be obtained at http://www.mysql.com/downloads/api-jdbc-stable.html.

Place the JDBC driver JAR file on this machine in the (java home)/lib/ext directory, or put a symbolic link there if you are fortunate enough to be running a version of Unix.

Now you may run DataProspector:

$ java -jar DataProspector.jar

This invocation statement can be put in a shell script for convenience (Windows users will find a program shortcut already provided).

The next step is to provide a username and password (you may optionally save the password between sessions), choose a JDBC driver and enter a database path. A typical MySQL path statement looks like this:

jdbc:mysql://hostname

If the path and driver entries are both valid, a list of databases will appear in the database list. When you choose a database, the list of tables beneath it changes, and you can use the drop-down list to select the table you desire.

The "Autoload" option can be used to automatically load the chosen table when it is selected - this is appropriate for small tables that don't take too long to load.

Having chosen a database table, click the "Data" tab to view it. If you have not chosen "Autoload," press "Execute" to the right of the SQL text window to load the table. You have the choice of working with the data as a table or as a form - just click the tab appropriate to your choice.

DataProspector defaults to "read-only," meaning you cannot write to your databases unless you deselect the read-only check box. Remember this as you begin working with DataProspector.

It is a good idea to practice with DataProspector in read-only mode until you are familiar with its operation, or you may want to practice with a sample table to preserve more valuable data.

The SQL text window allows you to customize the data displayed by DataProspector with the full flexibility of SQL. But remember that, if you intend to edit the data, in most cases you need to select entire rows, not just specific columns. This comes about because most tables have a key field, and if that field is not included in the query, the query cannot be used for updates.

You can always filter by row using a query that includes a "WHERE" clause and the rows will be suitable for editing/updating, but selecting specific columns using "SELECT" often will produce a dataset that cannot be updated.

On that topic, if you make an SQL statement error,  an error message will normally appear in the "Messages" window at the bottom of the program display.

*** Editing databases

You may eventually want to edit data using DataProspector, something this program was designed to do easily. You have the choice to edit individual cells in a table, or records in a form showing one record at a time. The form display can display and edit multi-line data fields and fields containing tabs (one of the design goals for DataProspector). You may choose the number of displayed lines using the drop-down list at the bottom right of the form display.

If you edit a line in the table display and press "Enter," the database record is updated. If this doesn't work, remember to deselect "Read-only" in the selection pane to allow writes.

NOTE: If you edit a field in the form display, pressing Enter will update the record when in "1 Line" display mode, but for multi-line displays, the Enter key is a valid entry (a linefeed), so to update the record you must either press "Commit" at the lower right, or you may press one of the navigation buttons - moving to a different record also updates the changed record.

*** Copy/Paste/Create tables using the system clipboard

While viewing the table display, press the right mouse button. You will see a list of menu options. If you want to write to your tables or create new ones, be sure to disable "read-only" on the selections pane. This will activate all the context (right-click) menu options.

Using these menu items you can copy and paste selections or entire databases, with and without header information, delete selected records, and create tables using clipboard data, again with or without header information.

An obvious source for tabular clipboard data is a spreadsheet program, but any program will do. All you need to remember is that the clipboard contents should resemble a table of tab-delimited fields. Most spreadsheet programs cooperate with this assumption, delimiting clipboard contents with tabs when copying, and accepting tab-delimited tables for paste operations.

Here's an example. Run your favorite spreadsheet program and  either enter a table or use an existing one for this exercise. Let's say you have this spreadsheet table:

X	X^2	X^3	X^4	X^5
2	4	8	16	32
3	9	27	81	243
4	16	64	256	1024
5	25	125	625	3125
6	36	216	1296	7776
7	49	343	2401	16807
8	64	512	4096	32768
9	81	729	6561	59049
10	100	1000	10000	100000

Notice that the first row consists of identifying labels. This is a "header row," and, since it conveys useful information, it should be included in the copy.

I just realized something - you can simply copy the contents of this table directly from this help file, and submit it to DataProspector's "paste table" feature. Just move your mouse cursor across the entire table above, press Ctrl+C (copy), then move to the data table display and choose the context menu (mouse right-click) item "Paste data with header into new table." Type a suitable name for your table and it will be created.

The point is, using the system clipboard, virtually any program on your system can be used as a source or destination for database information.

When you are preparing to copy data from DataProspector, you have a number of ways of filtering the data. You can always submit a custom query and copy the results:

select Firstname, Lastname from OldFriends where status not like "dead";

Another approach is to select the table rows you want with the mouse, then perform the copy operation. Clipboard copies only include selected rows, so in the event you want all the records, select at least one row, then press Ctrl+A (All).

Some words and characters may not be acceptable as table headers. The list of exceptions is long and boring, so I refer you to the MySQL documentation. In difficult cases you may be able to precede the row name with an underscore to force acceptance.

Another way to force the acceptance of words and characters that are ordinarily not accepted is to enclose the header names in back-ticks like this:

`X`	`X^2`	`X^3`	`X^4`	`X^5`
2	4	8	16	32
3	9	27	81	243
4	16	64	256	1024

This allows the use of the otherwise forbidden character "^", which is essential for this table to make sense. And if the database engine you are using is MySQL, then DataProspector automatically puts the backticks in for you, so you don't have to do anything. With other database engines, you may have to figure out a different approach, because some of them don't even allow this trick. One example, PostgreSQL, does not accept the back-ticks, which is why DataProspector doesn't wrap all table headers in back-ticks by default.

A final note about the clipboard features - they present an easy way to make a copy of a database table within DataProspector itself. Just select all records (Ctrl+A), copy with header, then paste to a new table with header under a different table name.

*** Direct SQL Entries

This is a deep topic, but you may enter virtually any SQL statement into the SQL text window. Your past entries are preserved for you in the dropdown list, and this list is maintained between program runs.

A useful SQL entry when using MySQL is:

describe (tablename)

This will create a table showing the properties of the named table.

If your SQL entry does not create output, be sure to precede it with ">", like this:

>drop table MyStuff

This ">" character is used by DataProspector to distinguish between SQL "update" and "query" class commands.

Here are some more useful MySQL queries (they are unique to MySQL, they are not generic SQL):

show tables
show databases

Remember, though, that you normally choose a database and table at the Selection pane - and that is easier.

It is in this way that DataProspector's SQL entry window and table display serve as a nice SQL teaching tool. Obviously learning SQL is a huge topic entirely beyond the scope of this help file.

Now, just one more SQL example. I have a large dictionary set up as a database, and it is fun to play with. Here is my SQL palindrome detector:

select * from dictionary where word = reverse(word) and length(word) > 4;

Output:

civic, dewed, kayak, level, ma'am, madam, minim, radar, redder, refer, rever, reviver, rotator, rotor, semmes, sexes, solos, tenet, terret

Cool, huh?

*** Arcana

1. If you get into a lock situation, where there is a DataProspector problem you cannot solve, simply delete this file:

(user home)/.DataProspector/DataProspector.ini.

If you delete this file, DataProspector will recreate one with default values on the next run.

2. DataProspector tries to move gracefully between the MySQL and PostgreSQL database engines (a problem make worse by the fact that there is a problem with the PostgreSQL JDBC driver at the time of writing (3/2003)) and the transition code is under your control.

Place the above listed configuration file in your favorite text editor and you will see some peculiar definitions, like this:

ShowDatabases_mysql=show databases;
ShowDatabases_postgresql=select datname from pg_database;
ShowTables_mysql=show tables;
ShowTables_postgresql=select tablename from pg_tables where tablename not like 'pg_%';

The idea of these entries is that a particular string value is chosen based on the host prefix:

jdbc:mysql://hostname

To decide which command to carry out, the section between the colons ":" is extracted and added to a prefix string name. The hope is that, if you change from MySQL to PostgreSQL database engines, DataProspector will change its behavior to accommodate the different syntax requirements of the two engines, on the fly.

If you decide to try to adapt DataProspector to some other database engine about which I know nothing, you can edit this configuration file to include custom commands of your own.

Let's say there is a new database engine "newengine," and the host string looks like this:

jdbc:newengine://hostname

To accommodate this engine, you could place some new strings into the configuration file:

ShowDatabases_newengine=(SQL to list databases);
ShowTables_newengine=(SQL to list tables);

And so forth. I always try to make my programs as user-customizable as possible.

*** Common Problems

1. As I have explained above, at the time of writing (3/2003), the current version of the PostgreSQL JDBC driver is buggy and broken - you can create tables, but you cannot update them. I will be testing any new versions of this driver that are released, and I will try to make DataProspector work fully with both PostgreSQL and MySQL. At the moment, it works fully only with MySQL.

2. After only a few days of experience, I see one problem area. I have converted a lot of spreadsheets into database tables using the slick DataProspector import methods, but one problem comes up again and again - one of the column headings will contain unacceptable characters or a reserved word like index, and this will break the import. Unfortunately, the error messages provided by the SQL engine are sometimes not very revealing.

This is ordinarily not a problem with the MySQL database engine because, as explained earlier, DataProspector places back-ticks around the column headings, which allows most any string to be used as a column name.

But for other engines, to solve the problem, use this procedure:

1. Try locating column headings that look funny - ones that have non-letter characters, or that look like they might be reserved words like "index" and so forth. If you find any, rename them before importing. Spaces in header names are not a problem, because DataProspector automatically creates underscores for each of these cases.

2. Try importing the table without a header (DataProspector creates a default header in this case). If that works, you at least know where the problem is. And in this case you can simply rename the column headers rather than importing again.

*** User support

Because DataProspector is free (but please visit www.arachnoid.com/careware anyway), there is no user support. This program is very easy to use and common sense should be able to stand in for user support.

If you detect a bug in DataProspector, please report it at www.arachnoid.com/messages. Make sure what you are reporting is in fact a bug. :)

-- P. Lutus, Port Hadlock, WA