PgDBF is a program for converting XBase databases - particularly FoxPro tables with memo files - into a format that PostgreSQL can directly import. It is the successor to the XBaseToPg project, which works well but is comparatively slow and big and complicated. The smallest changes touched lots of files, and the scope of that project is far larger than I actually needed on a daily basis.
Thus, PgDBF. It's a small C project comprising two files with no dependencies other than standard Unix libraries. While the project is relatively tiny and simple, it's also heavily optimized via profiling. I routinely got benchmark results that were many times faster than with XBaseToPg. In fact, even on slower systems, conversions were typically limited by hard drive speed.
Features
PgDBF was designed with a few core principles:
- Simplicity. This code should be understandable by anyone who wants to hack it.
- Robustness. Every syscall is checked for success.
- Speed. I wanted the fastest conversion available anywhere.
- Completeness. It has full support for FoxPro memo files.
- Portable. PgDBF runs on 32- and 64-bit systems, and both little-endian (eg x86) and big-endian (eg PowerPC) architectures.
Installation
$ cc -O2 pgdbf.c -o pgdbf
Then, copy it somewhere in your $PATH. I really meant the "simple" part. I use a Makefile locally that copies the binary to where I needed it to be, but it's guaranteed to be in the wrong place for you.
Usage
Use pgdbf to convert your XBase tables into a format suitable for piping into the psql client. pgdbf will generate the commands necessary to create near-exact replicas of your tables. Its output looks something like:
BEGIN; SET statement_timeout=60000; DROP TABLE invoice; SET statement_timeout=0; CREATE TABLE invoice (invoiceid INTEGER, note VARCHAR(50), billdate DATE, submitted TIMESTAMP, memofield TEXT, approved BOOLEAN); \COPY invoice FROM STDIN ... ... ... \. COMMIT; CREATE INDEX invoice_invoiceid ON invoice(invoiceid);
Note that the entire process is wrapped inside a transaction so that other clients will have access to the old data until the transaction is completed.
Indices are automatically created if you specify the columns (or expressions!) you want indexed on the command line. For example,
pgdbf foo.dbf rowid "substr(textfield,1,4)" price
will create three indices on the new foo table: one each for the rowid and price columns, and one for the substr() expression. It tries to give each index a reasonable name.
Differences from XBaseToPg
I changed datestamps to print as their Julian day to avoid limitations in standard time library functions.
This is a brand new implementation with no code common to XBaseToPg, other than a few lines I'd originally written myself to add PostgreSQL support to that project. I found some XBase file format specs and started from the ground up.
Bugs
Filenames are case sensitive. Memo files must be named the same as their associated data files, but with the extension .fpt. For example, if the data file is named FooTABle.DBF, the memo file must be named FooTABle.fpt. This is more of a Unix pattern than a bug, but may catch Windows admins off-guard.
When multiple incompatible interpretations of a type are available, such as the 'B' type which can mean 'binary object' in dBASE V or 'double-precision float' in FoxPro, I've used the FoxPro version.
Not all XBase datatypes are supported right now. As of this writing, PgDBF can handle boolean, currency, date, double-precision float, float, general (although only outputs empty strings; I don't know how to resolve OLE objects at this time), integer, memo, numeric, timestamp, and varchar fields. If you need other datatypes, send me a small sample database that I can test.
Releases
| Official releases | Date | Size | Links | Status | |
|---|---|---|---|---|---|
| 0.3.3 | 2008-Nov-07 | 18.81 KB | Recommended for all | ||
Post new comment