Database Connector
The Database Connector extracts content from any JDBC-compatible database by executing SQL queries and mapping each result row to a searchable document. It supports Oracle, PostgreSQL, MariaDB, MySQL, and any database with a JDBC driver.
How It Works
- Execute the configured SQL query against the database
- Iterate through the result set
- Map each column to a search index field based on the attribute mapping
- Create a Job Item with the mapped fields
- Submit to the pipeline in configurable chunks
- Repeat until all rows are processed
Key Features
| Feature | Description |
|---|---|
| Any JDBC database | Oracle, PostgreSQL, MariaDB, MySQL, and any JDBC-compatible source |
| SQL flexibility | Use any SELECT query — joins, aggregations, subqueries, functions |
| External SQL files | Load queries from files using the file:// protocol |
| Batch processing | Configurable chunk size for memory-efficient processing |
| Max content size | Limits per-document content size (default: 5 MB) to prevent oversized documents |
| Standalone CLI | Run imports from the command line without the full Dumont DEP application |
| Custom extensions | DumDbExtCustomImpl interface for custom row processing logic |
| Locale support | Configurable default locale for all extracted documents |
CLI Parameters
The standalone Database Connector accepts the following command-line parameters:
| Parameter | Required | Default | Description |
|---|---|---|---|
--driver | Yes | — | JDBC driver class name |
--connect | Yes | — | JDBC connection string |
--query / -q | Yes | — | SQL query or file://path/to/query.sql |
--site | Yes | — | Target Semantic Navigation Site name |
--server / -s | Yes | — | Dumont DEP server URL |
--api-key / -a | Yes | — | API key for authentication |
--locale | No | en_US | Default locale for documents |
--chunk / -z | No | (varies) | Batch size for processing |
--max-content-size | No | 5 | Maximum content size in MB |
--deindex-before-importing | No | false | Remove all existing documents before import |
Example: Indexing a Product Catalog
SQL Query
SELECT
p.id,
p.name AS title,
p.description AS text,
p.price,
c.name AS category,
p.updated_at AS date,
CONCAT('https://shop.example.com/products/', p.slug) AS url
FROM products p
JOIN categories c ON p.category_id = c.id
WHERE p.active = 1
Standalone Import
java -cp dumont-db.jar com.viglet.dumont.db.DumDbImportTool \
--server http://localhost:30130 \
--api-key your-api-key \
--driver org.mariadb.jdbc.Driver \
--connect "jdbc:mariadb://localhost:3306/shop?user=reader&password=secret" \
--query "file:///opt/queries/products.sql" \
--site ProductCatalog \
--locale en_US \
--chunk 100
This will:
- Connect to the MariaDB database
- Execute the SQL query from the external file
- Map each row to a document (columns become fields)
- Send documents in batches of 100 to the
ProductCatalogSN Site
Supported Databases
| Database | Driver | Connection String Example |
|---|---|---|
| MariaDB | org.mariadb.jdbc.Driver | jdbc:mariadb://host:3306/dbname |
| MySQL | com.mysql.cj.jdbc.Driver | jdbc:mysql://host:3306/dbname |
| PostgreSQL | org.postgresql.Driver | jdbc:postgresql://host:5432/dbname |
| Oracle | oracle.jdbc.OracleDriver | jdbc:oracle:thin:@host:1521/service |
Tips
- Use aliases in your SQL query to match the field names expected by Turing ES (e.g.,
SELECT name AS title). - External SQL files (
file://protocol) keep complex queries maintainable and version-controlled. - De-index before importing (
--deindex-before-importing) is useful for full refreshes — it removes stale documents that no longer exist in the database. - Chunk size affects memory usage: larger chunks are faster but use more memory. Start with 100 and adjust based on document size.
Customizing the Database Connector
Need to transform or enrich rows during import? See Extending the Database Connector for the DumDbExtCustomImpl interface and step-by-step guide.