We are pleased to introduce you the new development version of Relational pipes. This release brings an abstraction layer (ODBC) in the SQL transformation + several smaller improvements.
relpipe-tr-sql
module: see details belowrelpipe-in-jack
module: see details belowrelpipe-out-gui
module: use Ctrl+PgUp and Ctrl+PgDown to switch panels (relations) and Ctrl+Q to quitrelpipe-out-xhtml
command: number of records is printed under the table (this command part of relpipe-out-xml
, not a standalone module)See the examples and screenshots pages for details.
Please note that this is still a development release and thus the API (libraries, CLI arguments, formats) might and will change. Any suggestions, ideas and bug reports are welcome in our mail box.
Former versions of relpipe-tr-sql
were tied to SQLite
and user had no option to change the SQL engine.
However great SQLite is (and we are very thankful for it), having some particular DBMS (database management system) hard-coded in our program is too constraining.
So we added an abstraction layer (ODBC) and get rid of the direct dependency on SQLite.
Now any DBMS can be used with Relational pipes.
ODBC (Open Database Connectivity) is an industry standard and provides API for accessing a DBMS. In late 80s several vendors (mostly from the Unix and database communities) established the SQL Access Group (SAG) and then specified the Call Level Interface (CLI). ODBC, which is based on CLI, was published in early 90s. ODBC is available on many operating systems and there are at least two free software implementations: unixODBC and iODBC. We use unixODBC for development and testing. Future releases of Relational pipes should be tested also with other implementations and various database drivers.
SQLite remains the default option
(in the C++ implementation, while Java or other implementations may have different default and may use different abstraction layer like JDBC).
We count on SQLite for future releases. It is the simplest way to get full SQL power in your relational pipeline.
However, relpipe-tr-sql
do not depend on SQLite and can be installed without it (and then used e.g. with PostgreSQL driver).
Using different DBMS makes sense for two main reasons:
cat source-data.csv | relpipe-in-csv \
| relpipe-tr-sql \
--data-source-name "MyDatabaseServer" \
--relation "transformed_data" "
SELECT
some_csv_field AS id,
our_special_function(some_other, third_one) AS result
FROM csv" \
| relpipe-out-xml \
| xsltproc template.xsl - > some-fancy-report.xhtml
# or just: | relpipe-out-xhtml > some-generic-report.xhtml
relpipe-tr-sql
and relpipe-in-sql
can be used as a generic database clients
and are able to load relational data to and from any DBMS.
We can also write a pipeline to transfer data between two different DBMS, do some ETL (extract, transform, load) tasks
or just cache some result sets from a remote database in our local SQLite file.
We can cache e.g. some codelist tables or other data for offline use:
relpipe-in-sql \
--data-source-name "MyCompanyDatabase" \
--relation "country" "SELECT * FROM country" \
--relation "currency" "SELECT * FROM currency" \
--relation "exchange_rate" "SELECT * FROM exchange_rate WHERE …" \
--relation "phonebook" "SELECT * FROM phonebook" \
| relpipe-tr-sql \
--data-source-string 'Driver=SQLite3;Database=file:MyCachedCompanyData.sqlite'
n.b. However it still looks like executing a local command, we should be aware that while using a remote data source, our data travel to given remote server – this impacts performance and our privacy. Never use untrustworthy remote server for processing sensitive data (even if using just a temporary schema or tables). If SQLite is „too small“ then PostgreSQL installed on localhost is usually a good option.
There are ODBC drivers for any conceivable database system.
We can also write a custom driver for any other resource and just plug it in Relational pipes
without recompiling (a driver is a shared library – simply an .so
file).
This release also comes with better diagnostics. This feature is not specific to ODBC, but was implemented during the rewrite of the database layer. So if we make a mistake in our query or try to create a table with the same name as already exists in the DB, we will get a useful message with detailed description of the problem (instead of a pointless failure notice in the previous version).
The new implementation of relpipe-tr-sql
is still a bit raw and will be tuned in the upcoming versions,
but it seems working quite well (with SQLite, PostgreSQL and MySQL on GNU/Linux).
As always, testers are welcomed.
More details in the example: Accessing SQLite, PostgreSQL and MySQL through ODBC.
A powerful audio system called JACK allows us to build pipelines consisting of audio interfaces, players, recorders, filters and effects… and route sound streams (both PCM and MIDI) through them. MIDI messages can come from keyboards or other hardware MIDI controllers or from MIDI players and other software. Sometimes it is useful to check what is happening under the hood and examine particular MIDI messages instead of just playing them on a sound module or synthesizer. Now we can bridge two seemingly unrelated worlds: real-time audio and relational pipes.
We can join the JACK graph with relpipe-in-jack
command.
It does not consume STDIN, it gets events from JACK instead, so no other input data are needed.
More details in the example: Monitoring MIDI messages using JACK.
The options --file
and --file-keep
in relpipe-tr-sql
(and relpipe-in-sql
, which is an alias for the same binary)
have been dropped.
These options were specific to SQLite and make no sense now, when we do not depend on particular DBMS and can use any engine for SQL processing
(even a remote one somewhere on the network that could not reach our local files).
However SQLite is still the default option and the:
relpipe-tr-sql --file 'myDatabase.sqlite'
can be simply replaced by:
relpipe-tr-sql --data-source-string 'Driver=SQLite3;Database=file:myDatabase.sqlite'
Bash-completion works and will suggest even the Driver=SQLite3;Database=file:
part, so it is not necessary to memorize the connection string.
Frequently used databases can be configured in the ~/.odbc.ini
file and then referenced just by their names using --data-source-name
(the data source names – DSN – are also suggested by Bash-completion).
There is no built-in replacement for the --file-keep
option.
But if the user wants to create a temporary file and delete it at the end of the transformation,
he can simply add rm -f myDatabase.sqlite
to his script.
Instalation was tested on Debian GNU/Linux 10.2. The process should be similar on other distributions.
# Install dependencies as root:
apt install g++ make cmake mercurial pkg-config
apt install libxerces-c-dev # needed only for relpipe-in-xml module
apt install guile-2.2-dev # needed only for relpipe-tr-guile module
apt install gawk # needed only for relpipe-tr-awk module
apt install libxml++2.6-dev # needed only for relpipe-in-xmltable module
apt install unixodbc-dev # needed only for relpipe-tr-sql module
apt install libsqliteodbc # needed only for relpipe-tr-sql module if used with SQLite
apt install odbc-postgresql # needed only for relpipe-tr-sql module if used with PostgreSQL
apt install libjack-jackd2-dev # needed only for relpipe-in-jack module
# Run rest of installation as a non-root user:
export RELPIPE_VERSION="v0.16"
export RELPIPE_SRC=~/src
export RELPIPE_BUILD=~/build
export RELPIPE_INSTALL=~/install
export PKG_CONFIG_PATH="$RELPIPE_INSTALL/lib/pkgconfig/:$PKG_CONFIG_PATH"
export PATH="$RELPIPE_INSTALL/bin:$PATH"
rm -rf "$RELPIPE_BUILD"/relpipe-*
mkdir -p "$RELPIPE_SRC" "$RELPIPE_BUILD" "$RELPIPE_INSTALL"
# Helper functions:
relpipe_download() { for m in "$@"; do cd "$RELPIPE_SRC" && ([[ -d "relpipe-$m.cpp" ]] && hg pull -R "relpipe-$m.cpp" && hg update -R "relpipe-$m.cpp" "$RELPIPE_VERSION" || hg clone -u "$RELPIPE_VERSION" https://hg.globalcode.info/relpipe/relpipe-$m.cpp) || break; done; }
relpipe_install() { for m in "$@"; do cd "$RELPIPE_BUILD" && mkdir -p relpipe-$m.cpp && cd relpipe-$m.cpp && cmake -DCMAKE_INSTALL_PREFIX:PATH="$RELPIPE_INSTALL" "$RELPIPE_SRC/relpipe-$m.cpp" && make && make install || break; done; }
# Download all sources:
relpipe_download lib-common lib-reader lib-writer lib-cli lib-xmlwriter in-cli in-fstab in-xml in-xmltable in-csv in-filesystem in-recfile out-gui.qt out-nullbyte out-ods out-tabular out-xml out-csv out-asn1 out-recfile tr-cut tr-grep tr-python tr-sed tr-validator tr-guile tr-awk tr-sql in-jack
# Optional: At this point, we have all dependencies and sources downloaded, so we can disconnect this computer from the internet in order to verify that our build process is sane, deterministic and does not depend on any external resources.
# Build and install libraries:
relpipe_install lib-common lib-reader lib-writer lib-cli lib-xmlwriter
# Build and install tools:
relpipe_install in-fstab in-cli in-fstab in-xml in-xmltable in-csv in-recfile tr-cut tr-grep tr-sed tr-guile tr-awk tr-sql out-nullbyte out-ods out-tabular out-xml out-csv out-asn1 out-recfile in-filesystem in-jack
# Load Bash completion scripts:
for c in "$RELPIPE_SRC"/relpipe-*/bash-completion.sh ; do . "$c"; done
# Enable streamlet examples:
export RELPIPE_IN_FILESYSTEM_STREAMLET_PATH="$RELPIPE_SRC"/relpipe-in-filesystem.cpp/streamlet-examples/
# Clean-up:
unset -f relpipe_install
unset -f relpipe_download
unset -v RELPIPE_VERSION
unset -v RELPIPE_SRC
unset -v RELPIPE_BUILD
unset -v RELPIPE_INSTALL
# List configured ODBC data sources:
relpipe-in-sql --list-data-sources | relpipe-out-tabular
# n.b. since v0.18 there is a boolean parameter i.e. --list-data-sources true
Download: examples/release-v0.16.sh
Relational pipes are modular thus you can download and install only parts you need (the libraries are needed always).
Tools out-gui.qt
and tr-python
require additional libraries and are not built by default.
Relational pipes, open standard and free software © 2018-2022 GlobalCode