Chewdata

This application is an simple ETL in rust that can be use as a connector between systems

It handle multiple formats : Json, Jsonl, CSV, Toml, XML, Yaml, Text
It can read/write data from :
- Mongodb database
- S3/Minio with versionning & select
- Http(s) APIs with with some authenicators: Basic, Bearer, Jwt
- Local
- Relational DB like PSQL (Not Yet)
- Message broker (Not Yet)
It need only rustup
No garbage collector
Parallel work
Multi platforms
Use tera template in order to configure the actions for the data transformation

the target of this project is to simplify the work of developers and simplify the connection between system. The work is not finished but I hope it will be useful for you.

Getting started

Setup from source code

Requirement:

Rust
Docker and Docker-compose for testing the code in local

Commands to execute:

git clone https://github.com/jmfiaschi/chewdata.git chewdata
cd chewdata
cp .env.dev .env
vim .env // Edit the .env file
make build
make unit-tests
make integration-tests

If all the test pass, the project is ready. read the Makefile in order to see, what kind of shortcut you can use.

If you want some examples to discover this project, go in this section ./examples

Run the ETL

If you run the program without parameters, the application will wait until you write json data. By default, the program write json data in the output and the program stop when you press multiple times the 'enter' key.

$ cargo run 
$ [{"key":"value"},{"name":"test"}]
$ exit
[{"key":"value"},{"name":"test"}]

Another example without etl configuration and with file in input

$ cat ./data/multi_lines.json | cargo run 
[{...}]

$ cat ./data/multi_lines.json | make run 
[{...}]

Another example, With a json etl configuration in argument

$ cat ./data/multi_lines.csv | cargo run '[{"type":"reader","document":{"type":"csv"}},{"type":"writer"}]'
[{...}] // Will transform the csv data into json format

$ cat ./data/multi_lines.csv | make run json='[{\"type\":\"reader\",\"document\":{\"type\":\"csv\"}},{\"type\":\"writer\"}]'
[{...}] // Will transform the csv data into json format

Another example, With etl file configuration in argument

$ echo '[{"type":"reader","connector":{"type":"io"},"document":{"type":"csv"}},{"type":"writer"}]' > my_etl.conf.json
$ cat ./data/multi_lines.csv | cargo run -- --file my_etl.conf.json
[{...}]

$ echo '[{"type":"reader","connector":{"type":"io"},"document":{"type":"csv"}},{"type":"writer"}]' > my_etl.conf.json
$ cat ./data/multi_lines.csv | make run file=my_etl.conf.json
[{...}]

It is possible to use alias and default value to decrease the configuration length

$ echo '[{"type":"r","doc":{"type":"csv"}},{"type":"w"}]' > my_etl.conf.json
$ cat ./data/multi_lines.csv | make run file=my_etl.conf.json
[{...}]

How to contribute

In progress...

After code modifications, please run all tests.

make test

chewdata 1.7.0

Chewdata

Getting started

Setup from source code

Run the ETL

How to contribute

Useful links