odbc_api/
guide.rs

1#![allow(clippy::needless_doctest_main)]
2/*!
3# Introduction to `odbc-api` (documentation only)
4
5## About ODBC
6
7ODBC is an open standard which allows you to connect to various data sources. Most of these data sources are databases, but ODBC drivers are also available for various file types, such as Excel or CSV.
8
9Your application does not link against a driver, but instead links against an ODBC driver manager, which must be installed on the system where you intend to run the application. On modern Windows platforms, ODBC is always installed; on OS-X or Linux distributions, a driver manager like [unixODBC](http://www.unixodbc.org/) must be installed.
10
11To connect to a data source, a driver for the specific data source in question must be installed. On Windows, you can type 'ODBC Data Sources' into the search box to start a little GUI which shows you the various drivers and preconfigured data sources on your system.
12
13However, this is not a guide on how to configure and set up ODBC. This is a guide on how to use the Rust bindings for applications that want to utilize ODBC data sources.
14
15## Quickstart
16
17```no_run
18//! A program executing a query and printing the result as csv to standard out. Requires
19//! `anyhow` and `csv` crate.
20
21use anyhow::Error;
22use odbc_api::{buffers::TextRowSet, Cursor, Environment, ConnectionOptions, ResultSetMetadata};
23use std::{
24    ffi::CStr,
25    io::{stdout, Write},
26    path::PathBuf,
27};
28
29/// Maximum number of rows fetched with one row set. Fetching batches of rows is usually much
30/// faster than fetching individual rows.
31const BATCH_SIZE: usize = 5000;
32
33fn main() -> Result<(), Error> {
34    // Write csv to standard out
35    let out = stdout();
36    let mut writer = csv::Writer::from_writer(out);
37
38    // If you do not do anything fancy it is recommended to have only one Environment in the
39    // entire process.
40    let environment = Environment::new()?;
41
42    // Connect using a DSN. Alternatively we could have used a connection string
43    let mut connection = environment.connect(
44        "DataSourceName",
45        "Username",
46        "Password",
47        ConnectionOptions::default(),
48    )?;
49
50    // Execute a one of query without any parameters.
51    match connection.execute("SELECT * FROM TableName", (), None)? {
52        Some(mut cursor) => {
53            // Write the column names to stdout
54            let mut headline : Vec<String> = cursor.column_names()?.collect::<Result<_,_>>()?;
55            writer.write_record(headline)?;
56
57            // Use schema in cursor to initialize a text buffer large enough to hold the largest
58            // possible strings for each column up to an upper limit of 4KiB.
59            let mut buffers = TextRowSet::for_cursor(BATCH_SIZE, &mut cursor, Some(4096))?;
60            // Bind the buffer to the cursor. It is now being filled with every call to fetch.
61            let mut row_set_cursor = cursor.bind_buffer(&mut buffers)?;
62
63            // Iterate over batches
64            while let Some(batch) = row_set_cursor.fetch()? {
65                // Within a batch, iterate over every row
66                for row_index in 0..batch.num_rows() {
67                    // Within a row iterate over every column
68                    let record = (0..batch.num_cols()).map(|col_index| {
69                        batch
70                            .at(col_index, row_index)
71                            .unwrap_or(&[])
72                    });
73                    // Writes row as csv
74                    writer.write_record(record)?;
75                }
76            }
77        }
78        None => {
79            eprintln!(
80                "Query came back empty. No output has been created."
81            );
82        }
83    }
84
85    Ok(())
86}
87```
88
89## 32 Bit and 64 Bit considerations.
90
91To consider whether you want to work with 32 Bit or 64 Bit data sources is especially important
92for windows users, as driver managers (and possibly drivers) may both exist at the same time
93in the same system.
94
95In any case, depending on the platform part of your target triple either 32 Bit or 64 Bit
96drivers are going to work, but not both. On a private windows machine (even on a modern 64 Bit
97Windows) it is not unusual to find lots of 32 Bit drivers installed on the system, but none for
9864 Bits. So for windows users it is worth thinking about not using the default toolchain which
99is likely 64 Bits and to switch to a 32 Bit one. On other platforms you are usually fine
100sticking with 64 Bits, as there are not usually any drivers preinstalled anyway, 64 Bit or
101otherwise.
102
103No code changes are required, so you can also just build both if you want to.
104
105## Connecting to a data source
106
107### Setting up the ODBC Environment
108
109To connect with a data source we need a connection. To create a connection we need an ODBC
110environment.
111
112```no_run
113use odbc_api::Environment;
114
115let env = Environment::new()?;
116
117# Ok::<(), odbc_api::Error>(())
118```
119
120This is it. Our ODBC Environment is ready for action. We can use it to list data sources and
121drivers, but most importantly we can use it to create connections.
122
123These bindings currently support two ways of creating a connections:
124
125### Connect using a connection string
126
127Connection strings do not require that the data source is preconfigured by the driver manager
128this makes them very flexible.
129
130```no_run
131use odbc_api::{ConnectionOptions, Environment};
132
133let env = Environment::new()?;
134
135let connection_string = "
136    Driver={ODBC Driver 18 for SQL Server};\
137    Server=localhost;\
138    UID=SA;\
139    PWD=My@Test@Password1;\
140";
141
142let mut conn = env.connect_with_connection_string(connection_string, ConnectionOptions::default())?;
143# Ok::<(), odbc_api::Error>(())
144```
145
146There is a syntax to these connection strings, but few people go through the trouble to learn
147it. Most common strategy is to google one that works for with your data source. The connection
148borrows the environment, so you will get a compiler error, if your environment goes out of scope
149before the connection does.
150
151> You can list the available drivers using [`crate::Environment::drivers`].
152
153### Connect using a Data Source Name (DSN)
154
155Should a data source be known by the driver manager we can access it using its name and
156credentials. This is more convenient for the user or application developer, but requires a
157configuration of the ODBC driver manager. Think of it as shifting work from users to
158administrators.
159
160```no_run
161use odbc_api::{Environment, ConnectionOptions};
162
163let env = Environment::new()?;
164
165let mut conn = env.connect(
166    "YourDatabase", "SA", "My@Test@Password1",
167    ConnectionOptions::default()
168)?;
169# Ok::<(), odbc_api::Error>(())
170```
171
172How to configure such data sources is not the scope of this guide, and depends on the driver
173manager in question.
174
175> You can list the available data sources using [`crate::Environment::data_sources`].
176
177### Connection pooling
178
179ODBC specifies an interface to enable the driver manager to enable connection pooling for your
180application. It is off by default, but if you use ODBC to connect to your data source instead of
181implementing it in your application, or importing a library you may simply enable it in ODBC
182instead.
183Connection Pooling is governed by two attributes. The most important one is the connection
184pooling scheme which is `Off` by default. It must be set even before you create your ODBC
185environment. It is global mutable state on the process level. Setting it in Rust is therefore
186unsafe.
187
188The other one is changed via [`crate::Environment::set_connection_pooling_matching`]. It governs
189how a connection is chosen from the pool. It defaults to strict which means the `Connection`
190you get from the pool will have exactly the attributes specified in the connection string.
191
192Here is an example of how to create an ODBC environment with connection pooling.
193
194```
195use odbc_api::{Environment, environment, sys::{AttrConnectionPooling, AttrCpMatch}};
196
197fn main() {
198    // Enable connection pooling. Let driver decide whether the attributes of two connection
199    // are similar enough to change the attributes of a pooled one, to fit the requested
200    // connection, or if it is cheaper to create a new Connection from scratch.
201    // See <https://docs.microsoft.com/en-us/sql/odbc/reference/develop-app/driver-aware-connection-pooling>
202    //
203    // Safety: This call changes global mutable space in the underlying ODBC driver manager. Easiest
204    // to prove it is not causing a raise by calling it once right at the startup of your
205    // application.
206    unsafe {
207        Environment::set_connection_pooling(AttrConnectionPooling::DriverAware).unwrap();
208    }
209
210    // As long as `environment` is called for the first time **after** connection pooling is
211    // activated the static environment returned by it will have it activated.
212    let env = environment();
213    // ... use env to do db stuff
214}
215```
216
217## Executing SQL statements
218
219### Executing a single SQL statement
220
221With our ODBC connection all set up and ready to go, we can execute an SQL query:
222
223```no_run
224use odbc_api::{Environment, ConnectionOptions};
225
226let env = Environment::new()?;
227
228let mut conn = env.connect(
229    "YourDatabase", "SA", "My@Test@Password1",
230    ConnectionOptions::default()
231)?;
232let query = "SELECT year, name FROM Birthdays;";
233let parameters = (); // This query does not use any parameters.
234let timeout_sec = None;
235if let Some(cursor) = conn.execute(query, parameters, timeout_sec)? {
236    // Use cursor to process query results.
237}
238# Ok::<(), odbc_api::Error>(())
239```
240
241The first parameter of `execute` is the SQL statement text. The second parameter is used to pass
242arguments of the SQL Statements itself (more on that later). Ours has none, so we use `()` to
243not bind any arguments to the statement. You can learn all about passing parameters from the
244[`parameter module level documentation`](`crate::parameter`). It may feature an example for
245your use case.
246
247Note that the result of the operation is an `Option`. This reflects that not every statement
248returns a [`Cursor`](crate::Cursor). `INSERT` statements usually do not, but even `SELECT`
249queries which would return zero rows can depending on the driver return either an empty cursor
250or no cursor at all. Should a cursor exists, it must be consumed or closed. The `drop` handler
251of Cursor will close it for us. If the `Option` is `None` there is nothing to close, so is all
252taken care of, nice.
253
254### Executing a many SQL statements in sequence
255
256Execution of a statement, its bound parameters, its buffers and result set, are all managed by
257ODBC using a statement handle. If you call [`crate::Connection::execute`] a new one is allocated
258each time and the resulting cursor takes ownership of it (giving you an easier time with the
259borrow checker). In a use case there you want to execute multiple SQL Statements over the same
260connection in sequence, you may want to reuse an already allocated statement handle. This is
261what the [`crate::Preallocated`] struct is for. Please note that if you want to execute the same
262query with different parameters you can have potentially even better performance by utilizing
263prepared Statements ([`crate::Prepared`]).
264
265```
266use odbc_api::{Connection, Error};
267use std::io::{self, stdin, Read};
268
269fn interactive(conn: &Connection) -> io::Result<()>{
270    let mut statement = conn.preallocate().unwrap();
271    let mut query = String::new();
272    stdin().read_line(&mut query)?;
273    while !query.is_empty() {
274        match statement.execute(&query, ()) {
275            Err(e) => println!("{}", e),
276            Ok(None) => println!("No results set generated."),
277            Ok(Some(cursor)) => {
278                // ...print cursor contents...
279            }
280        }
281        stdin().read_line(&mut query)?;
282    }
283    Ok(())
284}
285```
286
287### Executing prepared queries
288
289Should your use case require you to execute the same query several times with different parameters,
290prepared queries are the way to go. These give the database a chance to cache the access plan
291associated with your SQL statement. It is not unlike compiling your program once and executing it
292several times.
293
294```
295use odbc_api::{Connection, Error, IntoParameter};
296use std::io::{self, stdin, Read};
297
298fn interactive(conn: &Connection) -> io::Result<()>{
299    let mut prepared = conn.prepare("SELECT * FROM Movies WHERE title=?;").unwrap();
300    let mut title = String::new();
301    stdin().read_line(&mut title)?;
302    while !title.is_empty() {
303        match prepared.execute(&title.as_str().into_parameter()) {
304            Err(e) => println!("{}", e),
305            // Most drivers would return a result set even if no Movie with the title is found,
306            // the result set would just be empty. Well, most drivers.
307            Ok(None) => println!("No result set generated."),
308            Ok(Some(cursor)) => {
309                // ...print cursor contents...
310            }
311        }
312        stdin().read_line(&mut title)?;
313    }
314    Ok(())
315}
316```
317
318### Asynchronous execution using polling mode
319
320ODBC supports asynchronous execution of queries. It offers two modes: Notification and Polling.
321Notification to the authors knowlegde is only supported by the Driver Manager shipping with
322Microsoft Windows. Polling mode is also supported by unixODBC. This crate only supports the Polling
323model.
324
325⚠️**Attention**⚠️: Your driver manager (e.g. UnixODBC) may support polling mode, your driver may not.
326The function call may succeed without an error, but actually block the thread during the entire
327operation. In a manual check performed by the author with the drivers of PostgreSQL, MariaDB, SQLite
328and Microsoft SQL Server on a windows machine only the Microsoft SQL Server driver actually returned
329control to the application during statement execution. It seems asynchrnous execution is not widely
330supported.
331
332Generic applications which want to support a wide array of drivers should use one or more system
333threads executing actors using the blocking ODBC api, and communicate the query results via channels
334to the rest of the async applications.
335
336This caveat aside, using the polling mode api is the only way to execute many statements at once
337with few system threads.
338
339Functions in this crate which support asynchronous execution using polling mode:
340
341* [`crate::Connection::execute_polling`]
342* [`crate::Preallocated::into_polling`]
343
344See also the ODBC reference on polling execution:
345<https://learn.microsoft.com/sql/odbc/reference/develop-app/asynchronous-execution-polling-method>
346
347## Fetching results
348
349ODBC offers two ways of retrieving values from a cursor over a result set. Row by row fetching and
350bulk fetching using application provided buffers.
351
352### Fetching results by filling application provided buffers
353
354The most efficient way to query results is not query an ODBC data source row by row, but to
355ask for a whole bulk of rows at once. The ODBC driver and driver manager will then fill these
356row sets into buffers which have been previously bound. This is also the most efficient way to
357query a single row many times for many queries, if the application can reuse the bound buffer.
358This crate allows you to provide your own buffers by implementing the [`crate::RowSetBuffer`]
359trait. That however requires `unsafe` code.
360
361This crate also provides three implementations of the [`crate::RowSetBuffer`] trait, ready to be
362used in safe code:
363
364* [`crate::buffers::ColumnarBuffer`]: Binds to the result set column wise. This is usually helpful
365  in dataengineering or data sciense tasks. This buffer type can be used in situations there the
366  schema of the queried data is known at compile time, as well as for generic applications which do
367  work with wide range of different data. Checkt the struct documentation for examples.
368* [`crate::buffers::TextRowSet`]: Queries all data as text bound in columns. Since the columns are
369  homogeneous, you can also use this, to iterate row wise over the buffer. Excellent if you want
370  to print the contents of a table, or are for any reason only interessted in the text
371  representation of the values.
372* [`crate::buffers::RowVec`]: A good choice if you know the schema at compile time and your
373  application logic is build in a row by row fashion, rather than column by column.
374
375You can read more about them in the documentation of the [`crate::buffers`] module.
376
377## Inserting values into a table
378
379### Inserting a single row into a table
380
381Inserting a single row can be done by executing a statement and binding the fields as parameters
382in a tuple.
383
384```no_run
385use odbc_api::{Connection, Error, IntoParameter};
386
387fn insert_birth_year(conn: &Connection, name: &str, year: i16) -> Result<(), Error>{
388    conn.execute(
389        "INSERT INTO Birthdays (name, year) VALUES (?, ?)",
390        (&name.into_parameter(), &year),
391        None,
392    )?;
393    Ok(())
394}
395```
396
397### Columnar bulk inserts
398
399Inserting values row by row can introduce a lot of overhead. ODBC allows you to perform either
400row or column wise bulk inserts. Especially in pipelines for data science you may already have
401buffers in a columnar layout at hand. [`crate::ColumnarBulkInserter`] can be used for bulk inserts.
402
403```no_run
404use odbc_api::{Connection, Error, IntoParameter, buffers::BufferDesc};
405
406fn insert_birth_years(conn: &Connection, names: &[&str], years: &[i16]) -> Result<(), Error> {
407
408    // All columns must have equal length.
409    assert_eq!(names.len(), years.len());
410
411    let prepared = conn.prepare("INSERT INTO Birthdays (name, year) VALUES (?, ?)")?;
412
413    // Create a columnar buffer which fits the input parameters.
414    let buffer_description = [
415        BufferDesc::Text { max_str_len: 255 },
416        BufferDesc::I16 { nullable: false },
417    ];
418    // The capacity must be able to hold at least the largest batch. We do everything in one go, so
419    // we set it to the length of the input parameters.
420    let capacity = names.len();
421    // Allocate memory for the array column parameters and bind it to the statement.
422    let mut prebound = prepared.into_column_inserter(capacity, buffer_description)?;
423    // Length of this batch
424    prebound.set_num_rows(capacity);
425
426
427    // Fill the buffer with values column by column
428    let mut col = prebound
429        .column_mut(0)
430        .as_text_view()
431        .expect("We know the name column to hold text.");
432
433    for (index, name) in names.iter().enumerate() {
434        col.set_cell(index, Some(name.as_bytes()));
435    }
436
437    let col = prebound
438        .column_mut(1)
439        .as_slice::<i16>()
440        .expect("We know the year column to hold i16.");
441    col.copy_from_slice(years);
442
443    prebound.execute()?;
444    Ok(())
445}
446```
447*/