odbc_api/guide.rs
1#![allow(clippy::needless_doctest_main)]
2/*!
3# Introduction to `odbc-api` (documentation only)
4
5## About ODBC
6
7ODBC is an open standard which allows you to connect to various data sources. Most of these data sources are databases, but ODBC drivers are also available for various file types, such as Excel or CSV.
8
9Your application does not link against a driver, but instead links against an ODBC driver manager, which must be installed on the system where you intend to run the application. On modern Windows platforms, ODBC is always installed; on OS-X or Linux distributions, a driver manager like [unixODBC](http://www.unixodbc.org/) must be installed.
10
11To connect to a data source, a driver for the specific data source in question must be installed. On Windows, you can type 'ODBC Data Sources' into the search box to start a little GUI which shows you the various drivers and preconfigured data sources on your system.
12
13However, this is not a guide on how to configure and set up ODBC. This is a guide on how to use the Rust bindings for applications that want to utilize ODBC data sources.
14
15## Quickstart
16
17```no_run
18//! A program executing a query and printing the result as csv to standard out. Requires
19//! `anyhow` and `csv` crate.
20
21use anyhow::Error;
22use odbc_api::{buffers::TextRowSet, Cursor, Environment, ConnectionOptions, ResultSetMetadata};
23use std::{
24 ffi::CStr,
25 io::{stdout, Write},
26 path::PathBuf,
27};
28
29/// Maximum number of rows fetched with one row set. Fetching batches of rows is usually much
30/// faster than fetching individual rows.
31const BATCH_SIZE: usize = 5000;
32
33fn main() -> Result<(), Error> {
34 // Write csv to standard out
35 let out = stdout();
36 let mut writer = csv::Writer::from_writer(out);
37
38 // If you do not do anything fancy it is recommended to have only one Environment in the
39 // entire process.
40 let environment = Environment::new()?;
41
42 // Connect using a DSN. Alternatively we could have used a connection string
43 let mut connection = environment.connect(
44 "DataSourceName",
45 "Username",
46 "Password",
47 ConnectionOptions::default(),
48 )?;
49
50 // Execute a one of query without any parameters.
51 match connection.execute("SELECT * FROM TableName", (), None)? {
52 Some(mut cursor) => {
53 // Write the column names to stdout
54 let mut headline : Vec<String> = cursor.column_names()?.collect::<Result<_,_>>()?;
55 writer.write_record(headline)?;
56
57 // Use schema in cursor to initialize a text buffer large enough to hold the largest
58 // possible strings for each column up to an upper limit of 4KiB.
59 let mut buffers = TextRowSet::for_cursor(BATCH_SIZE, &mut cursor, Some(4096))?;
60 // Bind the buffer to the cursor. It is now being filled with every call to fetch.
61 let mut row_set_cursor = cursor.bind_buffer(&mut buffers)?;
62
63 // Iterate over batches
64 while let Some(batch) = row_set_cursor.fetch()? {
65 // Within a batch, iterate over every row
66 for row_index in 0..batch.num_rows() {
67 // Within a row iterate over every column
68 let record = (0..batch.num_cols()).map(|col_index| {
69 batch
70 .at(col_index, row_index)
71 .unwrap_or(&[])
72 });
73 // Writes row as csv
74 writer.write_record(record)?;
75 }
76 }
77 }
78 None => {
79 eprintln!(
80 "Query came back empty. No output has been created."
81 );
82 }
83 }
84
85 Ok(())
86}
87```
88
89## 32 Bit and 64 Bit considerations.
90
91To consider whether you want to work with 32 Bit or 64 Bit data sources is especially important
92for windows users, as driver managers (and possibly drivers) may both exist at the same time
93in the same system.
94
95In any case, depending on the platform part of your target triple either 32 Bit or 64 Bit
96drivers are going to work, but not both. On a private windows machine (even on a modern 64 Bit
97Windows) it is not unusual to find lots of 32 Bit drivers installed on the system, but none for
9864 Bits. So for windows users it is worth thinking about not using the default toolchain which
99is likely 64 Bits and to switch to a 32 Bit one. On other platforms you are usually fine
100sticking with 64 Bits, as there are not usually any drivers preinstalled anyway, 64 Bit or
101otherwise.
102
103No code changes are required, so you can also just build both if you want to.
104
105## Connecting to a data source
106
107### Setting up the ODBC Environment
108
109To connect with a data source we need a connection. To create a connection we need an ODBC
110environment.
111
112```no_run
113use odbc_api::Environment;
114
115let env = Environment::new()?;
116
117# Ok::<(), odbc_api::Error>(())
118```
119
120This is it. Our ODBC Environment is ready for action. We can use it to list data sources and
121drivers, but most importantly we can use it to create connections.
122
123These bindings currently support two ways of creating a connections:
124
125### Connect using a connection string
126
127Connection strings do not require that the data source is preconfigured by the driver manager
128this makes them very flexible.
129
130```no_run
131use odbc_api::{ConnectionOptions, Environment};
132
133let env = Environment::new()?;
134
135let connection_string = "
136 Driver={ODBC Driver 18 for SQL Server};\
137 Server=localhost;\
138 UID=SA;\
139 PWD=My@Test@Password1;\
140";
141
142let mut conn = env.connect_with_connection_string(connection_string, ConnectionOptions::default())?;
143# Ok::<(), odbc_api::Error>(())
144```
145
146There is a syntax to these connection strings, but few people go through the trouble to learn
147it. Most common strategy is to google one that works for with your data source. The connection
148borrows the environment, so you will get a compiler error, if your environment goes out of scope
149before the connection does.
150
151> You can list the available drivers using [`crate::Environment::drivers`].
152
153### Connect using a Data Source Name (DSN)
154
155Should a data source be known by the driver manager we can access it using its name and
156credentials. This is more convenient for the user or application developer, but requires a
157configuration of the ODBC driver manager. Think of it as shifting work from users to
158administrators.
159
160```no_run
161use odbc_api::{Environment, ConnectionOptions};
162
163let env = Environment::new()?;
164
165let mut conn = env.connect(
166 "YourDatabase", "SA", "My@Test@Password1",
167 ConnectionOptions::default()
168)?;
169# Ok::<(), odbc_api::Error>(())
170```
171
172How to configure such data sources is not the scope of this guide, and depends on the driver
173manager in question.
174
175> You can list the available data sources using [`crate::Environment::data_sources`].
176
177### Connection pooling
178
179ODBC specifies an interface to enable the driver manager to enable connection pooling for your
180application. It is off by default, but if you use ODBC to connect to your data source instead of
181implementing it in your application, or importing a library you may simply enable it in ODBC
182instead.
183Connection Pooling is governed by two attributes. The most important one is the connection
184pooling scheme which is `Off` by default. It must be set even before you create your ODBC
185environment. It is global mutable state on the process level. Setting it in Rust is therefore
186unsafe.
187
188The other one is changed via [`crate::Environment::set_connection_pooling_matching`]. It governs
189how a connection is chosen from the pool. It defaults to strict which means the `Connection`
190you get from the pool will have exactly the attributes specified in the connection string.
191
192Here is an example of how to create an ODBC environment with connection pooling.
193
194```
195use odbc_api::{Environment, environment, sys::{AttrConnectionPooling, AttrCpMatch}};
196
197fn main() {
198 // Enable connection pooling. Let driver decide whether the attributes of two connection
199 // are similar enough to change the attributes of a pooled one, to fit the requested
200 // connection, or if it is cheaper to create a new Connection from scratch.
201 // See <https://docs.microsoft.com/en-us/sql/odbc/reference/develop-app/driver-aware-connection-pooling>
202 //
203 // Safety: This call changes global mutable space in the underlying ODBC driver manager. Easiest
204 // to prove it is not causing a raise by calling it once right at the startup of your
205 // application.
206 unsafe {
207 Environment::set_connection_pooling(AttrConnectionPooling::DriverAware).unwrap();
208 }
209
210 // As long as `environment` is called for the first time **after** connection pooling is
211 // activated the static environment returned by it will have it activated.
212 let env = environment();
213 // ... use env to do db stuff
214}
215```
216
217## Executing SQL statements
218
219### Executing a single SQL statement
220
221With our ODBC connection all set up and ready to go, we can execute an SQL query:
222
223```no_run
224use odbc_api::{Environment, ConnectionOptions};
225
226let env = Environment::new()?;
227
228let mut conn = env.connect(
229 "YourDatabase", "SA", "My@Test@Password1",
230 ConnectionOptions::default()
231)?;
232let query = "SELECT year, name FROM Birthdays;";
233let parameters = (); // This query does not use any parameters.
234let timeout_sec = None;
235if let Some(cursor) = conn.execute(query, parameters, timeout_sec)? {
236 // Use cursor to process query results.
237}
238# Ok::<(), odbc_api::Error>(())
239```
240
241The first parameter of `execute` is the SQL statement text. The second parameter is used to pass
242arguments of the SQL Statements itself (more on that later). Ours has none, so we use `()` to
243not bind any arguments to the statement. You can learn all about passing parameters from the
244[`parameter module level documentation`](`crate::parameter`). It may feature an example for
245your use case.
246
247Note that the result of the operation is an `Option`. This reflects that not every statement
248returns a [`Cursor`](crate::Cursor). `INSERT` statements usually do not, but even `SELECT`
249queries which would return zero rows can depending on the driver return either an empty cursor
250or no cursor at all. Should a cursor exists, it must be consumed or closed. The `drop` handler
251of Cursor will close it for us. If the `Option` is `None` there is nothing to close, so is all
252taken care of, nice.
253
254### Executing a many SQL statements in sequence
255
256Execution of a statement, its bound parameters, its buffers and result set, are all managed by
257ODBC using a statement handle. If you call [`crate::Connection::execute`] a new one is allocated
258each time and the resulting cursor takes ownership of it (giving you an easier time with the
259borrow checker). In a use case there you want to execute multiple SQL Statements over the same
260connection in sequence, you may want to reuse an already allocated statement handle. This is
261what the [`crate::Preallocated`] struct is for. Please note that if you want to execute the same
262query with different parameters you can have potentially even better performance by utilizing
263prepared Statements ([`crate::Prepared`]).
264
265```
266use odbc_api::{Connection, Error};
267use std::io::{self, stdin, Read};
268
269fn interactive(conn: &Connection) -> io::Result<()>{
270 let mut statement = conn.preallocate().unwrap();
271 let mut query = String::new();
272 stdin().read_line(&mut query)?;
273 while !query.is_empty() {
274 match statement.execute(&query, ()) {
275 Err(e) => println!("{}", e),
276 Ok(None) => println!("No results set generated."),
277 Ok(Some(cursor)) => {
278 // ...print cursor contents...
279 }
280 }
281 stdin().read_line(&mut query)?;
282 }
283 Ok(())
284}
285```
286
287### Executing prepared queries
288
289Should your use case require you to execute the same query several times with different parameters,
290prepared queries are the way to go. These give the database a chance to cache the access plan
291associated with your SQL statement. It is not unlike compiling your program once and executing it
292several times.
293
294```
295use odbc_api::{Connection, Error, IntoParameter};
296use std::io::{self, stdin, Read};
297
298fn interactive(conn: &Connection) -> io::Result<()>{
299 let mut prepared = conn.prepare("SELECT * FROM Movies WHERE title=?;").unwrap();
300 let mut title = String::new();
301 stdin().read_line(&mut title)?;
302 while !title.is_empty() {
303 match prepared.execute(&title.as_str().into_parameter()) {
304 Err(e) => println!("{}", e),
305 // Most drivers would return a result set even if no Movie with the title is found,
306 // the result set would just be empty. Well, most drivers.
307 Ok(None) => println!("No result set generated."),
308 Ok(Some(cursor)) => {
309 // ...print cursor contents...
310 }
311 }
312 stdin().read_line(&mut title)?;
313 }
314 Ok(())
315}
316```
317
318### Asynchronous execution using polling mode
319
320ODBC supports asynchronous execution of queries. It offers two modes: Notification and Polling.
321Notification to the authors knowlegde is only supported by the Driver Manager shipping with
322Microsoft Windows. Polling mode is also supported by unixODBC. This crate only supports the Polling
323model.
324
325⚠️**Attention**⚠️: Your driver manager (e.g. UnixODBC) may support polling mode, your driver may not.
326The function call may succeed without an error, but actually block the thread during the entire
327operation. In a manual check performed by the author with the drivers of PostgreSQL, MariaDB, SQLite
328and Microsoft SQL Server on a windows machine only the Microsoft SQL Server driver actually returned
329control to the application during statement execution. It seems asynchrnous execution is not widely
330supported.
331
332Generic applications which want to support a wide array of drivers should use one or more system
333threads executing actors using the blocking ODBC api, and communicate the query results via channels
334to the rest of the async applications.
335
336This caveat aside, using the polling mode api is the only way to execute many statements at once
337with few system threads.
338
339Functions in this crate which support asynchronous execution using polling mode:
340
341* [`crate::Connection::execute_polling`]
342* [`crate::Preallocated::into_polling`]
343
344See also the ODBC reference on polling execution:
345<https://learn.microsoft.com/sql/odbc/reference/develop-app/asynchronous-execution-polling-method>
346
347## Fetching results
348
349ODBC offers two ways of retrieving values from a cursor over a result set. Row by row fetching and
350bulk fetching using application provided buffers.
351
352### Fetching results by filling application provided buffers
353
354The most efficient way to query results is not query an ODBC data source row by row, but to
355ask for a whole bulk of rows at once. The ODBC driver and driver manager will then fill these
356row sets into buffers which have been previously bound. This is also the most efficient way to
357query a single row many times for many queries, if the application can reuse the bound buffer.
358This crate allows you to provide your own buffers by implementing the [`crate::RowSetBuffer`]
359trait. That however requires `unsafe` code.
360
361This crate also provides three implementations of the [`crate::RowSetBuffer`] trait, ready to be
362used in safe code:
363
364* [`crate::buffers::ColumnarBuffer`]: Binds to the result set column wise. This is usually helpful
365 in dataengineering or data sciense tasks. This buffer type can be used in situations there the
366 schema of the queried data is known at compile time, as well as for generic applications which do
367 work with wide range of different data. Checkt the struct documentation for examples.
368* [`crate::buffers::TextRowSet`]: Queries all data as text bound in columns. Since the columns are
369 homogeneous, you can also use this, to iterate row wise over the buffer. Excellent if you want
370 to print the contents of a table, or are for any reason only interessted in the text
371 representation of the values.
372* [`crate::buffers::RowVec`]: A good choice if you know the schema at compile time and your
373 application logic is build in a row by row fashion, rather than column by column.
374
375You can read more about them in the documentation of the [`crate::buffers`] module.
376
377## Inserting values into a table
378
379### Inserting a single row into a table
380
381Inserting a single row can be done by executing a statement and binding the fields as parameters
382in a tuple.
383
384```no_run
385use odbc_api::{Connection, Error, IntoParameter};
386
387fn insert_birth_year(conn: &Connection, name: &str, year: i16) -> Result<(), Error>{
388 conn.execute(
389 "INSERT INTO Birthdays (name, year) VALUES (?, ?)",
390 (&name.into_parameter(), &year),
391 None,
392 )?;
393 Ok(())
394}
395```
396
397### Columnar bulk inserts
398
399Inserting values row by row can introduce a lot of overhead. ODBC allows you to perform either
400row or column wise bulk inserts. Especially in pipelines for data science you may already have
401buffers in a columnar layout at hand. [`crate::ColumnarBulkInserter`] can be used for bulk inserts.
402
403```no_run
404use odbc_api::{Connection, Error, IntoParameter, buffers::BufferDesc};
405
406fn insert_birth_years(conn: &Connection, names: &[&str], years: &[i16]) -> Result<(), Error> {
407
408 // All columns must have equal length.
409 assert_eq!(names.len(), years.len());
410
411 let prepared = conn.prepare("INSERT INTO Birthdays (name, year) VALUES (?, ?)")?;
412
413 // Create a columnar buffer which fits the input parameters.
414 let buffer_description = [
415 BufferDesc::Text { max_str_len: 255 },
416 BufferDesc::I16 { nullable: false },
417 ];
418 // The capacity must be able to hold at least the largest batch. We do everything in one go, so
419 // we set it to the length of the input parameters.
420 let capacity = names.len();
421 // Allocate memory for the array column parameters and bind it to the statement.
422 let mut prebound = prepared.into_column_inserter(capacity, buffer_description)?;
423 // Length of this batch
424 prebound.set_num_rows(capacity);
425
426
427 // Fill the buffer with values column by column
428 let mut col = prebound
429 .column_mut(0)
430 .as_text_view()
431 .expect("We know the name column to hold text.");
432
433 for (index, name) in names.iter().enumerate() {
434 col.set_cell(index, Some(name.as_bytes()));
435 }
436
437 let col = prebound
438 .column_mut(1)
439 .as_slice::<i16>()
440 .expect("We know the year column to hold i16.");
441 col.copy_from_slice(years);
442
443 prebound.execute()?;
444 Ok(())
445}
446```
447*/