yash_syntax/decl_util.rs
1// This file is part of yash, an extended POSIX shell.
2// Copyright (C) 2024 WATANABE Yuki
3//
4// This program is free software: you can redistribute it and/or modify
5// it under the terms of the GNU General Public License as published by
6// the Free Software Foundation, either version 3 of the License, or
7// (at your option) any later version.
8//
9// This program is distributed in the hope that it will be useful,
10// but WITHOUT ANY WARRANTY; without even the implied warranty of
11// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12// GNU General Public License for more details.
13//
14// You should have received a copy of the GNU General Public License
15// along with this program. If not, see <https://www.gnu.org/licenses/>.
16
17//! Defining declaration utilities
18//!
19//! This module contains the [`Glossary`] trait, which is used by the parser to
20//! determine whether a command name is a declaration utility. It also provides
21//! two implementations of the `Glossary` trait: [`EmptyGlossary`] and
22//! [`PosixGlossary`].
23//!
24//! # What are declaration utilities?
25//!
26//! A [declaration utility] is a type of command that causes its argument words
27//! to be expanded in a manner slightly different from other commands. Usually,
28//! command word expansion includes field splitting and pathname expansion. For
29//! declaration utilities, however, those expansions are not performed on the
30//! arguments that have a form of variable assignments.
31//!
32//! [declaration utility]: https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap03.html#tag_03_100
33//!
34//! Generally, a simple command consists of assignments, redirections, and command
35//! words. The shell syntax allows the redirections to be placed anywhere in the
36//! command, but the assignments must come before the command words. An assignment
37//! token has the form `name=value`, the first token that does not match this
38//! form is considered the command name, and the rest are arguments regardless of
39//! whether they match the form. For example, in the command `a=1 b=2 echo c=3`,
40//! `a=1` and `b=2` are assignments, `echo` is the command name, and `c=3` is an
41//! argument.
42//!
43//! All assignments and command words are expanded when the command is executed,
44//! but the expansions are different. The expansions of assignments are performed
45//! in a way that does not include field splitting and pathname expansion. This
46//! ensures that the values of the assignments are not split or expanded into
47//! multiple fields. The expansions of command words, on the other hand, are
48//! performed in a way that includes field splitting and pathname expansion,
49//! which may expand a single word into multiple fields.
50//!
51//! The assignments specified in a simple command are performed by the shell
52//! before the utility specified by the command name is invoked. However, some
53//! utilities perform their own assignments based on their arguments. For such
54//! a utility, the tokens that specify the assigned variable names and values
55//! are given as arguments to the utility as in the command `export a=1 b=2`.
56//!
57//! By default, such arguments are expanded in the same way as usual command
58//! words, which means that the assignments are subject to field splitting and
59//! pathname expansion even though they are effectively assignments. To prevent
60//! this, the shell recognizes certain command names as declaration utilities
61//! and expands their arguments differently. The shell does not perform field
62//! splitting and pathname expansion on the arguments of declaration utilities
63//! that have the form of variable assignments.
64//!
65//! # Example
66//!
67//! POSIX requires the `export` utility to be recognized as a declaration
68//! utility. In the command `v='1 b=2'; export a=$v`, the word `a=$v` is not
69//! subject to field splitting because `export` is a declaration utility, so the
70//! expanded word `a=1 b=2` is passed to `export` as an argument, so `export`
71//! assigns the value `1 b=2` to the variable `a`. If `export` were not a
72//! declaration utility, the word `a=$v` would be subject to field splitting,
73//! and the expanded word `a=1 b=2` would be split into two fields `a=1` and
74//! `b=2`, so `export` would assign the value `1` to the variable `a` and the
75//! value `2` to the variable `b`.
76//!
77//! # Which command names are declaration utilities?
78//!
79//! The POSIX standard specifies that the following command names are declaration
80//! utilities:
81//!
82//! - `export` and `readonly` are declaration utilities.
83//! - `command` is neutral; it delegates to the next command word to determine
84//! whether it is a declaration utility.
85//!
86//! It is unspecified whether other command names are declaration utilities.
87//!
88//! The syntax parser in this crate uses the [`Glossary`] trait to determine
89//! whether a command name is a declaration utility. The parser calls its
90//! [`is_declaration_utility`] method when it encounters a command name, and
91//! changes how the following arguments are parsed based on the result.
92//!
93//! [`is_declaration_utility`]: Glossary::is_declaration_utility
94//!
95//! This module provides two implementations of the `Glossary` trait:
96//!
97//! - [`PosixGlossary`] recognizes the declaration utilities defined by POSIX
98//! (and no others). This is the default glossary used by the parser.
99//! - [`EmptyGlossary`] recognizes no command name as a declaration utility.
100//! The parse result does not conform to POSIX when this glossary is used.
101//!
102//! You can implement the `Glossary` trait for your own glossary if you want to
103//! recognize additional command names as declaration utilities. (In yash-rs,
104//! the `yash-env` crate provides a shell environment that implements `Glossary`
105//! based on the built-ins defined in the environment.)
106//!
107//! The glossary can be set to the parser with [`Config::declaration_utilities`].
108//!
109//! [`Config::declaration_utilities`]: crate::parser::Config::declaration_utilities
110//!
111//! # Parser behavior
112//!
113//! When the [parser] recognizes a command name as a declaration utility,
114//! command words that follow the command name are tested for the form of
115//! variable assignments. If a word is a variable assignment, it is parsed as
116//! such: the word is split into a variable name and a value, and tilde expansions
117//! are parsed with the [`parse_tilde_everywhere_after`] method in the value part.
118//! The result word is marked with [`ExpansionMode::Single`] in
119//! [`SimpleCommand::words`] to indicate that the word is not subject to field
120//! splitting and pathname expansion. If a word is not a variable assignment, it
121//! is parsed as a normal command word with [`parse_tilde_front`] and marked with
122//! [`ExpansionMode::Multiple`].
123//!
124//! The shell is expected to change the expansion behavior of the words based on
125//! the [`ExpansionMode`] of the words. In yash-rs, the semantics is implemented
126//! in the `yash-semantics` crate.
127//!
128//! [parser]: crate::parser
129//! [`parse_tilde_front`]: crate::syntax::Word::parse_tilde_front
130//! [`parse_tilde_everywhere_after`]: crate::syntax::Word::parse_tilde_everywhere_after
131//! [`ExpansionMode`]: crate::syntax::ExpansionMode
132//! [`ExpansionMode::Multiple`]: crate::syntax::ExpansionMode::Multiple
133//! [`ExpansionMode::Single`]: crate::syntax::ExpansionMode::Single
134//! [`SimpleCommand::words`]: crate::syntax::SimpleCommand::words
135
136use std::cell::RefCell;
137use std::fmt::Debug;
138
139/// Interface used by the parser to tell if a command name is a declaration utility
140///
141/// The parser uses this trait to determine whether a command name is a declaration
142/// utility. See the [module-level documentation](self) for details.
143pub trait Glossary: Debug {
144 /// Returns whether the given command name is a declaration utility.
145 ///
146 /// If the command name is a declaration utility, this method should return
147 /// `Some(true)`. If the command name is not a declaration utility, this
148 /// method should return `Some(false)`. If the return value is `None`, this
149 /// method is called again with the next command word in the simple command
150 /// being parsed, effectively delegating the decision to the next command word.
151 ///
152 /// To meet the POSIX standard, the method should return `Some(true)` for the
153 /// command names `export` and `readonly`, and `None` for the command name
154 /// `command`.
155 fn is_declaration_utility(&self, name: &str) -> Option<bool>;
156}
157
158/// Empty glossary that does not recognize any command name as a declaration utility
159///
160/// When this glossary is used, the parser recognizes no command name as a
161/// declaration utility. Note that this does not conform to POSIX.
162#[derive(Clone, Debug, Default, Eq, Hash, PartialEq)]
163pub struct EmptyGlossary;
164
165impl Glossary for EmptyGlossary {
166 #[inline(always)]
167 fn is_declaration_utility(&self, _name: &str) -> Option<bool> {
168 Some(false)
169 }
170}
171
172/// Glossary that recognizes declaration utilities defined by POSIX
173///
174/// This glossary recognizes the declaration utilities defined by POSIX and no
175/// others. The `is_declaration_utility` method returns `Some(true)` for the
176/// command names `export` and `readonly`, and `None` for the command name
177/// `command`.
178///
179/// This is the minimal glossary that conforms to POSIX, and is the default
180/// glossary used by the parser.
181#[derive(Clone, Debug, Default, Eq, Hash, PartialEq)]
182pub struct PosixGlossary;
183
184impl Glossary for PosixGlossary {
185 fn is_declaration_utility(&self, name: &str) -> Option<bool> {
186 match name {
187 "export" | "readonly" => Some(true),
188 "command" => None,
189 _ => Some(false),
190 }
191 }
192}
193
194impl<T: Glossary> Glossary for &T {
195 fn is_declaration_utility(&self, name: &str) -> Option<bool> {
196 (**self).is_declaration_utility(name)
197 }
198}
199
200impl<T: Glossary> Glossary for &mut T {
201 fn is_declaration_utility(&self, name: &str) -> Option<bool> {
202 (**self).is_declaration_utility(name)
203 }
204}
205
206/// Allows a glossary to be wrapped in a `RefCell`.
207///
208/// This implementation's methods immutably borrow the inner glossary.
209/// If the inner glossary is mutably borrowed at the same time, it panics.
210impl<T: Glossary> Glossary for RefCell<T> {
211 fn is_declaration_utility(&self, name: &str) -> Option<bool> {
212 self.borrow().is_declaration_utility(name)
213 }
214}