egg_mode/common/
mod.rs

1// This Source Code Form is subject to the terms of the Mozilla Public
2// License, v. 2.0. If a copy of the MPL was not distributed with this
3// file, You can obtain one at http://mozilla.org/MPL/2.0/.
4
5//! Set of structs and methods that act as a sort of internal prelude.
6//!
7//! The elements available in this module and its children are fairly basic building blocks that
8//! the other modules all glob-import to make available as a common language. A lot of
9//! infrastructure code goes in here.
10//!
11//! # Module contents
12//!
13//! Since i split this into multiple files that are then "flattened" into the final module, it's
14//! worth giving an inventory of what's in here, since every file has a `use common::*;` in it.
15//!
16//! ## Type Aliases
17//!
18//! These types are used commonly enough in the library that they're re-exported here for easy use.
19//!
20//! * `hyper::headers::HeaderMap<hyper::headers::HeaderValue>` (re-exported as the alias `Headers`)
21//!
22//! ## `ParamList`
23//!
24//! `ParamList` is a type alias for use as a collection of parameters to a given web call. It's
25//! consumed in the auth module, and provides some easy wrappers to consistently handle some types.
26//!
27//! `add_param` is a basic function that turns its arguments into `Cow<'static, str>`, then inserts them
28//! as a parameter into the given `ParamList`.
29//!
30//! `add_user_param` provides some special handling for the `UserID` enum, since Twitter always
31//! handles user parameters the same way: either as a `"user_id"` parameter with the ID, or as a
32//! `"screen_name"` parameter with the screen name. Since that's also how the `UserID` enum is laid
33//! out, this just puts the right parameter into the given `ParamList`.
34//!
35//! `add_list_param` does the same thing, but for `ListID`. Lists get a little more complicated
36//! than users, though, since there are technically *three* ways to reference a list: by its ID, by
37//! the owner's ID and the list slug, or by the owner's screen name and the list slug. Again, since
38//! Twitter always uses the same set of parameters when referencing a list, this deals with all of
39//! that work in one place, and i can just take a `ListID` from the user and shove it directly into
40//! a `ParamList`.
41//!
42//! `multiple_names_param` is for when a function takes an `IntoIterator<Item=UserID>` It's
43//! possible to mix and match the use of the `"user_id"` and `"screen_name"` parameters on these
44//! lookup functions, so this saves up all that handling and splits the iterator into two strings:
45//! one for the user IDs, one for the screen names.
46//!
47//! ## Miscellaneous functions
48//!
49//! `codepoints_to_bytes` is a convenience function that i use when Twitter returns text ranges in
50//! terms of codepoint offsets rather than byte offsets. It takes the pair of numbers from twitter
51//! and the string it refers to, and returns a pair that can be used directly to slice the given
52//! string. It's also an example of how function parameters are themselves patterns, because i
53//! destructure the pair right in the signature. `>_>`
54//!
55//! `serde_datetime` and `serde_via_string` are helper modules to use with derived
56//! `Serialize`/`Deserialize` implementations. `serde_datetime` loads and saves `DateTime`s with
57//! the format Twitter uses for timestamps, and `serde_via_string` uses `Display` and `FromStr` to
58//! save a string representation of the original type.
59//!
60//! `merge_by` and its companion type `MergeBy` is a copy of the iterator adapter of the same name
61//! from itertools, because i didn't want to add another dependency onto the great towering pile
62//! that is my dep tree. `>_>`
63//!
64//! `max_opt` and `min_opt` are helper functions because i didn't realize that `Option` derived
65//! `PartialOrd` and `Ord` at the time. Strictly speaking they're subtly different because
66//! `std::cmp::{min,max}` require `Ord` and `min_opt` won't reach for the None if it's there,
67//! unlike the derived `PartialOrd` which considers None to be less than Some.
68//!
69//! ## Authentication functions
70//!
71//! The functions `get`, `post`, and `post_json` are re-exported here to keep people from having to
72//! qualify them from `auth::raw`.
73//!
74//! ## `Response`
75//!
76//! Also in its own module, `Response` is a public structure that contains rate-limit information
77//! from Twitter, alongside some other desired output. This type is used all over the place in
78//! egg-mode, because i wanted to make sure people always had rate-limit information on hand. The
79//! module also contains the types and functions that all web calls go through: the ones that load
80//! a web call, parse out the rate-limit headers, and call some handler to perform final processing
81//! on the result.
82//!
83//! `request_with_json_response` is the most common future constructor, which just defers to
84//! `raw_request` (which just calls `serde_json` and loads up the rate-limit headers)
85//! then deserializes the json response to given type.
86//!
87//! `rate_headers` is an infra function that takes the `Headers` and returns an empty `Response`
88//! with the rate-limit info parsed out. It's only exported for a couple functions in `list` which
89//! need to get that info even on an error.
90
91use std::borrow::Cow;
92use std::collections::HashMap;
93use std::future::Future;
94use std::iter::Peekable;
95use std::pin::Pin;
96
97use hyper::header::{HeaderMap, HeaderValue};
98use percent_encoding::{utf8_percent_encode, AsciiSet, PercentEncode};
99
100mod response;
101
102pub use crate::auth::raw::{get, post, post_json};
103
104pub use crate::common::response::*;
105use crate::{error, list, user};
106
107/// Macro to create a `Serialize`/`Deserialize` implementation allowing for deserialization via the
108/// given "raw" struct or via a "round-trip" using the type's own serialization.
109///
110/// This macro takes two arguments: the name of a "raw" type, and a public struct definition. The
111/// given struct must implement `From` or `TryFrom` for the given raw type. In return, it derives
112/// `Serialize` and `Deserialize` for the struct, and creates a handful of helper types to modify
113/// the `Deserialize` implementation.
114///
115/// ## Warning
116///
117/// If you're adding this to something that should have custom (de-)serialization logic on some
118/// fields (e.g.  `DateTime`), make sure to add both the `serialize_with` and `deserialize_with`
119/// attributes to the struct definition. All the attributes are copied in to the `SerCopy` struct,
120/// so it inherits the deserialization logic that otherwise goes unused. If you don't do this, then
121/// the type will fail to "round-trip" properly and may create an error when you try to deserialize
122/// from the saved data.
123///
124/// ## Example
125///
126/// ```rust,ignore (internal-items)
127/// use crate::common::*;
128///
129/// round_trip! { raw::RawDummyStruct,
130///     /// A dummy struct to demonstrate `round_trip!`.
131///     pub struct DummyStruct {
132///         // ...
133///     }
134/// }
135///
136/// impl From<raw::RawDummyStruct> for DummyStruct {
137///     fn from(src: RawDummyStruct) -> DummyStruct {
138///         // ...
139///     }
140/// }
141/// ```
142///
143/// ## Implementation
144///
145/// This macro abuses the `#[serde(untagged)]` enum representation to allow it to deserialize via
146/// the existing "raw" type, or the generated "SerCopy" struct which is a field-for-field copy of
147/// the original struct. This way, either representation can be used to load the struct without the
148/// overhead of loading it all into a `serde_json::Value` first to manually decode into either
149/// type.
150macro_rules! round_trip {
151    ( $raw_name:path,
152      $(#[$outer_attr:meta])*
153      pub struct $struct_name:ident { $(
154          $(#[$attr:meta])*
155          $v:vis $f:ident : $t:ty
156      ),+ $(,)? } ) => {
157        $(#[$outer_attr])*
158        #[derive(serde::Serialize)]
159        #[derive(serde::Deserialize)]
160        #[serde(try_from = "SerEnum")]
161        pub struct $struct_name { $(
162            $(#[$attr])*
163            $v $f: $t
164        ),+ }
165
166        #[allow(unused_qualifications)]
167        impl crate::common::RoundTrip for $struct_name {
168            fn upstream_deser_error(input: serde_json::Value) -> Option<String> {
169                use crate::common::MapString;
170
171                serde_json::from_value::<$raw_name>(input).err().map_string()
172            }
173
174            fn roundtrip_deser_error(input: serde_json::Value) -> Option<String> {
175                use crate::common::MapString;
176
177                serde_json::from_value::<SerCopy>(input).err().map_string()
178            }
179        }
180
181        #[derive(serde::Deserialize)]
182        struct SerCopy { $(
183            $(#[$attr])*
184            $v $f: $t
185        ),+ }
186
187        impl From<SerCopy> for $struct_name {
188            fn from(src: SerCopy) -> $struct_name {
189                $struct_name { $(
190                    $f: src.$f
191                ),+ }
192            }
193        }
194
195        #[derive(serde::Deserialize)]
196        #[serde(untagged)]
197        enum SerEnum {
198            Raw($raw_name),
199            Ser(SerCopy),
200        }
201
202        #[allow(unused_qualifications)]
203        impl std::convert::TryFrom<SerEnum> for $struct_name
204        where
205            $struct_name: std::convert::TryFrom<$raw_name>,
206        {
207            type Error = <$struct_name as std::convert::TryFrom<$raw_name>>::Error;
208
209            fn try_from(src: SerEnum) -> std::result::Result<$struct_name, Self::Error> {
210                use std::convert::TryInto;
211
212                match src {
213                    SerEnum::Raw(raw) => raw.try_into(),
214                    SerEnum::Ser(ser) => Ok(ser.into()),
215                }
216            }
217        }
218    };
219}
220
221/// Types that implement `Deserialize` either by loading from upstream JSON, or via a "round-trip"
222/// serialization.
223///
224/// Starting in egg-mode 0.16, select types gained a `Serialize` implementation, which caused them
225/// to require special handling when deserializing. This special handling created an issue for when
226/// errors occur: When the input data didn't match the expected type definition, the only error
227/// that would be returned is a generic `"data did not match any variant of untagged enum
228/// SerEnum"`. In an attempt to allow these errors to be recovered, this trait was created.
229///
230/// If you get an error when trying to load a type that implements `RoundTrip`, and can isolate it
231/// to a specific instance of data, you can try to load it with either of these functions to see
232/// the specific error. For example, to find the error from loading a user:
233///
234/// ```no_run
235/// use egg_mode::user::TwitterUser;
236/// use egg_mode::raw::{self, RoundTrip};
237///
238/// # #[tokio::main]
239/// # async fn main() {
240/// # let token: egg_mode::Token = unimplemented!();
241/// let url = "https://api.twitter.com/1.1/users/show.json";
242/// let params = raw::ParamList::new().add_user_param("rustlang".into());
243/// let req = raw::request_get(url, &token, Some(&params));
244/// let resp = raw::response_json::<serde_json::Value>(req).await.unwrap();
245///
246/// if let Some(msg) = TwitterUser::upstream_deser_error(resp.response) {
247///     println!("there was an error: {}", msg);
248/// }
249/// # }
250/// ```
251pub trait RoundTrip {
252    /// Returns the string representation of an error from loading JSON from Twitter, if
253    /// applicable.
254    ///
255    /// Use this function if trying to load something from the API gave you a
256    /// deserialization error.
257    fn upstream_deser_error(input: serde_json::Value) -> Option<String>;
258
259    /// Returns the string representation of an error from loading JSON given by
260    /// serializing this type.
261    ///
262    /// Use this function if trying to load saved JSON from saving previously-loaded data
263    /// gave you a deserialization error.
264    fn roundtrip_deser_error(input: serde_json::Value) -> Option<String>;
265}
266
267// n.b. this type alias is re-exported in the `raw` module - these docs are public!
268/// A set of headers returned with a response.
269pub type Headers = HeaderMap<HeaderValue>;
270pub type CowStr = Cow<'static, str>;
271
272// n.b. this type is re-exported in the `raw` module - these docs are public!
273/// Represents a list of parameters to a Twitter API call.
274///
275/// This type is a wrapper around a `HashMap<Cow<'static, str>, Cow<'static, str>>` to collect a
276/// set of parameter key/value pairs. These are then used to assemble and sign a Twitter API
277/// request. The `Cow` type is used to avoid having to allocate a `String` if a string literal is
278/// used for a parameter. All the functions that add parameters to this `ParamList` accept `impl
279/// Into<Cow<'static, str>>`, meaning that either a string literal or an owned `String` may be
280/// used.
281///
282/// Most of the functions to add parameters follow a builder pattern, so that you can assemble a
283/// `ParamList` in a single statement:
284///
285/// ```
286/// use egg_mode::raw::ParamList;
287///
288/// // If you were looking up the user `@rustlang` with `GET users/show`, you might assemble a
289/// // ParamList like this...
290/// let params = ParamList::new()
291///     .extended_tweets()
292///     .add_user_param("rustlang".into());
293/// ```
294#[derive(Debug, Clone, Default, derive_more::Deref, derive_more::DerefMut, derive_more::From)]
295pub struct ParamList(HashMap<Cow<'static, str>, Cow<'static, str>>);
296
297impl ParamList {
298    /// Creates a new, empty `ParamList`.
299    pub fn new() -> Self {
300        Self(HashMap::new())
301    }
302
303    /// Adds the `tweet_mode=extended` parameter to this `ParamList`. Not including this parameter
304    /// will cause tweets to be loaded with legacy parameters, and a potentially-truncated `text`
305    /// if the tweet is longer than 140 characters. The `Deserialize` impl for `Tweet`s (or
306    /// anything that directly or indirectly includes a `Tweet`) expects the extended tweet format
307    /// enabled by this function.
308    pub fn extended_tweets(self) -> Self {
309        self.add_param("tweet_mode", "extended")
310    }
311
312    /// Adds the given key/value parameter to this `ParamList`.
313    pub fn add_param(
314        mut self,
315        key: impl Into<Cow<'static, str>>,
316        value: impl Into<Cow<'static, str>>,
317    ) -> Self {
318        self.insert(key.into(), value.into());
319        self
320    }
321
322    /// Adds the given key/value parameter to this `ParamList` only if the given value is `Some`.
323    ///
324    /// This can be a convenient wrapper to use in case you may or may not want to include
325    /// something based on some condition. If the given value is `None`, then the `ParamList` is
326    /// returned unmodified.
327    pub fn add_opt_param(
328        self,
329        key: impl Into<Cow<'static, str>>,
330        value: Option<impl Into<Cow<'static, str>>>,
331    ) -> Self {
332        match value {
333            Some(val) => self.add_param(key.into(), val.into()),
334            None => self,
335        }
336    }
337
338    /// Adds the given key/value to this `ParamList` by mutating it in place, rather than consuming
339    /// it as in `add_param`.
340    pub fn add_param_ref(
341        &mut self,
342        key: impl Into<Cow<'static, str>>,
343        value: impl Into<Cow<'static, str>>,
344    ) {
345        self.0.insert(key.into(), value.into());
346    }
347
348    /// Adds the given `UserID` as a parameter to this `ParamList` by adding either a `user_id` or
349    /// `screen_name` parameter as appropriate.
350    pub fn add_user_param(self, id: user::UserID) -> Self {
351        match id {
352            user::UserID::ID(id) => self.add_param("user_id", id.to_string()),
353            user::UserID::ScreenName(name) => self.add_param("screen_name", name),
354        }
355    }
356
357    /// Adds the given `ListID` as a parameter to this `ParamList` by adding either an
358    /// `owner_id`/`owner_screen_name` and `slug` pair, or a `list_id`, as appropriate.
359    pub fn add_list_param(mut self, list: list::ListID) -> Self {
360        match list {
361            list::ListID::Slug(owner, name) => {
362                match owner {
363                    user::UserID::ID(id) => {
364                        self.add_param_ref("owner_id", id.to_string());
365                    }
366                    user::UserID::ScreenName(name) => {
367                        self.add_param_ref("owner_screen_name", name);
368                    }
369                }
370                self.add_param("slug", name.clone())
371            }
372            list::ListID::ID(id) => self.add_param("list_id", id.to_string()),
373        }
374    }
375
376    /// Merge the parameters from the given `ParamList` into this one.
377    pub(crate) fn combine(&mut self, other: ParamList) {
378        self.0.extend(other.0);
379    }
380
381    /// Renders this `ParamList` as an `application/x-www-form-urlencoded` string.
382    ///
383    /// The key/value pairs are printed as `key1=value1&key2=value2`, with all keys and values
384    /// being percent-encoded according to Twitter's requirements.
385    pub fn to_urlencoded(&self) -> String {
386        self.0
387            .iter()
388            .map(|(k, v)| format!("{}={}", percent_encode(k), percent_encode(v)))
389            .collect::<Vec<_>>()
390            .join("&")
391    }
392}
393
394// Helper trait to stringify the contents of an Option
395pub(crate) trait MapString {
396    fn map_string(&self) -> Option<String>;
397}
398
399impl<T: std::fmt::Display> MapString for Option<T> {
400    fn map_string(&self) -> Option<String> {
401        self.as_ref().map(|v| v.to_string())
402    }
403}
404
405pub fn multiple_names_param<T, I>(accts: I) -> (String, String)
406where
407    T: Into<user::UserID>,
408    I: IntoIterator<Item = T>,
409{
410    let mut ids = Vec::new();
411    let mut names = Vec::new();
412
413    for x in accts {
414        match x.into() {
415            user::UserID::ID(id) => ids.push(id.to_string()),
416            user::UserID::ScreenName(name) => names.push(name),
417        }
418    }
419
420    (ids.join(","), names.join(","))
421}
422
423///Convenient type alias for futures that resolve to responses from Twitter.
424pub(crate) type FutureResponse<T> =
425    Pin<Box<dyn Future<Output = error::Result<Response<T>>> + Send>>;
426
427pub fn codepoints_to_bytes(&mut (ref mut start, ref mut end): &mut (usize, usize), text: &str) {
428    let mut byte_start = *start;
429    let mut byte_end = *end;
430    for (ch_offset, (by_offset, _)) in text.char_indices().enumerate() {
431        if ch_offset == *start {
432            byte_start = by_offset;
433        } else if ch_offset == *end {
434            byte_end = by_offset;
435        }
436    }
437    *start = byte_start;
438    if text.chars().count() == *end {
439        *end = text.len()
440    } else {
441        *end = byte_end
442    }
443}
444
445///A clone of MergeBy from Itertools.
446pub struct MergeBy<Iter, Fun>
447where
448    Iter: Iterator,
449{
450    left: Peekable<Iter>,
451    right: Peekable<Iter>,
452    comp: Fun,
453    fused: Option<bool>,
454}
455
456impl<Iter, Fun> Iterator for MergeBy<Iter, Fun>
457where
458    Iter: Iterator,
459    Fun: FnMut(&Iter::Item, &Iter::Item) -> bool,
460{
461    type Item = Iter::Item;
462
463    fn next(&mut self) -> Option<Self::Item> {
464        let is_left = match self.fused {
465            Some(lt) => lt,
466            None => match (self.left.peek(), self.right.peek()) {
467                (Some(a), Some(b)) => (self.comp)(a, b),
468                (Some(_), None) => {
469                    self.fused = Some(true);
470                    true
471                }
472                (None, Some(_)) => {
473                    self.fused = Some(false);
474                    false
475                }
476                (None, None) => return None,
477            },
478        };
479
480        if is_left {
481            self.left.next()
482        } else {
483            self.right.next()
484        }
485    }
486}
487
488pub mod serde_datetime {
489    use chrono::TimeZone;
490    use serde::de::Error;
491    use serde::{Deserialize, Deserializer, Serializer};
492
493    const DATE_FORMAT: &str = "%a %b %d %T %z %Y";
494
495    pub fn deserialize<'de, D>(ser: D) -> Result<chrono::DateTime<chrono::Utc>, D::Error>
496    where
497        D: Deserializer<'de>,
498    {
499        let s = String::deserialize(ser)?;
500        let date = (chrono::Utc)
501            .datetime_from_str(&s, DATE_FORMAT)
502            .map_err(D::Error::custom)?;
503        Ok(date)
504    }
505
506    pub fn serialize<S>(src: &chrono::DateTime<chrono::Utc>, ser: S) -> Result<S::Ok, S::Error>
507    where
508        S: Serializer,
509    {
510        ser.collect_str(&src.format(DATE_FORMAT))
511    }
512}
513
514pub mod serde_via_string {
515    use serde::de::Error;
516    use serde::{Deserialize, Deserializer, Serializer};
517
518    use std::fmt;
519
520    pub fn deserialize<'de, D, T>(ser: D) -> Result<T, D::Error>
521    where
522        D: Deserializer<'de>,
523        T: std::str::FromStr,
524        <T as std::str::FromStr>::Err: std::fmt::Display,
525    {
526        let str = String::deserialize(ser)?;
527        str.parse().map_err(D::Error::custom)
528    }
529
530    pub fn serialize<T, S>(src: &T, ser: S) -> Result<S::Ok, S::Error>
531    where
532        T: fmt::Display,
533        S: Serializer,
534    {
535        ser.collect_str(src)
536    }
537}
538
539/// Percent-encodes the given string based on the Twitter API specification.
540///
541/// Twitter bases its encoding scheme on RFC 3986, Section 2.1. They describe the process in full
542/// [in their documentation][twitter-percent], but the process can be summarized by saying that
543/// every *byte* that is not an ASCII number or letter, or the ASCII characters `-`, `.`, `_`, or
544/// `~` must be replaced with a percent sign (`%`) and the byte value in hexadecimal.
545///
546/// [twitter-percent]: https://developer.twitter.com/en/docs/basics/authentication/oauth-1-0a/percent-encoding-parameters
547///
548/// When this function was originally implemented, the `percent_encoding` crate did not have an
549/// encoding set that matched this, so it was recreated here.
550pub fn percent_encode(src: &str) -> PercentEncode {
551    lazy_static::lazy_static! {
552        static ref ENCODER: AsciiSet = percent_encoding::NON_ALPHANUMERIC.remove(b'-').remove(b'.').remove(b'_').remove(b'~');
553    }
554    utf8_percent_encode(src, &*ENCODER)
555}
556
557#[cfg(test)]
558pub(crate) mod tests {
559    use super::*;
560    use std::fs::File;
561    use std::io::Read;
562
563    pub(crate) fn load_file(path: &str) -> String {
564        let mut file = File::open(path).unwrap();
565        let mut content = String::new();
566        file.read_to_string(&mut content).unwrap();
567        content
568    }
569
570    #[test]
571    fn test_codepoints_to_bytes() {
572        let unicode = "frônt Iñtërnâtiônàližætiøn ënd";
573        // suppose we want to slice out the middle word.
574        // 30 codepoints of which we want the middle 20;
575        let mut range = (6, 26);
576        codepoints_to_bytes(&mut range, unicode);
577        assert_eq!(&unicode[range.0..range.1], "Iñtërnâtiônàližætiøn");
578
579        let mut range = (6, 30);
580        codepoints_to_bytes(&mut range, unicode);
581        assert_eq!(&unicode[range.0..range.1], "Iñtërnâtiônàližætiøn ënd");
582    }
583}