egg_mode/common/mod.rs
1// This Source Code Form is subject to the terms of the Mozilla Public
2// License, v. 2.0. If a copy of the MPL was not distributed with this
3// file, You can obtain one at http://mozilla.org/MPL/2.0/.
4
5//! Set of structs and methods that act as a sort of internal prelude.
6//!
7//! The elements available in this module and its children are fairly basic building blocks that
8//! the other modules all glob-import to make available as a common language. A lot of
9//! infrastructure code goes in here.
10//!
11//! # Module contents
12//!
13//! Since i split this into multiple files that are then "flattened" into the final module, it's
14//! worth giving an inventory of what's in here, since every file has a `use common::*;` in it.
15//!
16//! ## Type Aliases
17//!
18//! These types are used commonly enough in the library that they're re-exported here for easy use.
19//!
20//! * `hyper::headers::HeaderMap<hyper::headers::HeaderValue>` (re-exported as the alias `Headers`)
21//!
22//! ## `ParamList`
23//!
24//! `ParamList` is a type alias for use as a collection of parameters to a given web call. It's
25//! consumed in the auth module, and provides some easy wrappers to consistently handle some types.
26//!
27//! `add_param` is a basic function that turns its arguments into `Cow<'static, str>`, then inserts them
28//! as a parameter into the given `ParamList`.
29//!
30//! `add_user_param` provides some special handling for the `UserID` enum, since Twitter always
31//! handles user parameters the same way: either as a `"user_id"` parameter with the ID, or as a
32//! `"screen_name"` parameter with the screen name. Since that's also how the `UserID` enum is laid
33//! out, this just puts the right parameter into the given `ParamList`.
34//!
35//! `add_list_param` does the same thing, but for `ListID`. Lists get a little more complicated
36//! than users, though, since there are technically *three* ways to reference a list: by its ID, by
37//! the owner's ID and the list slug, or by the owner's screen name and the list slug. Again, since
38//! Twitter always uses the same set of parameters when referencing a list, this deals with all of
39//! that work in one place, and i can just take a `ListID` from the user and shove it directly into
40//! a `ParamList`.
41//!
42//! `multiple_names_param` is for when a function takes an `IntoIterator<Item=UserID>` It's
43//! possible to mix and match the use of the `"user_id"` and `"screen_name"` parameters on these
44//! lookup functions, so this saves up all that handling and splits the iterator into two strings:
45//! one for the user IDs, one for the screen names.
46//!
47//! ## Miscellaneous functions
48//!
49//! `codepoints_to_bytes` is a convenience function that i use when Twitter returns text ranges in
50//! terms of codepoint offsets rather than byte offsets. It takes the pair of numbers from twitter
51//! and the string it refers to, and returns a pair that can be used directly to slice the given
52//! string. It's also an example of how function parameters are themselves patterns, because i
53//! destructure the pair right in the signature. `>_>`
54//!
55//! `serde_datetime` and `serde_via_string` are helper modules to use with derived
56//! `Serialize`/`Deserialize` implementations. `serde_datetime` loads and saves `DateTime`s with
57//! the format Twitter uses for timestamps, and `serde_via_string` uses `Display` and `FromStr` to
58//! save a string representation of the original type.
59//!
60//! `merge_by` and its companion type `MergeBy` is a copy of the iterator adapter of the same name
61//! from itertools, because i didn't want to add another dependency onto the great towering pile
62//! that is my dep tree. `>_>`
63//!
64//! `max_opt` and `min_opt` are helper functions because i didn't realize that `Option` derived
65//! `PartialOrd` and `Ord` at the time. Strictly speaking they're subtly different because
66//! `std::cmp::{min,max}` require `Ord` and `min_opt` won't reach for the None if it's there,
67//! unlike the derived `PartialOrd` which considers None to be less than Some.
68//!
69//! ## Authentication functions
70//!
71//! The functions `get`, `post`, and `post_json` are re-exported here to keep people from having to
72//! qualify them from `auth::raw`.
73//!
74//! ## `Response`
75//!
76//! Also in its own module, `Response` is a public structure that contains rate-limit information
77//! from Twitter, alongside some other desired output. This type is used all over the place in
78//! egg-mode, because i wanted to make sure people always had rate-limit information on hand. The
79//! module also contains the types and functions that all web calls go through: the ones that load
80//! a web call, parse out the rate-limit headers, and call some handler to perform final processing
81//! on the result.
82//!
83//! `request_with_json_response` is the most common future constructor, which just defers to
84//! `raw_request` (which just calls `serde_json` and loads up the rate-limit headers)
85//! then deserializes the json response to given type.
86//!
87//! `rate_headers` is an infra function that takes the `Headers` and returns an empty `Response`
88//! with the rate-limit info parsed out. It's only exported for a couple functions in `list` which
89//! need to get that info even on an error.
90
91use std::borrow::Cow;
92use std::collections::HashMap;
93use std::future::Future;
94use std::iter::Peekable;
95use std::pin::Pin;
96
97use hyper::header::{HeaderMap, HeaderValue};
98use percent_encoding::{utf8_percent_encode, AsciiSet, PercentEncode};
99
100mod response;
101
102pub use crate::auth::raw::{get, post, post_json};
103
104pub use crate::common::response::*;
105use crate::{error, list, user};
106
107/// Macro to create a `Serialize`/`Deserialize` implementation allowing for deserialization via the
108/// given "raw" struct or via a "round-trip" using the type's own serialization.
109///
110/// This macro takes two arguments: the name of a "raw" type, and a public struct definition. The
111/// given struct must implement `From` or `TryFrom` for the given raw type. In return, it derives
112/// `Serialize` and `Deserialize` for the struct, and creates a handful of helper types to modify
113/// the `Deserialize` implementation.
114///
115/// ## Warning
116///
117/// If you're adding this to something that should have custom (de-)serialization logic on some
118/// fields (e.g. `DateTime`), make sure to add both the `serialize_with` and `deserialize_with`
119/// attributes to the struct definition. All the attributes are copied in to the `SerCopy` struct,
120/// so it inherits the deserialization logic that otherwise goes unused. If you don't do this, then
121/// the type will fail to "round-trip" properly and may create an error when you try to deserialize
122/// from the saved data.
123///
124/// ## Example
125///
126/// ```rust,ignore (internal-items)
127/// use crate::common::*;
128///
129/// round_trip! { raw::RawDummyStruct,
130/// /// A dummy struct to demonstrate `round_trip!`.
131/// pub struct DummyStruct {
132/// // ...
133/// }
134/// }
135///
136/// impl From<raw::RawDummyStruct> for DummyStruct {
137/// fn from(src: RawDummyStruct) -> DummyStruct {
138/// // ...
139/// }
140/// }
141/// ```
142///
143/// ## Implementation
144///
145/// This macro abuses the `#[serde(untagged)]` enum representation to allow it to deserialize via
146/// the existing "raw" type, or the generated "SerCopy" struct which is a field-for-field copy of
147/// the original struct. This way, either representation can be used to load the struct without the
148/// overhead of loading it all into a `serde_json::Value` first to manually decode into either
149/// type.
150macro_rules! round_trip {
151 ( $raw_name:path,
152 $(#[$outer_attr:meta])*
153 pub struct $struct_name:ident { $(
154 $(#[$attr:meta])*
155 $v:vis $f:ident : $t:ty
156 ),+ $(,)? } ) => {
157 $(#[$outer_attr])*
158 #[derive(serde::Serialize)]
159 #[derive(serde::Deserialize)]
160 #[serde(try_from = "SerEnum")]
161 pub struct $struct_name { $(
162 $(#[$attr])*
163 $v $f: $t
164 ),+ }
165
166 #[allow(unused_qualifications)]
167 impl crate::common::RoundTrip for $struct_name {
168 fn upstream_deser_error(input: serde_json::Value) -> Option<String> {
169 use crate::common::MapString;
170
171 serde_json::from_value::<$raw_name>(input).err().map_string()
172 }
173
174 fn roundtrip_deser_error(input: serde_json::Value) -> Option<String> {
175 use crate::common::MapString;
176
177 serde_json::from_value::<SerCopy>(input).err().map_string()
178 }
179 }
180
181 #[derive(serde::Deserialize)]
182 struct SerCopy { $(
183 $(#[$attr])*
184 $v $f: $t
185 ),+ }
186
187 impl From<SerCopy> for $struct_name {
188 fn from(src: SerCopy) -> $struct_name {
189 $struct_name { $(
190 $f: src.$f
191 ),+ }
192 }
193 }
194
195 #[derive(serde::Deserialize)]
196 #[serde(untagged)]
197 enum SerEnum {
198 Raw($raw_name),
199 Ser(SerCopy),
200 }
201
202 #[allow(unused_qualifications)]
203 impl std::convert::TryFrom<SerEnum> for $struct_name
204 where
205 $struct_name: std::convert::TryFrom<$raw_name>,
206 {
207 type Error = <$struct_name as std::convert::TryFrom<$raw_name>>::Error;
208
209 fn try_from(src: SerEnum) -> std::result::Result<$struct_name, Self::Error> {
210 use std::convert::TryInto;
211
212 match src {
213 SerEnum::Raw(raw) => raw.try_into(),
214 SerEnum::Ser(ser) => Ok(ser.into()),
215 }
216 }
217 }
218 };
219}
220
221/// Types that implement `Deserialize` either by loading from upstream JSON, or via a "round-trip"
222/// serialization.
223///
224/// Starting in egg-mode 0.16, select types gained a `Serialize` implementation, which caused them
225/// to require special handling when deserializing. This special handling created an issue for when
226/// errors occur: When the input data didn't match the expected type definition, the only error
227/// that would be returned is a generic `"data did not match any variant of untagged enum
228/// SerEnum"`. In an attempt to allow these errors to be recovered, this trait was created.
229///
230/// If you get an error when trying to load a type that implements `RoundTrip`, and can isolate it
231/// to a specific instance of data, you can try to load it with either of these functions to see
232/// the specific error. For example, to find the error from loading a user:
233///
234/// ```no_run
235/// use egg_mode::user::TwitterUser;
236/// use egg_mode::raw::{self, RoundTrip};
237///
238/// # #[tokio::main]
239/// # async fn main() {
240/// # let token: egg_mode::Token = unimplemented!();
241/// let url = "https://api.twitter.com/1.1/users/show.json";
242/// let params = raw::ParamList::new().add_user_param("rustlang".into());
243/// let req = raw::request_get(url, &token, Some(¶ms));
244/// let resp = raw::response_json::<serde_json::Value>(req).await.unwrap();
245///
246/// if let Some(msg) = TwitterUser::upstream_deser_error(resp.response) {
247/// println!("there was an error: {}", msg);
248/// }
249/// # }
250/// ```
251pub trait RoundTrip {
252 /// Returns the string representation of an error from loading JSON from Twitter, if
253 /// applicable.
254 ///
255 /// Use this function if trying to load something from the API gave you a
256 /// deserialization error.
257 fn upstream_deser_error(input: serde_json::Value) -> Option<String>;
258
259 /// Returns the string representation of an error from loading JSON given by
260 /// serializing this type.
261 ///
262 /// Use this function if trying to load saved JSON from saving previously-loaded data
263 /// gave you a deserialization error.
264 fn roundtrip_deser_error(input: serde_json::Value) -> Option<String>;
265}
266
267// n.b. this type alias is re-exported in the `raw` module - these docs are public!
268/// A set of headers returned with a response.
269pub type Headers = HeaderMap<HeaderValue>;
270pub type CowStr = Cow<'static, str>;
271
272// n.b. this type is re-exported in the `raw` module - these docs are public!
273/// Represents a list of parameters to a Twitter API call.
274///
275/// This type is a wrapper around a `HashMap<Cow<'static, str>, Cow<'static, str>>` to collect a
276/// set of parameter key/value pairs. These are then used to assemble and sign a Twitter API
277/// request. The `Cow` type is used to avoid having to allocate a `String` if a string literal is
278/// used for a parameter. All the functions that add parameters to this `ParamList` accept `impl
279/// Into<Cow<'static, str>>`, meaning that either a string literal or an owned `String` may be
280/// used.
281///
282/// Most of the functions to add parameters follow a builder pattern, so that you can assemble a
283/// `ParamList` in a single statement:
284///
285/// ```
286/// use egg_mode::raw::ParamList;
287///
288/// // If you were looking up the user `@rustlang` with `GET users/show`, you might assemble a
289/// // ParamList like this...
290/// let params = ParamList::new()
291/// .extended_tweets()
292/// .add_user_param("rustlang".into());
293/// ```
294#[derive(Debug, Clone, Default, derive_more::Deref, derive_more::DerefMut, derive_more::From)]
295pub struct ParamList(HashMap<Cow<'static, str>, Cow<'static, str>>);
296
297impl ParamList {
298 /// Creates a new, empty `ParamList`.
299 pub fn new() -> Self {
300 Self(HashMap::new())
301 }
302
303 /// Adds the `tweet_mode=extended` parameter to this `ParamList`. Not including this parameter
304 /// will cause tweets to be loaded with legacy parameters, and a potentially-truncated `text`
305 /// if the tweet is longer than 140 characters. The `Deserialize` impl for `Tweet`s (or
306 /// anything that directly or indirectly includes a `Tweet`) expects the extended tweet format
307 /// enabled by this function.
308 pub fn extended_tweets(self) -> Self {
309 self.add_param("tweet_mode", "extended")
310 }
311
312 /// Adds the given key/value parameter to this `ParamList`.
313 pub fn add_param(
314 mut self,
315 key: impl Into<Cow<'static, str>>,
316 value: impl Into<Cow<'static, str>>,
317 ) -> Self {
318 self.insert(key.into(), value.into());
319 self
320 }
321
322 /// Adds the given key/value parameter to this `ParamList` only if the given value is `Some`.
323 ///
324 /// This can be a convenient wrapper to use in case you may or may not want to include
325 /// something based on some condition. If the given value is `None`, then the `ParamList` is
326 /// returned unmodified.
327 pub fn add_opt_param(
328 self,
329 key: impl Into<Cow<'static, str>>,
330 value: Option<impl Into<Cow<'static, str>>>,
331 ) -> Self {
332 match value {
333 Some(val) => self.add_param(key.into(), val.into()),
334 None => self,
335 }
336 }
337
338 /// Adds the given key/value to this `ParamList` by mutating it in place, rather than consuming
339 /// it as in `add_param`.
340 pub fn add_param_ref(
341 &mut self,
342 key: impl Into<Cow<'static, str>>,
343 value: impl Into<Cow<'static, str>>,
344 ) {
345 self.0.insert(key.into(), value.into());
346 }
347
348 /// Adds the given `UserID` as a parameter to this `ParamList` by adding either a `user_id` or
349 /// `screen_name` parameter as appropriate.
350 pub fn add_user_param(self, id: user::UserID) -> Self {
351 match id {
352 user::UserID::ID(id) => self.add_param("user_id", id.to_string()),
353 user::UserID::ScreenName(name) => self.add_param("screen_name", name),
354 }
355 }
356
357 /// Adds the given `ListID` as a parameter to this `ParamList` by adding either an
358 /// `owner_id`/`owner_screen_name` and `slug` pair, or a `list_id`, as appropriate.
359 pub fn add_list_param(mut self, list: list::ListID) -> Self {
360 match list {
361 list::ListID::Slug(owner, name) => {
362 match owner {
363 user::UserID::ID(id) => {
364 self.add_param_ref("owner_id", id.to_string());
365 }
366 user::UserID::ScreenName(name) => {
367 self.add_param_ref("owner_screen_name", name);
368 }
369 }
370 self.add_param("slug", name.clone())
371 }
372 list::ListID::ID(id) => self.add_param("list_id", id.to_string()),
373 }
374 }
375
376 /// Merge the parameters from the given `ParamList` into this one.
377 pub(crate) fn combine(&mut self, other: ParamList) {
378 self.0.extend(other.0);
379 }
380
381 /// Renders this `ParamList` as an `application/x-www-form-urlencoded` string.
382 ///
383 /// The key/value pairs are printed as `key1=value1&key2=value2`, with all keys and values
384 /// being percent-encoded according to Twitter's requirements.
385 pub fn to_urlencoded(&self) -> String {
386 self.0
387 .iter()
388 .map(|(k, v)| format!("{}={}", percent_encode(k), percent_encode(v)))
389 .collect::<Vec<_>>()
390 .join("&")
391 }
392}
393
394// Helper trait to stringify the contents of an Option
395pub(crate) trait MapString {
396 fn map_string(&self) -> Option<String>;
397}
398
399impl<T: std::fmt::Display> MapString for Option<T> {
400 fn map_string(&self) -> Option<String> {
401 self.as_ref().map(|v| v.to_string())
402 }
403}
404
405pub fn multiple_names_param<T, I>(accts: I) -> (String, String)
406where
407 T: Into<user::UserID>,
408 I: IntoIterator<Item = T>,
409{
410 let mut ids = Vec::new();
411 let mut names = Vec::new();
412
413 for x in accts {
414 match x.into() {
415 user::UserID::ID(id) => ids.push(id.to_string()),
416 user::UserID::ScreenName(name) => names.push(name),
417 }
418 }
419
420 (ids.join(","), names.join(","))
421}
422
423///Convenient type alias for futures that resolve to responses from Twitter.
424pub(crate) type FutureResponse<T> =
425 Pin<Box<dyn Future<Output = error::Result<Response<T>>> + Send>>;
426
427pub fn codepoints_to_bytes(&mut (ref mut start, ref mut end): &mut (usize, usize), text: &str) {
428 let mut byte_start = *start;
429 let mut byte_end = *end;
430 for (ch_offset, (by_offset, _)) in text.char_indices().enumerate() {
431 if ch_offset == *start {
432 byte_start = by_offset;
433 } else if ch_offset == *end {
434 byte_end = by_offset;
435 }
436 }
437 *start = byte_start;
438 if text.chars().count() == *end {
439 *end = text.len()
440 } else {
441 *end = byte_end
442 }
443}
444
445///A clone of MergeBy from Itertools.
446pub struct MergeBy<Iter, Fun>
447where
448 Iter: Iterator,
449{
450 left: Peekable<Iter>,
451 right: Peekable<Iter>,
452 comp: Fun,
453 fused: Option<bool>,
454}
455
456impl<Iter, Fun> Iterator for MergeBy<Iter, Fun>
457where
458 Iter: Iterator,
459 Fun: FnMut(&Iter::Item, &Iter::Item) -> bool,
460{
461 type Item = Iter::Item;
462
463 fn next(&mut self) -> Option<Self::Item> {
464 let is_left = match self.fused {
465 Some(lt) => lt,
466 None => match (self.left.peek(), self.right.peek()) {
467 (Some(a), Some(b)) => (self.comp)(a, b),
468 (Some(_), None) => {
469 self.fused = Some(true);
470 true
471 }
472 (None, Some(_)) => {
473 self.fused = Some(false);
474 false
475 }
476 (None, None) => return None,
477 },
478 };
479
480 if is_left {
481 self.left.next()
482 } else {
483 self.right.next()
484 }
485 }
486}
487
488pub mod serde_datetime {
489 use chrono::TimeZone;
490 use serde::de::Error;
491 use serde::{Deserialize, Deserializer, Serializer};
492
493 const DATE_FORMAT: &str = "%a %b %d %T %z %Y";
494
495 pub fn deserialize<'de, D>(ser: D) -> Result<chrono::DateTime<chrono::Utc>, D::Error>
496 where
497 D: Deserializer<'de>,
498 {
499 let s = String::deserialize(ser)?;
500 let date = (chrono::Utc)
501 .datetime_from_str(&s, DATE_FORMAT)
502 .map_err(D::Error::custom)?;
503 Ok(date)
504 }
505
506 pub fn serialize<S>(src: &chrono::DateTime<chrono::Utc>, ser: S) -> Result<S::Ok, S::Error>
507 where
508 S: Serializer,
509 {
510 ser.collect_str(&src.format(DATE_FORMAT))
511 }
512}
513
514pub mod serde_via_string {
515 use serde::de::Error;
516 use serde::{Deserialize, Deserializer, Serializer};
517
518 use std::fmt;
519
520 pub fn deserialize<'de, D, T>(ser: D) -> Result<T, D::Error>
521 where
522 D: Deserializer<'de>,
523 T: std::str::FromStr,
524 <T as std::str::FromStr>::Err: std::fmt::Display,
525 {
526 let str = String::deserialize(ser)?;
527 str.parse().map_err(D::Error::custom)
528 }
529
530 pub fn serialize<T, S>(src: &T, ser: S) -> Result<S::Ok, S::Error>
531 where
532 T: fmt::Display,
533 S: Serializer,
534 {
535 ser.collect_str(src)
536 }
537}
538
539/// Percent-encodes the given string based on the Twitter API specification.
540///
541/// Twitter bases its encoding scheme on RFC 3986, Section 2.1. They describe the process in full
542/// [in their documentation][twitter-percent], but the process can be summarized by saying that
543/// every *byte* that is not an ASCII number or letter, or the ASCII characters `-`, `.`, `_`, or
544/// `~` must be replaced with a percent sign (`%`) and the byte value in hexadecimal.
545///
546/// [twitter-percent]: https://developer.twitter.com/en/docs/basics/authentication/oauth-1-0a/percent-encoding-parameters
547///
548/// When this function was originally implemented, the `percent_encoding` crate did not have an
549/// encoding set that matched this, so it was recreated here.
550pub fn percent_encode(src: &str) -> PercentEncode {
551 lazy_static::lazy_static! {
552 static ref ENCODER: AsciiSet = percent_encoding::NON_ALPHANUMERIC.remove(b'-').remove(b'.').remove(b'_').remove(b'~');
553 }
554 utf8_percent_encode(src, &*ENCODER)
555}
556
557#[cfg(test)]
558pub(crate) mod tests {
559 use super::*;
560 use std::fs::File;
561 use std::io::Read;
562
563 pub(crate) fn load_file(path: &str) -> String {
564 let mut file = File::open(path).unwrap();
565 let mut content = String::new();
566 file.read_to_string(&mut content).unwrap();
567 content
568 }
569
570 #[test]
571 fn test_codepoints_to_bytes() {
572 let unicode = "frônt Iñtërnâtiônàližætiøn ënd";
573 // suppose we want to slice out the middle word.
574 // 30 codepoints of which we want the middle 20;
575 let mut range = (6, 26);
576 codepoints_to_bytes(&mut range, unicode);
577 assert_eq!(&unicode[range.0..range.1], "Iñtërnâtiônàližætiøn");
578
579 let mut range = (6, 30);
580 codepoints_to_bytes(&mut range, unicode);
581 assert_eq!(&unicode[range.0..range.1], "Iñtërnâtiônàližætiøn ënd");
582 }
583}