droprate 1.0.0

A library for generating different kinds of randomized trials given a set of options and weights.
Documentation

Drop Rate

A Rust crate for choosing outcomes based on a weighted probability list.

You can use this to calculate results with tables on the fly, or to store tables and reuse them to calculate results from the same tables.

Random

The RandomTable uses the most naive random selection possible. This should be more-or-less exactly as close to "true" random as the pseudo-random number generator used by Rust. It will include all the famous attributes of random numbers such as clustering and unpredictability, and eventually (give enough trials) a trend toward the theoretical ratio of results.

If all you want is a tool to help you pick outcomes randomly, then this is a fairly simple, direct drop-in solution.

How it Works

Upon creation the RandomTable adds all the weights from all possible outcomes in the table to determine a total, and when you call random(), it generates a random number between 0 and the maximum, finding which entry it landed on and returning that outcome.

Notes

When used for games, random distributions can be really brutal or really generous. If something has a 10% probability, then given an infinite number of trials, you will see that outcome an average of once for every 10 trials. But you could also see it take 100 trials before you see that outcome, or you could get the outcome on your first trial. The nature of "random" is that every trial has exactly the same probability, and the past does not affect the probability of the next result. (You could talk about how likely any given sequence of outcomes might be, but that's not the same thing at all.)

Most people are really bad at understanding how "random" actually works, which leads to the gambler's fallacy (https://en.wikipedia.org/wiki/Gambler%27s_fallacy). This leads to users (incorrectly) assuming that games have systems built-in to "screw them over" sometimes, when, in fact, that's just how actually-random-like distributions work.

"Random" numbers generated by a computer are never actually random. There is usually a predictable outcome which yields a consistent sequence of numbers given a constant starting "seed." More correctly, these are called "pseudo-random" number sequences. The conform fairly well to the attributes of an actually random sequence, and the sequence is (correct me if I'm wrong) generally impossible to predict without knowing both the random seed and the index through the sequence (though I suppose a sequence of numbers suitably long could help you figure out the index to a reasonable degree). So please be forgiving if you see "random" in a computer sense and know that it's actually not a "true" random, but for most day-to-day uses, it's "random enough."

If you want actual random, here is a website which uses "quantum fluctuations of the vacuum" to generate number sequences which, thus far, have statistically proven indistinguishable from true random. (https://qrng.anu.edu.au/)

Fairly Random

The Fairly Random algorithm is designed to give more "human-like" pseudo-random outcomes with less clustering and a faster trend towards theoretical probability. Put another way, it proves the gambler's fallacy correct. These outcomes, of course, will not even remotely resemble true-random.

More specifically, outcomes become less likely immediately after they are chosen and more likely the more times they are not chosen. Clustering is still possible (i.e., you can have a "hot streak") as an intended goal of this algorithm, but less frequent and longer runs become exponentially less likely the longer they continue. However, most users will experience what feels to them like a "fair" random distribution.

This is useful in creating scenarios where you want outcomes which aren't quite predictable (without keeping track), but which will conform to user expectaion and consistently reward their hope that "it's gotta happen soon". A use for this would be in games with random item drops that users are expected to grind for. A player experience is feeling like the game (or developers) intentionally wrote the underlying system to prevent them from acquiring the item they desire. In reality, this is just how random numbers work -- they are not nice, and they don't care how many dozens of times you've tried to get a 10% drop. The odds are always the same, and to many players this isn't fun. If you want to be nicer to your players and prevent them from needing to grind potentially much longer than intended, this is the algorithm for you!

How it Works

Fairly Random take the probability weight of each trial's outcome and redistributes it proportionally across all possible outcomes (including the selected outcome). As an example, if outcome A has a 20% probability of occuring and it is selected, then that 20% is redistributed across all other possible outcomes -- meaning that if outcome B originally had a 50% probability and C had a 30% probability, then A's probability would increase by 10 percentage (50% of 20%) points to 60%, outcome C would increase by 6 (30% of 20%) percentage points to 36%, and outcome A would be set to only 4% (20% of 20%). All of the weight is preserved within the system (60% + 36% + 4% = 100%). You can see that it is still possible for A to be selected again, but it is now FAR more likely that B or C would be selected as though a human being were trying to stay true to a "sense" of randomness. Next time, if outcome B is selected, those 60 points would be redistributed, but always using the original weight ratio, not the modified one -- so A gains 12 and becomes 16%, B gets set to 30%, and C gains 18 and becomes 54%. C would now be the most probable outcome, which is "makes sense" to many people because it hasn't yet been chosen.

Notes

I haven't done the mathematical analysis properly, but it seems that Fairly Random, while performing quite well at returning the expected ratio of results, actually converges to the theoretical result faster than the basic pseudo-random generator yields. As an afterthought, this is probably fairly obvious, and the algorithm actively works towards achieving the theoretical results more quickly while still allowing for randomness to take effect.

TODO: Explain how Fairly Random works...

ShuffleRandom

(Future) This random requires the probability of each outcome to be specified as an integer. It then creates a shuffled (like a deck of cards) list of all possible outcomes which you walk through by running the random() function. The list will automatically reshuffle all outcomes when it runs through the list.

Notes

Because the "deck" is reshuffled when you run out, the only time when it is possible to get a unique outcome multiple times in a row is at the border of a shuffle where an outcome at the end of the deck could potentially be reshuffled to the beginning the next time. (If an outcome has a probability weight > 1, it could theoretically occur as many times in succession as the number of its weight, however these would be considered separate "cards" in the deck.) In other words, if you had a "deck" of the numbers [1 - 10] and each had a probability weight of 1, it is possible that the same number might occur twice in a row, but only on border multiples of 10, and thus the grouping seen in a more true random would not happen.

PlaylistRandom

(Future) Playlist is intended to provide a compelling random selector as one might use for playing random songs in a media player.

How do expectations for a shuffled playlist differ from a shuffled deck of cards?

  • When the playlist starts again, it shouldn't pick a song that has played recently.
  • Maybe use a threshold/priority tradeoff for continuous playback which picks either the oldest song played (if it's stale enough) or from all the least-recently selected options (within a threshold of "oldest").
    • Make sure that this actually allows the order of selections to change.

TODO: Implement and explain

TrueRandom

(Future) This is probably the slowest algorithm. It literally queries the online service for a random number for every trial (bonus: requests multiple trials in order to minimize the down-time), ensuring that your randomness is as close to true random as possible without paying money for commercial-grade hardware random number generators (HRNGs).

Recurrence Interval

(Future) When studying languages or learning new terms, research suggests that a of increasing recurrence intervals which causes items seen recently to be seen very frequently, and then with decreasing frequency as they become more "familiar" to the person studying.

This random table will initially be shuffled (e.g., like ShuffleRandom) and set to a moderate recurrence. Options already returned will have a higher priority than those un-seen until they pass the default threshold for recurrence, at which point new items will be added from the total list.

This would be ideal for creating a flash-card program, but could be used in any scenario where it would be useful to familiarize the user with things they've already seen and giving them time to master each item before introducing new items (e.g., generating random dungeons and choosing which monsters to add to each level).

Notes

  • Could specify the initial recurrence value for each item, thus making each entry inherently more or less common from the start as deemed ideal (e.g., you could categorize monsters by difficulty, and then monsters from each difficulty level could be chosen and introduced in random orders, but each tier would be seen before introducing any from the next tier).
  • Add a fuzzy threshold so that if items are close-to-the-same threshold, they won't always be in the same order the next time they come around.
  • Initialize from a Vec<T> and auto-select a weight.
  • Hold onto the most recent trial result and allow the API to adjust the recurrence interval (multiplier?) (e.g., with flash-cards, if the user doesn't remember, mark to recur sooner rather than later).
  • Allow custom recurrence algorithm/function for custom linear/non-linear recurrence intervals.
  • Make sure intervals can be used like real-world time but also works fine not representing a real-world time interval.
    • This might mean that users could supply a "time since last trial" value that reprioritizes the list?

Roadmap

  • Implement Shuffle table
  • Implement Playlist table
  • Make a macro to more easily create tables in-line?
  • Generate some samples code (for now, check out the tests -- it's pretty straightforward)
  • Output statistics from a table
    • e.g., theoretical probabilities, relative probabilities, etc...
    • e.g., probability of N in a row, or other patterns
  • Create reference tables, which hold their own local history, but can reference other base tables