Module backup

Module backup 

Source
Expand description

§NOT part of the final solution – Legacy backup module

This module is NOT part of the final distributed system solution.

It was originally developed as an concept for local fault tolerance, where a backup process would start automatically in a separate terminal if the main elevator program crashed. This idea was inspired by the fault tolerance mechanisms presented in the real-time lab exercises in TTK4145.

§Intended Failover Behavior (Not Active in Final Design):

  • To automatically restart the elevator program locally in case of crashes.
  • To allow the elevator to serve pending tasks while offline, even without reconnecting to the network.
  • To eventually rejoin the network and synchronize with the system if a connection was restored.

§Why is it not part of our solution?

After discussions with course assistants and a better understanding of the assignment, it became clear that:

  • The project aims to implement a distributed system, not local persistence or replication.
  • A local failover process like this is conceptually similar to writing to a file and reloading, which is explicitly not the intended direction of the assignment.
  • All call redundancy and recovery should happen through the shared synchronized worldview, not through isolated local state or takeover logic.

As a result, the failover behavior was disabled (e.g., by using high takeover timeouts), and this module now functions purely as a GUI client:

  • Connects to the master
  • Receives WorldView updates
  • Visualizes elevator state and network status using a colorized print

§Summary:

  • This is a separate visualization tool, not part of the distributed control logic.
  • It remains in the codebase as a helpful debug utility, but should not be considered a part of the system design.

§Note:

In industrial applications, local crash recovery might be useful, especially to avoid reinitializing the elevator in a potentially unstable state. For example, if a bug caused a crash, restarting at the same point could lead to an immediate second crash. A clean backup process, starting with the previous tasks, can offer a more controlled re-entry.

However, this type of resilience mechanism falls outside the scope and intention of this assignment, which emphasizes distributed coordination and recovery via the networked WorldView, not local persistence or reboot logic.

Functions§

run_as_backup
Entry point for the backup program (invoked with cargo run -- backup).
start_backup_server
Starts the backup server, listening for incoming backup clients and transmitting the current system state (WorldView) and network status.