Blog Post: Where Recovery?
This commit is contained in:
@@ -18,6 +18,19 @@
|
||||
<category>Life</category>
|
||||
<category>Mental Health</category>
|
||||
<category>Health</category>
|
||||
<item>
|
||||
<title>Where Recovery?</title>
|
||||
<pubDate>13 April, 2026</pubDate>
|
||||
<link>https://www.cutieguwu.ca/blog/posts/7_where_recovery.html</link>
|
||||
<description>
|
||||
Going over what's up with kramer, and how it's [hopefully] going to change for the better.
|
||||
</description>
|
||||
<category>Coding</category>
|
||||
<category>Programming</category>
|
||||
<category>Optical Media</category>
|
||||
<category>Rust</category>
|
||||
<category>Data Recovery</category>
|
||||
</item>
|
||||
<item>
|
||||
<title>Hack Racing</title>
|
||||
<pubDate>28 March, 2026</pubDate>
|
||||
|
||||
@@ -1,6 +1,13 @@
|
||||
<li class="spacer_container blog_recent_posts">
|
||||
<p class="title">Recent Posts</p>
|
||||
<ul class="section_list">
|
||||
<li>
|
||||
<header>
|
||||
<p class="name">Where Recovery?</p>
|
||||
<p class="subtitle">13 April, 2026</p>
|
||||
<a href="/blog/posts/7_where_recovery.html" class="status">View</a>
|
||||
</header>
|
||||
</li>
|
||||
<li>
|
||||
<header>
|
||||
<p class="name">Hack Racing</p>
|
||||
|
||||
@@ -0,0 +1,412 @@
|
||||
<!doctype html>
|
||||
|
||||
<html lang="en-ca">
|
||||
<head>
|
||||
<title>Where Recovery? | Cutieguwu</title>
|
||||
<include src="includes/meta.html" />
|
||||
<link rel="stylesheet" type="text/css" href="/styles/blog_post.css" />
|
||||
</head>
|
||||
<body>
|
||||
<nav class="pane">
|
||||
<include src="includes/nav_header.html" />
|
||||
<include src="includes/nav_menu.html" />
|
||||
<div class="location">
|
||||
<p class="title">You are here:</p>
|
||||
<p class="page">Blog - Where Recovery?</p>
|
||||
</div>
|
||||
<include src="includes/nav_quick_links.html" />
|
||||
</nav>
|
||||
<main class="pane blog">
|
||||
<div>
|
||||
<header>
|
||||
<h1 class="title">Where Recovery?</h1>
|
||||
<p class="date">Posted: 13 April, 2026</p>
|
||||
<p class="date">Last Edited: 13 April, 2026</p>
|
||||
</header>
|
||||
<!-- Insert Megamind head meme here -->
|
||||
<p>
|
||||
"Cutieguwu," I hear the void cry, "where is kramer's development
|
||||
at? You know, the optical disc recovery utility you've been so keen to work on,
|
||||
and have talked about so much that you're wishing there was a better way to
|
||||
refer to it than the working name <em>kramer</em> or
|
||||
<em>optical disc recovery utility</em>."
|
||||
</p>
|
||||
<p>Ah... right.</p>
|
||||
<h2>Where the Project Came From</h2>
|
||||
<p>
|
||||
I honestly didn't expect to get as far along with <code>kramer</code> as I did,
|
||||
before I ever formally announced it. After all, I have no experience messing
|
||||
around in the lower levels of the kernel. All my experience was in the abstract
|
||||
scripting land of semi-OOP Python.
|
||||
</p>
|
||||
<p>
|
||||
And let me be the first to admit, my last major python project was a disaster of
|
||||
a codebase. Hacky disasters everywhere, little forethought, and
|
||||
<em>a lot</em> of nesting.
|
||||
</p>
|
||||
<p>
|
||||
But, being the over-ambitious fool that I am, and given a bit of thought for
|
||||
cleaning up my code disasters, I started learning Rust. I wanted, and needed,
|
||||
something more performant than python that could still baby-step my bumbling
|
||||
arse through to low-level interactions and memory handling.
|
||||
</p>
|
||||
<p>
|
||||
Beside this, I grew more keen to collect and preserve my family's optical media.
|
||||
In my efforts, I came across some discs that, despite my best cleaning efforts
|
||||
and different drives, with and without LibreDrive patches on supported boards,
|
||||
could not read to a recoverable state.
|
||||
</p>
|
||||
<p>
|
||||
So, I tried <code>ddrescue</code>, and from there I encountered the issues
|
||||
outlined in my original post about <code>kramer</code>.
|
||||
</p>
|
||||
<h2>Where the Project Stalled</h2>
|
||||
<p>
|
||||
<code>kramer</code> was originally meant to be a simple data-scraper, and its
|
||||
codebase very much reflected that, being almost entirely procedural. It also
|
||||
tried to learn from <code>ddrescue</code>'s structure, for better or for worse.
|
||||
After all, <code>kramer</code> was but a simple data scraper, and I wasn't
|
||||
trying to reinvent the wheel in a domain of programming I was just starting to
|
||||
dip my toes into.
|
||||
</p>
|
||||
<p>
|
||||
However, as my list of goals for <code>kramer</code> expanded, and as I took a
|
||||
course centered around basic architecture design, it became clear that
|
||||
<code>kramer</code> needs to be refactored with clear structural design goals in
|
||||
place.
|
||||
</p>
|
||||
<p>Also, I never want to write in Java ever again*.</p>
|
||||
<h2>What's the Plan?</h2>
|
||||
<p>
|
||||
<code>kramer</code>'s design goals aren't entirely cohesive, and I don't know
|
||||
many architecture paradigms. The course introduced the Model-View-Controller
|
||||
paradigm, and... it looks like it may actually work quite well to appropriately
|
||||
separate a number of the major issues.
|
||||
</p>
|
||||
<p>
|
||||
The TUI and CLI ultimately want to make this a total disaster. Pushing them into
|
||||
a View role handled by a generic Controller should eliminate issues with unique
|
||||
interfaces. Almost like a whack adapter.
|
||||
</p>
|
||||
<p>
|
||||
Remember how I said I never want to write Java again? Well... I may end up doing
|
||||
a mock of the overall project structure in Java, despite my discontent with the
|
||||
language. The advantage being that I can forget about the borrow checker and
|
||||
focus on the architecture. Then, adapt that to sensible Rust. There are a few
|
||||
reasons why I would choose Java over Python for this:
|
||||
</p>
|
||||
<ul>
|
||||
<li>Python's OOP isn't nearly as well done as Java's</li>
|
||||
<li>
|
||||
Java has proper (i.e. integrated into the language), easily transferable
|
||||
implementations of Interfaces ("Protocols" in Python) to Traits.
|
||||
</li>
|
||||
<li>
|
||||
Java's paradigms, libraries, and core data types are more similar to Rust.
|
||||
</li>
|
||||
<li>
|
||||
Static typing. <code>mypy</code> can enforce static typing, and that's how I
|
||||
use Python nowadays, but it can fail to resolve polymorphism and shadows
|
||||
appropriately.
|
||||
</li>
|
||||
</ul>
|
||||
<p>
|
||||
And, as much as Java and Python's enums are wildly limited compared to Rust's
|
||||
pattern matching, Python somehow still does it worse.
|
||||
</p>
|
||||
<h3>Goals</h3>
|
||||
<ul>
|
||||
<li>
|
||||
Recovery Mechanisms
|
||||
<ul>
|
||||
<li>Data Scraping</li>
|
||||
<li>
|
||||
File structure validation
|
||||
<ul>
|
||||
<li>ISO9660</li>
|
||||
<li>UTF-8</li>
|
||||
<li>MPEG-2 Headers</li>
|
||||
<li>Requires post-processing.</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
Hash-based validation
|
||||
<ul>
|
||||
<li>
|
||||
Hash a section, possibly attempt to brute force the hash to
|
||||
repair.
|
||||
</li>
|
||||
<li>Requires pre-processing.</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
TUI (akin to <code>ddrescueview</code>)
|
||||
<ul>
|
||||
<li>Stats</li>
|
||||
<li>Progress</li>
|
||||
<li>Disc Properties</li>
|
||||
<li>Visual map</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
CLI
|
||||
<ul>
|
||||
<li>Stats</li>
|
||||
<li>Progress</li>
|
||||
<li>Disc Properties</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
i18n
|
||||
<ul>
|
||||
<li>English</li>
|
||||
<li>French</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h3>Considerations</h3>
|
||||
<ul>
|
||||
<li>The TUI and CLI share a number of common readouts.</li>
|
||||
<li>
|
||||
The system needs to function asynchronously
|
||||
<ul>
|
||||
<li>Drives have a habit of misbehaving with buggered data streams.</li>
|
||||
<li>
|
||||
TUI and CLI need to be responsive, and not hang from an unresponsive
|
||||
drive.
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>i18n should be shared between TUI and CLI.</li>
|
||||
<li>
|
||||
I may try to implement support for using LibreDrive.
|
||||
<ul>
|
||||
<li>
|
||||
Admittedly, there's a <em>very</em> slim chance of this, but there
|
||||
does appear to at least be some source for
|
||||
<code>libdriveio</code> in the <code>makemkv-oss</code> linux
|
||||
downloads.
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
Maintain support for reading data without <code>DIRECT_IO</code>.
|
||||
<ul>
|
||||
<li>Polymorphism for the reader, baby.</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h3>Libraries</h3>
|
||||
<ul>
|
||||
<li>Clap (duh)</li>
|
||||
<li>
|
||||
Ratatui
|
||||
<ul>
|
||||
<li>
|
||||
May also be useful for the simple CLI interface, but at the very
|
||||
least, it'll handle the TUI view.
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2>New Architecture</h2>
|
||||
<p>The highest level of abstracted architecture follows MVC.</p>
|
||||
<h3>Views and Controller</h3>
|
||||
<p>
|
||||
Obviously, this is going to be the CLI and TUI run through an interface to the
|
||||
controller. The controller will spawn one view or the other based upon a
|
||||
command-line argument.
|
||||
</p>
|
||||
<p>
|
||||
The controller should also handle the localization, and pass localized strings
|
||||
to the views as they need them.
|
||||
</p>
|
||||
<p>
|
||||
The goal of this is clean code, reuseability of the controller (which has the
|
||||
most non-generalizable behaviour), and potential support for a GUI later on.
|
||||
</p>
|
||||
<p>
|
||||
The last thing I want to cement into the Controller is a toggle to run only
|
||||
preprocessors (more on that later).
|
||||
</p>
|
||||
<h3>Model</h3>
|
||||
<p>This is where everything explodes and most of the refactoring occurs.</p>
|
||||
<p>
|
||||
A lot of things are needlessly hard-coded in and inter-connected; the issues
|
||||
with separation of tasks become apparent.
|
||||
</p>
|
||||
<p>
|
||||
Another major problem is that it's borrowing concepts from
|
||||
<code>ddrescue</code>'s C++ architecture, while crappily patching in bad Rust
|
||||
conversions, like <code>DirectIOBuffer</code>. Which is to say, a mish-mash of
|
||||
conflicting paradigms and bad, highly ignorant, coding practises.
|
||||
</p>
|
||||
<p>
|
||||
So, what about the refactor? Well, it's looking like it's going to be a number
|
||||
of independent libraries.
|
||||
</p>
|
||||
<p>Here's the current idea for the architecture:</p>
|
||||
<h4>AlignedBufReader</h4>
|
||||
<p>
|
||||
Effectively, a proper, full-featured replacement of the stupidity that is
|
||||
<code>DirectIOBuffer</code> taking the idea of wrapping a generic
|
||||
<a href="https://doc.rust-lang.org/std/io/trait.Read.html"><code>Read</code></a>
|
||||
like
|
||||
<a href="https://doc.rust-lang.org/std/io/struct.BufReader.html"
|
||||
><code>BufReader</code></a
|
||||
>.
|
||||
</p>
|
||||
<h4>Recovery Pipeline</h4>
|
||||
<p>
|
||||
The point of the recovery pipeline is to introduce flexibility into implementing
|
||||
new recovery mechanisms. This is the new core of the Model.
|
||||
</p>
|
||||
<p>
|
||||
The pipeline will work through a plugin system, allowing
|
||||
<em>dynamic</em> runtime discovery and loading of recovery mechanisms. This will
|
||||
likely leverage Rust's <code>dylib</code>s (aka. Shared Objects, DLLs, or
|
||||
Dynamic Libraries), and potentially later expand to supporting
|
||||
<code>cdylib</code> (the OG, C's shared objects) if there's a need.
|
||||
</p>
|
||||
<p>
|
||||
Somewhere in the setup of the pipeline will have to be a system for resolving
|
||||
incompatible plugins, plugin dependencies, and where in the pipeline a plugin
|
||||
lives.
|
||||
</p>
|
||||
<p>Plugin placement at a distance is relatively simple:</p>
|
||||
<ol>
|
||||
<li>Preprocessors</li>
|
||||
<li>Data Scraping (Packaged with the Model)</li>
|
||||
<li>Postprocessors</li>
|
||||
</ol>
|
||||
<h4>Preprocessors</h4>
|
||||
<p>
|
||||
Preprocessors operate by reading the original data, and storing error-correcting
|
||||
data for a postprocessor to leverage.
|
||||
</p>
|
||||
<p>
|
||||
I might try to enforce a standard for this error correcting data, even if it's
|
||||
as simple as storing error-correction files in a standard compressed archival
|
||||
format that has error-correction mechanisms.
|
||||
</p>
|
||||
<p>Examples of preprocessors would be hashing mechanisms.</p>
|
||||
<h4>Data Scraping</h4>
|
||||
<p>This should mostly follow the procedures of the current system.</p>
|
||||
<h4>Postprocessors</h4>
|
||||
<p>
|
||||
Postprocessors operate by reading the scraped data, and potentially combining it
|
||||
with cached recovery information from preprocessors, to repair the scraped data.
|
||||
</p>
|
||||
<p>
|
||||
Examples of postprocessors would be structure validation, and attempting
|
||||
recovery through brute forcing a region to match its cached hash.
|
||||
</p>
|
||||
<h4>Plugins</h4>
|
||||
<p>Plugins need systems to:</p>
|
||||
<ul>
|
||||
<li>Report incompatible plugins</li>
|
||||
<li>
|
||||
Report dependant plugins, and their relative ordering in the pipeline
|
||||
<ul>
|
||||
<li>
|
||||
Enforce including a relative position to the Data Scraping plugin?
|
||||
Effectively, "is this a preprocessor, or postprocessor?"
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Read from the drive</li>
|
||||
<li>Read and write the scraped data</li>
|
||||
<li>Read and write recovery data</li>
|
||||
<li>Read and write to the map</li>
|
||||
</ul>
|
||||
<p>They might also want systems to:</p>
|
||||
<ul>
|
||||
<li>
|
||||
Call upon other plugins as dependencies.
|
||||
<ul>
|
||||
<li>They are shared objects, after all.</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h4>Mapping</h4>
|
||||
<p>With the plugin system, mapping gets more difficult.</p>
|
||||
<p>
|
||||
Before, mapping was rather simple and could be a small
|
||||
<code>enum</code> representing the recovery stage. However, the new map will
|
||||
have to use dynamic tagging of some sort, as various plugins may want to pass
|
||||
mapping information between each other.
|
||||
</p>
|
||||
<p>
|
||||
For example, a preprocessor could map out the locations of all files. Then, a
|
||||
postprocessor could attempt to repair headers that have been damaged, and/or the
|
||||
file system tree. Using a preprocessor to map this information is more reliable
|
||||
as there may not be enough of the header left to validate that it is in fact a
|
||||
header, and not just data that's reminiscent of one.
|
||||
</p>
|
||||
<p>
|
||||
Along with dynamic tagging, there needs to be some kind of standard naming
|
||||
practice in place beforehand. But, there are limits to how well this can be
|
||||
enforced.
|
||||
</p>
|
||||
<p>
|
||||
Further details of how mapping will work will need to be sorted out. I still
|
||||
need a cleaner way of updating the status of regions than... whatever the heck
|
||||
the current system is. I don't understand how I got it working, but hey. Maybe I
|
||||
can at least reuse the tests now that I actually know all the overlaps?
|
||||
</p>
|
||||
<p>
|
||||
I don't think the tagging system should handle any form of incompatibility
|
||||
testing. That should be left to the plugins' own systems to handle
|
||||
appropriately.
|
||||
</p>
|
||||
<h2>The Ultimate Challenges</h2>
|
||||
<p>
|
||||
Really, all that stands in my way is the motivation to keep working on this, and
|
||||
in a <em>sensible</em> way. I have to always ensure that I'm not just writing
|
||||
code, but I'm thinking about the larger structure all the while, lest I code
|
||||
myself into a pit once more.
|
||||
</p>
|
||||
<p>
|
||||
Also, there stand the issues that I have yet to solve with sleepy and/or
|
||||
permission revoking drives. I don't actually know what's happening. I may have
|
||||
to learn to inspect the SCSI communications to know better what the drive is
|
||||
doing when it hits hard-to-read areas.
|
||||
</p>
|
||||
<p>
|
||||
It also looks like it could be fun, and useful, to dig into SCSI controls:
|
||||
<a href="https://en.wikipedia.org/wiki/Optical_disc_drive#SCSI_configuration"
|
||||
>https://en.wikipedia.org/wiki/Optical_disc_drive#SCSI_configuration</a
|
||||
>
|
||||
</p>
|
||||
<p>
|
||||
There's also the
|
||||
<a href="https://sg.danny.cz/sg/sdparm.html"
|
||||
><code>sdparm</code> Documentation</a
|
||||
>
|
||||
to dig into for general knowledge, and how SCSI actually works.
|
||||
</p>
|
||||
<h2>Why all of this?</h2>
|
||||
<p>
|
||||
Well, after having spent two courses reading requirements sheets and writing
|
||||
code around that, it's become natural to consult a spec.
|
||||
</p>
|
||||
<p>
|
||||
And again, the biggest failing of all my projects has been a lack of structural
|
||||
forethought. Usually, that's because I haven't the faintest idea where I want to
|
||||
go with it. But in this instance, I know enough about what I want to make
|
||||
<em>possible</em>, and enough sense to not just dive in blindly a third, or
|
||||
perhaps fourth, time.
|
||||
</p>
|
||||
</div>
|
||||
<include src="includes/tailer.html" />
|
||||
</main>
|
||||
<ul class="pane spacer">
|
||||
<include src="./includes/blog_recent_posts.html" />
|
||||
<li class="spacer_container">#AD</li>
|
||||
</ul>
|
||||
<include src="includes/footer.html" />
|
||||
<include src="includes/scripts.html" />
|
||||
</body>
|
||||
</html>
|
||||
Reference in New Issue
Block a user