DispatchToElsewhere: A security vulnerability in my world

HiddenLayer555@lemmy.ml · 9 days ago

DispatchToElsewhere: A security vulnerability in my world

Zonetrooper@lemmy.world · 7 days ago

As an engineer, I have two responses to this. The first is that this is an incredibly cool bit of worldbuilding; you’ve done your work to actually show how such a critical vulnerability might arise and how it works.

The second, of course, is awestruck horror that they would let such a critical system exist without checking or validating inputs. It sounds like if a malformed but non-malicious message was sent, then the pilot would get a rejected dispatch message… but what would happen if a new dispatch needed to be issued mid-flight (e.g., due to weather or airspace disruption)? Would a different system handle that?

Either way, really cool addition.

early_riser@lemmy.radio · 2 days ago

I like this a lot. My own conworld also involves quadrupedal sophonts (aliens rather than animals), and I’ve thought about writing RFCs for some of their communication protocols.

YAP (yinrih ansible protocol) is a link layer protocol that manages messaging on an ansible link. Communication via ansible is very low bandwidth, meaning interplanetary networks look like 80s BBS’s or very very early (we’re talking still at CERN) websites. Ansibles use a kind of subspace called the Underlay. In order for two ansibles to communicate, they must contain wafers of tailstone shaved from the same monocrystal. Raw tailstone (or tailstone precursors) are mined and manufactured into monocrystals in a similar process to how silicon ingots are grown. At a certain point in this growth process, the crystal “locks”, and any wafers shaved from that monocrystal will only communicate with other wafers from the same monocrystal.

The Underlay link is shared between all such wafers, similar to a shared wifi channel, so individual ansibles have to take turns sending frames. You can achieve full duplex communication by growing two seperate monocrystals, shaving two wafers from each, and placing one wafer from each crystal into two ansibles, with one serving as an RX interface and its partner on the other ansible serving as TX, and vice versa. In practice, this is only done on high capacity trunk lines, as more underlay links require more power.

State actors can perform a supply chain attack by ordering a tailstone fab to grow monocrystals twice as large, break them in half, send half downstream to their customers and give the other half to the government, which can spy on communications made via those ansibles undetected or perform MITM attacks.

YIP (yinrih internetworking protocol) is a network layer protocol that can serve either in a reliable or best effort configuration. While the address space of the original YIP was exhausted millennia ago and a new, but incompatible, YIP version developed with a much larger address space, there are still single-stack networks using the older protocol at the time of First Contact.

Because yinrih evolved writing rather than inventing it, their digital age stretches back to the mid-Pleistocene despite only gaining sapience around the same time as humanity, and because yinrih live over 700 Earth years on average, their networks are built to be apocalypse-proof. With such robust networks with enough uptime to be measured in geologic terms, some networks are left to run forgotten for millennia. This gives rise to the discipline of cyberarcheology, which specializes in ferreting out these archeonets and uncovering their secrets. Cyberarcheologists are experts in defunct communication protocols, obsolete storage formats, and long outmoded hardware architectures.

HiddenLayer555@lemmy.ml · edit-2 7 days ago

Thank you!

but what would happen if a new dispatch needed to be issued mid-flight (e.g., due to weather or airspace disruption)? Would a different system handle that?

That would go through no problem, and since the hovercraft keeps track of what entry ID it’s currently executing, you can give it extra steps by rearranging/adding entries after the current one (ATC would also be able to poll the hovercraft for its current entry ID)

In the worst case scenario, ATC can call up the pilot, ask them to disconnect the autopilot, stop the hovercraft in midair, and tell the hovercraft to wait for a brand new dispatch message. Once it has received such a message, the pilot can manually select which entry ID it should start at, and then reconnect the autopilot.

awestruck horror that they would let such a critical system exist without checking or validating inputs.

I have also written a post from a cat working for the Feline Ministry of Transportation, think of it as something that would be commented on their equivalent of /c/programming under a link to this report. Thought you might enjoy!

FMT/FTDMS dev here, thought I’d shed some light on what happened on our side.

This was found during a Ministry of Security audit as part of their Safe Infrastructure initiative. All of this was legacy code that hadn’t been touched in ages, the oldest of which dates back to before the Felines had even signed the ISPA and stopped eating prey. Yes, FDTMS as a protocol only became an official thing after the Feline Revolution (and by extension us signing the ISPA), but before that was a dozen different ad hoc ATC standards used by different parts of Feline Territory, and some of the source code from those (namely parts of Flight Dispatch Protocol, the old standard used in Moonpeak where all the government ministries are headquarted) were reused for FDTMS because it seemed easier and faster than writing everything from scratch.

All four points in the official root cause analysis can be boiled down to “cats are lazy” and/or “cats are bad at communicating with each other.”

Parsing the message before checking message signature: The message parsing and cryptography teams didn’t adequately consult with each other and failed to make sure their components were executed in the correct sequence. Both teams worked on their own thing and assumed that as long as both were done between receiving the message and prompting the pilot to accept the final dispatch table, there would be no problems. What happened was that the message processing and signature checking happened simultaneously on different threads. In theory, the cryptography thread could stop the message processing as soon as it detects an issue, but in practice, the message processing was way faster so by the time the cryptography thread finished and returned a result, the message had already been parsed and has overflowed the buffer. This was changed to a synchronous system where the cryptography thread executes first, and it is now the one responsible for triggering subsequent data processing threads if and only if it doesn’t detect any issues.
Failing silently on an invalid dispatch and reverting to the previous one: This was actually intentional because cats in the flight deck/ATC centre didn’t want to have to click two extra buttons whenever a dispatch message got corrupted during transmission. So the system was designed to be as automatic and “paws off” as possible. In theory, the pilot is supposed to check that the reversion to the previous valid dispatch is the correct action when they get a dispatch rejected message, and also report the error to ATC, but in practice that pretty much never happens. This was part of the code that was reused from FDP, and this whole thing was a really stupid idea and we only realized that after we found this issue.
The parser being more tolerant than the the actual FTDMS protocol: The standard originally treated TerritoryChange and RegionChange as completely different entries, but it got changed to a single entry fairly late in development because we realized that crossing a territorial border always implies crossing a regional border since the other territory would have its own regions. So buildDispatchTable had already been implemented and had to be updated. The cats who updated it figured that they only had to implement support for including the region information in TerritoryChange and didn’t need to completely remove support for two separate entries (because that would have been more work). But they also forgot to test that the changes made to TerritoryChange didn’t affect the ability to safely accept two separate entries (because the standard was being changed to disallow that anyway) and this is actually where the buffer overflow originated.
All of FDTMS-Client being run in privileged context with no Secure Mode protections that could have detected the buffer overflow/prevented privileged memory from being overwritten: WhiskerOS Secure Mode is really strict and honestly a pain in the tail to deal with, especially in terms of getting privileged and non-privileged code to talk to each other. So in classic cat fashion, we said “screw it” and made the entire codebase privileged so we wouldn’t have to deal with that. Actually, now that we’ve had to transition to a more restrictive privilege model, we’re getting a lot of issues with interprocess communication as expected. I actually got reassigned to a team specifically for dealing with complying with Secure Mode’s restrictions.

I also want to say that things are getting better, and from my experience, the Feline Government actually has way superior work culture than most other software development places in Feline Territory. But a lot of that old fashioned, counterproductive, least-effort Feline culture is still present especially in legacy code or massive codebases where many teams have to collaborate.

Zonetrooper@lemmy.world · 6 days ago

Point #1

…ah, race conditions. My old enemy.

What’s impressive (or scary, depending on your point of view) is how much of this rings entirely true to my real-life experiences. “The correct method is a pain in the rear/would require rework, so we’re going to take the fast and dirty method” is something I’ve watched have catastrophic consequences down the line. That, and peeling back the lid on the “new standard” and finding a hodgepodge of old standards, legacy code, and poor decision making is painfully real.

The question I have now is, if they actually tried to rip the bandaid off and replace the whole thing, would it be a political problem?

HiddenLayer555@lemmy.ml · edit-2 3 days ago

Thank you so much! I tried my darndest to incorporate my own experiences in software development into this! Though I definitely have way less experience than you.

The question I have now is, if they actually tried to rip the bandaid off and replace the whole thing, would it be a political problem?

I’ll let dev kitty respond to this!

Context: the Unified Territories is an alliance of many small to medium sized animal species, ranging from mice and songbirds to dogs and foxes; who all subscribe to an ideology called Unitism (hence unified territories), which is basically like “vegan socialism” where former predators and prey live in peace. The Felines are Unitist as well.

Fortunately there’s not really much political issues that would get in the way of that. The Feline Ministry of Transportation has full authority over the transportation infrastructure in Feline Territory, especially the implementation stuff like which ATC system to use.

Though, changes to the flight control standards (as opposed to the internal implementation of those standards) might require consultation with the Avian Government over in the Unified Territories because we want to ensure that naturally flying animals are kept safe when flying near hovercrafts. Even if the flight is entirely within Feline Territory, a lot of birds live here and they are still represented by their own taxonomic government, because all the Unitist taxonomic govs talk to each other and uphold Freedom of Migration between their territories. Generally, we would send our changes to them and they’ll let us know if they have any concerns.

However there probably won’t be a rewrite of FDTMS specifically, because we’re starting talks with the Unified Territory to develop a combined ATC standard so pur hovercrafts can more operate in each other’s airspace, and improve things like scheduling of cross-border flights, easier passenger connections between UTMT and FMT flights, allow an FMT aircraft to fill a UTMT scheduled flight and vice versa, and just generally try to make a system that combines the advantages of FDTMS and their UniFlyControl system. These are reeealy early stage talks so none of the technical details have been set yet, but it does seem like it will go through eventually.

I would hope it would be written in a completely memory safe language.

EntryID	EntryType	Name	SetAttributes	Location	EstimatedTime
0	DispatchStart	FMT1002	flightID=FMT1002; radioID=FMT1002; origin=MPK; originLandingPadID=2; dest=CVY; startingTerritory=FelineTerritory; startingRegion=MoonDistrict; atcServerURL=fdtms://transportation.feline.gov/atc/MoonDistrict; serviceType=Commuter	(-130.993234, -47.997826)	12:00:00
1	FlightStart	MPKL2	departureCorridorID=39; speed=100	(-130.993234, -47.997826)	12:00:00
2	WayPoint/DepartureCorridorExitPoint	MPK927	heading=305.1; speed=300; alt=1000	(-131.017368, -47.997826)	12:02:00
3	RegionChange	AlderDistrict	atcServerURL=fdtms://transportation.feline.gov/atc/AlderDistrict	(-130.994368, -47.997826)	12:19:13
4	WayPoint	ALD103	heading=305.1; speed=1000; alt=1700	(-129.083812, -47.997826)	12:21:02
5	TerritoryChange	UnifiedTerritories/FeatherDistrict	atcServerURL=fdtms://transportation.ut.gov/atc/alternateProtocols/Feline/FDTMS/region/FeatherDistrict	(-131.001929, -48.725192)	12:45:51
6	WayPoint	FET972	heading=307.9; speed=1000; alt=1700	(124.138493, -47.997826)	13:00:05
7	RegionChange	CommonDistrict	atcServerURL=fdtms://transportation.ut.gov/atc/alternateProtocols/Feline/FDTMS/region/CommonDistrict	(-122.904928, -46.203840)	13:37:19
8	Waypoint/ApproachCorridorEntryPoint	CVY671	approachTo=CVY; requestApproachCorridorID=81; requestLandingPadID=9; defaultAction=HoverInPlace/Wait; triggerNextEntry=OnAtcAuthorization	(-122.301937, -44.927394)	13:41:32
9	ApproachStart	CVYL9	approachCorridor=81; targetLandingPad=9	(-122.904928, -46.203840)	13:45:00
10	ApproachEnd	CVYL9		(-122.804221, -46.103840)	13:50:00
11	FlightEnd	FMT1002		(-122.804221, -46.103840)	13:50:00
11	DispatchEnd	FMT1002	clear all	(-122.804221, -46.103840)	13:50:00

EntryID	EntryType	Name	SetAttributes	Location	EstimatedTime
0	DispatchStart	FMT1002	onError=LoadPreviousDispatch; atcServerURL=fdtms://null	(-130.993234, -47.997826)	12:00:00
1	TerritoryChange	UnifiedTerritories	atcServerURL=fdtms://null	(-131.001929, -48.725192)	12:00:00
2	RegionChange	FeatherDistrict	atcServerURL=fdtms://AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[Malicious Code Goes Here]	(-131.001929, -48.725192)	12:00:00

DispatchToElsewhere: A security vulnerability in my world

DispatchToElsewhere: A security vulnerability in my world

CVE109-10028FL: DispatchToElsewhere, a memory corruption vulnerability in FDTMS resulting in remote code execution on Feline Ministry of Transportation hovercrafts

Exploit proof of concept

Response

Root Cause Analysis: