View Issue Details

IDProjectCategoryView StatusLast Update
0032079FPCCompilerpublic2019-05-22 20:46
ReporterMartok Assigned ToJonas Maebe  
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionduplicate 
Product Version3.1.1 
Summary0032079: Jumptables for case..of may jump into invalid memory
DescriptionI have written about it on the ML yesterday <http://lists.freepascal.org/pipermail/fpc-devel/2017-June/038005.html>, but after some investigation I feel there is an actual issue here.

With -O1 or above, "case Ordinal of" may generate a jumptable (essentially array[Ordinal] of CodePointer). If both the lowest and highest declared element of the type have a case label, range checking is omitted, otherwise a safety is put in place to check the number of elements of the jumptable.

If, for some reason, the input ordinal value is out of range, that results in an invalid memory access past the jumptable limits, jumping to an arbitrary (potentially attacker-controlled) location.
It also means that a potentially existing else statement (which may be intended for input validation) will never be reached, instead, the *best* case is a SIGSEGV.
Steps To ReproduceSee attached test program.
Additional InformationI'd say that without {$RangeChecks ON} (which, in this example, has absolutely no influence either way), the compiler can't make any assumption if X is in range or not. The only simple case where the compiler can safely omit the check is if the highest and lowest case label correspond to the base type size of the enum.
TagsNo tags attached.
Fixed in Revision
FPCOldBugId
FPCTarget
Attached Files

Relationships

duplicate of 0016006 closedJonas Maebe Dead code mistake 
has duplicate 0035603 resolvedFlorian Case statement does not handle out of bounds value of enumerated type, crashes instead 

Activities

Martok

2017-06-29 15:27

reporter  

project1.lpr (1,171 bytes)

Martok

2017-07-01 15:40

reporter   ~0101406

Not a duplicate.

This is an optimization-caused issue (with potential security implications), with no good way of catching it in user code.

I see 2 solutions:
 * always emit the check, regardless of previous determination of jumptable_no_range, unless cs_opt_level4 in current_settings.optimizerswitches. This is the easiest, and on somewhat modern CPUs the cheaper one.
 * always generate at least high(indexreg) elements, ie. on x86, if we jmp $table[$al*4], make $table have 256 elements (the rest pointing to the elselabel).


At the very least, it must be documented (refman, 13.2.2) that the case statement MUST NOT be used with untrusted input. Paragraphs 3 and 4 are essentially a lie, if this optimization is taken.

Jonas Maebe

2017-07-01 16:15

manager   ~0101407

It is exactly the same issue: the compiler always assumes that a variable of a particular type can only contain values that are valid for that type. That is the foundation of a typed language.

The security issues and crashes are identical when e.g. accessing an array (could be reading/writing out of bounds). We also assume that if a variable is typed as a function pointer and you call it, that it in fact points to a valid function and not to random memory.

There is nothing unsafe about the optimisation and e.g. ADA, one of the most safe languages there is, behaves exactly the same way. It's a matter of garbage-in-garbage-out: if you disable the type checking somewhere in your code and thereby put invalid data in a variable, any further code that operates on this variable has undefined behaviour.

Martok

2017-07-01 18:27

reporter   ~0101412

Are you really saying that the following two samples are identical, and clearly show the same intention of the programmer?

procedure Test1(const X: TFooEnum);
const
  Ptrs: array[TFooEnum] of TProcedure = (@Func1, @Func2, @Func3);
begin
  Ptrs[X]();
end;

// ---

procedure Test2(const X: TFooEnum);
begin
  case X of
    elOne: Something;
    elTwo: SomethingOther;
  else
    SomethingElse;
  end;
end;


I know that your statements may be formally correct, but this is not a CS lecture. A compiler must be able to generate safe code when asked to do so, or at least be deterministic in when it generates unsafe code, so that the programmer can put checks in place. If I can 'switch' between safe and unsafe code by adding a single value to my enum somewhere else, than that's not good. And I can't even write code to check for the problem beforehand because that code would be broken by the same issue as well, defeating the entire point.

Btw, ADA has safe conversions, and would actually catch this error before it occured: "[on 'Val:]If no value of the type of S has the position number Arg, the Constraint_Error exception is raised.". FPC has nothing of the sort, and no, even $R+ does nothing (useful), especially not for enums with gaps.

May I ask for a second opinion on this?

Jonas Maebe

2017-07-01 18:57

manager   ~0101413

I am saying is impossible for the compiler to generate safe code if the data in the variables does not correspond to their declared types. Jump tables are just one case where this goes wrong, and are not fundamentally different from other cases. Adding an arbitrary check for one case that you personally consider important does not solve that issue in any way. That has nothing to do with abstract computer science. Ad hoc changes like the one you propose only result in a false sense of security, because fundamentally the code is just as insecure as it was before.

The only way to get data with an invalid value in an enum in Pascal is by using an unchecked (aka explicit) typecast, by executing code without range checking (assigning an enum from a larger parent enum type into a smaller sub-enum type), or by having an uninitialised variable.

If you want a checked conversion operation from integers to enums, please file a separate feature request for that.

Martok

2017-07-01 21:13

reporter   ~0101415

> "Adding an arbitrary check for one case that you personally consider important does not solve that issue in any way."

This is *not* what this discussion is about! I'm talking about *not removing* a check that is aleady there in most cases but arbitrarily gets left out under very specific circumstances.


> "Ad hoc changes like the one you propose only result in a false sense of security, because fundamentally the code is just as insecure as it was before."

Then assign this to Michaƫl and change the documentation. The documentation on CASE..OF is very specific about how the code should behave and describes no undefined behaviour.


> "The only way to get data with an invalid value in an enum in Pascal is by using an unchecked (aka explicit) typecast, by executing code without range checking (assigning an enum from a larger parent enum type into a smaller sub-enum type), or by having an uninitialised variable."

Or by using Read(File). Or TStream.Read or the socket equivalent of choice. Or by calling a library function. There are many ways to have an invalid value in an enum in any meaningful code. The whole point in using an enum instead of untyped constants is type safety, so the compiler can tell me if I've handled every expected value. Currently, by using the type safe way I get *less* safe code, which is probably not what most people would expect.
This is why I may seem a bit... passionate about this. You're willfully trading an uncatchable potential RCE in every FPC-compiled program that interacts with the outside world for a single unlikely branch instruction that most CPUs will predict correctly. That seems like a rather... odd balance.


If it helps (as it sometimes does): Delphi keeps the check as well.

Jonas Maebe

2017-07-01 21:34

manager   ~0101416

> This is *not* what this discussion is about! I'm talking about *not removing* a check that is aleady there in most cases but arbitrarily gets left out under very specific circumstances.

The check gets removed because the type system says it is safe. It would be arbitrary to assume the type system holds everywhere, except in this particular case. The type system is an inherent part of the language. If it must be assumed to not be valid, then the language makes no sense.

> The documentation on CASE..OF is very specific about how the code should behave and describes no undefined behaviour.

This is unrelated to case..of. The behaviour of every single statement and expression is undefined if involved variables hold values that they cannot contain according to the type system (and not just for enums, but for every single type).

> The whole point in using an enum instead of untyped constants is type safety, so the compiler can tell me if I've handled every expected value. Currently, by using the type safe way I get *less* safe code, which is probably not what most people would expect.

You get equally unsafe code. The same holds for subrange types.

> You're willfully trading an uncatchable potential RCE in every FPC-compiled program that interacts with the outside world for a single unlikely branch instruction that most CPUs will predict correctly. That seems like a rather... odd balance.

Turning undefined behaviour into defined behaviour for one single kind of statement/expression is the worst kind of thing you can do in a language, because it is inconsistent and requires you to remember when it is safe to violate the language standards and when it is not (which makes no sense).

Either you do it everywhere (and then it's no longer undefined behaviour: then you explicitly define for each and every operation involving enums and subrange types what should/will happen if they contain an invalid value), or you don't do it anywhere at all. Something in the middle is the worst thing you can do for a programming language, because it creates false expectations of predictability (why would a case statement involving impossible values be predictable, but array indexations not?).

If you wish to continue this discussion, please do so on the fpc-devel list.

Issue History

Date Modified Username Field Change
2017-06-29 15:27 Martok New Issue
2017-06-29 15:27 Martok File Added: project1.lpr
2017-06-29 20:03 Jonas Maebe Relationship added duplicate of 0016006
2017-06-29 20:03 Jonas Maebe Status new => resolved
2017-06-29 20:03 Jonas Maebe Resolution open => duplicate
2017-06-29 20:03 Jonas Maebe Assigned To => Jonas Maebe
2017-07-01 15:40 Martok Note Added: 0101406
2017-07-01 15:40 Martok Status resolved => feedback
2017-07-01 15:40 Martok Resolution duplicate => reopened
2017-07-01 16:15 Jonas Maebe Note Added: 0101407
2017-07-01 16:15 Jonas Maebe Status feedback => resolved
2017-07-01 16:15 Jonas Maebe Resolution reopened => duplicate
2017-07-01 18:27 Martok Note Added: 0101412
2017-07-01 18:27 Martok Status resolved => feedback
2017-07-01 18:27 Martok Resolution duplicate => reopened
2017-07-01 18:57 Jonas Maebe Note Added: 0101413
2017-07-01 18:57 Jonas Maebe Status feedback => resolved
2017-07-01 18:57 Jonas Maebe Resolution reopened => duplicate
2017-07-01 21:13 Martok Note Added: 0101415
2017-07-01 21:13 Martok Status resolved => feedback
2017-07-01 21:13 Martok Resolution duplicate => reopened
2017-07-01 21:34 Jonas Maebe Note Added: 0101416
2017-07-01 21:34 Jonas Maebe Status feedback => resolved
2017-07-01 21:34 Jonas Maebe Resolution reopened => duplicate
2019-05-22 20:46 Florian Relationship added has duplicate 0035603