View Issue Details

IDProjectCategoryView StatusLast Update
0029325FPCPatchpublic2020-10-18 19:16
ReporterPaul Gevers Assigned ToFlorian  
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
Fixed in Version3.3.1 
Summary0029325: Enable reproducible builds of FPC.
DescriptionReproducible builds¹ requires that the build time stamp is not recorded in binaries. In FPC they are build in via the $INCLUDE %DATE% directive which calls getdatestr in globals.pas. To enable reproducible builds it would be nice if it would honor the SOURCE_DATE_EPOCH environment variable².

To not depend on the dateutil unit, we included the required code from that unit in this patch, which we successfully apply in Debian³.

¹ https://reproducible-builds.org/
² https://reproducible-builds.org/specs/source-date-epoch/
³ https://reproducible.debian.net/rb-pkg/experimental/amd64/fpc.html
TagsNo tags attached.
Fixed in Revision47133
FPCOldBugId
FPCTarget-
Attached Files

Relationships

related to 0032937 new TZ and TZDIR env vars being overwritten in Arch & Suse Linux 

Activities

Paul Gevers

2016-01-03 21:27

reporter  

honor_SOURCE_DATE_EPOCH_in_date.patch (2,145 bytes)   
Description: Reproducible builds requires that the build time stamp is not
 recorded in binaries. In FPC they are fuild in via the $INCLUDE %DATE%
 directive which calls getdatestr in globals.pas. To allow reproducible builds
 we should honor the SOURCE_DATE_EPOCH environment variable. To not depend on
 the dateutil unit, we include the required code from that package here.
Author: Paul Gevers <elbrus@debian.org>
Author: Abou Al Montacir <abou.almontacir@sfr.fr>

Index: fpc/fpcsrc/compiler/globals.pas
===================================================================
--- fpc.orig/fpcsrc/compiler/globals.pas
+++ fpc/fpcsrc/compiler/globals.pas
@@ -510,6 +510,7 @@ interface
       starttime  : real;
 
     function getdatestr:string;
+    Function UnixToDateTime(const AValue: Int64): TDateTime;
     function gettimestr:string;
     function filetimestring( t : longint) : string;
     function getrealtime : real;
@@ -766,11 +767,29 @@ implementation
    }
       var
         Year,Month,Day: Word;
+        SourceDateEpoch: string;
       begin
-        DecodeDate(Date,year,month,day);
+        SourceDateEpoch := GetEnvironmentVariable('SOURCE_DATE_EPOCH');
+        if Length(SourceDateEpoch)>0 then
+           DecodeDate(UnixToDateTime(StrToInt64(SourceDateEpoch)),year,month,day)
+        else
+          DecodeDate(Date,year,month,day);
         getdatestr:=L0(Year)+'/'+L0(Month)+'/'+L0(Day);
       end;
 
+   Function UnixToDateTime(const AValue: Int64): TDateTime;
+   { Code copied from fpcsrc/packages/rtl-objpas/src/inc/dateutil.inc and
+   fpcsrc/rtl/objpas/sysutils/datih.inc }
+   const
+      TDateTimeEpsilon = 2.2204460493e-16 ;
+      UnixEpoch = TDateTime(-2415018.5) + TDateTime(2440587.5) ;
+   begin
+      Result:=UnixEpoch + AValue/SecsPerDay;
+      if (UnixEpoch>=0) and (Result<-TDateTimeEpsilon) then
+         Result:=int(Result-1.0+TDateTimeEpsilon)-frac(1.0+frac(Result))
+      else if (UnixEpoch<=-1.0) and (Result>-1.0+TDateTimeEpsilon) then
+         Result:=int(Result+1.0-TDateTimeEpsilon)+frac(1.0-abs(frac(1.0+Result)));
+   end;
 
    function  filetimestring( t : longint) : string;
    {

Mark Morgan Lloyd

2016-01-03 21:59

reporter   ~0088570

(i) What's to stop somebody spoofing SOURCE_DATE_EPOCH? (ii) What's the behaviour of your patch when DST is applied? (iii) Does this result in consecutive builds with the same build ID (ld --build-id option without an explicit style)? (iv) What's the situation with this on non-unix OSes (specifically, variants of Windows)?

Sven Barth

2016-01-04 23:12

manager   ~0088633

@Mark:
(i) They can also set the date/time of the computer differently, so no real difference there...
(ii) What does DST have to do with that?
(iii) Don't know.
(iv) This should probably be put into a {$ifdef unix} or even {$ifdef linux}

@Paul:
I'll have to discuss this with the others (I personally don't like this... like at all). Nevertheless you shouldn't need to copy UnixToDateTime. You can just use the unit "DateUtil" (or "DateUtils") in a corresponding ifdef (see issue iv I answered to Mark).

Regards,
Sven

Jonas Maebe

2016-01-04 23:22

manager   ~0088634

dateutils has been removed from the RTL a while ago. We can maybe move it back though.

Sven Barth

2016-01-04 23:39

manager   ~0088636

Ah, right, I had forgotten that it was also a "victim" of the RTL split... :/

Regards,
Sven

Paul Gevers

2016-01-05 07:59

reporter   ~0088640

@Sven, an alternative for reproducible builds is to stop embedding the compile date into the binaries. I can create a patch for that if you rather want that. But my patch was meant to leave stuff as it is for those that care about that date, while not being in the way for people/distributions that care about reproducibilities.

@Sven, why do you think this is only for unix/linux? I think it is also valuable for Windows. IMHO reproducible builds are valuable for everybody, but I respect it that others my not find it such a big deal.

For the questions raised:
(i) spoofing the date is the purpose of this patch, so what are you (Mark) worried about. I probably don't understand the question and the remark of Sven also stands.
(ii) epoch is not sensitive to DST.
(iii) I'll investigate. Out of curiosity, if the answer is yes, would you (Mark) consider this good or bad and why?
(iv) As already mentioned above, why would the situation be different for the other OS's? I didn't claim that this patch is enough for reproducible builds (it is not), but some fix about the embedded timestamp is required for all OS's. There is nothing (or indeed shouldn't be) unix or linux specific in the patch.

Mark Morgan Lloyd

2016-01-05 16:26

reporter   ~0088660

(i) I'm not, I was just trying to get clarification.
(ii) There's longstanding debate about how FPC can get a reliable timestamp without DST correction, preferably UTC. I think this was finally resolved a few months ago.
(iii) Neutral. I'm highlighting the existence of a minimal "watermarking" facility supported by ld, distinct from simply taking the cksum of an entire file. It's obviously important to appreciate that the ld build stamp can be spoofed.
(iv) I'm purely looking ahead allowing that FPC is cross-platform. If the idea is to be able to force in a timestamp for a banner etc., such that if this is the same between two binaries and an external checksum is the same then the two binaries are almost certainly identical, then it seems fine to me... except that it would be worth knowing how Windows tackles this sort of thing since as I understand it they've got much better code signing etc. than Linux or (almost any) unix.

Thaddy de Koning

2016-01-06 12:57

reporter   ~0088688

Last edited: 2016-01-06 13:10

View 4 revisions

At most this should be a switch. But it is useful for release builds.

Code-signing has nothing to do with this: this only means you can get a bit for bit reproducible build at any given time (in an epoch) given the exact same - bit for bit - tools used to build a binary. Code-signing is adding a certificate that enables verification of the binary to come from a reputable source (can be internal) and has not been tampered with since signing. That means wildly different binaries can be signed and accepted with the same certificate set. There is really a HUGE difference.

Sven Barth

2016-01-08 18:54

manager   ~0088732

Last edited: 2016-01-08 18:55

View 2 revisions

@Thaddy: the problem with a switch would be that one would need to enable it for each build, so the Debian people would need to add it to each and every package that depends on FPC or to the fpc.cfg. Not an ideal solution in my opinion.

@Paul: I think that specific code of yours is only for Unix/Linux, because it relies on the Unix Epoch. Windows (or other non Unix systems for that matter) does not use the epoch normally (e.g. NTFS uses dates starting from 1600) and this would be like a foreign substance then. Thus operating system specific approaches should be implemented.
Of course I wouldn't force you to implement these and thus my preference to put the code into Unix/Linux ifdefs.

And yes, we wouldn't stop putting dates into binaries just because you (as in "Debian", not you personally) want to achieve reproducible builds.

Regards,
Sven

Marco van de Voort

2016-01-09 16:35

manager   ~0088741

Last edited: 2016-01-09 16:36

View 2 revisions

I think interpreting hordes of environment variables is not a good thing. I've no problem with reproducible builds, but IMHO it is better to make this an optional thing under ifdef.

Jonas Maebe

2016-01-09 16:52

manager   ~0088742

It can't be under ifdef, because then also all user-compiled units will have this same timestamp for {$i %date} if the compiler is built that way.

Paul Gevers

2016-01-10 13:04

reporter   ~0088755

@Sven: it isn't even a Debian thing yet. Just a project started by some Debian people and now rapidly picked up by a lot bigger part of the open source community. And no, you should do this because "I" want this, but more because you (as in "fpc") also believe in reproducible builds.

As already explained, reproducible builds is really about getting a bit-by-bit identical binary when build with the same version of all the tools. In Debian, the problem of singing is already solved since mid '90. All our binaries are signed already. Quoting from the front page of reproducible-builds.org:
"""Most aspects of software verification are done on source code, as that is what humans can reasonably understand. But most of the time, computers require software to be first built into a long string of numbers to be used. With reproducible builds, multiple parties can redo this process independently and ensure they all get exactly the same result. We can thus gain confidence that a distributed binary code is indeed coming from a given source code."""

One major aspect is about verifiable trust.

Abou Al Montacir

2016-01-10 22:06

manager   ~0088767

I think Paul already gave most important points but I just want to add my 2 cents here.

For me the reproducible build is one of the most important features for a compiler. Indeed this is the only way one can trust an open source compiler against a compiler attack [1].

One of the most important features in open source software is the code trust which is achieved by code availability but also because the same code is available for everybody. So even if one misses an issue an other will catch it. Now if that trusted code is compiled with untrusted compiler this chain is broken.

The trust of the compiler can not be ensured without reproducible builds. Because even if the SW is signed by some one of trust, a malware attack can change the compiler in that trusted person's machine and thus the signature is no more valuable.

The only way to achieve trust is to have many trusted peoples building on their own machine the same code and check all their outputs are bit to bit exact.

I hope this helps clarifying some reasons behind this feature. Other arguments could be find by searching the internet.

[1] http://www.linuxjournal.com/article/7839

Thaddy de Koning

2016-01-11 09:20

reporter   ~0088778

Last edited: 2016-01-11 09:48

View 8 revisions

@Jonas Maebe (0088742):

I wonder how this is handled by other compilers.
Maybe Paul can enlighten us.
For corporate customers I already pack the (hashed) toolset and sourcecode for them to reproduce a release and debug build, but that is never bit for bit because of the timestamp. Maybe such a feature can be solved by a crc32/diff which should always point to the timestamp fields only. Afiak from the discussions and documentations this wouldn't be against the intend and also achieves more or less the goal of reproducable builds: bit for bit except for the timestamp. Since the original timestamp is a given, that can be patched in for the purists to produce a bit for bit verifiable build. I can't read anywhere that such a patch would be against a reproducable build process, in fact it can be part of the build process.

As an aside: make/building the compiler already uses the concept of reproducable builds as a sanity check.

Abou Al Montacir

2016-01-11 10:31

manager   ~0088781

@Thaddy

Please note that the feature should be available for the generated compiler but also for any other program compiled by the generated compiler. This makes it hard to look for all dates and try to avoid timestamps when computing CRCs.

I think honoring SOURCE_DATE_EPOCH or providing a compiler switch is mandatory. I can live with a new compiler switch that enables passing a date. I can also leave with passing date in another format if you don't like unix epoch, provided that a mechanism of passing a static date exist.

Please note that changing the machine date is not an option as Debian build servers are shared resources and at the same time one is compiling FPC someone else could be compiling GCC or what ever other program.

Paul Gevers

2016-05-21 22:18

reporter   ~0092680

I don't like to do this, but just a short ping. Is moving this issue forward waiting for a response/investigation from me in any way?

What would be needed to move this issue forward?

@Thaddy, honoring the SOURCE_DATE_EPOCH variable has been accepted by a lot of open source communities, including e.g. gcc¹ for their __DATE__ and __TIME__ macros.

¹ https://reproducible.alioth.debian.org/blog/posts/53/ (Toolchain fixes)

Marco van de Voort

2016-05-22 14:05

manager   ~0092685

Last edited: 2016-05-22 14:07

View 2 revisions

Jonas: I mean to only enable reproducable builds by ifdef or switch for people/distributions that want to make reproducable builds.

I don't like binaries changing behaviour on environment variables, because while fun for a few *nix diehards, users have another source of input (besides config file and commandline) to contend with.

But I guess a parameter that enables such interpretation (and other things needed for reproducable builds) is ok of course, and nothing in the RTL.

Jonas Maebe

2016-10-02 13:36

manager   ~0094901

> But I guess a parameter that enables such interpretation (and
> other things needed for reproducable builds) is ok of course

Maybe we can do it if -Ur is specified?

Marco van de Voort

2018-05-14 16:58

manager   ~0108293

Last edited: 2018-05-14 20:30

View 4 revisions

IMHO no, since that parameter also has other meanings and purpose. The whole reason to avoid environment variables is to avoid sideeffects based on platform (and distribution) dependent environment variables. Moving the sideeffects to standard parameters is then not an option.

One could tie all reproducable build related stuff to one parameter though, and let it evolve with whatever the average distro requires.

Thaddy de Koning

2018-05-14 18:50

reporter   ~0108294

If possible, that's a good idea.

Florian

2020-10-18 19:16

administrator   ~0126404

Since r47133, the compiler makefile takes care of SOURCE_DATE_EPOCH and passed an appropriate string to the compiler. If no SOURCE_DATE_EPOCH is passed, the compiler looks if it can read the date from git log. This solves also the ugly problem that make cycle breaks if it is run over midnight (which hits me maybe every 10 years :) ).

Issue History

Date Modified Username Field Change
2016-01-03 21:27 Paul Gevers New Issue
2016-01-03 21:27 Paul Gevers File Added: honor_SOURCE_DATE_EPOCH_in_date.patch
2016-01-03 21:59 Mark Morgan Lloyd Note Added: 0088570
2016-01-04 23:12 Sven Barth Note Added: 0088633
2016-01-04 23:22 Jonas Maebe Note Added: 0088634
2016-01-04 23:39 Sven Barth Note Added: 0088636
2016-01-05 07:59 Paul Gevers Note Added: 0088640
2016-01-05 16:26 Mark Morgan Lloyd Note Added: 0088660
2016-01-06 12:57 Thaddy de Koning Note Added: 0088688
2016-01-06 13:04 Thaddy de Koning Note Edited: 0088688 View Revisions
2016-01-06 13:09 Thaddy de Koning Note Edited: 0088688 View Revisions
2016-01-06 13:10 Thaddy de Koning Note Edited: 0088688 View Revisions
2016-01-08 18:54 Sven Barth Note Added: 0088732
2016-01-08 18:55 Sven Barth Note Edited: 0088732 View Revisions
2016-01-09 16:35 Marco van de Voort Note Added: 0088741
2016-01-09 16:36 Marco van de Voort Note Edited: 0088741 View Revisions
2016-01-09 16:52 Jonas Maebe Note Added: 0088742
2016-01-10 13:04 Paul Gevers Note Added: 0088755
2016-01-10 22:06 Abou Al Montacir Note Added: 0088767
2016-01-11 09:20 Thaddy de Koning Note Added: 0088778
2016-01-11 09:21 Thaddy de Koning Note Edited: 0088778 View Revisions
2016-01-11 09:26 Thaddy de Koning Note Edited: 0088778 View Revisions
2016-01-11 09:28 Thaddy de Koning Note Edited: 0088778 View Revisions
2016-01-11 09:35 Thaddy de Koning Note Edited: 0088778 View Revisions
2016-01-11 09:36 Thaddy de Koning Note Edited: 0088778 View Revisions
2016-01-11 09:47 Thaddy de Koning Note Edited: 0088778 View Revisions
2016-01-11 09:48 Thaddy de Koning Note Edited: 0088778 View Revisions
2016-01-11 10:31 Abou Al Montacir Note Added: 0088781
2016-05-21 22:18 Paul Gevers Note Added: 0092680
2016-05-22 14:05 Marco van de Voort Note Added: 0092685
2016-05-22 14:07 Marco van de Voort Note Edited: 0092685 View Revisions
2016-10-02 13:36 Jonas Maebe Note Added: 0094901
2018-05-14 16:58 Marco van de Voort Note Added: 0108293
2018-05-14 18:50 Thaddy de Koning Note Added: 0108294
2018-05-14 20:29 Marco van de Voort Note Edited: 0108293 View Revisions
2018-05-14 20:29 Marco van de Voort Note Edited: 0108293 View Revisions
2018-05-14 20:30 Marco van de Voort Note Edited: 0108293 View Revisions
2020-10-18 14:20 Marco van de Voort Relationship added related to 0032937
2020-10-18 19:16 Florian Assigned To => Florian
2020-10-18 19:16 Florian Status new => resolved
2020-10-18 19:16 Florian Resolution open => fixed
2020-10-18 19:16 Florian Fixed in Version => 3.3.1
2020-10-18 19:16 Florian Fixed in Revision => 47133
2020-10-18 19:16 Florian FPCTarget => -
2020-10-18 19:16 Florian Note Added: 0126404