View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0037833 | FPC | RTL | public | 2020-09-28 21:39 | 2021-01-17 01:33 |
Reporter | Andrey "Croco" Stolyarov | Assigned To | |||
Priority | normal | Severity | minor | Reproducibility | always |
Status | new | Resolution | open | ||
OS | Linux | ||||
Product Version | 3.3.1 | ||||
Summary | 0037833: Output to text streams other than real disk files goes unbuffered | ||||
Description | Output to text streams, including stdout, is done without buffering, unless it is redirected to a real disk file. This significantly raises the program's execution time. | ||||
Steps To Reproduce | To reliably reproduce the effect, one can actually use any program that does text output (including the output to stdout) of a notable volume. To make a simple test, a program that outputs numbers from 0 to 100000000 (100 millions), one per line, will do: program million; var i: longint; begin for i:=0 to 100000000 do writeln(i) end. 100 mils is choosen because this takes measurable amount of time (tens of seconds), but not too much in order not to fall asleep. With the version of RTL present in current repositories, the following measurements are observed. When the output goes to a plain disk file: crocodil@frock:~/FPC$ time ./mil > file real 0m28.787s user 0m18.256s sys 0m9.532s So it took less than 29 seconds of real (wall clock) time, of which the kernel had to spend 9.5 seconds. If the same thing is done using a pipeline, the figures become different (please note the time command is only applied to ./mil, not to the whole pipeline): crocodil@frock:~/FPC$ time ./mil | cat > file real 1m15.093s user 0m26.912s sys 2m0.156s The "wall clock" time actually raised less than 2.5 times, but please note the system time which is more that the "real". This means the poor kernel had to utilize more than one CPU to serve this process and, roughly speaking, spent 13 times more time. This is because for the first run, the output (on the level of syscalls) is done like this (the following are excerpts from what strace command shows): write(1, "0\n1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n"..., 256) = 256 write(1, "\n89\n90\n91\n92\n93\n94\n95\n96\n97\n98\n9"..., 256) = 256 write(1, "5\n156\n157\n158\n159\n160\n161\n162\n16"..., 256) = 256 which seems reasonable (altough 256 is not the best buffer size -- well, 4096 is used by most of existing software; but 256 is still acceptable). But in case the output goes anywhere but a real disk file, the output is done like this: write(1, "0\n", 2) = 2 write(1, "1\n", 2) = 2 write(1, "2\n", 2) = 2 [...] write(1, "160\n", 4) = 4 write(1, "161\n", 4) = 4 write(1, "162\n", 4) = 4 [...] write(1, "87337\n", 6) = 6 write(1, "87338\n", 6) = 6 write(1, "87339\n", 6) = 6 [...] This means every writeln within the program causes its own system call write(2), that is, there's no output buffering in effect. The funniest thing here is that in case the output is redirected to /dev/null, it works slower that for a disk file: crocodil@frock:~/FPC$ time ./mil > /dev/null real 0m38.543s user 0m20.356s sys 0m18.184s This is ONLY because of total number of syscalls; every single syscall in this case returns almost immediately, because there's nothing to do for it; but the costs of switching contexts between the process and the kernel are still in effect. This all was not the case for the past releases, e.g., if we compile the same program with fpc 3.2.0, we've got the following figures: crocodil@frock:~/FPC$ time ./mil320 > file real 0m31.800s user 0m21.064s sys 0m10.136s crocodil@frock:~/FPC$ time ./mil320 | cat > file real 0m21.423s user 0m16.672s sys 0m19.272s crocodil@frock:~/FPC$ time ./mil320 > /dev/null real 0m17.139s user 0m16.592s sys 0m0.544s Please specifically note less than one second system time in the last case. | ||||
Additional Information | It looks like that this misbehaviour was introduced during the work on ticket https://bugs.freepascal.org/view.php?id=37716 when the function Do_IsDevice got changed despite that it had no relationship to the bug reported in that ticket, and despite that I strolngly suggested not to do this particular change. The function Do_IsDevice was not broken, it just was (and still is) misnamed. It was relatively safe not to touch it at all (as from SeekEof it was called in error, and now in SeekEof there's no call to Do_IsDevice). The "fix" for non-broken function lead to the obvious misbehaviour, as only the streams that go to a human user may go unbuffered (and even these, actually, should be buffered, but in a manner different from other streams). Pipes, sockets and devices (like, e.g., /dev/null) are NOT of this kind, only TTYs are; it is common practice in Unix world to pass gigabytes of data through pipes and/or through standard input/output streams, so unbuffering them all is a catastrophe, which can not even be worked around anyhow from within the pascal program. Effectively this makes Free Pascal totally unusable for implementing a large kind of programs known as "unix filters". Please note that before the "fix", Do_IsDevice was implemented as isatty(3) -- that is, it issued ioctl(2) syscall to determine whether the stream is connected to TTY or not. It is very easy to fix this misbehaviour: roll back the Do_IsDevice implementation to its previous state, and you're done. What I also strongly suggest is to rename Do_IsDevice into Do_IsConsole or the like, so that it will not do more confusion than it already did. | ||||
Tags | No tags attached. | ||||
Fixed in Revision | |||||
FPCOldBugId | |||||
FPCTarget | |||||
Attached Files |
|
related to | 0037716 | closed | Tomas Hajny | SeekEof fails on pipelines, FIFOs and sockets (quick fix attached) |
|
Note that you can increase buffersize using settextbuf, for applications that you expect to generate a lot of output. |
|
Marco, I think it is impossible (or useless) to increase any buffer when no buffering is done at all (which is the case I report), and I see/know no way to forcibly switch the buffering on from within the program. Actually, what really frustrates me here is the "fix" which fixes nothing but breaks correct behavior. |
|
@ Andrey: 1) The fix/change ensures consistent behaviour among supported targets - the previous state was a terrible mess with each FPC target behaving differently. 2) The fix/change makes the behaviour more compatible to TP/BP - the underlying DOS function used in TP/BP differentiates between files on one hand and all types of devices (regardless from them being related to a console or any other type of device) on the other hand. 3) The fix/change allows e.g. GUI applications to run a FPC-compiled console programs and show output of such console programs as soon as they are available (without having to modify these programs). Regarding possibility to forcibly switch the buffering on from within the program - this may be achieved easily using: TextRec(YourTextFile).FlushFunc := nil; And finally - anybody expecting decent behaviour when processing 'gigabytes of data' (as mentioned in the bug report) while performing I/O on a level of individual characters (even with standard files on a disk) is simply doing something wrong. This would never work reasonably with TP/BP (processing just hundreds of kilobytes using regular string reading / writing line by line already took considerable time by then; I know it very well because I have been using my own unit for exactly these purposes since 1997 after realizing that I couldn't achieve reasonable performance with standard text I/O) and it doesn't work efficiently with FPC either. |
|
Tomas: IMHO such things enabled by default touch the realm of going for a TP emulator rather than a TP compatible compiler. And FPC is a Delphi compatible too, not just TP. Andrey: I though in case it was a matter of the buffersize set to 1 or so, then setting a buffersize would override it later. Anyway, apparently not, Tomas already answered. |
|
Tomas, once again, your "fix" to the Do_IsDevice function fixes nothing, and I'm tired to repeat this. The detailed explanation was given on the previous ticket, What the "fix" really does is that it breaks the behavior which was relatively correct before. As of "compatibility with MSDOS", it is impossible to be "compatible" with it in dealing with pipes, sockets and other similar streams, as there were no such things in MSDOS. If Free Pascal is declared being cross-platform compiler and not an MSDOS emulator, perhaps it should work correctly on platforms other than MSDOS, such as unix-like systems. Next, the TextRec is a part of internals, no? That is, the fields of this record are not the part of the officially documented API, am I wrong? If so, then this can hardly considered a solution. The most interesting thing today is that idea of 'performing I/O on a level of individual characters'. First of all, I never mentioned anything like that. However (second) this might surprize you, but there's absolutely nothing wrong with such an output, provided that correct buffering is done. If you take a look, e.g., inside the glibc implementation, you'll notice that all output created with functions like printf, fputs etc., actually goes through the fputc function which places a single char into the buffer. Actually, all data in the world is built byte-by-byte, or may be word-by-word. The buffering is what protects the poor OS kernel from overwhelming number of syscalls (and, heh, this is exactly what you break with your "fix"). What really matters here is that in Free Pascal, the standard output (that output/stdout thing) is always of type 'text', so, yes, any data that must go to stdout just have to go thru all these text-output facilities. This is nothing wrong, too (no matter how huge are the data to be transferred), as nothing prevents the program from outputting anything byte by byte -- provided that the correct buffering is active. Marco, I'd appreciate if you take time to read all the discussion on the "related" ticket (and the texts on this ticket, too). I'm a bit surprised with how hard it appeared to explain obvious and well-known things to someone here, and I don't feel powerful enough to repeat everything again. |
|
> TextRec is a part of internals, no? That is, the fields of this record are not the part of the officially documented API, am I wrong? https://www.freepascal.org/docs-html/rtl/system/textrec.html |
|
Andrey: I'm tired as well, which is the reason why I won't add any further comments here after this one. You didn't mention output of individual characters, but your artificial test program writes between 1 and 9 characters at a time. I can easily show that the performed change improves performance in other cases considerably and that it has completely negligible impact in yet other cases, but that is not the point - the point is that FPC should behave the same way on all supported platforms (which wasn't the case before my change) and that it should provide possibility of working with the output of FPC compiled programs immediately if the output doesn't go to a disk file. Your question about TextRec was already answered by Bart in the meantime, I can just add that TextRec was part of the officially documented API even with TP/BP (Dos.TextRec). |
|
> but your artificial test program writes between 1 and 9 characters at a time Well, it does so solely because of your "fix" which is in question here. Normally output to a text stream is not a real output operation, it is an operation of composing data internally withing the program's own memory and costs next to nothing, compared with output syscalls. This is what buffering is for. Funny to see that you accuse my "artificial" program for things you introduces with your broken "fix". > I can easily show that the performed change improves performance in other cases considerably No you can't, because it's impossible due to the nature of basic things. If you can, surprize me, heh. Or at least surprise others. > won't add any further comments here after this one Okay, it's your right. Can anyone here tell me how to contact more reasonable people on the Free Pascal team? |
|
As an exception to my previous statement of not adding further comments: All members of the Free Pascal team received notification about your bug report and moreover, I explicitly mentioned in the core team mailing list that I'm too biased in this case, so that I don't want to make the final decision. I won't comment your statement suggesting who is reasonable and who not, anybody may judge it for himself or herself. |
|
Bart, > https://www.freepascal.org/docs-html/rtl/system/textrec.html citing the page on this reference: > TextRec is the underlying type used in text files. It should be treated as opaque and never manipulated directly. This is exaclty what I expected, that is, the "solution" offered to me here is not a solution. |
|
I would like to put two cents into this discussion, because for me this bug feels important. It is my point of view and I do not attaching it to any of the technical terms provided here, but I think when we are talking about "cross-platform" development, we need to ensure that our internal behaviors works the same way as target-platform expects and that means true cross-platform support and if we are moving priority to have same behavior for an interfaces inside our application instead of keeping expected behavior, that leads to fake cross-platform support and users, who expects OS-oriented behavior will encounter troubles. So in my point of view we need to keep behavior as Croco said and find another way to implement higher level thing what Tomas talking about. Because operation systems lays much down and their behavior much more important, than upper interface in terms of software for software developers. p.s Of course it is just my motivation to talk about it and it can be differ in compare with motivations of Croco and Tomas, but maybe my words can change someone's mind. |
|
Thanks for your input. What do you mean by "the way as target platform expects"? None of the FPC supported platforms has any notion of outbut buffering or not buffering as a property of the "platform"; such behaviour is always related to particular API calls (should it be OS API, or RTL, etc.) as far as I can see. Talking about behaviour native for a particular platform (in this case Linux referred to in the report), it is IMHO much more predictible / "native" if a program behaves the same way if sending its output directly to the console and if sending it to the console through the 'tee' filter. |
|
> much more predictible / "native" if a program behaves the same way if sending its output directly to the console and if sending it to the console through the 'tee' filter. This contradicts with what Unix (not only Linux) people can expect because of well-established unix traditions, so at least this has nothing to do with being "native" for any of the unix-like platforms. That is, no other language, interpreter or compiler (of what I know, and I know a lot of them as actually Linux is the only OS I use for more than 20 years) will demonstrate such behavior. From the other hand, this (mis)behavior makes FPC useless for programming unix filters as such. Furthermore, as far as I can tell, implementations of that Do_IsDevice thing for various platforms were consistent between each other, as well as with all uses of the function found in the rtl code (after the absolutely nonsense call to Do_IsDevice was finally removed from SeekEof code, which costed me, BTW, way too much time wasted on discussions), in that they did their best to detect whether the output goes to a human user or not. For unix-like systems, this is exactly what isatty(3) function does, and Do_IsDevice versions for unix-like systems were all the same doing exactly this; for other platforms, they actually checked whether the stream is a "standard" one, and when there's no other option, this could be not too bad. What is "inconsistent" here is the name of the function, which definitely must be named differently to prevent more confusion: it should be named Do_IsTerminal or something similar. Breaking functionality of possible unix-filters written in FPC will gain you nothing, you only loose. |
|
Alexander, > So in my point of view we need to keep behavior as Croco said and find another way to implement higher level thing what Tomas talking about. the problem is that these "higher level things" don't exist. Could you may be read the (well, way too long) discussion on the "related" (now closed) ticket, just to get the explanation of what really happens here? BTW, your name looks like Russian, so if you speak Russian, may be you can be so kind to contact me directly (e.g. via my site, http://stolyarov.info) so that we can discuss all this stuff more quickly? |
|
Consistent? Original state (still existing in FIXES_3_2): AIX, Haiku, Solaris - no buffering for std input/output/error regardless from them being redirected or not, always buffering for all other file handles regardless from them possibly being attached to console or anything else. BeOS, BSD targets, Linux - no buffering if the file handle was attached to a TTY device EMX, GO32v2, MSDOS, Watcom - buffering only if the file handle is a disk file (not a device - as the function name suggests and exactly as with TP/BP) OS/2 - no buffering if the file handle is attached to any device, buffering in all other cases Win16 - no buffering for std input/output/error (regardless from them being redirected or not) and for all other file handles attached to anything different from a real disk file Win32/Win64 - no buffering for std input/output/error _unless_ they are redirected, but also no buffering for other handles originally attached to console _even_ if redirected Current state in trunk: All above mentioned targets - no buffering for file handles attached to any device, pipe or socket |
Date Modified | Username | Field | Change |
---|---|---|---|
2020-09-28 21:39 | Andrey "Croco" Stolyarov | New Issue | |
2021-01-02 23:00 | Tomas Hajny | Relationship added | related to 0037716 |
2021-01-03 14:49 | Marco van de Voort | Note Added: 0128046 | |
2021-01-04 08:22 | Andrey "Croco" Stolyarov | Note Added: 0128063 | |
2021-01-04 10:05 | Tomas Hajny | Note Added: 0128064 | |
2021-01-04 10:06 | Tomas Hajny | Note Edited: 0128064 | View Revisions |
2021-01-04 12:22 | Marco van de Voort | Note Added: 0128067 | |
2021-01-04 19:34 | Andrey "Croco" Stolyarov | Note Added: 0128082 | |
2021-01-04 22:11 | Bart Broersma | Note Added: 0128084 | |
2021-01-05 00:03 | Tomas Hajny | Note Added: 0128086 | |
2021-01-07 16:23 | Andrey "Croco" Stolyarov | Note Added: 0128142 | |
2021-01-07 16:36 | Tomas Hajny | Note Added: 0128144 | |
2021-01-09 07:58 | Andrey "Croco" Stolyarov | Note Added: 0128191 | |
2021-01-11 15:21 | Alexander Logunov | Note Added: 0128266 | |
2021-01-11 15:58 | Tomas Hajny | Note Added: 0128268 | |
2021-01-11 15:59 | Tomas Hajny | Note Edited: 0128268 | View Revisions |
2021-01-16 22:59 | Andrey "Croco" Stolyarov | Note Added: 0128382 | |
2021-01-16 23:05 | Andrey "Croco" Stolyarov | Note Added: 0128383 | |
2021-01-17 01:33 | Tomas Hajny | Note Added: 0128384 |