View Issue Details

IDProjectCategoryView StatusLast Update
0028627LazarusPackagespublic2015-09-16 17:18
ReporterBart BroersmaAssigned ToBart Broersma 
PrioritynormalSeverityminorReproducibilityalways
Status closedResolutionfixed 
Platformi386OSWindowOS VersionWin7
Product Version1.5 (SVN)Product Buildr49759 
Target Version1.6Fixed in Version1.6 
Summary0028627: TSynExporterHTML does not generate doctype and content-type information
DescriptionHere's a typical fragment of the exporters export:

<html>
<head></head>
<body>unit main;

{$assertions on}

{$mode objfpc}{$H+}
//{$mode delphi}
interface


The file itself seems to be written with string-encoding "current codepage", but the file itself does not mention this.

Here's a fragment of a "proper" html file:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
       "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>

<meta http-equiv="content-type" content="text/html; charset=utf-8">


I know that in general it is the server's task to inform the client of the charset that served html paes have, but that only works if the charset used, is indeed the charset that the server says is serves *
Since we don't know that is the case, we'ld better insert some content-type info which states what the charset is.

Preferrably the charset of the export should be configurable, if that cannot be done, the better choice IMO would be to default to UTF-8.

B.t.w. the exporter also seems to assume that the input (TStrings) is encoded as windows-1252, which is not the default encoding at least for TSynEdit.

* My server says that it serves html as UTF-8. If I place an html file create with the exporter on my server, your browser wrongly assumes it is utf-8, and it may very well display wrong.
Steps To ReproducePut a TSynEdit and a TSynExporterHTML on a form, with a highlighter attached.

I used this code to export:
procedure TForm1.Button1Click(Sender: TObject);
begin
  HtmlExp.CreateHTMLFragment := False;
  HtmlExp.ExportAll(SynEdit1.Lines);
  HtmlExp.SaveToFile('test.html');
end;

(Q: how to get rid of the annoying HTML fragment (which is meant for copying to clipboard)?)
TagsNo tags attached.
Fixed in Revisionr49820
LazTarget1.6
Widgetset
Attached Files

Activities

Bart Broersma

2015-09-06 12:30

developer  

test.html (2,036 bytes)

Bart Broersma

2015-09-06 12:31

developer  

proper.html (644 bytes)

Bart Broersma

2015-09-06 12:33

developer   ~0085743

OK, the HTML tags nicely influence the rendering of the description.
I attached test.htlm (exported by TSynExporterHTML) and proper.html.

Bart Broersma

2015-09-06 12:44

developer   ~0085745

Note that exporting with charset=utf-8 has the advantage that not so much text needs to be replaced, just the "reserved" symbols like <,>,& etc.
And since these are all in the ascii range, replacing them is a lot easier than dealing with muti-byte codepoints.

Bart Broersma

2015-09-07 19:35

developer   ~0085768

Last edited: 2015-09-07 19:54

View 2 revisions

Finally figured out what SaveAsText and CreateHTMLFragment do:
ExportAsText means: DoNotIncludeFragmentInformationForWindowsClipBoard
CreateHTMLFragment means: DoNotSurroundWithHtmlAndBodyTags

Rather confusing names for these variables.

Issue History

Date Modified Username Field Change
2015-09-06 12:28 Bart Broersma New Issue
2015-09-06 12:30 Bart Broersma File Added: test.html
2015-09-06 12:31 Bart Broersma File Added: proper.html
2015-09-06 12:33 Bart Broersma Note Added: 0085743
2015-09-06 12:35 Bart Broersma Steps to Reproduce Updated View Revisions
2015-09-06 12:36 Bart Broersma Summary TSynExporterHTML does not generate doctype and content-type informtion => TSynExporterHTML does not generate doctype and content-type information
2015-09-06 12:44 Bart Broersma Note Added: 0085745
2015-09-06 12:45 Bart Broersma Description Updated View Revisions
2015-09-07 19:35 Bart Broersma Note Added: 0085768
2015-09-07 19:54 Bart Broersma Note Edited: 0085768 View Revisions
2015-09-13 10:51 Bart Broersma Assigned To => Bart Broersma
2015-09-13 10:51 Bart Broersma Status new => assigned
2015-09-13 14:11 Bart Broersma Fixed in Revision => r49820
2015-09-13 14:11 Bart Broersma LazTarget - => 1.6
2015-09-13 14:11 Bart Broersma Status assigned => resolved
2015-09-13 14:11 Bart Broersma Fixed in Version => 1.6
2015-09-13 14:11 Bart Broersma Resolution open => fixed
2015-09-13 14:11 Bart Broersma Target Version => 1.6
2015-09-16 17:18 Bart Broersma Status resolved => closed