FOP - HTML2PDF

Discussion:

FOP - HTML2PDF

echoo

2011-01-25 08:32:15 UTC

Dear

I have a problem which I don't know how to solve:

I have an xml file which I want to transform to:
- a pdf file.
- a xsl file.

For this, I use Apache FOP (as I am working in a Java environment).
The result of this is nice except for one thing:

My xml has one field which is called 'introduction' and which accepts
HTML contents.
After transformation, the plain html is shown in the pdf file.

I want the HTML to be interpreted.

Yours Sincerely

Christof

--
View this message in context: http://old.nabble.com/FOP---HTML2PDF-tp30748316p30748316.html
Sent from the FOP - Users mailing list archive at Nabble.com.

mehdi houshmand

2011-01-25 08:47:20 UTC

Permalink

Hi Christof,

Just to be clear, you've got an XML element that contains HTML and you
want that HTML to be interpreted as text (i.e. you want the HTML tags
removed?)? If so, this isn't strictly a FOP question, FOP isn't
responsible for analysing/parsing/interpreting XML directly (though
admittedly it does accept XML as input with an XSLT to transform the
XML to FO). Anyway, the point is, you want that knowledge to be in the
XSL. The XSL/XSLT is responsible for parsing the XML and converting it
into FO, how you do that is a question for an XSLT forum, but one way
would be using regexs to return the string you want. One google search
yielded http://www.xml.com/pub/a/2003/06/04/tr.html which seems like a
fairly nice little introduction to regexes in XSLT.

I hope that helps

Mehdi

Dear
- a pdf file.
- a xsl file.
For this, I use Apache FOP (as I am working in a Java environment).
My xml has one field which is called 'introduction' and which accepts
HTML contents.
After transformation, the plain html is shown in the pdf file.
I want the HTML to be interpreted.
Yours Sincerely
Christof
--
View this message in context: http://old.nabble.com/FOP---HTML2PDF-tp30748316p30748316.html
Sent from the FOP - Users mailing list archive at Nabble.com.
---------------------------------------------------------------------

echoo

2011-01-25 13:33:03 UTC

Permalink

Dear Mehdi

Thank you for your reply.

What do you mean with 'interpreted as text'?
What I want is that if I have a <table> tag in my the html content, a table
should be drawn in the resulting pdf file. I just don't know how :-) (yet)
You link might be, indeed, useful.

Yours Sincerely

Christof

Post by mehdi houshmand
Hi Christof,
Just to be clear, you've got an XML element that contains HTML and you
want that HTML to be interpreted as text (i.e. you want the HTML tags
removed?)? If so, this isn't strictly a FOP question, FOP isn't
responsible for analysing/parsing/interpreting XML directly (though
admittedly it does accept XML as input with an XSLT to transform the
XML to FO). Anyway, the point is, you want that knowledge to be in the
XSL. The XSL/XSLT is responsible for parsing the XML and converting it
into FO, how you do that is a question for an XSLT forum, but one way
would be using regexs to return the string you want. One google search
yielded http://www.xml.com/pub/a/2003/06/04/tr.html which seems like a
fairly nice little introduction to regexes in XSLT.
I hope that helps
Mehdi

Dear
- a pdf file.
- a xsl file.
For this, I use Apache FOP (as I am working in a Java environment).
My xml has one field which is called 'introduction' and which accepts
HTML contents.
After transformation, the plain html is shown in the pdf file.
I want the HTML to be interpreted.
Yours Sincerely
Christof
--
http://old.nabble.com/FOP---HTML2PDF-tp30748316p30748316.html
Sent from the FOP - Users mailing list archive at Nabble.com.
---------------------------------------------------------------------

---------------------------------------------------------------------

--
View this message in context: http://old.nabble.com/FOP---HTML2PDF-tp30748316p30757986.html
Sent from the FOP - Users mailing list archive at Nabble.com.

mehdi houshmand

2011-01-25 13:47:36 UTC

Permalink

Hi Christof,

Correct me if I'm wrong, but you're trying to extract the relevant
text from the HTML and convert that to FO objects in XML. If so, that
looks like a job for regex i.e. finding strings - in your case, you'd
be looking for <table>ANY STRING</table> (I presume) and insert that
text into FO elements. However, there's almost definitely a more
intuitive way to do that using XSLT, but that's not really the scope
of this forum. You want all that intelligence in the XSLT, you want
the XSLT to parse the HTML and create the necessary FO elements. XSLT
is a very powerful tool, and most likely someone else would have done
what you're trying to do or at least something similar that you
could... Uhm... *cough* plagiarize *cough*. My point is there's no
point reinventing the wheel, Google is your friend, check this out,
might be a good starting point:
http://stackoverflow.com/questions/1639625/can-i-parse-an-html-using-xslt.

I hope that helps

Mehdi

Post by echoo
Dear Mehdi
Thank you for your reply.
What do you mean with 'interpreted as text'?
What I want is that if I have a <table> tag in my the html content, a table
should be drawn in the resulting pdf file. I just don't know how :-) (yet)
You link might be, indeed, useful.
Yours Sincerely
Christof

Dear
- a pdf file.
- a xsl file.
For this, I use Apache FOP (as I am working in a Java environment).
My xml has one field which is called 'introduction' and which accepts
HTML contents.
After transformation, the plain html is shown in the pdf file.
I want the HTML to be interpreted.
Yours Sincerely
Christof
--
http://old.nabble.com/FOP---HTML2PDF-tp30748316p30748316.html
Sent from the FOP - Users mailing list archive at Nabble.com.
---------------------------------------------------------------------

---------------------------------------------------------------------

--
View this message in context: http://old.nabble.com/FOP---HTML2PDF-tp30748316p30757986.html
Sent from the FOP - Users mailing list archive at Nabble.com.
---------------------------------------------------------------------

Wim VN

2011-01-25 10:36:34 UTC

Permalink

Hello Christof,

I'm not sure but I think the solution can be found in XSLT and not in FOP.

If I understand correctly: you use an XSL transformation to go from a source
xml file to an intermediate XSL-FO file. Afterwards you process this with
Apache FOP to a final PDF document.
Within the xml there is a tag that holds html content. You wish this content
to be interpreted.

Is it not possible to have the XSL transformation lookup that <introduction>
tag and make sure it is converting the html content to FO as well?

I am not an XSLT expert and if I'm not mistaken other forums might be a
better choice to get help on this specific problem.

Good luck with your project
Wim

Post by echoo
Dear
- a pdf file.
- a xsl file.
For this, I use Apache FOP (as I am working in a Java environment).
My xml has one field which is called 'introduction' and which accepts
HTML contents.
After transformation, the plain html is shown in the pdf file.
I want the HTML to be interpreted.
Yours Sincerely
Christof

--
View this message in context: http://old.nabble.com/FOP---HTML2PDF-tp30748316p30748881.html
Sent from the FOP - Users mailing list archive at Nabble.com.

echoo

2011-01-25 13:27:07 UTC

Permalink

Hello Wim VN

Thank you for your reply.

Yes, what you metion is exactly what I want to do.

I am not sure but I believe I can use the xsl file (xhtml2fo.xsl), provided
by antennahouse(http://www.antennahouse.com/XSLsample/XSLsample.htm), to
lookup the html translations. This is what you mean with 'have the XSL
transformation lookup that <introduction> tag'?

Yours Sincerely

Christof

Post by Wim VN
Hello Christof,
I'm not sure but I think the solution can be found in XSLT and not in FOP.
If I understand correctly: you use an XSL transformation to go from a
source xml file to an intermediate XSL-FO file. Afterwards you process
this with Apache FOP to a final PDF document.
Within the xml there is a tag that holds html content. You wish this
content to be interpreted.
Is it not possible to have the XSL transformation lookup that
<introduction> tag and make sure it is converting the html content to FO
as well?
I am not an XSLT expert and if I'm not mistaken other forums might be a
better choice to get help on this specific problem.
Good luck with your project
Wim

--
View this message in context: http://old.nabble.com/FOP---HTML2PDF-tp30748316p30757603.html
Sent from the FOP - Users mailing list archive at Nabble.com.

Continue reading on narkive:

Search results for 'FOP - HTML2PDF' (Questions and Answers)

replies

If custodial parent moves to different states, do child support guidelines change?

started 2007-04-03 17:54:04 UTC

marriage & divorce

replies

How to generate pdf on the fly?

started 2005-12-09 06:44:17 UTC

programming & design

replies

How can i use pdf files in my website? i am a beginner web designer ,help me please?