texor is a proposed package that can help in converting old LaTeX based documents, research papers to HTML through intermediate conversions. This is particularly a problem for R Research papers where HTML export was not available and hence modern compatibility to export a HTML file was missed out. To bring these documents online HTML based webpages are a much better alternative as opposed to PDFs and to solve the lack of web versions texor will provide a mechanism.
In this stage, we will check the basics like using correct path, normalizing the path,extracting the file_name/ wrapper_name etc..
Pandoc does not need, all of the style files as it is not trying to compile, but rather convert. Hence, to workaround certain limitations, we have to remove the RJournal.sty file and include a new style file which redefines certain commands.
As we do not desire the embedded bibliography to be included as a div element in the article itself, we need to convert it to Bibtex format.
For removing the bibliography div elements from the article we use a Lua filter later on.
For converting the embedded bibliography we use rebib package. By default I have set up the bibliography aggregation function, which will logically create/update the bibtex file and include it in the article_tex_file as well (if not linked).
Texor package creates a yaml report about the figure environments, including tikz, algorithm2e images. There is also a logical function which uses pandoc’s Image data for converting PDF images to PNG.
Pandoc does not support certain environments, like:
in figures : figure*, algorithmic, algorithm. in table : table*.
in code : example, example*, Sin, Sout, Scode, Sinput, Soutput,
smallverbatim, boxedverbatim.
Here, texor will use the stream editor to patch these environments to
the default types figure
,table
and
verbatim
.
There is also a function to patch equations (especially eqnarray environment).
Here we will convert the document to Markdown, with a lot of Lua filters modifying the document.
pre_conversion statistics will read through the article and find statistical elements like figure, table, code blocks, citations etc.
This function will copy the files such as figures of all kinds, bibtex file, pdfs etc. to the /web folder.
In this stage we convert the markdown to Rmarkdown by reading and adding metadata information like ctv,CRANpkgs,BIOpkgs,slug,author metadata, title, abstract,etc..
We also add important parameters for
rjtools::rjournal_web_article
like:
This package is involved with solving conversion problems on multiple fronts, thus has to rely on multiple software tools. A list with reasoning is included here: