Package 'lyxport'

Title: LyX to MSWord etc
Description: Tools for smooth Lyx export via pandoc, currently to MSWord but to potentially other formats
Authors: Mark Bravington <[email protected]>
Maintainer: Mark Bravington <[email protected]>
License: GPL-2
Version: 1.0.155
Built: 2024-12-12 09:22:53 UTC
Source: https://github.com/markbravington/lyxport

Help Index


How to use the lyxport package

Description

The lyxport R package is for exporting LyX documents to MSWord— which I sometimes have to do, under duress— and perhaps other formats. Unlike LyX's built-in “MS Word Open Office XML” export, lyxport does proper cross-referencing including tables, figures (correctly sized), lists, equations, appendices, and bibliography. Most of the heavy lifting is still done by Pandoc, as in LyX's built-in export option; but Pandoc— wonderful though it is!— doesn't get everything right even with well-known filters (as you have probably discovered yourself by now, else you mightn't be reading this). So the package contains a lot of behind-the-scenes fiddly code in order to save you lots of manual post-tinkering.

Once you've installed the package, run lyxprefhack to set things up for direct use from LyX. Then you should see an "MSWord (lyxport)" option in "File->Export", and a "lyxport" item at the end of "Help->Specific manuals", where the full package documentation lives (you may wish to consider at least opening it...). To just see the features, do "File->Open example" and filter for "lyx", to open "lyxport-demo.lyx". You can try exporting with LyX's built-in MSWord option, and with lyxport. I haven't tested every LyX feature; it's mostly just stuff I need. More things might get added.

The package currently has one main user-visible function, lyxprefhack, which you need to call (once) for setup. The conversion work is done by lyxzip2word, but you don't normally need to call it directly; lyxzip2word is normally called by LyX on your behalf, when you export to "MSWord (lyxport)" or perhaps some other format. There are also some helper utilities:

  • requote_lyx (qv) to help sort out quotation marks, e.g. in case your document includes imports from other formats such as plain-text.

  • tidy_initials (qv) and tex2utf8 to tweak bibliography files to be MSWord-ready. These are called automatically by lyxzip2word, unless you tell it not to, but you might also find them useful in their own right.

Just in case

lyxprefhack makes a number of assumptions about your LyX config files (for good reasons). It works for me, but that could be dumb luck; if it goes wrong for you, be aware that it makes backups of your "preferences" and "ui/stdmenus.inc" files— so you can manually restore them. If you can't see "lyxport" as a help option, try typing "help-open lyxport-docu" in the minibuffer. And you can also see a PDF version of that documentation in R, via RShowDoc("lyxport-docu",package="lyxport").


Prepare LyX for better export to MSWord

Description

This function should be called just after you install the lyxport package, and you probably won't need to call it again. It modifies your LyX "preferences" file to add better MSWord export, with shortcut "W" in File->Export; see Details. It also creates a LyX help file and an example that you can access straight from LyX— so you may never need to use this package again from R, since almost everything you need is accessible from LyX itself. (One exception is if you want to use requote_lyx (qv) to sort out quotation-mark problems. Many people will never need it.)

Usage

lyxprefhack( userdir)

Arguments

userdir

Where your config files live; see section 2 of LyX's "Customization" manual. R will prompt you for it if you don't set the parameter, so you can copy it from "Help->About" in LyX; single backslashes are OK.

Details

To enable automatic export from LyX, you need to add two Preference settings inside LyX, either manually or by editing the "preferences" file in your LyX Userdir. The function lyxprefhack will attempt to do the latter, at your own risk.

The two settings can instead be set manually inside LyX, from the "Tools->Preferences->File Handling" menu. First, define a new File Format, which should be a copy of LyX's built-in "MS Word Open Office XML" but with a different name. The only fields you absolutely need are:

  • Format name: MSWord (lyxport)

  • Tick the boxes for "Document Format", "Show in Export menu", and "Vector graphics format"

  • Short name: wordx

  • Extensions: docx

  • Shortcut: you don't need this, but I use "W" so I can export via "Alt-F E W"

Second, you need to define a Converter, as follows:

  • From: Lyx Archive (zip) - not from straight LyX!

  • To: MSWord (lyxport) - ie the name of the new Format

  • Converter: Rscript --no-save --no-restore --verbose -e lyxport::lyxzip2word(FROM_LYX=TRUE) $$i $$r $$p 1 > docxconv.log 2>&1

When the converter runs, it will write a logfile into LyX's temporary folder, which unfortunately is a bit hard to find if things go wrong (if they don't, you don't need to find it). Weirdly, if you try to put the logfile into the "main" folder (ie where source and export live), by using "$$r/<something" after the redirect, then LyX says it can't execute the command...

Value

Overwrites the "preferences" file (unless there's no change required), after backing up the old one to "old_preferences<N>" (guaranteed not to overwrite any existing backup). Adds a "lyxport" option to "Help->Specific manuals", in the file "ui/stdmenus.inc", again making a backup of the latter if there's any change. Copies one file from the R installation to LyX's "<userdir>/doc" ("lyxport-docu.lyx") and one set of files to LyX's "<userdir>/examples/lyxport" ("lyxport-demo.lyx" and associated files).

See Also

lyxzip2word, lyxport

Examples

## Not run: 
lyxprefhack()

## End(Not run)
misc

Convert LyX to MSWord or other format

Description

lyxzip2word starts from a "LyX archive" (zip export from LyX) and converts to MS Word (or potentially other formats) using various tools, mostly Pandoc. See RShowDoc("lyxport-docu",package="lyxport") for more information about normal use from LyX, and any requirements of your LyX source file (eg specifying the bibliography format).

You don't normally need to know anything about this function, since it is called automatically from LyX. However, I should document it for maintenance-type reasons. Also, if you want to experiment with exporting to other formats, you might want to use it direct from R, setting the outext and panoutopts arguments.

Usage

lyxzip2word(
  zipfile,
  outext= 'docx',
  panoutopts= outext,
  origdir= dirname( zipfile),
  tempdir= base::tempdir(),
  copy= FALSE,
  FROM_LYX= !interactive(),
  refdir= NULL,
  lyxdir= NULL,
  lyx_userdir= NULL,
  natnum_pandoc= FALSE, # devil or deep-blue-sea?
  crossref_pandoc= TRUE,
  verbose= FALSE,
  dbglyx= ''
)

Arguments

zipfile

Name of input file, normally a LyX-zip archive with a path. Extension is optional, but ".zip" is assumed. If (for experimentation only) the extension is ".lyx", then all the other necessary files had better be in tempdir, which had better be the same folder as the in-fact-not-zippy zipfile.

outext

File extension of output.

panoutopts

For pandoc's writer, to tell it what kind of output to produce, ie pandoc's "-t" argument. Normally the default of outext is fine, but there might be other useful options for specific writers, eg "markdown+myfilter-theirfilter".

origdir

where zipfile lives, and where the output should go; deduced from zipfile if missing.

tempdir

where to unpack the zipfile and create temporary files etc.

copy

whether to copy zipfile to tempdir (leaving the original in place), or move it (in which case the original is "lost"). Normally it's fine to just move it, because the zipfile is only a temporary step in the creation of your magnificent MSWord-flavoured document...

FROM_LYX

set TRUE iff called from inside LyX by a Converter, in which case zipfile, origdir, tempdir, and possibly dbglyx will be set according to commandArgs.

refdir

Top of the folder-tree for bibliography-finding. It should have a a folder ./bibtex/bib/ underneath it, containing dot-bib files. It is normally found by automatic magic.

lyxdir

Where Lyx executable lives. However, you should probably make sure Lyx is on the search path anyway, otherwise things may not work; if it is, then you can leave this NULL.

lyx_userdir

what you'd pass in the "-u <userdir>" option when starting LyX. However, as of Lyx 2.4.2.1, there's a bug which stops that working (in the particular context of lyxzip2word). What this means is, you can't use your own private layouts/modules in your own userdir tree; you'd have to copy them to Lyx's system layouts folder.

natnum_pandoc

whether to turn on pandoc's own numbering scheme (when reading and when writing). More trouble than it's worth so far, hence the default is FALSE.

crossref_pandoc

whether to use the "pandoc-crossref" filter when reading the Latex source. The default is TRUE, but lyxzip2word actually has to spend quite a bit of time fixing incorrect xrefs, and I suspect I could get FALSE to work with just minor mods, leading to one less "dependency". But everything's fine at the moment, so leave well enough alone.

verbose

if TRUE

dbglyx

Only for debugging, obvs. Should be blank, or a positive integer as per "lyx -dbg". Any number will also cause R to print out various things, such as paths.

Details

The steps in the conversion process are (actually there's more than this, this list is out-of-date...):

  • Move & extract zip file

  • Generate Tex, by running "lyx –export"

  • Merge any input/include files

  • Add eqn labels: eqn_labels_for_word()

  • Check and prepare bibliography

  • Twiddle any appendices, so as to not confuse pandoc

  • Export Tex -> pandoc-native: pandoc

  • Fix labelled-eqn column widths: eqalignfix()

  • Perhaps move the bibliography to before appendices

  • Export pandoc-native -> docx: pandoc

Value

Should produce a file "<zipfilename>.<outext>" in folder origdir. There will also be various files tempdir (which will be LyX's session tempdir, if this was invoked from LyX itself), including a logfile "docxconv.log" which should/might contain useful error messages if there are any. Look carefully thru LyX's "View->Messages" window to see where that LyX tempdir is (it changes from one LyX session to the next). The formal R return-value of lyxzip2word is TRUE or FALSE according as whether it thinks everything worked.

See Also

lyxport

Examples

## Not run: 
# In LyX, open the "lyxport-demo.lyx" example, then File->Export->Lyxzip. Then try exporting
# to a non-MSWord format, via eg...
lyxzip2word( 'lyxport-demo.zip', outext='html')

## End(Not run)
misc

Clean up LyX quotation marks

Description

This function tries to sensibly turn assorted types of quotation marks in a LyX document into LyX "dynamic quotes". The latter make it easy to render the document with any of a number of defined "nationalesque" quotation schemes, just by tweaking a single item in Document->Settings->Language. For example, you can get outer double/inner single quotes, outer single/inner double, guillemets, and so on; see "Quotation marks" in the LyX UserGuide (currently section 3.9.4.2). Without this functionality, a large LyX document can end up having multiple types of quotation marks (especially if it is multi-authored or includes excerpts of other documents), which can't easily be changed or searched for or made coherent.

Not all quotation marks should be changed: for example, straight quotes within listings or ERT should be left alone. requote_lyx tries to get that right. However, many funny-looking things come out OK when exported to Latex (which is a necessary step in producing eg nice MSWord documents, using the other functions in this package).

requote_lyx mainly aims at double-quotes (since IME these are the commonest defaults for normal quotation), but does some single-quote stuff too:

  • Apostrophes are left alone (deliberately; they are tricky!)

  • Any explicit single quotes are made dynamic, but their singleness is kept; it's assumed to be deliberate.

  • Hard-wired directional single-quote characters are turned into dynamic double quotes, just like hard-wired directional double-quotes. Coz that was probably the intention of an author who just prefers single-quotes for outer.

requote_lyx isn't aiming at perfection, and may well not be foolproof; there might be situations where it doesn't work properly, because LyX used some structuring that I hadn't anticipated. Sorry.

Usage

requote_lyx( filename = NULL, lyx = NULL, outfile = NULL)

Arguments

filename

optional name of file to read from

lyx

or you can pass the actual text in directly, as a character vector

outfile

optional filename to write the output to.

Value

The modified LyX text will be returned, invisibly. Also, if outfile is not NULL, the modified LyX text will be written to outfile.


Transform Biblatex bibliography to native UTF8 characters

Description

Biblatex bibliography (dot-bib) files may contain legacy Latex/Bibtex representations of characters, such as "\c{G}" for "ģ". These are normally fine— though some of them are technically incorrect, but may still work— but not always, e.g. when looking for consistent names, as in tidy_initials (qv). So you can try tex2utf8 to translate such representations into "native UTF8 codepoints". It might help you.

This might even work on more general Latex (ie not on a bib file) but you are on your own there...

Usage

tex2utf8(tex, file = NULL, outfile = NULL, debrace = FALSE)

Arguments

tex

character vector containing the bibliography contents. Provide just one of tex or file.

file

if supplied, this is used in place of tex (which is ignored), to read the lines in from.

outfile

if supplied, the result will be written here.

debrace

whether to remove superfluous braces around single UTF8 characters. IME these are mostly legacy effects of Latex representation, rather than deliberate statements about upper/lower case (the only legit use I can think of). Extra braces are usually harmless if you are using tex2utf8 as a standalone, but they do muck up tidy_initials (qv), so lyxzip2word sets this option TRUE when tidying up the biblio (unless I have given an option to change that, which I haven't yet).

Details

This is harder than it sounds. The key was to find a couple of tables on WWW; see source code for details. tex2utf8 tries to fix up common Bibtex representation errors (i.e. semi-problems in my own master biblio file, mostly from WWW sources), but probably won't catch everything. And there may be "native UTF8 codepoints" for some characters that aren't in the (original?) Latex list, and won't be transformed. They can of course be produced in Latex by a composite (eg "\k{n}"; I have no idea whether that is a real character in some alphabet). They will stay composite in the output.

Value

The modified contents.

See Also

lyzip2word, tidy_initials

Examples

# Not compulsory to have an EXAMPLES -- you can put examples into other sections.
# Here's how to make a "don't run" example:
## Not run: 
reformat.my.hard.drive()

## End(Not run)

Consistent author names in dot-bib files

Description

In a dot-bib file, the same author can appear with slightly different names in different papers: "A. Psmith", "Alan Psmith", "Alan B. Psmith", "A. Bertram Psmith", and so on. If you are not careful, your citations can come out funny as a result. For example, you might see "Psmith et al. (1999)" but "A.B. Psmith (2004)" even though Alan Bertram is the only Psmith you are citing. With Biblatex and PDFs, you can suppress such nonsense via "uniquename=false" and "uniquelist=false". But with CSL and MSWord etc, it seems to be harder— well, so does pretty much everything, actually.

To circumvent the problem, you can call tidy_initials on your bib file beforehand (which lyxzip2word does automatically by default, on a temporary copy of the dot-bib, unless you ask it not to). This will ensure all plausibly-identical authors have exactly the same bib entries (initials only, and the longest possible set), and should eliminate silly citations.

The merging rules, which depend slightly on the gungho argument, are as follows:

  • "Alan Psmith" and "A. Psmith" are always assumed to be the same (and "Alan Psmith" will be remapped to "A. Psmith", since only initials are retained when there's a discrepancy).

  • "Alan Psmith" and "Alicia Psmith" are never merged. That is: if there's a full name rather than initial, it's taken seriously.

  • Mismatched initials are never merged, eg "A.B. Psmith" and "A.C. Psmith".

  • If gungho=TRUE, then missing initials are ignored, so that "A. Psmith" gets merged with "A.B. Psmith". If there's also an "A.C. Psmith", then it's the luck-of-the draw as which one "A. Psmith" will be merged with.

  • If gungho=FALSE, then merging only occurs when two authors also have the same number of initials; so "Alan Bertram Psmith" gets merged with "A.B. Psmith" but not with "A. Bertram Psmith".

In order to make this surprisingly tricky programming task a bit easier, tidy_initials calls tex2utf8, which you might find useful in its own right.

Usage

tidy_initials( bib, gungho = TRUE)

Arguments

bib

character vector containing the bibliography contents. Use one of bib or file.

file

if supplied, this is used in place of tex (which is then ignored), to read the lines in from.

outfile

if supplied, the result will be written here.

gungho

TRUE if you prefer to assume that middle initials are easy-come-easy-go. Saves pain, but in rare cases could merge genuinely different people.

Value

The modified contents, as a character vector. If outfile is supplied, that file will be created.

See Also

tex2utf8, lyxzip2word

Examples

# Not compulsory to have an EXAMPLES -- you can put examples into other sections.
# Here's how to make a "don't run" example:
## Not run: 
reformat.my.hard.drive()

## End(Not run)