Default file formats for research articles: my own story (so far)

I was going to write that I used to write articles in Latex and now I write them in Word (for some examples, see here), and then speculate about how the change in format might change how and what I write.

But then I thought that more background could be useful.

1. The pencil-and-paper era. Back when I was a kid I used to write everything by hand. I was proficient with the eraser. In high school I developed a style where I’d outline the English compositoin first and then write it.

2. Typewriter. In college we had to type our papers. Which I did, using my little typewriter (which had been my sister’s before that), until . . .

3. Mark’s homemade word processor. My college roommate was a CS major and had an Atari computer. I persuaded him to write a word processor in 6502 assembler, and I ended up using it more than he did.

I took a couple of English classes that required a paper every week or so, and to get everything done, I developed a system whereby I first diagrammed my plan on a single sheet of paper using circles and arrows, then wrote a series of outlines, the last of which had at least one sentence per paragraph of the final paper. I’d then take that outline, sit at the computer, and type up the paper pretty much in one take.

4. Troff. I wrote my senior thesis on a campus computer which printed out really nice—not that yucky dot-matrix stuff. I formatted it using Troff.

5. Pencil and paper again. For homework assignments in graduate school I went back to pencil and paper. I was still doing graphs by hand on graph paper (but that’s another story).

6. Latex. One of my colleagues told me about Latex, and this quickly became my standard. When I wanted to write an article, I’d take an old latex file and map out sections and subsections, then fill in different parts when I was ready. I did it this way for several books and a few zillion articles.

I have to admit, I’ve never learned Bibtex, so I spend lots of time cutting and pasting bibliographic references.

7. Html. A few years ago I started this blog (originally intended as a way for members of our research group to communicate with each other). I often use the blog to record thoughts that later are published more formally. Writing in Html puts these thoughts in a different shape than what was happening in Latex. More conversational, less locked into a formal structure.

8. Word. Recently, for some reason I’ve been writing articles (and our forthcoming book) in Word, which doesn’t work so well when I have formulas but somehow seems smoother otherwise.

The medium does affect the message. In many ways I’m dissatisfied with my current approach of composing at the keyboard. Maybe I’ll try pencil and paper outlining for awhile.

Oh, yeah, . . . happy new year.

P.S. There are a bunch of comments below, but none of the commenters addressed my point, which was the way in which the typesetting or word processing environment affects the style of writing (and the choice of what to write about). Lots of suggestions about Latex implementations, but this is really beside the point here.

P.P.S. I just came across this post. It’s been over ten years! In the meantime, I’ve switched back to Latex as a default, but with what I think is a more attractive format (for example ). And sometimes I use Google docs for shared projects. The one thing I still can’t stand is “track changes.” Whenever someone sends me a document with “track changes,” I turn off that feature. Also those marked-up PDF’s, I hate them. And don’t get me started on Github. What a mess. Whatever. Latex and Google docs, they still do the job.

21 thoughts on “Default file formats for research articles: my own story (so far)

  1. Bibtex would take you 1/2 hour to learn if you find a moderately well written intro, and look at some examples.

    anything 2 pages or less, or up to say 5 pages that requires a moderate amount of unique visual formatting can be written productively in Word.

    Anything like a book, a journal article, or a formal report for publication works much better in LaTeX provided that someone has produced a class file for your needs.

    One major advantage to LaTeX is that you can use computer programming version control systems (such as CVS, SVN, RCS, monotone, git, etc) to keep track of revisions and multiple-author's contributions.

  2. Lyx has become surprisingly good over time. (Certainly far better than the $500 SciWord.) Version 1.6 has lots of nice features for those that don't want to spend all their time looking at latex code. At the same time the latex output is pretty clean. Track changes is great for working with coauthors and the formula tools are incredibly robust, with macros and all sorts of advanced features too. Once you learn the keyboard combinations I find it vastly superior to working in latex directly. Bibtex is a breeze.

    I do miss inline spellchecking but that's on their todo list.

  3. I downloaded Lyx but I found it no better than my current Emacs/Latex setup. And Scientific Word is horrible; don't get me started on that.

    But the main point of my blog entry above was to reflect upon the ways that my word-processing tools have affected my writing style (and choices of what to write).

  4. I prefer to use LaTeX because: I like the fact that I don't have to worry about how stuff is going to look; equations and mathematical symbols are far, far, far easier to enter than in MS-Word's Equation Editor (which is the only alternative to LaTeX that I have considered); and I like the easy, semi-automatic numbering of equations and figures. But continuing to use LaTeX seems to be shoveling against the tide, and in fact, more than half of my papers in the past sixteen months or so have been written with MS-Word.

    First, many of my colleagues don't use LaTeX and don't want to learn it, so when we write a paper together it either has to be in Word or I have to make all of the edits.

    Second, many of the journals in which I publish either don't accept LaTeX submissions or they treat them in a second-class way somehow. I'm actually a bit puzzled in what to do with a paper that was just conditionally accepted by a journal that wants MS-Word files but accepts LaTeX. They want the reference list numbered like this: "(4) Smith and Jones,…" whereas the LaTeX default is "[4] Smith and Jones,…" I can probably figure out how to edit the LaTeX style file to make this happen…but is it even worth it? After all, they're not going to use the LaTeX file at all, they're going to convert everything to Word. But the author instructions state unambiguously that the author is required to put everything into the journal's format. If this journal would do as some others do, and provide a .sty file, there would be no problem here, but alas, they don't.

    I ought to just take the path of least resistance, which is MS-Word, but I'm just not willing to do that, so I continue to use LaTeX when I can. But it's increasingly painful.
    By the way, on the separate section of LaTeX editors, I've tried Lyx and hated it. I'm quite happy with the very simple TeXshop.

  5. It doesn't feel like I've seen enough here to decide whether anything's changed for you style-wise or subject-wise. Perhaps an annotated bibliography would help? (Easier said than done, of course.)

    Some questions that come to mind include:

    You mention that Word doesn't work so well for formulas: do you think you thus avoid formulas, perhaps concentrating more on concepts and results, or do you still use the same number of formulas but it's just something that's painful in Word?

    You mention that you basically outlined in LaTeX by taking a previous document and then setting up the sections and subsections first. (Perhaps this was because the overall LaTeX process was a little difficult, but setting up a section/subsection outline is a fairly easy way to get started and to get traction.) Do you not outline like that in Word? Do you use Word's outline capabilities?

    Are you also finding that Word is more accepted by your publishers and those you submit papers to? Or rather that you now favor different outlets because they accept Word? (And different outlets will tend to have different subject matter and styles.)

    Are you noticing any difference in the length of your papers? (E.g. shorter papers, which might benefit more from Word's more fluid style?)

    Personally, I've always used word processors — though I don't use Microsoft products, so don't use Word specifically — but I've been very successful lately in graduate school using Lyx and R (and graphviz, and OmniOutliner, and Omnigraffle, and Igor Pro, and Numbers, …), and part of the reason my papers have been well-received has been due to appearance: LaTeX's outstanding output and the excellent graphs I can make in R.

    By way of contrast, I've not been very happy with technical Word documents I've received — they've tended to be ugly and a bit sloppy — and I've been absolutely appalled by Excel (and other, more-technical products) graphs I've received.

    I have noticed that I have a different workflow and even thought process when using Lyx, at least in part because I have the need to output to PDF and preview in that format frequently. (Lyx's near-WYSIWYG is okay, but doesn't feel "real" to me as the super-nice-looking PDF does.) I can't quite put my finger on it otherwise.

    Based on your experience, what I'm reading in your blog, and feedback from partners on team-papers, it sounds like LaTeX may be on the way out. It's a bit odd that its replacement is also probably on its way out as well, though it will have a much longer tail.

  6. I write my articles in Outline View in MS Word. Using Outline View makes it easy to navigate through the hierarchy of headings in the document. Outline View can be used to zoom-out or zoom-in. Zooming-out is achieved by showing just the headings which provides an overview of the structure. To navigate to a particular section of the document, I zoom out and then move to the section that I want to work on.

    I use the short cut keys to navigate (e.g., – and + on the number pad) and structure (e.g., tab) the outline.

    I also often give each paragraph a title in the form of a heading in the outline view. I use a small macro to convert these paragraph titles into “hidden text”. Paragraph titles allow me to get a clear sense of the structure and how the material could be rearranged, although they are not intended to be displayed in the final document.

    I use two Word documents, one for the actual document and one for brainstorming and planning. I write both in Outline View. The brainstorming document is mainly written in an outline style format with a series of headings.

    For citations I use Endnote.

    I save document formatting to the very end. While writing the document I specify heading levels and body text. I then use styles to format.

    I don’t have a huge need for formulas, but my sense is that the newer MS Word 2007 equation editor is easier to use than the previous one.

    Thus, in response to the post, I find that writing articles in outline view facilitates my thinking about document structure and flow.

  7. I completely agree that the medium affects the message. This is very clear not only in papers, but in presentations, lecture material, etc. as well.

    I used to do everything (papers, talk slides, lecture material) in Latex. I recently had to write an article in Word, which made me very determined to keep doing so as far as papers are concerned, and the paper didn't even have significant amounts of math.

    However, I've switched from Latex (Beamer) to the infamous Powerpoint in talk slides. This encourages (and is encouraged by) reducing the level of detail and math in my talks; the details can always be found in papers if someone is interested. I've even tried giving a talk without any slides at all.

    In lecture material where we want to have proofs and can afford a lot of detail, I still think Latex is very good.

    Teemu

  8. I think one way the medium affects the way one writes is the extent to which one can easily produce something that looks exactly the way it will look to the reader. I find it very hard to think something through properly if some major element of a text (graphics, unusual scripts, other features) can't be put in place while I'm working on the piece, but has to wait till it's handed over to a designer or typesetter. But I also find it hard to think something through properly if I have to take time out to get up to speed on some package / plug-in / other with a steep learning curve. Sometimes pencil and paper actually is the best solution, because I can integrate everything without having to think about anything but the text.

  9. Andrew, very good points. I'm a bit more eclectic, perhaps, for I currently use several approaches, but I agree that the tool can affect the thinking.

    I gave up on Word for two reasons. First, in carrying out a 50-100 document document management project, I was bitten two or three times by the "spaghetti numbering" bug. The pain of recreating documents so afflicted caused me to swear off Word, hopefully forever. I feel much safer making my documents in tools that have an underlying text-based format that can be repaired by humans. Second, when I discovered OpenOffice.org Write, I discovered they had done a better job at enabling me to write with styles rather than with direct formatting, and I think that focus on styles helps me organize better. I know it makes me more productive and gives my documents a better appearance.

    At some point, I discovered DocBook. I like its single-sourcing ability and its abundance of tags to give subtle visual highlighting to concepts, but it does require a bit more labor to enter (even using Emacs' excellent nxml-mode).

    All three–LaTeX, OO.o, and DocBook–help me organize my thinking better because they're not as oriented to direct entry: you have to (or are encouraged to) think before typing.

    I had also discovered asciidoc markup, which I tend to use for client reports these days. I still find Emacs the most productive way to place text on paper, and asciidoc offers just enough structure to keep the organization straight without adding noticeable complexity.

    Then I discovered that there's at least one book in the works using asciidoc, and that convinced me of its broader utility. While I still use beamer for presentations, there is an asciidoc to beamer processor, too. You can embed LaTeX math expressions in asciidoc, as well, and process them using dblatex (if you care, you can also embed music using LilyPond notation).

    How has asciidoc affected how and what I write? I think it's keeping my writing more organized and less complex. I think it's causing me to make even short notes (not this comment, unfortunately) more coherent and slightly more formal, for I tend to use asciidoc markup for almost everything, realizing I can easily turn it into HTML, PDF, ODT, or even RTF later if I wish without further editing. I think it's pushing a regression to the mean, with what would have been scribbled notes turning into asciidoc-formatted quarter pages and what would have been longer articles or reports getting a bit simpler and clearer.

    If I process the result with dblatex, I can get the benefits of LaTeX's layout engine. Because it can create embedded graphics in HTML using the data-uri scheme, I can easily send graphic-rich documents in HTML by email, as long as my recipients don't use IE (or do use IE 8 beta 1 or later).

    I still use LaTeX and DocBook directly for some work, and I still use OO.o on occasion, but most of my documents seem to pass through asciidoc these days.

  10. Wayne: Latex worked ok for books but it seemed to make my articles less readable. Somehow in Word I was able to achieve a more conversational style. Go to my recent unpublished papers for some examples. (But I still have to figure out where to submit some of these articles.) Writing in Html (i.e., blogging) also helped in that regard.

    Helen: I'm obsessive about the exact appearance of my books (and, to a lesser extent, my articles). But I'm thinking that maybe I've been a bit _too_ obsessive, spending time getting the graphs to look just right instead of doing more writing, research, or thinking about copyright issues (ugh). One nice thing about Word (and, similarly, Powerpoint) is that there's a practical upper limit for me on how good the final product _can_ look. Being forced to give up on getting it to look perfect allows me to relax and focus more on content, perhaps.

    Bill and others: Yes, as a practical matter, the reproducibility of Latex is a huge advantage. What I should probably do is spend a few hours making a template Latex document that looks like a Word article, then I can work with that from then on.

    To all: Yet another issue is that I want people to read what I've written. In this web-based era, perhaps html documents will be more widely read than pdf. Conversely, when I have a nice point in a blog post, I find it helpful to have an accompanying pdf article that people can read. Then people will take my idea more seriously, it's not just a blog entry. It's the one-two punch: the blog entry gets noticed and linked, and the pdf article gets respect. I probably should've written a pdf article to summarize my 2008 election summary….

  11. One other way that my tool set affects my writing: quite often I use a revision control system (CS-RCS under Windows, bazaar under Linux), and I find that makes me more likely to edit heavily because I know I can go back to an earlier version if I want to.

    Using a revision control system often seems easier with a text-based source document, but it worked with Word and CS-RCS, too. I'm not sure I got the space compression with Word, though.

  12. I had almost the exact same trajectory, even going from troff to LaTeX.

    I type so fast that I find pencil and paper very painful for text. I just wrote a bunch of post-holiday thank-you notes and it's really hard to write neatly now that I type almost everything. It certainly sets a different pace with the slowness and lack of corrections.

    I like to outline and do math with pencil and paper on a steno-sized graph paper notebook.

    I find that LaTeX really helps me to organize my notation better with macros. I'm very fussy about fonts and typesetting, so Word drives me crazy. (I'd recommend the movie Helvetica).

    What's really changed me, though, is blogging. I just tried to write a conference submission and found my blog "style" creeping in.

  13. I used to use LaTeX during my time as a Computer Science student, but I've moved back to Word for Polsci & Soc. Mainly because Zotero and Word integration is too easy to change to something else.

    Has anyone tried the Office OpenXML format with some sort of configuration management sys (RCS or SVN)?

  14. Just a quick comment reg Word and Latex: You can use latex commands in Word 2007 to type up equations. There are some minor differences in actual latex commands and the commands you use in Word 2007 but for the most part they are the same. Perhaps, to an extent Word 2007 offers the best of both worlds?

    See my blog post at http://www.svadali.com/blog/2008/10/28/equations-… for more details about this feature of Word 2007.

  15. Andrew: you mention that PDF is more "authoritative", but I view it as more convenient and aesthetically pleasing. If I think something's worth saving — call me Old School — I want a PDF, even if it means printing to PDF from Safari. (Preview lets me easily mark up PDFs with highlights, etc, and it's easy to search them with Spotlight. Plus, purpose-generated PDF looks much nicer than the vast majority of HTML, et al, and is free of ads.)

    I looked at your unpublished papers, though it's a bit hard to do the A/B comparison to published papers and decide. I would not have immediately guessed that they were done in Word, which is a good thing.

    Have you noticed whether using Word for your text has led you to use, say, Excel for your graphics? Or would you still tend to use something like R to generate PDF's and bring them into Word?

  16. I am uncertain I know of any difference in my writing style from one means to another. The change in audience has changed the style, but the means has changed it little if any, at least what I can discern.

    However, the means has dramatically changed the way I think about problems. I have never got the knack of doing algebra in an electronic format. Given the need for menus or keystroke combinations for things like integral or derivative symbols and especially for indexing, pencil and paper is much faster. But the means seems to affect my speed of thinking. With a pencil and paper, I feel I can't write fast enough, but with the slower equation typing, I find I am thinking slower than I write.

    I also suspect that the choice of pencil and paper versus math notation word processing can make a difference in what I write because the difference in the methods of writing equations or expressions changes what I think about them immediately after I have written them, and will (presumably) change what I want to say even after some reflection.

  17. Andrew,

    I just read your PS statement and I am not quite sure my contribution falls into that category. Using a mind mapper software like Freemind has a definite feel that is different from the pencil and paper approach or the more linear Latex/text editor approach. In effect, I really think the end product at that stage is different from a more traditional latex/typesetting approach.

    If you try it, you can built something real fast using three keys: Insert, F2 and your mouse to move the bubbles around.

    Cheers,

    Igor.

  18. This discussion reminds me of a comment made by Donald Knuth: he wrote somewhere that he writes the first draft of "The Art of Computer Programming" using pen and paper because this is well synchronized with his speed of thinking; if he uses directly the computer, he types too fast and his thoughts cannot follow. Then, afterwards, he uses TeX, as one would expect from him.

    This looks like a nice example of how the chosen medium influence what one writes.

  19. I have used computers since the 70's and moved from a layout system modeled on TeX to Word (from Word 5.1 on a Mac SE, to today's Word 2003), and recently LyX.

    I have had many frustrations working with word and this autumn I had to do a paper where only a PDF was required.

    After some learning, and lots of package loading I have been pleased with LyX as a LaTeX front end.
    Apart for the often cited things of bibliographies and X-refs (which are great, better implemented than Word and easier to use) I love the live outline, which, unlike Word, you can use at the same time as your normal view.
    And, actually there are several other usability features that put Lyx in ahead of Word for my daily use.
    One example is the option to always work on the documents tabs you had open last time. This is SO simple, Word has never done this, and it saves me 30 secs of energy as I start my document, which as you all know is a crucial moment ;-)

    After experimenting with different packages I settled on the KOMA classes and I am very pleased with the result. The paper is not on the web yet when it is I will come back and point to it.
    (It is about SAS V9.2 graphics so it might be of interest to people here anyway.)

    For note making & 'creating' I use both Tiddlywiki and Tinderbox (from Eastgate systems). And sometimes outline mode in Word or PP. And I use paper a lot with pseudo mind-maps and notes scribbled everywhere, :-(
    I would actually use TB a lot more if I had a Mac at work. But I don't, so there it is.
    I have also used TB successfully as an R journal (program plus output) because it is so easy to copy & paste stuff in. Then organise it later.
    I am aware of the various Weaves but TB has the advantage that the various ways of exporting can be run starting from any point – so a TB file can have more than one document in there. EG a paper PLUS all the notes PLUS a presentation.

    The other VITAL tool I have is dropbox. I discovered it earlier this year and it is indispensable.
    I have just a free account (2GB) and all my current work goes there. When finished I move the folders out to
    permanent storage (&CDrom backup) on the appropriate machine (Client, my company or private).
    Because you install dropbox on all your machines, it is a life saver when you find that one change to make when you get home. And you left the company laptop at work, where it should be.
    And also when working on my LyX document I could easily sync it from a WiFi cafe, long before I finished my Espresso :-)
    And it keeps up to 30 versions (free) so it saves the effort of setting up a version control system, if you just want revertability and many backups.

    Dave

Comments are closed.