GitHub + Latex for collaborative writing


(Dave Murphy) #1

hi folks,

does anyone have experience of doing collaborative writing with Latex + Github that they would care to share? Any articles, links, etc. would be very welcome.

cheers
rgds Dave


(Ugo Pattacini) #2

Have you ever tried Overleaf?

It is very much used in academia since it allows for online collaboration among peers while writing up articles in Latex, plus it can rely on GitHub as a storage.


(Dave Murphy) #3

Thanks Ugo,

I’m trying to use an ‘open source’ solution, git + Latex should work in theory, however I’m trying to see if others have experience of using this combination.

I’ve used others such as ShareLatex but our experiences haven’t be satisfactory.

cheers
rgds Dave


(Dave Murphy) #4

I came across some interesting resources (below) that address collaborative LaTex on GitHub.
Things to watch out for include:

  1. Only commit the salient files, e.g. .tex files and .bib files (and any media files), don’t commit cache files or files generated by the LaTex rendering process (.aux .log .bbl etc.). You can setup .gitignore to ignore those files.
  2. Develop a strategy for avoiding merge conflicts - some prefer to create separate branches for different document sections, others simply allocate and agree to only work on specific files.
  3. Don’t include PDFs in the commit - it’s a pain because in most LaTex environments the master pdf is rendered (=modified) every time. So, consider not using the PDF format for input documents, such as images, or integrating existing pdfs; include PDF in the .gitignore

Tip: If you create the Repo on GitHub (which I recommend) you can automatically create the .gitignore file, which has a template for filtering out ‘noisy’ (unnecessary) TeX files.

as my experience with this develops I’ll add more to this posting.

Good Resources:
Medium article - https://medium.com/@rvprasad/a-git-workflow-for-writing-papers-in-latex-4cfb31be4b06
Youtube Playlist - https://www.youtube.com/playlist?list=PL8y6P9DSfNarZpzGmLYRpCfB1Pgt90OJi


(Gerardo Marx Chávez-Campos) #5

Hi, I have used Git+LaTeX for a collaborative writting since 3 or 4 years ago. I would recommend you the next stuffs:

  • Use the readme file to define the contents of the document, contribuitors and how the auxiliar files (figures, sections of code, bibliography) should be placed on the repository.
  • Try also to create first a specific document class and explain on the readme fila also how to use it.
  • Explain to your students how the review process will be to avoid commits for every step or paragraph done
  • create a .gitgnore file to avoid auxiliar files (*.aux, *.toc, an so on)
  • support your review process on the method of editors and make responsable to write, edit, figures, etc, to a each fellow by creating Issues or branches for each part.

I think there is agood idea to create some method for a better understanding and use of git on the acdemic writing process. I hope this ideas help you, best regards.


(Dave Murphy) #6

Some good points there Gerardo, especially something as simple as a README which can be very effective. :+1:


(Alexander L. Hayes) #7

Hey! I do a fair bit of collaborative writing, so I’ll share some of my process.

General Project Setup:

CollaborativeWritingProject/
├── .git/
├── .gitignore
├── compile.sh
├── LICENSE
├── ProjectTitle
│   ├── chapters/
│   │   ├── chapter00_front_matter.tex
│   │   ├── chapter01_introduction.tex
│   │   └── chapter02_main_topic.tex
│   ├── images/
│   │   └── creative_commons88x31.png
│   ├── Main.tex
│   └── .gitignore
├── README.md
└── StableCopy.pdf

My writing repositories look something like the above. The base of the repository has some info about what is being worked on, a recent copy of the .pdf, and the LICENSE/README/ other things a repository generally needs.

The project itself lives in a src/ folder (that I usually name after the title). This will contain a main .tex document, as well as subdirectories for individual chapters/sections and images.

The Writing Process

This is the main point I deviate on. I like using branches to group work as either stable and development. Keeping the most recent copy of the .pdf around can be useful if you have friends that are nice enough to help you proofread or discuss feedback.

This gets into a workflow somewhat like the following: keep a stable copy of the .pdf at the base of the repository and create a new copy during merges/rebases.

|
Stable .pdf at base of repository
|\
| \
|  Tweaking a section
|  |
|  Fixing some typos
| /
|/
Merge and create new stable document
|

Git allows you to create different .gitignore files for different directories. The .gitignore under the ProjectTitle/ directory tends to be structured like this (this is somewhat tuned to using Atom + pdflatex when writing so your exact requirements may differ).

The .gitignore specific to the internal src/ directory should ignore things like log files and “in-progress” versions of the .pdf that we don’t want to be tracked.

*.aux
*.bib
*.dvi
*.fdb_latexmk
*.fls
*.out
*.pdf
*.synctex.gz

Using subfiles to divide chapters

If your work can be divided into sections or chapters, then using individual files can be like using modules or classes in a programming language. I tend to find this even helps me focus, and helps me be less tempted to switch into editing a different chapter when I’m working somewhere else.

The main file tends to be where formatting, document structure, and a few other things will be defined. This way you can define how your document should look in a single file, then focus on the content in the individual chapters.

My Main.tex file tends to look something like the following, and makes use of the subfiles package to combine parts of the document:

% Overview:
%   Main TeX file for the project.
%   Subfiles should reside in the chapters/ directory;
%   and should have a chapter number, title, and a .tex extension.
%
% Build:
%   $ pdflatex main.tex

\documentclass[usenames,dvipsnames,letterpaper]{article}
\usepackage[left=2cm, right=2cm, top=2cm]{geometry}
\usepackage{subfiles}

\begin{document}
\title{ProjectTitle}
\author{Alexander L. Hayes\\
The University of Texas at Dallas\\
alexander@batflyer.net}

\maketitle

\subfile{chapters/chapter00_front_matter.tex}
\subfile{chapters/chapter01_introduction.tex}
\subfile{chapters/chapter02_main_topic.tex}

\end{document}

Beyond that, the individual chapters can have your content.


(Dave Murphy) #8

hi Alexander, that’s a great workflow and many thanks for sharing.
The reason I don’t incorporate the pdf is that with our LaTex setup, every time you render (which is frequently) it generates a new pdf, which can clutter the logs and commit history. But I also see the advantage of having a current version of the pdf for proof reading and distributing.

I like your structured approach, I use something similar, but I use \include instead of \subfile. Actually I’ve never used \subfile and since your reply I did a bit of reading on it and it looks like a very powerful package for compartmentalising the writing and the separate rendering -very cool!

thanks for sharing :+1:


(Łukasz Łaniewski-Wołłk) #9

What I can recommend is:

Every sentence in new line.

(Plus no line wrapping, which is default in most tex editors). I read this somewhere, early on, and it helped me a lot. The idea is, that because a new paragraph in Latex is started with two new-line characters, you can write every sentence in new line. The main advantage is:

  • It makes working with git and github, much more clear. Really. It saves lives.
  • From the editing point of view, you can clearly see the over-sized sentences, and fix them, etc.
  • It’s harder to make ugly punctation mistakes.

Other my experiance:

  • Dividing core text into files is overrated (if you’re not writing an epic 1000-page novel).
  • Using package glossaries for: acronyms and symbols. Apart from being a good practice, it’s a saver in collaboration.
  • I completely don’t understand the commend about PDF. Of course don’t include your output PDF, but include your input PDF’s like images. You can exclute the main PDF in gitignore explicitely (main.pdf in place of *.pdf)
  • Agree on the start who is the main autor, then fork/branch for all co-authors, and the main author merges (with meld) all pull requests.
  • There is a nice tool called latexdiff which generates from two tex files a tex file with marked differences you can compile. It’s a good visual aid.

Have a nice day.


(Dave Murphy) #10

hi @llaniewski

that’s a really good collection of tips.
To clarify, my comment about the PDFs is only on output pdfs, not input pdfs. Although for images I tend to use png files.

The newline for each sentence is a great tip, one I’ve seen other collaborators use, so that’s going in to my LaTex +Git toolbox :wink:

I don’t use meld (I’m on a Mac), but FileMerge (called opendiff on the command line) does pretty much the same thing + a nice gui.

glossaries is a clever way of standardising on aspects of writing and style - nice tip:+1:

this is becoming a great thread of tips and suggestions.

Does anyone have an opinion on rebasing vs merging for LaTex + Git collaborative writing?