Diffing Collaborative Text
A common advice I've read in the past suggests, that we should write text documents (like LaTeX documents or Markdown documentation) in a way, so that each sentence is put on its own line. This gives us an easier time to create diffs of documents which are created in a collaborative fashion.
Emacs offers a function fill-paragraph
, which breaks a given line before a
specific column width. This might look like this:
A [common advice](https://github.com/Wookai/paper-tips-and-tricks#one-sentence-per-line) I've read in the past suggests, that we should write text documents (like LaTeX documents or Markdown documentation) in a way, so that each sentence is put on its own line. This gives us an easier time to create diffs of documents which are created in a collaborative fashion.
fill-paragraph
looks decent, but it can create massive diffs which are hard to
reason about. There are several questions on StackOverflow (such as this one
where people ask for a customized version of fill-paragraph
which could format
a piece of text, so that each sentence is put onto a new line. I've recently
started to read the Emacs Lisp introduction tutorial (which ships with Emacs
itself), so I've tried to come up with my own solution to this problem. I took
some inspiration from the above StackOverflow post. Here's the code:
(defun fw/unfill-paragraph () "Unfill the paragraph at point." (interactive) (let ((fill-column (point-max))) (fill-paragraph))) (defun fw/wrap-at-sentences () "Fills the current paragraph, but starts each sentence on a new line." (interactive) (save-excursion (fw/unfill-paragraph) (mark-paragraph) (while (< (point) (region-end)) (forward-sentence) ;; We don't want the add a new line at the end of the paragraph (if (< (+ (point) 1) (region-end)) (newline-and-indent)))) ;; The selection will not be cleared if there is only one sentence in a paragraph (deactivate-mark))
The above code works for the most part, but there are still two edge cases which annoy me:
Emacs might treat phrases such as "e.g." or "i.e." as the end of a sentence,
which means that a single sentence might end up on more than one line. This
behavior can change depending on the configuration of your
sentence-end-double-space
variable, but we can still create examples in which
forward-sentence
does not behave as it should. Here's an example:
This sentence, which contains the phrase e.g. and because of how the new lines are put, Emacs interprets it as two sentences.
The markdown-mode
package has ambiguous behavior regarding lists. Depending on
what operations you perform, a list might be formatted in two different ways.
This can be visualized using an example:
Here's the initial text on one line:
- Some example sentence which contains no content. Another pointless example sentence. A third sentence.
If we put a new line right after every sentence, the text will end up like this:
- Some example sentence which contains no content. Another pointless example sentence. A third sentence.
If we instead indent the second sentence before we put a new line on every following sentence, each following sentence will have the correct indentation:
- Some example sentence which contains no content. Another pointless example sentence. A third sentence.
Both versions seem to be valid Markdown (well, at least for every interpreter that I've tried), but I'd still prefer the second version.