Showing posts with label Computers. Show all posts
Showing posts with label Computers. Show all posts

11 February 2025

Reformatting paragraphs in Project Gutenberg ebooks

I would be the last person to criticise the vast repository of public domain literature at Project Gutenberg. However, some titles there are formatted in a way I don’t much like. Paragraphs may either be block formatted (as in this post) or have white space between each pair of indented paragraphs.

The fix is simple and needs only a modicum of computer skills.

There are two basic file formats for ebooks. The industry standard is epub; the Amazon Kindle uses mobi. (The latter is an older format, but readable by all Kindle devices.) An epub is just a zip file containing text and the instructions for displaying it. The innards of an epub are complicated; luckily we do not have to delve into them too deeply.

If you use a Kindle, there is an extra step involved to convert your newly tweaked epub to mobi (more of which later).

Sigil is a free-to-use application for editing epub files. The left-hand column shows all the files contained in the epub. The middle column shows the editable material (whether the text of the book or the instructions for its display). The right-hand column, when preview mode is selected, shows how the text will look on an ereader; it can also display the Table of Contents.

Click to enlarge

The files we are interested in are in the folders ‘Text’ and ‘Styles’. As one would expect, ‘Text’ contains the text of the book. It is formatted as HTML – a file extracted from this folder will display in any browser, but use only that browser’s defaults for text size, heading style, etc.

The instructions for displaying the text in an ereader reside in the ‘Styles’ folder. The file or files there are in CCS (cascading style sheet) format. We need to edit part of a style sheet in order to modify the appearance of the text.

Having downloaded your epub from Project Gutenberg, load it into Sigil. Open the ‘Text’ folder. Click on one or more of the files there to make its content appear in the Preview window on the right. If the formatting of the paragraphs is not to your taste (in this example, there is extraneous space between them), open the ‘Styles’ folder and find the style sheet which defines the properties of paragraphs – Styles/OEBPS/0.cc in this case.

In HTML, paragraphs are enclosed by the <p> tag. The statement we are looking for is this one:

It causes the text to be indented by 1 em. The ‘em’ is a printer’s measure adopted for CSS; you may find ‘px’ (pixels) used instead. Here the top (‘margin-top’) and bottom of the paragraph are respectively to have an inserted space of 0.25 em. By changing these values to 0em, the extraneous white space between paragraphs will disappear.

The margin property is explained here; you may find ‘margin-top’, ‘margin-right’, ‘margin-bottom’ and ‘margin-left’ abbreviated, as the article explains.

Kindle users will need to convert the epub file to the mobi format. This is easily done with Kindle Previewer (Windows and Mac) or calibre (Windows, Mac, Linux).

30 August 2014

Some superb fonts

If you spend much time staring at computer text, you ought to consider very carefully which fonts you use.

Philipp H Poll and his team have provided us with the elegant and readable Linux Libertine. It out-classes Times New Roman by a country mile. The package includes Linux Biolinium, which is an open-source replacement for Linotype’s Optima.


Click to enlarge


Linux Libertine looks great when printed, and if you want serifs on your display font then Libertine is your man. However, I’m coming to prefer a sanserif face, and a monospaced one at that, and Ralph Levien’s crisp and humane Inconsolata is now my first choice for the screen.

9 March 2012

Imagination at work

Cyriak Harris is a British animator, working in Brighton, with his own unique vision. Have a look at this video for We Got More by Eskmo (be sure to view in Full Screen mode):


cows & cows & cows is even more representative of his stunning talent:


Some of his work is disturbing as well as very clever and funny. If you wish to visit his site, consider yourself warned ...

12 March 2011

Using OpenOffice Writer for drafting

I have just finished drafting a new novel. It was composed using OpenOffice Writer running under Linux, and has now migrated to Microsoft Word on a Mac for further formatting.

OpenOffice uses a much more compact file-format than Word. The finished book occupies 305 Kb in odt format and 1.5 Mb in doc format. At the end of every working day during the drafting period, I saved what I had written so far with a name like “101103.odt” (being the saved file for 3 November, 2010). The result is a directory containing 121 cumulative versions of the book, totalling 16.7 Mb.

Thus I have a permanent record of the drafting process, complete with all the errors, cul-de-sacs and general groping for direction that accompanies the construction of any novel, no matter how well planned it may be – and on this project I dispensed with my usual synopsis and flew by the seat of my pants.

I am making this post for the benefit of other authors. Such a record might prove instructive long after composition. It retains ideas and passages that, even if at first rejected, you may decide to use later. Finally, it provides incontrovertible proof of authorship, should there ever (heaven forfend!) be a need to produce it.

OpenOffice has become very stable and sophisticated, and if you haven’t checked it out recently or at all I recommend that you do. It crashed three times during perhaps 500 instantiations, which beats Word on the Mac hands down; and only crashed at all when I was doing unusual things. The autosave feature is configurable. Very little is lost even if a crash occurs. The flavour of Linux I used is Linux Mint, which is an Ubuntu derivative I can also recommend.