BreakIntoLines Project

Description

This project is about a Java program that vectorizes a string into Java Path2D.Float paths, breaking it into lines and right justifying the text.

It started with a function in PostScript as shown in the project 𝝿 𝗩𝗲𝗰𝘁𝗼𝗿 𝗚𝗨𝗜 𝗳𝗼𝗿 𝗝𝗮𝘃𝗮 𝗮𝗻𝗱 𝗔𝗻𝗱𝗿𝗼𝗶𝗱. But PostScript was just a prototype language and it only takes Type 1 fonts where one cannot access Kerning Pairs because they are encrypted. With the project 𝗩𝗲𝗰𝘁𝗼𝗿 𝗙𝗼𝗻𝘁𝘀: 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻 𝗘𝘅𝘁𝗿𝗮𝗰𝘁𝗶𝗼𝗻 - 𝗚𝗹𝘆𝗽𝗵𝘀, 𝗧𝗵𝗲𝗶𝗿 𝗪𝗶𝗱𝘁𝗵𝘀 𝗮𝗻𝗱 𝗞𝗲𝗿𝗻𝗶𝗻𝗴 𝗣𝗮𝗶𝗿𝘀 one is able to read the Kerning Pairs as well as other font information directly from their font files. This is the final version which generates texts with Opentype and Truetype fonts without needing the original font files, which is appropriate to embedded applications.

After a font is converted with the modified Glyph Inspector that generates a font class that can be embedded into any Java program, the instance of this font class (they are singletons) is used to convert strings into paths. What it basically does is to search for the glyph in the font for each character in the string, to apply an appropriate translation and scale, and to write the path for the glyph. Before passing to the next character a variable holding the translation is incremented with the glyph width and with the kerning width for each pair of characters. Because of the justification feature, kerning is only applied inside the words not between them. The space between the words is calculated by the program to justify the text to the right.

For each line, when a certain line size is attained, the line is shown right justified and the y coordinate of the next line is incremented by a value passed as a parameter.

The program is able to handle 256 simultaneous fonts for each string. Each string becomes a new paragraph as described above. Font switching is done by using character /000 followed by a character with the index of the font, that is, between /000 and /777 (255). These characters can appear at any place in a word, and they are not counted as actual characters. Other non-printing characters can also be use as commands in this same manner.

Implementation Details and Data Structures

Converting Strings to Glyphs

The initial text is a string that is to be converted to a formatted paragraph. The first step is to separate every word in the string through the method split from class String. This method generates the array of strings words where each position contains a string with a word of the previous string. Next, the array paragraph of the same size is created but containing in each position a Path2D.Float path with the glyphs of each character of each word as shown below:

This process is done by searching each glyph in our embedded font (which is a Java class as explained in the project Automatic Vector Fonts Generator Project – Glyphs, their widths and kerning pairs). Each glyph of the font is defined as a Path2D.Float path, composed by moveTo, lineTo, quadTo, curveTo, or closePath segments, with their respective points coordinates. These coordinates are all defined in the font coordinate system where the contents are defined in unitsPerEm. A glyph is supposed to be a 1 point square, so its scale is 1÷unitsPerEm. To scale glyphs to their proper size, coordinates must be multiplied by (font size)÷unitsPerEm.

Once the glyphs are searched, they are traversed one by one, segment by segment, and their coordinates are transformed accordingly. The first glyph is not translated, since it is assumed to be placed at the origin. Subsequent glyphs x coordinates are incremented with the cumulated value of the previous glyphs and kerning widths, depending on their kerning pairs. Finally both coordinates are multiplied by the scale ((font size)÷unitsPerEm).

The resulting glyphs are stored in a Path2D.Float path, as commented above. But a path cannot store a very important datum, which is the total word width, or in other words, the whole path width. That is why a new class Word is defined by inherinting it from Path2D.Float. The width of the path is then stored in the new class and instead of using a Path2D.Float class one uses the Word class instead. Its definition in Java is as shown:

class Word extends Path2D.Float {
	double size;

	public void setWidth(double translation) { size = translation; }
	public double getWidth() { return size; }
}

The algorithm below details how the method is implemented. The "input" string contains the characters of the word to be converted, and is actually a parameter passed to the method. The variable word is of type Word, inherited from Path2D.Float, the path created and being built in this method. At first, the x translation value dx is zero and the word path is created, being initially empty. Next, two loops are carried on. The first loop traverses all characters of string, one character at a time. For each character, the glyph path is searched in the font and saved in variable p of type Path2D.Float. The second loop traverses all segments of p until there are no more segments remaining. For each segment, the method tests if the segment is a moveTo, lineTo, quadTo, curveTo, or a closePath. Once the segment is identified, the points are translated and scaled as indicated, and a copy of the segment is appended to word by using its building segment methods moveTo, lineTo, quadTo, curveTo, and a closePath as indicated. Next, dx is incremented with the width of the glyph and with the kerning width, so the next glyph will be properly translated right after it. From there, the algorithm loops back to the test to see if path p is empty. Once it has no further segments, the width of the path (this is not indicated in the algorithm) is stored inside word in this way: word.setWidth(dx * scale), and word is returned.

Notice that dx can be seen as a vector in the font coordinate system because it is incremented with values (glyph widths and kerning widths) from the font which are also in the font coordinate system. That is why the scale must be applied to it, so it can be used in what can be seen as the word, paragraph or line coordinate system, since the three share the same coordinate system. When the line is finally "shown" the words are finally transformed to the final screen coordinate system when it can be explicitly "seen". Implicitly, one is actually working with three different coordinate systems, although the transformation matrix is not necessary because calculations are very simplified, only involving scale and translations.

A very important additional point one can easily see from what is shown here is that the result is always WYSIWYG, because "What You See" is exactly what is calculated internally, and it will remain unchanged in any other media. This is perhaps the most interesting reason to work with vector graphics primitives afterall.

Breakintolines Method

The Algorithm

The algorithm for the breakintolines method is shown in the fluxogram below. The input is in reality the parameter to the method, that is the paragraph array containing the glyphs for each word as explained in the previous section. The output operator can be substituted by any output such as a file, a println, a display or a Path.Float. Each time a line is output the variable yline is incremented by yinc, the distance between the lines.

Formatting the Paragraph

The process of formatting the paragraph is a matter of determining which words fit into a line. A line is limited in width and this is a parameter that should be provided (pw in the algorithm above). This will also determine the width of the paragraph formatted. As usual, this width is assumed to be in points (1/72 inch). A line is defined as an ArrayList<Word>, where words are stored one by one provided they all fit in the line, considering their widths and the space separating them. This space is calculated on the fly at each new word to be added. Once a word is unable to fit in the line anymore with the given constraints, the line is "shown" with the words separated by the calculated space (ws in the algorithm).

"Showing" the lines

Since each line is represented by a collection of words (actually an ArrayList), each word assumed to be at the origin, the final line can be seen as a new path where the words are copied and translated according to their widths (see algorithm above) and the word separator ws. This final path is the actual line that can be either displayed, printed, or written to a file.

Subsequent lines are constructed in the same way, except that the y coordinates are incremented each time by yinc. The only differences between constructing lines and constructing words are that when constructing lines only translations are used (in words the glyphs were also scaled), words are separated by the calculated space ws (in words, glyphs were separated by the characters width and by kerning distances) and that y coordinates are incremented by yinc at each line (in words, y coordinates were scaled).

Examples

With a Single Font

In this snapshot it was used: a MyriadPro-Regular.otf font, scaled to 20 points size, line widths limited to 380 points, a 20 points y coordinate increment, and a 5 points minimum word separator.

With Two Fonts

In this snapshot it was used: MyriadPro-Regular.otf and MyriadPro-Bold.otf fonts, scaled to 20 points size, line widths limited to 380 points, a 20 points y coordinate increment, and a 4 points minimum word separator.

With Four Fonts

In this snapshot it was used: MyriadPro-Regular.otf, MyriadPro-Bold.otf, MyriadPro-It.otf and MyriadPro-SemiboldIt.otf fonts, scaled to 20 points size, line widths limited to 380 points, a 20 points y coordinate increment, and a 4 points minimum word separator.

The following string was used to obtain the above result:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BreakIntoLines.md

BreakIntoLines.md

BreakIntoLines Project

Description

Implementation Details and Data Structures

Converting Strings to Glyphs

Breakintolines Method

The Algorithm

Formatting the Paragraph

"Showing" the lines

Examples

With a Single Font

With Two Fonts

With Four Fonts

Files

BreakIntoLines.md

Latest commit

History

BreakIntoLines.md

File metadata and controls

BreakIntoLines Project

Description

Implementation Details and Data Structures

Converting Strings to Glyphs

Breakintolines Method

The Algorithm

Formatting the Paragraph

"Showing" the lines

Examples

With a Single Font

With Two Fonts

With Four Fonts