Data-driven Handwriting Synthesis in a Conjoined Mannerrobin/docs/pg15.pdf · Data-driven...

Pacific Graphics 2015N. J. Mitra, J. Stam, and K. Xu(Guest Editors)

Volume 34 (2015), Number 7

Data-driven Handwriting Synthesis in a Conjoined Manner

Hsin-I Chen1 Tse-Ju Lin1 Xiao-Feng Jian1 I-Chao Shen2 Bing-Yu Chen1

1National Taiwan University2University of British Columbia

Abstract

(Please refer to Appendix for easy reading.)

Categories and Subject Descriptors (according to ACM CCS): I.3.3 [Computer Graphics]: Picture/Image Generation—;I.4.9 [Image Processing and Computer Vision]: Applications—

1. Introduction

Though printed characters are easy to read, most people stilllike handwriting styles due to their uniqueness and aesthetics,so handwritten words are getting more and more popular invarious design applications, from greeting card decoration topersonal website stylization. As an alternative to standardprinted characters, synthesizing handwriting-style text froma collected database has attracted a lot of attention [LW07,LYFD12, CS12, XWG∗13].

However, automatic synthesis of characters and paragraphsthat mimic one’s writing style is challenging. An individual’shandwriting appears differently within a typical range of vari-ations, and the shape of individual characters also showscomplex interaction with nearby neighbors. That is, the hand-written characters look different when they are written ina conjoined manner (see example in Figure 1). Such localtrajectory variation between neighboring characters, as an im-

portant consideration in human writing, is less addressed inprevious data analysis and handwriting synthesis approaches.In addition, the arrangement of the writing and the symbol-ism of margins, zones, spacing and slant must be taken intoconsideration in synthesizing paragraphs.

In this paper, we present a novel method for synthesiz-ing handwritten text according to the writer’s style whileconsidering characters’ conjoined property. Our method iscomposed of two phases: In the first phase, the morphablemodel of different characters is created, which enables usto synthesize various characters while simulating a person’swriting style. In the second phase, a trajectory synthesis al-gorithm is used to produce smooth trajectories for specifiedwords. The parameters of the trajectory synthesis algorithmare trained automatically from the collected samples usingthe coordinate descent method. Given the input ASCII words,the corresponding words are automatically synthesized by

c© 2015 The Author(s)Computer Graphics Forum c© 2015 The Eurographics Association and JohnWiley & Sons Ltd. Published by John Wiley & Sons Ltd.

H.-I. Chen& T.-J. Lin & X.-F. Jian & I.-C. Shen & B.-Y Chen / Data-driven Handwriting Synthesis in a Conjoined Manner

Figure 1: Two characters “t” in the same word “the”. Oneconjoins with its neighbor “h”, and the other does not. Thesetwo “t” look different, though they are both in the same word“the”, and written by the same writer in the same article veryclose in time.

the proposed method. In addition, the global features, suchas the angle of the handwritten text line with respect to thehorizontal direction, word and line spacing, are also adjustedaccordingly to reflect the writer’s patterns in the paragraphgenerating process.

Contribution The proposed method utilizes a statistic learn-ing approach to compactly model the distinctness of indi-vidual characters as well as the conjoined properties in ac-cordance with the collected dataset. With a small datasetprovided by a writer, we are able to synthesize non-existentscripts that mimic a person’s handwriting, both at the levelof individual character and complete words. Specifically, thecontributions of this paper include:

• A data analysis stage that offers an understanding of howcharacters interact with nearby neighbors, which reflectsindividual writing styles.• A character grouping method to guide the computation

of cursive probability for each character pair that decidewhether adjacent characters are conjoined together or not.• A novel data-driven optimization method which can be

used to synthesize non-existent paragraphs from a smallset of training data, where the distinctness of individual’swriting as well as the natural character conjoined propertyare modeled jointly.

2. Related Works

Handwriting. Handwriting styles are different from personto person, and have been studied a lot in forensic science.There are some important factors that can be used to dis-tinguish different handwriting styles [HH99, SCAL02]: (1)Element of style: the look of one’s writing, including theplacement of text, relative size between words or characters,and spacings, etc. (2) Elements of execution: things affectedby the action while writing, including usage of abbreviation,embellishment, pen pressure or movement while writing, etc.(3) Natural variance between each writing.

Several methods have explored the synthesis of peoples’

handwritings or drawings. Most of these works [WWXS05,LW07, CS12] collected samples of users’ handwriting (inEnglish) first, and then synthesized target words. Wanget al. [WWXS05] used the active shape model to modelthe distribution of handwriting samples, and then generatedsmooth connections between consecutive characters in En-glish words with the delta log-normal model. Chang andShin [CS12] decomposed collected sample handwritings intostrokes, and established a statistical model for generating andcombining strokes into a character. Hinton and Nair [HN05]proposed a generative model for synthesizing handwrittendigits. On the other hands, some previous works focus onstylization [LYFD12] or beautification [Zit13] of handwrit-ings or drawings. Lu et al. [LYFD12] presented a data-drivenapproach for synthesizing the 6D gesture data for users oflow-quality input devices. Zitnick [Zit13] beautified eachhandwritten character instance with its averaging appearance.

Our method is mostly inspired by [WWXS05] as we alsouse the active shape model to model the distribution of hand-writing samples. However, our method greatly differs intwo aspects. First, instead of collecting handwriting samplesthrough a Tablet PC, which is not a natural environment forhandwriting, we collected the samples from scanned imagesdirectly. Second, instead of concatenating successive char-acters while neglecting the individual preference in cursivewriting, we investigate the collected data and generate con-nections between consecutive characters according to theircursive probability and structure information revealed fromthe data.

Data-driven synthesis. There have been a number of workstargeting data-driven content synthesis in the graphics com-munity. In the 2D domain, many methods focus on stochastictexture synthesis as surveyed in [WLKT09], stroke styliza-tion [LYFD12], and pattern drawing [LLSC13, LBW∗14]. Inthe 3D domain, a family of data-driven shape synthesis meth-ods have been proposed. Chaudhuri et al. [CK10, CKGK11]introduced data-driven suggestions for 3D modeling, andKalogerakis et al. [KCKK12] proposed a generative modelof component-based shape structure to synthesize novelshapes automatically. In terms of animation synthesis, Maet al. [MWLT13] synthesized a variety of repetitive spatial-temporal phenomena using data-driven computation. Maet al. [MHS∗14] proposed an analogy driven shape syn-thesis framework to accomplish 3D style transfer. Li et al.[LZW∗13] identified curve style from a set of shapes whichis useful in several style-related applications including styleexaggeration and style transfer for 2D shape synthesis.

Multidimensional morphable model. Multidimensionalmorphable model (MMM) is widely used in previous works,including object recognition [BP96, JP98], appearance mod-eling [CET01], image morphing [LWS98], and animationsynthesis [EGP02]. By tuning the parameter of the model,we can generate novel objects in the same class but not in-cluded a priori. Ezzat et al. [EGP02] aimed for generating a

c© 2015 The Author(s)Computer Graphics Forum c© 2015 The Eurographics Association and John Wiley & Sons Ltd.


(a) Individual sample. (b) Integral sample - Word

(c) Integral sample - Paragraph

Figure 2: The sample document set. We asked users to write(a) individual characters including 26 English letters, in bothlower and upper case, and 10 Arabic numbers, (b) individualwords, and (c) a sample paragraph of an article.

facial animation with any speech, and used MMM to modelthe faces under different speeches. Part of our method wasinspired by this work, we built one MMM for one letter, thuswe can generate novel instances of letters. By constrainingthe parameters, the novel instances will preserve the style ofcollected samples.

3. Data Collection

We collected handwriting samples by asking users to writea set of “sample documents”. During the design phase ofthis sample document set, we observed that: (1) different in-stances of the same character tend to be more consistent whenthey are written separately (i.e., the variation among them issmaller) in comparison when they are written conjointly, and(2) the layouts of paragraphs, such as border width, line spac-ing or line parallelism, are different from person to person.Based on this observation, we have concluded the follow-ing criteria in designing the sample document set: (1) Eachcharacter must appear several times to capture its variation ofhandwriting style. (2) The sample paragraphs should coveras many common character pairs as possible to capture thelocal deformations and trajectories of the conjoined characterpairs. (3) While collecting integral samples, we cannot over-constrain the text placements in order to preserve personalstyle.

To support the above criteria, there are two different partsin the sample document set. In the first part, we asked usersto write individual characters, including 26 English letters,in both lower and upper cases, and 10 Arabic numbers (Fig-ure 2(a)). In order to capture the variation of writing styles,the users were asked to write the same content twice. Inthe second part, we target to collect writing samples in a

(a) (b) (c)

(f) (e) (d)

Figure 3: Work flow of our system. (a) Scanning input images.(b) Parameterizing each character. (c) Building the shapemodel for each character. (d) Synthesizing characters, (e)synthesizing words by optimizing smooth trajectory fromsynthesized characters, and (f) synthesizing paragraphs whilepreserving the writer’s style.

conjoined manner, i.e., character pairs composed of at leasttwo characters. Specifically, we asked users to first write theChapter 12 of The Little Prince as the sample paragraph (Fig-ure 2(c)). In addition, we also chose 15 additional words† tocover more character pairs (Figure 2(b)). (See Appendix to ex-plain the procedure to generate integral samples by analyzinga target corpus.)

4. Handwriting Synthesis

4.1. Overview

The work flow of our method is shown in Figure 3. Givena handwritten source document from a user, we first scan(Figure 3(a)) and parameterize (Figure 3(b)) the characters inthe document (Section 4.2) to obtain the control points repre-senting the characters for further processing. Then, we builda statistical shape model from the control points (Figure 3(c))to represent each character ci using a set of shape param-eters αααi (Section 4.3). By investigating the collected dataand grouping characters according to their structural similar-ities, we can also calculate the conjoined probability P(i, j)for every pair of character combination (ci,c j) (Section 4.4).Given an input text, our synthesis algorithm (Section 4.5)performs as following: (a) For each individual characterci, we synthesize similar but different character shapes ciby varying corresponding multidimensional parameters αααiin model space (Figure 3(d)). (b) If the current character ciand its next character ci+1 form a conjoined compound inthe collected data, their conjoined property P(i, j) is used toadjust the shape of ci to connect or not connect with ci+1.(c) However, if no conjoined information of pair (ci,c j) canbe found in the collected data, we synthesize the smooth tra-jectories between ci and ci+1 according to the stroke-based

† The 15 words are: cake, cars, cat, caveolae, dig, love, moon, no,pack, pick, poignant, rabbit, toss, troop, world.



structural similarities to adjust the shape of ci. By formulat-ing (b) and (c) as an energy minimization framework, wecan synthesize novel characters and words (Figure 3(e)) withpersonalized cursiveness property. Finally, we imitate theparagraph layout (Figure 3(f)) extracted from the integralsamples (Section 4.6).

4.2. Character Parametrization

After scanning the input images containing handwritten char-acters as shown in (Figure 3(a)), we fit B-spline curve witha low number of control points to our data and. We thenrepresent each character c as:

(1)

where px ∈ R2 indicates the relative coordinate of px fromthe center, and N is the number of control points.

4.3. Shape Model Generation

Our method relies on the morphable models trained by theinput handwritings, and is capable to morph between a setof examples to synthesize previously non-existent shapesthat are consistent with individual handwriting styles. Theassumption of the shape model is that all the handwrittenexamples of each character (e.g. “a”) from a single user liein a low-dimensional space whose axes represent shape vari-ation. For each character ci, we take the collected samplesCi = {ck

i }Sik=1 as input for building the shape model, where

Si denotes the number of samples for character ci. In addi-tion, we arbitrarily select one example, say c1

i , as a reference,and compute displacement vectors Di = {dk

i }Sik=1 by perform-

ing shape matching between other examples and the refer-ence. We construct 62 shape models in total (i.e., 26×2 forlowercase and uppercase English letters, and 10 for Arabicnumbers).

Shape matching. To match the reference example with oth-ers, we compute the displacement vector dk

i for each pair(c1

i ,cki ) as:

dki = {dk

i (p1),dki (p2), ...,d

ki (pN)} (2)

where dki (px) denotes the displacement of control point px

between c1i and ck

i . The displacement between the referenceexample and itself is zero (i.e., d1

i = 0).

To compute correspondences between the control pointson different examples, the shape context [BM02] descriptoris adopted, which is composed of a log-polar histogram thatmeasures edge points’ locations and orientations to providea θ× r feature vector. In our experiment, we use 12 angu-lar (θ) and 4 radial (r) bins. In addition, we supplement theshape context descriptor with the control points’ 2D posi-tions, which leads to a 12× 4+ 2 = 50 dimensional shapedescriptor.

(a) (b)

Figure 4: (a) One of character examples “a” in the dataset. (b)Instances synthesized using our shape model with differentαααi in Eq. (5).

Statistical shape model. The morphable shape model foreach character ci is created by performing principal compo-nent analysis (PCA). For each character ci, we assume eachexample is described by Ni control points, so aligned vectorsform a distribution in the 2Ni dimensional shape space. Theco-variance matrix of the shape space is:

Var(Di) =1

Si−1

Si

∑k=1

(dki −µµµi)(d

ki −µµµi)

T , (3)

where µµµi is the vector of mean displacements of character ci.The probabilistic model of PCA can be written as:

ΦT (dk

i −µµµi) =

(Ir0(2N−r)×r

)αααi + εεε, (4)

where ΦT is a 2Ni× 2Ni matrix whose row vectors are the

eigenvectors of Var(Di), I is a r× r identity matrix, εεε is anisotropic noise in the PCA space, and αααi is a r-dimensionalvector and r is the number of modes to train the PCA model ,where η = diag(λ1, ...,λr), and λ j is the j-th eigenvalue.

Hence, we can rewrite Eq. (4) as:

dki = µµµi +Φrαααi +Φεεε, (5)

where each element in αααi reflects a specific variation alongthe corresponding principle component axis. By varying theelements in αααi (see Figure 4), the generated shape parameterαααi enables us to synthesize the new handwritten character ciby

ci = µµµi +Φrαααi + c1i , (6)

where αααi is the coefficient of shape parameter plus a variantδ ∼ N(0,

√η

3 ), and c1i is the control points of the reference

sample of character ci.

4.4. Character Grouping and Cursive Probability

Given the collected samples and their shape models, we thengroup these examples according to their structural similarity.We subsequently use both the grouping results and collectedsamples to define a cursive probability for each characterpair that decides whether adjacent characters are conjoinedtogether or not.

Character grouping from structural similarity. When peo-ple perform a flowing and uninterrupted movement, two adja-cent characters have the property to form a conjoined com-pound: the finishing position of the former character may link



(a) (b)

Figure 5: We partition the writing sheet into 6 groups. (a) “h”and “l” are started from the same position G1, so they are inthe same starting group. (b) “r” and “v” are finished at thesame position G4, so they are in the same finishing group.

directly to the starting position of the later one. This impliesthat the starting and finishing positions are prominent factorsin determining whether and how characters are linked. Basedon this observation, we make an assumption: the characterpairs which have similar starting-finishing positions may havesimilar and identical writing trajectories. This assumption isvital for the conjoining probability computation and trajectoryshape synthesis (Section 4.5). For example, if a character pair(ci,c j) is absence in the collected data, we can find anothercharacter pair to synthesize similar writing trajectories withthe same conjoining probability.

We group characters by their structural similarity, whichis measured by the distance between their starting (and fin-ishing) positions. Specifically, we divide the space on thewriting sheet into six partitions (see Figure 5), say group 1 togroup 6. Two characters ci and c j are clustered into the samestarting/finishing group if their starting/finishing positionsare located in the same partition (see Figure 5(a)/Figure 5(b)).We use Ms

ik,k = 1, ...,6 to denote the starting group assign-ment of character ci, and we set Ms

ik = 1 if ci belongs togroup k. In addition, we use Gs

ik := {ci : Msik = 1} to denote

all characters in starting group k clustered by their startingpositions (M f

ik and G fik are defined in the same way for fin-

ishing groups). Please note that every individual may havedifferent start/finish group structure, thus the grouping resultfor each character may be different from person to person.

Cursive probability. For each spatially adjacent characterpair ci and c j, we use two different functions, Pdata andPgroup, to estimate its cursive probability P(i, j). We defineP(i, j) = Pdata(i, j) as the fraction of the occurrences of thecharacter pair ci and c j if they existed conjointly in the col-lected data. Otherwise, we use P(i, j) = Pgroup(G

fik,G

sjk′)

to estimate its conjoining probability of character pairs inG f

ik and Gsjk′ in the collected data, which are the character

sets in the same finishing and starting groups with ci and c j,respectively.

We define the cursiveness probability of a letter pair (i, j)as:

P(i, j)=

Pdata(i, j), if character pair (i, j) forms a

conjoined compound in dataset,Pgroup(i, j), otherwise.

(7)

(a) (b) (c) (d) (e)

Figure 6: (a) and (b) The samples written by a user. (c) Aword written by the same user. (d) The same word synthesizedby our method while the trajectory between “t” and “o” canbe generated smoothly. (e) The same word synthesized bydirectly pasting “t” from (a) and “o” from (b), so they cannotbe connected naturally.

4.5. Word Synthesis with Trajectory

Given an input text, we firstly generate individual charactersci using the morphable models in Eq. (6). However, when thecharacters are written in a flowing movement, they tend toconnect together with their adjacent neighbors. This makesthe characters look quite different from their regular shapes(when they are written separately). In order to mimic one’snatural handwriting with cursive style, two concerns are tobe dealt with. Firstly, for an individual user, a probabilitymodel should be learned from the collected data to estimatethe cursive probability when writing adjacent characters (Sec-tion 4.4). Secondly, the shape of the generated character cimust be adjusted to be consistent with those in the collectedsamples in both appearance style and local conjoined trajec-tory.

Structure similarity constraint. To form a cursive writinggiven P(i, i+1)> 0, our goal is to adjust the shape parameterαααi to specify how to deform the shape model to synthesizethe character ci to connect with its neighbor ci+1 naturally.Our first requirement is that the shape parameter αααi shouldcome from the character sample set Ci with similar conjoinedstyle. Specifically, we would like to find the shape parametersβββ

ki corresponding to the character samples ck

i ∈ Ci, underthe constraint that the pair (ck

i ,ci+1) existed conjointly in thecollected data. At the same time, we would like the trajectorybetween the conjoined characters to be as smooth as possible,since unnecessary discontinuities in the path will result inunnatural handwriting. Hence, we attempt to achieve theseobjectives by minimizing the following function:

αααi = arg minαααi,βββ

ki

{∑i(‖αααi−βββ

ki ‖2 + γ(gi

pN ,pN−1 −gi+1p1,p2)

2}

s.t. piN = pi+1

1 .

(8)

The first term is the data term, responsible for measuringthe distance between the estimated shape parameters αααi andβββ

ki . However, if there is no such pair (ci,ci+1) in the collected

data, we use the βββki from its structural similar group with ci+1.

The second term is the smoothness term, whose objective isto keep the slope of the trajectories between the finishingand starting positions of the conjoined characters as small



(a) Handwritten paragraph

(b) Synthesized paragraph without layout information

(c) Synthesized paragraph with layout

Figure 7: A comparison of synthesized paragraphs with andwithout considering the handwriting layout style. In (b), theresult looks like a typed paragraph using “handwriting” font.In (c), the result looks much nature compared with (b).

as possible. The slope between control points Pj and Pj′ incharacter ci is denoted as gi

Pj ,Pj′, and Pi

N denotes the N-thcontrol point of character ci. We also diminish the interval byconstraining the finishing position of character ci to be thesame as the starting position of character ci+1. The relativeweight of the two terms is controlled by the parameter γ. Inour current implementation, we use the default value γ =0.8. The multidimensional non-linear optimization problemis solved by the coordinate descent algorithm, where theinitial guess of the shape of character ci is generated byEq. (6). Figure 6(d) shows a synthesized result, showingsmooth connections and preserving the style in the dataset.

4.6. Paragraph Synthesis

To compose a paragraph that is consistent with one’s hand-writing style, some important features, such as the heightsand slopes of the characters, and the slopes of the entirelines, must be taken into consideration. It is because whenpeople write on blank paper without any guide line or grid,it is not easy for them to write the characters with exactlycorrect positions, i.e., the words may slant heavily upsidedown. In our method, the relative height among charactersand slant are preserved by our shape model. Words are ren-dered one by one, and the height and slope of each wordis adjusted with two Gaussian distributions, Gheight(µh,σh)and Gslope(µs,σs), respectively. To produce a smooth writingpath, the parameters µh and µs are set as the height and slopeof the preceding character, while σh and σs are set as 3 pixelsand 2 degrees for all the results in our experiment. Mean-while, the angle of each line with respect to the horizontaldirection is also altered by a Gaussian function, Gline(µl ,σl),where µl and σl are set as the mean and standard deviation ofdetected lines in the collected integral samples. In addition,our method can synthesize characters that faithfully reflect

the size of the user-written characters. The result is shown inFigure 7.

5. Experimental Result

We first describe our experiment setting in Section 5.1, re-port the user study result in Section 5.2 and visual results inSection 5.3, respectively. Finally, we compare our result withtwo related methods in Section 5.4.

5.1. Experiment Setting

We implement our method in MATLAB and all results weregenerated on a desktop PC with an Intel Core i7-3370 CPUand 16G RAM. Building the shape model for our datasetdescribed in Section 3 takes about 360 seconds while syn-thesizing a five-character word in a conjoined manner takesabout 1.2 seconds. Synthesizing the Chapter 12 of The LittlePrince, which contains 159 words (769 characters), takesabout 190 seconds.

5.2. User Study

To evaluate the ability of our method to mimic one’s handwrit-ing, we conducted two user studies, i.e., visual discriminationand similarity, using an online questionarie website.

Study 1: Visual discrimination. This test consists of 8cases, 4 of them are randomly selected from the integral sam-ples, and the rest 4 are synthesized by the proposed method.We recruited 65 participants (30 males and 35 females) inthis study. The 8 cases were shown in random order, and theparticipants were asked to identify whether each case is hand-written or synthesized. Figure 8 (a) shows the distribution ofthe participants’ score. A score of 8 indicates a participantcorrectly distinguished 4 real handwritten pieces as real, andall 4 synthesized ones as synthesized. From this, we can seethat most participants can not get all questions correctly. Theaverage number of correct answers across all participants is4.96. In Figure 8 (b), we can see the average percentage ofsentences in each of the four different writing styles that werecorrectly labelled. For the 4 synthesized paragraphs, 48%,60%, 49%, and 55% of the participants voted that they arehandwritings. The p-value of chi-square test at 5% significantlevel comparing with random guess is 0.998, suggesting thatparticipants are confused to distinguish the synthesized frag-ments thus performed similar to random guessing at spottingour rendered handwriting. Figure 12(a) ∼(c) display threesynthesized sentences used in User Study 1 and an exampleof our user interface are shown in Appendix .

Study 2: Similarity. The second user study attempts to eval-uate if our method is capable of synthesizing paragraph inthe same style as a given input. The participants were askedto perform similarity ratings between a paragraph pair, whichare formed from one of three following cases: (a) written by



(a) (b)

Figure 8: The result of User Study 1. (a) Histogram of thenumber of correct answers by participants. (b) Accuracy foreach different handwriting style. An average of 53% of par-ticipants confused synthesized handwritings as real ones.

mean score stdCase 1 4.13 0.72Case 2 2.07 0.82Case 3 3.21 0.94

Table 1: A statistical result in User Study 2. Case 1 containstwo paragraphs written by the same writer twice; Case 2contains two paragraphs written by different writers; Case3 contains one handwritten and one synthesized paragraphs.The scores are from 1 (totally different) to 5 (almost thesame).

p (α = 0.05)Case 1 vs. Case 3 0.999

Table 2: The result of chi-square test of User Study 2. Thep-value between Case 1 and Case 3 suggests that the partici-pants confused real handwriting and synthesized handwritinggenerated by our method.

(a) Handwritten sample

(b) Our result

Figure 9: An example paragraph used in User Study 2. Thewriter’s actual handwriting is shown in (a); synthesized resultis shown in (b).

(a) (b) (c) (d)

Figure 10: (a), (c), and (d) are collected samples in the dataset,while (b) is synthesized by the proposed method. Thoughthere is no conjoined pair “r-i” in the dataset, we still cangenerate the trajectory by described in Section 4.4.

the same writer twice; (b) written by two different writers;(c) one handwritten and one synthesized paragraph. For eachrating, the participant was instructed to read a paragraph pairand rate how similar the paragraphs are from 1 (totally dif-ferent) to 5 (almost the same). We recruited 41 participants(19 males and 22 females) in this study, and each participantrated 12 (= 3 cases ×4 styles) pairs.

For each case, we show the averaged score and the standarddeviation in Table 1. Case 2 got lowest score since they arefrom difference writers. At 5% significant level, the p-valueby performing chi-square test for the similarity test result isshown in Table 2. The p-value of Case 1 and 3 suggests thatthe participants confused real handwriting and synthesizedfragments generated by our method. Figure 7 (a) and (c)shows one pair in this test. In this comparison, more than68% of the participants considered these two paragraphs werewritten by the same writer. More than 70% of the participantsgave the scores 4 or 5, and less than 10% of the participantsconsidered they are written by different writers. Figure 9shows another pair in this test.

5.3. Visual Results

Figure 10 shows a conjoined result learned from existed pairin the collected dataset as well as an example leveraging thegrouping method described in Section 4.4. Figure 10(b) isour synthesized result of the word “prince”. For the “c-e”pair, it existed in the collected dataset, so we can generate itstrajectory by learning from the existed samples. On the otherhand, there is no “r-i” pair in the dataset. However, there are“r-y” and “t-i” pairs in (c) and (d), where “i” and “y” are in thesame starting group, and “r” and “t” are in the same finishinggroup.

Next, we show several synthesized results of small para-graphs in Figure 11 and 12. In Figure 11, we can see that theshape of our synthesized characters, ex:’y’ and ’g’, look simi-lar to their original styles in (a). Meanwhile, paragraph layoutin the same style still remains consistent. Figure 12(a)∼(c)display three various synthesized writing styles, in whichdifferent datasets are collected from different writers. Wecan easily distinguish different handwriting styles. We alsoinclude a failure case in Figure 13, where the synthesizedletter ’e’ and ’g’, and the spacing between ’r’ and ’y’ do notlook like real handwritings. This might be due to the prepro-cessing procedure for letter segmentation and binarization isnot always perfect.



(a) Handwritten paragraph

(b) Our synthesized paragraph

Figure 11: (a) A collected sample paragraph used as thetraining data. (b) The synthesized result using the trainedmodel from (a), so their styles are similar.

(a) Synthesized paragraph in style 1

(b) Synthesized paragraph in style 2

(c) Synthesized paragraph in style 3

Figure 12: (a)∼(c) are synthesized using different datasetscollected from different writers. The synthesized results looknatural and their styles are quite different.

Finally, we synthesize the Abstract of this paper (see thefirst page). The training set contains 159 words (769 char-acters), and there are 164 words (1151 characters) in theAbstract. It shows that we can synthesize paragraph largerthan our training set naturally, without repeated shapes. Wealso used Albert Einstein’s handwriting style to synthesizeEq. (1). Our training data is collected from Albert Einstein’smanuscripts for mathematical proof and scientific publica-tions. Because these datum are difficult to obtained, we onlyuse 2 ∼ 4 samples in the training phase for each letter orsymbol. In this experiment, we show that our method is ableto synthesize the styles that are difficult to replicate by otherwriters and under limited dataset.

Figure 13: An example of failure case.

(a) Original (d) [WWXS05] (g) Ours

(b) Original (e) [WWXS05] (h) Ours

(c) Original (f) [WWXS05] (i) Ours

Figure 14: Original handwriting examples are shown in leftcolumn (a)∼(c). Synthesized results by conditional samplingmethod [WWXS05] are shown in the middle (d)∼(f), inwhich ligatures are synthesized by delta log-normal model.Our method learns individual’s handwriting style and theconnects letters driven by information revealed from collecteddataset are shown on the right column (g)∼(i).

5.4. Comparison

We compare our method with two other approaches in hand-writing synthesis. First, we compare our result with glyph-based approach proposed in [LW07]. Figure 15 (a) is part oftraining set, considered as ground truth used in [LW07], andFigure 15 (b) is the corresponding synthesized result. We cansee that they appear quite different writing styles. Instead ofonly using one glyph for synthesizing each individual letter,and adding random rotation and scale variations [LW07], ourmethod learns the writer’s style from the collected data setby modelling shape distribution. Our synthesized results inFigure 15(d) looks much similar to the original writing styleas shown in Figure 15(c).

Next, we compare our result with the method proposedin [WWXS05], which synthesized cursive handwriting of auser’s personal writing style by combining shape and physicalmodel. Figure14 shows the comparison results. In the originalhandwritten sample in Figure 14(a), joined-up writing is sup-ported by data log-normal model as shown in Figure 14(d).Our method synthesizes glyphs by learning shape models aswell as their conjoined properties from data, leading to morenatural cursive styles between connected letters, e.g. ’w’ and’n’ in Figure 14(g). In Figure 14(b), it shows that although theindividual shape model of each letter is learned, the sparsecontrol point representation might result in distortion suchthat the generated glyphs might have an aliasing effect. Incontrast, our method produces more natural looking in Fig-ure 14(i). In Figure14 (h), the letter ’f’ synthesized by ourapproach also looks more similar to the original ground truthin Figure14(b) compared to the one in Figure14(e).



(a) Captured handwritten samples

(b) Synthesized result by [LW07]

(c) Captured handwrittensamples

(d) Ours

Figure 15: (a) is part of training set in [LW07], and (b) is thecorresponding synthesized words. (c) is original handwritingin our training set. (d) is the synthesized fragment by ourmethod. (a) and (b) appear different styles. In contrast, (c)and (d) looks more similar in style.

5.5. Implementation Detail

To enable synthesis, we included an semi-automatic user-interface for segmenting handwritten letters. We adopt leastsquare fitting method to find the b-spline model for eachcharacter. For the letters with many connected components,e.g. ’i’,’j’, we first perform connected component analysisand run the b-spline for each connected component.

Attributes Lettersa b c d e f g h i j

N 18 18 14 23 16 28 46 27 10 25S 15 13 24 39 100 14 18 49 65 4r 17 17 13 22 15 27 45 26 9 24

k l m n o p q r s tN 21 8 22 18 15 27 35 11 19 17S 8 45 17 53 53 36 3 47 37 74r 20 7 21 17 14 26 34 10 18 16

u v w x y zN 18 15 24 14 30 14S 8 45 17 8 45 17r 17 14 23 13 29 13

Table 3: This table lists the number of control points (Ni),the number of samples collected for each character (Si), andalso the number of eigenvectors (ri) used to represent eachcharacter.

6. Conclusion

In this paper, we proposed a data-driven framework to syn-thesize non-existent paragraphs while preserving the writer’s

handwriting style. We first determine whether each pair ofcharacters in the word are conjoined or not, by analyzingthe cursiveness probability in the collected dataset. Further-more, we learn the smooth conjoining trajectory from thesame dataset, and formulate a novel trajectory optimizationfor synthesizing conjoining characters. Finally, we arrangethe paragraph layout by imitating the writer’s handwritingstyle. The result of user study shows that we can generateparagraphs that is hard to distinguish from the handwrittenparagraphs.

Limitation and future work. Currently, we assume thateach character has only one topology, and cannot handlethe characters with large deformation because of current char-acter matching method. We believe a better matching methodwill help our method to handle more general dataset. In ad-dition, our work does not consider pen pressure, which mayaffect the handwriting styles. In some cases, larger pen pres-sure makes the stroke color darker. We believe this is aninteresting direction to obtain pen pressure from stroke colorsand leverage it to generate better synthesis results. Finally,while the strok order information is difficult to capture, thegrouping strategy can not work very well in some cases.

References

[BM02] BELONGIE S., MALIK J.: Shape matching and objectrecognition using shape contexts. IEEE Transactions on PatternAnalysis and Machine Intelligence 24, 4 (2002), 509–522. 4

[BP96] BEYMER D., POGGIO T.: Image representations for visuallearning. Science 272, 5270 (1996), 1905–1909. 2

[CET01] COOTES T. F., EDWARDS G. J., TAYLOR C. J.: Activeappearance models. IEEE Transactions on Pattern Analysis andMachine Intelligence 23, 6 (2001), 681–685. 2

[CK10] CHAUDHURI S., KOLTUN V.: Data-driven suggestionsfor creativity support in 3D modeling. ACM Transactions onGraphics 29, 6 (2010), 183:1–183:10. 2

[CKGK11] CHAUDHURI S., KALOGERAKIS E., GUIBAS L.,KOLTUN V.: Probabilistic reasoning for assembly-based 3D mod-eling. ACM Transactions on Graphics 30, 4 (2011), 35:1–35:10.2

[CS12] CHANG W.-D., SHIN J.: A statistical handwriting modelfor style-preserving and variable character synthesis. InternationalJournal on Document Analysis and Recognition 15, 1 (2012), 1–19. 1, 2

[EGP02] EZZAT T., GEIGER G., POGGIO T.: Trainable videore-alistic speech animation. ACM Transactions on Graphics 21, 3(2002), 388–398. 2

[HH99] HUBER R., HEADRICK A.: Handwriting Identification:Facts and Fundamentals. Taylor & Francis, 1999. 2

[HN05] HINTON G. E., NAIR V.: Inferring motor programs fromimages of handwritten digits. In Proc. of Neural InformationProcessing Systems 2005 (2005), pp. 515–522. 2

[JP98] JONES M. J., POGGIO T.: Multidimensional morphablemodels: A framework for representing and matching object classes.In Proc. of IEEE International Conference on Computer Vision1998 (1998), pp. 683–688. 2



[KCKK12] KALOGERAKIS E., CHAUDHURI S., KOLLER D.,KOLTUN V.: A probabilistic model for component-based shapesynthesis. ACM Transactions on Graphics 31, 4 (2012), 55:1–55:11. 2

[LBW∗14] LU J., BARNES C., WAN C., ASENTE P., MECH R.,FINKELSTEIN A.: DecoBrush: Drawing structured decorativepatterns by example. ACM Transactions on Graphics 33, 4 (2014).2

[LLSC13] LUO S.-J., LIN C.-Y., SHEN I.-C., CHEN B.-Y.:Stroke-guided image synthesis for skeletal structure editing. Com-puter Graphics Forum 32, 7 (2013), 71–78. 2

[LW07] LIN Z., WAN L.: Style-preserving english handwritingsynthesis. Pattern Recognition 40, 7 (2007), 2097–2109. 1, 2, 8, 9

[LWS98] LEE S., WOLBERG G., SHIN S. Y.: Polymorph: Mor-phing among multiple images. IEEE Computer Graphics andApplications 18, 1 (1998), 58–71. 2

[LYFD12] LU J., YU F., FINKELSTEIN A., DIVERDI S.: Help-ingHand: Example-based stroke stylization. ACM Transactionson Graphics 31, 4 (2012), 46:1–46:10. 1, 2

[LZW∗13] LI H., ZHANG H., WANG Y., CAO J., SHAMIR A.,COHEN-OR D.: Curve style analysis in a set of shapes. InComputer Graphics Forum (2013), vol. 32, pp. 77–88. 2

[MHS∗14] MA C., HUANG H., SHEFFER A., KALOGERAKIS E.,WANG R.: Analogy-driven 3D style transfer. Computer GraphicsForum 33, 2 (2014), 175–184. 2

[MWLT13] MA C., WEI L.-Y., LEFEBVRE S., TONG X.: Dy-namic element textures. ACM Transactions on Graphics 32 (2013),90:1–90:10. 2

[SCAL02] SRIHARI S. N., CHA S.-H., ARORA H., LEE S.: In-dividuality of handwriting. Journal of Forensic Sciences 4, 47(2002), 856–872. 2

[WLKT09] WEI L.-Y., LEFEBVRE S., KWATRA V., TURK G.:State of the art in example-based texture synthesis. In Eurograph-ics 2009 State of the Art Reports (2009), pp. 93–117. 2

[WWXS05] WANG J., WU C., XU Y.-Q., SHUM H.-Y.: Com-bining shape and physical models for online cursive handwritingsynthesis. International Journal of Document Analysis and Recog-nition 7, 4 (2005), 219–227. 2, 8

[XWG∗13] XIA Y., WU J., GAO P., LIN Y., MAO T.: Ontology-based model for chinese calligraphy synthesis. Computer Graph-ics Forum 32, 7 (2013), 11–20. 1

[Zit13] ZITNICK C. L.: Handwriting beautification using tokenmeans. ACM Transactions on Graphics 32, 4 (2013), 53:1–53:8.2

Appendix

Abstract. A person’s handwriting appears differently withina typical range of variations, and the shapes of handwritingcharacters also show complex interaction with their nearbyneighbors. This makes automatic synthesis of handwritingcharacters and paragraphs very challenging. In this paper, wepropose a method for synthesizing handwriting texts accord-ing to a writer’s handwriting style. The synthesis algorithmis composed by two phases. First, we create the multidimen-sional morphable models for different characters based onone writer’s data. Then, we compute the cursive probabilityto decide whether each pair of neighboring characters are

conjoined together or not. By jointly modeling the handwrit-ing style and conjoined property through a novel trajectoryoptimization, final handwriting words can be synthesizedfrom a set of collected samples. Furthermore, the paragraphs’layouts are also automatically generated and adjusted ac-cording to the writer’s style obtained from the same dataset.We demonstrate that our method can successfully synthesizean entire paragraph that mimic a writer’s handwriting usinghis/her collected handwriting samples.

English Character-Pair Analysis for Data Collection. Be-fore performing the data collection as described in Section 3,we conducted an English character-pair analysis to designthe writing task as shown in Figure 2(b) and Figure 2(c).Frequency analysis of English characters compound canbe achieved via various approaches. In this paper, we an-alyze character distributions using the statistics provided byProject Gutenberg‡, which digitized the works of our prede-cessors. Most of the collected books are written in Englishand scanned after their copyright expired. Of all the scannedeBooks, the 36,663 most commonly used words are listed un-til 2006-04-16. To obtain the frequency list of character pairs,we performed a straight frequency count on the commonlyused word list and found 590 pairs of connected characters,where the 159 mostly used pairs covered 90% of total pairs,109 pairs covered 80%, and 78 pairs covered 70%.

To design the writing task, we search for a concise andcomplete paragraph among world literature masterpieces thatcan (1) cover as many common character pairs as possible,and (2) be not too cumbersome for users to write. Chapter12 of The Little Prince was chosen because it features a setof 164 character pairs which covered 82% of the above totalpairs. The additional 15 words we choose can increase thecoverage to be about 90%, see Figure2(b).

Questionnaire in User Study 1

‡ http://www.gutenberg.org/


Data-driven Handwriting Synthesis in a Conjoined Mannerrobin/docs/pg15.pdf · Data-driven...

Documents

Transcript of Data-driven Handwriting Synthesis in a Conjoined Mannerrobin/docs/pg15.pdf · Data-driven...