Unison and replicating TWiki A talk for the Toronto Perl Mongers [email protected] 24 Nov 2005...
-
Upload
gwen-daniella-stanley -
Category
Documents
-
view
220 -
download
3
Transcript of Unison and replicating TWiki A talk for the Toronto Perl Mongers [email protected] 24 Nov 2005...
Unison and replicating TWiki
A talk for the Toronto Perl [email protected] Nov 2005
(Many slides herein are ‘replicated’ from other presentations found on the internet)
About me Background
BSc / MSc Computing Science (Newcastle University, UK, 1994) Masters of Business Administration (Melbourne Business School, Australia, 2004) Ex-programmer for Reuters International Technology Product group Ex-manager at Arthur Andersen
Interests Concept Mapping Wiki Organisational Design, Innovation Middleware (Message Oriented: MQ, TIBCO Message Brokers)
Currently working at Helix Commerce International http://www.helixcommerce.com http://www.collaborationcommerce.com Specialists in Collaboration, Social Network Analysis, Portal work Draws on a large Network of experts in the Greater Toronto Area
Motivation
Backup my wiki on the hosting provider Wiki on the road Sync to my palm top Because I want to sync the digimemo
… and because its fun
Read Write Offline Wiki A read write offline Wiki is for people in the field who have a need to change content
while offline. (Topic started in OfflineWiki) Pros:
Can edit and search content while offline. Webs shared this way cannot be censored or controlled by an single individual or small
group of individuals without significant implementation overhead. MS Can be implemented as a bolt-on, and "grown into", rather then increasing initial install
hurdle MS Cons/catch:
Setup issues: Web server and TWiki needs to be installed on client. Intelligent merge is necessary, like for example a TWikiWithCVS. Webs shared in this way lose the ability to perform access control, unless a heavyweight,
tightly integrated approach is taken MS How:
Work on content independently on different TWiki installations. Synchronize periodically.
http://twiki.org/cgi-bin/view/Codev/ReadWriteOfflineWiki
Other domains where data synchronization is useful
Synchronizing files on a PDA Synchronizing directory services such as an LDAP or
Active Directory server
Any application where there is a mobile workforce: E.g. Synchronizing data in Customer Relationship
Management (CRM) and Sales Force Automation (SFA) application between a compact mobile database on a PDA and a full-size relational database
Today’s toolset
TWiki Dakar A rewrite of the guts of TWiki Truly free and open source Wiki >90% compatible A shift from ‘coreteam’ to ‘collaboration’
Some functional improvements Some speed improvements Test suites Massively restructured
Unison File synchroniser
Think “My briefcase”, but for Gigabytes of data Academic Project by University of Pennsylvania Robust, portable and very useful
Wiki
What is a wiki?
A wiki is a website where a community collectively maintains and socializes ideas through a set of named hypertext pages. Uses include:
Terminology database Meeting minutes Initiative Status pages Personal pages What’s going on
Users each have a log-in and are granted access to edit pages.
A recent changes shows who has made what change
Every page is revision controlled. Every alteration on a page can be
tracked back to the user that made the change
Subscribers are notified by email as changes occur
What is a wiki? A wiki is an online space based on a set of discussion documents
interconnected with key words. Participants can add a discussion or can build on each other's thoughts by adding or moving the text around between existing documents. In this way a set of interconnected documents emerge that represents the sum of ideas in play.
The key to collaboration on a Wiki is interjecting and building thoughts at the right place of an existing document - ideas each get their own "watercooler", a place named according to key words where discussions on that topic are 1) collected and 2) continually synthesised.
These key words become common knowledge and become a linkage between verbal and online discussion.
Wikis are characterised by both a 'Recent Changes' page which sorts conversation by the last contribution and a 'Notifications' feature, which allows participants to be notified (by email, messenger) that new topics have been posted or that topics of interest have been modified.
The result is a place where individuals and groups can define, refine and relate terminology. The goal is conceptual integration, such that the terminology evolves to greater inclusiveness and power through integrating of wider number of opinions whilst maintaining higher definitional power.
Conceptual models can be discussed and refined The models provide seed for new ideas Ideas are fleshed out until socially accepted New stakeholders can see and resurrect previous discussions
Many Wikis to choose from
TWiki KWiki MediaWiki TikiWiki … 100s
Unison
http://www.cis.upenn.edu/~bcpierce/unison/index.html
Unison (v 2.7.7) GNU Public License Runs on both Windows (95, 98, NT, and 2k) and Unix (Solaris, Linux, etc.) User-level program Single executable Deals with symlinks, file permissions, modtimes, uids, etc. Deals with updates to both replicas
Non conflicting updates propagate automatically Works between any pair of directories Works between any pair of machines
direct socket link tunneling over rsh or ssh
Resilient to failure Tuned for high (ethernet) and medium (PPP) connections Uses the rsync ~15k lines of OCaml
Banging your head against a wall uses 150 calories an hour.
Synchronization (a simple example)
As long as they do not conflict
A more interesting example
Three reasonable possibilities
Heterogenity
System Architecture
Robustness
Merging
Controlling Unison By either:
Profiles Command line switches
Text Only Useful for batch Used by SyncContrib
GUI Fantastic for debugging Needed for reconciling conflicts
Unison File Synchronizer
1.Create a profile that map the Source’s path and Synchs path.
File Synchronizer Tools (Continue)Source Directory
Synch Directory
File Synchronizer Tools (Continue)
2. Some changed apply to the Source Directory.
Source Directory
Synch Directory
File Synchronizer Tools (Continue)3.Update the non conflict structure.
A reconciler
Options
Sync Contrib
Problem Definition
Typically laptop* and serverTypically cross-platform Inserts, deletes, updates could happen on
either serverWant to make updates made on either
available on both
* Fairly tricky to install on a laptop
The problem
What I set out to solve Backup from a TWiki
server to my laptop Using a free tool That could run
automatically
What I actually solved Two way synchronisation
of TWiki Free tools That can be run
interactively or in the background
“ReadWriteOfflineWiki”
But… its not perfect
Two Wiki servers
TWiki A
TWiki B
Unison SSH Access Unison
Call through ssh to unison -server
Linux Debian Sarge @ DreamhostDebian Perl 5.8TWiki (any with TWiki Store 1.0)
Windows XP on my laptopIndigoPerl 5.8TWiki Dakar
Unison does have a server mode too
Laptop Hosting Provider
ApacheModPerl
Perl
IndigoPerl
ApachePerl
C:\moreprgs\Unison\Unison.exe
C:\moreprgs\IndigoPerl\apacheC:\moreprgs\IndigoPerl\perl
Multiple sites merged into one composite siteOn TWiki I use nested webs
Web A
Web BWebs inside sites
Site A
Server 1
Site CServer 2
Web D
Web C
Web C
Server 3
Site B
SubWeb A
SubWeb B
SubWeb D
SubWeb C
SubWeb C
Site ASite A web
Site B web
Site C web
Inside each web is pub (attachments) and data
Syncs each directory separately Unison GUI appears
multiple times Does not use profiles
Your changes in the GUI have no lasting effect
Ignores templates So skins too
Ignores bin So functionality too
Ignores lib So plugins too
Do plugins really belong in the same hierarchy as TWiki?
Unison
Web/ MyTopic.txt AnotherTopic.txt
Web/ MyTopic/ AttachmentA.jpg AttachmentA2.doc AnotherTopic/ AttachmentB.gif
data pub
Web/ MyTopic.txt AnotherTopic.txt
Web/ MyTopic/ AttachmentA.jpg AttachmentA2.doc AnotherTopic/ AttachmentB.gif
data pub
Sync Contrib
A script loosely coupled to TWiki Knows about the layout of TWiki’s
filesystem
Sets up the command line options for unison Writes out a temporary intermediate launcher
script
Sync Contrib configuration
# the default unison profile must not exist # you must have set up a shared key# pageant must be running# designed to be run on a windows box# Common debug options# SYMPTOM:# Two or more files on a Unix system have names
identical except for case.# They cannot be synchronized to a Windows file
system.# DIAGNOSIS
Sync Contrib: Sample Configuration File
$UnisonSync::options = {'syncAccounts' => ['TestAccountRemote'],'clientServerPrivateKey' => 'c:\\Documents and Settings\\Martin Cleaver\\PuttyPrivateKey.ppk',
'clientRoot' => 'c:/moreprgs/indigoperl/apache/TWiki', 'plinkExecutable' => 'c:\\program files\\putty\plink.exe', 'plinkTempLauncherScriptFile' => 'c:\\temp\\plinkLauncher.bat', 'clientUnisonExecutable' => 'c:\\moreprgs\\unison\\unison-2.12.15-win-gui.exe', 'serverUnisonExecutable' => 'unison', 'unisonStdErrFile' => '%timestamp%-%accountName%.stdErr', 'unisonCaptureErrors' => '2> %unisonStdErrFile%', 'unisonCaptureLog' => '%timestamp%-%accountName%.uniLog', 'unisonLogfile' => '%timestamp%-%accountName%.log', 'unisonUimode' => '-ui text', 'unisonOptions' => '%unisonBatchMode% %unisonDebugMode% %unisonUimode% -logfile
%unisonCaptureLog% %unisonIgnoreTWikiHistory% %unisonTimeOutProtection%', 'unisonIgnoreTWikiHistory' => '-ignore "Regex ^.*,v$"', 'unisonTimeOutProtection' => '-sshargs "-o ServerAliveInterval=60"', 'unisonDebugMode' => '-debug none', # none, all, update 'unisonBatchMode' => '-batch', # e.g. -prefer roota or -force roota (see docs)
Config… cont. 'dataDir' => 'data', 'pubDir' => 'pub', 'protocol' => 'ssh',
'accounts' => {'TestAccountRemote' => {
serverRoot => '/home/mrjc/cairotwiki.mrjc.com/twiki', serverAccount => 'mrjc', serverSite => 'mrjc.com',
webs => ['Sandbox'],debug => '2',dryrun => '',
},'TestAccountLocal' => {
clientRoot => 'c:/temp/syncContribTest/TWiki', serverRoot => 'c:/moreprgs/indigoperl/apache/TWiki',
webs => ['Sandbox', 'CleaverSite'],unisonOptions => '-backups',debug => '2',dryrun => '',
},
Overridesthe config keys above
Config… cont.
'Palm' => {serverRoot => 'd:',
},'WikiConsultingSite' => {
serverRoot => '/home/mrjc/wikiconsulting.com/twiki', serverAccount => 'mrjc', serverSite => 'mrjc.com',
webs => ['WikiConsulting', 'People', 'PublicWikis', 'KmSurvey2003', 'PublicWikisOverview', 'WikiHistory', 'WikiHosting', 'WikiSuccessStories', 'WikiTechnologies'],
debug => '0'},
}};1;
Insurance (i.e. in the eventuality of no connection)
Filesystem Replication vs. Messaging Middleware Messaging middleware is another way
Some free implementations E.g. MQ
Need to build more into the application from the outset
Not needed if you are just doing backup Changing data under an app can be a bad thing
RSS might work
Broader Issues
What TWiki does to make this hard
The files are in two places (Data, Pub): you have to synchronise They always go together, so really they ought stored
together Some random files that you don’t want to
synchronise: .changes Thumbs.db files
Broader problems: Synchronising user identities
Summary: Where we are today Text and Attachment replication No conflict resolution
But the right hook is there in TWiki::Merge
Practical applications Useful for backup Useful for publishing Useful for laptop work, publish off-
line Need a ssh account on the server
=> open access Gain the rights of that user
There is a server mode in unison but it does not permit simultaneous connections
Future Store implementations will change Need to sync between different
types of store Different webs on same system
could have different storage methods (1 yr +)
Tell TWiki when files are dumped under it
Automatic attachments (TWiki will attach a file dumped in its attach directory)
What TWiki does to make this easy
No database Everything is a human-readable and
human-fixable flat file Hierarchical Webs
Limitations
Limitations
Limitations Doesn’t handle merges well Can’t handle more than client-server Merges transactions
E.g. Server A: Revisions made to version 1.1 makes 1.2, 1.3 Server B: No revision made Replicate 1.2 & 1.3 are seen as a single change on Server B, making version 1.2 Server A is at 1.3. Server B is at 1.2
Can’t do conflicts Server A: Revisions made to version 1.1 makes 1.2, Server B: Revision made to version 1.1 makes 1.2 (different) Replicate Conflict files are made on both machines Copies no longer replicate until conflict files are dealt with
What it actually transmits is the head revision As long as the HEAD revision on both machines
has not changed, it replicates the newer If it detects a conflict, Unison will call a
predefined Merge. Presently I don’t have any call hooked into
merge. Conflicts are just dumped side by side on the disk.These are not visible from a distanceNeed a halt on conflict switch
Limitations
Peer to peer
C1 S
C2
Client Server
Backup
C1 S
C2
C3 S3
S2
Federated
Simultaneous updates leave questions:
Topic’ Topic’’Compare
Topic’ Topic’’
In what order should the updates be placed into the resulting topic?
Future of SyncContrib
Transaction log transmission Instead of shunting the HEAD copy of client/server
back and forth, pass the set of changes made since last revision
Eg. Server A on Revision 1.1, makes Revision 1.2, 1.3 Server B up to date with Server A’s Revision 1.1 Server B should get 1.2 + 1.3 diff from Server A
Make config and logging web-visible Write to a wiki-compatible file format Expose as an attachment
Appendices
How to make a Unison key
How to create a shared SSH PuTTY (Windows) - SSH (Unix) key
Create shared SSH PuTTY (Windows) - SSH (Unix) key Install PuTTY and add the folder to the PATH environment variable. Generate OpenSSH keys with puttygen.
I left the config at SSH2 RSA. Add a private passkey phrase Save the private key in "c:\documents and settings\Martin Cleaver\PuttyPrivateKey.ppk"
For some reason it did not ask for an extension but I did this to match the instructions below Start pageant
A icon appeared in the bar next to the clock I right clicked this, did add key and picked my PuttyPrivateKey.ppk Putty confirmed that it had loaded this.
Back In puttygen, copy and paste the generated public key and append it to $HOME/.ssh/authorized_keys on the server. This will be a one-line entry as required by OpenSSH.
chmod og-w $HOME/.ssh/authorized_keys To keep PuTTY's pageant running:
start it from the Start Menu with the private key as parameter: pageant.exe "D:\somefolder\putty-key-jerry.ppk"
See http://twiki.org/cgi-bin/view/Codev/UnisonKeySetup
Notes You can get a copy of your public key at any time;
reload puttygen, pick the private key and enter your password. Create a launcher script* Put this:
@plink mrjc.com -i "c:\Documents and Settings\Martin Cleaver\PuttyPrivateKey.ppk" -l mrjc -ssh unison -server unison -server -contactquietly
into a separate file, called plink-mrjc.bat and invoke: C:\moreprgs\unison>unison.exe c:\moreprgs\indigoperl\apache\TWiki\data\
Sandbox ssh://mrjc.com/home/mrjc/cairotwiki.mrjc.com/twiki/data/Sandbox -sshcmd plink-mrjc.bat
*In my opinion this launcher script should be completely unnecessary.
References on TWiki.org
http://twiki.org/cgi-bin/view/Codev/DataAndCodeSeparation
Timeline
2001 2002 2003 2004 2005 20062000
Joachim’s algorithm My first commentOn this topic
SparksSyncs withRsync
UnisonReplicated slides…
Normal Unison setup
Unison Profiles Example
# Unison preferences filebatch = truelog = truetimes = trueprefer = newerservercmd = bin/unisonrshargs = -i E:\home\.ssh\identityUinclude ignoreroot = E:\home\workingroot = ssh://[email protected]/working
----- cut here ------ignore = Name {*~,.*~,.xvpics,*.o,*.tmp,tmp,temp,*.out}
A fish has a memory span of 3 seconds,
this explains why they move a lot.
Multi-configuration synchronizations with Unison Figure out and make Unison configurations for which files / directories:
Almost never change Change while working Change frequently Need to update when you move
Create a non password ssh key pair Adjust authorized_keys:
command="bin/unison -server",no-pty,\no-port-forwarding,no-X11-forwarding 1024 35 1..3
Now make some simple scripts like: (umain)unison palmunison docsunison working
By law, it is illegal to eat oranges while bathing in California.
Other Sync methods reported at TWiki.org
RSync: One way sync for master-slave TWiki Rsync was used to great effect at Inktomi many
years back for a significant time period (>12 months of active use) to replicate data between sites, and worked very well. One limitation of the approach taken was that you could only have one master site for each web , but that was resolved by simply making the edit and attach links point at the appropriate master site for that web. Michael Sparks – who got the rsync version to work
How to handle merged numbering?
So if there's a revision history like this: 1.1 1.2 1.3A 1.4A 1.3B 1.4B 1.5A then the author of 1.5A
will get a message with the deltas of 1.2->1.5A and 1.2->1.4B, with a request to merge the changes into a new, common 1.6 revision.
What if the 1.4A author is lazy (or ill or was run over by a bus) and ignores the request? Assume somebody has a 1.4B version on his machine and adds a change; in the moment that his new 1.5B change is distributed, a conflict will arise and the author of 1.5B will get a conflict resolution request.
http://twiki.org/cgi-bin/view/Codev/ReadWriteOfflineWiki -- JoachimDurchholz - 22 Nov 2000
Other tools
RSYNC (v 2.4.6)
Pushes new and changed files or directories to remote machine
Free Unidirectional
To get a kind of bi-directional transfer:Use ‘--update’ option (NTP!!)
rsync -auvz ~/ othermachine:rsync -auzv othermachine:/ ~
Uses rsh (can use ssh)
It's impossible to sneeze with your eyes open.
RFS vs. RSync vs. UnisonSynchronization time for a Constant file Size
0
5
10
15
20
25
100K 1000 100K 100 100K 10 100K 1
NUMBER OF FILES
TIM
E (S
econ
ds)
RSYNC
Unison
RFS
Reconcile
Mitsubishi Electric Research Laboratories Does everything Unison does (mostly) Still in research Requires you to become a collaborator to
get a testing copy of it
"Any sufficiently advanced bureaucracy
is indistinguishable from molasses."
? ?? ?? ?? ?
MS BriefCase(Microsoft), xSync (VBX System), FileTiger (Science Translations Software)
File Tiger
High-Speed-Drive
More File Synchronizer Tools.
Filesystem Synchronization Tools
rsync (v 2.4.6) http://rsync.samba.org
unison (v 2.7.7 stable) http://www.cis.upenn.edu/~bcpierce/unison
Cfengine (v 2.0.a14) http://www.iu.hio.no/cfengine/
Reconcile (internal) http://www.merl.com/projects/reconcile/
Other replication technologies: Synchronizing FTP Files with Perl - http://www.linuxjournal.com/article/6686 Thread discussing some alternatives
http://www.misticriver.net/boards/archive/index.php/t-1543.html, http://www.talkaboutshareware.com/group/alt.comp.freeware/messages/362086.html
http://www.tgrmn.com/web/kb/showall.htm
A ducks quack doesn't echo, and nobody knows why.