Jul 28 2009

Weasel Words “May Not Function Properly”

As a Mac user and developer, I’ve been looking for good alternatives to DropBox for on-line backup and sync, since I found myself dragging all kinds of configure files and directories into DropBox and then linking to them from their original paths. This move-and-link scheme is just too awkward to bear as more and more scattered files and directories are involved.

Today, I stumbled upon SugarSync which seems very promising. The most outstanding advantage of SugarSync over DropBox is its flexibility to backup/sync any folders. You can see the detailed comparison between SugarSync and other prominent competitors here.

However, just before I set out to download SugarSync Manager 1.6.2 for Mac (Beta), I glimpsed the Known Issues ^(The more software you use, especially the more software you develop, the more wary of software you are.)$, and one of them scared the pants off me. It said:

If you are using a mac that has been formatted with a case-sensitive filesystem instead of the default case-insensitive filesystem, SugarSync may not function properly due to incorrect case of paths.

I am definitely one of those implied idiots that formatted their effeminate Macs with Case-Sensitive filesystems just to show more masculine. Fine, in fact, I am a 6-year-old Linux user. Whatsoever, what on earth makes you developers resist case-sensitivity so much that you prefer to turn down your peer Unix developers ^(I don’t think Windows users care about case-sensitivity because they never have one, and aren’t all Unix (before Mac OS X) users developers?)$? Are you as arrogant as Adobe?

I am also a coward who dare not to take a risk. What do you mean by “may not function properly“? ^(Maybe, by “may not” instead of “will not”, they mean it works well even on case-sensitive filesystems only if there are not any files or directories distinguishable only by case?)$ Will it damage my original data, or just refuse to backup them? And why don’t you just say “SugarSync does not support case-sensitive filesystems (right now), so do not use SugarSync on them.”?

What if I took the risk and lost my data? So then you can disclaim “We have warned you that it may not function properly. You made the decision. You take the responsibility.”?

I think there are some basic moral code that a properly functioning vendor should follow:

  • For cases where your product function perfectly, proudly announce them and dutifully guarantee them.
  • For cases where your product does not function perfectly, describe the deficiencies clearly.
  • For cases you are not sure, confess candidly that even you are not sure and advise users against using it in those cases.
  • Never make any cunning disclaimers, because, you see, weasel words “may not function properly“, or even hurt your prospects.

Jul 24 2009

Meta Backup DreamHost on DreamHost Backups

DreamHost provides a snapshot backup service. However, this backup service seems a little shabby:

You may request a backup once every 30 days.

Due to some technical limitations, any users or databases greater than 4GB will not be included in these backups!

We also recommend that you maintain your own off-line backups locally.

Of course, they should complement such a faint backup service with something. It turns out that they do. DreamHost offers one Backups User per account (I’ll simply call it DreamHost Backups):

At DreamHost, you may only keep website-related content on your regular users.
You do, however, get one user per account where anything legal may be stored; your Backups User.

This user cannot have any websites pointed to it, nor may you share files via it… it is only to be used as an off-site backup for your personal files.
As such, we keep no backups of files on this account. These are already supposed to be your backups… not your only copy!
(Of course, you should always keep your own copies of all data stored with us.. we make no guarantees!)

Every full DreamHost Hosting plan includes 50GB of backups space!
(Additional usage will be charged at the rate of 10 cents / GB a month: the best backup deal on the net!)

You may access your backup user via ftp, sftp, scp, and rsync!

Surely, anything legal includes your legal web sites on DreamHost. And 50GB, in my opinion, is large enough for personal or small business web sites ^(If you are running web sites larger than 50GB, I’m sure you know everything I’m talking about here much better than me, so this article is obviously not for you.)$. So why not a meta backup of your DreamHost web sites on DreamHost Backups? ^(Notice that DreamHost Backups are hosted on different servers from web hosting servers, which makes much more sense for our meta backup.)$

Let’s just do it. I’m sure to get your hands dirty along the way. So be warned before reading on.

Mainly, I grabbed the pith from the clever Snapshot-Style Backups:

rm -rf backup.3
mv backup.2 backup.3
mv backup.1 backup.2
rsync -a --delete --link-dest=../backup.1 source_directory/  backup.0/

and adapted it to off-site backup, which entails more effort.

The core application used is rsync which is fortunately supported by DreamHost Backups.

But to incrementally build multiple versions of snapshot, only rsync is not enough. We also need at least two other shell commands: mv and rm.

Neither of them is available due to the lack of shell support on DreamHost Backups. What we have are only ftp, sftp, scp, and rsync, remember?

So, what we can do now is just make good use of what we have. In fact, we’ll need to make extreme use of two of them.

rsync is one. Which is the other one? scp can be first eliminated, for copy is already perfectly handled by rsync. Between the left two, I’ll take the safe side. So, sftp.

All ftp clients can do renaming, they just rename mv rename (quirk intended). To our great joy, sftp support batch mode, which means we can put all the renames in a script, and feed it to sftp which then cranks at it dutifully without any further interfere from us.

rm is the only knot left, which turns out to be a really hard one.

sftp supports rm, but only for files; it also supports rmdir, but only for empty directories. What we need is recursive rm ^(lftp supports recursive rm though not very smoothly; besides, it is not pre-installed on DreamHost.)$, and we are going to simulate one, by rsync, of course.

NOTICE: this is the only valuable part of this fluff, so attention please.

rsync -avz --delete --include='/backup.3' --exclude='/*' any_source_dir_without_backup.3 backups_user@backups_host:path

It is mind-boggling, I know. So let me explain it:

  • “–delete” — to delete the missing files/directories from destination;
  • “–exclude=’/*’” — to skip all files/directories when comparing source and destination, except for;
  • “–include=’/backup.3′” — the one to be deleted, must be put before the exclude-all clause, otherwise it would be excluded first and have even no chance to be checked for whether it should be included;
  • because “–delete” has effect only for the missing part, ‘/backup.3′ must be missing at source.

Another point maybe of interest is that you’d better make a proper filter for rsync to use when syncing. It will save you both bandwidth and disk space.

I exclude my locally installed softwares (~/bin, ~/include, ~/lib, ~/share, ~/var), sensitive information (~/.ssh/, ~/.my.cnf; actually, I exclude all dot files under my home directory, i.e., ~/.*, since it is web sites not development host that are being backed up), and other stuffs that just make no sense to be backed up, like source packages of the softwares (~/soft), temporary swap files of Vim (*.swp), and backup files spewed by Emacs (*~). I put all these exclusive filter rules in a script and feed it as the value of the ‘–exclude-from’ option to rsync.

That’s all. All the other utilities, such as mysqldump for mysql backup, and cron for scheduled automatic backup, are all explained very clearly on wiki.dreamhost.com.

Lastly, I’ll wrap all these up into several scripts (assumed to be put in ~/backup_utils), for you to take away:

mybackup
# Remember to make it executable, via "chmod +x ~/backup_utils/mybackup".
 
# dump mysql: the password is stored in ~/.my.cnf which is not to be backed up for security
mysqldump -u mysql_user -h mysql_host -A >| mysql_dump_path
 
# rsync and sftp both use ssh whose keys are stored in ~/.ssh/ which is not to be backed up for security
# delete the oldest
rsync -avz --delete --include='/backup.3' --exclude='/*' any_source_dir_without_backup.3 backups_user@backups_host:path
# rename olds to olders
sftp -b ~/backup_utils/sftpcmds backups_user@backups_host:path
# send the new
rsync -avz --exclude-from="$HOME/backup_utils/rsync.filter" --delete --link-dest='../backup.1' ~/ backups_user@backups_host:path/backup.0/

sftpcmds
rename backup.2 backup.3
rename backup.1 backup.2
rename backup.0 backup.1

rsync.filter
/bin/
/include/
/lib/
/libexec/
/share/
/soft/
/var/
/.*
*~
*.swp

crontab
# daily backup
0 0 * * * ~/backup_utils/mybackup

PS: While I am writing this article about DreamHost backup, my backup script is backing up drafts of the article. What a meta backup!


Jul 20 2009

乔迁,抑或流亡

鉴于 Blogger 被间歇性的阻击,我终于不能再忍。刚好 DreamHost 上的一个 9$ 一年的 Host 帐号目前也没啥正事可干,故将 blog 迁徙至此。

还是“我的地盘我做主”的感觉爽。

不过由于我没有开通 SSL,有些关键字还是会惹祸。比如刚才整理 Category 和 Tag 时,由于用了 freedom 就被搞了。最后只能用通过 SSH tunnel 做 proxy 才更新成功。作为教训,我把 China 和 freedom 之类的字眼都从 Category 里干掉了,一了百了。其实谁愿意揭家丑,只是被逼得烦到不行“靠”几声罢了。

马上就会有更多自由支配的时间,会多写一点。希望多些朋友能来捧场。

So, I’m wangling you.


Jul 5 2009

How content changes of ContentProvider propagate to ListView via Cursor

I know there must be an { Observable->Observer, Observable->Observer, … } chain from ContentProvider to ListView (or general AdapterView), but I didn’t know how exactly this chain is composed.

I put by the itch to untangle the knot until recently I set out to implement an OrderedMergeCursor ^(I will talk about it in future post.)$, which should be a subclass of MergeCursor that merge-sorts pre-ordered cursors.

It turns out that the chain is quite long, and kinda entangled by the two content data concepts of Cursor: Content(ContentObservable->ContentObserver) and DataSet(DataSetObservable->DataSetObserver).

As I see it, Cursor’s Content is the source Cursor retrieves data from, while Cursor’s DataSet is the data Cursor has retrieved. It will be more clear in the notification chain below.

For simplicity, I will omit the enclosed ContentObservable/DataSetObservable of Cursor and Adapter, since they are just one more indirect layer without any real functionalities. You can think of Cursor and Adapter as Observable themselves.

Without further ado, here comes the untangled register chain and the opposite notification chain.

  • Register Chain:
    ListView.setAdapter(CursorAdapter):
      Adapter.registerDataSetObserver(AdapterView.AdapterDataSetObserver)
        ^
        ^
    CursorAdapter <- CursorAdapter(Context, Cursor):
      Cursor.registerContentObserver(CursorAdapter.ChangeObserver)
      Cursor.registerDataSetObserver(CursorAdapter.MyDataSetObserver)
        ^
        ^
    Cursor <- ContentResolver.query(Uri, String[], String, String[], String):
      ContentProvider.query(Uri, String[], String, String[], String):
        AbstractCursor.setNotificationUri(ContentResolver, Uri):
          ContentResolver.registerContentObserver(Uri, boolean, AbstractCursor.SelfContentObserver)
  • Notification Chain:
    ContentResolver.notifyChange(Uri, ContentObserver) --->
    AbstractCursor.SelfContentObserver.onChange(boolean) --->
    AbstractCursor.onChange(boolean) --->
    CursorAdapter.ChangeObserver.onChange(boolean) --->
    AbstractCursor.requery() --->
    CursorAdapter.MyDataSetObserver.onChanged() --->
    BaseAdapter.notifyDataSetChanged() --->
    AdapterView.AdapterDataSetObserver.onChanged() --->
    View.requestLayout()

Jul 2 2009

理想工作

  1. Disney 做动画
  2. Blizzard 做游戏
  3. Apple 做电脑
  4. Google 做网络

都不如,在家做老板。

所以,我先试试能不能自我主宰自由自在的生活;不如意的话再去上面的list里寻觅寻觅。