In trying to finish transitioning from my old backup drive to my new backup mini thumper, I ran across a file with this content:
12345678
-take car in to get checked
-order lion king tickets
-clean room
+do laundry
+organize honeymoon
continue working on g4g saap psd
write more for guestlist application
call mom
It’s an old todo list. The modification date on the file is 8/10/02. Yup, exactly 8 years before my daughter was born.
I used to keep files like this around quite a bit. A minus sign means that I haven’t done that particular item and a plus means I’ve done it. I don’t know what the significance of a leading space is.
At this time in my life, I was:
just over three months away from getting married (on 11/9)
just under a month away from losing my job (on 9/2, because my employer went out of business)
and three days away from my 24th birthday
I’m glad to report that I did get the Lion King tickets (we went later that year), I did clean my room (at least as clean as it could get), and I’m sure I called my mom. I’m also glad I had organized the honeymoon by that time.
I’ve been using git for a while now, and I’m just getting to the point where I can think in it.
It’s the same as learning a new spoken language. I took three years of Spanish in high school, so I knew most of the rules and could translate back and forth to English, but I never really learned to think in Spanish (as opposed to thinking in English and then quickly translating). And, until you start thinking in Spanish, you are not on the path to full fluency and you’ll be constantly hampered whenever you try to talk with a native speaker.
I came to git from the world of subversion (a world that I’m quickly wishing I could fully leave). So my first forays into git-dom were guided by the SVN Crash Course. This allowed me to get access to git repositories quickly, but it kept me thinking in subversion. I wanted to use git more idiomatically. So I read and I read, I cloned and fetched and pushed and merged and after a while, I began to feel like I was really understanding git.
Last week, when I was chatting with a couple of my programmer geek friends about git and version control, I got to thinking about what has helped me the most in my learning experience. Here’s what I came up with.
Really understand how git stores its data.
Git stores it’s data as a directed acyclic graph, which is just about as simple as you can get. Commits point to their parent commit, which in turn point to their parent, and so on. A new node is created for every commit and a merge just points at all of its parents. Combine this knowledge with the fact that branches are just named pointers to nodes in this graph and you start to have more confidence when it comes time to change things around.
Learn the “get out of jail” command: git reset –hard
That’s all you need to undo almost anything you might have done. This command takes the current branch and points it at the commit you referenced.
For instance, if you want to revert a branch back to where it was at the last fetch:
1
$ git reset --hard origin/branchname
Or, if you decide that the last three commits should be forgotten:
1
$ git reset --hard HEAD~3
Or, if you don’t like the result of that interactive rebase:
1234
$ git branch before
$ git rebase -i HEAD~3
# decide the rebase was bad
git reset --hard before # all better
Don’t be afraid to experiment.
One of the best things about a distributed version control system is that you are free to mess around with your repository all you want and no one will know about it. Go ahead. Rebase your commits to be cleaner, rearrange branches to make them easier to work with, or rewrite every commit because you forgot to set your email address. It doesn’t matter at all until you share those changes with other people.
Also, almost nothing in git will delete data. In truth, it’s rather hard to permanently remove a commit. Because of this, even if something happens that you don’t like, you can use ‘git reset’ to make it all go away. For instance, if you have a repository with a dozen commits and you want to change the committer email, when the commits are rewritten, the old commits are still there in the database, waiting to be resurrected if you change your mind.
This book was instrumental in helping me move from beginner to intermediate. I recommend reading it cover to cover, even if you don’t ever plan on writing a hook or running ‘git filter-branch’. Just seeing how it all fits together helps inform day to day decisions when using git.
I’m a big fan of data visualization. Something about taking huge swaths of numbers and reducing them down to a set of conclusions or messages is very intriguing.
For a while, I’ve been consuming blogs and articles related to data visualization, so my head is full of theories with not much practice. So here is my feeble first foray into the dataviz space. The data comes from a post on flowingdata.com. It presents one way of looking at just how little water (percentage-wise) is available to drink and asks if there is a better way to depict the information. This is what popped into my head when I read the question, and hopefully that idea has translated well:
[Update] It looks like this only really applies to USB flash drives. When I mounted my actual backup drive, it showed up in prtpart. This post was written using the root drive on my old backup server, which is a SanDisk Cruzer flash drive.
Now that I finally got my mini thumper up and online, it’s time pull everything from my previous backup drive. The problem is that it’s a USB drive with an ext3 partition on it. I did a little googling and found several references to using the belenix FSWpart and FSWfsmisc packages, with this one being the most helpful.
My only problem was that when I ran prtpart, it only showed disk information for my non-USB drives. I could see that the drive was recognized by looking in syslog:
1234
root@silo:~# cat /var/adm/messages
Mar 14 12:03:36 silo usba: [ID 349649 kern.info] SanDisk U3 Cruzer Micro 0774920CB281D664
Mar 14 12:03:36 silo genunix: [ID 936769 kern.info] scsa2usb0 is /pci@0,0/pci1462,7418@1d,3/storage@1
...
So, I dug around a bit, trying to look for various names in /dev/rdsk that were in the above output when I stumbled across the fact that everything in /dev/rdsk is a symlink. So I did a quick grep:
Aha! Now I know what the device name is, so I can use prtpart to figure out what to mount:
1234567891011
root@silo:~# prtpart /dev/rdsk/c11t0d0p0 -ldevs
Fdisk information for device /dev/rdsk/c11t0d0p0
** NOTE **
/dev/dsk/c11t0d0p0 - Physical device referring to entire physical disk
/dev/dsk/c11t0d0p1 - p4 - Physical devices referring to the 4 primary partitions
/dev/dsk/c11t0d0p5 ... - Virtual devices referring to logical partitions
Virtual device names can be used to access EXT2 and NTFS on logical partitions
/dev/dsk/c11t0d0p1 Linux native
And mount it:
1234
root@silo:~# mkdir /mnt/linux
root@silo:~# mount -F ext2fs /dev/dsk/c11t0d0p1 /mnt/linux
root@silo:~# ls /mnt/linux/
bin dev etc home initrd lib lost+found media mnt proc root sbin sys tmp usr var www
After basically copying my friend’s exact specifications, I now have a little server at home with 1.5T of mirrored disk space. By and large it was a straightforward process, with the following interesting tidbits.
Most of the assembly went smoothly. You do have to pull the motherboard out to get the CF drive into its slot. In order to maneuver it out, you have to unclip the SATA cables and unscrew the VGA connector.
You can see the SATA cables snaking up the left and top and the VGA connector is in the lower right (blue). The CF slot is just left of center at the bottom of the picture. Here’s a picture with the drive and RAM installed.
The other issue I ran into was related to the optical drive bay. My first drive slid in and mounted fine in the HD bay, but I was stuck without brackets to properly secure the second drive in the 5.25 inch bay. I could have just put it in and held it with one screw, but after figuring that this is my backup server, I opted to head to Best Buy to pick up the brackets.
When I got there, I was informed that they don’t carry them any more and that I would have to pay a visit to Fry’s. Well, I hate going to Fry’s more than most bad things in life, so I called it a day and decided to figure it out later. Then, earlier this week, Sara and I were walking by a little local computer shop named *techquest. The proprietor was able to dig up a pair of brackets, so I bought them from him.
Yesterday, I finished assembling the hardware and then spent a while trying to figure out how to get it to boot OpenSolaris from the USB drive I had created.
The first problem was that I couldn’t get into the Wind BIOS. I could see it flash something on the screen after POST beeping, but it was cleared far too fast for me to get any information. After rebooting a few times and only getting a few words, I turned my iPhone video camera on it and was able to finally read the information with a well timed pause.
The rest of my issues revolved around the unique arrangement of boot options in the BIOS and having to remove the stupid U3 stuff from the Cruzer so that it behaved like a simple USB disk, but soon enough I was installing OpenSolaris.
The little box now sits in my entertainment center, ready for me to start transferring data to it.
For a while now, I’ve been backing up the few WordPress blogs that I run for various people with a very simple script that followed this algorithm:
Copy files to a temporary directory.
Dump the MySQL data into a file in that directory.
Tarball it up.
Scp that file to another server that I run.
At the time, I did this because it was the simplest thing that could possibly work. It didn’t depend on any external facility other than mysqldump, tar, and scp.
Well, running that script on a nightly cron filled up my disk allocation on that remote server a couple times, so I got clever with the backup organization so I could quickly remove old backups while keeping sparser (monthly) backups for longer. This only helped a little, because I was still nervous about deleting backups because I didn’t know what they contained.
I also have been using git more and more recently and I liked the idea of version control that can go in any direction. So, in the spare bits of time I’ve had in the past few weeks, I wrote git_backup.pl. It takes a git repository and does the following:
git add <any new or modified files>
git rm <any deleted files>
git commit
git push backup
Now, when the backup is run, only the small changes are sent to the remote server and I can look at the differences by examining the git log.
There are options for dumping database tables, changing the commit message and the remote that gets the push. Running “git_backup.pl –man” will show all the options.
The source is (of course) in a git repo: http://git.endot.org/git_backup.git
Being the data and visualization nerd that I am, I’ve been delving into R on occasion. For this purpose, I am using R.app on my Mac. To start it up for a certain working directory (to keep different projects separate), I run “open -a R <working dir>”. This worked great until I noticed that my history wasn’t getting saved to the .Rhistory file in each directory. When I use the command line R executable it does, but not in the R.app GUI.
So, it took me a little while to figure out that it’s a bug in the R.app code and you have to use a workaround. Open R.app’s preferences and set the “R history file” key to something other than “.Rhistory”:
Now, after a restart, the .nateRhistory file in the working directory is properly updated.