Tag Archives: Computer problems

A Kuhnian look at the iPad

In case anyone has missed it; Apple announced their new tablet computer last week. They call it the iPad. This has released a massive wave of comments of all kinds across the internet, where some has been calling the device over-hyped and under-performing, and others have hailed it as the device that will change mobile computing, and still other haven’t got the point at all (Stoke’s comparison is hilarious, he obviously doesn’t see what Apple is aiming for). But I don’t aim to get into detail on whether the iPad is a good or bad product, at least not before I’ve touched one myself. Instead, I will use two iPad-related articles as starting point for a discussion on the paradigms of computer environments, in a Kuhnian manner.

The philosophy of Kuhn

Thomas Samuel Kuhn was a physicist that is most known for his work on the theory of science and the book “The Structure of Scientific Revolutions“. His main ideas was that science periodically undergoes what he called paradigm shifts. The new paradigm introduces a new set of concepts and interpretations that can explain previous problems in the previous paradigm(s). The new paradigm can also bring a number of problems that scientist have never considered before. This paradigm-view invalidates the often perceived view that science progress linearly and continuously. Kuhn was, for example, sceptical about textbooks of science, which he thought distorted how science was carried out. He also argued that science can never be done in a completely objective way, instead findings will always be interpreted in the present paradigm, making our conclusions subjective.

The iPad

The iPad combines the touch interface of the iPhone and iPod Touch with a larger screen and more powerful hardware, which enables desktop-like applications to be run on the tablet. While this is not revolutionary in itself, Dan Moren of MacWorld argues that the philosophy of the iPad user interface is revolutionary for how we use computers in general. He concludes: “Regardless of how many people buy an iPad, it’s not hard to look forward a few years and imagine a world where more and more people are interacting with technology in this new way. Remember: even if it often seems to do just the opposite, the ultimate goal of technology has always been to make life easier.”

Steven F takes it one step further in his article reflecting on the iPad, and argues that “In the Old World, computers are general purpose, do-it-all machines (…) In the New World, computers are task-centric”, and concludes his introduction with the words: “Is the New World better than the Old World? Nothing’s ever simply black or white.” Steven then make a very interesting argument about why task-centric, easy-to-use, computers will slowly replace today’s multi-purpose devices.

Both of these articles are very much worth reading and brings up numerous important views on the future of computing, and I highly recommend reading them throughly. Even though they both use the iPad as a starting-point, Apple’s new gadget is not really their subject of study. Instead, what both Steven and Dan are reflecting on is the future of computer user interfaces. Which is a much more important subject than the success of the iPad, or if it is a good product or not.

The Old World Paradigm

To use Steven’s terminology, we have for the last 20-25 years resided in the Old World Paradigm of computing. This paradigm is what people usually think of as the normal graphical interface of a computer. It may be Windows, Mac OS X or any common Linux distribution – they are all essentially used in more or less exactly the same way. The same way the original Macintosh introduced in 1984. Not much has changed really. Using my MacBook today, is essentially the same as using my Mac Plus in 1990. It’s just easier to carry with me, slightly faster, and (importantly) has access to the internet. But I interact with it basically using the same metaphors; windows, menus, desktop icons, a pointer, buttons etc.

Before the Mac, most computing was done using command-line tools in a DOS or Unix environment. While this was convenient for many purposes, it was a huge abstraction to the new computer user, and scared people off. Still, I have access to the Unix shell in my MacBook, using the Terminal application, which I use frequently in my bioinformatics work. As computers became more common, this kind of abstraction was becoming a great wall. A crisis in computer interface development started to surface. And according to Kuhn, a crisis feed … a new paradigm. In 1984, Apple started off this paradigm, using techniques partially taken from research made at the Xerox labs. Five to ten years later, most of the computer users was taking part in this graphical paradigm (using Windows 3.1, Windows 95, or Mac System 6 and 7).

Things that scared people about the command-line, were to a large extent solved using the point-and-click metaphors. However, a lot of people still find computers hard to use. Computers need virus cleaning and re-installation. They get bogged down by running too many applications, and having them on for too long results in memory fragmentation. Just keeping the computer running is a hard task for many people. This creates another distraction, further fuelled by the addition of extra buttons, and extra functions directed at power users. Such extra features is just confusing for new computer users. And while Mac OS X and Linux is not as haunted by viruses and malware as Windows is, they are still very complex. Desktops get cluttered quickly by documents, the screen is too small to handle the windows of five applications running simultaneously. With the increasing computing power, a lot of extra functionality is added, which many times just obscures the main task of the computer. A new crisis is emerging.

The New World Paradigm: An era of simplicity

Apple sees this crisis, and also has a solution for it. They call it: simplicity. For Apple, the most important thing is not if we can have access to the latest technology to play around with its internals. Apple wants its average users to never worry about the internals of the device. It should just work. Steven nails it like this: “In the New World, computers are task-centric. We are reading email, browsing the web, playing a game, but not all at once. Applications are sandboxed, then moats dug around the sandboxes, and then barbed wire placed around the moats. As a direct result, New World computers do not need virus scanners, their batteries last longer, and they rarely crash, but their users have lost a degree of freedom. New World computers have unprecedented ease of use, and benefit from decades of research into human-computer interaction.” This means that we computer-savvy guys of the old paradigm will loose something. We loose our freedom to tinker. But this loss comes at great gains.

Kuhn would say that personal computing have reached a new crisis, which opens up for a new paradigm. Apple is among the first companies to try to create a device that defines this paradigm, but they are not alone. Google’s Chrome OS aims at the same thing – to define the next paradigm in computing. And both Apple and Google are willing to bet that the kids born today, who never saw the Old World Paradigm of computing, will never miss it. They will never ask what happened to the file system metaphors, the window metaphors, and the multi-tasking of today’s computers. Because they will never have seen it. Instead, they will ask how we could stand using the buggy and unstable computers we have today.

Apple redefined the smart phone three years ago. They definitely have the potential to redefine the experience of computing. Not that the this new paradigm would mean no more Unix command-line tools. Not that it will mean that the current desktop computers will immediately die out. But what we first think of as a computer in ten years, might very likely be a much more task-centric device than the laptops we use today. And even though this is a loss of freedom, it will surely be a great gain in usability. Until the next paradigm comes along…

Solving problems in seconds

Sometimes a given solution to a problem lies much closer at hand than you expect. In my work I usually do the same task repeatedly with between 6 and 50 files. Even though Unix is very efficient in many ways, this still takes time to do by hand. I have thought of various ways around that problem, including using wildcards (*), but never got fully satisfied. But this week, I finally came up with the simplest solution this far. And it took about a minute or two to implement. I don’t know why I didn’t think about this a year ago. Maybe I thought that I would only do these repetitive tasks a couple of times. I was wrong, but thanks to Perl I can now be much more efficient (and write this instead of typing Unix commands…) The good thing about my implementation (in my opinion) is that it’s so flexible. Here’s my code, please comment if you feel that there is more efficient ways. “{}” is replaced by a number for each file name:


#!/usr/bin/perl

## LOOP COMMAND
$versionID = "Version 1.0";
print "LoopCommand\n";
print "Version $versionID\n";
print "Written by Johan Bengtsson, October 2009\n";
print "-----------------------------------------\n";

## GET USER INPUT
print 'Execute command: ';
chomp($command = <STDIN>);
print 'From number: ';
chomp($start = <STDIN>);
print 'To number: ';
chomp($end = <STDIN>);

## EXECUTE
for ($i = $start ; $i <= $end ; $i++) {
$exec = $command;
$exec =~ s/\{\}/$i/g;
$result = `$exec`;
print $result;
}

LogoMat-M, or how I started to hate source code and opted for precompiled binaries

LogoMat-M and its uses
I have recently struggled to install a bioinformatics program called LogoMat-M. LogoMat-M is a command line based program that creates visual representations of HMM-profiles. An excellent example of the program in action can be viewed at Sanger’s LogoMat-M website. It creates images that looks a bit like this:

The resulting images make it easy to interpret how common a given amino acid is at each position of a sequence alignment, where the alignment usually represent a protein family. So far, so good.

The problem is that the web service was not designed to work with large amounts of sequences, and thus returns nada when such sequence alignments are used. To solve this problem, I thought I would try to install the program locally, on my own computer, at least to receive a proper error message. This was a big mistake.

The “install” process
I started by downloading the LogoMat-M package (i.e. the source code – this is open source software, which often means that there are no pre-compiled binaries). However, the build files for the program complained that my computer missed certain libraries and programs required for the LogoMat software to compile. Well, alright, I went out to find the pieces of missing software. Quite fast I could track down the two missing components and download these. Once again, these were open source programs – meaning no pre-compiled binaries. I tried to compile the first of those and rapidly got the answer that a component called PDL was required and could be obtained via a service called cpan.

I started to get a bit frustrated, since I didn’t want to spend the whole day installing software – I wanted to construct images like the one above. However, I did as the instructions said and text started flashing down my screen. Suddenly, cpan exited and said “Could not compile. Compiler returned bad status.” Wow. How informative! How do you expect me to know what caused that?! So, now I was stuck. I could not compile LogoMat because I was missing another program that was required, and I couldn’t install that program because I lacked a component that wasn’t, for some unknown reason, able to compile.

Now, the big problem here is that there is no way for me to get around this, because the documentation does not mention this kind of situation. I could, of course, contact the developers, but I was on a tight time schedule, and needed this to work. It was possible, if not likely, that it would take days for the developers (who do not get paid for this software, i.e. there is no official support channel) to sort out my problem.

Again, a mentality problem
Many times, open source software is praised for being open, but what people tend to forget is that a lot of this software is not at all easy to use. Or, in this case, even install. On Windows or Mac OS X, I would have fired up an installer, which would have installed a working pre-compiled binary on my system, with all its required libraries. It would work out-of-the-box. And if it didn’t, there would be someone to call.

Now, I don’t want to call for open source developers to set up call centres for supporting their programs, that would just be ridiculous. But I beg you to please make pre-compiled, working versions, including required libraries, and supply these for at least the most common platforms. Depending on the kind of software, that could be Windows, Mac OS X, Ubuntu and Red Hat Linux, for example. Don’t bother with pre-compiled software for strange and uncommon architecture, people running these things probably know how to compile their software anyway. But please, supply some easy to use, pre-compiled program for the rest of us. Because otherwise we will never be able to get our work done using open source alternatives, and that does not benefit either our work or the open source community in general. The situation described above only benefit big corporations selling overpriced software. And that is really, really sad.

Microsoft WORD format is not a sequence format

I found this on a bioinformatics info site related to the EMBOSS package. I find the tone of it rather amusing, especially as people usually refers to Word-files simply as “text”:

Sequences

Before reading the rest of this document, please note:
Microsoft WORD format is not a sequence format.

Sequences can be read and written in a variety of formats. These can be very confusing for users, but EMBOSS aims to make life easier by automatically recognising the sequence format on input.

That means that if you are converting from using another sequencing package to EMBOSS and you have your existing sequences in a format that is specific for that package, for example GCG format, you will have no problem reading them in.

If you don’t hold your sequence in a recognised standard format, you will not be able to analyse your sequence easily.

What a sequence format is NOT

When we talk about ‘sequence format’ we are NOT talking about any sort of program-specific format like a word processor format or text formatting language , so we are not talking about things like: ‘NOTEPAD’, ‘WORD’, ‘WORDPAD’, ‘PostScript’, ‘PDF’, ‘RTF’, ‘TeX’, ‘HTML’

If you have somehow managed to type a sequence into a word-processor (!) you should:

  • Save the sequence to a file as ASCII text (try selecting: File, SaveAs, Text)
  • Stop using word-processors to write sequences.
  • Investigate a sequence editor, such as mse
  • Investigate using simple text editors, such as pico, nedit or, at a pinch, wordpad

Now, repeat after me:
Microsoft WORD format is not a sequence format

EMBOSS programs will not read in anything which is held in Microsoft WORD files.

So, remember that Word format is not a sequence format, and be careful with you bioinformatics research! Original text found at: http://emboss.sourceforge.net/docs/themes/SequenceFormats.html