Skip to content

FOSDEM 2011 Highlights: LLVM & Clang

12 February, 2011

It’s great to be back. I’ve been so busy these last couple of weeks and it might have been a good idea to put up a notice of absence.

Well, after passing some of the nastiest exams (or so I’ve been told), I’ve been in Brussels with the ROSEdu crew. Specifically, for FOSDEM, but everyone knows that was just a pretext to visit Belgium (for me at least).

Well, in trying to keep a professional tone, let me talk a bit about my favorite talks from this year’s FOSDEM. The keynotes are not available online as of today (February 12th), but I’ll try adding a link as soon as possible.

LLVM and Clang

LLVM, although initially standing for Low Level Virtual Machine is nowadays an umbrella project for an improved compiler infrastructure. Essentially, tools like gcc and gdb have significant portions of common functionality (parsing C/C++ code) and this is done twice, using two different engines.

The talk, by Chris Lattner (mastermind of LLVM) himself was introductory and although it’s not online yet, here’s a link to an interview he had taken for FOSDEM. The amusing part is that you won’t find an actual interview, only a compilation of information he himself provided (Because of his employer’s [Apple] policy, Chris couldn’t be interviewed…).

The core libraries are based around the LLVM intermediate representation, LLVM IR and can be targeted by various compilers (Clang being one of them). So, what happens is that code is compiled to LLVM IR which is then optimized and finally CPU-specific code is generated.

What a compiler does (very high level overview) is:

  1. parses the source language to an intermediate representation (like an abstract syntax tree)
  2. transforms the intermediate representation (possibly to some other intermediate representation as the performance is improved and the code is simplified to a lower level to simplify language generation)
  3. generate machine code from the intermediate representation

The first two steps are performed by the front-end and the last step by the back-end. The intermediate representation is the crux of the matter. Which to chose so that the parsing and machine code generation can be split? LLVM offers an intermediate representation that is well tested and quite mature now (in development for over 10 years) and that can generate high quality machine code. So, with a back-end in place, there needs to be a front-end for each supported language.

This is where Clang comes in. Clang is a C/C++/Objective-C compiler that is faster than GCC, delivers faster code, provides meaningful error messages. As wonderful as all that sounds (and I will try using it from now on — it’s not like my C/C++ code is that complex/exotic), there are cases where GCC and Clang behave differently.

For example, the keyword inline is treated differently in Clang than in GCC, but that’s because GCC doesn’t adhere to the C99 standard completely by default. Actually dealing with inline is surprisingly tricky apparently… although I never noticed before. I just thought it was a better way of doing macros. Have fun reading the rules here.

At any rate, there are lots of projects that want to compile to LLVM IR like LDC (compiler for the D language with an LLVM backend) or a GHC backend. The GHC backend is the one I think is the most interesting. You can read the thesis I linked to, by David Terei who did all the work by replacing the Cmm language the GHC uses with LLVM IR. At least have a look for the details about the pipeline of the GHC — starting with Haskell code, to HS (an in memory representation of Haskell with syntax represented and on which type checking is performed), Core (version of lambda calculus with some extensions, just large enough to express Haskell), STG (the Spineless Tagless G-Machine which is an abstract machine representation of Haskell, I don’t understand, but has an awesome name :), and finally Cmm (a variation of the C- language that represents a Haskell program in a procedural form). This is the representation that is converted into LLVM IR.

In addition, LLDB uses libraries provided by LLVM and Clang to implement a more powerful debugger (faster and more memory efficient than GDB at loading symbols according to the hype). Also a new C++ standard library, libc++ is coming down the pipeline. A faster, less memory hungry one apparently.

Now, if this project is so dramatically better, everything is so much faster, why doesn’t everyone get to work on using it? Why don’t you hear about Ubuntu preparing to use it?

Well, as far as I can tell, although the LLVM project is under a less restrictive license than the GPL (aka allows proprietary, binary-only extensions, which should be good, right?) and it is still being worked on. For one, C++0x support is unfinished and only recently, on October 26th 2010 to be more specific has Clang built a working Linux Kernel. Other projects like llvm-gcc (GCC 4.2 front-end) apparently work whereas dragonegg (GCC 4.5, GPL3) is still buggy. And projects like libc++ exist (only?) because “Mainline libstdc++ has switched to GPL3, a license which the developers of libc++ cannot use”.

So, since Apple is the main sponsor of this project, it’s pretty clear that they don’t really appreciate the GPL3 (although I’m not sure whether or not it actually affects them — possibly forces them to open-source parts of Xcode?). As a consequence of that, all LLVM projects obviously work on Mac OS X and LLVM has already been used successfully to convert some more advanced OpenGL functions not supported by Macs using Intel GMA chipsets to simpler subroutines to ensure correct operation. (of course, one can also ask what in the world is a shitty Intel GPU doing inside a Mac; maybe next time they’ll not use piece of junk hardware instead?)

LLVM is featured most prominently in Xcode, Apple’s IDE for Mac OS X and iOS and it’s quite clear that future versions of both OSes will no longer use GCC. In Xcode 4, there is support for LLVM2.0, Fix-It (which basically detects possible errors at edit thanks to some LLVM magic) and the LLDB debugger.

BSDs will probably follow, pouncing at the opportunity to no longer rely on a GPL3 compiler.

Whether or not this is good in the long-term is still debatable. Some may view Apple’s involvement in the LLVM project with suspicion and there might be some hesitation to switch entire Linux distributions to Clang. Whatever happens, LLVM is clearly here to stay. Hopefully cross-platform support gets better, although right now, it’s pretty clear that Mac OS X is definitely the priority. Here’s to better compilers for everyone!

Oh, and Brussels was a lot of fun!

In Brussels

Team ROSEdu @ De Brouckère, Bruxelles

Balancing Partitions

21 January, 2011

I never liked Dynamic Programming.

In fact, in programming contests, where the pressure is always on winning, getting it, I would feel immensely stupid for not getting it. Anyway, after reading up a bit about how it works from the CLR, I ended up understanding a bit more about how it works.

This post is about a problem given in this month’s USACO contest (Silver Division), called divgold where you are tasked with solving the Balanced Partition problem. It’s a fairly known example, but I was unfamiliar with it. You can read all the problems by viewing the January contest on the USACO Contestgate. You need to register though.

Balanced-Partition

You are given N numbers, let’s call them a_1, a_2, \ldots, a_n, and you must find the number of ways to partition these numbers into two groups a_{11}, a_{12}, \ldots, a_{1k} and a_{21}, a_{22}, \ldots, a_{2l} such that:

  • l + k = N;
  • |S_1 - S_2| is minimized, where S_1 = \sum_{i=0}^{k}a_{1i} and S_2 = \sum_{j=0}^{l}a_{2j}.

And you must find the minimum difference itself, m = |S_1 - S_2|.

Let’s make some notes before actually talking about the solution (which – surprise! – involves dynamic programming). First of all, it is NP-Complete. This can be of course proven quite easily, and we will actually do it since it contains the key idea that helps solve the problem.

Warning! Theoretical stuff ahead. If you just want to learn how to solve the problem, read the official analysis!

So, when proving that Balanced-Partitions \in NP-Complete we’ll first prove that Balanced-Partitions \in NP by devising a non-deterministic algorithm that solves it.

Note however that the problem wants to find the number of ways to partition the numbers. It is not a decision problem. We’ll instead redefine it to ask whether you can get a difference of at exactly m between the two sets. So we’re completely ignoring the number of ways to get that difference, we’re only interested in whether getting it is possible or not. We’ll see how to actually compute the number of ways later.

Balanced-Partitions \in NP

We’ll use choice, success and fail to describe it. I think that the simplest way to build the actual partition is to generate all possible 0/1 assignments and get the sums of the two resulting sets of numbers (those who were given a 0, and those who were given a 1).


M-Balanced-Partition(A, N, m):
  S1 = S2 = 0
  for i = 1 .. N:
    if choice(0, 1) == 1:
      S1 += A[i]
    else
      S2 += A[i]
  if abs(S1 - S2) == m:
    success
  fail

So, what this simple algorithm does is assign each element of A a value of either 0 or 1. If A[i] is 1, it’s in the first set, otherwise, it’s in the second set. By generating all of the possibilities using choice we will clearly determine whether or not obtaining the difference we need is possible.

The actual complexity of the algorithm itself is \Theta(N) since we’ll partition the array A into one go (lines 3-7). Since \Theta(n) is a polynomial, we conclude that Balanced-Partitions \in NP.

Balanced-Partitions \in NP-Hard

Let’s suppose that we want to know whether we can get a certain difference, m. Let’s first note that if the two sets are S_1, S_2 (note: we’ll also use S_1, S_2 when referring to the sums themselves as well!) and we call S_1 the smaller sum, S_2 - S_1 = m and S_1 + S_2 = S so therefore S_2 = \frac{S+m}2 and S_1 = \frac{S-m}2. In fact, no matter what we call these two sets, if we can form a set of sum \frac{S+m}2, we’ll clearly have obtained the other set as well (it’s this set’s complement!).

So, in fact, finding whether or not Balanced-Partitions has a solution is kind of equivalent to wanting to find whether we can get a certain sum, let’s call it Q. This, however, is the Q-Sums problem which is known to be NP-Complete. So, we’ll try reducing Q-Sums to Balanced-Partitions. If we prove this to be true, then Balanced-Partitions \in NP-Hard because it means that for any problem K \in NP, K \le_p Q-Sums (Q-Sums is also $NP-Hard$!) So, by the transitive nature of the polynomial reduction relationship (that’s what the funny \le_p symbol is) it follows that Balanced-Partitions \in NP-Hard too.

Warning! Possible rambling ahead. Feel free to skip the following paragraph (Also, the picture is only chuckle-worthy if visiting from Facebook!) : )

Well, these people balance stuff all the time!

It’s important to note that the key element is reducing a problem that is already known to be $NP-Hard$ to our problem. That means intuitively that our problem is at least as hard as the original problem. If we’d do it in the opposite direction, it would be meaningless! Why? Well, consider the problem (not a decision problem, I know, but I can’t think of a better example, maybe a helpful comment someone?) — sort an array of N integers. This clearly has lots of great polynomial time algorithms that solve it (my favorite being quicksort with a randomized pivot). But, we could equally well sort an array of numbers by generating all possible permutations of the numbers and outputting first one that we generate that is sorted. This would run in exponential time though, as generating the permutations of a set is in EXPTIME (to be taken with a grain of salt, since it’s not actually a decision problem). So, reducing a problem we know nothing about to a difficult problem achieves nothing.

Ok, so, If you’re still following at this point, let’s get on with it and try to do the reduction itself. We must prove that we can solve Q-Sums using Balanced-Partitions and that there can be no false-positive, i.e. for an input i \in I, I being the set of inputs for Q-Sums and a polynomial time algorithm/function F:I\rightarrow I', where I' is the set of inputs for Balanced-Partitions, Q-Sums(i)=1 \Leftrightarrow Balanced-Partitions(F(i))=1.

If we want to find a set of sum Q, we’ve seen that there existing a difference m is equivalent to there being a set of sum \frac{S-m}2. So, we’ll just map Q\rightarrow \frac{S-m}2 \Rightarrow m \leftarrow S-2Q and solve Balanced-Partitons for that value of m. It’s pretty clear that the two problems are equivalent I would say, to prove it, we just“go back” and forth through the relationship between Q and m.

Now, we’re finally content. Balanced-Partition is both in NP and in NP-Hard, and we can finally conclude that it is NP-Complete.

Back to the initial problem

Why go through all the trouble proving that it’s $NP-Complete$ anyway? Well, as a nifty exercise for one, but it also provides the essential insight that can help solve the initial problem from the USACO contest, divgold. There, we were tasked with finding the minimum difference as well as the number of ways to obtain it. Well, this illustrates the connection between this kind of optimum problems and their associated decision problem pretty clearly. To find the minimum difference, call it d, we’ll essentially start with d = 0 and keep increasing the difference until we find that Balanced-Partition(A, n, d) = 1.

How far are we supposed to go anyway? Well, for one, the difference between the two sets can be at most S for a very loose upper bound, (but I really think it should something much tighter, like perhaps the maximum value from A, but I didn’t manage to prove this, not at this late hour anyway; I’m not even sure whether or not it’s true to be honest). Anyway, the point is we’ll stop at some point, in at most N steps.

How to find out whether we can get a particular difference though? Well, here the insight from the NP-Hardness proof  comes in handy since the transformation we’ve done between Q and m is bijective, and that means intuitively that we can also use it to solve Balanced-Partition using Q-Sums. In English, we want to know whether it’s possible to obtain a sum Q = \frac{S-d}2. What would happen if S-d is odd? Well, we’d certainly not be able to get a rational sum out of integers. That doesn’t bother us one bit though as we’ll soon see.

This is where dynamic programming comes in. Q-Sums is a problem with a pseudo-polynomial time algorithm. That simply means that there exists an algorithm that finds the answer in polynomial time dependent on the numeric value of the input. That numeric value is exponential in terms of bits however!

(neat fact: NP-Complete Problems that can be solved by such algorithms are called weakly NP-Complete. These are as far as I understand the only NP-Complete problems where dynamic programming can be applied. If however, the numbers in the array A had been real numbers, such a solution would not have been possible. And by real numbers I mean actual real numbers, not IEE754 floating point, which are in fact rational.)

To use dynamic programming, we need to define an optimal sub-solution’s structure. This structure will be used do determine the larger sub-solutions, until we finally have the whole solution! If I mess up the explanation, feel free to use the extremely helpful tutorials of Brian Dean. He actually has many more dynamic programming examples there you might find interesting.

So, we’re going to solve Q-Sums. We need the following structure (usually a matrix that adequately describes what exactly a sub-problem is with a couple of indices):

N[i][j] = the number of ways to get the sum j using the first i numbers. To actually solve it, we need a recurrence relation. We need to think how to get from one state to the next, either with a top-down or a bottom-up approach (i.e. either forward or backward). Let’s look “back”, at N[i-1][j]. We’re at the i-th number and we can either chose to add it to the sum or not.

If we don’t add it, then we need to get the same sum j, without the current number, A[i], so we add N[i-1][j]. Otherwise, to get j=(j-A[i])+A[i] in total, we’ll need N[i-1][j-A[i]]:

Recursion formula – N[i][j]=N[i-1][j]+N[i-1][j - A[i]]
The base case being – N[0][0]=1

Note that we actually count the number of ways to get the desired sum here. It could have been the same decision problem we talked about if we performed the same calculations modulo 2 (or used a Boolean data type and interpreted the ‘+’ operation to mean logical OR).

At any rate, for an array $A$, of length n, the answer to Q-Sums will be in N[n][Q]. We want to check for \frac{S-d}2 in a loop from d=0. Also, \frac{S-d}2) needs to be even.

for (j = 0; (s - j)/2 >= 0; ++ j)
 if ((s - j) % 2 == 0 && N[n][(s - j)/2] > 0)
 break;

And so, the minimum difference will be j and the number of ways to get will be N[n][(s - j)/2].

Final remarks

You can in fact optimize this solution space-wise a lot ( although this completely escaped my mind in the contest and I didn’t get max for this problem 😦 ). Notice that we don’t use all of the rows in the matrix at once, we only use the final two, and really, we only need one row since we’re only interested in N[n][\ldots] and all of the updates are done in place.

So, we can finally write a much better recurrence:

N[j]=N[j]+N[j - A[i]] where N[0] = 1

where N[j] is the number of ways to get a subset of sum j.

You can find all of the implementations here.

Also, dear reader, I also need your help with a couple of questions:

  1. What happens (in the NP-Hardness proof) when Balanced-Partition is taken to require the partitioning into two subsets whose sums have an absolute difference of at most m? Basically replace the equality with an inequality.
    I was thinking about calling Balanced-Partition twice with m and m-1 to see if you get different answers, If you do, then you might not get the Q in the Q-Sum you were looking for. But does that mean it doesn’t exist?
  2. Is it true that the maximum difference between the two partitions in the optimum case can be no larger than the maximum number in the array A?
    I feel that this is true, but can’t think of any good reason right now. Assistance appreciated 😉

Land of Lisp

15 January, 2011

Today, after a long absence, I’d like to write a small post about an awesome book I’ve found about from Mihai.

It’s called the Land of Lisp and is about… well… Lisp! What makes it stand out from those dreadfully boring Structure and Interpretation of Computer Programs books they use(d) at MIT or Berkeley is that this one has tons of drawings and is lots of fun to read, even if only as a comic.

So, have a look, at his website which features the most hilarious promotional video ever. And an epic comic as well!

Oh, and be sure to also check out lisperati.com!

How can you resist something that is… made with secret alien technology?

It’s really such a shame that I’ll have my first final exam this Wednesday 😐 … I won’t be able to start reading the book.

I also promise to:

Write about the FDC [Free Development Course]… or Victor will have my head 😀

To Boot a Computer

4 January, 2011

Ah… the 7 days off you get between Christmas and New Year’s Eve every year… – tis a season to be jolly and all that. And what could possibly be more jolly than learning how booting a computer works for real?

But first of all, let me just wish everyone a great 2011! And all bets are now officially open as to how many strange cults commit mass suicide thinking the world will end in 2012 – so get ready!

Anyway, let’s start with the disclaimer, otherwise you’ll be sorely disappointed. I’d like to make it absolutely clear what this is about: it’s about a project given to some students in an Operating System’s course called COS 318 at Princeton. I’ve linked to the one from 2004, but there has been one every year and the first project has always been the same – write a basic bootloader.

This bootloader is supposed to be able to load a small kernel written in assembly language that runs in real mode and prints “Hello World!” a couple of times on the screen, then halts. Well not quite – it prints a couple of numbers and stuff and then it loops.

Still, it’s a neat proof of concept. You can check the code out here and here. The really great part with this small project is that I worked with fellow hacker Vlad Bagrin (who doesn’t yet have a blog since the name vladb.wordpress.com is taken ; ) So, we’ve worked together to patch things up and got it working (mostly).

The course has 6 assignments that some of you might enjoy taking a look at (I lack the time and motivation to bother). We decided working on the bootloader because we have a Computer Architecture and Assembly Language course this semester and we thought that a practical assembly language project would be helpful.

So, here’s what we’ve learned about booting a system:

  1. After the POST test and the selection of a boot device, the BIOS loads the first sector of the boot device at the address 0x07c00. The processor starts off in real mode and so, this is a physical address (20 bits), which using linear mapping could be expressed as 0x7c00:0x0000 (out of other combinations).
  2. At this point, the bootloader takes over and it is responsible for loading the kernel from the boot device – the kernel will always start from the second sector on the boot device in our case – to 0x01000. It should also set up the kernel’s stack and ideally enter protected mode (that however is way more complicated as it requires setting up the Global Descriptor Table and delaying the setup of the stack segment to after this is done).
  3. Finally, the bootloader runs the kernel by performing a long jump to 0x01000.

Without any further ado, here’s the bootblock.s code we’ve written and that works for the basic requirements of this project – loading a small kernel in real mode (that is a kernel at most 128 sectors large – this being a limitation of the BIOS interrupt we used to copy the kernel – interrupt 0x13, function 2).

Before digging into the code itself, how exactly the bootable image is made needs to be explained – the initial project calls for the creation of a binary called createimage which is responsible for extracting the object code from the ELF binary that gcc outputs.
This entails reading the ELF files generated by assembling and linking the code. The bootable image doesn’t need the ELF header telling the OS how to start the program or anything like that. We’re just interested in the program’s segments – code and data.

The supplied Makefile (modified for our purposes though) uses gcc for both steps actually rather that calling as and lddirectly, presumably because the list of parameters would be way to long… it’s either that, or maybe they want to have the same basic Makefile layout for all subsequent projects where the kernel is written in C rather than in assembly.
At any rate, the actually interesting bit as far as I’m concerned is the following LDOPTS line:

LDOPTS = -nostartfiles -nostdlib --section-start=.text

This tells the linker to: not use the standard system startup files when linking (can anyone perhaps explain what these are?), not use the standard system libraries and only link files that you specify explicitly and where in memory (as an offset from the start of the file I imagine) the start of the .text (code) segment resides.

There are also some specific options for the compiler, and you’re free to check them out yourself 😉

Moving on… after the ELF binaries kernel and bootblock are created, createimage.given extracts the actual segments (there will in fact only be one per file – everything will be stored in the code segment in fact!), concatenates them and marks the resulting image as bootable.
Afterwards, it’s simply a matter of dd-ing the image to a USB stick or using Bochs to test it out directly (there’s a bochsrc file in the same repo – you just need to run bochs in that directory and it will take care of everything).

Finally, let’s talk about what it does exactly, line by line and we’ll finish with a link to the reference implementation, aka the one that the instructors over at Princeton use for the next project:

	.equ    BOOT_SEGMENT,0x07c0
	.equ    DISPLAY_SEGMENT,0xb800

.text               
	# print_char is defined externally
	.globl    _start    # The entry point must be global
	.code16             # Real mode

_start:
	jmp     over
os_size:
	# Area reserved for createimage to write the OS size
	.word   0
	.word   0
message_testing:
	.asciz	"Testing Bootblock..."
	#.asciz	"" <- not really needed, the 'z' stands for zero-terminated
message_bootfrom:
	.asciz	"Booting from %dl: "
message_jumpready:
	.asciz	"Ready for Long Jump into Kernel..."
message_error:
	.asciz	"Unexpected error occured at interrupt 0x13 : (\r\n"

Oh, and yes, the syntax is AT&T because we simply could not get the code working with NASM. It just seems that gcc is doing some actual magic behind the scenes and there’s simply no way I know of instructing gcc to generate code using NASM instead of AS.

There are only a couple of things of interest here:

  1. The BOOT_SEGMENT is 0x07c0 instead of 0x7c00, the value we said it’d be as a physical address. That is because we’re dealing with the value from assembly and when actually accessing memory, it will be modified accordingly.
  2. .code16 explicitly instructs AS to generate 16bit code. After switching to protected mode 32bit code is mandatory! For now, we’re stuck with 16bit code though.
  3. jmp over jumps over the small part of the code segment dedicated to data allocation. Since at boot there are no initialized segments except %cs we need to have everything we need there.

Next, we set up the Stack and Data segments so that we can call functions like print_char and pass pointers to print_string:

	# Allocating Stack Segment of 0x100 [256] bytes
	movw	$0x0a00, %ax
	movw	%ax, %ss
	movw	$0x100, %sp

	# Setting up Data Segment [the boot segment] 0x07c0
	movw	$BOOT_SEGMENT, %ax
	movw	%ax, %ds

and then we fool around with these a bit:

# Print greeting  message
	pushw	$message_testing
	call	print_string
	addw	$2, %sp

	call	print_endl
	addw	$2, %sp

	# Testing print_int
	pushw	$42
	call	print_int
	addw	$2, %sp

	call	print_endl
	addw	$2, %sp

The actual functions all depend of print_char which in turn calls interrupt 0x10, function 0x0e. You can have a look at them here.

Now comes the magic interrupt 0x13. You’re supposed to set up stuff accordingly.

# Do voodoo for 0x13, something about cylinders & shit
	# Need to comment these!
	movb	$0, %dh
	movb	$0, %ch
	movb	$2, %cl
	movb	$1, %al # should be actual number of sectors the kernel has here... 
        # replace with something like...
        # movb $os_size, %al
	movb	$0x02, %ah

	clc
	int	$0x13
	jc	error

So, it turns out that: %ah=2, %al=number of sectors to read, %ch=cylinder head, %cl=sector number, %dh=starting head number, %dl=driver number, %es:%bx=(%es, %bx, 1)=pointer where to place the information.
This function sets the carry bit if in error occurred so that we can check to see if the kernel was loaded properly, and if everything went well, jump to the new address and run the code.

Although this bootblock works for our kernel, which we tried writing in C to see whether it would work… the fact that it:

  1. doesn’t use os_size (and I’m too lazy to make the change… : )
  2. doesn’t load a kernel larger than 128 sectors (i.e. 1 segment)
  3. doesn’t set up protected mode

make it pretty useless.

For a better academic example, check the one they supply for the second project found in archive start2.zip here.
And, for more about the differences between Intel and AT&T syntax, here’s a great article on IBM developerWorks.

In the end we had fun making it and I hope you had fun reading about it. This probably puts a stop to my OS ambitions… for now at least 😉

Happy New 2011 everyone!

Dan

Higher Order Haskell

28 December, 2010

After lots of fun drawings and examples I finished the Higher order functions chapter from the Learn You a Haskell for Great Good tutorial.

So, you might recall the functions I wrote last post in Haskellthe original ones were written in Scheme. Well, now let’s be even more advanced, and what better way to do that than use… *cue suspense*… higher order functions!

append :: [a] -> [a] -> [a]
append l1 l2 = foldr (:) l2 l1

reverse' :: [a] -> [a]
reverse' = foldl (flip (:)) []

member :: (Eq a) => a -> [a] -> Bool
member key = foldl (\truth lhead ->  key == lhead || truth) False

size :: [a] -> Integer
size = sum . map (\_ -> 1)

maxelem :: (Ord a) => [a] -> a
maxelem = foldl1 max

minelem :: (Ord a) => [a] -> a
minelem = foldl1 min

set :: (Eq a) => [a] -> [a]
set = foldr (\lhead cset -> if (member lhead cset) == True
                               then cset
                               else lhead : cset) []

nbrelems :: (Eq a) => [a] ->Integer
nbrelems = size . set

remove :: (Eq a) => a -> [a] -> [a]
remove key = filter (/= key)

double :: (Eq a) => a -> [a] -> [a]
double key = foldl (\accum lhead -> if key == lhead
                                       then lhead : lhead : accum
                                       else lhead : accum) []

A couple of conclusions:

  • it is way shorter to use higher order functions – the previous functions totalled 90 lines of code (with comments and all) whereas this code is 34 lines long (in Emacs : ) So, it’s a 2.64x improvement in length;
  • clarity has also been improved – the new function remove in particular is much better. As are minelem, maxelem.
  • they are kind of unreadable to someone unfamiliar with what these higher-level functions do exactly. But that’s a story for another time (article)…

Haskell Land

26 December, 2010

After playing around a bit with Haskell (I’ve finished the first few chapters in Learn You a Haskell For Great Good which is awesome, I highly recommend it, even if just for the fun drawings), I’ve finally reached the point where I can write the code for the lists I previously played with in To Curse and Recurse.

So, without further ado, here’s the code (and some other functions as well – these were all part of a Theory of Algorithms homework):

append :: [a] -> [a] -> [a]
append (x:l1) l2 = x : (append l1 l2)
append [] l2 = l2

-- this would take O(n^2) time : (
reverse' :: [a] -> [a]
reverse' (x:l) = append l' [x] -- <- the culprit is here, *append*
        where l' = reverse' l
reverse' [] = []

-- this only takes O(n)
reverse'' :: [a] -> [a]
reverse'' (x:l) = reverse_a l [x]

reverse_a :: [a] -> [a] -> [a]
reverse_a (x:l1) l2 = reverse_a l1 (x:l2)
reverse_a [] l2 = l2

--
member :: (Eq a) => a -> [a] -> Bool
member a (x:l) = if x == a
                 then True
                 else (member a l)
member a [] = False

--
size :: [a] -> Integer
size (x:l) = 1 + (size l)
size [] = 0

-- maxelem [] is not defined! it does not make sense
maxelem :: (Ord a) => [a] -> a
maxelem [x] = x
maxelem (x:l) = max x (maxelem l)

-- different 'tail' version
maxelem' :: (Ord a) => [a] -> a
maxelem' (x:l) = maxelem_a l x

maxelem_a :: (Ord a) => [a] -> a -> a
maxelem_a (x:l) m = maxelem_a l (max x m)
maxelem_a [] m = m

--
minelem :: (Ord a) => [a] -> a
minelem [x] = x
minelem (x:l) = min x (minelem l)

--
set :: (Eq a) => [a] -> [a]
set (x:l)
    | member x s = s
    | otherwise = x:s
    where s = set l
set [] = []

set' :: (Eq a) => [a] -> [a]
set' (x:l) =
     let s = set l
     in if (member x s)
        then s
        else x:s
set' [] = []

--
nbrelems :: (Eq a) => [a] -> Integer
nbrelems l = size (set l)

-- from a different set of exercises
-- but really, these comments are worthless : D
remove :: (Eq a) => a -> [a] -> [a]
remove a (x:l)
       | a == x = l'
       | otherwise = x:l'
       where l' = remove a l
remove a [] = []

--
double :: (Eq a) => a -> [a] -> [a]
double a (x:l) =
       let l' = double a l
       in
        case a == x of
             True -> a:x:l'
             False -> x:l'
double a [] = []

Yes, so congrats to me for reimplementing functions that were already in the Prelude! I feel so very useful 😉 Anyway, this just might be the beginning of a beautiful friendship…

Also, who else is annoyed that Haskell syntax is not supported by WordPress? But, well I’m just thrilled they have syntax highlighting for much more useful and interesting languages like… ColdFusion or ActionScript.

Computability

25 December, 2010

Let’s start with the basics of Computer Science – everything rests on what’s called the Church-Turing Thesis. This thesis describes the fundamental nature of computable functions – what does an effectively calculable procedure look like? Effective calculability for a procedure M (as defined in the Stanford Encyclopaedia of Philosophy entry on the Church-Turing thesis):

  1. M is set out in terms of a finite number of exact instructions (each instruction being expressed by means of a finite number of symbols);
  2. M will, if carried out without error, produce the desired result in a finite number of steps;
  3. M can (in practice or in principle) be carried out by a human being unaided by any machinery save paper and pencil;
  4. M demands no insight or ingenuity on the part of the human being carrying it out.

Points 1-3 are pretty rigorously defined – finite number of instructions expressed by a finite number of symbols – but the kicker is (4) – no insight or ingenuity on the part of the human calculator. What does insight mean mathematically? And for that matter, what does instructions mean? The Church-Turing thesis relates this notion, which is intrinsically informal with the formal notion of computability. Computability however exists only within a mathematic framework of what a computation actually means – there are numerous formal models of computation, like the Turing Machine, the Deterministic Finite State Automaton, Lambda Calculus or the Register Machine. All of these models have been proven to be equivalent. Church’s thesis uses lambda calculus whereas Turing’s thesis uses Turing Machines. Church’s Thesis

A function of positive integers is effectively calculable only if recursive.

Turing’s Thesis

Logical computing machines [Turing’s expression for Turing machines] can do anything that could be described as “rule of thumb” or “purely mechanical”.

After the development of so many other models of computation, Gandy reformulated the thesis as follows:

Every effectively calculable function is a computable function.

So now, effectively calculable, the informal concept we first described is asserted to be equivalent with the formal concept of computable function. Although this cannot actually be proved, Wilfried Sieg lists some arguments for accepting the Church-Turing thesis in a very interesting essay called Church Without Dogma: Axioms for computability where he argues that:

  1. All known [effectively] calculable functions are general recursive (general recursiveness is a concept belonging to Gödel, proved by Church and Kleene to be equivalent with lambda-definability, which is itself equivalent to other definitions of computability like Turing-computability).
  2. [Confluence] The variety of mathematical notions, that are quite different in character and for which there are independent reasons to believe that they capture the informal concept of effective calculability turn out to be equivalent.

The most interesting difference between Church’s and Turing’s approaches is the fact that while Church is interested in schemes for calculating the values of number theoretic functions, Turing looks at the basic symbolic processes that are the building block for calculations. Post, a contemporary of Turing, specifies such symbolic processes as those that can be carried out by a human worker who operates with an alphabet of symbols (the two digits 0, 1 for example) and who carries out the exact same operations a Turing machine does. Turing’s novel perspective comes from arguing that all human mechanical calculations are reducible to a Logic Computing Machine. He does this by determining two essential constraints of the capacity of the human computing agent, call it as Stieg does a computor. Indeed even Wittgenstein observed that in fact:

these machines [Turing Machines] are humans who calculate.

The constraints imposed on general symbolic processes so that they don’t need to be further subdivided (that they represent the most basic steps of computations) stem from the requirement that configurations need to be immediately recognizable by the computor. This essential limitation is motivated by the limited capacity of the computor’s sensory apparatus and yields the following two conditions:

  1. Boundednessa computor can immediately recognize only a bounded number of configurations;
  2. Locality – a computor can change only immediately recognizable configurations.

However, the precise definition of symbolic configuration and changes effected by mechanical operations are not given rigorously. Sieg gives some ideas that can be formulated for computors:

  1. They operate deterministically on finite configurations (whatever those are);
  2. They recognize only a bounded number of different kinds of patterns (configurations);
  3. They operate locally on exactly one of the patterns;
  4. They assemble the next configuration from the original one and the result of the local operation.

We’ll see exactly how this works for a Turing Machine below, but first, let’s note that although Turing envisioned his machines to simulate human mechanical calculation, boundedness and locality can apply to an “artificial” machine as well, again due to physical considerations (very interesting observation by Sieg):

  1. Quantum mechanic’s Uncertainty Principle ensures a limit on the size of distinguishable components;
  2. The theory of special relativity gives an upper bound for signal propagation.

These two bounds justify boundedness and locality for machines in the same way sensory limitations do for humans.

Now that we’ve talked a bit about the basics of what computation actually is, stay tuned for a future post actually talking about some real models. Obviously this is only a tiny scratch on the surface of such a vast and interesting topic. Google is your friend, and you might enjoy some of the papers I’ve linked to – Stieg’s in particular are of interest (if somewhat dry), also try the Stanford Encyclopaedia of Philosophy for a more top-level overview and dig into some of the articles mentioned over there. Finally, most of the actual papers Turing and Church wrote are available online so you might also want to take a look at those.

In closing, happy holidays everyone, and I hope to see you all soon!