So, I’m taking a class this year on high performance computing, and I figure’d I might as well kill two birds with one stone: write some blog posts, and also get some studying done. Let’s get to it!
What Is OpenMP?
OpenMP is an API for working with shared memory parallel computers. Essentially everyone now owns one of these machines, as any multi-core machine is a shared memory parallel machine. What it isn’t is a tool for GPU programming or programming on distributed memory systems (like a Beowulf cluster).
OpenMP is one of the fastest and easiest ways to squeeze extra performance our of modern multicore CPUs.
How to Set Up OpenMP?
Unlike some parallel tools (I’m looking at you CUDA 2 years ago), OpenMP is ridiculously easy to set up. If you are running a Debian-like system, it is just:
apt-get install libgomp1
And that’s it! All you need to do now is compile your code, as you normally would, with gcc and the -fopenmp flag
How easy is that?
In the next post, we will write some simple C code using OpenMP.
One of the projects in my list of stuff I’ll get around to is making a 3D unprinter: a machine that can melt a thermoplastic object down and extrude it back into filament. McMaster has this cool course called Sustainable Future, and part of the course is for the students to do a real world project involving sustainability. I pitched the idea to the class, and I’ve got a team of 4 students now working with me to build one! We’re blogging here, and we’ve set up a github repo here. Watch our progress, we should have a good prototype by December.
I have always had a problem with the concept of intellectual property. The great western tradition of post-enlightenment values have always placed the free flow of art and ideas on a pedestal, as a sacrosanct cornerstone of a just society. That the ideas living in our heads and flowing from our lips were the domain of no king, pope, or policeman is the one of the most important cultural norms that has emerged from the enlightenment into modern liberal democracies. The legal constructs associated with intellectual property, in my evaluation, cannot be reconciled with this. A corpuscle of information cannot be at once free to be spoken or expressed and also be the property of some individual and corporation. Information Theory, the fantastic work pioneered by Claude Shannon, only swells my distaste for intellectual property. We know now that with simple coding, all information is reducible to a common binary form. Film, print, music, photography: all is merely a collection of ordered bits. Which makes the idea of owning information all the more ridiculous, as the process can be just as easily reversed: A song can be represented by a string of Shakespeare quotations, a movie can be rendered in musical score. As an illustration of this, I’ve written a short program that takes any file and converts it to a long, rambling nonsense-poem. Poetry as Piracy.
Making the Wordlists
The first step is generating a set of words to use to generate our poems, categorized by their grammatical type. To do this, I downloaded the English wiktionary. I then used grep, sed, and awk to split it into plain lists of words: nouns, past tense verbs, present participle verbs, and adjectives. I then shuffled these lists, and trimmed them down so that their length was a multiple of 2. I didn’t need to do this, but it simplified the work slightly. In the end, I was left with 17 bits worth of information stored in each noun (131,072 words), 13 bits in each past-tense verb (8192), 13 bits in each present-participle verb (8192), and 15 bits for each adjective.
I then decided on two rough sentence skeletons:
The ADJECTIVE NOUN PAST-VERBED the ADJECTIVE NOUN.
ADJECTIVE NOUN is PRESENT-VERBING the ADJECTIVE NOUN.
Each of those sentences can store 77 bits of information. A 1Mb file, for example, will require roughly 10,000 sentences, or about a novelette worth of words. If that 1 Mb file was a copyrighted song, you would not in fact have the freedom to print and distribute your nice new novel (not that you would want to, it would be random nonsense.)
Encoding the File
Now, 77 bits is a bit awkward. Just choosing between each sentence type gives me 1 bit of information. I also get punctuation at the end. If I end each sentence with either a period, exclamation mark, two exclamation marks, or three exclamation marks, that gets me an extra two bits of information. This gets me up to 80 bits per sentence, or 10 bytes. I can now easily encode my data as nonsense poetry! I use the first bit to select which tense of verb, the second two decide if I get a period or exclamation series, and the rest determine the sentence itself. If my file isn’t nicely divisible into base 10, I simply add an additional line at the end:
All that remains are NUM memories and NUM regrets.
Where NUM is the base-10 representation of the remaining bytes in the first case, and the number of bytes remaining in the second instance (as a long string of leading zeros will get truncated in converting to decimal).
Decoding the File
Decoding the file is as simple as just reading in each line, checking what sentence type it is, and what the punctuation at the end is, and returning it to the original binary form!