Saturday, 12 December 2009

Fixing tabs and spaces in emacs

Tabs to spaces

Ever since I upgraded to Ubuntu Karmic I've had problems with tabs and spaces in emacs23. I have indent-tabs set to nil, but somehow a set of four spaces will still be saved as a tab. This is a pain when writing in languages like Python, where whitespace is semantically meaningful and mixing tabs and spaces can cause a syntax error.

To deal with this I wrote a small emacs-lisp function to convert all tabs in the active region (i.e. the currently selected text) to four spaces:

Spaces to tabs

Similarly, I have a set of generic Makefiles for LaTeX papers, OCaml programs and other things. These are all saved on github or on wikis and I copy them when starting a new project. Makefiles are very sensitive to changes in whitespace and characters which should be tabs may not be spaces. If you copy and paste this sort of code from a webpage you will always get spaces rather than tabs and so need to replace these. The following two functions convert each group of four spaces (or eight spaces) in the active region to a tab:

Using these functions

If you have been using emacs for a while you will probably already know how to customise it. If not, start by finding (or creating) your initialisation file. This will either be in ~/init.el or ~/.emacs.d/init.el Add the functions you want to use to that file, then make sure that Emacs re-evaluates the file by typing M-x eval-buffer. Then move to the buffer where you want to convert your tabs or spaces and use any of the above functions by typing M-x <function-name> just as if you were using a built-in function. That's all there is to extending emacs :-)

Posted via web from snim2's posterous

Saturday, 28 November 2009

Python, communicating processes and Pygame

Pygame is a popular and excellent library for writing 2d arcade games and animations. It takes an approach that has become very popular in the Python world, it wraps the low-level SDL library giving programmers the best of both worlds -- efficiency from the underlying C++ code and a simple, productive scripting environment from the high-level Python wrappers. Pygame is so simple, it is used in a number of introductory text books, including our own book, Python for Rookies. One reason for the simplicity of Pygame is it's approach to events: events exist, but Pygame is not event driven. The programmer has to write his / her own event loop and process events which are relevant to the application explicitly. This makes for a model of interaction that is very easy for beginners to understand and very easy for experts to get right.

For nearly a year now I have been working on python-csp which adds Hoare's Communicating Sequential Processes (CSP) toPython and is nearing a full release. You can read more about CSP on the WoTUG site, but briefly it is a neat way of implementing "message-passing" concurrency to construct concurrent and parallel programs. CSP eliminates several classes of well-known bugs in this sort of software, including race conditions, and makes it much easier to avoid deadlocks. In a CSP program, there are no "locks" which cuts out a lot of difficult boilerplate code. Instead, a CSP program typically consists of a number of CSP processes (which may be reified as threads, processes, coroutines, or anything else) running in parallel. These communicate by sending data along synchronous "channels", which you can think of as being similar to UNIX pipes. Wherever you might use shared data or fire an event in another style of concurrency, in CSP you would send and receive data down a channel. Because the communication between processes is synchronous, the flow of data between processes can only happen in the order in which it appears statically in your code (a big advantage compared to event-driven systems). 

The details of python-csp will keep for another post, but I wanted to document here a pattern for fixing a very irritating problem that occurs when writing python-csp code which uses Pygame: it's very difficult to kill the application in the way you normally would, by pressing the "Close" button on the application window, pressing Alt+F4, or whatever. In CSP programs, the usual way of terminating running processes is to "poison" the channels which they use, which causes all processes which read / write to those channels to propagate the poisoning on any channels they know about and terminate themselves. Neil Brown has a very nice post on poisoning here if you want more details and nice graphics. In Pygame, to quit the application there's a handy pygame.quit() function. Mixing these two requires a bit of alchemy, so it's worth knowing a pattern that works.

For simple programs, the following is enough: just place all Pygame related code in a single process which draws to the application window and have one or more channels to pass information from the rest of the running program to the drawing process. When a pygame.QUIT event is received drop out of the main animation loop and then (only then) poison any channels and quit the graphical application. Just like this:

@process  def Drawme(channel, _process=None):  import pygame  # Constants  width, height = 512, 256  # Open window  pygame.init()  screen = pygame.display.set_mode((width, height), 0)  quit = False  while not quit:  data =  # Drawing code goes here...  for event in pygame.event.get():  if event.type == pygame.QUIT:  quit = True  # Process other events here ...  channel.poison()  pygame.quit()  return


If you try this out, make sure to have a separate terminal open to watch the number of Python processes running in your OS and check that quitting really does work. On UNIX systems you can use the 'watch' utility for this: watch -n 0.5 'ps h -C python -o pid'

Here's an example Pygame / python-csp application, a simple demonstration of Reynold's "flocking" algorithm, which simulates a flock of birds or other wildlife moving in unison:

Posted via web from snim2's posterous

Sunday, 25 October 2009

Silently killing multiple Linux processes

In the last few months I've been working with a lot of concurrent and parallel programs. One of the problems with this sort of programming is when you get it wrong, you can end up with a whole bunch of child processes running detached from their parent, like this:

24827 pts/3 00:00:00 python  24828 pts/3 00:00:00 python  24829 pts/3 00:00:00 python  24830 pts/3 00:00:00 python  24831 pts/3 00:00:00 python  24832 pts/3 00:00:00 python  24833 pts/3 00:00:00 python  24834 pts/3 00:00:00 python  24835 pts/3 00:00:00 python  24836 pts/3 00:00:00 python  
Right now I have 103 of these orphaned Python processes. Using killall doesn't seem to work, so what we need to do is pass their PIDs directly to 'kill -9' . ps will print a whole bunch of nice information, so we need to coerce ps into just printing out the information we want, like this:

$ ps h -C python -o pid  ...  24814  24819  24820  24821  24822  ...  

Then we can pass that to kill -9, like this:

$ kill -9 `ps h -C python -o pid`  

Now we can find out how many Python processes are still running:

$ ps h -C python -o pid | wc -l  0  

zero -- just what we want :-)

Posted via web from snim2's posterous

Saturday, 17 October 2009

How to write a literature review for your final year thesis project

A long time ago I wrote an article on how to pass your final year thesis project, which several students found helpful. In the same vein, this post deals with a particular aspect of the final year project: the literature review.

Every year I supervise projects I find students tend to ask the same questions about their literature review. Most common are:

  • How many papers should I read?
  • How long should the literature review be?
  • Should I read books, articles, or ...?
  • Is it OK to reference websites such as Wikipedia?
  • Who will read my literature review and what can I assume about their knowledge of the area?
  • When should I start the literature review and when should it be finished?

These questions crop up frequently and will be familiar to any readers who are starting their own project. However, when you fully understand the purpose of the literature and how to go about writing one, you begin to realise that these questions are actually not that important. This post is designed to help students make that transition, from not yet understanding what the literature review is for, to having a thorough understanding of its purpose and a clear idea of how to write it up.

The meaning of a final year project

A lot of students starting off their projects talk about writing a "report" at the end of their "project" and doing some "research" as part of their literature review. This is where the subtleties of the English language tend to cause a lot of confusion. "Research" can have many different meanings, and to complete a really successful final year project the first thing to do it to understand fully what is expected of you in an academic context. Even if your final year at University is the last experience you have of academic work, remember that your work will be marked according to academic criteria, and academia has quite different aims to industry.

So, to be clear: research, in an academic context, means adding something new to the body of knowledge that humans have gathered in your area of interest. Your project as a whole will be a piece of research because you will be creating something new that has not been created before. What that is exactly will depend on the field you are studying. It might be a new perspective on a piece of literature, a new proof of a theorem, a new application of a particular technology, or something else. Since you are still an undergraduate it is likely (although not necessary) that your work will be a small step forward. It is unlikely that you will produce something completely ground breaking, so don't be intimidated by fact that your work has to be novel. That said, it may be that you produce an excellent piece of work and your supervisor may want to turn that into a technical report or conference paper with you, which would be great for your CV (or resume).

A thesis is a statement of belief that is central to your research. Your dissertation will be a piece of writing that defends your thesis, based on your research. So, for example, if your thesis is regular, online tests help University students to learn new material then you will need to implement some sort of online tests for new material, design and run an experiment to test your thesis, and write it up in your dissertation. Equally, if your thesis is water causes cancer in mice then you will need to plan and run an experiment to determine whether or not this is true and write it up. Notice that you may disprove your thesis in your work. It may be that online tests do not help students learn, or that water doesn't cause cancer in mice. This is absolutely fine, so long as your experiments give a clear answer to the question and you can show that your experiments were performed fairly it doesn't matter whether your thesis turns out to be incorrect or correct (in so far as you have tested it). It may also be that your evaluation is inconclusive, which is also acceptable, so long as your experimental method is good and you can say exactly what further work is necessary to produce a definite result, you will be fine.

Alternatively, you might phrase your thesis as a research question. In which case, instead of having a thesis such as water causes cancer in mice you would ask the question does water cause cancer in mice and your dissertation would describe your efforts to answer that question.

The shape of your dissertation and where the literature fits in

Every dissertation is slightly different, but good dissertations will all contain the same elements. I should say that the advice given in this sections is likely only relevant to science based projects. If you are working in the arts or some areas in the humanities then the expectations of you may well be very different. Still, a good dissertation in the sciences will contain roughly the elements listed below. I say "roughly" because, depending on the exact nature of your work, it may be sensible to expand some sections into two chapters rather than one, or to coalesce some elements into a single chapter. Your supervisor can give you more specific advice on this.

  • Introduction: should introduce the reader to the broad context of the research and explain why this is an interesting area to work in. So, if your thesis is something to do with mobile computing, you might say something here about why mobile phones are important, why mobile computing is an interesting and important area, and broadly what other researchers are working on. At the end of the chapter you will want to introduce your specific research question, having said why the area you are working in (and therefore your question) is important.
  • Literature review: Now you have introduced the reader (who will likely not be an expert in your exact area) to the broad research agenda in the field, and your research question, you can start writing more specifically about your own project. In this chapter you will survey the work that other researchers have done to answer your research question, or related questions. At the end of the chapter you should briefly explain how your own work builds on and differs from the work that has gone before it.
  • Method: this chapter should describe what you did to answer your research question (or to support your thesis, if you think of it that way), and how you went about it. You should describe your work in sufficient detail that another researcher could recreate your work to check your results.
  • Evaluation: here, you should evaluate what you have done, and say what answer (to your research question) you have arrived at. It may be that in your method you describe some experiments, and this section records your results and analysis of those results. This is an important section -- most students gain or lose marks in either their literature review or evaluation. Key to producing a convincing evaluation is to plan very early in the project what information you will need to write this section. More on that in another blog post.
  • Conclusions: should summarise what you have done and how you answered the research question. It may be that your work produced a very clear answer to the question, or it may be that your work points to a need for further research to clarify or confirm your answer. You should refer back to the literature review and summarise how your research differs from (hopefully improves on) the work described in the literature. Make sure you also say what research you would do if you were to continue working on your project.
  • References: a list of publications cited in the main text, in Harvard style or similar format.
It is likely that most chapters will be roughly the same size, although the introductory chapter and conclusions are usually slightly shorter than the others. Try to let the lengths of each chapter be guided by the amount of useful and important information you have to convey to the reader, don't impose artificial word limits on yourself.

Summaries and synthesis: what should go in your literature review

Poor literature reviews often take the same form -- they tend to be a (usually short) list of papers that the student has read, briefly summarised. This is not really what is expected and will not gain high marks. Another common mistake is to review literature that has been used to inform some part of the practical work of the project, rather than to review work that has answered the same or related research questions. To do better, your writing needs to not only summarise the prior art in your area, but also synthesise what is in the literature. In fact, synthesis is one of the key skills that we expect to see from final year students.

So, what is synthesis? The main idea is that you should have understood the literature you have read and, more importantly, you should show that you understand the relationships between items of literature. That means what came first in your field, how it influenced later work, how each step forward in the research improved upon what came before it, and so on. Ideally, you will present your own view of the work you are describing. This partly means that you should be critical of the literature you read, and say where the shortcomings of the work are, and how the work could be improved upon (in particular how your work will improve on the prior art). Also, you might have your own view on where your area of research is likely to go in the future.

Critiquing the work of others is something that is often new to students in their final year. One of the most frequent mistakes I see from students is to criticise the style of the papers they read, rather than the research that those papers describe. Avoid writing things like "this paper is not well written" or "this paper is hard to understand". A literature review should really be a review of the research work that has gone before you, not a literary criticism of the style that other authors adopt.

Practical matters: how to start, how to finish and how to do the bit in the middle

Reading and understanding the work of others is a lifetimes work for professional researchers, it is not something that starts and stops on particular dates, according to a Gantt chart. Your final year project will have a hand in deadline, so you need to be a little more circumscribed about how work. 

Ideally, you should be reading some literature in your field very early on in your project, to help you choose a good topic and write an initial research proposal. I would suggest that you consider this to be the starting point for your literature review and keep on reading and adding to your writing all the way through your project until you hand in your final dissertation. To do this, I suggest you do two things. Firstly, keep a careful log of what you read in a format you find easy to work with. This might be a a log book on paper, or it might be a file on your computer, whatever works best for you. Each entry should include the title of the work you have read and enough information about the authors and so on that you can find the publication again. You should then summarise the work in the paper, including the research question answered by the work, the nature of the answer and the methodology of the research (i.e. what the authors actually did). This will give you a couple of advantages -- you won't forget anything you read because you have a record of it, your literature review can be written from your notes and if you are asked any awkward questions in a viva you can refer back to your log. Secondly, as soon as you feel you have read enough to understand the broad context of the literature, start writing it up formally as your second dissertation chapter. This should really be within two or three months of your starting date. Then, as the project progresses and you read more, you can integrate your new reading into your already drafted chapter. This might seem like a lot of effort early on, but when you come to write up your work, you will be incredibly grateful that your most time consuming chapter has already been written, when it was fresh in your mind, leaving you free to write up the later sections of your work.

An example of good writing

Now you know what a literature review is for, how it fits into your dissertation and how to go about writing your own, it would probably be useful to see an example of (part of) an example review. The paragraphs below give such an example and the text [in italics] is some commentary to explain how each part of the writing contributes towards the start of a good thesis chapter. When you read this, don't worry too much about the subject matter, just try to concentrate on the style of writing and the structure of the text.

The area of pervasive, or ubiquitous, computing was founded by Wieser (1991) [ referenced] who predicted that computers would one day be integrated into everyday objects and interact with people seamlessly. Although few such products are available today Weiser’s work has led to the creation of a number of research areas, including ambient intelligence (Eli and Epstein 1998), smart dust (Khan et al, 1999) and the Internet of Things (Brickley et al, 2001). [Sets the historical context of the area and defines related areas.]

An early application of pervasive computing was the active badge location system, described by Want et al (1992), in which users and objects were tagged with an "active" badge which could locate and identify them. This system was based on ultrasound locationing, whereas later systems might use RFID technology to achieve the same effect. [describes how the field has changed over time] Uses of the active badge system included routing phone calls, email alerts and so on to the physical location of the receiver. [contextualises the fundamental research]

Posted via email from snim2's posterous

Thursday, 6 August 2009

Understanding Python Error Messages

Understanding runtime errors and (uncaught) exceptions in any programming language can be a pain, especially if your code is complex or the error message is obscure. The usual way to deal with this situation is either to use a full blown debugger to step through the code, or to add as many print statements as necessary to uncover the source of the error. However, Python provides a third solution which is pretty neat -- use a disassembler. The dis module takes a Python bytecode object (as generated by the builtin compile function or the py_compile module) and prints out a listing of the bytecode instructions "in" that object. However, dis also has another use -- calling the disassembler with no arguments prints out the bytecode instructions generated during the last traceback.

For example, if you import dis in the interactive interpreter and generate a traceback, like this:

>>> 'foobar' * 2.5
Traceback (most recent call last):
 File "", line 1, in 
TypeError: can't multiply sequence by non-int of type 'float'

You can then run the dis.dis() method to examine the error:

>>> dis.dis()
 1           0 LOAD_CONST               0 ('foobar')
             3 LOAD_CONST               1 (2.5)
   -->       6 BINARY_MULTIPLY
             7 PRINT_EXPR
             8 LOAD_CONST               2 (None)
            11 RETURN_VALUE

The arrow on the left (-->) points to the bytecode instruction which caused the TypeError. The number 0 on the left before load_const shows the line number of the source which generated the load_const bytecode instruction. On the right hand side in brackets are the constants loaded into the interpreter.

Sunday, 12 July 2009

Shift LIfe

Sam Moore, Eugene Ch'ng, Dew Harrison, Mat Murray and I have been working on a pervasive interface for an artificial life simulation. At the recent Shift-Time festival in Shrewsbury we exhibited an artificial life simulation of an fictional ecosystem, projected onto a sand pit. People could change the behaviour of the creatures in the ecosystem by changing the environmental conditions of the system. They could make the sun shine more or less by playing with a lamp, increase the humidity or change the pH by pouring in water, vinegar or soda mix from watering cans or cause an earthquake by hitting the side of the sand pit with a toy hammer. We had really good feedback from the people who came to see us. One family came back on the second day because their three children were talking about it "all night". Typical comments from kids were "I think it's cool" and one kid left saying "well, you've got to be impressed with that", which made us laugh. More info and an interview with Dew can be found here: [Event listing] [Interview]