How do you go about processing an endless amount of data about the DNA material of micro-organisms? When Marjet Oosterkamp was researching industrial wastewater treatment she turned for help to the national supercomputer: it takes over when the human brain and standard computers have to throw in the towel.
You feed the supercomputer a concoction of data, it starts whirring and, hey presto, a few days later it spits out ready-to-use data chunks which will help you on your way. It sounds simple enough but according to Oosterkamp, handling the supercomputer demanded quite a bit of time and effort.
‘It’s a complex process because you are dealing with a variety of programmes and you’re on a limited ‘time budget’. You also have to indicate exactly what it is you want the computer to work on and how much memory that requires. Fortunately we had help from our faculty ICT manager, a bioinformatician from Mexico and, of course, the experts from SURFsara, the organisation which facilitates the use of the supercomputer.’
Optimal circumstances for wastewater treatment
Oosterkamp is part of a team of scientists who are researching a method to purify industrial wastewater by means of micro-organisms such as bacteria and archaea in a project called BioXtreme. In four membrane bioreactors at the Delft lab, specially designed wastewater is used to create circumstances in which micro-organisms remove compounds that are difficult to break down, such as phenols.
‘The water we use is a mixture of components commonly used in industry,’ Oosterkamp says. ‘We know exactly what’s in it and that means we can analyse everything properly. We will eventually use real wastewater produced by industry as a next step in the project.’
The aim of the project is to find out which circumstances (temperature, the amount of salt and lactic acid) will make the micro-organisms do the very best job they can. According to Oosterkamp, there is much left to find out about the role of high temperatures in wastewater treatment using micro-organisms. What remains after the treatment is water that can be re-used in industry, as well as biogases which can be used as a source of energy. It is effectively creating a circular system.
From samples to processed data
DNA material of the micro-organisms is isolated from samples taken from the wastewater, Oosterkamp explains. ‘This is sent to a company in Leiden called BaseClear which breaks up the material into very small pieces at its lab. It then returns this huge volume of data to us.’
From her computer in Delft she then sends it all to SURFsara’s data center, located in a 72-metre tall building at the Science Park in Amsterdam. Covering two floors, it is home to network cables stretching to 100 kilometres, 25 petabytes (25,000 terabytes) of online storage and 50 petabytes of tape storage. The processors on SURFsara’s server system each have a designated task. They relay the results to a central server which reassembles the parts to form a complete package of processed data.
An impressive sight, Oosterkamp thought when she visited the data center earlier this year. ‘The servers are stored in huge racks and there are some very clever air conditioning systems in place. It’s the latest technology and they don’t allow just anybody in there.’
Completing the puzzle
The researchers needed SURFsara to provide them with a capacity of 1 terabyte of working memory for their project. The computer then took 400 hours (almost 17 days) to do the computing. Now the work at SURFsara is done, the team can continue their research using ‘ordinary’ computers. ‘We are in the final phase of the project. What we have to do now is put together the pieces of the puzzle and figure out which DNA is linked to which micro-organisms so we will know exactly which properties they have.’
The help of the supercomputer was essential, says Oosterkamp. Without it, the research would have been impossible to carry out. ‘Standard computers are simply not equipped to handle the amount of data, or would have taken months to come up with a few results. That is far too long, especially since you can’t know in advance if the findings correspond to what you had been expecting. And that could mean starting from scratch.’
The results of this project will form the basis for another project which could see the same tests carried out on real industrial wastewater. ‘We will definitely need the services of SURFsara’s supercomputer again. It was both challenging and interesting to work with SURFsara and I’m looking forward to our next ‘cooperation’.