The $1000 myth

The internet is awash with completely unfounded speculation (read: utter crap) about the $1000 genome, and recently two things have made me a little “upset”.

The first, this paper’s assertion that “with high-throughput DNA sequencing costs dropping <$1000 for human genomes, data storage, retrieval and analysis are the major bottlenecks in biological studies”.  I’m sorry for being a traditionalist, but I thought statements in papers were supposed to be based on facts, referenced and backed up by evidence?  I know, what a dinosaur!

Update 22/06/2013: apologies to the PeerJ – who apparently never used the first claim officially, it seems to be a mutated meme that appeared on twitter.

The second, possibly 10 times worse, is the PeerJ‘s marketing which used to begin “If we can sequence a human genome for $100….”.  Possibly under pressure, Peerj now state “If we can set a goal to sequence the Human Genome for $99…”, but quite frankly the damage has already been done.

The standard that is generally accepted is that the human genome should be sequenced to 30X coverage, so that is what I will talk about below.

I’m going to try and lay this out in a completely technology neutral way, though I will have to mention different sequencing technologies at some point.  However, I am pretty convinced of this one fact: there is not a single sequencing technology out today that can deliver 30X of a human genome for anywhere near $1000.

Quite frankly, they struggle to get near twice that.  Feel free to disagree with me in the comments, but provide evidence please.

How sequencing is costed

This is pretty simple, but there are five facets to how much sequencing a human genome costs:

  1. Reagents and consumables.  We need to buy chips, flowcells, reagents etc to actually put on the machine.  These are bought directly from the relevant sequencing company.
  2. Staff time.  There is no magic machine where you put DNA in and get sequence out.  It takes time to prepare DNA and make sequencing libraries and run the machine.  This costs money, in salary, pensions etc.
  3. Equipment depreciation.  Basically, if I run a sequencing machine for 3 years, at the end of that 3 years I will probably need to buy a new one.  So the cost of the purchase of the new machine gets spread over the projects I run on the old one.  This is the only sustainable business model, unless you assume an investor will continually give you money, or that you have a rich benefactor who subsidises your business model.
  4. Bioinformatics/data storage.  The data need to be QC-ed and at the very least aligned.  The raw and aligned data need to be stored somewhere.
  5. Overheads.  We need to pay the rent, pay electricity and water bills etc.  I know, but they cost money and the money has to come from somewhere

What things actually cost

I’m not going to list each company and give you costs, but what I am going to say is this:

None of the current sequencing companies can deliver 30x of a human genome for less than $1000 reagent costs (using list prices)

Yes, that’s right – even ignoring points 2-5, even just buying reagents, the cost is greater than $1000 for a 30x human genome.

Now, it’s possible Broad, BGI, Sanger etc can get below $1000 for the reagents due to sheer economies of scale and special deals they have with sequencing companies – but then remember they have to add in those extra charges (2-5) above.

Obviously, Illumina don’t charge themselves list price for reagents, and nor do LifeTech, so it’s possible that they themselves can sequence 30x human genomes and just pay whatever it costs to make the reagents and build the machines; but this is not reality and it’s not really how sequencing is done today.  These guys want to sell machines and reagents, they don’t want to be sequencing facilities, plus they still have to pay the staff, pay the bills, make a profit and return money to investors.

Myths in the press

You may come across articles like this, which have blithe statements such as “Complete Genomics now routinely sequenced human genomes at 30x coverage for less than $1,000 in reagent costs”.  

Well, lets not forget that Complete Genomics’ business model completely failed, they never made money and had to be bought by BGI in order to survive.

This kind of article/statement is basically marketing for the company involved, because they want to be the one to reach the $1000 genome first.  Scratch beneath the surface though, and its all smoke and mirrors.

…., the $1,000,000 analysis

Utter crap.  Utter, utter crap.  EDIT 14:10 18/06/2013. There is a real question about why we compare detailed research data analysis costs to sequencing costs – we’ve always had to analyse the data and write papers, sequencing data is no different. Do we compare analysis costs to qPCR costs? Microarray costs? Why all of a sudden are we comparing the very expensive activity of “doing research” with sequencing costs?

Obviously I recognise that in some circumstances, the analysis can cost way more than the sequencing, but it’s really not as common as its made out to be.

Economies of scale

When I mentioned reagents costs above, I said “list price”.  Of course, you can achieve huge discounts if you buy lots of reagents, and so if you are sequencing say 10,000 human genomes then you will get a massive reduction on those reagents prices.  Huge projects such as this probably include the sequencing company as a partner, and in such arrangements of course it may be possible to do 30x human genomes at less than $1000.  But this would represent a completely unique scenario, a one-off, and wouldn’t affect the price the rest of us have to pay for human genomes.

Can we have some truth please?

My problem is, every time an article is published saying that it’s possible to do $1000 human genomes, we get collaborators who expect that price.  Your bullshit affects my life, and I get upset by that.  Why doesn’t everyone just tell the truth?  We know what it is.  We all know which company comes closest to delivering the $1000 genome, and we know which companies simply aspire to it.  We know that none of the companies have yet achieved it.  We know that they all want to, and hell, I want to too – I would love to deliver $1000 genomes, $100 genomes etc.  But we’re not there yet, and if you say we are, then you’re going to get struck off my Christmas list!

Update: 22/06/2013

This rather amusing piece turned up on the 19th: $1000 genome a mirage, in which Craig Venter and Eric Topol both completely agree with me :)  Well, they say I am wrong, but when you read what they have to say, they actually agree with me.  I commented on the article which I produce below:

The first thing worth noting is that the cost of sequencing is actually starting to go up:

http://genomebiology.com/2013/14/5/115

and the rate of change of the price reduction has been following an upwards trend for some time:

A pedantic look at the cost of sequencing

And my point about the Illumina genomes, which may very well cost $2500 to you is that they are SUBSIDIZED to make them that cheap. I can do you a $1 genome if I subsidize the costs, and we’re not taking about the $1 genome are we?

About these ads

14 thoughts on “The $1000 myth

  1. Donald Dunbar

    Good post Mick. I agree with most of that and your main point (let’s just all tell the truth) is well made. One additional thing may be that when people talk about the analysis, they also sometimes are talking about all the work done with the sequence data after you guys have generated it: all the way down to generating hypotheses, writing papers and grant apps. And of course all these things have their points 2 and 5 at least.

    Reply
    1. biomickwatson Post author

      Hi Donald. Yes is realise I slightly underplayed the cost of analysis, it’s just that I don’t think we should confound the issues. We’ve always had to spend time generating hypotheses, testing them and writing papers, why do we now roll these into (or next to) the cost of sequencing?

      Ultimately, the data generation aspect of sequencing remains relatively cheap. Knowledge generation may cost more, but we’ve always had to do that :)

      Reply
  2. Kevin Davies (@KevinADavies)

    Spot on, Mick.
    In my recent talk at the NIH symposium to mark the 10th anniversary of the HGP:
    http://bit.ly/KDHGP10
    … I quoted a personal communication from Illumina CSO David Bentley, who says that in batch mode, the HiSeq can currently sequence five human genomes (presumably to 30x or higher) for a reagents list price of $25,000 — or $5,000/genome. With negotiated discounts (or if you want to estimate the wholesale cost), take 1/3 or 1/2 of that figure. So for what it’s worth, we might be edging close to the $2,500 genome, but that’s as good as it gets for now.

    It’s worth noting that three years ago, David Dooling did a similar back-of-the-envelope calculation of the *full* cost of genome sequencing above and beyond the reagents on his blog, PolITigenomics. That post is here, and is worth a read:
    http://www.politigenomics.com/2010/06/the-cost-of-doing-sequencing.html

    Finally, on Bruce Korf’s infamous quip about the “$1 million interpretation,” which I quoted in my book “The $1000 Genome.” It bears repeating that this was not intended as an actual assessment of the cost of clinical genome interpretation, but a more whimsical attempt to point out the significant challenges (financial and otherwise) that must be met to fully integrate clinical genomics into the clinic — health-IT, counseling, monitoring, reimbursement, etc. — particularly as sequencing per se becomes more of a commodity.

    Reply
    1. biomickwatson Post author

      Thanks Kevin, I appreciate you commenting on the blog, and you obviously have a great knowledge of the area :) I met David Bentley recently, he’s a very smart guy and I am sure under his stewardship, and others’, Illumina will be the first company to truly crack $1000 genomes.

      I’m certainly getting feedback that I missed the mark when tackling the “$1m analysis” – and I can only apologize to everyone for doing that. I just hate cliche’s in biology, and that’s certainly becoming one of them!

      Reply
  3. homolog.us

    Biomickwatson, A small, yet revolutionary, company based in UK solved all problems mentioned by you two years back. We routinely sequence genomes in USB-sized sequencers, store them in USB drives and then throw away the drives.

    Please ask your customers to try Oxford Nanopore before sending papers to PeerJ :)

    Reply
  4. Robert Lanfear (@RobLanfear)

    A naive question: Let’s say I want one human genome sequence. What’s a ballpark figure for the cost today?

    More specifically, I want to send someone some tissue (e.g. a cheek swab) and get back an assembled genome where most of the interesting stuff is sequenced and mapped to ~30x coverage. I don’t want any downstream analysis, I don’t have any hypotheses to test. What would it cost just to get the assembled genome data if the whole thing is outsourced?

    If we assume that some company will wait until they have another four to sequence, and they can get the ~$5k per genome reagents cost from Illumina, is the total cost still close to that, or does rent, staffing, bioinformatics, data storage, depreciation etc. increase it significantly? If so, can anyone estimate the final cost?

    Reply
    1. biomickwatson Post author

      What you have to understand is that us facilities are very reluctant to give out this information publicly, and here is why:

      If I tell you that sequencing a human genome costs £N, then that will becomes irrefutable “FACT” in the eyes of our collaborators, and it becomes known that we do human genomes for £N.

      In reality, depending on the size and nature of the project, costs can vary between N/2 and 2N per genome.

      I’m not being difficult, it’s just that genuinely, every single project is different, and I don’t want to give you a figure for a human genome so that someone else can come along and say “FacilityX can do it for half that price”, when in fact we could also do it at half the price as the FacilityX figure comes from a completely different type of project.

      Reply
  5. Jason Hoyt (@jasonHoyt)

    Hi Mick,

    Just to set the record straight – Since before it launched, PeerJ has always stated “If we can set a goal …”. There are tweets going out that truncate the sentence and are leaving off “goal.”

    This is the earliest record from the Internet Archive from Feb 20, 2012. That is four months before we officially launched in June of 2012 showing that we’ve always used “goal” in the challenge statement: http://web.archive.org/web/20120220073241/http://peerj.com/

    Best,
    Jason

    Reply
  6. Pingback: Links 6/23/13 | Mike the Mad Biologist

  7. xsd

    “Life Technologies can sequence the exome — the 1 percent of the genome we know
    how to interpret — for $500. “In three months, we’ll be able to do one entire
    human genome for $1,000,” predicts Rothberg, whose first company, 454 Life
    Sciences, was the one that sequenced James Watson’s genome.”

    Reply
  8. Pingback: Bacterial genomes – 2nd and 3rd generation | opiniomics

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s