Use biomaRt to tranlate HUGO to Entrez gene Ids.

We can use the R package biomaRt to conveniently convert between different types of gene ids. In this example we will convert official HUGO gene names to entrez gene ids.

First we load biomaRt in R using the current ensembl database for human:
library(biomaRt)
ensembl <- useMart("ensembl", dataset="hsapiens_gene_ensembl")

Now get the mapping, myHugoGeneNames is a vector of HUGO gene names. As this mapping will not tend to be unique, make sure "uniqueRows=FALSE" is included in the query.
mapTab <- getBM(attributes = c("hgnc_symbol", "entrezgene"), filters = "hgnc_symbol", values = myHugoGeneNames, mart = ensembl, uniqueRows=FALSE)

This query will return a table with HUGO gene names in the first column and corresponding Entrez gene id in the 2nd column. As the mapping isn't unique, you may wish to remove duplicates, which can be done as follows:

dupRows <- union(which(duplicated(mapTab[,1])), which(duplicated(mapTab[,2])))
entrezIds <- mapTab[-dupRows, 2]
names(entrezIds) <- mapTab[-dupRows, 1]

The object "entrezIds" now contains a unique mapping of the gene ids.

If you wish to map different kinds of Ids, these function will be of use in identifying what identifiers are available in biomaRt:
listFilters(ensembl)
listAttributes(ensembl)

Boom!

2 Responses to “Use biomaRt to tranlate HUGO to Entrez gene Ids.”

  1. How might I get biomaRt to return ALL gene IDs for humans? Have been using it for conversion of transcripts to gene ID, but can’t figure out how to just get a list of all features. Thanks=)

  2. Hey Kid, try asking here: http://biostars.org/

Leave a Reply