Frequently asked questions:

  1. What is ENCODE?
  2. What does the ENCODEdb portal allow me to do?
  3. Which public databases and data types does the ENCODEdb portal currently query?
  4. What happens if I don’t make a selection on one of the Consortium Data query pages?
  5. Which browser should I use so that ENCODEdb pages display correctly?
  6. I made a selection on one of the Consortium Data query pages, and the remaining pull-down menus on the page have now changed. I need to go back and change my previous selection to actually get the kind of data I was looking for. How do I do that?
  7. Why do some of the pages "expand" when I make a selection?
  8. What is a BED-formatted file?
  9. What is Galaxy?
  10. I have just issued a query on the GEO Components page and asked for my data to be returned in BED format. This produced a new Web page asking me to select a single column from the GEO hybridization tables. Why can’t I select more than one column of data from the GEO hybridization tables?
  11. How can I do analyses on multiple sets of data that I have generated through the ENCODEdb portal?
  12. How do I view GEO Component data using the UCSC Genome Browser?
  13. How do I specify chromosome ranges?
  14. How often is ENCODEdb updated?

  1. What is ENCODE?
  2. The National Human Genome Research Institute (NHGRI) launched a public research consortium named ENCODE, the Encyclopedia Of DNA Elements, in September 2003, to carry out a project to identify all functional elements in the human genome sequence. The project is being conducted in three phases: A pilot project phase, a technology development phase and a planned production phase.

    Information about the ENCODE Project, consortium membership, data release policies, and additional background can be found by visiting the ENCODE Project page.

    top

  3. What does the ENCODEdb portal allow me to do?
  4. Data generated by members of the ENCODE Consortium is housed in a number of public databases, such as the UCSC Genome Browser and NCBI’s Gene Expression Omnibus (GEO). Since issuing queries to these databases is often not intuitive, the ENCODEdb portal was developed to allow biologists to more easily query and retrieve data generated by the ENCODE Consortium. The ENCODEdb portal provides users a single, unified point-of-access to data generated by the ENCODEdb Consortium, regardless of which public database the primary data is housed in.

    top

  5. Which public databases and data types does the ENCODEdb portal currently query?
  6. The ENCODEdb portal currently provides unified access to the following resources:

    More information about NCBI GEO data types can be found in ENCODEdb’s Summary of GEO Terminology.

    top

  7. What happens if I don’t make a selection on one of the Consortium Data query pages?
  8. Not choosing a selection from a pull-down menu will simply return all possible values for that particular field.

    Making a selection will help narrow down the results returned to those that are actually of interest to the user. Making a selection will also narrow down the choices in any remaining pull-down menus; this is done to prevent combinations that either produce no result or are contradictory. Any selections made may also change any data ranges that are displayed under the Value options, reflecting the valid range of values for the now-selected experiments.

    The only place where a selection is required is on the Consortium Data query pages for GEO Datasets, GEO Profiles, and GEO Components. Here, an Experimental Group must be selected in order for ENCODEdb to issue a query against the GEO database.

    top

  9. Which browser should I use so that ENCODEdb pages display correctly?
  10. ENCODEdb has been tested using Internet Explorer, Safari, and Firefox. Most pages require that JavaScript be enabled; please check your browser’s preferences to assure that JavaScript is enabled.

    top

  11. I made a selection on one of the Consortium Data query pages, and the remaining pull-down menus on the page have now changed. I need to go back and change my previous selection to actually get the kind of data I was looking for. How do I do that?
  12. Simply clicking on the Clear button at the bottom of the page will reset all the pull-down menus, allowing the query to be re-issued from scratch.

    top

  13. Why do some of the pages "expand" when I make a selection?
  14. In some instances, making a selection in one of the pull-down menus will produce additional pull-down menus or generate new fields. ENCODEdb will only display fields that directly pertain to the user’s previous choices, keeping the user interface as simple as possible.

    top

  15. What is a BED-formatted file?
  16. BED stands for "browser-extensible data." The BED file format is used by UCSC to provide a flexible way of capturing disparate types of data that are to be displayed in an annotation track in a UCSC Genome Browser display. While many kinds of data can be displayed in an annotation track, the data must be specified in a rigid, consistent way.

    A complete description of the BED file format, specifying what type of data is in each column of the file, can be found on the UCSC Web site.

    top

  17. What is Galaxy?
  18. Galaxy is a platform for interactive large-scale genome analysis. Developed at Penn State, Galaxy provides a simple Web portal that enables users to search remote resources, combine data from independent queries, and visualize the results. More information on Galaxy can be found on the Galaxy Web site at Penn State.

    top

  19. I have just issued a query on the GEO Components page and asked for my data to be returned in BED format. This produced a new Web page asking me to select a single column from the GEO hybridization tables. Why can’t I select more than one column of data from the GEO hybridization tables?
  20. The BED format requires that only one data value be associated with each region (each row in the table), so only one column can be selected from the GEO hybridization tables. The same will occur if Send Query to Galaxy is selected, since Galaxy also uses the BED format.

    If you wish to retrieve multiple columns, select Download Selected Columns from Data File instead. This will produce a file (in tabular text format) with all of the selected data that can be used for further analysis.

    top

  21. How can I do analyses on multiple sets of data that I have generated through the ENCODEdb portal?
  22. The Galaxy server at Penn State has a flexible history system that stores the queries from each user; performs operations such as intersections, unions, and subtractions; and links to other computational tools. For any data type of interest, choose Send Query to Galaxy as the output option. You will then be able to use Galaxy’s interface to combine queries as needed.

    top

  23. How do I view GEO Component data using the UCSC Genome Browser?
  24. There are two possible way to view the GEO Component data using the UCSC Genome Browser:

    1. Select the "Download BED File" output option. This file can then be uploaded to the UCSC Genome Browser as a "custom track". Instructions on how to upload and visualize custom tracks can be found on the UCSC Web site.
    2. Select the "BED format custom track to UCSC Browser" or the "WIGGLE format custom track to UCSC Browser" output option. The system will generate a BED/WIG file of the results from the GEO Components query and send it to the UCSC browser automatically. A new browser window will open with the UCSC site showing the data.

    top

  25. How do I specify chromosome ranges?
  26. To perform a query using a specific chromosome coordinate range, please use the following format:

    chr#:begin-end

    replacing # with the chromosome number of interest and the two values after the colon with the beginning and end points for the chromosome coordinates of interest.

    For example, to specify the region on chromosome 7 between position 116147883 and 116418462, write the query as follows:

    chr7:116147883-116418462

    top

  27. How often is ENCODEdb updated?
  28. ENCODEdb Consortium Data is check every week for available updates.

    ENCODEdb Genomic Context Data is updated every month.

    top