Best Practices for Creating Recipes

A recipe is a short, standalone workflow for completing a complex bioinformatic analyses. Recipes answer questions such as, "How do I make a network out of a list of genes?", "How do I create a consensus signature from several gene expression signatures?", "How do I remove adapter sequences from my RNA-Seq data?"

In addition, recipes use GenomeSpace-enabled tools, and limit the amount of external scripting or other computational processes that are necessary to complete a simple bioinformatic analysis. The goal of recipes is to make common tasks easily accomplishable by biologists, clinicians, or other users who are starting out in bioinformatics, and may not have extensive computational skills.

If you have a workflow that you believe will benefit the genomics community, and which uses or can be adapted to use GenomeSpace tools, we welcome you to submit your recipe to the Recipe Resource!

 

Recipe Resource

GenomeSpace users can create recipes and submit them to the Recipe Resource. Only logged-in GenomeSpace users can create or edit recipes. User-contributed recipes will first be reviewed by the GenomeSpace team to prevent misuse, and once approved will be made available to for public use. Please read carefully through the documentation on this page to understand how to use the Recipe Resource creation interface, and about the best practices for recipe creation.

 

F1000 Channel

The GenomeSpace team has created an F1000 channel to host published recipes. Once a recipe has been submitted to the Recipe Resource and reviewed by the GenomeSpace team, the author can publish the recipe in the F1000 channel. The GenomeSpace team can help coordinate submissions of recipes to the F1000 channel. The recipes will then undergo peer review according to the F1000 publication guidelines. Recipe creators will be credited with authorship of that article, and once reviewed and published, the resulting recipe and workflows can be cited normally. If you are interested in submitting a recipe to the F1000 channel, please contact us.

 

Recipe Components

Certain components are necessary to build a successful recipe that adequately describes the motivation and steps for the analysis. Here, we describe necessary components of recipes, and some standard guidelines for creating recipes. Components that are required for submission of a recipe to the Recipe Resource are marked by a red asterisk ().

 

Title

The recipe should have a descriptive but short title, such that users who may be browsing the Recipe Resource can quickly determine if the recipe suits their needs. Example title: Identify biological functions for genes in copy number variation regions.

 

Video URL

Many GenomeSpace recipes have video components to them, to facilitate users' understanding of a specific aspect of the analysis. We encourage recipe authors to create their own videos to demonstrate or walkthrough a particular recipe. If you have created your own videos to accompany your recipe, please copy-paste the link to the Video URL box. Please note that we only accept video URLs from publically-available video hosts, such as YouTube or Vimeo. You can also upload a presentation that is hosted on SlideShare, instead of a video.

Recipe Title Image

The recipe title image is a small image that describes, at a glance, the tools being used in the specific recipe. For example, the following recipe title image describes a recipe that uses the UCSC Table Browser, GenePattern, and IGV:

You must upload this icon separately from the rest of your recipe screenshots.

 

Tags

Recipes should be tagged with some basic information to help identify the core components of a recipe. Tags should not include tool-based tags, which are attached to recipes in a different manner. Tags should describe the input data, and/or analyses used in the recipe, but without providing unnecessary details. For example, DNA microarray is a good tag; however, two-color Illumina Bead Chip microarray is too specific, since most analyses can be also completed on other, non-Illumina microarrays, assuming different pre-processing steps are taken. Similarly, hierarchical clustering is a good tag; however, UPGMA is too specific, as it refers to a specific algorithm, which a user may want to modify using various parameters.

 

Summary

This is a short paragraph (3-4 sentences) explaining the motivation for the recipe. Example questions the motivation section could answer: "What are we trying to accomplish from this analysis?", "Why would we choose to connect these recipe components?", "What would we learn from this recipe?"

If the recipe is recapitulating published research, it would be appropriate to include PubMed IDs or links to the publications. Recipe authors can also briefly summarize the basic motivation, methods, and conclusions of the paper. In addition, users can mention ways in which the recipe deviates from the workflow in the published paper, for example, if a recipe makes substitutions for certain tools or algorithms due to the fact that they have not been incorporated into a GenomeSpace tool.

 

Inputs

The input section lists the files needed for analysis, along with brief (~1 sentence) explanations of what the data is. For example, authors may list a specific example file to be used, "gene_list.txt", and provide a description, "A list of interacting genes in a two-column format". If applicable, authors can link to descriptions of file formats for clarity, such as the GenePattern GCT file format guide.

This is also an appropriate place to clarify how input datasets may be generated, if it is not intuitive, and if it was not explained in the Summary section. For example, the file "GISTIC.bed" was generated using the GenePattern GISTIC module, and the resulting output was re-named to have a '.bed' extension instead of the original output.

In GenomeSpace we frequently host the example files used in recipes, to make them available to users. Authors of recipes are encouraged to use existing files that are available in the GenomeSpace example recipe data repository, which is under the username RecipeData in the GenomeSpace public directory. If authors want to host their own example data in GenomeSpace, they are welcome to do so. The GenomeSpace RecipeData repository will consider hosting some example input data files for authors of recipes, on a case-by-case basis.

 

Outputs

The output section describes the basic outputs that can be expected if a user follows along with the recipe, using the example recipe data. This section can either describe a file that is the result of running the workflow, or it could describe the overall outcome. For example, several recipes end in 'visualization' steps in which there is no output file per se, but the user can visualize the results of an analysis as, for example, a network. In this case, the output section would describe that the end result is a visualization of a network. Additional details, such as what the biological meaning behind this resulting network, would be described in the Results Interpretation section (see below).

 

Recipe Steps

Step title

This is a short, descriptive title of the individual recipe step. This short title should focus broadly on the particular outcome that the step is trying to accomplish. For example, Loading data into Galaxy, or Slice gene list out of expression file.

 

Video URL

As in the case of the entire recipe, if you have created a video for a specific step, please include it in this section of the recipe. For many GenomeSpace recipes, we create several individual videos which are very short (~1-2 minutes), guiding a user through one step of the recipe. We link these videos at each recipe step. At the top of the page, we link the entire playlist of compiled videos for the readers to view.

Tools used in this step

This section provides a drop-down menu which lists GenomeSpace-enabled tools. Please select the tool or tools used in this individual step of the recipe. The tools you click will become tool-based tags for the recipe. If you want to select more than one tool, use ctrl+click, or click and drag to select the tools.

 

Description

The description of the recipe step is a brief (3-4 sentences) overview of what that particular step is trying to accomplish. The description section will be viewable to readers at all times; in contrast, the content or video sections of the recipe steps can be collapsed or expanded to view them. Therefore, the description step provides the very basic information about the step, without overwhelming the reader with too many details. An example description: We will use the GeneMANIA plug-in to find the network of interacting proteins associated with our gene list.

 

Content

This section of a recipe step provides the step-by-step, fine-grained details of how to accomplish the analysis. This is the section which contains details, such as:

  • Menu navigation instructions (e.g. "Click on File, then Print...")
  • Where to move or upload data (e.g. "Click and drag data.gct to the Input box...")
  • What parameters to adjust (e.g. "Select the checkbox next to protein-protein interactions...")
  • Which buttons to click (e.g. "Click the Run button to start the job...")

Individual recipe steps can have content organized as lists, or contain labeled instructions (e.g. "Step 1..."). The recipe steps should also incorporate screenshots and videos, if applicable. Recipe authors should feel free to be as detailed as possible when writing the recipe steps, as these are the components necessary for another user to repeat the analysis described in the recipe.

 

Alternative steps

The alternative steps section provides a place for a recipe author to describe alternative tools or parameters that can be used to accomplish the recipe step. For example, an author who chooses to use the TopHat module in GenePattern in a step, can clarify that the TopHat tool could also be used in Galaxy.

 

Results Interpretation

The results interpretation section is one or two paragraphs which explain the overall process that was taken in the recipe, and what was learned at each step, as well as what new biological insight comes from the overall recipe. The results interpretation section should include screenshots, if applicable. The interpretation can be based on the method, e.g. "The removal of adapter sequences results in higher quality data...", or more based in biological results, e.g. "The signature is associated with zinc-ion binding..."

 

Status

Recipes will be reviewed by GenomeSpace curators prior to publication just to ensure they are not spam or offensive. We will not otherwise judge or restrict the recipe content you have created.

Recipes proceed through the following states:

Created - Recipe has been started but is incomplete.  It is visible only to the author and the GenomeSpace team.

Pending Review - Recipe is awaiting publication by the GenomeSpace team. It is visible only to the author and the GenomeSpace team.

Published - Recipe is visible to all visitors of the GenomeSpace recipe Resournce.

Rejected - Recipe is not and will not be published due to a violation of the community guidelines.

 

 

Recipe Standards

To make it easier for recipe authors to create high-quality recipes, we have provided some guidelines for formatting the text, screenshots, and videos that accompany recipes:

  • The word "recipe" should be lowercase, unless at the start of a sentence or in reference to the Recipe Resource.
  • Appropriate terminology should be used for components of tools (GenePattern modules, Galaxy tools, Cytoscape plugins, etc.)
  • Define terminology that has specific meaning in a recipe, e.g.: cluster, subnetwork, module, edge, differentially expressed gene, etc.
  • Define all acronyms before use.
  • Use the same voice in all titles:
       YES: Load Data into GenomeSpace; Identify a Subnetwork; Download a Dataset, etc.
       NO: Load Data into GenomeSpace; Identifying a Subnetwork, etc.
  • Use the same case in all titles:
       YES: Load Data into GenomeSpace; Identify a Subnetwork; Download a Dataset, etc.
       NO: Load data into GenomeSpace; Identify a list of Genes, etc.
  • Label screenshots clearly and simply.
  • Label videos clearly and simply. If a video is a playlist of other videos, make sure to note it.
  • Text should be colored based on its context:
    • Parameters, Buttons and Menu Items: Change the parameter input filename to your chosen file. Select Save As... in the drop-down menu. Click OK to save your file.
      Use the HTML code:
      <code class="parameter">Your Text Here</code>
    • Filenames: Click and drag the test.gct file to your directory.
      Use the HTML code:
       <code class="filename">Your Text Here</code>
    • Subtools and Modules: Search for ComparativeMarkerSelection in GenePattern.
      Use the HTML code:
       <code class="subtool">Your Text Here</code>
    • Directories: Look for your data in the Public > SharedData folder.
      Use the HTML code:
       <code class="directory">Your Text Here</code>
    • Notes or Warnings: Run the program. This may take several minutes.
      Use the HTML code:
       <code class="note">Your Text Here</code>
    • Keyboard controls: Use ctrl + S to save your output.
      Use the HTML code:
       <kbd>Your Text Here</kbd>