Merging data frames

In R, there is often the need to merge two data.frame objects (say one with individual samples and the other with population coordinates.  The merge()  function is a pretty awesome though it may take a little getting used to.
Here are some things to remember:

  1. You need to have two data.frame objects to merge
  2. The first one in the function call will be the one merged on-to the second one is added to the first.
  3. Each will need a column to use as an index—it is a column that will be used to match rows of data.  If they are the same column names then the function will do it automagically, if no common names are found in the names()  of either  data.frame objects, you can specify the columns using the optional by.x=  and by.y=  function arguments.

Here is an example. I’m going to load in some data from the popgraph  library.  First, I’ll load up the library and hten grab the population meta data for the lophocereus data set we analyzed in Dyer & Nason (2004).

The graph itself has nodes indicated as populations and perhaps we are interested in plotting node size as a function of spatial location.  We can grab the names and sizes from the popgraph object (it is a kind of igraph ) by:

Now we have baja  and df.nodes as two data.frames and can merge them by their common column Population.  If we merge  df.nodesonto  baja  then we get the new  data.frame:

but if we do it the other way, we get:

Hope this helps.