Quantifying differential rhythmicity between conditions

Load the data

The expression data are in a matrix with one row per gene and one column per sample. The metadata are in a table with one row per sample. To save time and space, the expression data include only a subset of genes.

y = GSE34018$y
y[1:5, 1:5]
#>       GSM840516 GSM840517 GSM840518 GSM840519 GSM840520
#> 12686 11.962830 11.923338 11.098814 10.958933  9.256413
#> 13170  8.989743  9.132606 12.381036 12.441759 14.766070
#> 26897 11.515292 11.625519 10.579969 10.601969 11.096489
#> 11287  7.985859  7.930935  7.674688  7.899531  7.768563
#> 12046  8.024084  7.856703  7.942198  8.172695  7.981340

metadata = GSE34018$metadata
metadata
#>        sample      cond  time
#>        <char>    <fctr> <int>
#>  1: GSM840516 wild-type     0
#>  2: GSM840517 wild-type     0
#>  3: GSM840518 wild-type     4
#>  4: GSM840519 wild-type     4
#>  5: GSM840520 wild-type     8
#>  6: GSM840521 wild-type     8
#>  7: GSM840522 wild-type    12
#>  8: GSM840523 wild-type    12
#>  9: GSM840524 wild-type    16
#> 10: GSM840525 wild-type    16
#> 11: GSM840526 wild-type    20
#> 12: GSM840527 wild-type    20
#> 13: GSM840504  knockout     0
#> 14: GSM840505  knockout     0
#> 15: GSM840506  knockout     4
#> 16: GSM840507  knockout     4
#> 17: GSM840508  knockout     8
#> 18: GSM840509  knockout     8
#> 19: GSM840510  knockout    12
#> 20: GSM840511  knockout    12
#> 21: GSM840512  knockout    16
#> 22: GSM840513  knockout    16
#> 23: GSM840514  knockout    20
#> 24: GSM840515  knockout    20
#>        sample      cond  time

Fit linear models and compute posterior fits

Because the samples were acquired at relatively low temporal resolution (every 4 h), we use three knots instead of the default four, which reduces the flexibility of the spline curves. We specify condColname so getModelFit() knows to fit a differential rhythmicity model.

fit = getModelFit(y, metadata, nKnots = 3L, condColname = 'cond')
fit = getPosteriorFit(fit)

Get rhythm statistics

Next, we use the posterior fits to compute rhythm statistics for each gene in each condition.

rhyStats = getRhythmStats(fit)
print(rhyStats, nrows = 10L)
#>           cond   feature peak_phase peak_value trough_phase trough_value
#>         <fctr>    <char>      <num>      <num>        <num>        <num>
#>   1: wild-type     12686   0.000000  11.848689    10.510719     8.814508
#>   2: wild-type     13170   9.431107  15.056098    22.129887     9.039250
#>   3: wild-type     26897  18.371177  12.403175     4.623781    10.745386
#>   4: wild-type     11287  22.517494   7.903967     7.586936     7.794846
#>   5: wild-type     12046   5.694928   7.984916    20.181606     7.971871
#>  ---                                                                    
#>  96:  knockout    317750  22.569400   8.197825     7.890174     8.050101
#>  97:  knockout    329015  18.277878   9.367790    10.330503     9.236423
#>  98:  knockout    381760  19.734165   9.456695     9.938374     9.257839
#>  99:  knockout    434864  21.966042   7.912880     8.649651     7.761379
#> 100:  knockout 110599566   6.508437   8.959367    22.779614     8.945090
#>      peak_trough_amp     mesor
#>                <num>     <num>
#>   1:      3.03418170 10.368427
#>   2:      6.01684814 12.018767
#>   3:      1.65778881 11.675870
#>   4:      0.10912046  7.854743
#>   5:      0.01304465  7.979693
#>  ---                          
#>  96:      0.14772364  8.104574
#>  97:      0.13136624  9.296439
#>  98:      0.19885626  9.353978
#>  99:      0.15150048  7.829699
#> 100:      0.01427692  8.953484

Get differential rhythm statistics

We can now calculate the rhythmic differences for each gene between any two conditions, here between wild-type and knockout.

diffRhyStats = getDiffRhythmStats(fit, rhyStats)
print(diffRhyStats, nrows = 10L)
#> Key: <feature>
#>       feature     cond1    cond2 mean_mesor mean_peak_trough_amp   diff_mesor
#>        <char>    <fctr>   <fctr>      <num>                <num>        <num>
#>  1:    103266 wild-type knockout   9.094249          0.184903036  0.051687912
#>  2:    108897 wild-type knockout   7.958910          0.006831757 -0.005168699
#>  3: 110599566 wild-type knockout   8.938463          0.069308476  0.030043525
#>  4:     11287 wild-type knockout   7.867503          0.116526530  0.025519670
#>  5:     12046 wild-type knockout   8.038329          0.170680319  0.117271901
#> ---                                                                          
#> 46:     72114 wild-type knockout   7.815911          0.102562698 -0.031350508
#> 47:     74087 wild-type knockout   7.876956          0.120937382  0.031165093
#> 48:     75801 wild-type knockout   7.885490          0.009974481  0.021612144
#> 49:     78697 wild-type knockout   8.435127          0.016110453  0.030223706
#> 50:     93877 wild-type knockout   8.060970          0.134552916  0.103128679
#>     diff_peak_trough_amp diff_peak_phase diff_trough_phase diff_rhy_dist
#>                    <num>           <num>             <num>         <num>
#>  1:         -0.214572098       -4.188981         0.4565494    0.26587667
#>  2:         -0.006489454       10.576316         9.3558268    0.01348066
#>  3:         -0.110063106       -6.830498        -6.7648712    0.12818012
#>  4:          0.014812131       -6.040059        -7.8807586    0.16598258
#>  5:          0.315271340       -6.811385        11.1769819    0.33131162
#> ---                                                                     
#> 46:          0.037126061       -9.577629         6.1331939    0.19524286
#> 47:         -0.025998153       -2.082278        -2.0154710    0.06976247
#> 48:         -0.012272708        5.695024        -8.5424574    0.01626120
#> 49:         -0.001703226        8.330807         9.3559106    0.02858621
#> 50:         -0.086004439       10.791143         8.1130055    0.26608906

We can examine the distributions of the statistics in various ways, such as ranking genes by difference in peak-to-trough amplitude (no p-values necessary) or plotting difference in peak-to-trough amplitude vs. difference in mean expression.

print(diffRhyStats[order(diff_peak_trough_amp)], nrows = 10L)
#>     feature     cond1    cond2 mean_mesor mean_peak_trough_amp  diff_mesor
#>      <char>    <fctr>   <fctr>      <num>                <num>       <num>
#>  1:   13170 wild-type knockout  12.616280            4.3073908  1.19502624
#>  2:   12686 wild-type knockout  10.157657            1.7859377 -0.42154165
#>  3:   26897 wild-type knockout  10.473462            1.0551507 -2.40481581
#>  4:   14385 wild-type knockout  10.335544            0.6410604 -0.23091881
#>  5:   56209 wild-type knockout  11.382764            0.5173328 -0.29032848
#> ---                                                                       
#> 46:   13507 wild-type knockout   7.983178            0.1552713  0.15367719
#> 47:   17252 wild-type knockout   9.983190            0.2583523 -0.08320065
#> 48:   12046 wild-type knockout   8.038329            0.1706803  0.11727190
#> 49:   20775 wild-type knockout  11.830131            1.3835675  0.24859164
#> 50:   22113 wild-type knockout   8.395714            0.6561732  0.26000350
#>     diff_peak_trough_amp diff_peak_phase diff_trough_phase diff_rhy_dist
#>                    <num>           <num>             <num>         <num>
#>  1:           -3.4189147      -0.8277360       -0.01805794     3.5242228
#>  2:           -2.4964879      -6.4332587       -0.45100842     3.1408093
#>  3:           -1.2052763      -0.5625297        5.85548916     1.2119948
#>  4:           -0.9487905       5.2747430       -1.96740822     1.0962868
#>  5:           -0.5254610      -0.9897313        0.16651740     0.5379302
#> ---                                                                     
#> 46:            0.2214059       4.7481008       -0.10424266     0.2551407
#> 47:            0.2605478      -5.3279342        2.99082880     0.3873103
#> 48:            0.3152713      -6.8113848       11.17698194     0.3313116
#> 49:            0.4137977       3.5329385       -1.78226089     1.2889095
#> 50:            0.5568909      -0.2510863       -2.68931784     0.5582583

ggplot(diffRhyStats) +
  geom_point(aes(x = diff_mesor, y = diff_peak_trough_amp), alpha = 0.2) +
  labs(x = bquote(Delta * 'mesor (norm.)'), y = bquote(Delta * 'amplitude (norm.)'))

Get observed and fitted time-courses

We can compute the expected measurements for one or more genes at one or more time-points in each condition, which correspond to the fitted curves. Here we plot the posterior fits and observed expression for three genes (converting from gene id to gene symbol).

genes = data.table(
  id = c('13170', '12686', '26897'),
  symbol = c('Dbp', 'Elovl3', 'Acot1'))

measFit = getExpectedMeas(fit, times = seq(0, 24, 0.5), features = genes$id)
measFit[genes, symbol := i.symbol, on = .(feature = id)]
print(measFit, nrows = 10L)
#>       time      cond feature     value symbol
#>      <num>    <fctr>  <char>     <num> <char>
#>   1:     0 wild-type   13170  9.402594    Dbp
#>   2:     0 wild-type   12686 11.848689 Elovl3
#>   3:     0 wild-type   26897 11.551600  Acot1
#>   4:     0  knockout   13170 11.969182    Dbp
#>   5:     0  knockout   12686  9.801527 Elovl3
#>  ---                                         
#> 290:    24 wild-type   12686 11.848689 Elovl3
#> 291:    24 wild-type   26897 11.551600  Acot1
#> 292:    24  knockout   13170 11.969182    Dbp
#> 293:    24  knockout   12686  9.801527 Elovl3
#> 294:    24  knockout   26897  9.142774  Acot1

Next we combine the observed expression data and metadata. The curves show how limorhyde2 is able to fit non-sinusoidal rhythms.

measObs = mergeMeasMeta(y, metadata, features = genes$id)
measObs[genes, symbol := i.symbol, on = .(feature = id)]
print(measObs, nrows = 10L)
#> Key: <sample>
#>        sample      cond  time feature      meas symbol
#>        <char>    <fctr> <int>  <char>     <num> <char>
#>  1: GSM840504  knockout     0   13170 11.669138    Dbp
#>  2: GSM840504  knockout     0   12686  9.705361 Elovl3
#>  3: GSM840504  knockout     0   26897  8.654624  Acot1
#>  4: GSM840505  knockout     0   13170 11.877697    Dbp
#>  5: GSM840505  knockout     0   12686  9.611530 Elovl3
#> ---                                                   
#> 68: GSM840526 wild-type    20   12686 10.911935 Elovl3
#> 69: GSM840526 wild-type    20   26897 12.486105  Acot1
#> 70: GSM840527 wild-type    20   13170  9.749365    Dbp
#> 71: GSM840527 wild-type    20   12686 11.075636 Elovl3
#> 72: GSM840527 wild-type    20   26897 12.352601  Acot1

ggplot() +
  facet_wrap(vars(symbol), scales = 'free_y', nrow = 1) +
  geom_line(aes(x = time, y = value, color = cond), data = measFit) +
  geom_point(aes(x = time %% 24, y = meas, color = cond, shape = cond),
             size = 1.5, data = measObs) +
  labs(x = 'Zeitgeber time (h)', y = 'Expression (norm.)',
       color = 'Condition', shape = 'Condition') +
  scale_x_continuous(breaks = seq(0, 24, 4)) +
  scale_color_brewer(palette = 'Dark2') +
  scale_shape_manual(values = c(21, 23)) +
  theme(legend.position = 'bottom')