I use Kallisto's transcript abundances to perform DEA using Sleuth.
In the typical Sleuth's workflow, the likelihood ratio test (LRT) is applied. Briefly, the LRT models the likelihood of the data given 2 models:
- full: transcript abundance affected on one or more dependent variables (here just being treated or not)
- reduced: transcript abundance unaffected by the treatment (null hypothesis)
The LRT then estimates for each transcript the ratio of the 2 likelihoods and produces a q-value (i.e. p-value adjusted for multiple testing by means of false discovery rate, FDR) which can be used as measure of significance. Note that the LRT does not produce any metric equivalent to the fold change, which indicates the maginitude of the change in expression between the 2 conditions and is commonly reported in DEA. That's why the script below also applies the Wald test (WT), which is somewhat related to the LRT and is also used to test for differential expression. WT is used becase it generates the beta statistic, which approximates to the log2 fold change in expression between the 2 condition tested. However, LRT is considerd a better test than the WT (see here) and thus significance filtering is based on LRT's q-values.
My implementation of Sleuth (see below) is largerly inspired on the introduction to Sleuth and on this blog post explaining the usage of the WT.
When applying the WT, in order to get beta, understanding how Sleuth reads the 2 treatments/conditions tested is key to understand how expression changes between them.
Imagine that we have 3 samples for each of 2 conditions (untreated and treated):
| sample_id | condition | path |
|---|---|---|
| sample01 | untreated | path1 |
| sample02 | untreated | path2 |
| sample03 | untreated | path3 |
| sample04 | treated | path4 |
| sample05 | treated | path5 |
| sample06 | treated | path6 |
At some point we load the Kallisto-processed data and make a regression model using 'condition' as the dependent variable:
so <- sleuth_prep(tab_metadata, ~ condition)
And then we apply the WT:
so <- sleuth_wt(so, paste0('condition'))
Internally, with sleuth_prep
Sleuth will transform elements in the condition field to 0s and 1s in alphabetical order and then WT's beta values will be relative to the 0 condition; that is, positive beta values showing transcripts in which expression is greater in condition 1 than in condition 0. But this would result in counter-intuitive results in our experiment because beta values would be relative to the 'Treated' samples. So the trick is making that the reference condition ranks first alphabetically speaking. Below is my solution to avoid having to play excessively with the condition names.