This notebook walks through the initial processing and analysis for
the 16N barcodes placed at the galK locus. The structure of the amplicon
library is available at
data/DNA_maps/sequencing_libraries/galk_BC16N_orbit_seq_library.gb
The expected read length is exactly 189 bp with unique barcodes flanked
by identical genomic and pInt sequences.
The raw sequencing data is available on the SRA and is associated with this BioProject.
Library Name | Run ID and Link | Size |
---|---|---|
galK_BC16N | SRR25076308 | 123 Mb |
Setup packages and plotting for the notebook:
# Check packages
source("../../tools/package_setup.R")
# Load packages
library(tidyverse)
library(cowplot)
library(kableExtra)
library(tidysq)
library(ggridges)
# Code display options
knitr::opts_chunk$set(tidy.opts=list(width.cutoff=60),tidy=FALSE, echo = TRUE, message=FALSE, warning=FALSE, fig.align="center", fig.retina = 2)
# Load plotting tools
source("../../tools/plotting_tools.R")
#Modify the plot theme
theme_set(theme_notebook())
The read 1 and read 2 fastq files in
/data/seq_data/galK_BC16N
were first processed using the
galaxy webserver. Reads were quality filtered using fastp on default
(html report is in data folder) and were joined into a single fasta file
using fastq-join and FASTQ to FASTA.
Let’s first read in that fasta file as a dataframe and look at the reads.
#takes a while to run
#df_full <- read_fasta(file_name = "../../../data/seq_data/galK_BC16N/galK-BC16N_trimmed_joined_reads.fasta")
# runs faster
df_full <- readRDS("../../../data/seq_data/galK_BC16N/2022_05_03_galKBC16N_reads_df.rds")
df_full %>% head() %>%
mutate(seq = as.character(sq)) %>% #need to convert from tidysq format to human readable characters
select(seq) %>%
kable() %>% kable_styling()
seq |
---|
TCCGTTAGAACTATTACACGTCTTGAGCGATTGGTTTGTCTGGTCAACCACCGCGGTCTCCGTCGTCAGGATCATTGGTGGGCGAGTAGGTTCACAAGGAGCAGGACAGTGCTGAACGAAACTCCCGCACTGGCACCCGATGGTCAGCCGTACCGACTGTTAACTTTGCGTAACAACGCAGGGATGGTA |
GCTGTGCGAACTATTACACGTCTTGAGCGATTGGTTTGTCTGGTCAACCACCGCGGTCTCCGTCGTCAGGATCATAGATTAATTAGTTTGATCACAAGGAGCAGGACAGTGCTGAACGAAACTCCCGCACTGGCACCCGATGGTCAGCCGTACCGACTGTTAACTTTGCGTAACAACGCAGGGATGGTA |
CTCTTTGGAACTATTACACGTCTTGAGCGATTGGTTTGTCTGGTCAACCACCGCGGTCTCCGTCGTCAGGATCATTAATTCTTCTTTCCTCTCACAAGGAGCAGGACAGTGCTGAACGAAACTCCCGCACTGGCACCCGATGGTCAGCCGTACCGACTGTTAACTTTGCGTAACAACGCAGGGATGGTA |
CTGTCTTGAACTATTACACGTCTTGAGCGATTGGTTTGTCTGGTCAACCACCGCGGTCTCCGTCGTCAGGATCATTTTCGGTTTTATTTTTTCACAAGGAGCAGGACAGTGCTGAACGAAACTCCCGCACTGGCACCCGATGGTCAGCCGTACCGACTGTTAACTTTGCGTAACAACGCAGGGATGGTA |
TGTCTGTGAACTATTACACGTCTTGAGCGATTGGTTTGTCTGGTCAACCACCGCGGTCTCCGTCGTCAGGATCATAGATAAGGCAAATGTATCACAAGGAGCAGGACAGTGCTGAACGAAACTCCCGCACTGGCACCCGATGGTCAGCCGTACCGACTGTTAACTTTGCGTAACAACGCAGGGATGGTA |
TTATTCGGAACTATTACACGTCTTGAGCGATTGGTTTGTCTGGTCAACCACCGCGGTCTCCGTCGTCAGGATCATCGCCATACCCCCTCTCTCACAAGGAGCAGGACAGTGCTGAACGAAACTCCCGCACTGGCACCCGATGGTCAGCCGTACCGACTGTTAACTTTGCGTAACAACGCAGGGATGGTA |
Note that this library was run with 10% phiX, but the library index was not read, so these raw reads include both our galKBC16N amplicon product and phiX.
Reads are expected to be 189 length. Let’s check.
df_full_len <- df_full %>%
mutate(len = get_sq_lengths(sq))
ggplot(df_full_len %>% filter(len>125), #there is one stray read below this threshold that distorts plot.
aes(x = len)) +
geom_vline(xintercept = 189, linetype = 2)+
geom_histogram(color = NA, fill = 'gray', shape = 21, binwidth = 1) +
scale_y_log10(labels = scales::label_comma()) +
theme_bw() + labs(x = 'Read length', y = 'Number of reads') + theme(axis.text = element_text(color = 'black', size = 12),axis.title = element_text(color = 'black', size = 14))
The peak read length is at the expected length (dashed line). If we take the first read of the expected length, 189 bp, we can align it to our .gb file of the amplicon library structure using a DNA editor (e.g. benchling). It matches perfectly.
## # A tibble: 6 × 1
## seq
## <chr>
## 1 TCCGTTAGAACTATTACACGTCTTGAGCGATTGGTTTGTCTGGTCAACCACCGCGGTCTCCGTCGTCAGGATCATTG…
## 2 GCTGTGCGAACTATTACACGTCTTGAGCGATTGGTTTGTCTGGTCAACCACCGCGGTCTCCGTCGTCAGGATCATAG…
## 3 CTCTTTGGAACTATTACACGTCTTGAGCGATTGGTTTGTCTGGTCAACCACCGCGGTCTCCGTCGTCAGGATCATTA…
## 4 CTGTCTTGAACTATTACACGTCTTGAGCGATTGGTTTGTCTGGTCAACCACCGCGGTCTCCGTCGTCAGGATCATTT…
## 5 TGTCTGTGAACTATTACACGTCTTGAGCGATTGGTTTGTCTGGTCAACCACCGCGGTCTCCGTCGTCAGGATCATAG…
## 6 TTATTCGGAACTATTACACGTCTTGAGCGATTGGTTTGTCTGGTCAACCACCGCGGTCTCCGTCGTCAGGATCATCG…
## # A tibble: 6 × 1
## seq
## <chr>
## 1 AATCACCAGAAGGCGGTTCCTGAATGAATGGGAAGCCTTCAAGAAGGTGATAAGCAGGAGAAACATACGAAGGCGCA…
## 2 GAACGTGAAAAAGCGTCCTGCGTGTAGCGAACTGCGATGGGCATACTGTAACCATAAGGCCACGTATTTTGCAAGCT…
## 3 TATTATACCGTCAAGGACTGTGTGACTATTGACGTCCTTCCCCGTACGCCGGGCAATAATGTTTATGTTGGTTTCAT…
## 4 CTTAATCCACTGTTCACCATAAACGTGACGATGAGGGACATAAAAAGTCAAAATGTCTACAGTAGAGTCAATAGCAA…
## 5 TCAAAATAATCAGCGTGACATTCAGAAGGGTAATAAGAACGAACCATAAAAAAGCCTCCAAGATTTGGAGGCATGAA…
## 6 GACGCTGACAACCGTCCTTTACTTGTCATGCGCTCTAATCTCTGGGCATCTGGCTATGATGTTGATGGAACTGACCA…
If we instead take the first read of an unexpected length, 250 bp, we see it does not align to the amplicon file. When we blast this sequence we can see it is the phiX control.
To find reads from which we want to confidently extract unique barcodes, we take a simple approach - looking for reads with perfect flanking sequences. We will look within each read for the 10 bp preceding the random barcode and following the random barcode. We will also look for the first 10 bp of read 1 and read 2 to make sure the read has the expected structure. Let’s first try to find these flanking sequences and see how many of our reads conform to the expected structure perfectly.
pre_BC_seq <- 'TCAGGATCAT'
post_BC_seq <- 'TCACAAGGAG'
read_1_seq <- 'GAACTATTAC'
read_2_seq <- 'AGGGATGGTA'
df_full_check <- df_full_len %>%
mutate(pre_BC = sq %has% pre_BC_seq)%>%
mutate(post_BC = sq %has% post_BC_seq)%>%
mutate(read_1 = sq %has% read_1_seq)%>%
mutate(read_2 = sq %has% read_2_seq) %>%
mutate(perfect = pre_BC & post_BC & read_1 & read_2)
df_full_check %>%
group_by( perfect) %>%
summarise(mean_len = mean(len), sd_len = sd(len), n = n()) %>%
mutate(frac = scales::label_percent(accuracy = 0.01)(n/ length(df_full_check$name)))
## # A tibble: 2 × 5
## perfect mean_len sd_len n frac
## <lgl> <dbl> <dbl> <int> <chr>
## 1 FALSE 190. 18.0 61927 7.82%
## 2 TRUE 189. 1.11 730107 92.18%
So, we can see that ~92% of the reads are perfect, meaning that they have all four flanking sequences. Perfect and imperfect sequences have similar mean lengths centered around 189 bp, however, the imperfect reads have a much larger standard deviation (18 vs. 1). We fully expect that the perfect sequences here match the expected read structure, but the imperfect reads will be mostly a mixture of phiX control and sequencing errors of otherwise perfect constructs.
We can see some evidence of this by looking at the length distributions of reads that have the read1 and read2 flanking sequences for our amplicon, vs. those that do not posess either and are therefore very “wrong.”
ggplot(df_full_check %>%
mutate(expected_read = ifelse(read_1 == T & read_2 == T, T, NA)) %>%
mutate(expected_read = ifelse(read_1 == F & read_2 == F, F, expected_read)) %>%
filter(!is.na(expected_read)),
aes(x = len, y = expected_read)) +geom_density_ridges() +geom_vline(xintercept = 189, linetype = 2)+
labs(x = 'Read length (bp)', y = 'Expected amplicon read structure')
You can see that the reads that do not contain the read1/2 flanking sequences have a very wide length distribution, likely stemming from the random fragmentation of the phiX control library. Meanwhile, the correct read1/2 reads show a tight peak centered on our amplicon length of 189.
Therefore, we feel confident proceeding with the 92% of our library that has roughly perfect read structure (based on flanking sequences) and we will toss away the imperfect reads.
The previous analysis of flanking sequences only looked for presence or absence of the desired sequence (code runs very fast). However, to extract barcodes we actually need to know the location of each barcode flanking sequence (code runs a bit slower). Then we will try to extract the intervening sequence.
# takes about 15 min to run
df_perfect <- df_full_check %>% filter(perfect == T)
# Code unexpectedly fails when dataframes are processed together, therefore split into 100,000 row subsets and runs fine.
pre_BC_motifs_1 <- find_motifs(df_perfect$sq[1:100000], df_perfect$name[1:100000], pre_BC_seq)
pre_BC_motifs_2 <- find_motifs(df_perfect$sq[100001:200000], df_perfect$name[100001:200000], pre_BC_seq)
pre_BC_motifs_3 <- find_motifs(df_perfect$sq[200001:300000], df_perfect$name[200001:300000], pre_BC_seq)
pre_BC_motifs_4 <- find_motifs(df_perfect$sq[300001:400000], df_perfect$name[300001:400000], pre_BC_seq)
pre_BC_motifs_5 <- find_motifs(df_perfect$sq[400001:500000], df_perfect$name[400001:500000], pre_BC_seq)
pre_BC_motifs_6 <- find_motifs(df_perfect$sq[500001:600000], df_perfect$name[500001:600000], pre_BC_seq)
pre_BC_motifs_7 <- find_motifs(df_perfect$sq[600001:700000], df_perfect$name[600001:700000], pre_BC_seq)
pre_BC_motifs_8 <- find_motifs(df_perfect$sq[700001:length(df_perfect$name)], df_perfect$name[700001:length(df_perfect$name)], pre_BC_seq)
post_BC_motifs_1 <- find_motifs(df_perfect$sq[1:100000], df_perfect$name[1:100000], post_BC_seq)
post_BC_motifs_2 <- find_motifs(df_perfect$sq[100001:200000], df_perfect$name[100001:200000], post_BC_seq)
post_BC_motifs_3 <- find_motifs(df_perfect$sq[200001:300000], df_perfect$name[200001:300000], post_BC_seq)
post_BC_motifs_4 <- find_motifs(df_perfect$sq[300001:400000], df_perfect$name[300001:400000], post_BC_seq)
post_BC_motifs_5 <- find_motifs(df_perfect$sq[400001:500000], df_perfect$name[400001:500000], post_BC_seq)
post_BC_motifs_6 <- find_motifs(df_perfect$sq[500001:600000], df_perfect$name[500001:600000], post_BC_seq)
post_BC_motifs_7 <- find_motifs(df_perfect$sq[600001:700000], df_perfect$name[600001:700000], post_BC_seq)
post_BC_motifs_8 <- find_motifs(df_perfect$sq[700001:length(df_perfect$name)], df_perfect$name[700001:length(df_perfect$name)], post_BC_seq)
pre_BC_motifs <- bind_rows(pre_BC_motifs_1, pre_BC_motifs_2, pre_BC_motifs_3, pre_BC_motifs_4, pre_BC_motifs_5, pre_BC_motifs_6, pre_BC_motifs_7, pre_BC_motifs_8)
saveRDS(pre_BC_motifs, file = 'galKBC16N_pre_BC_motifs.rds')
#write_csv(pre_BC_motifs, file = 'galKBC16N_pre_BC_motifs.csv')
post_BC_motifs <- bind_rows(post_BC_motifs_1, post_BC_motifs_2, post_BC_motifs_3, post_BC_motifs_4, post_BC_motifs_5, post_BC_motifs_6, post_BC_motifs_7, post_BC_motifs_8)
saveRDS(post_BC_motifs, file = 'galKBC16N_post_BC_motifs.rds')
#write_csv(post_BC_motifs, file = 'galKBC16N_post_BC_motifs.csv')
With the locations for each motif found, we can look and see where in each read the pre and post barcode flanking sequences were found.
pre_BC_motifs <- readRDS("galKBC16N_pre_BC_motifs.rds")
post_BC_motifs <- readRDS("galKBC16N_post_BC_motifs.rds")
ggplot(data = pre_BC_motifs %>% group_by(end) %>% summarise(n = n())%>% mutate(pos = end), aes(x = pos, y = n)) +
geom_vline(xintercept = c(75,92), linetype = 2)+
geom_path(color = 'red') +
geom_path(data = post_BC_motifs %>% group_by(start) %>% summarise(n = n()) %>% mutate(pos = start), color = 'blue')+
scale_y_log10(labels = scales::label_comma()) +
labs(x = 'Putative barcode stop / start position', y = 'Read count')
Since most reads seem to have the correct putative barcode start and stop position (dashed lines), let’s actually extract the putative barcode length for each read.
BC_motifs <- left_join(pre_BC_motifs %>% select(names, start, end), post_BC_motifs %>% select(names, start, end), by = 'names', suffix = c('_pre_BC', '_post_BC')) #match by read names / id
BC_motifs_lens <- BC_motifs %>% mutate(len = (start_post_BC -1) - (end_pre_BC + 1) + 1 ) #go 1 nt back from postBC and 1 nt forward from pre BC. Add back 1 nt to be inclusive
BC_motifs_lens %>% group_by(len) %>% summarise(n=n()) %>% arrange(desc(n))
## # A tibble: 30 × 2
## len n
## <dbl> <int>
## 1 16 666265
## 2 15 35412
## 3 14 6700
## 4 13 6165
## 5 12 3641
## 6 17 3463
## 7 11 2648
## 8 9 1544
## 9 10 1322
## 10 7 850
## # … with 20 more rows
BC_motifs_lens %>% group_by(len) %>% summarise(n=n()) %>%
ggplot(data = ., aes(x= len, y = n)) +
geom_vline(xintercept = 16, linetype = 2)+
geom_point(shape = 21, size = 2, fill ='light gray') +
scale_y_log10(labels = scales::label_comma()) +
theme_bw()+
theme(axis.text = element_text(color = 'black', size = 12),axis.title = element_text(color = 'black', size = 14))+
labs(y = "Reads", x = "Putative barcode length")
This plot shows that most barcodes are the expected 16 bp long and some are a few nt shorter (or longer). We will only consider the full length 16 bp barcodes from here on, although there are almost surely unique barcodes that have errors that make them shorter.
Although most reads with 16bp barcodes have those barcodes in the exact correct position, there is some minor variance in barcode position within the read that we should consider.
df_full_len_BCs <- left_join(BC_motifs_lens %>% filter(len==16), df_full, by = c('names' = 'name'))
df_full_len_BCs %>% group_by(end_pre_BC) %>% summarise(n=n())
## # A tibble: 14 × 2
## end_pre_BC n
## <int> <int>
## 1 68 25
## 2 69 48
## 3 70 50
## 4 71 95
## 5 72 271
## 6 73 2162
## 7 74 28140
## 8 75 635187
## 9 76 282
## 10 77 1
## 11 79 1
## 12 81 1
## 13 83 1
## 14 137 1
We can process these reads by the end position of their pre_BC flanking sequence (i.e. the position before the putative barcode). From there we can extract the following 16bp sequence before the post_BC flanking sequence.
We can check that the correct read indices are used for the dominant (pos = 75) and a shorter group of reads (pos = 73).
# group by end_pre_BC to make sure correct indices are used for bite
df_full_len_BCs_ext <- df_full_len_BCs %>%
group_by(end_pre_BC) %>%
mutate(ind = paste0((mean(end_pre_BC)+1),":",(mean(start_post_BC)-1))) %>%
mutate(BC_sq = bite(sq,indices = (mean(end_pre_BC)+1):(mean(start_post_BC)-1))) %>%
mutate(BC_seq = as.character(BC_sq)) %>%
mutate(BC_len = nchar(BC_seq))
df_full_len_BCs_ext %>% filter(end_pre_BC == 75) %>% head() %>% select(names, BC_seq, BC_len, ind)
## # A tibble: 6 × 5
## # Groups: end_pre_BC [1]
## end_pre_BC names BC_seq BC_len ind
## <int> <chr> <chr> <int> <chr>
## 1 75 1 TGGTGGGCGAGTAGGT 16 76:91
## 2 75 2 AGATTAATTAGTTTGA 16 76:91
## 3 75 3 TAATTCTTCTTTCCTC 16 76:91
## 4 75 4 TTTCGGTTTTATTTTT 16 76:91
## 5 75 5 AGATAAGGCAAATGTA 16 76:91
## 6 75 6 CGCCATACCCCCTCTC 16 76:91
## # A tibble: 6 × 5
## # Groups: end_pre_BC [1]
## end_pre_BC names BC_seq BC_len ind
## <int> <chr> <chr> <int> <chr>
## 1 73 785 ATAAAAAACAAAAAGA 16 74:89
## 2 73 1014 GTGTCTTTTTTTTTAT 16 74:89
## 3 73 1517 AAGGGTTTTTTTTTTT 16 74:89
## 4 73 1945 GGGATGGGAAGAGGGC 16 74:89
## 5 73 2251 CTCACGTTCCCTTCAC 16 74:89
## 6 73 3210 ACATGATTATTGTTTG 16 74:89
Looks correct, so now we can group all the BC sequences that are identical together and count how many unique sequences we have:
df_unique_BC_counts <- df_full_len_BCs_ext %>% group_by(BC_seq) %>% summarise(n = n()) %>% arrange(desc(n))
#write_delim(df_unique_BC_counts, file = "galKBC16N_unique_BC_counts.txt", delim = '\t',col_names = F)
#sum(df_unique_BC_counts$n)
df_unique_BC_counts
## # A tibble: 109,504 × 2
## BC_seq n
## <chr> <int>
## 1 TTTTTTTTTTTTTTTT 766
## 2 TTTTTGTTTTTTTTTT 403
## 3 TTTCTTTTTTTTTTTT 367
## 4 TTTGTTTTTTTTTTTT 351
## 5 TTTTTTTTTGTTTTTT 232
## 6 TTTTTTGTTTTTTTTT 212
## 7 AAAAAAAAAAAAAAAA 180
## 8 TTTTTCTTTTTTTTTT 178
## 9 GGGGGGGGGAGGGTGT 174
## 10 TTTATTTTTTTTTTTT 170
## # … with 109,494 more rows
Based on the length of this dataframe, we observe 109,504 unique barcode sequences. We have written this dataframe to text file for later use, but let’s look at a plot of this data:
ggplot(df_unique_BC_counts, aes(x = n)) +
geom_histogram(color = 'black', fill = 'light gray') +
scale_x_log10() + scale_y_log10() +
theme_bw() +
labs(x = 'Reads', y = 'Unique Barcodes')+
theme(axis.text= element_text(color = 'black', size = 12),axis.title = element_text(color = 'black', size = 14))
Although 109k barcode sequences is a pretty reasonable number based on the hundreds of thousands of mutant colonies that were obtained, we should be conservative and see if any of these barcodes are likely to be small sequencing errors. This does seem possible, since many unique barcodes are only observed 1 time.
We will use the program starcode to do an error correction. Essentially it calculates the hamming distance between barcodes and for low abundance barcodes that are very similar to high abundance barcodes, it collapses them into the same barcode sequence and assumes the low abundance barcode was a sequencing or PCR error.
We call starcode separately in the terminal with the following code
starcode -d 3 -s -i "input_file.txt" -o "output_file.txt"
using galKBC16N_unique_BC_counts.txt
as the input file and
galKBC16N_unique_BC_starcode_counts.txt
df_starcode <- read_delim("galKBC16N_unique_BC_starcode_counts_d3s.txt",col_names = c('seq','n'))
df_starcode
## # A tibble: 43,419 × 2
## seq n
## <chr> <dbl>
## 1 TTTTTTTTTTTTTTTT 26953
## 2 AAAAAAAAAAAAAAAA 12219
## 3 GGGGGGGGGAGGGTGT 6872
## 4 CCCCCTTCCCCCTCCC 2939
## 5 GGGGGGGTGATTGGGT 1032
## 6 TTGTTTGTTGGTTTTT 986
## 7 GGTGAGGGGGGGGTGG 904
## 8 GGGGGGGAGGGGGTAG 801
## 9 TGGTTGTTTTGTTTGT 800
## 10 TTTGGTTCGTTTTTTT 797
## # … with 43,409 more rows
## # A tibble: 29,973 × 2
## seq n
## <chr> <dbl>
## 1 TTTTTTTTTTTTTTTT 26953
## 2 AAAAAAAAAAAAAAAA 12219
## 3 GGGGGGGGGAGGGTGT 6872
## 4 CCCCCTTCCCCCTCCC 2939
## 5 GGGGGGGTGATTGGGT 1032
## 6 TTGTTTGTTGGTTTTT 986
## 7 GGTGAGGGGGGGGTGG 904
## 8 GGGGGGGAGGGGGTAG 801
## 9 TGGTTGTTTTGTTTGT 800
## 10 TTTGGTTCGTTTTTTT 797
## # … with 29,963 more rows
Following error correction we obtained ~43k unique barcodes, with ~30k barcodes that were observed at least 4 times. Expressing this data as a plot gives almost the final figure:
ggplot(df_starcode, aes(x = n)) +
geom_vline(xintercept = 4, linetype = 2)+
geom_histogram(color = 'black', fill = 'light gray', bins = 30) +
geom_vline(xintercept = 4, linetype = 2)+
scale_x_log10(limits = c(NA,1000)) + scale_y_log10() +
theme_bw() +
labs(x = 'Reads', y = 'Unique Barcodes')+
theme(axis.text= element_text(color = 'black', size = 12),axis.title = element_text(color = 'black', size = 14))
The final figure will be created in the main figure notebook for figure #7.
## R version 4.2.0 (2022-04-22)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur/Monterey 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggridges_0.5.4 tidysq_1.1.3 ggrastr_1.0.1 kableExtra_1.3.4
## [5] cowplot_1.1.1 viridis_0.6.2 viridisLite_0.4.1 knitr_1.41
## [9] forcats_0.5.2 stringr_1.5.0 dplyr_1.1.0 purrr_0.3.5
## [13] readr_2.1.3 tidyr_1.2.1 tibble_3.1.8 ggplot2_3.4.0
## [17] tidyverse_1.3.2
##
## loaded via a namespace (and not attached):
## [1] httr_1.4.4 sass_0.4.4 bit64_4.0.5
## [4] vroom_1.6.0 jsonlite_1.8.3 modelr_0.1.10
## [7] bslib_0.4.1 assertthat_0.2.1 highr_0.9
## [10] vipor_0.4.5 googlesheets4_1.0.1 cellranger_1.1.0
## [13] yaml_2.3.6 pillar_1.8.1 backports_1.4.1
## [16] glue_1.6.2 digest_0.6.30 checkmate_2.1.0
## [19] rvest_1.0.3 colorspace_2.0-3 htmltools_0.5.4
## [22] pkgconfig_2.0.3 broom_1.0.1 haven_2.5.1
## [25] scales_1.2.1 webshot_0.5.4 svglite_2.1.0
## [28] tzdb_0.3.0 timechange_0.1.1 googledrive_2.0.0
## [31] farver_2.1.1 generics_0.1.3 ellipsis_0.3.2
## [34] cachem_1.0.6 withr_2.5.0 cli_3.4.1
## [37] magrittr_2.0.3 crayon_1.5.2 readxl_1.4.1
## [40] evaluate_0.18 fs_1.5.2 fansi_1.0.3
## [43] xml2_1.3.3 beeswarm_0.4.0 tools_4.2.0
## [46] hms_1.1.2 gargle_1.2.1 lifecycle_1.0.3
## [49] munsell_0.5.0 reprex_2.0.2 compiler_4.2.0
## [52] jquerylib_0.1.4 systemfonts_1.0.4 rlang_1.1.1
## [55] grid_4.2.0 rstudioapi_0.14 labeling_0.4.2
## [58] rmarkdown_2.18 gtable_0.3.1 DBI_1.1.3
## [61] R6_2.5.1 gridExtra_2.3 lubridate_1.9.0
## [64] bit_4.0.5 fastmap_1.1.0 utf8_1.2.2
## [67] stringi_1.7.8 ggbeeswarm_0.7.1 parallel_4.2.0
## [70] Rcpp_1.0.9 vctrs_0.5.2 dbplyr_2.2.1
## [73] tidyselect_1.2.0 xfun_0.35