Contributed Talks 6
Charlotte Soneson (Friedrich Miescher Institute for Biomedical Research) Research Associate
Davide Risso (University of Padova) Assistant Professor
Anthony Sonrel (University of Zürich) PhD Student
Stephanie Hicks (Johns Hopkins Bloomberg School of Public Health) Assistant Professor
3:00 PM - 3:55 PM EDT on Friday, 31 July
Preprocessing choices affect RNA velocity results for droplet scRNA-seq data
Bench pressing differential abundance methods for microbiome data
Bench pressing performant single cell RNA-seq preprocessing tools through pipeComp; a general framework for the evaluation of computational pipeline.
Bench pressing single-cell RNA-sequencing imputation methods.
Moderator: Matthew McCall, Simone Bell, Kayla Interdonato
My research interests include - Statistical Modeling of high-throughput data (such as microarray data, high- throughput sequencing) - Development of novel methods for RNA-Seq data analysis. - Normalization, meta-analysis and pathway analysis of gene expression data. - Computationally intensive statistical methods.Перейти в профиль
Bioinformatician experienced in quantitative genetics, from population genetics to single-cell analysis. Interested in method development and immunology. Creative, open-minded and independent.Перейти в профиль
I am an Assistant Professor in the Department of Biostatistics at Johns Hopkins Bloomberg School of Public Health. I received my B.S. in Mathematics from LSU and my M.A. and Ph.D. from the Department of Statistics at Rice University under the direction of Marek Kimmel and Sharon Plon. I completed my postdoctoral training with Rafael Irizarry in the Department of Biostatistics and Computational Biology at Dana-Farber Cancer Institute and Harvard T.H. Chan School of Public Health. This postdoctoral research resulted in a K99/R00 grant from the ‘National Human Genome Research Institute (NHGRI) to develop statistical methods for the normalization and quantification of single-cell RNA-Sequencing data. My broad research interests focus around developing statistical methods and tools in application for genomics, epigenomics, functional genomics and most recently, single-cell data. Specifically, my research addresses statistical challenges such as the pre-processing, normalization, analysis of raw, noisy high-throughput data (microarray and next-generation sequencing) leading to an improved quantification and understanding of biological variability.Перейти в профиль
Alright, welcome everybody to the sixth and final contributed succession of five. Conductor 2024 amazing talks, eats approximately 10 minutes in length. Followed by a 15 minutes during a session please submit questions by a palpable pulse unction and make sure you signify in some way sexual music. Our first speaker is Charlotte going to send you a video. I am about to start where we look at the Bangles Constitution specifically for our phones aren't even in general
construction, single cell organism from different parts of the trajectory and then it performs a joint modeling of mature and mermaids. A regular gene expression, mg and Marlene are bombas doing sure. Whether the expression level of each gene is on his way up or down, or did you know statistics Combining this information across all the genes, basically, lots of things are where in the gene expression space. Each cell is headed. Interpretation aren't even all cities are often visualized overlaid on top of another
dimensional embedding of themselves, as we see here in the chicken to the right, where the dogs are the cells in a lot of National Space. And the velocity is shown in the form of streamlines indicating the main directional. So what did we do? What is a fitness model to the dreamer? Renee and Jeremy. Abundance is we first need to estimate, And we focused on this abundance, estimation part, and we won't busy to compare the existing estimations. We approached is by considering several Publix chocolate singles. Alarm is if they dusted with
it was some known Dynamics, equals 12, different complication approaches. And she gave us the pre-mrna on a mature and learn a dab on the system. These approaches into the different ways of running Alvin Kelley's the bus to start solo on the low side by side pockets to each of these Obamas and their weak compared to others. So I will not go too deep into the results here at but I would point out the couple of interesting aspects and the first and maybe the most important takeaway I think it's the part of the choice of the bonus estimation method, doesn't just show up a small
changes in the counts Matrix but it also propagates to the estimated velocities and to the biological interpretation. Ephrata Boosie velocity streamlines, estimated based in based on analysis from two different approaches. Who played on the same, low-dimensional representation and the turtle arrows. Indicate several places streamlines actually point in completely opposite direction between the two applications. And in addition, we notice of the correlation among velocities based on different abundances was offering lower than the correlation among ourselves.
So you know preprints would go into much more detail into this differences and explain some reasons for the differences between the message. But I think this really shows the abundance estimation is an essential part of the Arnold Classic. The one thing that turns out to be quite important than which is to some extent independent of the software package that we use Bourbonnais destination is how we deal with Reed's map to mp3s regions regions that can be either in public. Works on it, depending on
which I should also say here that we often used in tronic read or you in my account, as a representation of the pre-mrna bundles and the X Sonic or the full transcript read to you my count as a representation of the maturing our neighbor. Doing this Gene here, for example, which has two partially overlapping. I supports there too. And biggest Regents indicated here by rectangles, are we smacking completely in this? We cannot directly Beyond videos, just find a Scyther ex on the country. The different methods
deal with such rates in different ways. You could say that anything that specifically asks only for because of your digs on them or conversely of that that if there's any evidence of every being in Toronto kids would be considered in Sonic X, Sonic in Sonic sequences into your set of reference features and competes with a Quantified them joint. If either, you could decide to count it with twice once for the exotic animals, free in Sonic or not at all. I need result of this Choice has a considerable impact on the assigned accounts for jeans lights.
I'm generally we got the most reasonable results when quantifying the example, entrance, jointed and conversely the worst when we double counted them video street as both sexes, Kenny. No further question, that's undoubtedly Rises is so, which muscle is the best one I have to say that question, question, question, and part of the reason for that is because we don't really have an established way of simulating realistic, Sonic and Sonic reads in a way that's actually captures the characteristics of experimental Z.
And that means that at the moment, much of the evaluations actually have to be done on experimental data for the truth. May not be fully known and either way. We're also expecting the expected to choose from the sending a second here. We're just considering the next day. And the optimal approach may very well be different for full-length protocols. So I think there's still room for Innovation and Prosecco specific investigation to cross the data. Since then, we generally go to the most consistently reasonable results with
living, when quantifying the exons and introns point. When were also Kelly's the bar, stools and star solo performed well for several different ways of running, the same thing to work for example, whether or not that big of sweets without counting, man, between the best way of running differences. Here, for example, we see the best results of running a Latinas guitar solo for the results. Look pretty similar. In this chapter for the other hounds. And we show the uncle to running a
living in two different ways, quantifying the exons and introns, I jointly or larger difference especially double count. So, damn big competition with 11 based on observations from our evaluation entrance from the Russians June, which is particularly handy in the settings of this particular work. So is Bubba, Sparxxx or using trig functions from Genoa, Genoa? Teachers are we estimated going to specifically developed? Instigation of a single-cell organism which you may have already heard about my cannabis Workshop, yesterday and reform of them for use
with a software package from my club. And then they wanted to highlight year did the last year after our package which is currently under development sandwich. Aaron already mentioned in his keynote today in DirectX single cell experiment subjects as input. So basically you don't have to leave our daughter would like to round up and mentioned that there is a pre-printed available. If you're interested in reading more in detail about what we did here is the Velociraptors and thank you very much.
The next talk is by Saturday. We so I'm very excited to share some of the results that we have for our benchmarking of differential. Abundance methods for a microbiome data and I'd like to start thanking my quarters and this project matoaka guarantee call me to look at it from the University of Arizona, University School of Public Health, and especially would like to thank myself for all these work. Well, with this analysis. So what is differential? Abundance, so weird when you when we're studying the microbiome is the collection of microbes that live in and on
our bodies with one of the Main Financial abundance. And as the name suggests, they looking for some differences in the back of my own composition of two groups of people, for example, healthy versus DC's, resembling of differential expressed expression in in our Niecy. And in fact, many of the methods that were developed for our me. See, I used your microbiome data for a differential about data are highly sparse and composition and so, This is not a straightforward application of differential expression methods and bespoke Method specifically designed for a differential of
being proposed. Angel on top of all. This one of the challenges of the microbiome data is, is sparsity over asking ourselves whether methods that were developed for single cell organism, which is also cracked Rise by sparse. If you could be useful in this context, and we decided to Benchmark, a lot of people have been used in this type of analysis borrowing, borrowing them from The Benchmark. Some some methods developer single-cell RNA seek such as using simply wave waves in conjunction with the alternative methods or using last.
Sarat, an STD that were there all methods developed for single-cell RNA. Seq friendships Russian mix methods, such as Alex to buy the GMC corn, cob Songbird and, and, and make them see, which is part of the AmEx Amex package. So I also, I'm happy to report that most of these methods actually are implemented as our lives much easier in this benchmarking, So how did we actually perform? This is not of this Earth. First of all, I want to say that the benchmarking is challenging per
say because especially in genomics because we, we offer unlock at gold standard and the so we could rely on synthetic data, but that's not always ideal. Because it stalks very hard, not to see me. Let's rally stick bait. So, what we decided to do for the most part, is to rely on someone really created a force of any particular device, group has these two wonderful, buy connect, two packages, hnp 69 stator and curated metal genomic, data that overall comprised, 100% of data sets from from variety of projects and in
different body size. And so this is a very diverse set of that. Especially Benchmark. Our methods are two methods that we looked at in real data. So of course, these 10 minutes are not enough to go through that, we did in this in this study, but I just give you a like that one slide of a few. These were our objective. So we first looked at the goodness of fit of the underlined statistical model we looked at the concordance both within each methods in a random space. At 8 to look at certain results which method and also the concordance between method to look at some similarities between
merchants and then we looked at power through, but spermatic simulations, but more importantly using an enrichment analysis with shovel girl. In more detail, synonym. Flights to. This is the type 1 error control figure in are in our paper is for the 69 stator, and and similar one for shotgun with a genome sequencing here. On the left side of the slide, you can see, especially for each method and their ability to control type 1 error at 3 usual spot for the, for the moment. So I hear what we did. We use mock
in a 16 oz, latest at Subway took one day, to said, we randomly split into and we compare the two calls and and because we're comparing two halves of the same. They just said we assume that there are no difference differences. And only difference is that we find our actually post Discovery proportion of the solstice covers and Great value for the type one, ever, and ever. And I would say that we repeat this 1,000 times and that's why I think this random sleep. So that's why we have box was here. Instead of you can see that,
we sort of replicate what would what was already out of serving some other papers switches that most methods do not control type 1 error? We also see that some matters are very conservative like for example of the end and last and I see D and there are some methods for example, by the gym seek, they are very liberal or I should say. Why do they do not technically control? They have enough served. Also that is only slightly higher than the nominal Elsa. Adjusting. Probably even though
they technically do not control type 1 error, they are probably going to do fine in practice and in this included in particular, do you speak to this one here? Which is the one that probably most closely controls the type one error without being conservative, which seems to be very hot. Some Hindi songs on the right side. We we have a, a different aspects of this analysis, where we look at the distance in terms of chromatography, meaning of statistics between the empirical distribution of the P
values and the uniform distribution, which is the theoretical distribution in case that there are no differences. And you can see that some methods such as corn cob, two very well and there are very close to the theoretical distribution, uniform distribution of the P values. I do I apply for lease on such as to, which is on the opposite side of the spectrum, one is very conservative. And the other one is very liberal of the picture because it gives, you know, false discoveries. But what about the true
discoveries? And we sort of got to that with Arrangement. And that's and that's what is the score of the start of the second is that I want to go into details up and we leverage this comparison that we had in one of the Eagles versus substantial block, that is that there are some microbes that are Arabic, and they need oxygen to survive. And so, they could survive in the Super Bowl block, which is exposed to oxygen, but not in the subject shall pluck. Their
other my trucks there, I'm Arabic. And they thrive in the absence of oxygen expects to see an arrangement of an aerobic taxa in a separate block and anaerobic. Overabundant Indo sub changeable clock. And so we've essentially it's method by their ability to, to find a sort of enrichment. And the third category that we have is the facultative anaerobic. And I we, we should see no enrichment there because that's the name suggests. These microbes are able to switch between aerobic and anaerobic. Here, you see that? Pretty much we
confirm that there's some methods are too conservative in politics to a CD and must they don't find any enrichment and and I'm actually did find barely any differently about the Toxin, and there's some methods such as meta-genome seek that the correct. And we went it also finds many discoveries because it's actually quite good. Text the correct enrichment and without finding too many false positives and the good news is that when we look at their talk and eventually abundance,
doxa, we see that for most of them, the majority of the methods, find them depression, the apartment, and they find it going to be on the same side with the notable exception of metal GMC that has many. So just to conclude we perform any more of them and here is sort of like our attempt to summarize all that we have learned. We find that in general Lima broom corn cob in these two are the most stable methods across a variety of comparisons, but of course the perfect method does not exist. We need to look at the data and
Andrew careful. Exploratory data analysis to go to decide which method would work. Best thing that we notice is that account distributions are fit the data better than Alternatives and that while they're appealing and in theory but composite, we did not find evidence compositional. Normal methods are performed simpler methods. Based on. I just dropping here to link up reprint that has many more details and also to the code and I will be happy to take questions. Just a quick reminder, if you have questions,
please post them to the possible Paul on speaker. August 3rd, speaker II is Anthony SunRail. My name is, I am a student in the group of market and today, every prison to pipe, which is an idea which come from an idea for Calvary Chapel in our group. And I will present to buy count for the evaluation of a flight to synchrony a drastic pre-processing. So, I guess that most of you have a rough idea of how to make things work. These, are we able to discern
are the steps of up to the steps which in Stereo by to take a question. And that's what most of you are here today. I guess that's why are you interested in developing new methods and at one point I guess that you will find it and would like to attend a date in a new decree, what you would like to do is stand to this date and what may come in your mind would be okay, I have my ocean and I would like to see how it reacts in the world pipeline. I would like to see how it reacts using different data sets with your friends, got their six
for the vibration first. I would like to remind me to do different parameters and I would like to evaluate the. How does the combination of the different parameters? Answer? The question of how and how to buy kind of works. And then every other so different combination of pharmaceuticals that would like to to know which function used to the best results. And if my function yield a good performance in Europe, An Indian diverse. We need some Evolution to tricks. To get
snow. Tinley at the end of the pipe and butt are. So I told her that interests you and it would like to aggregate this metrics in a way that I can then we'll make Impressions about the performance of the bike ride. Can we run into the same problem too many times? Which time we will bring a friend with the benchmarking from us from scratch with one of the remote from spot. So then we had the idea of this wrapping pipe comp for the operation of pipelines. That is an option package to a different types in any
framework. It's a bit of running event, definition of sect which is a series of steps. And then you get to the everybody should have any steps that you want. You can tell he's busy 5 days and then this Evolution metrics you can agree. Get them as you want by the 14th. When I go get them in a in a metrics for every combination of thermometers and turn on Matrix. But you can apply to any any fear from a ship that got a static say to Alexa called she was So this is the structure and then you need at 2 to specify which alternative functions on primitive as you would
like to test them for this. You have to give the key to start active list, which will dance with you, by which one can will you guys do you want to test and which parameters you want to bet on it? It will be then recognized by the pipe and efficient and it would treat all combinations of pipeline that are possible on smithson's. Lee update them on this pipe and ignition with ear tags that you have two steps of analysis. Each one playing at Sutter metals on Monday
which is done here. On the second step is to get number of cells, which are here inside their number of current in Newark single cell experiment. And you need Santa is. So here we have two little stinkers experiments that you would like to test done and dusted the the Alternatives that you would like to test it here. You can we fix our either on the Mexican or chains or under the leaves by Jens? And then we sit there on the on the fence if we want to text.
And then we would like to run the background with the right back and retrieve all possible combinations of her that is that we gave on the defense and we can access to the results which are typically given as a matrix of the clinician set aside for you or your end the defense and more chances, get to the real cumbersome to look at this. At this aversion to tricks to wear to a birthday bag with a version functions, which produce heat maps to do. Kelly aggregate, all of them,
the results in Easy. Wait, so he gets in a box for your two functions of Interest. For the difference that accept until you see, that's one of the function you more than the other one. It was a very simply local but the poor five companies is that it can be able to more data sets on to bigger by Prince. It was because it's finalized. When it was, it does is to create combinations of pipelines. And when applying for a plane tickets to each data set, which one is it in a different in a different Sprites and see her. That's who we have to sort of a branching of
another, the steps because I come to those nuts, I read you the same and I did that twice. It applies to sort of branching pipeline so that it's easier and faster to add to do then. If he wants to turn the lights on the steps, you can clearly identify which steps I giving you the results because we play version from that we have, you can expands all of the steps that we would do, if I waited or you can aggregated, do you want Stephanie to that? You can identify which
functions on which parameters are affecting your end results? Then you can customize your back when you can add modify, or delete any function in a direction that week or the deactivation are in your pipe and use it as a single single. I don't worry, you don't need to do it, always comes watch, once we provide two huge by Phantom plates are built around singer sex with the right, which do a lot of different stuff that you can use to us whether to add your, so your
evaluation question. 30 framework and we wanted to apply it to a field while we knew we we felt we had a notorious what is in single serving of sake and we saw that most of them were a link to people that have to wait for this would be a good field to begin with with with my cup. So what we did was to add that the definition to our needs that is we included organized steps from the breast removal featuring up to Chris ring. So that is the issue and then we did we build
mini brothers from the most common tools for which one of the sister of the steps and we evaluated using different evaluations. Mainly during one of these, we wanted to see how all of this. And after the christening accuracy, that is how they affect the ability to retrieve the same population in your data. The Renaissance hotel reservation combining out of the information. We had tested more than a thousand pipeline that we could. If I'd waited for the
sites, we use the spring fashion that is the only future any more than thousand times but practically way less than Thousand Paper Cranes. The kid. That's the results that we are that we get. So we have different other than that to you and if you do different amazing and meters is a very popular in the field and what you have here is the Christian God Christian at the true number of crystals. So that is the combination of parameters, yield, the right amount of customers. And
really hear. You can say that the most accurate results when I come visit, you can I create more than a thousand lifetimes and their evaluation into one of you until we could. We could we could identify the stuff tomatoes for 2 hours to retrieve the derailleur in order. We need to save myself for all the Asian up to Crystal ring, so that we could give key recommendation of which function and which parameter to use to, to get here or crystals in your, in your daytime, to your side of an attic.
And for more discussion about the text about the recommendation that you can see here. I invite you to look at the reprint of my account. I will try to have the link to the GitHub ever. I would like to find this is National foundation for the funding and the World Romance and love for the highest quality data back from the Rapunzel, and thanks to you for your time and attention. And actually, the last 5 Second fix. All right, thank you for the organizers for their heroic effort, and making by
ac2020 happens, it's been amazing to be able to participate in, in this neighborhood conference, I will talk about benchmarking asking about detection methods, but first I thought I would explain a little bit about the phrase benchmarking This phrase came from that article in February 2020 by Vivian marks. She interviewed a lot of researchers who produced some great papers or they compare and contrast the performance of different algorithms Houston been evacuated Alice's. One of those individuals was and she gave a shout out to her
fellow single cell, including me, and this is because today I have been fortunate enough to be able to work on to Benchmark papers. The first was a favor a benchmark for controlling salsa calories in computational biology and we had a case study in their benchmarking single-cell RNA seek. And we also have the paper that I'm going to talk about today is on bench pressing single-file, amputations methods. Cuz I Was preparing trying to decide if I want to listen to talk or byassee 2020, I was talking with my friend everyday on Thursday,
but whether or not attend this conference might be interested in such a talk, I decided to go for it and submit an abstract, but I thought it would be fun to include the phrase bench pressing and actually the title of it is to make it more fun. And then I was happy that I was able to convince both Charlotte and every day to take about a single-cell RNA, take data, compared to bulk. Are they seek single cell data has been shown to be more. Where sparsity means the fraction of absurd zeros
for where is zero is noham, eyes or reads mapping to a given Gene in a cell. Historically, there are two types of zeroes that people argue about one is a biological zero or a g. This is a theater where Gene is not being expressed or technical theater where you just have challenges in quantifying small amounts of mRNA such as from or variation, from just sampling, lowly Express cheats. This has led to increased Varsity led to the development of imputation methods in a similar Spirit to impugning genotype data for genotype that are not up there. So if you aren't going to show you a typical
application scenario where we're shooting steps using large reference map, that just how ever. I just want to make one thing, clear. The difference between these amputation methods for Snips and anxiety speakers that sedate almost all of them to tations methods for single-celled, do not rely on an external records. They really depend on the data themselves. Does talk a little bit about what is amputation. Doing both assume we have some kind of true biological expression and here, we've got a matrix of jeans for the roads and South along the columns and again we were
able to measure the true biological gene expression. Next, let's put it through a single cell experiment and be any type of protocol that you're thinking of actually we're sampling mRNA from each individual cell and we get a set of counts, it could be you and my council could be recounts. And went into Tatian attempts to do is take this Matrix, but I'm going to call and try to estimate a function at that allows us to recover the true biological expression of each cell. So the goal is to try and recover what the true biological expression is. How to date, there
are three broad, approaches for single-celled, teaching method one or model-based. So she ever directly modeling sparsity using probabilistic modeling space may or may not distinguish between biological toxin zeros. They typically impute for only technical zeros. Wedding based here. We're adjusting usually all values below zero and none Zero by smoothie using the wrong values of cells with similar expression profiles, using for something like Neighbors on a graph. The third one are Derry construction methods of here. We're identifying a late in space and then we're
reconstructing The observed expression Matrix which is no longer Sparks using something like low-rank Matrix base methods, which captures linear relationships are deporting methods that can capture nonlinear relationship. Nowadays there are around three or four studies that are benchmarking, single-celled beautician methods, but they really only compare a subset of the available imputation methods that are out there like 3 to 6. So they're actually like 18 or 20 of them that have been published or pre-printed and we are interested in using some of these for our
own analysis. So we found it, frustrating that there wasn't really a comparison comparing all of them. So we started to just explore them. The different approaches in a very simple setting. We simulated just know they do or place on cows and we only very the only difference is a varied by Library size as we still had a different Library. Expect no biological difference. And then we applied imputation to the Knoll formulated data and then apply principal components on top of the imputed data. Which part is when imputation
method except the part in the middle, no and represents. We did not impute there and we're showing UPC wanted PC to for each one of the results and we found this, but there was quite a bit of unexpected structure in the data after applying these amputation methods in this snow Friday and this motivated us to explore these methods further. So we performed a large evaluation of benchmarking 18. Single Soule are in a c computation method so that I can call and therefore Temple we show the different methods that we considered, we talked about the different
data that we considered. We use the Salvage data from that Richie's group, for example, We process the data, if the amputation method required, it using screen on lock 2 transforms replied. So and Jean at quality control metrics, we evaluated, imputation methods and two ways. Why was just an evaluation of the impudent values themselves. Comparing it to a book RAC profile and a homogeneous sell population and then the other settings are Downstream Dallas. He's so different to expression, class, training and trajectory analysis. We have a variety of different
metrics that we considered for performance, and then we have a set of recommendations at the end. I'm going to focus on just trying to highlight one of them briefly here today, the cluster A1 When did cassette we took was a set of 10x data from pbmcs. So here there are around 60,000 cells that were purified in two different cell populations. And we're showing you that you might representation of the data on the left with no invitation and on the right is amputation with magic and spells are colored by different cell
types. Here, this is the true cell type so we can apply the that we can fire and each one of her apply each one of the imputation method. And then we considered for different metrics for evaluation a metric, accuracy Purity, adjusted by an index and medium silhouette. We're still alive is used as a measure of consistency within cluster. So how similar. Salvation is to its own class. Compared to others. We found it there. The top set of methods that performed well here are sedi saver foxy
and SCA Laden is actually using the late in space that seti provides as opposed to using the reconstructed datamatrix. No indication is here highlighted and bread and then there are a set of methods that we labeled invitations fail. So we are criteria was to let a method reputation as a run for 72 hours and if we got no results at the end of the 72 hours to call Dan and Shay station sale, These were the results for using louvain clustering. We also applied this using k-means clustering, and we found a consistent set of results.
One aspect one interesting aspect of the results, so you'll see that in a medium silhouette column, which measures that consistency with in clusters, there are some methods that really are much brighter compared to others. And so, what we found was that using the imputed values you can calculate for each method, you can calculate the jeans, pacific standard deviation and we found that methods like magic. For example, they make the jeans pacific standard deviation, very small. So they shrink imputed
values when you are strong and together. And so this is potentially useful or something like this train, but it can also be not very useful for something like differential Express. Do some key takeaways. Most methods, most methods can recover expression from a book, RNA sequencing experiment in a out of a homogeneous setting of cells. However, many single file, imputation methods to date do not improve the performance and downstream analysis compared to notification and should be used with caution
performance of single silent. Mutation methods depends on the experimental protocol as far as the other day that the number of the cells. If you decide to impute, there are some methods that outperform the other message, the most consistently including favor, seti, Cadence wedding and Magic. And I just want to give a shout out to my carburetors on this project. The the majority of the work was done by my joint. When pain who and she's amazing. And if you have any comments, feel free to reach out, there's a pretty prince online. All right, thank you, everyone
questions and we'll get you as many as you can see of the questions that we don't get to please follow up with the speakers and slaw or I mean I'm still available. So the first question is for Charlotte, how do you quantify exons and introns Brantley just the reference transcript on file, contains each transfer twice. I first, transferred by representative, I need salsa included Us in the, in the evaluation in comparison. So it turns out that for 10 Xterra, it doesn't work so well because since it's like three prime,
highest most of the reeds in Saint, how many of the genes will actually be in the last XO? So they will be this kind of hard enough case to say, whether those fries actually count from a splice transcript to renounce my strong suit because they're excellent in both of them. You know what to do, if the Reeds pop pop. So you would get out of the weeds in the supply chain and a half of the drinks in the spice to so it may work better for the transcripts and the intro
in themselves. Do not I got this isn't really a question but could you please post the link that you shared at the end of your presentation? So you don't need to do that. I'll post the link to the site as well. Include clesse in The Benchmark. Likely the most popular microbiome, differential abundance. Yeah. I mean, the simple reason is that we couldn't finish my call the methods because there were too many, many methods and and some of them are quite demanding competition. So we, we lost out a couple that are very popular in the other one is an come, which is the other one that is a popular
method. If it isn't meant Mike wee, wee status decided to go for the newest and most user-friendly method but but one of the reasons we wanted to put the code out there and hopefully we're working on making a five-metre package. You know. It would should be in principle that easy for you to add your favorite method to our Benchmark and especially update the future with the additional message, Anthony, what is the input data file, formats for pipe, giving it a scalable for largest
it affect. Doesn't need a sunrise experiment object or town. MCX or so. That object, is it also modular for each of the steps. I'm looking for the question. Are we begin with the first one? Subscriber the first four large and because it doesn't renew dinner, it is twice. That means that you don't have an explosion or better memory because you do over and over there, the saying First Steps. So then because of this Burning shame that we have, that is we, we do relatively few F you in the bar off of my
plans, then it's a bit more convenient for Alaska. That's because we don't, we don't keep the intermediate results that much. So it works better before. I said that it's and the second question was, Yes, it does. It need us some rice experiment object or a rod object? Or At the moment, it accepts a single cell experiment and cigarettes and the world, the World by clandestine done, on this to format. But I said, you can tell the other to it. So that's the power of depends on the rappers that you are using during the evaluation
when benchmarking, how do you select parameters that need to be tune, default for amateur? Beautiful perimeters. That's a limitation of the study just because there were so many different methods across different evaluations. We just decided to you threefold. But I, I know what that there are methods such as molecular cross-validation. I believe Josh Mastin that can potentially improve the results. If they were tuned, they could result in better. Governance
is the agreement across. Quantification methods influenced by the sequence in a medium Cellular my account. Yeah, so that's an interesting question. Which we did not explicitly look at, so maybe so I mean we're not using all the jeans Indian. So we're selecting the most highly variable genes. So select two thousand genes play, some beach consultation and then only maybe half or so of, those are actually have a good enough velocity to actually be used for the evaluations
luckier than MJ by changing. So just like the deposit was elected by all the messages. So there is some of the difference of comes from them selecting different genes, but there was still quite a bit of difference still left. So there is still something of don't depend on the gene selection and they're still, it's still there. If we only consider really the most highly variable expression. So earnestly to methods for differential specifically, used to compute differential fondants for microbial date is it because of sparsity how compromised
ya say? So yeah, the main reason, we wanted to include those methods in The Benchmark is because both metronomics data and data are quite sparse. And so we thought that maybe some of the strategies that are used to sitting down weights that the influence of zeros in in single-cell RNA, 60, differential expression, could be useful for a 4. Differential, abundance, ducks are quite compatible. We have to see her in the inappropriate there. You know, it's kind of similar to what you see in the next day. So you can go, you know, from
50 to 80% of 0 in, in the, in the book that I should say that there's a, there's a lot of When you look at 16 US data, or when you look at the shotgun metal, seating at the genome sequencing data, and you know, you can look at the briefing for all the teams that the single some methods are more useful for home at the Gene. And then 416s, I want to see the second. Did you think that's the shrinkage that you saw was something to say, some methods are loaded? So we
actually did Benchmark trajectory or the performance of trajectory inference, using imputed values, and we saw a consistent results, you don't quite see it in terms of like an Indian silhouette in terms of like how dense the Clusters are or how similar to I think all four speakers again for a very interesting session up. Next are the closing remarks by divide and 15. So hope to see you all there.
Купить этот доклад
Купить это видео
ConferenceCast.tv — архив видеозаписей докладов и конференций.
С этим сервисом вы можете найти интересные лекции специально для вас!