Мероприятия Добавить мероприятие Спикеры Доклады Коллекции
 
Продолжительность 51:14
16+
Видео

Aaron Lun, Making the infrastructure sausage tales of Bioconductor package development

Aaron Lun
Scientist в Genentech
  • Видео
  • Тезисы
  • Видео
BioC2020
31 июля 2020, Онлайн, USA
BioC2020
Запросить Q&A
BioC2020
Из видеозаписей конференции
BioC2020
Запросить Q&A
Видеозапись
Aaron Lun, Making the infrastructure sausage tales of Bioconductor package development
Доступно
В корзине
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
В избранное
571
Мне понравилось 0
Мне не понравилось 0
Доступно
В корзине
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
  • Описание
  • Расшифровка
  • Обсуждение

О докладе

Keynote: Making the infrastructure sausage: tales of Bioconductor package development

Aaron Lun, PhD (Genentech)

2:00 PM - 2:55 PM EDT on Friday, 31 July

TALK

A key asset of the Bioconductor project is its software infrastructure that nourishes an ecosystem of interoperable packages from a diverse community of contributors. By adhering to common concepts like the SummarizedExperiment data structure, users and developers can be confident that many packages can work together to address relevant scientific questions. In this talk, I will reveal how the infrastructure sausage is made, focusing on new pieces in interactive visualization via iSEE, reliable Python integration via basilisk, and the efficient representation of large datasets via DelayedArray backends. These developments aim to empower the Bioconductor community to easily create custom software tools for their specific research problems.

Moderator: Vincent Carey, Aedin Culhane, Lauren Hsu

О спикере

Aaron Lun
Scientist в Genentech

Just a regular guy who likes developing software to solve biological problems. Also lives vicariously through collaborations with actual biologists.

Перейти в профиль
Поделиться

All right. Well, it's a real pleasure to introduce our and loon here for keynote address. If you weren't at the awards ceremony are an, is a fire connector award-winning winner for this year, he's a scientist at Genentech and was a research associate. And John meriones group at cancer research, UK, he completed his PhD with Gordon Smith at the Walter and Eliza Hall Institute in Australia. And he's a prolific contributor to buy a conductor project currently especially in the area of

single-cell RNA seek. I've had the pleasure of working with RN at some very late night sessions where we are debugging packages on Windows I specifically and we've done some working ontology together and we really need to wrap that up. He has worked tremendous, tremendous effort into the bioconductor python interface in. The Basilisk package is very exciting development and he spearheaded the engagement of the buy a connector project with the human cell Atlas. So, I'm looking forward to this talk and I will have questions afterwards. So 32 minute video that I will now start

for you. Hi everyone. My name is Aaron and it's a pleasure to be here for the bottom of the 2020 conference. But I'm not really here and this is pre-recorded now she doesn't mention that. I mean if you're going to talk and I didn't really feel like lip syncing through it again. So I said I just recorded a random hour of my day to keep you company. Anyway, I'll be talking about some of the work I've been doing rather some of the adventures I've been having with new pieces of Subway, infrastructure for the bottom of the project. If you are in a package of developer, I think you find is

talk to be right up your alley, but users can also enjoy the process of seeing how the sausage get made, and I had the title. Right, let's get started. Software infrastructure is one of our conductors biggest assets here. I'm talking about stuff like someone who would be nitty-gritty of Veteran Resource Management so that the rest of us don't have to these infrastructure packages. Make it a lot easier to develop an interesting scientific questions because we don't have to reinvent the wheel

every time and whenever these packages get performance improvements or bug fixes or more functionality and its benefits of us that lifts all boats that we will forget this interoperability with a package has developed by different people, can function together in a single analysis work for because they share the same Concepts and data structures. And of course, this makes life easier for the end-users because they don't have to learn a new programming Paradigm for each package. In the sense enough to support infrastructure is the fertile soil that

nourishes. The ecosystem of community contributions by supporting more Humane Society, infrastructure pieces, and ultimately, the fantasy analysis, and visualization packages that people can't most excited about. So I'll talk about three things today. Jailbreaking, I see for custom visualizations, integrating python to buy conductor packages with basilisk, and some fun, and games of the delayed or a framework for large data sets. Now, getting each of these things to work, was like making sausage and final product looks pretty nice but it was Blood and Guts and hard

work all the way through. I however it was also a pretty fun and you know that's that's really how coating should be. Okay, is going to be a bad interactive visualization. Some of you may be familiar with the icy application for interactive exploration of a samurai experimental. Well, I mean, I see anyway this is a Shawnee app that provides a multi kind of layout to visualize multiple aspects of a some Rust screenshot. Here, we have one panel to visualize the reduced to

mention another panel to look at the road. Me today, that is a table and get another kind of want to look at the expression on the team. Are the apples. That has a system for flexible interpersonal Communications where you can infer example brush on a reduced to mention Floyd and see the response on multiple other panels based on the samples that was selected. That we also have features such as code, tracking, reproducible stays. And for some reason voice control, it's a proud of because it's fully

community-driven. There is no Grand snow paid positions just sweat and tears from passionate volunteers. Now, I see is pretty useful, but there are still some pain points when people try to use it for real. And after and uses that is analyst and scientist, you just want to go get that data General, requiring some work to set it up to get the exact visualization that they want for developers. He doesn't have quite enough entry points into the interface to Advanced, customization example, to make entirely new panel types of mechanics. And also, as a

side note for us, it was a real pain to maintain 32 lines of Sean e code. And if you've never had to do that it's for the best until you These difficulties that's imagine what we would need to do to get icy to show a standard volcano foot for visualizing the result of a differential expression analysis. We would have to open up. I see, I will throw date of thought too much, then we would have to manually set the X & Y, axes to the, to the appropriate field in the road data, okay? That's a bit annoying to have to do, and it'll go shoes that I had the foresight to look

transform. I P value used before starting icing. And then finally, we would have to manually count the number of things that are significantly up or down to show those to an end. It's just, that's just a practical. There must be a better way. The handle this, a solution was to make it easy to create extensions to icy in the form of new panel times. This allows developers to write custom panels to support. All I see takes care of all the boring stuff related to managing the data and

passing information between panels Napa and uses the result. Is that application set up becomes very straightforward. As you can see. From from this a simple one, Lana, that instantly creates an RC instance. We do go. Go here is to convert from a single application into a ecosystem of applications built on the same framework. This gives developers the freedom to create their custom panels in separate part of the packages and then uses can then mix and match the kind of what they like to set up that perfect. I see instance. Strategy is to

allow custom panel types into the icy framework as first-class citizens information. From other kind of knows, it has full control over its user interface elements and its visualizations. These custom panels will have will have the same privileges but also follow the same rules as the standard kind of was that we provide in the corps is a package. The only difference is that these are written by someone else with the theme of community contribution building on top of common infrastructure. Okay, so that's a nice dream. But how do we make it happen?

With s for classes, of course, in ic version to, we completely over for the internal structure that is represented by its own. As for this allows users to control the set up the app. I just creating a school subjects. Corresponding to the panels that they want that, for example of what we have. And I said, I'm icy with one reduce Dimension floor, at which point is colored by. It's a sign of colon. Meditator plus I see relies on the class identity, the interface. For example, we have an

S4 generic function code generator output, that takes a penile object X as its first argument. Now, when X is a degenerative would function won't refuse a deep one of those reduce the dimensions when X is a colon. Complex heat map. The same function will create that will accomplish the same line of code in the corps. I see application can support a variety of different behaviors and this makes it very easy to accept what we have to do, with the rock, a class for a new

penalty-kick, specify, how that cost should behave when generator output and relay function to cord and then it'll work inside icing. Another nice thing about using classes is that we can use inheritance to organize out panels. Everyone starts from the base Panel class, which is about as general as a guest. And then we specialize in some classes that do more specific things. For example, for a colon. Called that shows each column of a sunrise experiment, as adults, or as a point that I bought this keep going until we get to the leaves of this tree, which occupied

by the classes that in Jesus will actually end up using, for example, with the dimension Florida. We use inheritance to reuse clothes that is common to multiple classes. At, for example, would be reduced to mention floor and pulling this up to summarize experiment as points in the plot. And I just means that the column. What parent class can Implement a lot of the common methods across the spot, for example, to create user interface elements, to color points by a field of the color meditated.

Similarly, the table parent class. Implement the law of common methods or table related panels industry that we can avoid the need to implement them separately in each of the sub classes at that the leaves of the street. This strategy makes it really easy to write new pedals because we can just reuse a lot of the existing code in the parent classes, while customizing some of the Visa violations or interface elements to get the desired effect. It was great. I walk through some practical examples, I will start with X, for which replaces the conventional scatter part with these Expeditions

that are colored by the density of points. That the idea is to improve putting speed for large datasets because we don't have to create. We simply made a new class that inherits from the standard, reduce Dimension panel, overrode the plumbing function, said that you would use the key on the Xbox and we'll be in instead of Kion .38 to base to control them inside. And just let me would do with the usual, reduce Dimension hat. Now on to our old friend, the volcano. With the general Road, Dana Point

but we add constraints to ensure that we can only specify rotated columns containing P values for the y-axis and don't feel changes for the x-axis. We modify the proteins function, so that we can get some nice colors on the Ube jeans in each Direction and we get automatic log transformation of the P values on the y-axis. Now, we also record the number of differential Express Lanes in each Direction in The Legend. And finally, we had some interface elements to allow users to change the local. Change in FBI thresholds, used to define d e jeans. This gives us a nice looking, but kind of

court that people can use straight away without any setup. Slightly more exotic panel is used for dynamic marketing detection. This is based on the road table. Go to class which provides utilities for managing a table where each row corresponds to a row about summarize experimental. Yeah, the cleansing of the table is dynamically generated by performing in on the Fly differential, fresh nanalysis between groups of samples collected in the panels. In this particular example, we perform a different expression analysis between two selections and they refused to

mention port to identify Makkah jeans in one of the sections compared to the other, by simply overriding the function that creates the dataframe to be visualized in the data table. Widget, we don't have to write any of the actual rendering call yourselves. And we certainly don't have to write the code to transmit information between panels, because I see gives us that will free. So I hope you can see how I feel but going to really enables deep and Duck Boots. Customization by its full frame, look very interested in interactive visualization, this is a

great time either in the form of ideas, when your panels, or even better, pull request, or even better than that. You packages that Implement, their own kind of some examples are given in the links below and all of us on the icy development. Team will be happy to help you out if you're interested in giving it a go Right onto the next topic efforts to improve depression in Boston. So, why do we want to do this? Well, the positive speaker, system contains a lot of functionality that compliment, but we have in our conductor and

here I've bested. A few examples that personally interested in your bike. Image. Analysis has more support for known scientific programming utilities as companies working on new software Technologies are more inclined to right by him. And the general philosophy is let's that's not reinvent the Wheel by sohn functions. As if they were all packages for those of you who aren't familiar with particular, I've included a little chunk of code here to show exactly how easy

it is to write to use recirculate in an option. I'm just a PCA, buy a package. Now already has multi-language the portrait packages though. I regularly includes he pass passcode in my packages and I've occasionally fiddled with see you in full track code as well. Also fed meets all people incorporating Java code into the packages. Given that we already have access to reticulate. Why can't we do the same with pasta? Well, it turns out there are a bunch of reasons. The most depressing

one is that we need to assume that the engines that has an appropriate version, the pythian, and all of the Platinum package has installed. This means that it's our package depends on the particular Platinum package. We need to include the the dreaded system requirements in a package description. And, you know, whenever I see that I think because it means that the user needs to do extra work in order to install the package and I was, but I found this particularly Troublesome and configuration the easiest all of us has to be installed virtual

environments, and environments. Where is my parking pass point to and and so on. They're relying on the end-user to manually do the installation means that there's all sorts of things that can go wrong particularly with respect to his doing the wrong version or of heart attack. Even after you explicitly told him what version do you need? Another problem is that different parking packages? That have incompatible dependency trees, fundamentally. This is caused by the fact that a package to inspect, it 5 million numbers, other than greater than or equal to which is what we doing

up. If your packages require mutually exclusive versions of the same dependency then. Well, you won't be able to use them in the same environment. And this is particularly problematic for reticular. We can only support one person environment to audition. And by this, I don't mean like one environment at a time, I mean, one environment for the entirety of the office. And so what you loading, one environment, you're stuck with it. And do you restart a? This is a, this is basically a giant Waypoint for the code inside, if I have to Different

environment. I caught use both of them in the same audition. Start to get around these problems, we created the basileus package so this is named after the snake in Harry Potter with the idea being that would be able to do fries pop in the penalties for using Inside by bring up the packages. Most of the heavy lifting to you by quando Rondo and sure that we have a consistent part and installation in the use of machines, that eat the bottom. The packaging then creates one or more combat environment. Containing the pipe and packages that they need.

All of this is done automatically either, upon our package, installation or by using a lazy mechanism, where Condor and it's environments are created as needed us to write a few lines of off code to instruct basilisk, to set up a phone to environment with the necessary dependencies. Once this has provided basilisk handles all of the Condor stuff so that the energies that doesn't even have to worry about it, Smith and you bought the new condo environment

that will buy from the Pacific and so they won't interfere with the end-user activities and you wouldn't be too wrong here, but the Jesus was a lot of traffic but even with condo doing most of the work, that's so many things that went wrong especially with these are these alternative operating system so I can Call bay windows. Anyway that's all done. So in Jesus don't have to deal with these problems anymore. As for the other problem, that's what allows us to use multiple parts and

environments in the same author. Okay. That's not technically own pocket and firemen boxing. Environment is the oxygen and that block I could be from being overused a different environment and contrast we can have look at the situation on the ride. We're both packages A and B, are you? That's why I went back and tries to load in its environment. Fastest way to text that the oscillations Titans Go has already been filled by a zanbar, a bath, liquid and skin off of a new process

for package. Be that dude. Stop with its environment and what's finished a transfer the results back to the parent session. This allows us to use both packages A&B within the same option by the Basilisk, is greedy, in the sense that it would try to fill with the global patents. What if it's available as this is more efficient than? So I'm starting a new project. If this lot isn't available within basilisk, is smart enough to start a new process and avoid error, that would otherwise I could have this capability and assures that different biking up packages, that use five things

can work together without burdening. The end-user with the details of the are pythons interaction. I'll never walk through a handful of examples of Basilisk in action at the birth is the previously mentioned bossy. Sklearn package from Vince, Terry that which rather be psychic done functionality into our functions switching. The Basilisk, it was an easy to step process that we provisioned the cold environment with the back of its environment class. And then we wrap the existing reticular code in the Basilisk run function. I've been the engines that can just call

the offer options like that would do for any other off package, without requiring the management of any parking dependencies even better order. Our package has been called, by the Estee Lauder to access. So, I could learn functions in a reliable, and portable, Matt, there's no need to be in Plymouth the algorithms and, and reinvent a particular wheel. The next example, concerned iron a velocity calculations for single so dead right now. So you said I mostly written in typing, but here we've created a velociraptor package that uses basilisk to pull down and use the SEI Bella pots and

pans and uses just have to pass a single cell experiment object to the rapid function, and they get over the velocity calculations back, not an easy. There's no need to be implemented anything, which gives me more time to do other things like that, you know, YouTube, Finally, I can point out that we are limited to using basilisk inside our packages. So, we can use basilisk inside and Analysis trips that uses pots in. For some steps in this example, basilisk, you set up a cold environment with the necessary dependency. So that later parking steps will run properly. The cool

thing is that this trip to automatically detect whether the environment exist and it does, it will use it and it doesn't that it will create. It just means that even your analysis groups can be reproduced, only run on another uses machine without requiring them to manually. Set up the same pot and environment. I didn't use basileus to set up on the environment for you. That's because I always forget, you know, how to do it with cones on the command line. To sum up. The idea behind basilisk is to improve the reliability of fighting integration inside bargain. Tuck the packages this

unlocks the patent packaging system for biking advantage of ridiculous capabilities. I'm hoping that will reach the same level of reliability enjoyed by c&t Sardar packages. Of course we don't even have to stop there because conda is more General than Parson. For example we could use Condit to install other interpreters for languages like Julia. I can get some integration going with packages, relying on difficult to send the penalties. Could also use condo by basilisk to you simplify the installation.

So there's quite a few interesting opportunities for further synergies across the bottom of the project. Okay, onto the final stretch, for the last part of this presentation. I'll talk about some of the fun I've had moved delayed array to represent, large data sets. The the later a framework was developed by other day from the bottom of the core team. And it's pretty cool to the idea here is to provide an array in phase two, large data sets that are too large to store in memory. The most obvious application is for file back status. It's where most of the data is

stored on the hard drive and pulled into the a session on an as-needed basis. This allows us the poem analysis on very large datasets even a machine to go to memory. Now it's too late already because the operations of arteries blocked immediately executed Stanford below that says we have a back hand that holds out count Matrix. For example in a fixed if I file and we want to compute log, transformed normalize the expression about we stop by creating of Adelaide afraid to wrap a towel back in and out. She has been to divide the counts by. The suspect is to account for

differences in seasons. How about when we perform the division on the delay, right? Actually, it will not do the division right away. So think about it, if we did the division right away, we need a place to put the normal value Opie's BBQ on stored and ran because it was too large to put into memory in the first place. So, if we want to do the division, right away would have to ride the output back to another file, which would be an efficient and use more disk space. So instead, we'll just remember that. We requested a division and then move on. This approach

is applied to the next step, such as edition of, the Balkan nation. And all of these operations are delayed rather than being executed straight away. Now let's say we have a, we have a function where we need the actual values or bad, dog. No more Matrix. In this example, here I'm using the voice function. So when the lady Raven chances that it will extract a block of the data into memory and apply all of the operation. I previously been delayed, this block is then used by the revenue function to well, compute the road me in this manner, we avoid, loading the

Wichitas up the process. Very large data sets in limited memory, So hopefully it's pretty clear why this might be useful. For example, I can load the 10x genomics 1.3 million grain field data set as hdf5 backs, delayed Matrix on my laptop, even though it would have taken, 146 Pequot, Ave Ram to store it as an ordinary Matrix from a programmer's perspective, stand in Matrix operations, like some studying multiplication calculation of very statistics at Remington and someone and this means that Adelaide Matrix can mimic an ordinary

Matrix to the point that we can use it. Occasionally in our analysis, advice about much medication. Now, how far can we push this? So I've already shown you any sample of a hitch that 5 back to the lake makes it. But what other than anything is can we do a 12 O Clock? An exhibit, we have the Deferred Matrix which admittedly is not a very good name, but whatever this particular, Matrix uses the delay the right Machinery to delay, the centering and scaling that is commonly performed on The Matrix. Bride-to-be CA the retinol is that if we have a sparse Matrix and we sent

to it, it's not going to be spot anymore. We might not even have enough RAM to hold this. Since it may take me to sample in this example. If we sent it to spot Matrix in the usual manner of this Matrix would require 12 gigabytes of RAM to represent, Because what the laying of the censoring out, Hood Matrix is very lightweight and then we efficient another bonus of delaying. The centering is that we can take advantage of the highly efficient algorithm is that are available for sparse matrix. Multiplication pacifically. We can rearrange the multiplications that we can directly

operate on the sparse Matrix prior to Century. I was just highlighted in this expression. By this is why a product, this is important for approximate ETA algorithms like Grandma's, PCA and I are we to do a lot of matrix, multiplication Dynamite just to give you an hour to put Matrix takes just seconds, which is pretty nice. Our necks example is pretty similar. If we want to regret out some I'm going to the variation, for example, of other things that we can delay, the calculation of the residuals with this residual Matrix. Again,

this uses the delay, the right Machinery to avoid actually creating the Matrix of residual in this example to buy for bran if we would a computer that goes like a champ, but instead we get this lightweight residual Matrix object. That requires very little memory, multiplication where we can rearrange the sequence of Matrix products to get access to efficient sparse matrix. Multiplication algorithm in this expression. He just ordered the application of the Hat Matrix so that we can directly will

come the efficient. Why a product instead of having to use dance multiplication algorithm, this gives us a pretty fast. I r o b a run on the switch is Give me a free speed boost if we want to do PCA on the residuals off to some kind of regression. Slightly different use case is the low rank Matrix so silver, beautiful. Now, PCA. And we want to make a low-ranked reconstruction about input Matrix. This is easy enough to do very by taking the product of the bus, tell principal components and the corresponding Columbus of the rotation Matrix. However, the Reconstruction is dense, and might

not fit in the memory that we use the delayed or a framework to delay the matrix multiplication itself. So, the meringue product is only computed for a particular rows or columns upon request, allowing us to mimic a 12 GB, 16 GB of memory. In fact, this is the data structure that I used to report the back correct expression values in the output of the fast Eminem method for single so bad correction. It's pretty cool. The final exam portal. Talk about here is some work in progress with tile DP. So in its simplest form, this is a file back. Datastore much like hdf5.

I have, I want to talk to be a selling point is that you can store spice matrices natively. This has the potential to be much better than Al Hicks. Be a father, proach, where we have to expand the sparse Matrix into a dance, major. It's because they stay at 5 doesn't happen. It doesn't have to read and write all the zeros. That's a little time when transferring data from is not test it out. We put together a delay, delay delay, the rapper, or around, Todd EB, that we cord. How do you be a real be asleep? And some of the early tests.

Promising, for example, the reeds are several times pasta with tile DB a raise compared to HD Ava. Ray culminating in a PCA step that's almost three times pasta. This isn't very nice for large data, sets with a, PCA alone can take almost an hour to run. So I think we're just waiting for the base tile. DB package to get on the crown and then we'll have a mission for this family. The really nice thing about the delay. Today framework is how flexible it is for everyone. If you're a developer of analysis software, all you have to do is make sure that your software works

with the delayed Ray machinery, and you get to use all of these cool toys. That I've just described without any extra on the other side of the fence, if you want to write a new delayed of a class with special behaviors, then you can be pretty confident that it will work with anybody in the packages that are compatible with the delay draping work. So this is the easy way to stop to make you a package is scalable, for example. Am I single Quest that lives and breathes later Aid? So it's pretty easy to turn through it now. She's involved in acquiring much RAM. I give a few

examples here and you might be wondering why I asked you a sales on a machine with more RAM and not because I need some memory left over to watch YouTube at the same time. Anyway, One particularly cool thing would be to see if we can use the delayed Ray framework to push more complication for cloud, which would allow most of the heavy lifting to be done on a server, that's close to the data. Right? So that's the end of the two of them. So hopefully some of this infrastructure will help you improve your existing, bargain. Two

packages. I'll give you some inspiration to write new ones. I provide something or do you want somewhere to samples and really that all of this is to promote synergies between packages that really make bad conduct are greater than the sum of its parts. Of course, all of his work in boquete from many people are members of the IC development team, a lot of volunteers for testing back to us to get to work on different systems and a lot of development effort in the delayed a race-based at Wilson, my colleagues at Genentech and to finish, I'll just say now that we have all, this infrastructure is a

great time to start writing your own bottom of the packages to do something Christian Science. So, what are you waiting for? Get out there and start running some stuff away. Alright. Well, thank you very much Erin. We can watch that again. You know, there's interest in many of the things that went on in there. Let's take a look at some questions to see the pool this creation of new. Our sessions were multiple python environment. Adversely affect multi-session, parallel

programming, It depends on its in some contexts it may well do so in some contexts where you have a particularly aggressive out of memory, Philip I can be a problem but don't be a problem. Anyway, if you if you if you spent too many sessions and they are because I haven't some I mean I haven't posted yet because most of the time I'm running, I'm running the fastest functions within a Skype but in a parallel segments of my code and I'm relying on whatever kind of the PlayStation commence didn't do

that. Yeah, I would agree that it may not be unique to battle this but yes we will find out. Next question, do can basilisk environment share python package installations if the version requirements are the same What was the first fight? Like two different packages that have the same installation requirements for their python infrastructure. Do they are they able to share those or do they need to install them? Separate, I think I provided methods that allow people to do that, but that's a

different environment that you want to use in in the in the past and it will use it was it has to be one of your dependency. Yes. That's that's, that's why and Great. The Basilisk. Tendency of reticulate. Even if it is mentioned in the vignette as a package dependency, for some byassee package. So, I don't think so. I think I can basilisk, basilisks, are always really as a, as a test ruling on top of particulate. So if you use a siphon, you stupid using employ, you know, the various are tied to our and go all the time

you need to do that. It's just not messing around that to make it easier for you to do. You deploy that packet? How is a deferred Matrix different from Adelaide array? Yes. Yes, I heard that sucks. I had to put Matrix is Adelaide Matrix, basically said, it's not a good choice of any. I thought you say, you know, something is that a true Delight alterations, which is censoring and the scale, right. That's that's basic. That's basically. The reason why is because I want, I need to add the,

I need to override the matrix multiplication operate after that. If he can intelligently operate on the back, that's on the centering is delay, right? Because normally, if you do make it to motivation, with the process that is used, is a process where each block is is, is realizing to an ordinary Matrix and then that is multiplied. I'm sad with the different metrics because because, you know, we know that the operation is that allows us to do things with respect.

It's very interesting can basilisk use packages installed manually. For example, when you were working on less common, architectures or Os Yeah, that's a question. I've I've I've seen there is people that I know old have done, that would definitely feel them lucu installations. I don't know how easy it would be to make that into a Deployable by conductor package. So if you wanted to play only for you, but if you go to pull the code down from somewhere right now, we pull the car

down from Honda or something. I just some random code, right? That you wanted to include I mean you but yeah you can definitely do that locally. I don't know how you would do it in a text for me. I guess it's kind of buyer beware, right? I mean you don't do anything to see whether your python. Dependency is one that has been deprecated or, you know, needed an update or whatever. Any thoughts about how one could have some sensitivity to whether they're using the the state of the

practice. python infrastructure at some point in time at some point in time, early on I tried I tried to do something like that to make sure that some older versions of the Python dependencies agreed with each other. For example, and we're up to date with the latest latest distribution from Anaconda. So I just wanted to make sure that. But at some point, you know, if I stopped fighting assistant and I was like, well, you know, I just like, Wanda handle, the dependency help for that.

That's more of a decision that you made that you say, okay, I'm going to pay tomorrow. Packages and then you are responsible for that. I don't have any I didn't have any clear idea around what I could suggest updates for you because a lot of the time Indies versus down. Make sure you're getting consistent Behavior across but I wouldn't make more work for you there but maybe something like a basilisk. Invalid would be something that somebody could contribute potentially has another package or is it pull request to basilisk? Okay, let's see.

So we'll all these updates be pushed into the Oscar book. So you can use this Quite a feat like as it doesn't directly use some of it, right? So you never seen you really talk about it in the butt under the hood, that the example of a function uses the low rank Matrix, spectacles at the reason why you forget isn't crash because you're trying to eat at 22, some of the various functions, use the residual Matrix rights. That again, that's something that they just they just do it under the hood. That, you know you. You don't even see it happen. And we weave in the trajectory chapter

that has eventually worked on it and putting in the then, right? You don't even notice that you're using. But the idea is that you won't even notice that using plastic. Just like you don't even notice that when you run into work under the hood, The infrastructure that makes the magic happen, okay? We have two more questions. I think the first is where is the best resource simple for beginners to use delayed array. Used later, right? Okay. So she used as an

interview that you don't have to do too much. I said that that the simplest resource is not a workshop just wait, it's go to Pete's workshop and listen to the music that he plays and try some of the exercises cuz I think you can get information from that. But as you say, you wreck your brains, say where's the document? Maybe the vignette is good enough, maybe you need to Now those of us who've been there and have implemented the life. Maybe maybe maybe right,

I don't know. But someone should write a book about how he sort of going at the high-level thinking of ten simple rules. Do you can you contribute a couple of rules for new Developers? New developments. Haven't been one. Like, I haven't been that for a long time. I don't, let me think, let me think you developed it. 10 rules as well. One of them is to not reinvent the wheel, right? That's like a crazy amount of real reinvention that goes on for an S3 and

do everything. You know, when you when you actually want to get some work done, at whatever you've done, just probably someone who's done it, much better with more lies and more test it, and it took me two years of taking it apart, eventually replace it with But you think I need someone else to say something else. But he said that if you received his Foster and usually some Cody Wright and thinking about, you know, what you should be doing and looking for other schools in for like solutions that are already there

and I had another piece of device. Yep. It's it's it's all throw one in one is don't go It Alone. Talk to somebody. I don't know how often you do that. I'm awake on slack so much wheat. A ride because if I had the definitely you definitely talk to people, yes, they're pretty nice out there and CB Potts Wright. And then, the other one would be to read a book called the mythical man month. You familiar with that, I thought about it, I've never read it myself.

So it's basically going way back to software architecture for the VM 370 and the idea of the ability to predict, just how much work is going to take to do. Something is probably not what you think that wave. You think it is? It's going to take a lot longer. Yeah. Yep. That's that's definitely true time on my hands because it doesn't really matter to me. Is I just have to spend more time with it. and I didn't say it's a yes, that's You need the Longview and you've taken it and you've got

some, some definite, I mean, some key contributions but it certainly is a lot of work. We thank you for that and the video should be watched Again by everybody. Just forget it all under your belt. I don't see any more questions in the poll. Maybe you have some questions but it was more of a comment than a question in the chest. Land masses. I think making it hard to use random python package not on condo or Pippin bioconductor packages. Make sense just like fire conduct your packages. Can't have our dependencies that are not on Krannert

bioconductor. I also agree sometimes but user is responsible for what's going on this office likes it is no way I could fit that into production, right? All right, folks. Well I think we are at the end of the session. We need to get ready for the next one. Thank you all for participating and thank you very much for our and everything you've done.

Купить этот доклад

Доступ к видеозаписи доклада «Aaron Lun, Making the infrastructure sausage tales of Bioconductor package development»
Доступно
В корзине
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно

Ticket

Доступ к записям всех докладов «BioC2020»
Доступно
В корзине
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Билет

Интересуетесь тематикой «Наука и исследования»?

Возможно, вас заинтересуют видеозаписи с этого мероприятия

27-31 июля 2020
Онлайн
45
19,14 K
bioc2020, bioconductor , dna methylation, epidemiology, functional enrichment, human rna, probabilistic gene, public data resources, visualizations

Похожие доклады

Peter Hickey
Senior Research Officer в The Walter and Eliza Hall Institute of Medical Research
Доступно
В корзине
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Доступно
В корзине
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Qian Liu
Assistant Professor в Roswell Park Comprehensive Cancer Center
+ 1 докладчик
Qiang Hu
Postdoc в Roswell Park Comprehensive Cancer Center
+ 1 докладчик
Доступно
В корзине
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Доступно
В корзине
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Kylie Bemis
Lecturer в Northeastern University
Доступно
В корзине
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Доступно
В корзине
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно

Купить это видео

Видеозапись
Доступ к видеозаписи доклада «Aaron Lun, Making the infrastructure sausage tales of Bioconductor package development»
Доступно
В корзине
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно
Бесплатно

Conference Cast

ConferenceCast.tv — архив видеозаписей докладов и конференций.
С этим сервисом вы можете найти интересные лекции специально для вас!

Conference Cast
1497 конференций
47700 докладчиков
20185 часов контента