Transcript for:
Exploring Epic Cosmos for Research

It is my immense privilege to introduce Dr. Lindsay Knaik. We checked last night and you were saying the K out loud. I've known Lindsay for five years and I was never sure whether it was more like K-night or K-night. And so it's more like K-night. So Lindsay is visiting us from the University of Iowa where she is a professor of metrics and associate CMIO at the Stead Family Children's Hospital. Lindsay is an Iowan. college in Iowa with a biomedic in year eight, and then medical school also in Iowa, right? And then she had the idea that she wanted to go on an adventure, so she went to Baylor for residency in theatrics, and then she had the incredibly good sense to do a neonicology fellowship, and to do it at Vanderbilt, home, best place in the universe for the neonicology fellowship. She had the even better sense to pair her neonicology fellowship with a and informatics uh bachelor's degree so i met that dr kennedy who's going to be a professor's degree here she did a very interesting study that i'm not sure she was talking about but about the volume of targeted ventilation in the unit we're trying to make sure that we pump up small babies we do it based on the size of the balloon not how hard you blow and so uh that can be the next talk okay exactly so it turns out that's better for the lungs and the stress job as much so um That was pretty cool. And then we tried to get her to come here, but she said she wanted to go to Iowa because there's a lot of small babies there. These are exceptionally small babies. They're actually going to take her to Iowa. She is an associate CMIO there. She's in charge of lots of powerful and important things, including using her cat. So I have a questionable hypothesis that someday we're going to abandon Mobile Heartbeat and use your cat instead. I don't know for sure. Everyone else I know has done that. I think it's probably going to be a big deal. We'll probably ask her at birth. Yes, so good. So we'll ask her for some tips. We invited her here partly because she's our friend and we like her, but partly because she is a power user of Epic Cosmos. You're talking about Cosmos. Cosmos is a research-related sharing network. And I know we're friends at least with everyone in the room, and I'm assuming we like all the participants in teams. But I'll tell you guys that right now we don't have Cosmos at Vanderbilt. My. perception as the director of the Food and Informatics Center that this is an oversight that we should start missing Cosmos. So I believe that we need a groundswell of support and enthusiasm for Cosmos. And so one of my goals is to create such a groundswell. And so if after hearing Lindsay's talk, you say to yourself, man, you're going to Cosmos, email me and tell me why. And if you're a powerful person, email other powerful people and tell them. And hopefully we'll get access to Cosmos. He does everything I love to MC. Anything else I should have said? Okay. Doctor today. Don't see anymore. Well, thank you guys for inviting me here. I don't have to go too much into my personal background. Adam did a great job of doing that. But some of what I wanted to talk about with Cosmos is just giving you an idea of what it is and how kind of we're using it at the University of Iowa right now. I will say we have a lot of small baby examples since that's really what I take, babies I take care of and what we're going to focus on, but hopefully can relate to other populations as well. And then more about how, you know, we interact with the Epic Research team and the Epic Cosmos team as well, since it is a product that is new and still continuing to improve and where I think the future of Cosmos is going. And there should be some time at the end for more questions about other populations we can talk about as well. So my disclosure is I've only ever worked at hospitals that have used Epic. I've taken a lot of Epic training classes, mostly because Adam told me I had to. When I graduate. But sadly, Epic is still not sponsoring this talk. Maybe someday. So for the trainees in the room, Adam did a great job of going around my background, but just a little bit more in depth. I did do biomedical engineering at Iowa, which is why I think I became informatics focused. But in my, you know, my research that I had to do to get into medical school, I actually created, did prediction modeling for a digital human at engineering, which is funny because now that's what I'm potentially writing my K award on is prediction modeling. So sometimes things come back full circle and what you do in undergrad to get into medical school. So trying to use some of those skills now in my continued career. But at the University of Iowa in medical school, one of the other research projects I did was. look at EHRs for taking, for using, creating tiny baby databases for prematurity, which is something that now directly relates to Cosmos, is how can we use EHR data to still continue to create those databases we needed to create, even though that was 10 years ago I did that research, we still haven't improved it yet, but I... I think someday hopefully we'll be able to. And then the rest is history. I came down south, went to Texas and then Nashville and then back home to Iowa. But happy to visit here and talk with you guys more. So the big question is, what is Cosmos? So since I started giving talks about Cosmos six months ago, I gave a talk at UGM. At that point, they had their de-identified database of people who are in ethics signed up to be submitting data to Cosmos. I had 203 million patients, billions of encounters, billions of face-to-face visits, and now, since I pulled this a couple days ago, they've only continued to increase as they're getting more hospital systems signing up to Cosmos and going on board. So it's definitely a large, de-identified, very powerful database that Epic is trying to champion from all of their institutions. The benefits, and I will say this is from the EPIC website, so a disclosure there. Their benefits is it's one of the largest databases that you can get of clinical EHR information. There's lots of other institutes or areas that are trying to create similar databases, but I do think EPIC being at so many big, large institutions, they're definitely going to outpace any other EHR databases quickly in the future. And it is fairly fast to query millions of... patients. It can sometimes take minutes because you're going through so many patients, so you just have to be patient when you run a query. It might take a little bit before you can get your data, but still overall pretty quick in the scheme of things. I'll show the slide next that it is data that is pretty similar to the U.S. Census. So since they are getting sites from all over the U.S., they do have a pretty similar demographic spin that is close to the U.S. Census. And then there are other big powerful pieces. They say that you can have longitudinal charts of patients. However, I will say that the caveat to that is, remember, not like Vanderbilt, not all organizations that have Epic are on Cosmos right now. So patients that may move in and out of Epic organizations or in and out of non-Epic organizations, you just have to take a caveat of what longitudinal data is there. So you may be missing some of that. However, with the numbers of Cosmos, maybe it doesn't matter as much if you drop off, you know, 10 or 20 percent of your patients because the overall numbers are so big. But I think the important thing to understand about Cosmos is your question is really important because it's is the data there and is is what really matters going to be reliable in the Cosmos database, especially while it's new and up and coming. Something to think about as you're thinking of a research question. And then they do have a very diverse data set. They have labs, meds, some social determinants of health things in there, which I'll show a little bit, and even some patient entry data. And Epic is actually very good at taking feedback about what things that they don't have in Cosmos yet and what things that they might be willing to get. I will say, because I have an ICU background, you know, one of the first questions I ask them is, can I get all of my... flow sheet data, my labs, my vital signs, all that sort of thing. Right now, they have hourly flow sheet data in there because they couldn't handle doing much more. So just knowing questions like vital signs, they just have hourly data. And you might need to get to the granularity and talking to them of, if you took three blood pressure measurements in a row, which one are you actually putting in? And so I think they're kind of deciding, do we put in the max, the median, that sort of thing. So just things like that, that you actually have to. get into EPIC and asking them those questions if you want to use this for research and really understanding the data set available. So this is again from the EPIC website showing how representative it is compared to the U.S. Census. The purple is COSMOS and the blue is the U.S. Census. So you can see that they do a pretty good job with age and race and ethnicity all getting pretty close to the U.S. Census. And then while they don't necessarily always give you exactly you know, it's de-identified which hospital and where these patients live, but they give you additional information like the social vulnerability index. So you can get to some more social determinants of health to potentially ask them some of the questions you want to ask Cosmos. So how do you get access to Cosmos? So number one, belong to Epic Institution. You guys have that. I think what Adam was alluding to is you're stuck in the number two step. which I will say full disclosure, I was not involved in my institution. I had more people higher and powerful than me that, you know, pushed that along. But I think that's where our why I was invited here to help Adam and getting that grassroots to push this along. But it is, you know, it is up to different hospitals and deciding if they're going to. I'll be okay with the legal terms of cost. But then after you get through that, step three is for those of us who are users, of course, you have to take Epic classes because anybody who's listened to Adam and Allison knows there's lots of Epic classes you can sign up for and take and become additionally trained in. And I'll show some of them a little bit, but you can do the just super user-valved in Cosmos where you learn more about the slicer-dicer side of things. Of course, anything Epic, they like to have a lot of slicer-dicer access. Or if you want to get more to the line level data, you can do the data modeling training, which then you have to kind of jump through their SQL tools to make sure you've taken those prerequisites. But then you can get access to some of their additional information in the line level data as well. So once you get access and you jump through the classes, you get access to the Cosmos dashboard, which this is actually a virtual machine that you log into that's hosted at Epic. So that's how they're able to kind of say that the de-identified data is secure. It's not something I'm ever downloading on my system. It always stays on their virtual machine at Epic. But once you log in, you can see the Cosmos dashboard where they're constantly updating patient population and what kind of patients they have here. And then you can explore with your slicer dicer, which will show some examples of that. You can go into the data science side of things where you'll have access to. All in their virtual machine you'll have access to R and Python and other things that you might need to do for the data science side of things. And then, of course, their data dictionary as well, which is something that's very important to figure out what data is in there and how do they define it and learning more about that. And then they do have other things to be helpful, like the publication checklist. So before you publish your data, thinking about have I thought about these steps that they know are some of the limitations of Cosmos. sure you're publishing data that's valid and they do actually recommend that other peers in your institution help review the data you've pulled out of cosmos before pushing it because as we any of us know when we work with ehr data there's there's lots of different ways that you can ask the question and potentially show something that maybe isn't as active so they're trying to be mindful of making sure you follow up so the data sets we have available so Their limited data set is what they have available in the slicer-dicer version. And so this is where they do actually include things like zip code and age. And you can see state-level data because this is all queried in aggregate because it's only in slicer-dicer version. So you can see a little bit more about the state-level data there. However, if you want to get down to more of the line-level data, like flow sheet information or, you know, more specific granularity. That's where you'll need to go to the data science side of things. And then they do make it more de-identified. They remove the state and zip code information and then they ship a lot of the stuff similar to what you guys do here with the SD and the RD and sort of thing. So you just have to know that the caveats are a little bit different on age and birth date, which matters to me. But you're able to get more to the. So now we're going to go into a few examples to kind of show you how I've used this in the real world. So we're going to talk about neonatal hypertension first. This is some of the talk that I gave at UGM. And so I gave it with my ACMIO counterpart, Dr. Misrak. So he helped me with this. He's a pediatric nephrologist. Thus, the neonatal hypertension. We included our combined interest as an example. So the first thing, first we're starting in slicer-dicer. So if anyone's used slicer-dicer, you know you always have a lot of models you can choose from. And the similar thing for COSMOS is there's models that you could choose from. So at the time, we could choose to use the patient model that contains millions of patients. But me, as the neonatologist, was actually more interested in the patients with birthing parent information models, so the babies that are connected to the mothers and having some of the maternal information. which is around 6 million patients, which is a huge database compared to anything that we have in neonatology right now. Or you could do the encounters model, which is, you know, in neonates, maybe looking at that birth encounter, that first encounter of their life, or you could go into their admissions model. So when we were first looking at how do we want to do this neonatal hypertension, I, of course, chose patients with birth and current information. The neonatologist, I thought that would be. the best information I'm going to go to first. So I went into Slic3r Dicer and I had my six million patients and I decided, okay, let me start creating my model. And so the first thing I did is because this is what I do in Slicer Dicer at my institution is I tried to slice by the department specialty. I tried to go like which of these babies, because right there could be newborn nursery babies, which of them were admitted to the NICU. And I tried. They had neonatal critical fear level, level three, level two. And I tried to slice by that. And then I was like, wow, why is five million and none of the above? So my first query kind of failed and I thought, gosh, why can't I get more of that information in there than what I'm used to doing in slicer dicer in my in my own institution? And so I asked Epic and I started asking him and they said, well, actually, the mapping comes from care everywhere mapping that they already have set up. And the way that they were mapping neonates is they actually mapped it to the pediatric intensive care unit. They had outpatient neonatology that mapped to neonatology. But so that's why there was no really neonatal departments mapped in Cosmos. And if you should learn anything from this talk, you should learn that the NICU is not equal to the PICU. This is one of the babies I take care of. And when patients have teeth, I get very nervous. So this is disappointing to me that Epic didn't realize it. And it's care everywhere, right? It's been around for years and they just had lumped all the pediatric patients together. So that was one of the. first limitations I found out in Cosmos. And so I was like, okay, I have to think about this differently than the way I use slicer-dicer at my own institution. So we ended up deciding to use the admissions model because that seemed to be a poster-level model. You were admitted to the hospital. And so then the way we filtered out newborn nursery to premature babies is we actually ended up using ICD codes, which everybody knows there's caveats to ICD codes, but it's a... the way we decided to use it. And Cosmos actually had a pre-built grouper on prematurity. So that was really nice. We didn't have to make our own, but you could expand their grouper and see what they included in prematurity, which they included a lot of stuff. I would 27 to 28 weeks gestation and all the premature babies. But they also included some things I probably wouldn't have like at risk for aspiration and in premature newborn. I mean, I guess maybe they were premature, but so there's just. Caveats to using the pre-built pre-made groupers. You can always copy and make your own groupers and uncheck some of these things and get those things out or add some things in if you really want to be super specific about what ICD codes. But as our proof of concept, we kind of just use them to see how well they work. Then the other thing we did is we made slices and then chose out of those different ICD codes to split it up, which this was this was manual work. They hadn't done this for us, but we split up the weeks of gestation, how premature with. the baby because for neonatal care that really matters you have a lot of different risk profiles depending on how premature those babies are um and for the most part we we chose these groupers so they got rid of the at risk for aspiration of premature babies and really did more of the specific groupers for these slices um and then with the measures we were looking at patients with hypertension so they did have a grouper for hypertension which seemed fairly reasonable, you know, lots of different types of hypertension in here that may be not always diagnosed in the neonatal period, but it's okay. It's a big catch-all of getting hypertension. So hopefully most of the babies who at least somebody thinks that they have enough hypertension to give them that ICD-CO diagnosis would get into this cohort. And then, you know, using some of their visualization options in SlicerDicer, we were able to see, does it look like this data pans out? So clinically, a lot of times what we see in the NICU is the earlier you were born, the younger you are, your kidneys are probably more immature, your vasculature is more immature. You're going to be at higher risk of having hypertension, which is really what we saw here. You know, the babies who are just slightly premature generally don't have hypertension, but the younger ones who have higher comorbidities do. So it seems like this data was panning out in the model we chose. And so then we started to ask a little bit more questions. to see if it matched what we're seeing in the literature, is that hypertension is only diagnosed via ICD code, right? We're not looking at the actual blood pressures right now. 1.7% of the time, which seemed to match what we saw in the literature. And we could look via year, 2018 to 2019 to 2022, the diagnosis seemed pretty similar. So at least it's not showing that hypertension is increasing in our future population. So we could look at that. And then we could look to see what meds people are using to treat these babies with hypertension. hypertension. And for the most part, they were using ACE inhibitors and calcium channel blockers, such as imlodipine, to treat these babies. So it was a really interesting way to kind of look and see across the U.S., not just the academic seminars who are publishing papers, we can actually see what people are really doing and diagnosing and treating neonatal hypertension. We haven't published this data as of yet because it was a little bit more of an exploratory data, but this is just kind of showing the power of what you can use Cosmos to, you know, really see what across the nation and community hospitals as well as academic centers are doing to treat these patients. Lindsay, can I just give you a plug for this? So what Lindsay did is something I call like a sanity check. Like she made sure the data wasn't totally bonkers. And so it would not be expected that this year there were three times as many hyperdense of babies as there were last year. That would be a sign. something to change in the data center. I love you all, but I occasionally see people, I think, very confidently using data that someone else got for them without doing a good sanity check on it. And I think that could lead to false results. And so I think Lindsay used a little bit of her domain knowledge, what medicines treat hypertension. There's a gradient with age. It hasn't tripled in the last year. And she convinced us that her data was, it may not be perfect, but at least valid enough. that some well-understood physiologic assumptions held true. And I would encourage you guys all to do that, whether you're using Cosmos, Clarity, SD or RD. At least half the data people have given my life has been just like lordly incorrect. And so you have to check it if you want to have truth in your research. And even when we were trying to get started, which model should we start with in slicer-dicer? You know, we found there was... difficulties in different models. So we needed this sanity check to make sure, was this admissions model the best model that we could use to start with? Best practice. Good job. Good. I can keep going now. So the other thing that I wanted to point out is while we were doing some of this, because you can imagine we went through multiple iterations, the other thing that Cosmos has, you know, they build on all the things that Epic has, is they have a dashboard. that these different models and components that you're building in Slic3r Dicer, you can add them to the dashboard as you're changing things in and out, which is a nice way to, you know, kind of put these in and remind yourself, okay, what have we all done? What's all under the hypertension bullets? It's the sad thing about the dashboard, at least in the current state, is it wasn't easily shareable between people and organizations, which I'm hoping that Epic will eventually create. change because since I was working on this with my colleague, it was nice if like he did something to the dashboard and I could see it in the future. But what's also nice about these dashboards is, like I said, the queries can sometimes take a couple of minutes to run because it's working on millions of patients. And so if you have these components saved in the dashboard, once you've run the query once, your cache is saved. So it should look faster the next time, but it's just always the first time that you run these queries. It can take a long time. And then Cosmos actually updates every two weeks. They upload more patients, so then your cache is kind of cleared out. And so then if you run it again after that with the same query, it might take a little bit longer to run again. So good to have multiple screens to let Cosmos run. Someone in the chat had a question about what Cosmos treats as the unit of analysis, and then he specifically said Like, is the patient seen at one institution who then moves and comes to another epic using institution counted twice? So back to Adam's point, I think it depends on how you make your patient cohort. Right. So it's very important. And I don't think I don't think I have a slide in here, but it's very important. The work that we've done when we started making mother baby databases, we tried to deduplicate and make sure that we didn't have a lot of duplicates. Cosmos does have unique identifiers for each of the patients. So if they hop in and out of Epic Cosmos institutions, they do try to link them. But you just have to be mindful of, again, how you're creating that cohort. And especially in the example of moms and babies, moms can have multiple babies. So depending on how you do that, depending on the question, we have to think about. Do we want to kick out the, do we want a single mom or do we want all the moms and their birth encounters in there? So always something similar questions. Are there possibly students, I don't know how to say that word, identifiers of which institution or record came from? It's not too about it. It's nothing different to me as a clinician. Sorry, I can read. So there are unique identifiers you have in Cosmos to know your patients. It's just you can't, like, chart review them as easily, right? Because you're not in your own institution where you can go back and, like, really look at the patient. So you have to use some of the age, you know, other things that you're using to kind of look at the patients that way. And they are linked to an institution. The institutions are just de-identified to all of us. They have like an institution code as well. So we don't know which institution that is, but you can see they are linked to institutions. But you do know the state. In Slice or Dice where you know the state, not in the data science side, because that is de-identified. You're taking my thunder at the end. But I think. Epic's legal team has just decided that they can potentially in the future release state-level information on the data science side. But right now they don't because that was part of their D.I. like HIPAA de-identification regulation. But I think now that they have enough hospitals in each state, I think they feel that they can release the state-level information on the data science side. And they wouldn't have zip code. They have zip code in slicer-dicer because it's all in aggregate. Slicer-dicer is all in aggregated data, but if you want to get to the line-level data, they get rid of the zip code. But they give you other things like the social vulnerability index, you know, some other social markers, but not exactly the zip code. So Lindsay works in a hospital that's renowned for the quality of pedi-pride to very, very small premature babies. And one of the things that... was worried about was that her hospital might calculate some outcomes and then publish them and say, we have much better outcomes than the hospital across town or something like that. And so Epic has done a lot to try to prevent you from doing hospital versus hospital comparisons for like marketing reasons or safety or quality reasons. And I suspect if Brad Malin was here, like there may be ways that you could try to re-identify this data, but you're not supposed to. And they try to... resist you doing that. They want you to treat this as an aggregate data set, and it's not really designed to identify specific hospitals. You're not supposed to try. Right. You're not supposed to try to identify the specific hospitals, but you can, and I don't think I showed this, but you can in slicer dicer get your data and compare it to all of Cosmos. So you can see how am I doing in comparison to all the other hospitals in Cosmos to kind of do a little bit of quality improvement QI to see like, are we keeping up? But they wouldn't want me to say. look, Vanderbilt's better at heart attack care than, say, Thomas or something. They would resist me doing that. But it would be easy to figure out, right, that Vanderbilt's going to have way more heart transplants than any other hospital in the state. So I think that's why they were a little bit resistant in releasing some of the state-level data. And so it's still, especially on the data science side, it's still yet to be revealed how they feel that they can start releasing that. But you're right. You could, especially if you're looking at heart transplants or something like that. You're like even the NICU, you know how many beds are in the NICUs of each hospital. So you could probably reverse engineer regional hospitals. As part of the causal rules, though, don't you promise not to try to do that? That's what I was going to say. This is Mike Williams from UVA Health. I remember delving into the Cosmos governance kind of rules, and they explicitly had to work this out years ago because there was a big brouhaha between two Cosmos institutions over just that exact issue. One institution was publishing data and results saying operationally they were much better. Yes, I think we lost you, but I think we got your sentiment in that, yes, when you sign up. for Cosmos, you're supposed to agree to do all these things. And that's part of why you let them know before you publish, like, this is what I'm going to publish on. And they can kind of double check that you're following the rules. And they can, and I probably do audit queries that you run and stuff like that. Yes. In one of the slides I had, you could see that this is all on a virtual machine, right? So they're recording everything you're doing and they know. I mean, I don't think they're really watching me and everything I'm doing, but like if they needed to go back and audit and say, are you following the rules? Yeah. Okay. So great questions. So to continue, so all of what I did was in the slicer-dicer side, so kind of the aggregate data side. So then the next step is we were like, okay, can we take what we did and move that into the data science side? Like I validated this cohort, like Adam said, I think I have a cohort I like, can I now move that to data science and get the line level data? So in slicer-dicer, Cosmos, you can ask the troubleshoot button. create a lot of machine generated SQL code behind the scenes. However, they are really optimizing this code. So it says here eight out of 12 optimizations attempted. So someone like me, when I try to go read the SQL query, it's like a nested query and a nested query and a nexus query. And it is really hard to follow because they're trying to optimize everything to make it go. So even when I tried to copy this code exactly, move it into the data science side of a very simple small table right here, I was unable to get it to run exactly. So I was trying to replicate this table here. And the reason why I couldn't get it to run is we use age at ICU stay start to say, like, I only want babies who are admitted to the ICU less than 60 days of life because otherwise they might get readmitted for something else. And this. is an example of the data dictionary where they said this is actually only available in slicer dicer because this contains the age of the patient and if you want to get to the line level data we have moved this out and so you can't actually get it so after we made all of those queries we couldn't directly take what we did and create our cohort in the data science side of things and i say that because if you're like well i'm going to be a physician that like validates everything in slicer dicer and then gives it off to my data scientist it's not quite as easy to do it that way because to make SQL code that humans can more easily understand, it's probably better to kind of start fresh from the beginning than to just go into the slicer-dicer because it's really just trying to aggregate everything. And you have to look at the data dictionary. This is the little data science icon, slicer-dicer versus data science to know what's available in each environment that you're in. I think some people have been successful in taking the machine generated code and then like moving it into the data science side. But then if you want to edit it and add to it, you know, it's not as easy to understand as if you and I would have written it. So but on the data science side, so to say what would be the next step of this project to go to on the data science side is we could get that hourly blood pressure value. So like if we wanted to not just use ICD codes for hypertension and look at what are the. actual hypertension diagnosis, we could do that. And then we could access all the other tools like Python, R Notebook, Excel, MS SQL, other things to work in that data science environment as well. So this is the virtual machine of the data science environment. And here it is, this session is being recorded for auditing purposes. But these are some of the different tools you have available. And what they recommend you do is you pull data out of Cosmos, and then you put it in your projects folder. Because, like I said, it can take, you know, depending on how complicated your query is, it can take minutes to hours to potentially run. So if you pull it into your projects folder and kind of have a more locally hosted projects folder to run things on, it's going to go a lot faster. But then every time you want to add new patients because it's been updated in Cosmos, there's new patients now. It just might take a while for your query to run. But what's nice about this virtual machine is other people who have Cosmos, you can actually share your. project folders to other people. So you can work and collaborate with people at different institutions and say, we've created this cohort that we validated and we believe we trust. So you don't all have to reinvent the wheel in creating your patient cohort. And you can actually share that more easily across institutions. That is exciting. Okay. So moving into another cool tool in Cosmos that I got to trial is the sidekick. So If you've been to XGM or UGM, you've probably heard that Epic is trying to get some of the large language models into some of what they have. So in their slicer dicer, they have this new thing called Sidekick, which they haven't released to everyone who has Cosmos yet. I got to be one of the beta testers. Sorry, I like to show off. But I'm showing you how I trialed it. So in the Sidekick, I first started and said, how many preterm infants are in this database? And then it tries to pull the number. preterm infants in the database. And I didn't really like the number. I don't think it filtered it appropriately. So I tried again in my large language model. I said, how many preterm infants with a gestational age less than 37 weeks? So I gave them my definition of preterm are in this cohort. And then I gave them the thumbs down because I really actually wanted it split by gestational age. So then I said, split the groups by gestational age. And I finally got to something that I wanted. Now they only do this in the. patience model of slicer dicer. So I have a lot of patients with no values because it's adults or people who don't have gestational age, so don't have preterm information in there. But it did split it by gestational age for me. So I thought, okay, that's cool. I finally got there. So a faster way than me making the slicer dicer query that maybe the slime could help me. So then I asked the question, include only patients who are intubated. So include those patients who got a breathing tube in. And then they did that. And I can look to see how are they making this. So it looked like they did the build procedures for intubations of babies. is how they did that. But then when I tried to validate it, so babies less than 24 weeks gestation are usually immature enough. A lot of them need to be intubated. I looked at what they gave me. They gave me like 44,000 babies and only around 2000 of them got intubated. And the numbers didn't quite seem to make sense to me. And so again, digging into the slicer dicer because they use that build procedures. I would have done it myself. I wouldn't have necessarily done the billed procedures because I know as a clinician, a lot of times it's in a bundle. We get paid for prematurity and the gestational age of the baby, and it's just assumed they might get intubated. So a lot of times we don't submit a separate bill for this. So this billed procedures might've been the premature babies who came back to the ED and got intubated for another reason because they were sick when they were older, because this is just the patient's model. The patient, any time in the patient's epic. record could have been intubated. It wasn't necessarily tied to just like their birth history or their birth encounter. So just again, where large language models inside can be helpful, but if you're not a clinical expert and you don't understand the data and how it's made, you have to be very careful in how you're using this because it could give you data that's not really answering the question that you want. So the way I would have done this is I went back to the data dictionary and I found the LDA event facts. and could find that they actually had the airway start and stop time in there in the slicer-dicer version and in the data model and the data science side. So if I were going to do this myself, that's probably the route that I would recommend going. But it was just interesting to see that the sidekick recommended an alternative route, which could be useful because sometimes maybe the sidekick would recommend something that I didn't think of that might be better. But this was an example of where I didn't think it was necessarily a better route to go. Could you say, please use the LDA event, like, when you're... That's a good question. I didn't actually try that, because then I tried to type in the LDA event myself, and I didn't like the response that I got. But, I mean, that's the point of the large language learning model, like, right, can it learn from me? And you could see it had the thumbs up and thumbs down, where I can get, right now, I'm the beta tester, so I can get the feedback to the model. But it'll just be interesting to see as this evolves and where it's going to go. Are they planning on introducing the sidekick to everybody? So I think they're planning on doing it for like all slicer dicer, not just Cosmos, like all slicer dicer. But I don't know how long they're going to be in the beta testing. Because like I said, and I told this to them because then they scheduled a message for me about like feedback. And I was like, I just worry that, you know, clinicians, it's already a big enough leap to get clinicians used to using slicer dicer. If you use the sidekick and they give data that question. clinicians don't think is accurate they're going to be like slides or dice or socks but i used it but i think there's the hope that someday the sidekick will be better i just don't know when they're planning on releasing it i think they'll love it um i've been with them once yeah yeah i'm probably at xgm and ugm they'll be talking about their timeline on this i would i would guess um Okay, so a little bit more about working with the Cosmos team, because as you've probably heard, I've already given them a lot of feedback. So what's nice is being kind of like one of the early adopters of Cosmos. They have been excited about setting up extra meetings and talking to me about what can we improve? What do we need? So I've talked to them about the sidekick. I've talked to them about, hey, I tried to find the neonatology department and I didn't. And they understand the limitations about that and how other users might want to use that. And I do have an example of they have added some pre-prematurity subcategories for me. That's been really nice since I talked to them about that. And they keep enhancing like the birth history table, the birth fact table and getting more information in that mother baby link of what they can add into Cosmos as well. So that they have been very receptive, knowing that this is a new tool. How can we continue to improve this for researchers? But this is an example of when I was using the patient model that I just. pulled up the WHO preterm pregnancy subcategories. So they had created separate places for me. I still complain to them because Epic has gestational age and days. And I say, nobody thinks of gestational age and days. We only think of weeks. They haven't changed that yet, but I'm hoping they also record birth weight in ounces and not grams, which all researchers. So, so there's those caveats, right? Like I always have to. do the conversion in my head because that's how epic is storing the data in the back end and they haven't been able to enjoy it yet so our adult weight announces too yeah yeah like that was just their i don't know they were on the english system and it's like why Maybe someday. The other thing they do in Cosmos is that I don't know if people have seen this, but they have a whole epic research kind of blog or website where you can sign up to get their newsletters and they send out what new research is the epic team finding out of Cosmos. It's very interesting. So the way it works, here's two examples of some neonatal and maternal data they found. They have a team A and a team B. They don't do, they don't try to publish this. They don't try to do peer review. Epic's version of peer review is they have two data science teams who try to answer the same question in Cosmos. And if they get to the same results or use similar methods, they feel like it's probably valid. And then they publish this data. The feedback I've given to them is, hey, you guys are publishing in the space that I'm doing research in. And if you want us as researchers to publish in like peer reviewed journals or get grants about Cosmos data, it might look kind of bad if you're publishing this before me, because you have a team of data scientists with unlimited time to work on this, where those of us in academics are like slowly chipping away. So I've given them that feedback and I've also told them you have the researchers doing this. So maybe before you publish some of this stuff, you should run like your definitions. bias because in this example of neonatal admission to the NICU, some of their definitions for admission to the NICU are different than what I would have chosen as a clinician. They do have a physician on their team, but this is, I actually know him, he's a newborn hospitalist. So it's very different than a neonatologist in the academic realm who thinks about like EHR data and ICD coding and how to cohort this a lot. the few clinicians, which is sometimes one MD or one RN that they have on their team. So I worry about this slightly, that it could decrease the validity of Cosmos if they're publishing on this. But I think it's them trying to show the value of what I can get out of Cosmos and get some information out there. But they are interested. One of the times the University of Utah was one of their team Bs. So they are willing to partner with other institutions. to be one of the data science teams and then they would give it to the University of Utah and say we're not going to put this in a peer-reviewed journal so if you want to publish it in a journal like keep going and go ahead. So I think they're still figuring out what's the best way to collaborate with researchers. Is there a way to search for example like a project directory I'm interested in this kind of research because I feel like research for all the projects that have been done it's on there like a epic.research.com or whatever and it has a list of all the ones that they've published so you could searchable yeah yeah it's just on the website so you don't even have to be a cosmos user and you can read all the things that they've published and sign up for the newsletter to get what are they continuing to put out there and i mean even right now you could see what they have their cohort definitions so you can look and see how did they define it do i want to define it this way in my ehr system as well yeah And do you know, like, the majority of them, are they using the slice of dice? No, they're using the data science, because most of these are, like, data science people who are on this. So how the maternal child kind of group got started is actually when Cosmos started going live in 2020, before they had a bunch of research institutions up and running, the Epic Research team actually did collaborate with some OB. neonatologist research colleagues that I know, and they published a number of different papers on, you know, COVID and neonatal and maternal health outcomes. And this is where the EPIC research team, because they didn't have all the data science and all the tools up and running, was doing most of the work and partnering with institutions. However, now the model has shifted to, we've given you guys all the tools, you can do it, you know, EPIC's not going to do your research for you for free. However, like I said, we're still having open talks and collaborations about how potentially you could work together. But because of that work, this is how a multidisciplinary team from multiple institutions has gotten together on maternal child data and information. And so that's where I was lucky to jump on this later because I was like, hey, I'm an idiotologist. I like what you guys are doing. And they've done a lot of work in creating a validated. maternal baby cohort in Cosmos that we can then all share and use that together to ask the different questions because the maternal questions might be slightly different but can start from that same database than the neonatal questions and that sort of thing. So this has been a great group to continue to collaborate with and what we've been trying to tell EPIC this is how we think there should be spin-off subgroups like this and everything, right? Adult hypertension or, you know, whatever the question might be, it would be good to have collaboration of researchers so people aren't recreating the same cohorts in COSMOS over and over and over again. There are some issues for a large multidisciplinary team. So teams from all over, like the good news is we have OBGYN, we have neonatology, we have a lot of different expertise, but say you want to get funding, how do you fund all these different institutions and get people to have time to do this? And, you know, how are we going to, you know, share all of our data sets and eventually make it open source, especially if we do get funding someday, because that is some of the requirements. So stuff that we're still working through. So that's why I tell people that, you know, Cosmos is like my hobby, because it's not like I'm funded or I've dedicated time for this, but something that I like to do as we see what this tool is going to be able to do for us. So speaking of funding, when you start a new project in Cosmos, they actually ask you that question. So as long as you don't have funding, you get to access Cosmos and do everything for free. But if you do get funding, and part of the rules of the road when you sign up for Cosmos is they do say that we would like to take a percentage of that funding, which I haven't checked on this recently. But I think at one point that percentage was like 10%. They would like to give some contribution back to. Cosmos, which if you get a small $10,000 grant and give them $1,000, that seems reasonable. But if you get millions of dollars from the NIH, a 10% fee is a lot. So some people have chosen not to use Cosmos for their federally funded research because of that reason. And again, this is feedback we've given to Cosmos of a lot of databases have a flat fee and not like a percentage of your grant. So hopefully something that will be changing and something that I think you could have those conversations with Epic. But I know that that has been a deterrent for some some people. So on the research team side of things, this is a slide that I got from Kevin Dysart, who is part of the neonatology group that I showed you before. I think part of the reason that I think having these multidisciplinary teams is really useful is I know at Vanderbilt, there's lots of us who strive to become uniforms. and the experts at everything. But the reality is, can you be a subject matter expert and know all of the clinical things as well as all the machine learning, math and science? It's really hard. So that's why creating these multidisciplinary groups where you have people that bring in that different expertise can be really useful and making sure you have your subject matter experts at the beginning of creating your databases, like Adam said, is making sure you're having a validated EHR database is really important. important as well. So something to think about when you're using this large data set. Okay, I have time for a final tiny baby example. I'll go fast since there's probably going to be more questions. So at the University of Iowa, some of my mentors, Dr. Klein and Dr. Daigle, have been creating the tiny baby program for years at the University of Iowa. And I show this slide because the 22-week gestation is very arguably whether the babies can survive or not. Is that you know, the cutoff of viability. Not that many years ago, 24 weeks gestation was the cutoff of where a premature baby could survive. But you can see as time has gone on, this is the University of Iowa data, we've gotten better and better at keeping these babies survive. So our limit of when we will resuscitate keeps dropping lower and lower. So I wanted to use this question of, we have a great program at the University of Iowa, how is the rest of COSMOS doing? How's the rest of the data in the US doing? This is just a quick slide of showing we've even gone down to 21 in three weeks, and this baby at one year old has survived and done a pretty good job. So we like to push the limits at the University of Iowa. Lots of reasons for that that I won't get into since this isn't quite a clinical audience, but we have a lot of dedicated teams taking care of these small babies. Oh, and one of them is the standardized order sets, which as a physician builder is a great way to have clinical decision support and standardized care for your patients. But a lot of other neonatology databases have tried to answer the same question, how are 22-weekers surviving across the U.S.? And these databases have been started back in the 1980s. And so they are manually curated databases that research nurses spend hours looking through my admission and discharge summaries and all these things to submit the data to these databases. Even a more recent one established in 2010 at the Children's Hospital, the CHNC, still does a similar thing. However, those of us who use EHR data, I wonder what's the future of some of these databases and can we automate some of them? So that's kind of what I'm trying to test with Cosmos. So this was from the neonatal research database. This was most recently published looking at mortality and morbidity in these tiny babies. And this was 2013 to 2018. And you can see it was published in 2020. So COVID happened in those years. But knowing that the data after 2018. and then 2020 took like four years to finally get this data published just shows the lag of manually curted databases and the ones who finally can get this data to be published so i decided to try to replicate this this paper that was published in JAMA and showing their survival. But I also decided to replicate it not only in the 2013 to 2018, but hey, in Cosmos data, I have the right now data. I can look at the 2019 to 2023 to see if I can replicate that database. So starting with just female, male, and birth weight, can we get some of the basic demographics correct? In that 2013 to 2018, You can see most of the time we have 40% females, which is about what they found here, and 52, 55% males, which is about what we found here. So I gave that a checkmark that that seems pretty reputable for what we found. And then what's interesting is the median birth weight was 480 for 22 weekers, and we got almost the exact same medians across the gestational ages there. So again, birth weight is something that as neonatologists, we want to know immediately and is very readily put into ethics and very reliable. there. And you can see the numbers, 22-weekers. I have 1,422-weekers in COSMOS versus 550 in this database, which is only 15 academic institutions. But you can just see the power and the difference of COSMOS there. So going further along, the big question is survival to discharge. Do these babies survive to discharge home? So in this database, survive to discharge home. the 22 leakers had 10% survival to discharge. And in my Cosmos database, I got 7% survival, which I thought was pretty good because remember Cosmos is community sites, academic sites. It's a bigger number versus this is 15 high performing academic areas. So I think that's pretty similar. But the other thing I put in was transferred, right? Because some of these babies might be transferred to higher levels of institutions. So we might have, you know, not the exact same numbers because of that. But then what's interesting in this more recent cohort is we can see that the survival has improved from 7% to 12%. So meaning parents and families are wanting these babies resuscitated more and more hospitals are trying and succeeding at getting them to survive to discharge. So it's just very cool to see the trend. And then of course, as the babies get older, the trends goes up. You can see that 26 weekers gets to like 80%, 90%. It's not quite as high in my database. Again, probably because of the limitations of are these community hospitals, are they getting transferred out? So maybe I don't have as full of a data set there as this one does. But then the other question I thought, medication use. EHR data is really good at medications. Can I look at what medications are being used in this tiny baby population? So I pulled the most common neonatal medications. And what I found is a lot of X's because of the size. In the small 22-weekers, ampicillin and gentamicin are antibiotics we use pretty frequently, 90% of the time. I only found 8% and 30% in these infants. And after we asked Cosmos, the reason was is the way that institutions were submitting their medication data was a little bit messy and probably not quite as accurate. Because some of these antibiotics come in vials that then have to be mixed. And so like... Similar to like the care everywhere problem, just the way the data was getting submitted in Cosmos wasn't quite as clean. So we've given this feedback to Epic and hopefully they'll work with the institutions or work with how can we get this data better. But I was just kind of surprised because I thought the medication data would be spot on in Cosmos, but it wasn't quite yet. However, for caffeine, it seemed to work pretty well. So caffeine in the small babies is 40 to 60 percent. And we were getting some of that in our database. So that one worked. pretty well. And surfactant, I gave that a plus minus. This is a medication we give to babies right after they're born to help them breathe better. And in here, it's the third most common, up to 80%. And I got it up to about 60% in my database, which I think is pretty reasonable, considering that surfactant can be given on transport at a different hospital before it's brought in. So something that's probably a reasonable number for that. Real quickly, going to maternal characteristics. So I'm not going to go through all of this, but I will say that the ones that we found maybe weren't quite as reliable with some of the public medical insurance compared to the paper I was comparing to, because I think maybe Epic Cosmos has slightly different definitions of what they call public medical insurance versus the others, as well as prenatal care was not. quite as reliable. Prenatal care in the paper got up to 90% and that got it to be 50%. And in a manually curated database, you could ask the mother, did you have prenatal care? Whereas in COSMOS, you can only rely on, did they have prenatal care in my Epic COSMOS system? So, you know, there's some limitations to that there. And then the big question that everybody in neonatology wants to know is actively treated at birth. Are we trying to resuscitate these 22-weekers or are they just going to need comfort care and we don't think we can actually try to resuscitate them? So in the NRN paper, they were able to chart review and look at the resuscitation information. Did you try to give chest compressions? Did you try to intubate the baby? What did you try to do in the delivery room for that baby? We won't have. the notes and that kind of level of data necessarily in Cosmos. So we chose the definition of did you get any antibiotic or any medication on that list I showed you, knowing that some of them are better than others. But if somebody gave you a medication in Cosmos, probably they were trying to keep you alive. So we use that as our active definition. I threw in the 21 weekers here because not many places are resuscitating 21 weekers to kind of show a comparison. But you can see that here they got up to 36% of active resuscitation. And we had about 10%, but then it kept increasing as time went on, which it does with gestational age. It does increase. And it got up 24 weeks to 89%, which, you know, gets up to 90 in this paper. So I think, you know, just any sort of medication was actually a pretty good definition that we could use. And then looking at our more recent cohort, the 2019 to 2023, you can again see in the 22-weekers, it was only 10%, and it's now gone up to 20% of the time people are trying to actively resuscitate. And again, this is in academic community centers, so like a bigger encatchment area in Cosmos. But this led me to believe that, you know, we could confidently use Cosmos in this definition to give us some information of trends over time. And then the question also is time of death in Cosmos. So knowing that dates and times are slightly date shifted, how reliable was time of death? So at 21 or 22 weekers, a lot of times if they don't survive, the screen is going to be they're going to die within the first 12 hours of life, which is what we see. They're too immature, they're just not going to survive. And you can see those are the biggest bar graphs there. But then as they get older and get less premature, the red is the survived one year of age, you can see that that's getting better and better. So based on the numbers, a fairly reliable definition that we can even use for time of birth as well. Now, I don't know that I would, or time of death. I don't know that I would use that all the way down to the minute or second and trust it, but at least in the hours. Because it depends on when you hit an epic discharged as deceased, right? Like the timestamp might be slightly off, but at least it kind of gives you a general idea that it seems to be trending in the right direction. So real quick on the future. So this is where I already kind of talked about in slicer-dicer, I can see the state-level data. So I can see that Tennessee for tiny babies is only submitting 91 patients because Vanderbilt's not on it. But in Iowa, I have 833 patients. So I can see the state-level data. There's more data coming out of Iowa. But this potentially might change in that data science side in the future. With the aggregated data, you can still see a little bit of state-level data. And you can see here the shading is you have, this is the patients with birthing parent information. You can see which states have contributed more data in that data model in slicer-dicer. Ohio is beating everyone, apparently. But they're also, Epic is trying to, you know, help with the data quality. So they're putting out data quality dashboards, things like the deceased patients, like I talked about, do they have death... dates? Do they have a status of disease? So you know when you're choosing these areas and in Cosmos, how good is your data quality? So they are trying to put out dashboards so you can more educatedly make that decision of, do I want to use this criteria in my queries? So summary, we have, well, we had questions at the beginning, but I guess they ran out the end. Cosmos is not perfect, but they're open to user feedback. And there's, you know, I think continuing to improve. Epic research is out there. I think our research should just be mindful of that and look and see what they're doing. But I do think that this has the real potential of helping with some of those large consortium databases and answering some of the questions that we want to do. And I hope I showed you guys with the small babies that we are validating some of what we're seeing in Cosmos so that hopefully we can ask bigger and better questions using Cosmos data in the future. So that's it. We're at the hour or we're going to pass. We probably should officially stop, but Lindsay, we'd be happy to join if people emailed you if they had a question or stuck around and asked a question. So Lindsay is incredibly helpful. She's been really impressive, I think, both in the set of research you were doing, but then also the power of the tool that you demonstrated. we're going to keep trying to get it here at the end of the month. I promise this is one of our goals and then get that 86 number up to 500 or so. Right. Yeah. Awesome. Thanks, Lindsay. Sure. Oh, yeah. Are we talking to people? Wow. Sorry. Further questions in the chat. There actually weren't. Hopefully I answered them all that. Yes. Yeah. Can I ask you a question? I thought it was interesting when you were talking about that situation where it turned out that like with the sidekick, there's only 5% of the 23 week babies could intubate it or something. So you were a neonatologist, that wronged you. If somebody is not a doctor or is a doctor but not a neonatologist, how do you develop that sense or what could they do? Well, so I think that's part of why I haven't rolled it out to everyone yet. Okay. And there's like, Lindsay, you use this and you give us feedback. But that is some of my fear of these large language learning models is that I do think it'll help us build models faster. And people who aren't maybe as savvy at slicer dicer can give them ideas. But I think it's, you know, especially as a data scientist, you have to really like even be more mindful about fact checking and saying, I might not have the content expertise. But maybe I can pull up a paper or see like, what are the percentages I should be thinking? Yeah, I love that. I mean, you could Google what procedures for the babies are intubated. Outside of my office, I would shout questions about immunology to her. So you'd say, Lindsay, 5% of 23 of her is getting intubated. Does that seem reasonable to you? I try to spend as much time as I can at the hospital. I used to go walk around the NICU with Chris Lieben, and I probably started to realize that a lot of little babies had like tubes sticking out of their mouths and more than 5%. So that might feel wrong to me just from having to hang out at the hospital. So if you were not... if you're like a phd having an infirmatician like some of you guys are or will be or if you're a doctor but not that kind of doctor spend time in that setting and you can develop the same spidey sense of dr canadian question what's the recommendation for surfactants in um like what uh what age premature yeah good question so it's it's more clinically based on the baby so the other thing i didn't get into is if the mom is given steroids um for delivery, a lot of times their lungs will mature faster and maybe the babies don't need surfactant. But as a lot of moms go into preterm labor really quickly and the babies just fall out, that doesn't always happen. So it's really based on how the baby looks. I will say 22 weekers at the edge of viability, like a lot of them end up getting surfactant. So that's why I was kind of using this as are people actively treating them. But in neonatology, there's a whole ... There's different ways to give surfactant. So if you intubate the baby, you can give the surfactant through the breathing too. But there's also more non-invasive ways of giving it as well, where you, you know, you try to like spray it in and hopefully it gets mostly into their lungs and doesn't go into the stomach. But so it's an evolving area of how people are giving it as well. There's not like a... There's not a like everyone, everyone less than 24 weeks is going to get... There's not... for sure that way, but I would say, again, clinically practicing most babies in the 22 to 24 week are going to be pretty immature that a lot of them are going to get surfactant. And the paper that I pulled up was like, you know, surfactant was the number three most common med given in 22 and 23 week. So you just know it's very commonly given in those babies. Do you have a strategy to find the data quality by size? For example, if I want, it's going to be a really bad practice, but if. I want to study and I know that, for example, a site is a community center. They don't do the mapping right yet. Yeah. Can I exclude that? Can I find the sites that have maybe like probable medication mapping so I can exclude it in this current study? So EPIC can. And part of what EPIC is doing is they're actually planning on giving that feedback to those sites. Like we've noticed on our data quality monitoring that you're. much lower than everyone else in this area. Can you look into that? Is there a reason? And actually, they kind of asked me on the neonatal side, they were like, oh, we were planning on telling people that if babies stay in the NICU for over two months, that maybe their data quality is wrong. And I was like, no, we have babies who stay in the NICU for a year. I was like, don't use that as your cutoff. Like, you know, so they're asking content experts, like, what should be the cutoff of my data quality question? I think in Cosmos, you'll be able to see your site to see how your data compares to other people. But I'm not sure. It might be a question of ethic, of will they have that data and can they give it to you, depending on your research question. When you join Cosmos, there's like a burden period where your data goes to Epic, but it's segregated. It's not mixed in with the rest of the data. You see the dashboard and they say your lab mapping is only 82%. We require you to be at 95%. Yeah. uh to join cosmos you have to sit there and map labs and tell it exceeds a certain threshold so there is a minimum data quality threshold to join cosmos well i was surprised the medication did not make that maybe they're working on i think it's a grouper issue so the problem was like ampicillin like there's like he said like there's little vials and ampules and all kinds of weird there's some different ways to represent the same medication and maybe it's a neonatal vial different than the adult file with the percentage. So they process the map. Yeah, but if it's half and half, then I think I could do it. Let's do it. Exactly. We were talking about the fact that some mapping customers use First Data Bank and some use MetaSpan. And so there's got to be some mapping they're doing. And it seems like maybe that's leading to trouble for certain metas, especially when it's silicate or mixtures are diluted. Yeah. It's a general question. So. I guess there's pros and cons to Cosmos, right? So I'm guessing the way you think about whether you use your own data, which is more like identified versus using Cosmos, is it just based on how common or how much, you know, whether you're looking for like a really rare outcome or small patient population, is that like the one thing that goes into how you think about it? Or is there something else that helps you guide like? I should try to do this in Cosmos. Yeah, yeah. So the way that I first think about it is like, would it be useful for me to answer this question, like multi-center on a large scale, right? But then you also have to think about, for example, like rare diseases could be a great example. But what if it's a rare disease that no one ever gets the ICP code correct? You know, like potentially that's where some of those like rare disease networks might have better data because they're like, Like, oh, this rheumatologist always submits their patient data to the rare disease network because they know they've truly been diagnosed with this disease versus how good is the ICD code going to be in Cosmos? So you have to be mindful of that. But I think that's where like a lot of what I've done is just exploratory at first so I can start learning what are the pros and cons of the Cosmos database. And I'm especially exploring in the neonatal space so that I can learn about it and know what. questions are going to be the most useful for me to answer with most um they're mostly big like epidemiological kind of questions right because you can't get data of like this population then this intervention happened then you know like a before it passed or necessarily or unless eventually you can sequence it yeah oh all right yeah yeah yeah and it's just you have the caveats of you might have some patients that don't have fully complete before and after data, but you can. Yeah, it's cool because you can do a lot of insights here. It's actually has sequencing rules, but then whenever you can't accomplish insights, you can try to accomplish in Python or something like that. I think it'll be really cool to see how this tool evolves and how different researchers use it. But I think, I mean, Vanderbilt getting on is great because you need people that understand both kind of like the clinical informatics side of how does data get into Epic to know the caveats of that, but also the data science, machine learning, every other piece as well. What is the training like? It's pretty easy. I mean, the super user is just online, right? The online training. And then the data science training, at first it was in-person only, but I think now they're doing a lot of virtual options. It's like two days of data science training. And as long as you've done the SQL pre-requisites. it's pretty it's pretty easy and they just use they just show you some examples of why sometimes death date isn't the most accurate and you know just thinking about stuff like that I review it into the right now since we're doing everything retrospective right it's easy to just put in a retrospective like cosmos umbrella IRB sort of a thing for like if you want to publish it. I mean I think every IRB is different in how lax and easy it is but I haven't had too much problems with IRB yet because it is all retrospective right and you're just saying I'm using this you know this software that's already been vetted by our institution and it's not like we're ever going to consent So we try to educate the IRB on Cosmos before we got it here. At University of Texas, somebody was telling me that they have just a single IRB for all of their Cosmos studies. It's like pre-approved. And I don't know if you get that here, but I hope it's going to be good. Yeah, it would be an easier way than people having to put in like separate IRBs. Someone asked if there's laboratory data on Cosmos and if it has the ability to develop or extract structured data. There is laboratory data on Cosmos. I don't know that it has like every component of like a urinalysis or something, or, you know, might not have like every single lab component of the lab that's available in your hospital system, but they have a good amount of lab data in Cosmos. I don't think they've started doing structured note data yet because that would just be. lot of additional data to upload in Cosmos, potentially in the future, but it's not there yet. But things, so like our gestational age and birth weight is like kind of like a form that's filled out a lot of times, and then that pulls into our notes, right? But as long as you put that form that gets filed to a flow sheet, right, like then that's structured data that can get out of that thing. So if there's... common things that all EPIC institutions are using, you could potentially get those data. And I think they're starting to think about, you know, some patient forms like PHQ-9 or depression screening, right? Some forms that probably everybody has like a similar data structure on that they could get into COSMOS. And Lindsay mentioned this, but the foundation of COSMOS is Care Everywhere, which is EPIC's health information exchange tool. And so people have already done a lot of work to standardized lab tests and mapping some of you have done the mapping that talk to you about them uh for the purpose of interoperability and cosmos takes advantage of that but they may ask you to do additional mapping instead of tillings point so your hospital may including what may already have a head start on the mapping it's just like spinning everywhere oh i got multiple questions people who said like how can you access the cosmos of vanderbilt i want to use cosmos and weren't completely paying attention to my introduction but um I'm going to use them as examples when we send our memo. Okay. It sounds like typically it's like working out the legal issues and then also just the mapping. Is that what you need to do? Yeah. No one here is mad about the mapping. The legal issues are part of it. I just used it on the phone.