[Lecture 5] Understanding Rowhammer Vulnerability and Mitigation

are ready good hello everyone uh let's get started uh today we do not have Professor M I'll try my best to substitute from uh I'm gir Al I'm u a researcher and lecturer in um sofware group uh I recently got my PhD uh on memory robustness so uh I'll talk about the rest of the memory robustness today uh Professor M already covered uh the essential things in the last week uh but we will elaborate more on uh the story of memory robustness and the uh today's concerns and tomorrow's concerns um so uh there are many uh so our main focus will be on ra Hammer but we can extend the problems that we talk about raw Hammer to other vulnerabilities in the memory devices so rowhammer is something that is used by uh a lot of security exploit voice you can use raw Hammer to uh take over a system read uh data uh that you don't have access read access to uh permission to you can break out the virtual machine sandboxes you can corrupt important data you can change uh uh critical workload Behavior you can steal some uh secret sensitive data like your sshs for example um so uh rammer story pretty much starts with this partic paper from Isco 2014 it was from Safari uh philippin bits in memory without accessing them we will talk about this paper in detail uh later on in this lecture uh but I'll first talk about the imediate impact this paper had in the field uh which is um paper from uh Google FS uh that exploit this rowhammer vulnerability the the inherent rammer vulnerability that Dam chips have uh to gain kernal privileges in Linux systems so it's from the project zero team in Google and they exploit ra Hammer to game uh colal privileges so this is uh an assembly of uh our slides merge with their slides so this is what that paper talks about um you can see the reference in the in the bottom here uh rammer is a problem with some recent Dam devices in which repeatedly access in a row row of memory can cause bit fillips in adjustment rows so this is this is what we dis what we showed in Isa 14 paper right so they they already cite that paper and then they test a selection of laptops that use some um uh modern memory devices of that time and uh they found that the problem process in uh the the problem exist in uh some of them and they built uh two working privilege escalation uh exploits using this effect and um they basically end up uh exploiting Ro Hammer to gain kernal Privileges and take over a system basically so how they do it is that uh uh they they uh they use the rammer induced bit Phillips to alter some critical bits in the system that can in a Linux system that can give them the unprivileged uh uh uh they give them access to some unprivileged regions so when they run this uh on a machine there's vulnerable draw Hammer they can actually flip bits in page table entries that I will talk about this in detail and this allows them to um gain read right access to the whole physical memory um yeah so this is uh yeah the following slides are going to be uh some uh slides that we borrowed from uh the black hat presentation of Google Project zero so they Define this as like um page table page table targeted Journal exploit uh so page table entries are Dan and trusted in the software level but uh when we exploit physics then software level trusted data structures might not be that trusted so they aim to get access to the page table uh that gives access to all of uh physical memory and they maximize their chances by spraying physical memory with page tables and they check for useful and repeatable uh locations in the memory that are vulnerable thr Hammer so this is uh uh okay uh this is a recap of your computer architecture uh knowledge or uh memory virtual memory or Os knowledge so uh we have one we called page table it's it's um uh often 4 kilobyte page uh containing uh an array of like 512 different entries and this is what we have inside every entry so we have a physical page Base address and then some control bits here so for example we have some read write permission bit here so if we want to uh flip this bit so that we can get a right permit to a page table entry uh it's like uh let's say we have the the the chance of having a bit flip in a given location as uniform distributed in this uh in this page table ENT let's say um then the the philippin this this permission bit uh has like two% chance right so they do not do that they what they do is they flip bits in this uh physical page base addresses so that you have 31% chance of hitting one of those bits so that uh your page table entry actually points to a different location which you shouldn't be able to do so let's see how it works so you have a virtual address space in your missions uh you have an infinite memory from software's point of view right and you have a limited physical memory so when you allocate some pages in the virtual address memory uh they uh point to some physical memories so if you uh allocate um okay so to access here you basically access a virtual address that virtual address operating system on behalf of you accesses some page table entries and then with redirection from those page table entries you access to the correct physical memory location so what happens if you allocate uh um different places in your virtual address space that point to uh the same physical memory you have multiple of these page tables and then uh they keep pointing to the same physical address so the happens when you use share libraries for example uh so what they do as part of the attack is that they keep allocating a lot of pages in the virtual address space that point to a physical page uh so if you do this in a in a aggressive way you can actually uh fill up pretty much the all of the physical memory with page table entries so let's say um you you access to this virtual address and then it goes to this P page table entry and then it points to here right so let's say you have bit flips in this page table entry uh the the physical page number in this page table entry then it goes to a different location and if you fill up all your memory with page table entries then the new location that this points to is has some other page table entries and let's say you have allocated some memory space with read write uh permission already then you have right access to this page table entry now you can change the data inside there and the funny thing is you can just make it point to any arbitrary location in the physical memory right so what you do later on you access to the other virtual uh like this virtual address uh sorry virtual page and then this points to this page table because it thinks that there is the original page table there and and then since you altered the information inside this you can access to an arbitrary location and you can write the whole memory and then by doing that you can actually take over the system so uh this is just like uh the summary of what they did so they allocate a large chunk of memory search for locations prone to Philippine go ahead yes in the N method of uh okay maybe I need to repeat the question right so um the question is can we limit the number of page table entries so that we can mitigate this kind of attack um you can do it to some extent you still need this virtual memory support so you cannot completely eliminate it uh by limiting the number of them actually effectively limits the um number of physical cells you can exploit in the physical memory right um and they have some uh improvements on top of the AIS attack on top of that actually so um you can uh say that you cannot create like more than n page tables of course it has like performance overheads and all that uh but I'm not getting into that um you can still find the vulnerable bits inside the physical memory and then you can um you can Target your attack to there with using like smaller number of page tables uh they they actually show that in this presentation I'm not sure if I have those slides here but uh if you go to the original presentation they explain all of them um so they check if they fall into the right spot meaning that uh do we uh do we have the uh exploitable bits in that page table entry and then they return that particular of memory to the operating system so that uh they force the operating system to reuse that location for the page table entries so this is actually uh sort of an answer to previous question as well so they cause bit flips in the page table entries they change the address and then uh they abuse read write access to the whole physical memory so okay uh we have some of these like fancy images throughout the slce uh it it's it's uh it it it it has been quite popular in the security community and there are like all sorts of like uh different throw Hammer images uh uh you can see like this here and there um so this is more like a popular science kind of description of Ro Hammer so it's an analogy basically it's like you break into an apartment by uh chip smashing the neighbor store until the vibrations through the wall uh propagate and open the door that you're after um so after this first exploit that Google project zero showed uh of course it's something very useful it's it's essentially an asset or a tool uh to have the rowhammer vulnerability in inside modern memory chips uh it's it's it provided the necessary um um necessary functionality for the security FKS to exploit it in many ways right so uh uh the the first exploit was by using some C Plus+ code uh then Security Experts figured out a way that they can just transform JavaScript code and they do the same thing so uh what does that mean you just like visit a web page and then it runs a JavaScript code and your computer is compromised right so it's quite uh interesting uh they have ways of uh hammering uh and rooting uh Android devices because they also use the same technology right uh there are many words in the field that use um uh like different components in the uh in the system to make your attacks more effective faster more efficient so there are words that show that you can actually use gpus to access the main memory and then you accelerate the way that you induce the bit Philips so you can conduct your attacks in much smaller time window uh there has been um uh some so if if you remember the the code that we showed last week it's basically like you access two memory locations and and then you have two instructions like uh C flush that a those um data from the cache to the main memory so for a while people were thinking that okay yeah if you if you uh disable C flash in the system then there's no vulnerability but uh there are papers that show that you you can use some memory accesses that are not cached at all you can directly access to memory so for example these papers um use Network packets so in your system you have a network chip that connects to the processor and it goes through the direct memory access engine to directly go to the memory right so you don't uh you don't put those data in into the cache so you don't need to have C flush for this for example so you can just use Network Packers to to even uh attack a machine that that's remote um these works are showing that and uh yeah another network based rowhammer attack so this is quite uh interesting so the experimental characterization uh data of rammer vulnerability so far showed that um the bit flips are repeatable in certain locations and they're also quite sensitive to the data pattern you put inside those dram cells so what data like one or zero you have in a dram cell and what data you have in the neighboring dram cells matters so much that you can actually um craft the data pattern to read uh to to focus your attack on particular cells so what this work does is that um this is called rambled uh they read bits in the memory in the locations they don't have uh access permission without access in them what they do they initialize their uh own memory space with some particular data pattern and then they keep hammering the sensitive data by accessing to that row by accessing some other columns that they can access and then they observe the bit flips in their own data so it um since we know that the data pattern sensitivity is quite significant in raw Hammer they can actually read your RSA key basically the key that you use for S into some machines so it's quite uh scary and it was published in 2020 um there are Works uh two Works actually that came in 2019 and 2020 that uh show that you can uh induce bit fillips in uh in a guided way again to some weights in some neural networks so it was the um top of the hype of uh maybe not hype the the Deep neural networks right it's quite popular in many workflows we use deep neural networks and then they come and say that I can flip bits inside the weights of your DNN and then I can reduce your dnn's accuracy from 90% to 10% so that it's pretty much useless if you're uh using this DNN to selft drive your car for example then you can uh understand a pedest you can you can MISD detect a pedestrian as a I don't know that as if it's it's not there and you can hit for example there are like significant safety issues that you can guess here there are works that uh use again uh different components like the so this this particular tack is using the fbga so you have in many systems they are connected to some fbga for as accelerator purposes right so you can actually use those components to accelerate your attack you can induce bit filps in a much faster way um in 2021 Google uh again came up with another attack called uh half double uh it was a secur to block post and then it became a uh paper in in the top uh uh security when using Security in 2022 that basically shows that you can uh you can even do uh induce bit Philips in an easier way by um uh hammer in rows that are not next to you but in two rows distance um so uh yeah I just wanted to mention that um so okay here's another like uh uh story from the field so when if if you if you go and watch the presentation of the Google project zero that I mentioned in the beginning uh in at the end of the presentation they talk about three mitigation methods one of them is uh error correcting codes uh how many people heard about error correcting codes here oh quite okay quite some nice so you you guys are familiar so if if your in in your data there's a bit flip for some reason you can use this air correcting codes to correct those bit flips right and they come with some par bits basically and um uh they have some limited capability of like let's say in a 64bit data word they can correct some some ECC scheme can create like one bit for example or detail two bit um in some sophisticated ones they add like more par bits so you keep consuming your systems resources to be able to correct more bits so there's a whole trade of space there and um a a quite um popular one is the singular correction double error detection uh Heming code that is used in many systems and in this paper uh uh yeah before this paper they they were talking about like um not they were talking there was a like General understand of okay if you have ECC uh you can actually correct Ro Hammer bit filps because they happen like here and there they're not like very concentrated in a sense so ECC solds ra hammer and this paper in 2019 showed that ECC does not solve Ro Hammer actually you can um in a ECC protected system you can reverse engineer the ECC you can uh guide the bit flip locations in your um DRM cells in a way that you can actually cause silent data Corruptions uh silent data corruption means that your ECC does not even know that there's a bit fillip there um later on uh there has been some work that from D manufacturers that uh they implemented some mechanisms inside their dram chips and they claim that uh their dram chips are Grammer free at some point and then of course as researchers we um uh investigated those uh I'm just showing this paper here but I will talk about it in detail later on so uh I just wanted to say that a key key result from this paper here is good for mentioning that uh you can in in modern ddr4 chips uh with raw Hammer protection you can do the reverse engineering you can uh craft access patterns that induce bit filps and you can induce up to seven raw Hammer bit flips in a 8 by data world so in in 64 bits you have seven bit flips so to protect against this kind of U uh errors error rate you you really need to have like uh you you need to span a lot of Paris basically so ecc's or the today's eccs are practically ineffective against Ro hammer so uh there are many security implications I try to cover some of them so they they reach to a point that you you uh actually hate your D chip and you want to smash it with a huge Hammer um okay so uh there there are some Works uh Pro from Professor Mutu that actually um summarizes uh like uh retrospective survey and future perspective um U like um uh what's that uh like future looking uh uh papers as well that that uh show about like the the current and future problems and what research problems to work on so this is one of them retrospective paper from 2019 there is an updated version of this that we published in 2023 and um another very short retrospective paper that is um invited to uh the 50th year uh anniv 50th year of uh isca conference Isa is one of the top conferences in comp field so basically uh you can recognize the title of the paper actually this is uh this is a recognition of the 2014 paper because of the impact it had in the field uh so there's a uh short summary of uh that paper and the papers impact and what happened until the 50th year year of Isa um I also wanted to mention here uh some recent advancements in the uh raw Hammer field uh but we will talk about these like this is just like a fores shallow in it we will talk about all of these later on so um Samsung in 2023 uh published a paper uh that shows that their their indam uh mitigation mechanism they explain their indam mitigation mechanisms and uh like one year after that basically um there's a Google paper that breaks that as well um and uh I want to show this paper as well this is very relevant uh it was published just like two months ago in using security conference the one of the top security venues um this is actually from eth you can see the names uh that shows that you can even get bit Philips in ddr5 modules which is the state-ofthe-art now so the problem is still there and um I want to um uh promote my tees again that has a um a summary of these and also uh includes the some cut inage research uh project in it okay so uh I'll continue with understanding rammer so we did a lot of experimental characterization and I'm going to cover all those now but uh before going to that are there any questions yes all right so I Okay so question is does the attacker need to run a code on the victim system so that they can indu some bit fillips right so uh yes or no uh so isolation is a good solution for security and for RO Hammer as well in general but it comes with some practical costs as well so uh can you point to A system that never connects to internet for example uh there are systems like that of course but uh for like other functionalities you usually want to everything to be connected yes yeah yeah so uh that that's where I was coming yeah so um let's say we are talking about a computer you open a web page that R JavaScript code you have a code running on your system if you if you have like a if you have access some network connection to some system you can send some Network packets then you can do like network based attacks um of course if you just like isolate everything then uh no malicious user can run any program on your system security wise you you might be safe uh I have to mention though that the rowhammer vulnerability is getting worse we will see it later on that um fewer and fewer row activations cause bit filips nowadays so it's actually um constitutes a reliability issue as well and when we look at the benign workloads in the field uh they already uh unintentionally basically keep hammering theam row and they so in in a refresh window for example you can see some benign workloads that reach to activation counts of like several 100s uh even like close to 1,000 if you go for like some particular workloads and uh you can now like indu bit fillips in like several hundred activations at the same time so you might have some like unintended bit fillips popping up in your system as well so that's another issue other questions okay so let's go with the understanding rammer papers uh the first one is philippin bits that the paper that I referred to that was published in 2014 uh that uh ran a lot of uh tests on real D RAM chips so this is a picture of our infrastructure that we used uh for that uh set of experiments so as you can see there are like um in a Plex container there's like um a bunch of um fbj boards here uh there are some DDR3 Dr modules that are connected to them it's DDR3 because this tests were conducted in 2012 and 14 the D of that time and uh you have some uh uh host machine that is connected to these fbj boards that run these tests collect the experimental data processes it and we have a commercial temperature controller here that we set the temperature and then it runs this heater on this uh plexiglass containers essentially heat Chambers uh to heat up the uh chips so that uh we can test our chips in different temperatures in a stable way uh in this work we tested uh 129 uh dram chips they were manufactured from 2008 to 2014 and you can see like they are from three major manufacturers and uh you have all the information in the full paper um about these Dam chips so to summarize the characterization results essentially most modules are at risk meaning that we can induce bit filps in these modules and uh you you have this tradeoff based on that data you either need to use vintage dram modules that are like manufactured basically essentially older dram chips or you have to deal with errors because in Neer Dam chips we have the scaling in the technology now then d cells are smaller they they they they have smaller charge they're closer to each other so this interference or this um disturbance effect is uh exacerbated and uh a a key point is adjacency uh so basically your aggressive role and the victim roow should be physically uh close to each other um if possible adjacent to each other um because as they get closer the disturbance exis again and we have some sensitivity studies and the results in the paper and we also discuss what kind of solutions that can be um uh used for mitigating such bit filips so here is U uh yeah this is a this is a nice uh plot where on the x-axis we look at the distance between two rows the aggressive row and the victim row so distance of zero means the aggressor row itself so we we keep hammering the draws in the location zero here and then on the y- axis it's the bit flip count so you see like how many bit flips appear in each row so you as you see like the most of the bit flips are concentrated in the in the immediate adjustment rows and you can also have some bit fops in the close proximity uh non adjacent rows uh so most aggressors are aggressors and victims are adjacent to each other so we look at how the access interval of the aggressor uh effect so how frequently you need to hammer basically so uh the the the highest frequency you can activate a row is one activation every 55 NCS this is a timing constraint uh B by the uh theam communication protocol DDR3 of that time uh you cannot activate a r more frequently than 55 n seconds and if you make it like very make activ very frequent then you have the highest bit filps if you reduce it then you have less and less bit fillips so it reaches to zero around like 500 milliseconds in this plot of course this is data from the dam chips from 200 8 to 2014 right so uh nowadays if we regenerate this plot with modern dram chips then these slops will be U much more flat basically because uh at larger access inter as well we still have a lot of good flips okay go ahead ah because uh basically you send activations to the aggressive R right and uh so here we look at sensitivity of like how frequently we can send activations and how many bit flips we can get by doing that so there there is an upper limit of the highest frequency because once you send an activation command uh the me the memory controller does not allow you to well in this case we don't have memory controller actually but by the DM communication protocols uh they they mandate some time in constraints so after you activate a row there's a Time window that you need to to keep that row active for and then you can close that row and after sending the closing command which is pre-charge you need to wait for the latency of that pre-charge operation as well until until you send the second Activation so the the time minimum time window between two consecutive activations is around like 55 NS so it's not so the the so you're looking at realistic cases in this case uh we the you cannot send basically activations sooner 55 n seconds yeah so the if you go to any system they they have their own memory controllers and all these timing constraints are enforced in those memor memory controllers so you cannot so it's it's not relevant basically to look at to the uh points smaller than 55 second in this work later on we looked at that as well yes so in ddr4 uh if I remember the timing constraint is like around 46 NS is correct maybe smile can confirm uh TRC for ddr4 I think I also saw like 4 the 6.2 something like that yeah so basically varies from Chip to chip as well or based on the generation of the chip uh but it's it's around the same ballpark not necessarily so yes you can send activations more frequently now but uh the reason that it gets worse is that uh the physics gets worse the dam cells are closer to each other so one activation actually creates more disturbance than before so we looked at the refresh interval basically so this is the nominal refresh rate you have since the ram chip is volatile uh you have to uh refresh your data inside the dam chip every 64 milliseconds right so um uh if you uh basically based on this experimental data you can see that if we uh um refresh our chips more frequently then we have less bit flips and if we refresh them very frequently maybe we can get rid of these bit flips but it immediately turn out to be quite impractical because refreshing row also takes some time and you you have many rows to refresh so when you just add up those times then uh it needs to be smaller than your refresh window and uh you you quickly reach to that point actually and uh yeah so in in the uh based on this data you need to increase your refresh rate by seven times but it's it's not practical and uh you need to refresh even more frequently in today's D RAM chips um also data pattern uh has a plays a significant role here so if we have the same if we initialize the whole dram row with once or with zeros uh we have some bit rate of course but when we go for like internally we data pattern we increase the bit flip rate uh bit rate quite significantly um and uh yeah so we we didn't see any uh correlation between the victim cells of data retention bit filps and raw Hammer bit filps so it seems like these are different error mechanisms and raw Hammer errors are repeatable so if You observe a bit F at a particular location in the future it's very likely that you can observe bit flip in the same location as well so you can exploit those bit flips and craft your attacks accordingly um and you yeah in this work the observation is like as many as four errors per cach line in 64 byte let's say you have four bit fillips but later on as I said like in Neer chips we have like seven bit fillips in 64bits and um uh for a given cell you can see that it's affected by the two aggressors on each side of the victim cell and this was the 2014 paper uh and uh in 2020 we repeated sorry uh okay so what is the reason behind the data pattern dependency um so so far like based on this I can just say that it's empirical observation right um but you can actually uh simulate to physics in your mind a little bit and you know uh once you initialize a row with one or zero like you can do the encoding in either way but like in one case you have some high voltage in some capacitor and you have low voltage in the neighboring capacitor and it creates some tension basically some electric fields and this whole thing is based on like electrons moving around getting trapped in certain locations exacer leakage and reducing the data retention time so uh so far I guess I can explain like this okay so this was 2014 paper and we are what slide is this4 okay I'm slower than what I was planning to but I'm going to cover this quickly as well uh I'm hoping to reach to slide 75 before we give a break so rammer is getting much worse we repeated this experimental study and extended it to some different sensitivities as well in 2020 and uh what we observe is that it's getting much worse so it's not very uh uh surprising at this point because I keep saying that and um there are some works from industri as well so this is a paper that we collaborated with some Microsoft folks uh that um looked at the um rawh hammer vulnerability of the machines in their servers uh and it's it's it turns out that if when you ask the question like can we guarantee that our systems are Ro Hammer free it's very hard to answer that question because uh you need to really conduct some rigorous experimental characterization there and we also looked into different aspects of sensitivities of rowhammer vulnerability um in this work particularly we looked at the temperature data access path memory access patterns and the um and the spatial variation within the dram dram chip and um it turns out that raw Hammer varies in it has like multiple sensitivities and they are not really straightforward to find the worst case very easily um and we also looked at the um effect of voltage scaling on ra hammmer uh we will cover all these Works in time and uh we uh we also tested some HPM dram chips uh hopefully we will talk about this tomorrow uh in detail and we show that it's not only ddrx like DDR3 ddr4 ddr5 um uh LP DDR series are also vulnerable to roow hammer hbms are also vulnerable to roow hammer and hbms are quite uh interesting or uh like understanding the vulnerability of hbm chips today are is quite relevant uh because uh now you know that we have like all these um Advanced uh AI models and workloads that run on gpus and things like that and uh all of them are essentially using high band memory chips hbm series basically and um there were not much characterization data in the field and uh we did that and uh they're not much better than DDR basically so your um llm models can be at risk as well um okay so we will talk about uh these papers uh I just pretty much all of them uh in the following lectures in this lecture as well and so this HPM work is quite recent it was uh published in uh DSN which is a top uh conference for dependability and reliability uh earlier last summer and uh uh special variation study that I will talk about uh tomorrow actually is also published in uh earlier this year um and uh there are some rammer solutions uh so we are not hopeless in that sense uh so the ISA 14 paper defines some immediate and long-term Solutions immediate solution is basically you need to uh uh update your firmware somehow somehow and then you should mitigate those bit flips right so uh you you aim to protect vulnerable Dam chips in the field without CH with minimal changes to Hardware let's say and in the longer term uh we need we uh focus on improving our Hardware inside the dram chip and inside the processor so it it has a wider range of protection mechanisms but it's not immediately applicable to the chips in the field and Isco 14 paper proposes seven Solutions uh uh that I will cover um but uh yeah later on when we classify the Solutions in the field uh we can see that you know you can uh you can have more robust theam chips basically you can um uh improve the physics which is like quite uh expansive and time consuming going forward um this is one way of going there the second way is like you can increase the refresh rate uh so basically uh in the spot it's like the uh voltage in your cell it it goes lower in time because of the charge leakage because the is volatile so if you just like refresh it more frequently it leaks less so by doing so you can hope to avoid raw Hammer B Phillips but uh when you do the math you can see that it's not very practical today you can introduce some physical isolation between the physical theam regions of an potential attacker and victim uh it's it's a it's a uh somewhat practical way uh but it it requires a lot of uh isolation roles in between so you sacrifice your memor capacity uh you can uh observe the memory access patterns you can uh decide detect that one partic row is rapidly and repeatedly being activated so it's being hammered so you can go and proactively or reactively refresh the potential victim RS around it and it it mitigates the bit Phillips or or you can even like throttle memory accesses so that you do not access to memory at a high enough frequency to uh IND those bit fillips of course all these methods have their own like cost power performance and complexity it's a huge trade of uh space uh you need to uh you know Find Your solution um in in in the based on the trade of across all these so following our 2014 paper uh Apple uh release a security patch essentially increases refresh rate because that's pretty much what you can do in the uh uh devices that you already shipped um and uh I I guess Lenovo also had some security update but I don't have the slide for that here and um our uh our solution in that Isa 14 paper which is which actually uh survived until today the only solution pretty much that survived until today is para probable schedules Activation so it's a very lowc cost mechanism uh and the key idea is as follows so when you close a dram row you uh you refresh its physical neighbors with a low probability uh so uh this is the probability threshold basically over there uh you can you can uh play with this number of course to make it more aggressive or less aggressive so uh based on this probability you get bit flips based on the characterization data in that paper uh you you only get like this many bit or the probability of your getting bit flips is this much in a in a year of hammering and by adjusting this value you can um play with its aggressiveness so advantages of para is that um it refreshes roles very infrequently so uh it does not incure high performance or power overheads you no it's it's for hammering yeah so um so advantages of par is that it refreshes row uh quite infrequently so you have low power and performance overheads and it's stateless so you don't have to maintain a lot of um metadata about your memory access patterns uh you can just Implement it's a very low cost mechanism you can just uh Implement very uh quickly and uh it doesn't consume your resources basically um so it's an effective and lower solution to prevent three disturbance errors and uh if implement it inside the dram chip which is done today uh you basically need enough time select to perform those refresh operations and there's plenty of Select today that we showed in like several other papers uh that we might cover some of them in the follow following lectures in the following weeks um if you implement para inside the memory controller which is also done actually by Intel at some point I have a slide about that uh you essentially need some coordination between the memory controller and and the dam chip because uh you need to know which row is next to which row um uh in the physical uh layout of the dram chip um so that once you uh find an aggressive row uh you have an idea about like what roles can be potential victim roles so this is a screenshot of a bios screen of an Intel system so basically they call this like U pro raw Hammer solution yeah raw Hammer P I don't remember what P stands for so basically you can either double the refresh rate as a rowhammer solution in your bio setting or you can choose one of these rowhammer acation probabilities to perform those victim R refreshes in those systems so later on Intel did not I think later on newer biosis did not have this option after some point because the Dr manufacturers came up with new set of dram chips in ddr4 time and then they say that new dram modules are completely row Hammer free you don't have to worry about that you don't have to do any refreshes uh so um that functionality in some bioses are discontinued uh but as we will see later on that it's not the case uh you shouldn't believe what the industry is saying you BL blind look basically so uh a takeaway from our study is that main memory needs intelligent controllers for security safety and reliability and it's not only true for Dr modules it's also true for land flash devices and other memor Technologies as well um especially in Flash we keep uh observing in lots of lots of bit flips because it's uh not a very reliable memory technology by itself and there are a lot of mechanisms uh that people Implement around n flash arrays so that you get rid of all those bit filps there's a lot of redundancy implemented there and we also have some characterization works on N flush we might cover later on some lectures so uh yeah as I said like two major we have two major ormer directions uh one of them is understanding row Hammer so we already looked at a lot of effects and sensitivities of roow hammer but we have other things to do uh so if you're curious uh I encourage you to uh come talk to us we can uh plan uh some characterization based Works to understand Ro Hammer better and we also uh need to find better Solutions we need flexible and efficient Solutions uh uh we we want our solutions to be patchable in field so that when you uh figure out a new vulnerability or when you change your Dam chip and the new dam chip is more vulnerable than the previous one you can easily change the configuration of your solution and we also need works on Co architect and system and memory so uh this is a good logical break I ran over a little bit uh so it's 5 2 now so uh I think it's a good point to take a break until 2:15 uh yeah then if 10 minutes is okay yeah let's meet at 2:15 now by the way before we go do do you guys have any questions no ah yeah go ahead why why do we Implement par in the memory controllers yes so it it came later on it it wasn't implemented in the RAM chips before so they Implement in the memory controller and so that you can also uh protect the chips in the field right but then uh they also implemented inside the RAM chips as well but they came L that came later yeah so in architecture it's hard to say that one is better than the other usually there are like different tradeos uh inside the ram CH as well there are some additional constraints and challenges that we will talk about later e e e e e e e e e e e e e e e e hello welcome back can you still hear me on YouTube okay good uh so uh let's get to more up toate times uh so we talked about what happened in the field of rammer before 2020 so now let's look at what happened since 2020 so the first uh paper that we will go through or the set of experimental characterization is R roarhammer so this paper was published in isca 2020 which is the which is one of the top conferences for AR arure and um in this study we essentially test uh more than 1.5k chips and uh we show that newer Dam chips are more vulnerable this is the main key takeaway from uh this this work and um there's something striking like uh until this point everybody was assuming that you need tens of thousands of activations to induce a bit flip but this work show that you can have bit flips at only like 4.8 th000 uh hammers per agresso so um this data even though it's like something striking uh even now it's it can be still uh you will see towards the end of the lecture that uh we can actually induce bit flips at much lower Hammer counts in today's Dam chips so the the the trend of get inverse is still there and um chips of new technology not can exhibit through Hammer bit flips in more rows and farther away from the victim row and exist mitigations are not effective at Future technology not so we evaluated the existing mitigation mechanisms of that time as well and show that uh they incure uh too high performance overhead so you end up either having bit flips in your system or having a quite slow system so this is the set of Dr chips that we tested in this work from three major dram manufacturers um uh in this work we didn't uh uh say their names we anonymize them but today like we actually uh say that it's like Samsung Micron and S hyx um in in more recent words and uh this this paper covers uh dram modules that Implement different dram communication protocols such as DDR3 ddr4 lpddr4 and um it classifies each of these as like old and new uh dram chips based on their manufacturing date or the uh D die revision and the density they have so here is a plot that I'm not going to go through very uh in detail uh but uh this shows how the hammer count uh affects the number of bit flips you have so um each curve belongs to a different theam type so I can use my pointer here so this blue CH belongs to ddr4 Old Dr modules for example and then this yellow one is for ddr4 NE and you can see that the curve just like shifts up uh as you increase your Hammer count on the x-axis you see the bit flip rate on y- axis and the curve shifts up so it means that you have more bit fillips in newer Dam chips and it's true across all these like um different U standards and Ro bit flip rates increase with technology of generation here's another uh uh key uh result or takeaway from this uh this paper so here we Implement uh uh different uh dram mitigation mechan sorry roham mitigation mechanisms so uh you're familiar with para now para is the one that I I explained the previous lecture like the the probabilistic one that we proposed in es 14 paper and there were some other mechanisms as well um proposed later on uh so prit and amarlo are other uh probabilistic mechanisms that improve on top of para and uh there is twice that was the stateof the-art counter based deterministic solution of that time and um it has a problem inherently in its design it cannot scale uh after a threshold of 32,000 uh for a thresold below to 2,000 so we just like assume that those uh uh inherit problems are resolved and we just like scale it as if it works so that's the curve of twice ideal and what we see is that uh as we go from left to right we reduce draw Hammer threshold which is essentially this HC first metric the number of hammers required to induce ra Hammer bit flip so as you go from left to right um we go from older chips to the newer chips and you can see the experimental characterization data um uh actually uh marked there like DDR3 Al is here you need like almost 100,000 activations to induce a with Filip DDR3 new is here so you have like a few tons of thousands activations and uh on the y- axis we look at the performance overhead of each uh given ration mechanism and as you can see all the curves are going down right as we go from left to right and um this this means that your system performance is degrad degrad degradated and um you have some significant performance overheads and uh we have a chve here that is called ideal ideal is a um it's a hypothetical mechanism that does not exist um it's essentially uh detects the uh rammer attacks with 100% accuracy and it performs performs refresh operations on the potential wicom row only when needed so um uh it it it goes uh like uh with low relatively lower performance overhead than the other mechanisms and uh the Gap in between uh the the best mechanism of each uh H first volue basically and the ideal keeps increasing as you see like this uh red arrows are getting longer and longer so it shows us that the existing mitigation mechanisms of the time which is 2020 uh were not stealing as expected or as wanted with the uh vers and raw Hammer vulnerability and uh there is a significant gap between the best mechanism we have and the ideal mechanism so um it it also shows significant opportunity that we partially SE it later on and we have better mechanisms now that falls in between uh so rowhammer is a techn technology scaling problem as we show with experimental characterization data in this 2020 paper and finding a good solution to rowhammer is difficult uh and it will be even more difficult going forward so here is a uh cartoonish comparison of uh different vulnerability levels so again I'm I'm going to talk about this metric HC first so this is essentially the hammer to you see the first bit filp number of hammers to the first bit FP so in the extreme left side we have HC first as like infinite so no matter how much you Hammer you do not uh incure any bit flips in your Dam chip and on the Y A sorry on on the right hand side um H first is one which is like dam is Doom every access results in a bit F so uh this is the Spectrum we we have and uh this uh data shows that like which uh D which set of dam chips fall into what category So based on our characterization results in Isa 2014 we observe like 140k around for DDR3 right and then in Isa 20 this DDR3 results got in got down to 24K and the only difference in between is the technology not scaling now we have like Dan chips we have more gigabytes in a given chip and you see like uh different results for lpddr4 ddr4 and hbm2 as well none of them are like really good so uh uh so we showed this in 2020 and starting from 2016 is Dr manufacturers are claiming that uh our new Dr modules do not have draw Hammer problem with Sol that internally you don't have to worry about that and when you ask them or when when you look at their documentation of how they sold it they call about they they talk about a mysterious mechanism called trr Target R refresh so Target R refresh is essentially be um uh um observes uh your memory access patterns and then finds out potential victim roles and it needs to refresh those roles in a targeted way uh but this is very high level explanation and you do not see any further uh description any details about these mechanisms because it's considered as like proprietary knowledge and it's some corporate secret uh they do not reveal it so uh you have two options either you trust them and then you do not Implement anything in your mechanism you stop the research assuming that dams are fixed now uh or you can get those new dams ships and then do characterization on top of that so this is the work that does that characterization it's also published in Isa sorry not Isa this is security and privacy conference in 2020 and it's a collaboration between uh our group and um free University of Amsterdam uh kavi is also U at that time I guess he was there and then he came here so it's a eth paper uh you can be proud of that uh so it shows that what dram manufacturers implemented inside their dram chips is a poor rawh Hammer solution so there's a solution there but you can easily walk around that so this is the first work to show that trr protected dram chips are vulnerable to Raw hammers in the field and the mitigations advertis are secure are not actually secure and it introduces a new um memory access pattern called mided thrw Hammer attack that you can use to bypass these mechanisms so idea is that uh you Hammer many rows so it overwhelms the uh internal counter logic that they implemented inside the dam chip and then this implemented TR mechanism cannot figure out which row you're attacking and it goes and refreshes the wrong rows and as a result you have bit filps in the row you actually want to have bit FPS uh so this trespass work partially reverse Engineers the uh TR and ptrr P stands for pseudo uh mitigation mechanisms implementing Dam chips and memory controllers and it provides an auto automated tool that can effectively create minid throw Hammer attacks for ddr4 and lpddr4 Dr uh chips that you can use to uh in your spit filps so uh here I can just talk about this foursided probably so this um this is a cartoonish picture of how your dram uh array looks like so these red rows are aggressive rows and the blue rows are your victim rows you so you excuse me so you allocate your rows in in this way that it alternates between aggressors and victims and if you increase the number of aggressors now the drram chip needs to keep track of the activation count of more and more rows and they have only a limited number of counters inside the dam chip which is a dum design uh but that design decision is there because of some cost concerns and um as a result uh you overwhelm it and uh here in this data in in these three plots uh they belong to like different um different modules um so on the x-axis we sweep the number of aggressive RS we have in this attack or access pattern on the y- axis we have the number of bit flips we have so as you can see up to like four aggressive rows for this plot you do not observe any bit filps the reason is that the counter mechanism they implemented inside that D module can keep track of those four aggressive rows and refreshes there with them so as a result you don't have bit filips but the moment you actually introduce the fifth aggressive row that you concurrently Hammer they cannot uh keep track of the fifth one and then you have a lot of bit flips and as you increase the number of aggressors uh since you have a limited time window uh you have like smaller Hammer counts per aggressor so as a result you have uh some smaller bit FP counts um okay so and this this behavior is different from Chip to chip yes yeah it's a good question actually I don't remember the reason yeah yeah the behavior I explained is more like in here so from nine onwards you you you have a like General decreasing pattern I'm not sure why we have a second Peak here uh yeah I need to check the paper actually I don't remember that good question uh maybe we should put the answer to modu okay so um uh in this work we like this work tests um uh many Dr modules from like three uh major manufacturers and you see uh the best uh mided thr Hammer access pattern in this column for example in some of them it doesn't work some of them like you need like n Hammer 19 rows some of them 10 rows and you have like um a lot of bit flips basically as a result and uh they tested uh uh several uh Comm commodity uh devices like these are some phones tablets whatever and um they they use the dram chips right and these are the manufacturing dates and you see that some of them actually were already vulnerable to this attack so you can mount some system level attacks that uh perform this kind of memory access pattern and you have B filips on these devices so they're compromised and U you can use these uh memory access patterns to attack to page table entries to steal some RSC keys and to to gain studo privileges on these devices and these are the attack time that you need so you can uh uh you can get like a few minutes of attack or like one around one minute of attack for this module for example gives you the sshg that you need okay so uh they find out that 13 out of 42 tested D modules are vulnerable five out of 13 mobile phones are vulnerable and these results are uh just scratching the surface in a way that uh the the the characterization study here is not uh super rigorous later on we did another work uh a followup of this work that actually delve into more details and did a more way more rigorous reverse engineering study that actually find out that uh uh more of the there there's more room for uncovering uh more more vulnerabilities so um this this trespass work in 2020 essentially showed that rammer is still an open problem and security by obscurity uh the way that Iran manufacturers did that like just saying that our chips are safe there's no rooh Hammer but we do not tell you what the security countermeasure we take here so this is this is what we call security by obscurity it is likely not a good solution because there are many Smart Security people here and there and um they can find ways uh around these uh not very secure mechanisms um so after 20120 we also had some improvements in jedc documents so uh they they um they release a near-term dram level roation and system level ration documents they implemented new commands uh called refresh management and uh a new mechanism called pra that we will uh talk in detail tomorrow Al son will he's in the back uh he will be here uh tomorrow uh talking about the details of these mechanisms but I just wanted to mention here now because uh this is a this is some uh exciting improvement from the industrial side because um uh if you look at the trends of this uh J Consortium and their uh task forces um they uh they have like many bodies in this Consortium there are like system manufacturers some software companies Dr manufacturers and to make a simple change basically you need to uh convince all these parties and they they should be on consensus on something so that there's some new uh requirement in the following dram communication protocol let's say you move from ddr4 to ddr5 and you want to add a new functionality or new timing constraint it's quite hard to do you need to convince many people and uh these Works uh especially in 2020 played a huge role actually convincing people and then they actually take took some um um uh solid steps to uh solve this issue um the next work that I mentioned is uncovering indam Rohan protection mechanisms this is the work that followed up trespass um where we uh do a rigorous reverse engineering study basically we uh use the data retention failures as a side channel to understand if a dram row is refreshed or not so let's say you Hammer a role and then as a result of this hammering the ins inside the dam chip the mechanism triggers a refresh and then you refresh the victim row so you can see if this row is refreshed or Not by uh characterizing the retention failures and uh this is a full-blown uh uh reverse engineering methodology that resulted in uh completely reverse engineering uh trr mechanisms of different D modules and it crafted uh uh uh targeted access patterns to those modules so uh they test 45 different modules uh uh for by using this methodology and they show that 99.9% of the rows in a d Bank experience at least one row Hammer bit so you basically at this point reliably work around uh the deployed mechanism over there so it's it's essentially ineffective and uh ECC is also ineffective because when they look at the distribution of the bit Philips they can see up to seven rowhammer bit filips in a 8 by data ver so it's uh quite high density of bit Philips that you can protect with um some practical ECC schemes basically sorry so uh basically you bypass ECC with these not new rowhammer access patterns and uh yeah commissional DCC cannot protect against new rammer access patterns so this also um put a dot to the discussion to about like if ECC is enough for rammer uh protection so today you can fairly confidently say that ECC is not enough we need additional mechanism on top of ECC and if you think about it actually ECC is there for uh the bit filps that uh that are not very predictable and they they happen randomly and the bit flip probability is uniformally distributed across cells so in those kind of cases and if your bit flips happen rarely uh or like you have a very low bitor rate then ECC is quite effective right so those are error mechanisms like softare errors like you have a particle strike and then one bit fillips and then you correct it but in rowhammer case uh you actually can craft memory access patterns you can Target some dram cells you can predictably and repeatedly induce bit filps in some cells more frequently than others for example you can change data pattern you can redirect your uh uh aim of bit filp to different cells so all these things are making rammer bit flips nature different than the other bit flips that we protect from by E using ECC so um like in in high level this kind of uh chain of thoughts also suggest that ECC is not uh effective or not a very good fit for RO Hammer bit philps essentially but uh now we also have some experimental data backing up this so uh next I'll will talk about new rowhammer characteristics that we found out after 2020 but before doing that are there any questions or about these two papers or on the dam mitigation Ro mitigation mechanisms no okay let's continue with new rammer characteristics done so this paper is published in micro in 2021 uh it's also one of the uh chapter in my thesis so uh I I love this paper so here what we do differently is that uh we look at rowhammer sensitivity to three different properties uh temperature uh aggressive active time is a very low level definition of that I would just say memory access pattern uh at high level and also victim Dam cells location inside the dam chip and uh we do that to find effective and efficient attacks and defenses and uh we test 1 72 Dam chips in from four major manufacturers in the study we add n on top of Samsung Micron and HX and we come up with uh six major takeaways six all observations and we summarize those by saying that rammer bit fillips are more likely to occur in a bonded range of temperature so it's it's quite interesting actually it's not like you increase the temperature and everything gets worse uh sometimes increas in temperature uh reduces the probability of having a bit Filip in a dam cell and um if you keep aggressive rows on or open for or active for longer time than the minimum amount of time you can you can reduce the number of hammers you need uh I will talk about all these and um tomorrow also we will have a more elaborated lecture on this and um you can also find certain physical regions of memory that are more vulnerable to throw Hammer than the others and uh we show that our obser can uh Inspire andate uh uh more effective attacks and more effective defenses efficient defenses as well so it's it's it's better to know right um if you know all these sensitivities then you can improve your attacks or defenses uh we do not want attackers to improve their attacks we want the to achieve like better defenses but it's it's it's also like you need to study what to do that uh okay so here's an example that we will uh uh talk about in detail tomorrow uh um yeah it will be obvious soon why so uh we we're comparing two access patterns here uh so assume that these black vertical uh these lines are uh the the times where we have a row activation for the aggressive row so we activate the we activate the aggressive row here and then in this yellow window row row a is active during that time and then we close the row and you have to do this because when you activate the row there's a minimum timing constraint that you need to wait until you close row and then here you activate here you close sorry and then you activate the other row so let's say we are doing double-sided attack A and B we keep hammering and then you Hammer B and then you close B and then you activate a close B CL um um precharge and close a so if you do this as frequently as possible by the minimum timing constraints so this is known as the uh uh most effective raw Hammer attack based on the pr prior experimental data uh if you remember the uh activation interval plot that I show from 2014 paper as we increase the time between two activations the bit flip uh rate bit rate is reducing um but here in the study what we showed is that if you actually uh use this time to keep your aggressive row active for longer time then you can actually significantly reduce the number of hammers that you need to perform on these rows to induce a bit flip so now as a result your HC first or the minimum activation count you need to induce a bit flip just reduce significantly and uh by doing that let's say you have this information and you have this characterization data one second um and the deployed mitigation mechanism in your system does not account for this kind of effect then you can uh easily bypass uh the defense by reducing the activation count because it might be configured for a higher activation c i can get your question now okay so are you asking about the experimental methodology in this paper or in a more realistic scenario in the real system okay okay so the question is how can we keep R active longer in a real system so uh in in a Dray um if you remember from the last week's lecture you have the Dray and you have a Rob buffer right and internally you access the d d rer so when you activate you pull you you fetch all row data to the Rob buffer and then as you have column accesses reads and writes you serve from the Rob buer and uh this activation and uh once you want to close the row and activate another row you need to wait for closing and then open the other row these all add latencies so I'm just talking about like General background of dram um and we do not want to pay for these latencies in like just just forget about rowhammer think about B9 workload we do not wait to for wait for these latencies so that we can serve our memory request faster so that our application can keep going and then we can achieve high performance so to do that to to ensure that we don't have to activate an uh pre charge rows very frequently um uh the memory controllers exploit the locality inside the Rob buffer so if you have two requests that are Target in the same row even if they are like in different uh order in your que let's say if like first com first or Q um if your Rob offer is open you keep uh service and these uh memory accesses so that uh later on when you want to access the same data you don't have to open the same row again so this is called Rob offer locality and to to leverage Rob offer locality memory controllers are tend to keep rows open for some time in case like uh new requests will come to these rows and in real systems what we do is that we access to a column in a row a row is like as large as 8 kilobytes right and a column access is uh in the ground of like 64 bytes so you have many of these columns inside the row um and what we do is like we access different columns with some time delay in between so that memory controller thinks that uh this row is quite hot now it keeps getting accessed so we need to keep it open for a longer time so it's it's a performance optimization that you can actually uh exploit for your attack basically does that answer the question okay yeah so in the yeah so as as long as the row is open it's connect so so um it's actually not decoupled in current architectures if the row is open uh so if if the row buffer has your row rows data your row also stays open so basically your word line is asserted so your Dam cells are connected to your sense amplifiers and whatever you write to sense amplifi is probating back to the dam cells themselves so that's the reason that everything is on and the word line remains asserted and that's the main thing actually that creates this effect uh we keep a high voltage on the word line so that it it keeps attracting or causing electrons move around that's a very good mitigation yes uh maybe I should repeat potential mitigation is to decouple the dram array from drw offer so that you can keep accessing to drw offer but you can close your dram R and it comes at the uh so it has tradeoffs as well what if you decouple it you CL you de assort the word line you close the row you have your data in robw offer and then you receive a right now you need to open open the row again and then propagate the right back to the dram house right uh so there is this tradeoff also for decoupling you need to implement new isolation transistors between Rob offer and Dray so it also cost some area cost and um there's a if you're curious I can refer you to a very recent paper called Hi-Fi Dam uh from Mela Marazzi uh a PhD he also graduated actually recently um inth direct from K Group uh he actually looked into the so they they did some like uh investigation on how dram chips are laid out and uh they they show that like it didn't a wire even increases cost a lot let alone like putting a lot of transistors there so there there are some like cost driven concerns and also performance concerns of decoupling but for addressing this issue particular ularly yes it's a good solution okay so um another aspect of this paper was about like looking at the spatial variation across uh different dram cells in a dram chip so we observe that let's say you look at the 10% most vulnerable dram rows and you need a you need some activation counts let's say let's parameterize it HC first to induce bit Philips in this like most vulnerable 10% of the aggressive r d r for the remaining 90% of the row you need at least double that activation count so you don't have to protect all parts of the system or dram chip in the same aggressiveness right you don't have to protect you if if you configure your mechanism to behave like for the worst case across among your dram rows then you end up overprotecting a lot of contrs and then it comes at the at a significant performance cost and uh for some mechanisms that Implement some sophisticated uh metadata management uh structures it also increases their area overhead so you can get rid of those by leverage and this kind of uh circuit level observation and um uh another thing that I mentioned about the temperature is that uh dram cells are vulnerable to temperature in a bonded temperature range so as we increase temperature from left to right uh there are two bounds like a lower bound and upper Bound in between we observe bit fillips in a un Ram cell but outside of this range we do not observe bit filips at a particular Hammer count for even me so basically we can um optionally dynamically retire some rows basically uh based on our um systems temperature level for example so there are these kind of improvements that that we can come up with um I can get some questions you are first yeah so the question is why do we have this vulnerable temperature range in high temperatures do we have the memory control refresh and draws more uh frequently and is that M bit Philips the answer is no uh but this is a very good question like where wherever I uh present this observation I keep getting um this question from drage Sports so very good question so the reason is not that we know that because uh in these experiments we do not uh have Dam refresh we do not have memory controller we test Dam chips in fbj based infr structure so we have full control on everything we know exactly when row is refresh exactly when is activated and uh that is not the case here so this is uh this this goes back to I I think I don't have backup slice for that but U this goes back to some circuit level phenomenon uh called trap assisted charge leakage so basically inside your Dam cell in particular locations you have some traps that electrons can get in and trapped and then once it happens uh you exacerbate the charge Liquide cell so you lose your charge faster in your capacitor that causes to beit philps and uh when you do the physics level simulation of this particular error mechanism it has this uh Behavior actually so there are some uh circuit level Works based on some tat simulations that show some similar Behavior as well uh based on your trap's characteristics um at high temperatures electron does not get into trap it just like moves across U and it causes like this kind kind of weird behavior and when we look at this vulnerable temperature range across theam cells it also varies a lot so there are cells that are only vulnerable at like 50 degrees for example Celsius and then there are cells that are only vulnerable at like 70 degrees Celsius so it's very hard to generalize and it makes everything much more complicated in terms of like profiling your dram cells understanding their vulnerability limitations and all that all that um so it's it's it's quite uh Curious phenomenon still to me I hope this gives some sufficient answer to your question uh there was another question yeah it can be very narrow so the question is like how large is temperature range from Dam sou to Dam sou it can be just like one two degrees celsi or it can be like 50° Celsius large like let's say from 50° to 90° it's vulnerable for example um there are like all sorts of behaviors across theum cells that we can observe okay let's move on yeah there's a more detailed talk that is uh as long as wi one minutes um I also gave a full hour lecture on this uh I guess I can add that slide later on um so we have more Ro Hammer analysis uh so this is another paper that's part of my thesis so I also like this work uh here what we do is we uh change the voltage level inside the dram cell so um uh here is a picture of our infrastructure so we have an fbga based infrastructure we program this fbga with a design uh that is written as like a highly uh modified version of soft MC here now it's called dram bander um tomorrow we will talk about that as well so you can we have actually many instances of this infrastructure in etz building building and uh you can get handson if you're curious uh if you want to test different access patterns or changing different things and want to observe like what other behaviors we have in the ram cells there are a lot of undiscovered things we believe um you can just get hands on if you're curious um so here we have the Dr module we have a temperature controller that uh keeps the temperature stable for Dram chip and we have an external power supply that we can allows us to play with some voltage level on some power rails so in this particular case we change the voltage on the word line of theam cells and uh we show that when we reduce the voltage on the word line uh now you have a smaller voltage swing uh or toggling on your wordline voltage as you activate temper charge rows so in the uh to to give an example in nominal ddr4 par uh constraints uh you have to provide like 2.5 volt to your word lines for example and uh once you assert and theer the difference is 2.5 right and if you steal down this power supply to let's say 1.7 volts then you have a smaller voltage toggle on your word line and this voltage toggle actually is the reason or the thing that triggers rowhammer bit Phillips so what we expected and observed is that when we reduce this magnitude of this voltage toggle we can reduce ra Hammer vulnerability and as a result we observe that we can uh reduce the number of bit flips uh in a test and also we can um uh we can increase the necessary Hammer count to induce a bit flip quite significantly and this comes at the cost of increasing the raow activation latency or memory access latency and reducing the data retention time that requires you to refresh uh theam Sals more frequently but uh it turns out that the current timing constraints that we get from data sheets already have a very large safety margin called guardband so if we just use the nominal timing parameters everything works fine for most of the chips so there's a full talk about this as well and uh the okay uh I think the rest we will cover tomorrow um uh not rest of all slce the other characterization works so as I said earlier we also tested some hbm dram chips uh by the way is there any question about the voltage scale and work I already done with that no okay so we also tested some hbm chips recently uh as I said it it was published as a short paper in 2023 in a DSN workshop and then um as a as a full paper it was published again 2024 with an nice Standard Version uh with more data of course with more rigor and um uh we we observe that hbm 2D RAM chips are also quite vulnerable to ro Hammer unfortunately uh the the current like industrial standard is like hbm3 hbm 4 but um in the in the fbga that you can get today uh you can only get like hbm2 so that's why we tested hbm2 in this work um I guess going uh going on uh we will also test like newer hbm chips and um I guess you can also guess at this point that the Rohan vulnerability will turn out to be even worse um uh there is another paper that that was published in hbca that shows the spatial variation that I will cover tomorrow and uh with that we can done with the understanding part of the um post 2020 works and we will move to some uh new Roar Hammer attacks and solutions so half double uh is um uh is a new uh memory access pattern that Google found out um basically uh the okay so I think this gives me the uh nice pictoral representation so I can uh talk about this so in classic grow Hammer let's say we have a single-sized attack meaning that we keep hammering a sing single D row so this is our aggressive row and then the two adjacent rows on above and below are our potential victim row so we induc bit filps in these right so this is uh how classical work row Hammer works and once you have the Oni trr mechanisms implemented inside theam chip what happens is that uh since you Hammer this aggressive R quite frequently they keep refreshing these victim RS right so you don't you do not observe bit filps in these victim rows so the question is if I keep accessing this aggressive row very frequently and I trigger some victim row refreshes targeting these immediate adjacent rows can I hammer these immediate adjacent rows to induce bit FIP in the two rows far uh distance uh the the answer is uh yes but not very uh not very straight forward so you Hammer this a lot and you Hammer this a little bit and uh also uh to help you actually the onit TR mechanism also hammers this this B thinking that it's protecting B because you're attacking a so you have some activation count accumulated in B and this a also has some disturbance induced to C here in two row distance and when you accumulate this disturbance effects you can induce bit filps at C so this is a new attack pattern that you know the deployed Solutions did not account for so this was able to bypass them basically quite reliably so this was published as a blog post in 2021 and then as a full paper in using Security in 2022 uh another very interesting work is um so yeah I I I really app appreciate this work actually uh so this is called Mo Prime so um in this work uh uh the the the authors basically went to Microsoft servers and then they looked at the workload uh workload access patterns to memory and then they analyzed uh like this this is a huge uh rigorous analysis of like uh snooping what is going on in the dram communication right and they find that uh even if you do not have any attack in your system you might end up hammering some rows and inducing bit flips and when they uh started like tearing apart the system and why this happens it turns out that in some chips um uh the Moa protocol I need to pause here actually and then ask you if you're familiar with cash coherency okay so Moi is a cash coher protocol many of you are familiar with uh you can also check the paper it explains it and we have lectures about Mo for sure so you can uh learn from there as well so um the Moa protocol in some dram chips is uh not really implemented in the best way possible so basically for sustaining coherency you might end up invalidating some cach blocks and then uh writing by some cash blocks very frequently to DM uh chip and then it causes so with certain accesses to caches you can make this even worse it causes accessing a lot of some some Dam rows a lot of times so you end up hammering some rows unintentionally and this is not even like a workload that a user uh runs on a system right it's just like the hardware components themselves are doing this so this is a very good example of like um we we have to when we design new systems we need to look at like all aspects of this and the we we need to avoid this kind of unwanted behaviors so to me um it has some parallelism with the uh other attacks for example that um we we keep suffering from Spectra meltdown you have you heard about Spectra and meltdown uh basically it's like for performance Improvement you have speculative execution in your uh microarchitecture and that causes some uh observable uh intermediate States in your architecture and um a malicious person can uh understand what you're doing and uh it can create some side channels right um and in this case you don't even need a malicious user that's you know doing anything wrong basically some improvement some uh implementation you did in your architecture causes this hammer and access pattern so is fascinating we need to uh really like Design Systems by considering like micro architecture memory subsystem software stch all of them together so the the the new and better architectures should come with a holistic approach uh yeah another paper uh 2021 also part of my thesis but I'm I think I'm not going to go into this very deeply um it was published in hbca uh it's a Raam mitigation mechanism that um um basically SEL throttles the memory access patterns that are performing some hammer and access pattern um before going into that do you have any questions so far no okay so uh we identify two key challenges okay maybe we should take another break but um let me quick Che something okay maybe I can cover this and then we can take the break so um am I still sharing the screen on Zoom okay that's good so we uh we looked at the existing mitigation mechanisms of that time and we identifi two key problems one of them is scalability with wor syndr Hammer vulnerability that we keep talking about the other one is compatibility with commod to dram chips and I will explain it here so okay this is not explaining the problem but uh I can verbally explain it so basically when you Hammer aggressive row you need to refresh the rows that are physically adjustable to the aggressive row right but inside the RAM chips there's a row address mapping that can scramble The Logical row addresses in the physical layout so row a + one might not be next to row a for example and this is some proprietary information you don't have access to this information and um there have been a lot of um uh me R mitigation mechanisms that are proposed to be implemented inside the memory controller that performs victim R refreshes which is not really practical without having that information so here in this work we tackle that uh problem that that's the challenge with Comm compatibility with com Tod RAM chips so when a row Hammer attack hammers a row a here so uh it might have induced bit flips in immediately adjust throws but U block Hammer our mechanism detect this raw Hammer attack using uh some area efficient structure called balloon filters and um it selectiv throttles the accesses targeting this row a only so all other accesses are serviced as usual nothing changed for them but for the accesses that are targetting row a you slow down that and then uh as a result in in an extended time window you don't have many accesses to row a and uh bit Philips do not occur in the victim rows and this scheme you don't have to know which row is adjacent to which row um and uh we also um do the performance evaluation and show that um rawh hammer sorry block Hammer can mitigate uh rawh hammer with philps with low overhead um so a takeaway so far is the main memory so it's just reiterating it right main memory needs intelligent controllers for security safety reliability and scaling um and we I I I I didn't talk about all those solution space uh I can refer you to other papers as well going forward um but basically like we have several Solutions in the field and they have their own trade of space for in terms of cost power performance and complexity um some exciting news are like from 2023 HX published a very short paper it's just like two page long if I remember um that talks about U both probabilistic and deterministic RAM mitigation mechanisms that they um they propos to be implemented inside theam chips so you can assume that they can they they uh implementing those um so I'm just going to skip this and I show here that uh so they they even show some Di picture where you have a cell an array of normal cells and you have some raw Hammer counting cells here and some uh logic here that that keeps accessing these like activation counters inside the dram chip and Performing dra Hammer refreshes uh there's another paper from Samsung that uh proposes some uh indiam stasan approximate counting uh algorithms uh and this paper was actually bypassed by the Google paper recently that I showed at the uh beginning of the previous uh uh session and um there's another paper that we had from it was from 2021 yes it's from 2021 it's uh it's published in a workshop called theam sack uh that's pretty much the U showing the state of thee art in rammer for the past several years um in the the workshop itself so it's it's a it's a workshop cated with this Isa conference that is one of the top conferences in architecture um so in this paper uh uh they they they propose a mechanism where you have a you implement a ro activation counter for each dram row inside the dram chip and you exploit the dram cells high density to implement all these counters it low cost and uh that turns out to be the new uh addition to the ddr5 uh standards actually so ddr5 is a few years old actually uh if you go to the first documents I think it's from 2020 uh but in 2024 April like less than a year ago right it's just like this year uh it got an update and this update uh suggests that um newer dram chips ddr5 ddr6 going forward will'll Implement um a raw Hammer mitigation mechanism where you have a uh row activation counter Affiliated to a dram row and internally you keep tracking you keep track of the activation counters um in a hopefully better way than the previous implementations of TR and you perform these uh um victim R refresh operations based on the activation count quite uh um on spot or like when needed um tomorrow oan will uh talk about uh this whole PRACK mechanism it's tradeoffs Pros cons and limitations and we we already analyzed it in a short paper that was published in this year's zamac uh as you can see oan is the uh first auor um and U uh we already implemented this in in our uh simulation infrastructure culator actually it's on GitHub uh you can even like download and uh evaluate this uh new edition of ddr5 uh in your systems today of course it's not like using your own Hardware it's just simulation environment but you can understand the performance benefits or security limitations or performance overheads so okay let's ask the question are we now rammer free in 2024 and Beyond I see some faces doing this okay yeah I think you're right um yeah so uh it seems like we are not really raw Hammer free or R disturbance free so in 2023 last year in isca we published another paper called RW press that I will talk in detail in after the break and uh it it basically ask the question what if there's another phenomenon that is based on theam disturbance different than rowhammer so we all have all these like mitigation mechanisms over there mitigating raw Hammer bit fillips what if we have another error mechanism so what happens then and we actually found that error mechanism uh tomorrow hon will uh talk about it in detail I will also briefly explain it uh in the last uh part of this session so it's are there questions so far okay let's take another 10 minute break uh if you don't mind and then we can cover the rest of the slides quickly e e e e e e e e e e e e e e e e e e e e e e e for hello smile can you still hear me test can you still hear me is there a problem is good okay okay let's close the door okay so uh thank you a lot for coming back uh on time so uh I'll sort of speedrun the last part of this slide deck uh but U I'm doing this on purpose actually because uh we will have more elaborated lectures on these topics uh or or at least like most of these topics uh tomorrow and in the foll uh other lectures so uh now I'm going to talk about Ro this is something similar to Raw Hammer but a different uh error mechanism that we have in the dram chips that is another Dam disturbance phenomenon so this paper is published in Isa 2023 so it's quite recent not all people are still aware of this uh but it's getting there so um it's something significantly quite significant and it's it's it's quite important going forward so um I'll just go over this sze um in a high level and then uh tomorrow we will have a more elaborated lecture the the first auor how song will be here and he will talk about R press he will talk about uh a followup work of R press and some circuit level uh simulation studies uh about R press as well so um this is the high level summary of this new R disturbance phenomenon so in this paper we demonstrate and analyze R press it's a new R disturbance phenomenon that caus bit Philips in real Dam chips it exacerbates uh Dam rowhammer vulnerability uh but it is different than rowhammer vulnerability uh and we demonstrate R press uh using a user level program on Real uh Intel system as well so you can even Mount R you can exploit R press from system level attacks and we also discussed some effective solutions to RW press in in our paper and there are some followup Works uh as well so um basically what is RW press RW press is a memory access pattern it's it's a d service phenomenon that will exploit by keeping a dram Row open for a long time so uh you can see the parallelism actually in the micro 21 paper I mentioned that if you keep a row open for a long time you can reduce the uh HC first value or the minimum Hammer count you need to uh the First with f so it was a um preliminary uh uh preliminary observation if you will in that paper and um in in this one uh we basically do a more rigorous analysis of such phenomenon such empirical observation and try to understand what is the root cause of this and how this behaves under different conditions so uh we find that these bit flips do not require many row activations so you can achieve these bit flips by activating rows only a few times and uh in some extreme cases you can achieve bit flips with only one row activation you activate a row wait for a very long time which is uh I would say not practical in current systems today but if exist if this phenomenon exist based then uh it can be uh quite practical in the future so uh okay let's look into this uh so the question we asked in this paper are there other red disturb issues in Dam and um other than raw Hammer you mean and um uh can can we like so far like there there's a lot of focus on Raw Hammer right if there are some other red disturbance mechanisms then tradeoffs might change and mations work by uh detecting hro activation count for raow Hammers and can we use them for other potential dorance problems as well so what if there is another R distance problem that we we cannot actually mitigate by just like looking at the high activation count so it makes our um like such uh potential uh unknown R disturbance phenomenon can make our uh very sopis and provably secure mitigation mechanisms uh vulnerable or bypassable uh because of the basic assumptions of like requiring High activation count um so here what we do is that instead of using a high activation count we increase the time that aggressor row stays open so we already went through this in the micro paper right and we observe that here you can see if you keep a row open for 36 NS uh you need to activate a row for 47,000 times to induce a bit flip so now we look at the same bit flip and we open the row 7.8 micros seconds instead of 36 NS every time we open the row and and by doing that we can reduce the necessary activation count from 47,000 to 5,000 so it's an order of magnitude reduction right and if you actually keep the row open for 30 milliseconds then only one activation is enough so 30 open for 30 milliseconds it's it's not very realistic but it shows like where it is going right so in this work we use a fbj based testing infrastructure again we have an fbga board here that is program with Dam bander that we will talk about in detail tomorrow um and we have a dam module con uh uh pressed against uh two heater pads from both sides uh and the heater pads are controlled by a temperature controller so we have the temperature stable as well and we have a host machine that runs the test and uh analyzes the experimental data so we have full control on uh what kind of uh dram commands we are issuing to this dram modules and uh we have a fine grain control over their Tim M so every 1.5 NCS we can issue another D command um and we observe that we test 164 ddr4 dram chips from three major manufacturers uh that cover different D densities and revisions so from different Generations if you will and um we observe that uh they are vulnerable to this new re disturbance phenomenon ress and ress significantly amplifies drams vulnerable to read disturbance and it has a different underlying error mechanism as well so when we look at the U change in the minimum activation count to induce the first bit flip the abbreviation is a little bit different here AC Min um we observe that um we can reduce this necessary activation count one to two orders of magnitudes in a practical way so here on the x-axis we have the time uh that we keep a row open for like how much basically here so in the in the First Column we have we we keep Row open for 36 NS this is the minimum you you can do and here 7.8 microc 70.2 microc and 30 milliseconds like just these uh points that we selected and uh it's it's a educated Choice actually because 7.8 micros is in these ddr4 modules the uh time inter leave between two refresh uh commands uh when refresh comes you need to close the row so that you can perform refresh operation that's why 7.8 microc is an important point so at 7.8 microc we can reduce the activation count from tens of thousands to just like single digit thousands right and um there's also an Extended refresh mode in the modern dram chips that you can postpone refresh operations for um nine times uh here so basically you can practically keep a row open for 70 micros until to like in between two refresh commands uh you can postpone refresh commands that much and uh it results in a few hundreds of activations uh to induce the first bit FIP so it's it's quite significant reduction in a very practical way and it's it's getting lower and lower let's go ahead which Tim in parameters yes correct yes fourth one doesn't because you need to perform ref so basically you cannot keep a row open for more than 70 microc 70.2 microc uh because at that point you need to close and refresh uh I'm not sure if it's a correct place to get into that discussion actually but I can briefly mention that uh there are some works also in the past that show that um not all rows need to be refreshed every 64 milliseconds so there are some rows that can retain their data for more than two seconds for example you don't have to refresh those rows and uh prior work already shows that you can get significant performance benefits if you Leverage The cerenity in your system and if you imagine that kind of system for example maybe you don't have to perform a refresh operation after 7.2 70.2 micros so maybe the maximum realistic time window can be something between this maybe even 30 milliseconds so yeah so basically like yeah with with today's commercial devices it's not possible but uh uh it's it's important to look into basically and uh let's so you're I hope that you're familiar with this cartoonish picture by now because I already showed it before um uh so on the left side we have everything good on the right side we have theam is doomed and uh you can see that with the new DDR 4D RAM chips we can induce bit flips with 380 uh row activations if we keep uh rows up active for like a uh the nominal refresh to abide with the nominal refresh inter interview and U interval and in hbm2 it's even worse uh 335 row activations are enough to indu bit Phillips and if you ex if you exploit the extended refresh rate where you can postpone the refreshes the story becomes like this so 51 activations is enough to induce a bit filp in new DDR 4D Ram module and 123 activations are enough to induce a bit F in HPM 2 so these are quite low and you can even uh observe easily benign work LS reach into these activation counts um of course they do not keep rows open for 72.2 microc but uh just being those numbers in the same ballpark is something significant that shows that we need to actually uh IM immediately take some precautions some counter measures to this um so essentially it amplifies R disturbance in Dam it reduces thresholds quite significantly in extreme cases only one activation is enough to induce bit Philips and we observe that as you increase temperature row press gets worse so it's different than raow hammer in raow Hammer we had a bounded temperature range right here is in increased temperature at thece so it's a different uh it has a different behavor Behavior with temperature and we observe that it's actually uh different than rawh hammer because of this phenomenon and also because of some other uh behaviors that houseon will talk in detail uh about so uh it affects the different set of cells compared to rmer and retention failures not the same cells are the most vulnerable ones for all those three mechanisms and it behaves differently uh so we also showed the that we can exploit on real system on a I5 uh machine with a ddr4 module uh connected to that and what we do is exactly what I described in the previous um uh question actually we basically keep accessing to different columns inside the same row so that the memory controller is motivated or there is some incentive for the memory controller to keep the row open longer and actually uh on the x-axis here you see that we we accessed different number of cach blocks uh in the same row as you go from left to right so in the leftmost one we only access one cach block so it's a raw Hammer attack right you activate the row read one cach block and then precharge the row activate again because you want to maximize the activation count and here we keep accessing different number of columns and uh we see so on the y axis it's the number of bit philps that we look into and um until like we access 32 different columns in a row we don't observe any bit filps in this test so basically in this particular test we show that uh there is this Dam cell set of dam cells or Dam chip Dam row that we cannot induce bit fillips by using raw Hammer alone but if we uh do if we uh use R press to exacerbate the ram disturbance then we can have significantly large number of good flips um okay so we uh propos some adapt adaptations to the existing mitigation mechanisms in this paper to um uh to address row press as well so basically we uh we basically limit the maximum Row open time uh by the memor controller scheduling policy and uh we also reduce the thresholds that we use accordingly so that um we um just like modify the existing roation mechanisms to uh behave a little more aggressive and we close rows uh at some point when it reaches to some threshold and we can mitigate it and we show that uh we can do that at very low performance overhead uh there are many more results and anal in this paper and there are more things that we need to discover as well so if you're curious please refer to this paper read it through and uh please attend the lecture tomorrow as well so that you can have a better understanding you can ask your questions directly to the first author as well and um we have like U lots of empirical observations and we prop we have suggestions for uh mitigations and all the um um um experimental methodology simulation into infrastructure everything is uh open source on GitHub so you can also dig into that as well uh so this is the full reference to paper and we have more to come uh again two major understanding R disturbance and solving R disturbance we need to invest on both directions and um as I referred to earlier in 2023 we published this uh fundamental understanding solving raw Hammer paper it's the invited uh retrospective paper that summarizes the recent advancements but after 2023 as I showed earlier like we have more advancements in the field it's a quite hot field we have a lot of papers coming up you can see uh just in 2024 a lot of papers so this is one of them that we we'll talk uh about tomorrow as well uh along with RW press so in this work we combine Ro hammer and R press memory access patterns to uh achieve uh a more effective attack than both um we have some other papers uh that uh try mitigating the uh so it's actually a bit abrupt here but uh uh I I will talk about this I guess later on in the memory access latency and refresh lecture um so uh basically there are more uh uh rowhammer mitigation and rowhammer attack papers and rammer characterization papers in the field in the um in the recent years so um the this this paper in particular uh talks about the limitations of the communication between the memory controller and Dam chip in today's uh setup and um uh it's it's important to actually revisit so that uh I talked about for example the internal row address layout uh inside the dram chip and we do not know that from the more controller side it's a limitation uh another limitation is that inside the dram chip you implement some trr mechanisms you need to perform a lot of refresh operations but the dram chip does not control its own timing so the memory controller controls its timing in a very fine grain Manner and it doesn't have the same information as the dram chip and um this causes some difficulty in like performing all those to to fit all the operations in a Time window basically um these schedu and decisions are getting way more complicated when you have this poor communication between the dam chip and the memory controller so uh this work talks about all those and also on the ECC as well so um I will refer to that uh it's it's uh I guess going forward we will talk more about these kind of things so this is U another paper in the same uh Direction um okay so I guess uh I have to explain this paper so um it's it's important at this point so um as you see like going forward dram chips cannot be just like uh the old usual dram chips that we can control everything from memory controller they need to be smarter they need to be more robust they need to be more reliable and that requires as we go forward uh to implement uh better maintenance mechanisms inside the dram chips and uh this is important because as I said earlier doing any simple change in the dram and uh memory controller communication protocol is a very painful and high latency process because of convincing all the uh different parties in the Consortium and um as a result of that we have some uh maintenance mechanisms that implemented in a huge DeLay So we um we discovered raw Hammer as a widespread phenomenon in 2014 and the uh first promising solution came out in 2024 so after 10 years so it took 10 years for uh uh for the communication protocol to adapt to this kind of or address this kind of need and this paper is uh proposing a solution to this whole problem in the industry so this is called self-managing theam it enables autonomous indam maintenance operations and the key idea is that we just need to let theam take some time off than it's needed to perform the maintenance operations so it prevents the memory control roller from accessing the D regions that are under maintenance by rejecting the row activation commands to those and it's it's a simple addition to the memory U memory communication the dram communication protocols and um basically we add a signal from the ram chip back to the memory controller called act neck pin and uh I think you will find some similarity between this pin and the signal that osam will mention tomorrow about the uh rowhammer solution um so basically memor when me when dram chip decides to do some maintenance operation this maintenance operation can be um uh refresh operation can be raw Hammer mitigation can be Memory scrubbing can be anything uh it's it basically blocks a memory region and if memory controller tries to activate a row in that memory region it sends a signal called act neck so memor control understands that that activation is not performed and it it does not try activating for some time and then it tries again and by doing that we can enable a lot of Maintenance mechanisms in in in in this paper we talk about three of them the refresh or protection and memor scrubbing and um you can uh Implement like a variety of all these maintenance mechanisms actually and um our goal is to here is to ease the pro of enabling new D maintenance operations and we show that we can easily Implement those uh without any additional changes in the memory communication protocol and uh by doing that we can also even improve performance while enabling all these maintenance operations so Hassan is the first author of this paper uh who is a alumnus of safari and he has a full talk on YouTube explain all the details uh I I uh I want to encourage you to uh go there and listen to that and also refer to the paper and understand details as well it actually talks about uh the underlying and overlooked problem of the DRM industry in a sense it's it's it's quite uh important and sign uh like it's it's really interesting going forward to um enable better more robust Dam chips um so I'm not going to get into this other paper before this are there any questions about self managing you know or the other things yes go ahead yeah so yeah I didn't get into details basically when you do maintenance operations inside the Dr module or Dam chip then you have uh way more freedom of doing things in the way that you want that that works for you than doing that in the memory controller so one example is uh you can enable subar level parallelism for example so inside the dran bank you have hundreds of subay and uh in the current communication protocol if you refresh one row 8 kilobyte of data you actually um block megabytes of data uh because like the whole bank is blocked during that refresh but internally you don't have to do that you can leverage subarray level parallelism uh you can access one subarray while doing the maintenance operation in other subarrays so you can overlap latencies you can hide things you can skip refreshing some rows if they are not needed to refresh you can um stral activation counts in a uh way uh high accuracy uh compared to memory controller so you can reduce the number of maintenance operations and you can actually perform them in a lower latency as well when you do them in dup other questions okay so we have more papers uh that are published this year Abacus is a paper for Rohan mitigation published in usage Security in August and Comet uh is an hpca paper it's also about rication Nissa in the front row we'll talk about both of them tomorrow please attend the lecture uh spatial variation in disturbance defenses is uh my paper that was published earlier this year I'll also talk about it tomorrow uh and this is the reference to my thesis for like uh a nice summary in the related work and also uh the projects that I talked about today are included inside this Theses you can um read it there as well so uh one thing to mention on top of all these is that effect of agent on Ro with philps that we do not know so this is an ongoing problem uh Nobody Knows the Aging effect of aging on Ro Hammer with flips so we urgently need to do uh uh more characterization Works to understand how damaging effects ra Hammer bit flips so we have some preliminary data that we published in the paper that I M L where the Aging uh can lead to Raw Hammer bit filps to occure at smaller Hammer counts so what we did was like we kept hammer in some rows for many days and then we observe in a small fraction of rows that the activation count threshold can be reduced a little bit uh or quite significantly in some cases so um this is some motivational data I'm not claiming that uh aging has a huge effect but this is some motivational data that hints that there's something there and if you can find what is there it can be a really impactful work so I want to encourage you to you know do such studies we have all the infrastructure if you want to use we have a upcoming paper called Brak Hammer that reduces the uh uh performance over head of existing mitigation mechanisms by throttling the suspected threats that will appear in micro uh next month so it's not published yet but I can quickly summarize it uh our key observations that r mitigation is some performance uh uh some has some significant performance overheads and uh it's it's a new attack surface for exploits like denial of service so you can keep triggering these mechanisms so they keep the dam chips busy so theam Chip cannot uh service your memory requests and as a result uh you end up uh denying the memory service and it causes a lot of performance overhead it's a it's a it's a very Charming attack for the uh malicious people and uh our key idea is to throttle the memory uh access patterns of the hardware threads that keep uh triggering these uh maintenance operations and we to this to to do so we propos a new mechanism called Brak Hammer that detects the threat that repeatedly trigger these U uh maintenance operations and limits their on fly memory request by limiting the cash Miss buffers that they use and uh we can implement at near zero overhead with no additional memory access latency and we can improve system performance quite a lot by avoiding um or mitigating the effect of denial of service attacks uh so probably after micro in a future session we will talk about it in detail as well so um we have more solutions in Industry as well that we will talk about uh you can when you look at the programs of the major conferences in 2024 you can see a lot of papers about rowhammer you have workshops fully about rowhammer all the papers are about rowhammer in this dram security Workshop um you have circuit level simulation studies that uh this one house one will talk about tomorrow and uh in the upcoming micro uh uh conference we have a session dedicated for rawh Hammer with three raw Hammer papers uh so we have going forward more future robustness uh challenges because as the ram cells get smaller technal with technal scaling uh we observe that uh the minimum Hammer count to induce bit filp gets reduced more than two orders of magnitude and um it affects the memory robust quite a lot so memory is becoming less reliable and more vulnerable so we need better mechanisms and um we can have other uh error mechanisms in the circuit that we do not know for today uh but we need to discover them so that uh we can U mitigate their effects whenever they come and um uh for future main memory robustness we need to look at dram but we shouldn't stop with dram we should look at flash memory that's also very unreliable memory technology uh but we have Smart uh mechanisms around that that make it work but it it also gets worse so we need to look at that we need to look at emerging memory Technologies uh there are some like uh futuristic papers that were published this one is published uh 2009 uh quite early basically that uh looks at the um trade-offs of uh using face change memory as the main memory uh so I I want to refer you to that as well so uh the takeaway is that going forward we need intelligent memory controllers enhanced robustness and enable better scaling we need to understand architect and uh design better testing methodologies uh for the memory systems we need better infrastructures these are some pictures of our infrastructures for Dam and flash memory and I'm going to just show you a lot of bridges that are that have collapsed so uh these Bridges were not robust enough and this is some disaster and we do not want similar disaster in our Computing systems so we need intelligent memory controllers to avoid such failures and Me main memory needs to uh needs intelligent controls for security safety reliability and scalability and uh we need fundamental robust computer architectures we don't need to have we we don't want to have just like temporary patches final thoughts this is a sort of bantine failure where uh uh you can have undetected erroneous competition and it's not something like you can fail fast and you can have a direct error message you need to do a lot of characterization you need to understand your circuitry really well and um uh erroneous behavior of the circuitry can be exploited and it can be used maliciously so we need to avoid all of these uh a reference to byzantin General's problem you you can take a look at that if you're curious and um um yeah so this is not something new last thing I promise uh before Ro Hammer as well we had some uh attacks basically you put a light bulb next to the memory it heats up the memory and then you have a lot of data retention errors and uh you can uh attack systems like this so this is a paper that was published in security and privacy in 2003 and with ra Hammer such attacks are much simpler much exploitable and they can be induced by the software so there are a lot of news articles as well about that so uh raw Hammer set up a new mindset um and it renewed the interest in Hardware security attack research and uh it shows us repeatedly actually Hardware security is important and we need to understand our systems better uh we have manyi rowhammer and bit flip attacks tons of papers come every year and many New rawh Hammer Solutions as well but more to come so uh this shifted mindset in the mainstream security researchers and now they're looking at the hardware as well and uh it's also not a coincidence that the two groups that discovered M Spectre also worked with rammer before uh all of these are like related to each other so there are more to come come in conclusion memory reliability is reducing rammer is a prime example of that and it's the first example how a simple Hardware failure mechanism can be exploited and become a widespread system security problem bad news is rammer is getting worse and good news is we have a lot to do so if you're curious we have a lot to do we have a lot of infrastructure available you can come work with us um okay so I'm just going to skip this because we're running out of time and uh I'm going to show you some Safari pictures this is our group we have newsletters we have a lot of updates on dram servance and also other topics that we're working on and uh you can find our lectures and um thank you for the funders and uh um yeah so you can you can access all the papers actually essentially in this link uh please don't hesitate to uh read our papers I encourage you to read our papers and also get hands on with our tools we open source everything on GitHub and um we have uh this new infrastructure for simulation rolator 2 and theander that I keep talking about the fbgs infrastructure house on will talk about them tomorrow and we have a pns course uh about this and uh we have some other lectures already on YouTube that you can follow and this is the story of rooh hammer only on the surface us uh for deeper knowledge please uh come fresh to our lecture tomorrow so don't get hammered tonight and uh uh let's uh see you soon tomorrow and get into details more I can stay and answer your questions but we are out of time and they're re uh catching another lecture I guess now so I I'll stay a little more to answer your questions but we should wrap up the session here I guess thank you very much

Transcript for:[Lecture 5] Understanding Rowhammer Vulnerability and Mitigation

Transcript for:
[Lecture 5] Understanding Rowhammer Vulnerability and Mitigation