Transcript for:
Understanding the Impact of Statistics

the world we live in is a wash with data that comes pouring in from everywhere around us on its own this data is just noise and confusion to make sense of data to find the meaning in it we need the powerful branch of science statistics believe me there's nothing boring about statistics especially not today when we can make the data sing with statistics we can really make sense of the world and there's more with statistics the data deluge as it's being called is leading us to an ever greater understanding of life on earth and the universe beyond and thanks to the incredible power of today's computers it may fundamentally transform the process of scientific discovery i kid you not statistics is now the sexiest subject around [Music] did you know that there is one million boats in sweden that's one boat for nine people is the highest number of boats per person in [Music] europe being a statistician you don't like telling your profession at dinner parties but really statisticians shouldn't be shy because everyone wants to understand what's going on and statistics gives us a perspective on the world we live in that we can't get in any other way [Music] statistics tells us whether the things we think and believe are actually true and statistics are far more useful than we usually like to admit in the last recession there was this famous call-in to a talk radio station the man complained in times like this when unemployment rates are up to 13 and income has fallen by 5 and suicide rates are climbing i get so angry that the government is wasting money on things like collection of statistics i'm not officially a statistician strictly speaking my field is global health but i got really obsessed with stats when i realized how much people in sweden just don't know about the rest of the world i started in our medical university karolinska institute an undergraduate course called global health these students coming to us actually have the highest grade you can get in swedish college system so i thought maybe they know everything i'm going to teach them about so i did a pre-test when they came and one of the question from which i learned a lot was this one which country has the highest child mortality of these five pairs i won't put you at a test here but it's turkey which is highest there poland russia pakistan and south africa and these were the results of the swedish students at one point eight right answer out of five possible that means that there was a place for a professor of international health and for my course but one late night when i was compiling the report i really realized my discovery i have shown that swedish top students know statistically significantly less about the world than the chimpanzees because the chimpanzee would score half right if i gave him two bananas with sri lanka and turkey they would be right half of the cases but the students are not there i did also an unethical study of the professors of the karolinska institute that hands out the nobel prize in medicine and they are on par with the chimpanzee there today there's more information accessible than ever before and i work with my team at the gapminder foundation using new tools that help everyone make sense of the changing world we draw on the masses of data that's now freely available from international institutions like the un and the world bank and it's become my mission to share the insights from this data with anyone who listen and to reveal how statistics is nothing to be frightened of [Music] i'm going to provide you a view of the global health situation across mankind and i'm going to do that in a hopefully enjoyable way so relax so we did the software which displays it like this every bubble here is a country this is china this is india the size of the bubble is the population and i'm going to stage a race here between this sort of yellowish fort here and the red toyota down there and the brownish volvo the toyota has a very bad start down here and united states ford is going off road then and the volvo is doing quite fine this is the war the toyota got off track and now toyota is coming on the healthier side of sweden that's about where i sold the ball when bought the toyota this was a great leap forward when china fell down it was central landing by mount situ china recovered and they said never more stupid central planning but they went up here no there was one more inequity look there united states oh they broke my frame washington dc is so rich over there but they are not as healthy as kerala in india it's quite interesting isn't it welcome to the usa world leaders in big cars and free data there are many here who share my vision of making public data accessible and useful for everyone the city of san francisco is in the lead opening up its data on everything even the police department is releasing all its crime reports this official crime data has been turned into a wonderful interactive map by two of the city's computer visits its community statistics in action [Music] crime spotting is a map of crime reports from the san francisco police department showing you know dots on maps for citizens to be able to see patterns of crime around their neighborhoods in san francisco the map is not just about individual crimes but about broader patterns that show you where crime is clustered around the city which areas have high crime in which areas have relatively low crime [Music] we are here at the top of jones street on nob hill quite a nice neighborhood what the crime maps show us is the relationship between topography and crime basically the higher up the hill the less crime there is cross over the border into the flats [Music] essentially as soon as you get into the kind of lower lying areas of jones street the crime just skyrockets [Music] so we're here in the uptown tenderloin district it's one of the oldest and densest neighborhoods in san francisco this is where you go to buy drugs right around here [Music] we see lots of aggravated assaults lots of auto thefts basically the huge part of the crime that happens in the city happens just right in this five or six block radius if you've been hearing police sirens in your neighborhood you can use the map to find out why if you're out at night in an unfamiliar part of town you can check the map for streets to avoid if a neighbor gets burgle you can see is it a one-off or has there been a spike in local crime if you commute through a neighborhood and you're worried about its safety the fact that we have the ability to turn off all the nighttime and middle of the day crimes and show you just the things that are happening during the commute is a statistical operation but i think to people that are interacting with the thing it feels very much more like they're just sort of browsing a website or you know shopping on amazon they're they're looking at data and they don't realize they're doing statistics what's most exciting for me is that public statistics is making citizens more powerful and the authorities more accountable [Music] we have community meetings that the police attend and what citizens are now doing are bringing printouts of the maps that show where crimes are taking place and they're demanding services from the police department and the police department is now having to change how they police how they provide policing services because the data is showing what is working and what is not people in san francisco are also using public data to map social inequalities and see how to improve society and the possibilities are endless i think our dream government data analysis project would really be focused on live information on stuff that was being reported and pushed out to the world over the internet as it was happening you know trash pickups traffic accidents buses and i think through the kind of stats gathering power of the internet it's possible to really begin to see the the workings of the city displayed as a unified interface so that's where we are heading towards a world of free data with all the statistical insights that come from it accessible to everyone empowering us as citizens and letting us hold our rulers to account it's a long way from where statistics began statistics are essential to us to monitor our governments and our societies but it was our rulers up there who started the collection of statistics in the first place in order to monitor us [Applause] in fact the word statistics comes from the state modern statistics began two centuries ago once it got going it spread and never stopped and guess who was first the chinese have confucius the italians have da vinci and the british have shakespeare and we have the tabel varkid the first ever systematic collection of statistics since the year 1749 we have collected data on every birth marriage and death and we are proud of it [Music] the tabel varkid recorded information from every parish in sweden it was a huge quantity of data and it was the first time any government could get an accurate picture of its people [Music] sweden had been the greatest military power in northern europe but by 1749 our star was really fading and other countries were growing stronger at least though we were a large power thought to have 20 million people enough to rival britain and france [Music] but we were in for a nasty surprise [Music] the first analysis of tabel berkett revealed that sweden all had 2 million inhabitants sweden was not only a power in decline it also had a very small population the government was horrified by this finding what if the enemy found out but the tabel vargas also showed that many women died in childbirth and many children died young so government took action to improve the health of the people this was the beginning of modern sweden it took more than 50 years before the austrians belgians danes dutch french germans italians and finally the british caught up with sweden in collecting and using statistics it was called political arithmetic there was a lovely phrase that was used for statistics governments could have much more control and understanding of this society how it was working how it was developing and essentially so they could control it better [Music] it wasn't just governments who woke up to the power of statistics right across europe 19th century society went mad for facts and despite its late start britain with its royal statistical society in london was soon a statistician's nirvana [Music] i love looking at old copies of the royal statistical society journal because it's full of such odd stuff there's a wonderful paper from the 1840s which shows a map of england and the rates of bastardy in each county and so you can identify very quickly the areas with high rates of bastardy big in east anglia it always makes me slightly laugh that norfolk seems to top the bastardy league in the 1840s one of the founders of the royal statistical society was the great victorian mathematician and inventor charles babbage in 1842 he read the latest poem by an equally great victorian alfred tennyson vision of sin contained the lines fill the cup and fill the can have a rouse before the morn every moment dies a man every moment one is born so keen a statistician was a babbage that he could not contain himself he dashed off a letter to tennyson explaining that because of population growth the line should read every moment dies a man and one and a sixteenth is born i may add that the exact figure is 1.067 but something must be conceded to the laws of meter in the 19th century scholars all over europe did amazing work in measuring their societies they were hoovering update on almost everything but numbers alone don't tell you anything you have to analyze them and that's what makes statistics when the first statisticians began to get to grips with analyzing their data they seized upon the average and they took the average of everything [Music] what's so great about an average is that you can take a whole mass of data and reduce it to a single number [Music] and though each of us is unique our collective lives produce averages that can characterize whole populations i looked to my local newspaper one week and saw a pensioner had accidentally put her foot on the accelerator and crushed her friend against a wall devastating hideous horrible thing to happen and then there was a second one about a young man who didn't have a driving license was driving a car under the influence of drugs and alcohol and he bashed into a pedestrian killed him what's remarkable absolutely remarkable if you look at the number of people who die each year in traffic crashes it's nearly a constant what all these individual events somehow when you sum them all up there's the same number every year and every year two and a half times as many men die in traffic crashes as women and it's a constant and every year the rate in belgium is double the rate in england they're these remarkable regularities so that these individual particular events summer sum up into a social phenomenon let's see what sweden have done we used to boast about fast social progress that's where we were in my lectures to tell stories about the changing world i use the averages from entire countries whether the average of income child mortality family size or carbon output okay i give you singapore the year i was born singapore had twice the child mortality of sweden it's the most tropical country in the world a marshland on the equator and here we go it took a little time for them to get independent but then they started to grow their economy and they made the social investment they got away malaria they got a magnificent health system that beated both u.s and sweden we never thought it would happen that they would win over sweden but useful as averages are they don't tell you the whole story [Music] on average swedish people have slightly less than two legs this is because a few people only have one leg or no legs and no one has three legs so almost everybody in sweden has more than the average number of legs the variation in data is just as important as the average [Music] but how do you get a handle on variation for this you transform numbers into shapes [Music] let's look again at the number of adult women in sweden for different heights plotting the data as a shape shows how much their heights vary from the average and how wide that variation is the shape a set of data makes is called its distribution this is the income distribution of china 1970 this is the income distribution of the united states 1970 almost no overlap and what has happened china is growing it's not so equal any longer and it's appearing here overlooking the united states almost like a ghost isn't it it's pretty scary [Music] the statisticians who first explored distribution discovered one shape that turned up again and again the victorian scholar francis goldtone was so fascinated he built a machine that could reproduce it and he found it fitted so many different sets of measurements that he named it the normal distribution whether it was people's arms bands lung capacities or even their exam results the normal distribution shape recurred time and time again other statisticians soon found many other regular shapes each produced by a particular kind of natural or social processes and every statistician has their favorite the poisson distribution the plasma check is my favorite distribution i think he says absolute cracker [Music] the poison shake describes how likely it is that out of the ordinary things will happen imagine a london bus stop where we know that on average we'll get three buses an hour we won't always get three buses of course amazingly the poisson shape will show us the probability that in any given hour we will get four five or six buses or no buses at the exact shape changes with the average but whether it's how many people will win the lottery jackpot each week or how many people will phone a call center each minute the poisson shape will give the probabilities the wonderful example where this was applied to in the late 19th century was to count each year the number of prussian officers cavalry officers who were kicked to death by their horses now some years there were none some years that one some years ago but two up to seven i think one particularly bad year and but with this distribution of how many years they were with naught one two three four prussian cavalry officers kicked to death by the horses beautifully obeyed the poisson [Applause] [Music] [Music] distribution [Music] so statisticians use shapes to reveal the patterns in the data but we also use images of all kinds to communicate statistics to a wider public because if the story in the numbers is told by a beautiful and clever image then everyone understands of the pioneers of statistical graphics my favorite is florence nightingale [Music] there are not many people who realize that actually she was known as a passionate statistician and not just a lady of the lamp she said that to understand god's thoughts we must study statistics for these are the measure of his purpose statistics was for her a religious duty and moral imperative [Music] when florence was nine years old she started collecting data her data was different fruits and vegetables she found put them into different tables trying to organize them in some standard form and so we have one of vinegal's first statistical tables at the age of nine in the mid-1850s florence nightingale went to the crimea to care for british casualties of war she was horrified by what she discovered for all the soldiers being blown to bits on the battlefield there were many many more soldiers dying from diseases they caught in the army's filthy hospitals so florence nightingale began counting the dead for two years she recorded mortality data in meticulous detail when the war was over she persuaded the government to set up a royal commission of inquiry and gathered her data in a devastating report what has cemented her place in the statistical history books are the graphics she used and one in particular the polar area graph for each month of the war a huge blue wedge represented the soldiers who had died from preventable diseases the much smaller red wedges were deaths from wounds and the black wedges deaths from accidents and other causes nightingale's graphics were so clear they were impossible to ignore the usual thing around florence nightingale's time was just to produce tables and tables of figures and absolutely really tedious stuff that unless you're absolutely dedicated statistician it's really quite difficult to spot the patterns quite naturally but visualizations they tell a story they tell a story immediately and the use of color and the use of shape and you know can really tell a powerful story and nowadays of course we can make things move as well florence nightingale would have loved to have played with she would have been produced wonderful animations i'm absolutely certain of it today 150 years on nightingale's graphics are rightly regarded as a classic they led to a revolution in nursing healthcare and hygiene in hospitals worldwide which saved innumerable lives and statistical graphics has become an art form of its very own led by designers who are passionate about visualizing data [Music] this is the billion pound ogram this image arose out of frustration with the reporting of billion pound amounts in the media 500 billion pounds for this war 50 billion pounds for this oil spill doesn't kind of make sense these numbers are too enormous to get your mind around so i scraped all this data from various news sources and created this diagram so the squares here are scaled according to the billion pound amounts when you see numbers visualize like this you start to have a different kind of relationship with them um you can start to see patterns you can see the scale of them here in the corner this little square 37 billion that was the cost predicted cost of the iraq war in 2003 as you can see it's grown exponentially over the last few years so the total cost now is around about 2 500 billion it's funny because when you visualize statistics like this you understand them and when you understand them you can really start to put things into perspective visualization is right at the heart of my own work too i teach global health and i know having the data is not enough i have to show it in ways people both enjoy and understand now i'm going to try something i've never done before animating the data in real space with a bit of technical assistance from the crew so here we go first an axis for health life expectancy from 25 years to 75 years and down here an axis for wealth income per person 400 4 000 and 40 000 so down here is poor and sick and up here is rich and healthy now i'm going to show you the world 200 years ago in 1810 here come all the countries europe brown asia red middle east green africa south of sahara blue and the americas yellow and the size of the country bubble showed the size of the population and in 1810 it was pretty crowded down there wasn't it all countries were sick and poor life expectancy were below 40 in all countries and all the uk and the netherlands were slightly better off but not much and now why start the world the industrial revolution makes countries in europe and elsewhere move away from the rest but the colonized countries in asia and africa they are stuck down there and eventually the western countries get healthier and healthier and now we slow down to show the impact of the first world war and the spanish flu epidemic what a catastrophe and now i speed up through the 1920s and the 1930s and in spite of the great depression western countries forge on towards greater wealth and health japan and some others try to follow but most countries stay down here now after the tragedies of the second world war we stop a bit to look at the world in 1948. 1948 was a great year the war was over sweden topped the medal table at the winter olympics and i was born but the differences between the countries of the world was wider than ever united states was in the front japan was catching up brazil was way behind iran was getting a little richer from oil but still had short lives and the asian giants china india pakistan bangladesh and indonesia they were still poor and sick down here but look what is about to happen here we go again in my lifetime former colonies gained independence and then finally they started to get healthier and healthier and healthier and in the 1970s then countries in asia and latin america started to catch up with the western countries they became the emerging economies some in africa follows some africans were stuck in civil war and others hit by hiv and now we can see the world today in the most up-to-date statistics most people today live in the middle but they're a huge difference at the same time between the best of countries and the worst of countries and there are also huge inequalities within countries these bubbles show country averages but i can split them take china i can split it into provinces there goes shanghai it has the same wealth and health as italy today and there is the poor inline province gwy show it is like pakistan and if i split it further the rural parts are like ghana in africa and yet despite the enormous disparities today we have seen 200 years of remarkable progress that huge historical gap between the west and the rest is now closing we have become an entirely new converging world and i see a clear trend into the future with aid trade green technology and peace it's fully possible that everyone can make it to the healthy wealthy corner [Music] well what you have seen in the last few minutes is a story of 200 countries shown over 200 years and beyond it involves plotting a 120 000 numbers pretty neat [Music] so with statistics we can begin to see things as they really are from tables of data to averages distributions and visualizations statistics gives us a clear description of the world but with statistics we cannot only discover what is happening but also explore why by using the powerful analytical method correlation just looking at one thing at a time doesn't tell you very much is you've got to look at the relationships between things how they change how they vary together and that's what correlation is about and that's how you start trying to understand the processes that are really going on in the world and in society most of us today would recognize that crime correlates to poverty that infection correlates to poor sanitation and that knowledge of statistics correlates to being great at dancing correlations can be very tricky i got a joke about silly correlations there was this american who was afraid of heart attack and he found out that the japanese ate very little fat and almost didn't drink wine but they had much less heart attacks than the american but on the other hand he also found out that the french eat as much fat as the americans and they drink much more wine but they also have less heart attacks so he concluded that what kills you is speaking [Music] to them english time the pace the cigarette weights tipped the best example of a really groundbreaking correlation is the link that was established in the 1950s between smoking and lung cancer not long after the second world war a british doctor richard dahl investigated lung cancer patients in 20 london hospitals and he became certain that the only thing they had in common was smoking so certain that he stopped smoking himself but other people weren't so sure [Music] a lot of the discussion of the early data linking smoking to lung cancer said well it's not the smoking surely that thing that we've done all our lives that can't be bad for you maybe it's genes maybe people who are genetically predisposed to get lung cancer are also genetically predisposed to smoke maybe it's not the smoking maybe it's air pollution that smokers are somehow more exposed to air pollution than non-smokers maybe it's not smoking maybe it's poverty so now we've got three alternate explanations apart from chance to verify his correlation did imply cause and effect richard created the biggest statistical study of smoking yet he began tracking the lives of 40 000 british doctors some of whom smoked and some of whom didn't and gathered enough data to correlate the amount the doctors smoked with their likelihood of getting cancer eventually he not only showed a correlation between smoking and lung cancer but also correlation between stopping smoking and reducing the risk this was science at its best what correlations do not replace is human thought you could think about what it means i mean what a good scientist does if he comes up with a correlation is try as hard as she or he possibly can to disprove it to break it down to get rid of it to try and refute it and if it withstands all those efforts at demolishing it and it's still standing up then cautiously you say we really might have something here [Music] [Applause] however brilliant the scientists data is still the oxygen of science the good news is that the more we have the more correlations we'll find the more theories we'll test and the more discoveries we are likely to make and history shows how our total sum of information grows in huge leaps as we develop new technologies the invention of the printing press kicked off the first data and information explosion if you piled up all the books that had been printed by the year 1700 they would make 60 stacks each as high as mount everest then starting in the 19th century there came a second information revolution with the telegraph gramophone and camera and later radio and tv the total amount of information exploded and by the 1950s the information available to us all had multiplied six thousand times then thanks to the computer and later the internet we went digital and the amount of data we have now is unimaginable vast [Music] a single letter printed in a book is equivalent to a byte of data a printed page equals a kilobyte or two [Music] 5 megabytes is enough for the complete works of shakespeare 10 gigabytes that's a dvd movie two terabytes is the tens of millions of photos added to facebook every day [Music] ten petabytes is the data recorded every second by the world's largest particle accelerator so much only a tiny fraction is kept six exabytes is what you'd have if you sequence the genomes of every single person on earth [Music] but really that's nothing in 2009 the internet added up to 500 exabytes and in 2010 in just one year that will double to more than one zettabyte [Music] back in the real world if we turn all this data into print it would make 90 stacks of books each reaching from here all the way to the sun the data deluge is staggering but with today's computers and statistics i'm confident we can handle it [Music] when it comes to all the data on the internet the powerhouse of statistical analysis is the silicon valley giant google the average person over their lifetime is exposed to about 100 million words of conversation and so if you multiply that by the six billion people on the planet that amount of words is about equal to the number of words that google has available at any one instant in time google's computers hoover up and file away every document web page and image they can find they then hunt for patterns and correlations in all this data doing statistics on a massive scale and for me google has one project that's particularly exciting statistical language translation we wanted to provide access to all the web's information no matter what language you spoke there's just so much information on the internet you couldn't hope to translate it all by hand into every possible language we figured we'd have to be able to do machine translation in the past programmers try to teach their computers to see each language as a set of grammatical rules much like the way languages are taught at school but this didn't work because no set of rules could capture language in all its subtlety and ambiguity having eaten our lunch the coach departed well that's obviously incorrect written like that it would imply that the coach has eaten the lunch it would be far better to say having eaten our lunch we departed in the coach those rules are helpful and they are useful most of the time but they don't turn out to be true all the time and the insight of using statistical machine translation is saying well if you've got to have all these exceptions anyways maybe you can get by without having any of the rules maybe you can treat everything as an exception and that's essentially what we've done what the computer is doing when when he's learning how to translate is to learn correlations between words and correlations between phrases so we we feed the system very large amounts of data and then the system is seeing that a certain word or a certain phrase correlates very often to the other language google's website currently offers translation between any of 57 different languages it does this purely statistically having correlated a huge collection of multilingual texts the people that build the system don't need to know chinese in order to build a chinese english system they don't need to know arabic but the expertise that's needed is basically knowledge of statistics knowledge of computer science uh knowledge of infrastructure to build those very large computational systems that we are building for doing that okay so then i'm going to invite i hooked up with google from my office in stockholm to try the translator for myself okay [Music] okay so it says sweden's finance minister has a ponytail and a gold ring in your ear so i guess it probably means in this year correct it's amazing he comes from the conservative party that's the kind of sweden we have today and i will type one more sentence exceed some [Music] in his same-sex partnerships has stockholm's new bishop and his partners are three years almost perfect unusual it's one important thing it's her it's a lesbian partnership okay so so that's uh those kinds of words his and her are are one of the challenges in in translation to get really those right in empty when it comes to bishops one can excuse it right right so i guess more often than not it would probably be a hiss i would write one more sentence okay when sweden is taking part in the olympic goal is not to win but to beat norway yes this is what it is but they are very good in winter olympics so we can't make it but we are trying ah very good very good this is absolutely amazing you know and i was especially uh impressed that it picked up words like sam same-sex partnerships which are very new to the language the translator is good but if they succeed with what's next that'll be remarkable one of the exciting possibilities is combining the machine translation technology with the speech recognition technology now both of these are statistical in nature the machine translation relies on the statistics of mapping from one language to another and similarly speech recognition relies on the statistics of mat mapping from a sound form to the words when we put them together now we have the capability of having instant conversation between two people that don't speak a common language that uh i can talk to you in my language you hear me in your language and you can answer back and in real time we can make that translation we can bring to get people together and allow them to speak [Music] the internet is just one of many technologies created to gather massive amounts of data scientists studying our earth and our environment now use an incredible range of instruments to measure the processes of our planet all around us are sensors continuously measuring temperature water flow and ocean currents and high in orbit are satellites busy imaging cloud formations forest growth and snow cover scientists speak of instrumenting the earth and pointing up to the skies above are powerful new telescopes mapping the universe what's happening in astronomy is typical of how profoundly this new torrent of data is transforming science [Music] astronomers are now addressing many enduring mysteries of the cosmos by applying statistical methods to all this new data [Music] the galaxy's a very big place and it's got billions of stars in it and so to put together a coherent picture of the whole galaxy requires having an enormous amount of data and before you could do a large sky survey with sensitive digital detectors that meant that you could map many many stars all at once it was very difficult to build up enough data on enough of the galaxy in the past large surveys of the night sky had to be done by exposing thousands of large photographic plates but this service could take 25 years or more to complete then in the 1990s came digital astronomy and a huge increase in both the amount and the accessibility of data the sloan sky survey is the world's biggest yet using a massive digital sensor mounted on the back of a custom-built telescope in new mexico it's scanned the sky night after night for eight years building up a composite picture in unprecedented resolution the saloon is some of the best deepest survey data that we have in astronomy both on our own galaxy and on galaxies further away from ours [Music] all the sloan data is on the internet and with it astronomers have identified millions of hither to unknown stars and galaxies they also comb the database for statistical patterns which will prove disprove or even suggest new theories so we have this idea that galaxies grow they become large galaxies like the one we live in the milky way uh not all at once or not smoothly but by continuously incorporating basically cannibalizing smaller galaxies they dissolve them and they become part of of the bigger galaxy as it grows it's a startling idea and in the sloan data is the evidence to support it groups of stars that came from cannibalized galaxies stand out in the sloan data as statistically different from other stars because they move at a different velocity each big spike on one of these distribution graphs means professor roccosi has found a group of stars all traveling in a different way to the rest they are the telltale patterns she's looking for the evidence is accumulating that in fact this really is how galaxies grow or an important way in which how galaxies grow and so this is an important part of of understanding how galaxies form not only ours but every galaxy the more data there is the more discoveries can be made and the technology is getting better all the time the next big survey telescope starts its work in 2015. it will leave sloan in the dust sloan has taken eight years to cover one quarter of the night sky the new telescope will scan the entire sky in even greater resolution every three days [Music] the vast amounts of data we have today allows researchers in all sorts of fields to test their theories on the previously unimaginable scale but more than this it may even change the fundamental way science is done with the power of today's computers applied to all this data the machines might even be able to guide the researchers [Music] we're at a potentially profoundly important and potentially one of the most significant points in science and certainly one of the most exciting where the potential to transform not just how scientists do science but even what science is possible and what will power that transformation of both how science is done and even what science is possible is going to be computation many of the dynamics of the natural world like the interplay between the rainforest and the atmosphere are so complex that we don't as yet really understand them but now computers are generating literally tens of thousands of different simulations of how these biological systems might work it's like creating thousands of hypothetical parallel worlds each and every one of these simulations is analyzed with statistics to see if any are a good match for what is observed in nature the computers can now automatically generate test and discard hypothesis with scarcely a human inside this new application statistics will become absolutely vital for the future of science it's creating a new paradigm if you like in science in the way in which we can do science which is increasingly which one might characterize as data-centric or data-driven rather than being hypothesis-driven or experimentally driven so it's exciting times in terms of the science in terms of the competition and in terms of the statistics [Music] now if all that sounds a bit abstract and theoretical to you how about one final frontier could statistics even make sense of your feelings in california where else one computer scientist is harvesting the internet to try to define the patterns of our innermost thoughts and emotions [Music] well this is the mad mister movement the madness movement represents a skyscraper view of the world each of these brightly colored dots is an individual feeling expressed by someone out there in a blog or a tweet and when you click on a dot it explodes to reveal the underlying feeling of that person this is what people say they're feeling today better safe crappy well pretty special sorry alone [Music] so every minute we feel fine crawls the world's blogs takes all the sentences that start with the words i feel or i am feeling and puts them in a database we collect all the feelings and we count the most common they are better bad good right guilty sick the same like [ __ ] sorry well and so on and we can take a look at any one feeling and analyze it right now a lot of people are feeling happy we can take a look at all the people who are happy and break down by age gender or location since bloggers have public profiles we have that information and so we can ask questions like are women happier than men or is england happier than the united states we find that as people get older they get happier and moreover we find that for younger people they associate happiness more with excitement and as people get older they associate happiness more with peacefulness and we also find that women feel loved more often than men but also more guilty while men feel good more often than women but also more alone as people lead more and more of their lives online they leave behind digital traces and with these digital traces we can begin to statistically analyze what it means to be human [Music] [Applause] so where does all of this leave us we generate unimaginable quantities of data about everything you can think of and we analyze it to reveal the patterns and now not only expert but all of us can understand the stories in the numbers [Music] instead of being led astray by prejudice with statistics at our fingertips our eyes can be open for a fact-based view of the world so more than ever before we can become authors of our own destiny and that's pretty exciting isn't it one two three four five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen seventeen nineteen nineteen twenty one three twenty four did you buy things six three seven eight nine thirty shopping malls lawns oblique strategies these and so much more can be explored in the well of random download the bbc sounds app to dip your inquisitive tools in now coming up in a moment bethany hughes presents genius of the ancient world focusing upon