Vets, pets, data sets and beyond.

From the 10th to the 14th of April 2017 researchers from the UK’s flagship project on companion animal surveillance, the Small Animal Veterinary Surveillance Network (SAVSNET), set up shop at Edinburgh’s international science festival.

SAVSNET* uses big data to survey animal disease across the UK and ultimately aims to improve animal care through identification of trends in diseases observed by veterinary practitioners.

This work offers huge benefits for companion animals, meaning that interventions can be targeted towards those most at risk and risk factors for disease can be identified across the population.

There is also significant crossover between this work and that of human health data science. Indeed, lessons learned from the processing and analysis of big data from vets may be used to inform aspects of human data analysis while work on shared and zoonotic diseases, antibacterial use and resistance also offer significant benefit to human health.

So, for this week, we took our science to the public to engage, inspire, raise awareness and stimulate discussion about our work.

SAVSNET mascots Alan, Phil and PJ

The SAVSNET Liverpool team worked hard to develop a wide range of activities designed to bring data science to life and to raise awareness of their work while Dr Sarah Fox, from HeRC’s PPI team joined the fun to expand discussions beyond pets and into the realms of human health.

Our stall was designed to take the public on a data journey, a journey which began with our resident mascots Alan, Phil and PJ, who were suffering from a parasitic problem. Hidden in our fluffy friend’s fur were a host of unwanted passengers – ticks (not the real thing but small sticky models we used to represent real ticks). Visitors helped us to remove these pests from our mascots and learned that every time this process is performed by a vet, a medical record is created for that procedure. Indeed, vets across the country are regularly called upon to remove such pests and, assuming the practice is signed-up to the SAVSNET system, information on these procedures is transferred to data scientists.

The next stage of our data journey is one health researchers are very familiar with but which may remain a mystery amongst the general public – sorting and analysing these records.

Interactive sticker-wall showing seasonal tick prevalence.

Our stall was equipped with a large touch-screen PC, linked to the SAVSNET database and programmed to pull out and de-identify all vet records which made reference to the word tick. It was explained that, in order to perform a complete analysis of the prevalence of ticks across the UK, data scientists needed to manually sort through these selected records and confirm the presence or absence of a tick at the time of the recorded consultation. Now visitors to our stall could take part in their own citizen science project as they helped us to sort through these records, uncovering ticks and adding their findings to our maps of regional and seasonal tick prevalence. Dogs came up trumps as the pet most likely to visit their local vet to have ticks removed, while the ticks themselves seemed to indiscriminately pop up all around the UK (even in the centre of London) while also having a preference for outings during the warmer summer months.

In the final stage of our data journey, visitors had the chance to get hands-on with some data science theory.

A few beautifully coloured ticks alongside our wooded data blocks.

Dr Alan Radford, a reader in infection biology from the University of Liverpool, developed a novel way of exploring sample theory and odds ratios using wooden building blocks.

This activity consisted of hundreds of wooden blocks sporting either cat or dog stickers, a subsection of which also housed a smaller tick sticker (on their rear). Visitors were told that these blocks represented all the information available on cats and dogs in the UK. After conceding that they would not be able to count all of these blocks independently, visitors were encouraged to form groups and choose a smaller sub-sample of ten blocks each. Visitors counted how many of their chosen ten blocks showed cat stickers and how many showed dog stickers. As a rule most groups of ten contained more dogs than cats – since overall there were more dog blocks in the total population. However, inevitably we also saw variability and some individuals chose more cat blocks than dogs. This tactile and visual example of sample theory allowed a discussion regarding sample bias and how increasing the number or size of samples taken would bring you closer to the correct population value. Finally visitors were asked to turn their blocks around and count how many of their dogs and cats also had ticks. In our example cats were more likely to house a resident parasite but, with fewer cats to sample from, this was not always immediately obvious. Specifically, assuming a visitor chose 7 dog blocks and 3 cat blocks then found that 4 of their dogs had ticks while only two of their cats did, they might be forgiven for thinking that within our sample dogs were more prone to ticks. However, from this data our older visitors were taught how to calculate an odds ratio, which could show that our cats were actually more likely to house ticks than dogs. It was also noted that similar calculations are often used to calculate risk in medical studies and that it is often these vales which are reported in the media.

The view down our microscope of our preserved pests.

Alongside our data blocks, younger visitors also had the chance to get up close and personal with real life ticks, through both a colouring exercise and by peeking down our microscope at a range of preserved specimens.

Finally, we discussed how tick data and similar veterinary information could be used to improve the health of companion animals and to better understand disease outbreaks across the country. It was at this point we also introduced the idea that similar methods could also be applied to human health data in order to streamline and improve our healthcare services. Our discussions centred around the successes already shown in The Farr Institute for Health Informatics’ 100 Ways case studies and HeRC’s work, including improvements in surgical practice and regional health improvements from HeRC’s Born in Bradford study – whilst also engaging in a frank discussion around data privacy and research transparency. Visitors were encouraged to document their views on these uses of big data on our post-it note wall, garnering comments to the questions: “What do you think of big data?” and “Should we use human data?” A majority of visitors chose to comment on our second question, generally expressing positive feelings concerning this topic but, with many also noting the need for tight data privacy controls. Comments of note include:

Should we use human data?
Yes, but with controls and limited personal info
We need to get better at persuading people to change behaviour and ask the right questions to collect the right data.
Yes, it’s towards a good cause and can help people.
Using data is a good idea if it helps to make people better.
Yes, as long as there are sufficient controls in place.
Yes, but don’t sell it.
Yes, if you are careful not to breach privacy.

The data detectives.

Overall we had a great time at the festival and hope everyone who visited out stall took away a little bit of our enthusiasm and a bit more knowledge of health data science.

* co-funded by the BBSRC and in collaboration with the British Small Animal Veterinary Association (BSAVA) and the University of Liverpool.

Post by: Sarah Fox

Save

Save

Save

Digital technologies: a new era in healthcare

Our NHS provides world-class care to millions of people every year. But, due to funding cuts and the challenges of providing care to an ageing population with complex health needs, this vital service is unsurprisingly under strain.  At the same time, with the mobile-internet at our fingertips, we have become accustomed to quick, on-demand services. Whether it’s browsing the internet, staying connected on social media or using mobile banking, our smartphones play important roles in nearly every aspect of our lives. It is therefore not surprising to find that over 70% of the UK population are now going online for their healthcare information.

This raises a question: could digital health (in particular mobile health apps) play a role in bolstering our faltering health service?

Unfortunately, to date, healthcare has been lagging behind other services in the digital revolution. When most other sectors grabbed onto the digital train, healthcare remained reluctant. Nevertheless, the potential for mobile technology to track, manage and improve patient health, is being increasingly recognised.

ClinTouch for instance, is a mobile health intervention co-created by a team of Manchester-based health researchers at The Farr Institute of Health Informatics’ Health eResearch Centre. ClinTouch is a psychiatric–symptom assessment tool developed to aid management of psychosis (a condition affecting 1 in 100 people). The app was co-designed by health professionals and patients, ensuring that the final output reflected both the needs of patients and clinicians. It combines a patient-focussed front end which allows users to record and monitoring their symptoms whilst simultaneously feeding this information back to clinicians to provide an early warning of possible relapse. The project has the potential to empower patients and improve relationships between the user and their physician. Moreover, if ClinTouch can reduce 5% of relapse cases, it will save the average NHS trust £250,000 to £500,000, per year (equating to a possible saving of £119 million to the NHS over three years!).

Adopting disruptive technologies such as ClinTouch can have meaningful benefits for patients and the NHS. And there are signs that the healthcare sector is warming up to the idea. Earlier this year the National Institute for Health and Care Excellence (NICE) announced that they are planning to apply health technology assessments to mobile health apps and only this week, the NHS announced a £35 million investment in digital health services.

On Thursday 27th April, the North West Biotech Initiative will be hosting an interactive panel discussion on the future of digital health. We will be joined by a fantastic line-up speakers providing a range of perspectives on the topic, including:

Professor Shôn Lewis: Principal Investigator of the ClinTouch project and professor of Adult Psychiatry at The University of Manchester who will be speaking about the development of and the potential impact of the ClinTouch app. 

Tom Higham: former Executive Director at FutureEverything and a freelance digital art curator and producer, interested in the enabling power of digital technology. Tom is also diagnosed with type 1 diabetes, has worked with diabetes charity JDRF UK and has written about the benefits of and the need for improvements in mobile apps for diabetes care.

Anna Beukenhorst: a PhD candidate currently working on the Cloudy with a Chance of Pain project, a nationwide smartphone study investigating the association between the weather and chronic pain in more than 13,000 participants.

Reina Yaidoo: founder of Bassajamba, a social enterprise whose main aim is increase participation of underrepresented groups in science and technology. Bassajamba are currently working with several diabetes support groups to develop self-management apps, which incorporate an aspect of gamification.

Professor Tjeerd Van Staa: professor of eHealth Research in the Division of Informatics, Imaging and Data Sciences, at The University of Manchester. He is currently leading the CityVerve case on the use of Internet of Things technology to help people manage their Chronic Obstructive Pulmonary Disease (COPD).

Dr Mariam Bibi: Senior Director of Real World Evidence at Consulting for McCann Health, External advisor for Quality and Productivity at NICE and an Associate Lecturer at Manchester Metropolitan University. She will be talking about the regulatory aspect of bringing digital technology to healthcare.

The event is open anyone with an interest in digital health, including the general public, students and academics. It is free to attend and will be a great opportunity to understand the potential role of digital technology in healthcare and to network with local business leaders, academics and students working at the forefront of digital healthcare.

Date: 27th April 2017
Venue: Moseley Theatre, Schuster Building, The University of Manchester
Time: 3.30pm – 6.00pm
Register to attend! http://bit.ly/2o4fzd7
Questions about the event? Please get in touch with us at: [email protected]

Guest post by: Fatima Chunara

Save

Save

Neural coding 2: Measuring information within the brain

In my previous neuroscience post, I talked about the spike-triggered averaging method scientists use to find what small part of a stimulus a neuron is capturing. This tells us what a neuron is interested in, such as the peak or trough of a sound wave, but it tells us nothing about how much information a neuron is transmitting about a stimulus to other neurons. We know from my last neuroscience post that a neuron will change its spike firing when it senses a stimulus it is tuned to. Unfortunately, neurons are not perfect and they make mistakes, sometimes staying quiet when they should fire or firing when they should be quiet. Therefore, when the neuron fires, listening neurons can not be fully sure a stimulus has actually occurred. These mistakes lead to a loss of information as signals get sent from neuron to neuron, like Chinese whispers.

Figure 1: Chinese whispers is an example of information loss during communication. Source: jasonthomas92.blogspot.com

It is very important for neuroscientists to ascertain information flow within the brain because this is underlines all other computational processes that happen. After all, to process information within the brain you must first correctly transmit it in the first place! To understand and quantify information flow, neuroscientists use a branch of mathematics known as Information Theory. Information theory centers around the idea of a sender and a receiver. In the brain, both the sender and receiver are normally neurons. The sender neuron encodes a message about a stimulus in a sequence of spikes. The receiving neuron/neurons try to decode this spike sequence and ascertain what the stimulus was. Before the receiving neuron gets the signal, it has little idea what the stimulus was. We say this neuron has a high uncertainty about the stimulus. By receiving a signal from the sending neuron, this uncertainty is reduced, the extent of this reduction in uncertainty depends on the amount of information carried in the signal. Just in case that is not clear, lets use an analogy…so imagine you are a lost hiker with a map.

Figure 2: A map from one of my favourite games. Knowing where you are requires information. Source: pac-miam.deviantart.com

You have a map with 8 equally sized sectors and all you know is that you could be in any of them. You then receive a phone call telling you that you are definitely within 2 sectors on the map. This phone call actually contains a measurable amount of information. If we assume the probability of being in any part of the map prior to receiving the phone call is equal then you have a 1/8 chance of being in each part of the map. We need to calculate a measure of uncertainty and for this we use something called Shannon entropy. This measurement is related to the number of different possible areas there are in the map, so a map with 2000 different areas will have greater entropy than a map with 10 sectors. In our example we have an entropy of 3 bits. After receiving the message, the entropy drops to 1 bit because there are now only two map sectors you could be in. So the telephone call caused our uncertainty about our location to drop from 3 bits to 1 bit of entropy. The information within the phone call is equal to this drop in uncertainty which is 3 – 1 = 2 bits of information. Notice how we didn’t need to know anything about the map itself or the exact words in the telephone call, only what the probabilities of your location were before and after the call.

In neurons, we can calculate information without knowing the details of the stimulus a neuron is responding to. The trick is to stimulate a neuron with the same way over many repeated trials using a highly varying, white-noise stimulus (see the bottom trace in Figure 3).

Figure 3: Diagram showing a neuron’s response to 5 repeats of identical neuron input (bottom). The responses are shown as voltage traces (labelled neuron response). The spike times can be represented as points in time in a ‘raster plot’ (top).

So how does information theory apply to this? Well, recall how Shannon entropy is linked with the number of possible areas contained within a map. In a neuron’s response, entropy is related to the number of different spike sequences a neuron can produce. A neuron producing many different spike sequences has a greater entropy.

In the raster plots below (Figure 4) are the responses of three simulated neurons using computer models that closely approximate real neuron behaviour. They are responding to a noisy stimulus (not shown) similar to the one shown at the bottom of Figure 3. Each dot is a spike fired at a certain time on a particular trial.

Figure 4: Raster plots show three neuron responses transmitting different amounts of information. The first (top) transmits about 9 bits per second of response, the second (middle) transmits 2 bits/s and the third (bottom) transmits 0.7 bits/s.

In all responses, the neuron is generating different spike sequences, some spikes are packed close together in time, while at other times, these spikes are spaced apart. This variation gives rise to entropy.

In the response of the first neuron (top) the spike sequences change in time but do not change at all across trials. This is an unrealistically perfect neuron. All the variable spike sequences follow the stimulus with 100% accuracy. When the stimulus repeats in the next trial the neuron simple fires the same spikes as before, producing vertical lines in the raster plot. Therefore, all that entropy in the neuron’s response is because of the stimulus and is therefore transmitting information. This neuron is highly informative; despite firing relatively few spikes it transmits about 9 bits/second…pretty good for a single neuron.

The second neuron (Figure 4, middle) also shows varying spike sequences across time, but now these sequences vary slightly across trials. We can think of this response as having two types of entropy, a total entropy which measures the total amount of variation a neuron can produce in its response, and a noise entropy. This second entropy is caused by the neuron changing its response to unrelated influences, such as other neuron inputs, electrical/chemical interferences and random fluctuations in signaling mechanisms within the neuron. The noise entropy causes the variability across trials in the raster plot and reduces the information transmitted by the neuron. To be more precise, the information carried in this neuron’s response it whatever remains from the total entropy when the noise entropy is subtracted from it…about 2 bits/s in this case.

In the final response (bottom), the spikes from the neuron only weakly follow the stimulus and are highly variable across trials. Interestingly it shows the most complex spike sequences of spikes of all three examples. It therefore has a very large total entropy, which means it has the capacity to transmit a great deal of information. Unfortunately, much of this entropy is wasted because the neurons spends most its time varying its spike patterns randomly instead of with the stimulus. This makes its noise entropy very high and the useful information low, it transmits a measly 0.7 bits/s.

So, what should you take away from this post. Firstly that neuroscientists can accurately measure the amount of information can transmit. Second, that neurons are not perfect and cannot respond the same way even to repeated identical stimuli. This leads to the final point that this noise within neurons limits the amount of information they can communicate to each other.

Of course, I have only shown a simple view of things in this post. In reality, neurons work together to communicate information and overcome the noise they contain. Perhaps in the future, I will elaborate on this further…

Post by: Dan Elijah.

To share or not to share: delving into health data research.

In January this year I made a bold move, well at least bold for someone who is often accused of being painfully risk averse. I waved a fond farewell to life in the lab to take on a new role where I have been able to combine my training as a researcher with my passion for science engagement. In this role I work closely with health researchers and the public, building the scaffolding needed for the two to work together and co-produce research which may improve healthcare for millions of patients across the UK. The group I work alongside are collectively known as the Health eResearch Centre (part of the world-leading Farr Institute for Health Informatics) and are proud in their mission of using de-identified electronic patient data* to improve public health.

For me, taking on this role has felt particularly poignant and has lead me to think deeply about the implications and risks of sharing such personal information. This is because, like many of you, my health records contain details which I’m scared to share with a wider audience. So, with this in mind, I want to invite you inside my head to explore the reasons why I believe that, despite my concerns, sharing such data with researchers is crucial for the future of public health and the NHS.

It’s no secret that any information stored in a digital form is at risk from security breaches, theft or damage and that this risk increases when information is shared. But, it’s also important to recognise that these risks can be significantly reduced if the correct structures are put in place to protect this information. Not only this but, when weighing up these risks, I also think that it is immensely important to know the benefits sharing data can provide.

With this in mind, I was really impressed that, within the first few weeks of starting this role, I was expected to complete some very thorough data security training (which, considering I won’t actually be working directly with patient data almost seemed like overkill). I was also introduced to the catchily titled ISO 27001 which, if my understanding is correct, certifies that an organisation is running a ‘gold standard’ framework of policies and procedures for data protection – this being something we as a group hope to obtain before the year is out. This all left me with the distinct feeling that security is a major concern for our group and that it is considered to be of paramount importance to our work. I also learned about data governance within the NHS and how each NHS organisation has an assigned data guardian who is tasked with protecting the confidentiality of patient and service-user information. So, I’m quite sure information security is taken exceedingly seriously at every step of the data sharing chain.

But what will the public gain from sharing their health data?

We all know that, in this cyber age, most of us have quite an extensive digital-data footprint. It’s no accident that my Facebook feed is peppered with pictures of sad dogs encouraging me to donate money to animal charities while Google proudly presents me with adverts for ‘Geek gear’ and fantasy inspired jewellery. I don’t make too much effort to ensure that my internet searches are private, so marketers probably see me as easy prey. This type of data mining happens all the time, with little benefit to you or me and, although we may install add blocking software, few of us make a considered effort to stop this from happening. Health data, on the other hand, is not only shared in a measured and secure manner but could offer enormous benefits to the UK’s health service and to us as individual patients.

Our NHS is being placed under increasing financial strain, with the added pressure of providing care to a growing, ageing population with complex health needs. Meaning that it has never been more important to find innovative ways of streamlining and improving our care system. This is where health data researchers can offer a helping hand. Work using patient data can identify ‘at risk’ populations, allowing health workers to target interventions at these groups before they develop health problems. New drugs and surgical procedures can also be monitored to ensure better outcomes and fewer complications.

And this is already happening across the UK – the Farr Institute are currently putting together a list of 100 projects which have already improved patient health – you can find these here. Also, in 2014 the #datasaveslives campaign was launched. This highlights the positive impact health-data research is having in the UK by building a digital library of this work – type #datasaveslives into Google and explore this library or join the conversation on twitter.

One example is work on a procedure to unblock arteries and improve outcomes for patients suffering from coronary heart disease:

In the UK this procedure is carried out in one of two ways: Stents (a special type of scaffolding used to open up arteries and improve blood flow) can be inserted either through a patient’s leg (the transfemoral route) or via the wrist (the transradial route). Insertion through the wrist is a more modern technique which is believed to be safer and less invasive – however both methods are routinely performed across the UK.
Farr institute researchers working between The University of Manchester’s Health eResearch Centre and Keele University used de-identified health records (with all personal information removed) to analyse the outcomes of 448,853 surgical stent insertion procedures across the UK between 2005 and 2012.

This study allowed researchers to calculate, for the first time, the true benefits of the transradial method. They showed that between 2005 and 2012 the use of transradial surgery increased from 14% in 2005 to 58% in 2012 – a change which is thought to have saved an estimated 450 lives. They also discovered that the South East of England had the lowest uptake of surgery via the wrist.

This work shows one example of how research use of existing health records can highlight ways of improving patient care across the country – thanks to this research the transradial route is now the dominant surgical practice adopted across the UK (leading to an estimated 30% reduction in the risk of mortality in high risk patients undergoing this procedure).

Reading through all these studies and imagining the potential for future research does convince me that, even with my concerns, the benefits of sharing my data far outweigh the risks. But, I also recognise that it is of tantamount importance for patients and the public to be aware of how this process works and to play an active role in shaping research. It seems that when the public have the opportunity to question health data scientists and are fully informed about policy and privacy many feel comfortable with sharing their data. This proves that we need to strive towards transparency and to keep an active dialogue with the public to ensure we are really addressing their needs and concerns.

This is an amazingly complex and interesting field of study, combining policy, academic research, public priority setting and oodles of engagement and involvement – so I hope over the next year to be publishing more posts covering aspects of this work in more detail.

Post by: Sarah Fox

*The kind of data which is routinely collected during doctor and hospital appointments but with all personal identifiable information removed.

 

Save