Tell us about yourself...
I have been a Data Scientist at Intechnica since July 2017. Previously I was an ESPRC Doctoral Prize Fellow at the University of Manchester in the Computer Science department, researching next-generation non-volatile memory technology. I wrote my PhD thesis on mathematically modelling thermal fluctuations in magnetic storage technology, also at the University of Manchester, graduating in 2015.
What initially drew you to a career in IT?
I studied physics as an undergraduate and physics and IT have a big overlap. When I went into research I decided I wanted to be researching future computational technology, although I was initially focused on devices and physical hardware. I spent the first year of my PhD building high-tech microscopy equipment and writing the control and analysis software. I ended up building a working model atomic force microscope out of Lego for university outreach demonstrations. I was more drawn to the programming and mathematical modelling side of things though and that became the most successful area of my research. I decided eventually that it wasn’t particularly the physics that interested me, but using the analytical skills I had developed, solving problems and working in cutting edge research and technology development, all of which were more than applicable to the IT sector.
What appealed to you about specialising in Data Science?
A lot of my work in academia involved data analysis and it was something I found both challenging and enjoyable. Towards the end of my PhD I became much more involved in data analysis and writing mathematical models and I discovered that I really enjoyed it. It was a bonus when I discovered that those skills were also very much in demand. I came to Data Science as it was a career that offered me opportunities to use and improve the skills I already had and develop new ones in the same area, while working on things that interested me.
What do you like about being a Data Scientist?
Analysing and seeing patterns in data I think is the best part of what I do. It’s a challenge from start to end, but when I’ve got everything together and anyone can look at the visualisation and understand exactly what’s going on without any further explanation from me it’s really satisfying and I feel like I’ve achieved something. I have presented a lot of data to a variety of audiences and there is a real talent in making sure what you’re showing is understood quickly by everyone in the room. I’m always trying to get better at it too.
Your current role sees you as a Data Scientist at Intechnica in Manchester, can you tell us more about Intechnica and what they do?
Intechnica are world leading digital performance and scalability experts – originally known best for providing digital strategy and infrastructure consultancy for eCommerce applications worldwide. However around 18 months ago they release a web traffic insight and management solution based on advanced machine learning, which is where I fit in as a Data Scientist working with our Machine Learning.
The product is called Traffic Defender and ensures websites can serve an unlimited number of customers and blocks out all the nasty traffic on the web. So, you can see it as having two parts, the first protects against peaks in traffic consuming more resources than there are available.
It ensures uptime and prevents sites crashing, and provides a queuing system to let all visitors access the site in a fair manner.
The second is the really clever bit where the data science and machine learning comes in. Businesses are plagued by automated non-human traffic (known as bots) some of this is good (like GoogleBot) however a lot are doing all sorts of nasty things on their site, such as stealing content, price scraping, launching attacks, creating fake accounts to get sign up offers, or trying to brute force crack customer accounts. In the middle, there’s a whole set of bots that depending on your business, you may or may not want on your site. Some businesses want them, to promote their product with third party sites, other businesses will regard it as price scraping and content theft.
We’ve built a system that enables businesses to manage their web traffic, identifying what is not human traffic and challenging it. The machine learning enables us to understand the intent of web traffic rather then look at access requests. From here we can provide insight into the nature of traffic, categorise it and provide the customers with their own choice as to whether they want to manage and optimise it, or block it.
What’s a typical day like for you at Intechnica?
I work on both real-time data analytics and retrospective insights into historic data. I don’t think I really have typical days and everything moves very quickly. One day I might be putting together an in-depth analysis of web traffic and the behaviour of non-human traffic, pulling out highlights of good and bad bot behaviour and the next I’ll be improving our real time behavioural analysis, updating the model to find new behaviours and stop a new type of security threat.
How would you describe the tech scene in Manchester?
There is a huge and very lively tech scene in Manchester. Coming from the University of Manchester I was continually exposed to a lot of the start-ups that sprung out of various research topics and a lot of my friends went on to work in tech start-ups – or in a few cases start their own successful ones. There’s a huge amount going on in the tech scene generally though and a lot of meet ups, hackathons and talks.
Would top three tips would you give to anyone wanting to pursue a career in data science?
There are a few good online courses for data science –Udacity Machine Learning Nanodegree is good – and I would absolutely recommend doing a course on Machine Learning or Artificial Intelligence for anyone interested in developing their data science skills as both areas of becoming more important to data analysis. I would also encourage anyone looking to pursue data science to make sure they are up to date on the latest technologies that are being used and be constantly on the look-out for opportunities to learn more. Finally, I would say that data science is a pretty broad area and it’s not all programming and being comfortable with maths and statistics is a must, knowing what algorithms are doing and how they work beyond will save a lot of time and effort.
What’s your favourite piece of technology at the moment?
Obviously I’m really excited about the advances being made in artificial intelligence and neural networks and that’s an area I’m personally interested in learning more about. I’ve being dying to find an excuse to play around with Google’s Tensorflow for a project. I also like to keep up to date on advances in biologically inspired computing and I’ve read a few exciting research papers recently about using nanoscopic magnetic devices to store data by copying the way neurons behave in the brain.