Scroll to top of page

#TechJobs - Data whisperers: learning more about data scientists

Ahhh, big databig dataAlthough the concept of big data became very popular in the 2010s, it still remains quite nebulous. It actually refers to the massive amounts of data collected by organisations, data which are now exploitable thanks to recent advances in computer and information sciences. Thanks to big data methodologies, these organisations are able to gain precious insight into their own activity and clients, and to predict specific events with a high confidence interval. The marketing industry, in particular, was completely disrupted by the possibilities opened by big data: indeed, the digital and advertising industries are among the industries generating, but also reusing, most of these data.Learn more. This concept has revolutionised the business world, and determines the strategies of many companies who must follow the trend as best they can. But big data has also created a whole new range of experts. A whole new series of jobs has emerged, including data analysts, data scientists, and data engineers.

Though the Harvard Business Review called being a data scientist “the Sexiest Job of the 21st Century” (and data analysts are not the only ones mentioned, Antoine) the job still remains mysterious. So what does being a data scientist mean today? And what can they bring to a business?

First of all, what purpose does data sciencedata scienceData science is a sub-discipline of statistics. It is a field of study that analyses massive amounts of data in order to extract useful insights from them. It makes it possible to formulate predictions based on a data record, and to automatically spot inconsistencies.Learn more serve? Today, collecting and analysing data can be applied to various domains, which leads to multiple challenges for professionals across industries. Online merchants might wish to create personalised discount coupons or product recommendations, while energy providers, insurance, and telecom firms might want to identify clients likely to terminate their contracts (or “churn”). Data science can help – though players must keep a critical view of the limitations in what kind of models can be established. Concretely, data exploitation raises ethical questions from the general public and professionals alike, particularly on subjects like automation or even artificial intelligenceartificial intelligenceArtificial intelligence (often shortened to AI) is a subdiscipline of cognitive science. It refers to the technologies and methodologies that aim at building computer programs mimicking human intelligence and cognitive functions, including both human reasoning and perceptions.Learn more. In the United States, for example, a pair of Stanford researchers declared that they had managed to determine whether or not a person is homosexual based on an algorithmalgorithmAn algorithm is a mathematical process designed to solve a problem or to obtain a result, using a finite number of operations. It can be translated into a computer program thanks to a programming language.Learn more that analyses facial images using “deep neural networks.”. This example shows that using certain data, especially personal data, can be used for any purpose, for better or for worse. When algorithms are used to help with decision-making, we must thus be careful about possible interpretations and uses.

Where does the data scientist come in? Data scientists can be seen as “data whisperers”. Once the raw data has been gathered, it must be processed and analysed to see what it “says” in order to guide decision-making. A data scientist does more than just applying a statistical model, then, as he or she must learn from the data. The skills of a data scientist include a savvy mix of maths (statistics or even machine learningmachine learningMachine learning is an artificial intelligence system that is based on the learning ability of algorithms. As this learning process relies on the repetition of an action, the accuracy of the results produced by machine learning algorithms improves over time.Learn more), computer science, as well as a good business understanding, and notions in marketing.

I’ve been a data scientist at fifty-five for over two years now, and have worked on a wide range of issues. Each new project has its own specificities, but the methodology remains pretty similar. First of all, we have to understand what our client needs, defining the stakes and the limitations of the subject with business teams and marketers. Next comes the feasibility study for the project, with a team of engineers which is in charge of collecting data and verifying that it is reliable. If data isn’t collected properly, or if the volume is insufficient, it will be impossible to draw any conclusions. Incidentally, this is where some companies are not yet developed enough. Our job is to show them that data quality is paramount to getting a data project off the ground.

Once the project has been approved, a descriptive analysis phase allows us to identify outliers, as well as an initial idea of major trends. For example, if a user has browsed 1000 pages of an online store in one day, even though the average is 10 pages per day per user, it could indicate that this user was a robot and the related data can be excluded from our dataset. It is essential that we are in close contact with the company during this step, as they have the best knowledge of their own industry. Our knowledge of banking, insurance, or retail cannot come close to the expertise of our clients. It is by working together that we can identify the best paths to explore.

Lastly, once the data has been analysed and cleaned up, the modelling can begin. Modelling consists of extracting and automating rules for decisions from our database, using machine learning algorithms.

Let’s look at a real example, to clarify. An advertiser can come to fifty-five if it wants to qualify its audience – that is to say, if it wants to enrich the quantity and quality of the information that it has about this audience. For example, perhaps the advertiser wants to learn the engagement level for each user, or find socio-demographic patterns. To do this, the data scientist might use clusteringclusteringClustering is the act of grouping a set of items that look alike and have common characteristics: these groups are called clusters. For instance, all users of one product can be grouped into a specific cluster.Learn more, which creates homogeneous user groups based on defined variables, pre-selected with the advertiser. Once these groups have been created, we can define targeted activation strategies. Using browsing data, an online merchant could for instance group its users into “families” based on their profiles and their interest in a certain product type, in order to offer a customised user experience.

Each client request has more than one solution! Analysing results helps data scientists and their teams to establish different areas to exploit or not, depending on client objectives and strategy, which makes each project unique. So, do you think being a data scientist is sexy? You’re the judge!

Translated from French by Niamh Cloughley.

Want to learn more? Get in touch!

13-03-2018

close legal

À propos

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec a venenatis dolor, non ornare ligula. Nam ultricies elementum tellus, sed pulvinar libero egestas nec. Fusce facilisis nulla vestibulum, commodo neque eget, dapibus lacus. Aliquam neque felis, sagittis nec consequat sed, commodo ac ipsum. Sed neque tortor, semper quis viverra et, malesuada et eros. Donec at dui ut ligula pharetra aliquet. Etiam dapibus semper orci. Integer efficitur dolor tortor, nec mattis elit placerat vel. Ut nulla enim, lacinia in pharetra id, convallis vitae massa. Donec neque est, tincidunt non ullamcorper commodo, tincidunt non turpis. Pellentesque viverra enim a sapien placerat, ut volutpat mauris condimentum. Proin tincidunt sollicitudin dui, sit amet condimentum ante commodo a. Aenean posuere aliquam purus, sed aliquam magna sagittis finibus. Morbi molestie feugiat feugiat. Phasellus tempus in dolor vel maximus. Cras efficitur sagittis lorem porta iaculis. Maecenas sed hendrerit urna. In mattis posuere purus, sit amet placerat arcu posuere quis. Etiam nec arcu nec magna interdum maximus. Integer sit amet lacus neque. Curabitur interdum molestie magna, in scelerisque tellus iaculis sed. Sed nec metus ut purus efficitur laoreet a quis eros. Proin dui dui, dignissim eget risus sit amet, bibendum condimentum velit. Maecenas in justo eu elit eleifend consectetur. Aenean scelerisque fringilla sollicitudin. Nam sem nibh, pharetra nec lacus non, mollis interdum odio. Aliquam sollicitudin posuere nibh sed eleifend.

Édition

55 SAS, 5 — 7 rue d'Athènes

75009 Paris

+33 1 76 21 91 37

Hébergement

OVH SAS

2, rue Kellermann

59100 Roubaix

+33 8 20 69 87 65

Publication

Lan Anh Vu Hong

Crédits photo

Mats Carduner, Adobe Stock & Unsplash

Vous avez aimé nos nouvelles fraîches sur l'état du marché brandtech ? Inscrivez vous à notre newsletter