Data Careers: Analyst vs Scientist vs Engineer

Every time you send a text message, type a tweet, post a Facebook photo, click a link, or buy something online, you’re generating data. And considering there are more than 3 billion Internet users in the world (a quantity that’s tripled in the last 9 years) and 1.75 billion cell phone users, that’s a heck of a lot of data.
All that data crunching requires an army of data masters. Translation: there’s never been a better time to pursue a career in data. The McKinsey Global Institute predicted that by 2018 the U.S. could face a shortage of 1.5 million people who know how to leverage data analysis to make effective decisions. Enter: you.
The first step on your path to professional data whiz? Taking stock of your three main career options: data analyst, data scientist, and data engineer.
Data/Business Analyst
Data Analysts perform a variety of tasks around collecting, organizing, and interpreting statistical information. Their mainly responsible for using data to identify efficiencies, problem areas, and possible improvements.
Think of it as data science light. While they may not have the mathematical chops to invent new algorithms, they have a strong understanding of how to use existing tools to solve problems. They need to have a baseline understanding of five core competencies: programming, statistics, machine learning, data munging, and data visualization.
These are often the guys making charts and reports for management as well as the ones conducting primary research (like surveys). This part of their job means communication skills are essential. They need to take complex ideas and present them in a way that non-technical people can understand.
The line between Business Analysts and Data Analysts has become so blurred that they’re essentially the same thing. Both use their reports and analyses to help management make decisions and set goals.
While they possess some technical skills, your traditional Data Analyst, is far less technical than the average Data Scientist. Instead of R and Python, they deal in Microsoft Excel, Microsoft Access, SharePoint, and SQL databases.
With the simpler skillset, comes a lower pay scale. The average Data Analyst earns around $54,000/year. Data analysts come from a diverse set of backgrounds that can include anything from technology, information management, relational database design and development, business intelligence, data mining or statistics.
Data Scientist
The actual role of the Data Scientist is one of the most debated — probably because the role varies considerably from company to company.
In all data related jobs there’s a certain amount of skills overlap. The best way to differentiate them is to think of their skills like a T. They’re a generalist in a variety of different areas, but have deep domain experience in one particular area. For a Data Scientist, that deep experience is probably in Statistics and Machine Learning.
Statistical and machine learning knowledge is the domain expertise required to acquire data from different sources, create a model, optimize its accuracy, validate its purpose and confirm its significance. At minimum, Data Scientists need to know how to take some data, munge it, clean it, filter it, mine it, visualize it and then validate it.
In addition to all that statistical modeling, Data Scientists also need to know how to explain their findings to business decision makers, understand the business and product model, be good at problem-solving and know some basic engineering. The most popular Data Scientist languages are R and Python, but they may also know Scala, Java or Closure.
So, to become a data scientist you need a solid foundation in computer science, modeling, statistics, analytics, and math.
Their role varies from sector to sector, but in general, they sift through all the incoming data streams (both internal and external) with the goal of discovering new insights and solving business problems. Then they communicate their findings and recommendations to the organization’s leadership.
There are literally 1,000s of tools a Data Scientist might use to do their job. Everything from import.io (for data collection) to Tableau (for data visualization) to RJ Metrics (for data analysis).
The technical nature of the job (and the shortage of good candidates), means that Data Scientists earn good money. According to Glassdoor, Data Science is currently the 15th highest paid job in America, averaging $91,000/year nationally and $110,000/year in Silicon Valley.
Data Engineer
On the other side of the technical spectrum, you’ll find the Data Engineer.
Typically software engineers by trade, Data Engineers are the designers, builders, and managers of the data infrastructure. They are responsible for compiling and installing database systems, writing complex queries, scaling to multiple machines, and putting disaster recovery systems into place. They also make sure those systems are performing smoothly.
The core job of the Data Engineer is to make sure data is flowing smoothly from source to destination so it can be processed and analyzed. To do that they need to know complex Hadoop-based technologies (MapReduce, Hive, Pig, Spark), SQL technologies (PostgreSQL and MySQL), NoSQL technologies (Cassandra and MongoDB) and data warehousing solutions. In addition, they should also be familiar with a variety of coding languages such as Python, C/C++, Java, Scala, R and more.
Data Engineers may work largely behind the scenes, but they are an essential part of the data ecosystem in your business. As such, they get paid quite well — an average of $91,000/year.
The Bottom Line
Collecting, storing, analyzing and presenting data takes a team of people. No one data job is any more important than any other. Each role has a unique and important part to play in making sure management has all the information they need to make decisions.
Sources: