What Junior ML Engineers Actually Need To Know To Get Hired? - KDnuggets

Republished By Plato

Followers: 0

What Junior ML Engineers Actually Need to Know to Get Hired?

As a seasoned ML developer who has hired many junior engineers across different projects, I have come to realize that there are certain skills essential for a junior developer to be considered for a job in the field. These skills vary depending on the project and the company, but there are some fundamental skills that are universally required.

In this article, we will discuss the key skills that junior ML developers should have in order to be successful in their job search. By the end of this article, you will have a better understanding of what skills are necessary for junior ML developers to land their first job.

What skills do most junior developers who apply for a job have?

Junior developers looking to land their first job often come from other fields, having completed some ML courses. They have learned basic ML but do NOT have a deep background in engineering, computer science, or mathematics. While a math degree is not required to become a programmer, in ML, it is highly recommended. Machine learning and data science are fields that require experimentation and fine-tuning of the existing algorithms or even creating your own ones. And without some knowledge of math, it is hard to do.

College students with a good degree are at an advantage here. However, while they might have a deeper technical knowledge than an average junior without a specialized education, they often lack the necessary practical skills and experience that are vital for a job. College education is wired to give the students fundamental knowledge, often paying little attention to marketable skills.

Most applicants for junior ML engineer positions don’t have any problems with SQL, vector embeddings, and some basic time series analysis algorithms. I also used basic Python libraries such as Scikit-learn and applied basic problem-solving and algorithms (clustering, regression, random forests). But it’s not enough.

What skills do popular courses not provide?

As you now understand, most educational programs are unable to give hands-on experience and a deeper understanding of the subject matter. If you are determined to build a career in the field of ML, there are things you will need to learn on your own to make yourself more marketable. Because if you aren’t willing to learn, and I say that with care, don’t bother ? the days when anybody could land a career in IT are gone. Today it’s a pretty competitive market.

One of the key skills that popular courses may not provide a deep enough understanding of is random forests, which includes pruning, how to select the number of trees/features etc. While courses may cover the basics of how random forests work and how to implement them, they may not delve into important details. Or even talk about some more advanced ensembling methods. These details are crucial for building effective models and optimizing performance.

Another skill that is often overlooked is web scraping. Collecting data from the web is a common task in many ML projects, but it requires knowledge of tools and techniques for scraping data from websites. Popular courses may touch on this topic briefly, but they may not provide enough hands-on experience to truly master this skill.

In addition to technical skills, junior ML developers also need to know how to present their solutions effectively. This includes creating user-friendly interfaces and deploying models to production environments. For example, Flask in conjunction with NGrok gives you a powerful tool for creating web interfaces for ML models, but many courses do not cover these at all.

Another important skill that is often overlooked is Docker. Docker is a containerization tool that allows developers to easily package and deploy applications. Understanding how to use Docker can be valuable for deploying ML models to production environments and scaling applications.

Virtual environments are another important tool for managing dependencies and isolating projects. While many courses may cover virtual environments briefly, they may not provide enough hands-on experience for junior developers to truly understand their importance.

GitHub is an essential tool for version control and collaboration in software development, including ML projects. However, many junior developers may only have a surface-level understanding of GitHub and may not know how to use it effectively for managing ML projects.

Finally, ML tracking systems such as Weights and Biases or MLFlow can help developers keep track of model performance and experiment results. These systems can be valuable for optimizing models and improving performance, but they may not be covered in depth in many courses.

By mastering these skills, junior developers can set themselves apart from the competition and become valuable assets to any ML team.

What do you need to get an ML engineering job?

Young professionals often face a problem: to get a job, they need experience. But how can they get the experience if nobody wants to hire? Luckily, in ML and in programming in general, you can resolve this problem by creating pet projects. They allow you to demonstrate your skills in programming, knowledge of ML, as well as motivation to the potential employer.

Here are some ideas for pet projects that I, honestly, would like to see more among people who apply for jobs in my department:

Web scraping project

The goal of this project is to scrape data from a specific website and store it in a database. The data can be used for various purposes, such as analysis or machine learning. The project can involve the use of libraries like BeautifulSoup or Scrapy for web scraping and SQLite or MySQL for database storage. Additionally, the project can include integration with Google Drive or other cloud services for backup and easy access to the data.

NLP project

Here you need to build a chatbot that can understand and respond to natural language queries. The chatbot can be integrated with additional functionality, such as maps integration, to provide more useful responses. You can also use libraries like NLTK or spaCy for natural language processing and TensorFlow or PyTorch for building the model.

CV project

The objective of this project is to build a computer vision model that can detect objects in images. There is no need to use the most sophisticated models, just use some models that can show your skills with basics of deep learning like U-net or YOLO. The project can include uploading an image to a website using ngrok or a similar tool, and then returning the image with objects detected and highlighted in squares.

Sound project

You can build a text-to-speech model that can convert recorded audio into text. The model can be trained using deep learning algorithms like LSTM or GRU. The project can involve the use of libraries like PyDub or librosa for audio processing and TensorFlow or PyTorch for building the model.

Time series prediction project

The objective of this project is to build a model that can predict future values based on past data. The project can involve the use of libraries like Pandas or NumPy for data manipulation and scikit-learn or TensorFlow for building the model. The data can be sourced from various places, such as stock market data or weather data, and can be integrated with web scraping tools to automate data collection.

What else?

Having a good portfolio that showcases your skills is as valuable (or maybe, even more valuable) than a degree from a renowned university. However, there are other skills that are important for anyone these days: soft skills.

Developing soft skills is important for an ML engineer because it helps them communicate complex technical concepts to non-technical stakeholders, collaborate effectively with team members, and build strong relationships with clients and customers. Some ways to develop soft skills include:

Creating a blog. While writing is a solitary practice, it can be quite effective at helping you become better at communication. Writing about technical concepts in a clear and concise manner can help you structure your thoughts better and grasp how to explain complex tasks to different audiences.
Speaking at conferences and meetups. Presenting at conferences can help ML engineers improve their public speaking skills and learn how to tailor their message to different audiences.
Training to explain concepts to your grandma. Practicing explaining technical concepts in simple terms can help ML engineers improve their ability to communicate with non-technical stakeholders.

Overall, developing both your technical skills and communication skills can help you get your first job in the ML field.

Ivan Mishanin is the co-founder and COO of Brainify.ai, an AI/ML biomarker platform for novel treatment development aimed at psychiatry. His previous tech company, Bright Box, was sold to Zurich Insurance Group for $75M.