
Hi, my name is Hironori Sakai. I am a machine learning engineer living in Munich, Germany.
Hi, my name is Hironori Sakai. I am a machine learning engineer living in Munich, Germany.
I have a passion for advanced analysis and MLOps. Applying advanced analysis methods, we can gain deep insights in a data. Applying MLOps methods, we can convert the gained insights into business values in a scalable fashion. I look at these two fields as a connected large field and propose an end-to-end solution for a data product.
My former profession was a researcher. After I received my PhD in Mathematics in 2009, I researched mathematics as a researcher in Taiwan (NCKU) and Germany (Max Planck Institute for Mathematics and Uni Muenster). After that I chose a career in data science, because I can apply my expertise, Mathematics, to a real problem in data science and I like software development.
I am currently working as a Data Engineer/Machine Learning Engineer at gutefrage.net GmbH.
Because I have experience in front-end (analysis, modeling, reporting) and back-end (data engineering, software development) in data science, I describe myself as a full stack data scientist.
I do descriptive analysis, which is the most important type of analysis. But beyond that I tackle a complicated problem which we need statistical method, machine learning or deep learning. My main programming language is Python, while I have a hands-on experience in R, as well. I use scikit-learn and pandas for my daily business. I also create a deep learning model with PyTorch (or TensorFlow) especially for text classification. My favorite visualization library is Altair and I use Streamlit for dashboards personally.
An RDBMS is a comfortable data storage for me, but Spark cluster is one of my favorite environments, because it allows me to do a complicated analysis of big data. For data processing I use PySpark as well as Spark with Scala.
MLOps is DevOps for data science: a set of practices that combines data science and IT operations. MLOps is a key concept for an effective and efficient data science project. Git, Poetry, DVC, Docker and CI/CD pipeline (Jenkins, AWS CodeBuild) are in my MLOps toolbox.