Hironori Sakai–a mathematician in a broader sense

Hi, my name is Hironori Sakai. I am a machine learning engineer living in Munich, Germany.

About

About me

I have a passion for advanced analysis and MLOps. Applying advanced analysis methods, we can gain deep insights in a data. Applying MLOps methods, we can convert the gained insights into business values in a scalable fashion. I look at these two fields as a connected large field and propose an end-to-end solution for a data product.

My former profession was a researcher. After I received my PhD in Mathematics in 2009, I researched mathematics as a researcher in Taiwan (NCKU) and Germany (Max Planck Institute for Mathematics and Uni Muenster). After that I chose a career in data science, because I can apply my expertise, Mathematics, to a real problem in data science and I like software development.

I am currently working as a Data Engineer/Machine Learning Engineer at gutefrage.net GmbH.

Where you can find me

💼 Professional SNS: LinkedIn and XING
💻 Github: Hotel data set analysis (Dashboard), Simple ML pipeline for GoEmotions dataset
🦋 Bluesky: Miscellaneous about data science and coffee.

Skills

Because I have experience in front-end (analysis, modeling, reporting) and back-end (data engineering, software development) in data science, I describe myself as a full stack data scientist.

Advanced analysis

I do descriptive analysis, which is the most important type of analysis. But beyond that I tackle a complicated problem which we need statistical method, machine learning or deep learning. My main programming language is Python, while I have a hands-on experience in R, as well. I use scikit-learn and pandas for my daily business. I also create a deep learning model with PyTorch (or TensorFlow) especially for text classification. My favorite visualization library is Altair and I use Streamlit for dashboards personally.

Big Data

An RDBMS is a comfortable data storage for me, but Spark cluster is one of my favorite environments, because it allows me to do a complicated analysis of big data. For data processing I use PySpark as well as Spark with Scala.

MLOps

MLOps is DevOps for data science: a set of practices that combines data science and IT operations. MLOps is a key concept for an effective and efficient data science project. Git, Poetry, DVC, Docker and CI/CD pipeline (Jenkins, AWS CodeBuild) are in my MLOps toolbox.