DAVID ELVIS KOMBO



Cinque Terre

David Elvis Kombo

Email:komboelvis08@gmail.com

A Machine learning engineer with experience in creating data pipelines using DVC, Apache Spark, and Pachyderm, feeding into machine learning and deep learning models (GANs, RNNs, CNNs, LSTMs), running orchestration and workflow pipelines using Kubeflow, prefect and Kedro, deploying using CircleCI and Jenkins, monitoring using Fiddler and Neptune, and maintaining solutions in production.

Welcome to my website!




- Data science - Machine learning Engineer at Kunumi, Belo Horizonte, Brazil P.o Box 7302-00300
- Machine learning - Lead Machine Learning Engineer at Omdena Nairobi, Kenya
- Artificial Intelligence - Machine learning engineer at Omdena

Contact

Mail

Cinque Terre
Cinque Terre

Oct, 2019 - Oct, 2020

I am currently a Machine Intelligence student at the African Masters in Machine Intelligence (AMMI) in Ghana. A Master’s program founded by Google and Facebook that provides me with state-of-the-art training in Machine Learning and its Applications.
Cinque Terre

June, 2019 - Aug, 2019

I recevied an AIMS ESMT- IIP scholarship to undertake a Business School management and skills development at AIMS South Africa. The program has been conceived as a mixture of academic, practical learning, and skill development which allowed me to transition from a scientific environment to an applied industry setting.
Cinque Terre

Aug, 2018 - Jun, 2019

I awarded a MasterCard Foundation scholarship to pursue studies in Mathematical Sciences between Stellenbosch University and African Institute for Mathematical Sciences(AIMS - SA).

Master Thesis: Pricing American Option using Least Squares Monte Carlo method.

Supervisor: Prof Philip Mashele

Cinque Terre

Oct, 2010 - Jul, 2016

I hold a Bachelor's degree in Applied Mathematics summa cum laudo from University of Kinshasa in RDC.

Final project : Système Bonus-Malus applicable par les compagnies d'assurance automobile en RDC.

Supervisor: Prof. Leonard Manya(+).


Work Experience


Cinque Terre

Machine learning Engineer

March, 2021 - Present
  • Developing a process-oriented data platform that behaves like a traditional database and automatic prediction engine that is to help businesses save billions by answering questions from data.
  • Helped a client company to save 10% of its expenses using reinforcement learning to predict, allocate and direct tug boats used for towing cargo ships.
  • Improved the delivery of projects by internal data scientists 3x faster by building template pipelines that were fine-tuned to specific projects.
Cinque Terre

Lead Machine Learning Engineer

Nov, 2021 - August, 2022
  • Creating a hybrid model of computer vision and NLP techniques to perform near-accurate data extraction.
  • Capturing KPI value indicators and extraction of information using a NER-based approach from spacy.
  • Led a team in creating the Dagshub plus DVC data pipelines and also designed the deployment and packaging of the project.
Cinque Terre

Machine Learning Engineer

November, 2016 - June, 2020
A project with the World Resources Institute, Code For Africa, and collaborators from Stanford University to leverage #AI to map land ownership to boost Kenya’s efforts to restore degraded land in an equitable way.
  • Built a Knowledge Graph of land ownership in Kenya, which boosted the efforts to restore degraded land
  • Built a Named Entity Recognition model (Spacy, Transformer), for entity identification from gazettes and court documents.
  • Geo-tagged network data & making it easily accessible to social-activists, journalists.
CERTIFICATE
BLOG ON THE PROJECT

Education


AI Center of Excellence

Machine Learning and Artificial Intelligence
Cinque Terre

December, 2020 - April, 2021

Enrolled in AICE. An esteemed AI centre focused on intensive online training with the full range of competencies, required to become a job-ready ML/AI engineer.

Google Developer program

Cloud computing
Cinque Terre

April, 2020 - Oct, 2020

Earned a scholarship the program to associatre cloud engineering track. The program gave us access to free select courses, projects, embedded labs (Qwiklabs) and skills assessment; plus support from developer .

CERTIFICATE

10 Academy

Data science
Cinque Terre

July, 2020 - Oct, 2020

Graduated from 10 Academy.
An esteemed academy focused on intensive online training with the full range of competencies, required to become a job ready data scientist. Worked on different projects and sharpened my skills in statistical machine learning modelling, data extraction, data acquisation and data clean, exploratory analysis, value extraction and data visualization.

CERTIFICATE

Jomo Kenyatta University of Agriculture and Technology

BSc Applied Mathematics
Cinque Terre

September, 2017 - September, 2020

Mathematical content combined with principles of computer science, computation, systems design, and software engineering to give a computer-aided approach to mathematics. Learning how to express and present mathematical and computing knowledge, techniques and tools in a logical and precise manner.

Programming Languages

  • Python
  • Java

Modelling

  • Inference
  • Statistics
  • Linear Algebra

Dashboards

  • Tableau
  • Power BI
  • Python libraries

Deployment

  • Docker and kubernetes
  • AWS
  • Flask
  • Streamlit

Career Skills

  • Reporting to clients
  • Making presentations
  • Team work
  • Career development

Libraries

  • Deep Learning: Pytorch, Tensorflow, Keras, OpenCv
  • Time-series Analysis: Statsmodels
  • Scientific Computation: Numpy
  • Tabular data: Pandas, Vaex
  • Big data: Apache Spark, Apache Kafka, Apache Airflow, Snowflake, Kubeflow, Vaex

Online Certificates

  • "Python for data science and AI" an online course authorized by IBM and offered through Coursera . My certificate here

  • "What is Data Science" an online course authorized by IBM and offered through Coursera . My certificate here

  • "Tools for Data Science" an online course authorized by IBM and offered through Coursera . My certificate here

  • "Data Science Methodology" an online course authorized by IBM and offered through Coursera . My certificate here

  • "Sentimental Analysis" an online course offered through Coursera . My certificate here

  • "Intermediate Machine Learning" an online course offered through Coursera . My certificate here

  • "Feature Engineering" an online course offered through Coursera . My certificate here

  • "Data Science and Machine Learning with Python" an online course offered through Coursera . My certificate here

  • Zindi Competition : "Expresso Churn Prediction Challenge” Certificate of participation.

  • Zindi Competition : "Zimnat Insurance Recommendation” Certificate of participation.

  • Kaggle Competition : "Future Sales Prediction” Certificate of participation.

Written Data science Articles

  • "Value of Machine learning pipelines" Link
  • "Exploring Users’ Data in twitter using Twitter API" Link
  • "Bank Term Deposit Marketing Strategy" Link

Written and Spoken Languages

  • English
  • Kiswahili

Portfolio

A selection of cool stuff I've worked on.

Featured

Finding donors

Bank Term Deposit Prediction

There is a stiff competition among the financial institutions/banks in increasing the customer base in their retail banking segment. Along with offering innovative products to the public, a huge amount of money is spent on marketing their products. The term deposit is very important among the diverse range of products and services offered by banks in retail banking segment. With the advancement in data science and machine learning and availability of data, most banks are adapting to a data-driven decision.

In this project I,

  • Used data visualization tools or software such as Tableau, Matplolib, Seaborn
  • Built machine learning model, end to end pipeline; Exploratory data analysis, data wrangling, building and fine-tuning models using Grid search.
  • Improved model interpretation and performance by using Label encoding and one-hot encoding, Fine -tuning hyperparameters and handling class imbalance.
  • Used GitHub for version control with recommended folder tree
  • Documented results to both technical and non-technical audiences
Walkability map

User Analytics In Telecom Industry

Investment is a path that many entities use to get revenues. Before an investor purchases an asset, a rich analysis of the data that underlies the business, to try to understand the fundamentals of the business and especially to identify opportunities to drive profitability by changing the focus of which products or services are being offered. In this project, an investor is interested in purchasing TellCo, an existing mobile service provider in the Republic of Pefkakia.

In this project I,

  • Applied Data wrangling techniques on the telecommunication data
  • Performed a univariate(graphical and non- graphical) and bivariate analysis on the user data, clustering of the customers using k-means clustering algorithm, dimensionality reduction using PCA.
  • Performed self-explanatory visualizations using tools such as plotly, matplotlib to get rich insights to improve customer experience and reduce the churn rate
  • Provided comprehensive report on the analysis to management for decision making.
Market segments

Pharmaceutical Stores Sales Prediction

Precise sales prediction is an essential and inexpensive way for each company to augment their profits, decrease their costs, and achieve greater flexibility to changes. In other words, exact sales forecasting is utilized for capturing the trade off between customer demand satisfaction and inventory costs. Especially, for the pharmaceutical industry, successful sales forecasting systems can be very beneficial, due to the short shelf-life of many pharmaceutical products and the importance of the product quality which is closely related to the human health.

information retrieval

Brand Impact Evaluation

SmartAd is a mobile first advertiser agency. It designs Intuitive touch-enabled advertising. It provides brands with an automated advertising experience via machine learning and creative excellence. Their company is based on the principle of voluntary participation which is proven to increase brand engagement and memorability 10 x more than static alternatives. SmartAd provides an additional service called Brand Impact Optimiser (BIO), a lightweight questionnaire, served with every campaign to determine the impact of the creative, the ad they design, on various upper funnel metrics, including memorability and brand sentiment As a data scientist in SmartAd.

More

  • Senegal Covid-19 Inference and forecast:
    Github repo

    Change point analysis to quantify the impact of Senegal government policy interventions to slow the spread of COVID-19.

    Skills: Python, machine learning, mcmc, pymc3,bayesian inference, Statistics

  • Twitter influencers analysis
    Github repo

    Identify influencers rank position from Twitter data,for online viral marketing strategies

    Skills: Python, NLP, Scikit-learn

  • Moodle Database Educational Data Log Analysis
    Github repo

    Exploration of 10 Academy Moodle logs stored in the database together with many other relevant tables and build a Tableau dashboard that illustrates the progress of students across time.

    Skills: Python, Database, Tableau, postgres, SQL

  • Tableau Dashboard for Covid-19 in Africa
    Dashboard

    The coronavirus disease 2019 (COVID-19) is a communicable respiratory disease caused by a new strain of coronavirus that causes illness in humans. This dashboard shows the data in Africa

    Skills: Tableau, Data synthesis



I am motivated by these quotes:

No matter the challenge.Get up, dress up, show up and NEVER GIVE UP!!

Great success comes from great personal sacrifices

Pi Day 2020

Mathematics is everywhere in the organization of the civilization: it helps designing electoral system that better represente the pepol's will.