Close

Wesley Tao

Machine Learning Engineer

Download Resume

About Me

I believe in data and the logic behind it.

I am eager to thrive in any data environment with two powerful friend python and SQL.

I am the one who seeks to study complex data and understand the challenges of accessing it.

MongoDB or traditional databases make no difference to me.

I enjoy looking at and solving big-picture problems and crafting detailed solution wholeheartedly - I like to ask questions and devise a complete solution.

I want to understand the data (not only the pipes), and I can perform statistical and machine learning analytics and build dashboards because I like it. Yes really, because I do.

I know that I don’t know enough, and it bothers me that there isn’t enough time in the day to learn about the next topic.

I don’t sleep well at night when I leave work with a question unanswered.

I feel accountable for everything I do, and that sense of urgency has been driving me my entire life.

I wish to work in a team where I have my team back, and the team has mine.

Experience

Ushur Inc

Data Scientist

Santa Clara, CA

Developed an NLP (natural language processing) pipeline for insurance underwriter’s decision automation process. Integrated a dockerized rule-based expert system UMLS (unified medical language system) for feature extraction which greatly improved coverage for disease detection from 40% to 80% Implemented a tfidf and SVM (support vector machine) model for email classification and visualized the confidence scores distribution for unseen categories which beats the production model in terms of overall accuracy and robustness. Designed a statistical brand proximity metric which evaluate the product’s user engagement and efficiency of the system response; A nonprovisional patent being applied in progress

Pactera Oneconnect AI Lab

Machine Learning Researcher Intern

New York, US

Built an end-to-end chatbot assistant to facilitate the company’s hiring process collaborating with other engineers Implemented LSTM, SVM, Tree-based models to upgrade hard-coded dialogue and created a user simulator to generate simulation data for testing Created and maintained the SQL databases on AWS for chatbot to access and retrieve relevant information

Adatos.ai

Data Scientist Intern

Implemented a Deep Learning, powered predicting model of palm tree yield (tree detection, tree counting/density estimation, leaf and soil nutrient analysis, fertilization analysis, age estimation and weather/disasters analysis) on satellite imagery to incorporate the signals in palm oil commodity future trading strategy

Institute of Data Science, Fudan University

Research Assistant

Completed independently a research project on electricity user's behavior study with Hausman-Taylor model and test effectiveness of the electricity pricing policy in Shanghai.

Tested and proved the hypothesis that even at a low price difference +/- $0.03/kWh, people under the non-flat rate policy tend to use 12% less after 22:00 (peak hour)

Used Hausman-Taylor model to exclude the unobserved individual effect and successfully measured price elasticities

Education

Columbia University in the City of New York

Sept 2017 - Dec 2018

Master of Arts in Statistics,GPA 3.92

University of Fudan

Sept 2012 - Sept 2017

Bachelor of Arts in International Finance, GPA 3.53

Projects

Air Pollution in France z

Air pollution is causing 48000 French deaths per year. Over 47 million French people are exposed to a level of air pollution particles that are considered to be unsafe by the WHO.

View Project

Palm Tree detection and Counting for High Resolution Satellite Images

In agriculture, palm tree cultivation is one of the big sectors with a huge market value. Palm trees are used to produce a variety of products like vegetable oil, bio-fuel, papers, furniture, decorations, fodder for cattle etc. It also has to be mentioned that palm oil is the most widely used vegetable oil in the world.

View Project

Top 1 solution for 24-hour Indeed hackathon

In this sprint project, we have only 24 hours to present a data solution for Indeed.com We perform an in-depth analysis with its job-posting data and found some interesting insights. Based on our findings, we proposed a marketing strategy for indeed and won the Best Insight Award.

View Project

Sentiment Analysis on Sino-US Trade War Twitter Comments

Donald Trump and its trade war During his election campaign, President Donald Trump threatened to impose 35% to 45% tariffs on Chinese imports to force China into renegotiating its trade balance with the U.S. The immediate result of that would be a fierce trade war.

View Project

Skills

Blog and Project Reports

Shared Slides and Worth-Spreading Ideas

This notebook would be my place to organize my thoughts,to share insights, to connect the real-world problems and the most important of all to grow with data science community.

Other Notes

Get in Touch