"Don't let what you think you can’t do interfere with what you can do."
Oct 2024 - now
- Finance Transformation Analyst
, Finance & Controlling - Haventree Bank - Toronto, Ontario, Canada 🍁🇨🇦
- Data Automation & Workflow Optimization: Designed and deployed Alteryx workflows to automate bank account reconciliation, consolidate mortgage & funding transactions, integrating data from SharePoint, Snowflake, and SQL databases, improving reconciliation accuracy and efficiency by 90% and reducing manual reconciliation efforts by 70%.
- Collaboration & Governance: Worked closely with accounting, finance teams to define data standards and establish best practices for financial data automation and reconciliation on Alteryx Designer workflows deployed to Alteryx Server.
- Advanced Analytics & Reporting: Supported risk analytics & credit risk team with logical development in Alteryx, Python, Tableau’s KPIs, codes to transform data, plot geospatial visuals, analyze fire weather climate risk, and translate OSFI metrics.
Apr 2024 - Oct 2024
- Analyst, Business Insights
, Accounting, Tax & Finance - Hudson's Bay Company (HBC: Hudson’s Bay, The Bay, Saks Fifth Avenue, Saks Off Fifth) - Toronto, Ontario, Canada 🍁🇨🇦 🇺🇸
- SQL Pipelines & Data Engineering: Built SQL pipelines and automated Alteryx workflows to streamline financial reporting, reducing manual reconciliation efforts by 70-95%, integrated multiple data sources (Sales Audit, Oracle, Snowflake) to enhance data visibility and reporting accuracy for tax compliance teams.
- Machine Learning & AI: Implemented ML models, NLP, Computer Vision in Python to classify tax codes in SKU items in both the US & Canada, identify features from PDF invoices, increasing tax compliance and reducing manual efforts by 60%.
- Data Architecture & Analytics: Cooperated with Data Engineer, Architect to design and develop a Snowflake-based Data Hub to centralize tax data for later reporting and analytics, optimizing ETL workflows and reducing reporting errors by 70%.
- Data Visualization & Business Intelligence: Designed Tableau, Power BI dashboards with advanced LOD, DAX, MDX measures, improving stakeholder engagement and decision-making insights.
Jan 2024 - Dec 2027
- Master of Liberal Arts (ALM), Extension Studies, Harvard University, Admitted Candidate
- Harvard Extension School, Harvard University (online part-time) - Cambridge, Massachusetts, USA 🇺🇸
Jan 2023 - Apr 2024
- Alteryx Administrator
, AWS Cloud Ops Data Migration - Billennium IT Inc for Roche (Swiss BioTech), Data Engineering - Integration, Data Services & Insights Foundational Domain - Mississauga, Ontario, Canada 🍁🇨🇦
- Data Governance & Log Analysis: Monitored and analyzed IT log data to track user activities, workflow hubs, and server performance across Roche’s North America & Europe operations.
- Automation & Performance Optimization: Collaborated with team leader to develop Alteryx-based automation flows to enhance user’s workflow performance and identify bottlenecks in data processing.
- System Monitoring & Security: Evaluated user authentication, server logs, and data access patterns to ensure compliance with Roche’s data protection standards and global security policies.
- Alteryx Server Administration: Optimized server configurations, managed workflow execution, and collaborated with IT teams to troubleshoot high-performance computing issues.
Jan 2021 - Aug 2022
- Business Insights & Analytics Post-Graduate Program
- Humber College - Toronto, Ontario, Canada 🍁🇨🇦
May 2022 - Aug 2022
- Data Science Intern
(remote) - Cohost AI (founded in San Francisco, USA, based in Ha Noi, Viet Nam) - Toronto, Ontario, Canada 🍁🇨🇦
- Data Pipeline Automation: Automated data ingestion from multiple APIs & databases into Python, SQL, enabling real-time financial reporting, improving analytics accuracy with domain expertise by 40-60%.
- Visualization & Reporting: Created domain-based KPIs to embed with developed interactive Power BI dashboards in advanced DAX to support revenue decision-making, boosting user engagement by 10-25%.
Jan'22 - Apr 2022
- Product Data Analyst Intern
- iRestify Inc. (based in Toronto, Canada) - Toronto, Ontario, Canada 🍁🇨🇦
- Geospatial & Business Intelligence Analytics: Built GIS-based Power BI dashboards to analyze revenue performance by location, optimizing territory-based pricing and operations.
- Data Cleansing & Feature Engineering: Applied Python & SQL for data wrangling, increasing accuracy by 20% for key KPIs.
- Data Automation: Designed workflow automations in Power BI’s DAX and MDX, reducing manual reporting efforts by 30%.
Aug-Dec 2021
- Data Engineering & Analytics Intern
(remote) - Center of Talent in AI (CoTAI, based in Ho Chi Minh City, Viet Nam) - Toronto, Ontario, Canada 🍁🇨🇦
-
- Big Data Engineering: Managed 4M+ data records, optimizing ETL pipelines between Vietnam & North America for Sentiment Analysis & behavior detection.
- Sentiment Analysis & Target Detection: Developed NLP-based classification models in Python to detect sentiment and reaction from customer feedback on e-commerce platforms.
- Visualization & Predictive Insights: Designed Tableau dashboards to track consumer sentiment trends and signals.
- Compiled Machine & Deep Learning classifiers tackling imbalanced datasets to detect target customers for Banking’s Marketing Targets
Jun 2017 - Jun 2019
- Sales Executive & Sales Coordinator
- Sofitel Saigon Plaza - Ho Chi Minh City, Viet Nam
- Revenue Forecasting: Prepared, consolidated financial Excel & Power BI reports to track sales performance and forecast departmental revenue targets, supporting executive decision-making and driving quarterly sales growth by 1-10% per account.
- Revenue Generation: Managed key accounts, segments, and markets, consistently meeting and exceeding team & personal revenue targets for approximately 16 months, contributing to 65% of sales duration while consulting with the Revenue team on target settings.
Topic | more projects available on GitHub & Tableau Public |
---|---|
Brain Tumor MRI Image Segmentation & Detection (Computer Vision) | - Designed deep learning pipelines (Keras, Pytorch) for MRI image segmentation, leveraging CNNs, U-Net for high-precision tumor detection. |
Scalable Cloud-Based NLP for Feedback Analysis | - Built a real-time Natural Language Processing feedback processing platform using Python, PySpark, SQL integrated with AWS Redshift, Glue, GCP BigQuery, identifying customer sentiment trends. |
Housing Affordability Statistical Inferences | - Applied Bayesian models (pooled, unpooled, hierarchical), Linear Regression, and Maximum Likelihood Estimation (MLE) to analyze key housing affordability indicators and posterior distributions. |
Hotel Daily Room Rate & Booking Cancellation Prediction | - Implemented XGBoost, Random Forest, and Deep Neural Networks (DNNs) to predict ADR (Average Daily Rate) and booking cancellation probability, and applied logistic regression, hypothesis testing, and ensemble models with increased revenue forecast accuracy using grid search hyperparameter tuning |
IEEE-CIS Fraud Detection (Capstone, Humber College) | - Preprocessed data in Python , designed architecture solution, analyzed performance between ML classifiers to determine the best performers on the imbalanced dataset, Balanced Random Forest with ROC AUC around 0.9 & Random Forest with ROC AUC, Precision around 0.9 |
Safe Roads 2022 Competition - Toronto Police Service | - Used Power BI, Python, Azure Machine Learning to analyze geospatial datasets, provide interpretation, conduct A/B testing , determine factors, recommend on road conditions, awareness, top fatal intersections to enhance traffic safety, prevent fatal accidents, achieve prediction using Random Forest ’s ROC AUC & Precision around 0.8 |
Sentiment Analysis | - Conducted Sentiment Analysis on customer’s comments & analyzed data generated from a system using Natural Language Processing through API on Fan Pages’ dialogs of diet products & participated in Data Operations, ETL in Python , SQL in MySQL , Azure , Visualization in Tableau to determine top customers, top efficient fan pages, most crucial intentions & demand entities, peak effective contact hours, peak periods of confirmations, common complaints |
Banking Dataset – Marketing Targets | - Used classification methods of ML, DL in Python to predict more accurately filing a claim while avoiding overfitting on an imbalanced dataset; - RUS Boost had the highest Balanced Accuracy, Geometric Mean, F1 scores & best Confusion Matrix among classifiers |
SQL Murder Mystery | - Determined the extract murder and killing planner with the shortest-possible SQL queries from basic to intermediate querying skills & approaches using: INNER/LEFT JOIN, GROUP BY, WITH, WHERE, Sub-Queries |
Porto Seguro’s Safe Driver Prediction | - Used classification methods of ML, DL in Python to predict more accurately auto insurance policy holders filing a claim (predict the probability) while avoiding overfitting on imbalanced dataset - RUS Boost had the highest Balanced Accuracy, Geometric Mean, F1 scores & best Confusion Matrix among classifiers |
Acquisition & Merger Analysis | - Compared techniques between loading dataset in Python’s SQL Alchemy to MySQL & loading it in SQL to Hadoop , investigated & identified organizations for the most profitable merger and acquisition by examining accumulated data sets in terms of Sales, Revenue, Product Line in SQL on Zeppelin , visualized charts in Tableau , Power BI |
Pharma Portfolio Predictive Analysis | - Coded in Python and AzureML to analyze time-series pharmaceutical sales data and forecast the key pharma product and predict the patterns in the future |
Annual Sales Analysis & Visualization | - Applied EDA in Python , visualized 200K datapoints to answer Revenue questions - Visualized & compared results between charts in Tableau & Power BI to determine that the variables which caused the highest Sales Value: December, San Francisco, peak hours placing orders, top sold products, correlation between Prices & Volumes |
Income Analysis & Classification | - Preprocessed, analyzed the Income background of all records in Python , SQL & visualized key variables in Tableau / Power BI to determine highlights, trends & predictions of Income types with ML, DL Classifiers |
Eden Hotels & Resorts Group | - Created a Sales Incentive Plan in Java : input, check password, calculate Salespersons, Revenues & export reports, calculated Hotel Revenue’s metrics in Excel to analyze, visualize different types of KPIs - Designed Database and inserted sample data into tables of hotels, guests, employees & bookings in SQL queries |
University Admission | - Led a team & built a Java program (< 150 coding lines) to store information of the newly admitted students, prompted user to enter the student name & high school grades, calculated GPA & assigned to the University’s schools |
Investment Analysis of Shopify and Lightspeed in Canada | - Managerial Finance & Accounting Report |
Governance & Ethics in Data | - Gained the highest grade of 95% in all Professor's classes analyzing ethics & governance models about data manipulated in Cybersecurity, COVID-19, Vaccination, etc. - Analyzed 3 aspects of the ethics model, data governance to mitigate potential challenges in the chosen context |
TD Bank's Porter’s Value Chain Analysis (available for being shown only in a section) | - Conducted an analysis of TD Bank over history, vision, mission, strategic and financial objectives, External environment based on PESTEL and Five Forces analysis, Internal environment based on SWOT-analysis, resource and capability analysis, and a value chain analysis, the current strategic approach and its various strategic actions, the staffing practices and strategy execution, Organizational structure. |
Better Working Word - EY, NASA, Microsoft | - Using Python , Machine Learning , Azure Studio , Azure Machine Learning in 3 challenges for 3 months to help locate and protect the biodiversity of frogs by discovering and counting local and global frogs on weather data sampled over space and time (spatiotemporal sampling) with given preliminary F1 score. |
US Medicaid Pharmacy Pricing Analysis | - Establishing tables by nodes and Graph on Neo4j in Cypher, and on Azure in SQL to predict future prices/quantities and important pharmaceutical products of US Medicaid datasets in Python, AzureML |
Home Credit Default Risk | - Connected, transformed datasets, conducted EDA in SQL , Scala on Hive , Zeppelin on customized datasets on the to analyze the loan applicants' background and help expanding to those unable to access financial services - Determined on Zeppelin/ Tableau / Power BI the most significant background check of applicants who got most loan approvals |
Courses | Details |
---|---|
Data Analytics Tools ✅ | SAS, SPSS Modeler, SPSS, Excel, Cognos |
Managerial Finance & Accounting ✅ | Excel (Investment Analysis of Shopify and Lightspeed in Canada) |
Big Data ✅ | Hadoop, R, Neo4j, Cypher, Graph |
Quantitative Research Methods I & II ✅ | Descriptive & Inferential Statistics, Probability, Normal Distribution, Estimation, Hypothesis Testing |
Database & SQL ✅ | SQL, ERD, Normalization |
Governance & Ethics in Data ✅ | Reflection & Integration of Knowledge: Governance & Ethics of Analytics in in Data, AI & Technology - only available from hyperlink in my Resume - (graded 95/100 & feedbacked by Professor. Kathleen Mcginn 😧 : "My goodness Phuong,Thank you for sharing this with me. It is indeed a very deep, intelligent and meaningful piece of writing that deserves an excellent grade - 95 (!) - the highest grade I have given so far. Congratulations - you have truly earned it." ) |
Canadian Business & Strategy ✅ | TD Bank's Porter’s Value Chain Analysis & Nucor Corporation Analysis |
Marketing ✅ | |
Predictive Analytics ✅ | linear and multiple regression, decision trees, linear programming, factor analysis, cluster analysis, modelling |
Machine Learning and Programming 1 & 2 ✅ | Python: Data Mining, Data Science, Data Visualization, Dimension Reduction, CRM, Evaluation Predictive Performance, Multiple Linear Regression, K-NN, Naives Bayes Classifier, Classification, Regression Trees, Logistic Regression, Cluster Analysis |
Communication & Data Visualization ✅ | Excel, Tableau |
Business Intelligence ✅ | Power BI |
Machine Learning and Programming 2 ✅ | Python: Time Series Forecasting, Market Basket Analysis, Natural Language Processing |
Capstone Course ✅ | IEEE-CIS Fraud Detection (Capstone, Humber College) |
Project Management ✅ | Boeing Aviation Case Report of Sales and Supply Boost |
Criteria | Details |
---|---|
Programming | Certified SQL, Python (Pandas, Numpy, Matplotlib, Keras, SkLearn), Tensorflow Developer (in progress), T-SQL, PL/pgSQL, Java, Scala, R, HTML |
Viz & ETL | Certified Power BI, Tableau Desktop, Alteryx Advanced Designer, Alteryx Designer Cloud Advanced, Alteryx Machine Learning Fundamentals, Tableau Prep, SPSS (Modeler, Statistics), SAS (Studio, Enterprise Miner), Cognos, Qlik |
Big Data | Certified Azure Data Fundamentals, Azure AI Fundamentals, Alteryx Server Administration, Databricks Accredited Lakehouse Fundamentals, AWS (ML & Data Analytics), Azure (ML, Synapse), MySQL, MongoDB, MS SQL, Oracle, PostgreSQL, Hadoop (Hive, Zeppelin), Neo4j, Splunk |
Collaboration wiki | Atlassian Confluence, Jira, Trello |
Languages | English 🇺🇲 (fluent), Vietnamese (native), French 🇨🇦🇨🇵 (basic overall, intermediate reading), German 🇩🇪 (basic overall, intermediate reading) |
Others | Certified Six Sigma White Belt, Excel (Solver, GoalSeek, Macros), GDPR, ServiceNow, Confluence, Jira, Trello, Machine & Deep Learning, AI, Teamwork, Statistics, Probability, Sales, Accounting, Finance, Project Management, Hospitality, Presentation, Communication, Marketing |
Earned 🏅 | Details |
---|---|
ProtonX | Tensorflow Developer (Statistics, Probability, Algebra, Machine Learning, Deep Learning, AI) |
Center of Talent in AI | Python, Machine Learning, Deep Learning, AI, Reinforcement Learning |
Nordic Coder | Python, Tableau |
DataCamp | SQL Intermediate |
Microsoft Office Specialist | Word, Excel, Powerpoint |
Udemy | Power BI for Business Intelligence |