IQBAL Fresher 19

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

MD IQBAL BAJMI

+919701922101 | [email protected] | Linkedin | Github | HackerEarth | Medium

Education
Maulana Azad National Urdu University Hyderabad, India
Master of Computer Application, Secured an aggregate of 8.2 CGPA Aug. 2016 – May 2019
MITMI College, Sabzibagh. Patna, India
Bachelor of Computer Application, Secured an aggregate of 75 percentage Aug. 2013 – May 2016

Projects
Sexual Harassment Stories Classification | Python, Tensorflow, Sklearn, Streamlit April 2020 – Present
• Sexual Harassment Stories Classification using Machine Learning and Deep Learning.

• Interpreted using LIME and deployed on Heroku.

• RNN(Recrrent Neural Network) using pretrained Word2Vec embedding(300d) worked well.

• Model-deployment : Github

• Live Demo: Heroku

Quora Question pair similarity | Python, scikit-learn May 2020 – June 2020
• Trained Machine Learning Algorithms to detect whether two given questions are duplicate or not.

• Used many hand-engineered features.

• Logistic Regression won the game.

• Metrics: Log-loss, Binary Confusion Matrix

Amazon Fine Food Reviews | Python, scikit-learn, NLTK May 2020 – June 2020
• Trained Machine Learning Algorithms to find polarity of reviews. Polarity means whether reviews are positive or

negative.
• Used BOW, bi-gram, n-grams, TF-IDF and Word2Vec featurization techniques.

• Logistic Regression won the game.

• Metrics: accuracy score, Binary Confusion Matrix

Personalized Medicine: Redefining Cancer Treatment | Python, Scikit-learn, NLTK June 2020 – July 2020
• Trained Machine Learning Algorithms to classify genetic mutations based on clinical evidence(text).

• It is a multi-class classification problem. There are nine different classes a genetic mutation can be classified on.

• Interpretation is important. Errors can be very costly. Probability of a data-point to each class is needed.

• Used BOW, bi-gram, n-grams, TF-IDF and Word2Vec featurization techniques for text. Also used Response coding

for categorical features.


• Used a few hand-engineered features.

• Logistic Regression won the game.

• Metrics: Multi-class Log-loss, Multi-class Confusion Matrix

Social Network Graph Link Prediction | Python, Scikit-learn, networkx July 2020 – August 2020
• It is a Kaggle competition launched by Facebook as a recruitment test.

• Trained many Machine Learning Algorithms to predict the missing link.

• Data is given in the form of Graph containing source and destination node(this creates edge).

• Proposed this problem as Binary Classification. Created negative samples randomly using the given graph data set.

• Used many few hand-engineered features like: is-followed-back, page rank, Katz score, SVD, etc.
• Model gives probability of prediction to recommend highest probability links
• Random Forest Classifier worked well.
• Metrics: f1-score, Confusion Matrix.
New York City Taxi demand prediction | Python, Scikit-learn July 2020 – August 2020
• Problem Statement: Find out the pick-up density, give the time and a region.

• It is a regression problem.

• We used Yellow Taxi data of 2015 to train the model.


• Used Fourier Transformation as a featurization because of time-series data.
• Used some feature engineering and pre-processing to remove outliers.
• Used XGBoost Regressor.
• Metric: MAPE(Mean Absolute Percentage Error.

Stack Overflow tag prediction | Python, Scikit-learn, skmultilearn July 2020 – August 2020
• Problem Statement: Suggest the tags based on the content that was there in the question posted on Stack

Overflow.
• ML problem: Question may have more than one tag or no tag at all. So, It is a Multi-label classification problem.

• Constraints: Incorrect tags could impact customer experience on Stack Overflow. No strict latency requirements.

• Feature Engineering: Removed code part from the question because It is very hard to encode coding keywords.

• ML Model: OneVsRest with Logistic Regression.

• Metrics: Micro f1-score and Hamming Loss.

Image Document Classification using Transfer Learning | Keras July 2020 – August 2020
• Problem Statement: Classify given image document among 16 classes.

• ML problem: It is a multi-class classification problem.

• Data set is balanced.

• Used VGG as a trained model.

• Used 3 types of CNN Model.

• Metric: accuracy(because data set is balanced).

ML Assignments at Applied AI Course | Python, Sklearn, tensorflow, Keras July 2020 – January 2021
• 1. Optional Python Programming.

• 2. Python Mandatory Programming.

• 3. Pandas Optional assignment.

• 4. Haberman Data Analysis.

• 5. Performance metrics implementation in core python.

• 6. TF-IDF Vectorizer implementation in core python.

• 7. RandomSearchCV in core python.

• 8. Naive Bayes on Donors Choose Data set.

• 9. Logistic Regression implementation using SGD(Stochastic Gradient Descent) in core python

• 10. Behaviour of Linear Models.

• 11. Decision Tree on Donors Choose Data set.

• 12. Bootstrap Sampling in Random Forest Regression core python without using Scikit-learn Linear Regression is

used as a base model.


• 13. GBDT(Gradient Boosting Decision Tree) on Donors Choose Data set.

• 14. Clustering similar movies and actors on Graph data set by using KMeans algorithm. I used Custom metric to

judge the algorithm.


• 15. Used SGD Algorithm to predict movie ratings.

• 17. Adding some new features in Graph Link Prediction-A Facebook Challenge.

• 18. SQL(Structured Query Language) assignments.

• 19. Back propagation in core python.

• 20. Implemented custom and predefined Callbacks in Tensorflow.

Achievements
• Qualified GATE-CS 2019 with score 389
• Secured 1st position in ”Debugging C programming Code” at University level.
• Accomplished Data Science Certificate by Workera.ai(by Andrew Ng)
• Accomplished AI Literacy Certificate by Workera.ai(by Andrew Ng)
• Accomplished Data Analyst Role Certificate by Workera.ai(by Andrew Ng)
Courses
• Machine Learning and AI Foundations: Recommendations.
• Python for Machine Learning and Data Science by Jose Partilla
• Machine Learning A-Z: Hands-On Python R in Data Science
• Siamese Networks 101 by PyImageSearch.
• OpenCV 101- OpenCV Basics by PyImageSearch.
• OpenCV 102- Basic Image Processing Operations by PyImageSearch.
• Object Detection 101–Easy Object Detection by PyImageSearch.
• Object Detection 201–Fundamentals of Deep Learning Object Detection by PyImageSearch.
• Object Detection 202–Bounding Box Regression by PyImageSearch
• Image Adversaries 101–Intro to Image Adversaries by PyImageSearch.
• Face Recognition 101–Fundamentals of Facial Recognition by PyImageSearch.
• Face Applications 101–Face Detection by PyImageSearch.
• Face Applications 102–Fundamentals of Facial Landmarks by PyImageSearch.
• Autoencoders 101–Intro to Autoencoders by PyImageSearch.
• Deep Learning 101–Neural Networks and Parameterized Learning by PyImageSearch.
• Deep Learning 102–Optimization Methods and Regularization by PyImageSearch.
• Deep Learning 103–Neural Network Fundamentals by PyImageSearch.
• Deep Learning 120–Regression with CNNs by PyImageSearch.
• Deep Learning 130–Hyperparameter Tuning by PyImageSearch.

Skills
• Front-end: HTML, CSS, JavaScript
• Programming Languages: C, C++, Java, Python, SQL
• Pre-processing: Numpy, Pandas
• Visualization: Matplotlib, Seaborn, Plotly
• Machine Learning Libraries: Scikit-learn, skmultilearn
• Deep Learning Libraries: Tensorflow, Keras
• Computer Vision libraries: OpenCV, scikit-image, PIL
• NLP libraries: NLTK, Spacy
• IDE: Jupyter Notebook
• Deployment: Heroku

You might also like